AI search

llms.txt

llms.txt is a proposed plain-text file placed at a site root that points AI crawlers and language models to the pages and facts a site considers most important, written in Markdown so a model can read it without parsing the full HTML of every page.

The idea is simple: rather than make a model wade through navigation, scripts, and boilerplate, you hand it a curated map of your best content with short descriptions and links. The format is deliberately plain. A heading names the site, a sentence or two describes what it does, and a list of links groups the pages worth reading, each with a one-line summary. It mirrors the spirit of robots.txt and sitemap.xml, but it is a community proposal, not an agreed standard, and no major AI provider has confirmed it changes how their systems crawl, rank, or cite.

The honest read is that llms.txt is cheap to ship and has little measurable effect today. The large answer engines build their understanding from pages they already fetch and from corroboration across the wider web, not from a self-declared file that any site could fill with marketing claims. Treating it as a ranking lever is a misread. Treating it as low-cost hygiene that may matter later, once tooling and conventions settle, is the fair position.

Consider a Shopify merchant selling merino base layers. They publish llms.txt at the store root listing their sizing guide, their wash-and-care page, and three top product pages, each with a short description such as "Lightweight 150gsm crew, sizes XS to XXL". This is tidy and free to maintain. What it will not do on its own is make ChatGPT or Perplexity recommend the brand when a shopper asks for a warm winter base layer. Those systems weigh the live product pages, the reviews on them, and what independent sites and forums say, not the merchant's own summary of itself.

For AI search and answer engines, the file matters mainly as a signal of intent rather than a source of truth. A model can read a self-description, but it has every reason to discount claims a brand makes about itself and to favour evidence it can corroborate elsewhere. That asymmetry is the whole point of how Google AI Overviews, Perplexity, and similar tools decide what to cite: they reward facts that hold up across sources, not assertions filed at a known path.

What actually drives AI citation is more mundane: clean, crawlable pages, clear and specific answers near the top of each one, structured data where it fits, and the same facts confirmed by independent sources a model already trusts. Reviews are one of the strongest forms of that corroboration, because they are customer language about the product, sitting on the product page, repeated across many buyers. Getting existing reviews readable, indexable, and cited by search and AI is the gap BeyondReviews closes, and it does far more for visibility than any single text file at the root. Ship llms.txt if you like the hygiene of it, then spend the real effort on the pages and proof a model can verify.

Related terms

Go deeper

Guide: the llms.txt truth →