What Is llms.txt? AI Crawler & LLM Rules Explained
Learn what llms.txt is, why it matters for AI and SEO, and how The SEO Guide Book uses it to set clear rules for Large Language Models.
About llms.txt
The llms.txt file is an experimental, community-driven standard that aims to give website owners more control over how Large Language Models (LLMs) and AI crawlers interact with their content. It works in a similar way to robots.txt
, but instead of targeting traditional search engines, it is designed to communicate preferences to AI systems such as ChatGPT, Claude, Perplexity, Gemini, and others.
Why Does llms.txt Matter?
AI systems increasingly use web content to train their models or generate answers. However, there is currently no official global standard to control how this content is used. By publishing an llms.txt
file, site owners can:
- Declare whether their content can be used for AI training.
- Indicate if summaries are permitted and under what conditions.
- Require attribution when content is referenced or quoted.
- Promote transparency and ethical use of online content.
The SEO Guide Book’s Approach
We believe in open knowledge sharing — but also in protecting the work of creators. That’s why our llms.txt
file clearly states:
- Our content must not be used to train commercial AI models without permission.
- Summarisation is allowed only if proper source attribution is given.
- Attribution must include a direct link to The SEO Guide Book.
This ensures that while AI systems may reference our resources, credit is always given where it’s due.
What Website Owners Should Know
If you run a website, adding an llms.txt
file is optional, but it signals your stance on AI usage. While compliance isn’t guaranteed today, adoption is growing — and by acting early, you set the rules for how your work should be treated in the AI era.
At minimum, we recommend that site owners use:
robots.txt
for traditional search enginesllms.txt
for AI crawlersmeta
tags such asnoai
andnoimageai
Final Thoughts
The llms.txt
standard is still experimental, but we believe it represents a step towards a more balanced relationship between content creators and AI systems. By publishing our own file, we’re making our preferences clear — and encouraging others to do the same.
FAQs: llms.txt for AI crawlers
What is llms.txt?
llms.txt is an experimental text file placed at the root of a website to state your preferences for how Large Language Models and AI crawlers may use your content. It is similar in spirit to robots.txt but is targeted at AI systems rather than traditional search engine crawlers.
Is llms.txt an official standard?
No. It is not an official or universally enforced standard. Compliance depends on whether a given AI crawler chooses to read and respect it. You should pair it with established controls such as robots.txt
and meta directives like noai
and noimageai
.
Where should I place the file?
Place the file at your site root so it is accessible at https://example.com/llms.txt
. Keep it as plain UTF-8 text with Unix line endings where possible.
Does llms.txt affect Google rankings?
No. It does not influence organic rankings. It is a disclosure of AI usage preferences, not a ranking signal.
How does llms.txt differ from robots.txt?
robots.txt
is a long-standing convention for web crawlers used by search engines. llms.txt
is a newer, community-led convention aimed at LLMs and AI scrapers. Use both if you want to communicate to both audiences.
Can I block AI training with llms.txt?
You can declare a preference such as AI-Usage: no-training
, but enforcement is voluntary. For stronger protection, combine this with robots.txt user-agent blocks for known AI bots and page-level meta directives. Consider Terms of Use updates for legal clarity.
Which AI user-agents should I mention?
Commonly referenced agents include GPTBot
, CCBot
, ClaudeBot
, PerplexityBot
, Google-Extended
, Amazonbot
and others. Maintain a short, well-commented list and review it periodically.
Should I use llm.txt or llms.txt?
There is no single agreed filename. Some sites use llm.txt
; others use llms.txt
. Pick one, document it on an explainer page and keep a consistent approach. You may optionally host both files with identical content to maximise discovery.
How can I encourage attribution?
State an attribution requirement explicitly, for example Attribution: required
and include a canonical link. You may also add a human-readable policy page and require attribution in your Terms of Use.
How do I keep the file up to date?
Review it quarterly or when new AI crawlers emerge. Log changes with a short “Last Updated” line and keep directives concise to avoid ambiguity.