Hello, llms.txt is a proposal that aims to standardize how LLMs can easily access information. I think the Apache Camel website is structured well enough and we can easily expose information so that it's easily accessible to LLMs. The /llms.txt is similar to a sitemap, but designed for LLM consumption with markdown content.
I see here two main benefits: 1. When LLMs are trained, they can easily crawl and index our documentation through the standardized llms.txt format 2. The llms.txt and markdown pages can be used by coding agents like Gemini CLI, Claude Code, Cursor, etc. directly to provide accurate Apache Camel information Implementation attempt on camel-website: - After Antora generates HTML pages, a Gulp task converts them to markdown - The public folder now contains both advice-with.html and advice-with.html.md for every page - Markdown files are cleaned up - only the important article section is extracted (no nav, headers, footers) - An /llms.txt file is generated at the root with an overview and structure Results: - 5,355+ markdown pages generated automatically during build - Almost all HTML pages can be accessed as markdown by appending .md to the URL (the .md after .html is just a proposal, there aren't best practices around it, any input is welcome) This way html documentation like https://camel.apache.org/components/next/languages/simple-language.html and markdown content https://camel.apache.org/components/next/languages/simple-language.html.md will be exposed. This should make Apache Camel documentation much more accessible to AI tools and future LLM training. Draft Pull Request: https://github.com/apache/camel-website/pull/1437 Any feedback or suggestions are welcome! Regards, Federico
