[ https://issues.apache.org/jira/browse/CAMEL-14952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17147637#comment-17147637 ]
Aemie commented on CAMEL-14952: ------------------------------- [~aashnajena] if you want, I added *apache_demo.json*, you can check that, it works for both components and manuals as I checked. > Better search on the website > ---------------------------- > > Key: CAMEL-14952 > URL: https://issues.apache.org/jira/browse/CAMEL-14952 > Project: Camel > Issue Type: Improvement > Components: website > Reporter: Zoran Regvart > Priority: Major > Attachments: > BH4D9OD16A_apache_camel_20200608-20200614_no_result_searches.csv, > List_Of_Crawled_Pages_by_DocSearch.txt, apache_demo.json, camel.json, > image-2020-06-13-14-39-08-776.png, list_of_crawled_pages.txt, > sitemap-camel.png > > > We use [Algolia|http://algolia.com/] for the search functionality on the > website using their [free plan|https://www.algolia.com/for-open-source/] for > Open Source projects. The index is built by Algolia's crawler using the > [DocSearch|https://docsearch.algolia.com/]. > When this was done we built our own UI on top of Algolia JavaScript API, as > one if requirements is that clients use Algolia's JavaScript clients. We did > not use Algolia UI as at that point it was rather large JavaScript dependency > to add and it would slow down the loading of the website. > We also have some [initial > work|https://github.com/apache/camel-website/pull/74] on creating our own > Algolia index at build time. > The current search doesn't seem to index the whole website, some results > don't appear in the search, looks like most of the content from Antora is not > indexed: trying to search for {{removeHeader}}, the [FAQ > entry|https://camel.apache.org/manual/latest/faq/how-to-avoid-sending-some-or-all-message-headers.html] > is not found. There's also a list of failed searches on the Algolia > dashboard we can use to benchmark the search. > What we need is to build the search index over the whole content. Approach > taken in [#74|https://github.com/apache/camel-website/pull/74] is good start > for Hugo generated content. We need to expand that to Antora built content as > well. > This search index would be built at the website build time and would include > both Hugo and Algolia content in the same file or possibly in several files > if we use multi-index search. More on how indexes are built can be seen in > the [Algolia > documentation|https://www.algolia.com/doc/guides/sending-and-managing-data/prepare-your-data/]. > We need to figure out what data to send and how to integrate this with > Antora, for Hugo we have a good idea from > [#74|https://github.com/apache/camel-website/pull/74], importantly the > structure needs to be the same. One good source of inspiration on building > the index for Antora content is in the [Lunr.js > integration|https://github.com/Mogztter/antora-lunr]. > We need to build the index with the search UI in mind, i.e. the index needs > to contain the data we wish to present in the UI as well as enough content > for Algolia to be able to use the content to perform search. So starting with > a mockup of what we wish to present/utilize in the search UI and deriving the > data structure for the index from that would be a good start. -- This message was sent by Atlassian Jira (v8.3.4#803005)