[ https://issues.apache.org/jira/browse/SOLR-10299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16064927#comment-16064927 ]
Cassandra Targett commented on SOLR-10299: ------------------------------------------ Parsing the raw content is one approach that might be successful. Indexing the generated HTML is another option. Seeing what happens with {{bin/post}} on the HTML files would be another simple experiment to try. I'm not sure it would be preferable, but will reflect what end-users see. We don't do this yet, but someday we will have raw content files that do not stand alone but are snippets included inside another file that together become a single HTML page. The harder questions IMO are going to be how to integrate it with the CMS, keeping the index up to date, the facet options, the end-user UI, etc. bq. One thing that might help in the short term could be enabling fuzzy search mentioned on https://github.com/christian-fei/Simple-Jekyll-Search ? the search.json file we have doesn't mention it and the docs doesn't specify whether it is true or false by default As I've mentioned a few times to the list(s), we're currently using a JavaScript to generate the title title-keyword approach that's in use now. That doesn't come from Jekyll, but from an open-source Jekyll theme that I borrowed for the basic layout of the pages. That Javascript _can_ index the body when it's generated, but the author of it notes in his documentation that it can cause problems. I never had time to try it to see what these problems are so I can't speak to it being a satisfactory stopgap - I'll guess, though, that the problems are related to performance, relevance, and proper parsing of text (only, you know, all the problems that we know plague inadequate attempts at full-text search). If you are interested, though, here are the docs for the keyword lookup that's currently in place: http://idratherbewriting.com/documentation-theme-jekyll/mydoc_search_configuration.html. You will see immediately the similarities between that site and ours ;) I have seen the Simple-Jekyll-Search project early on, but I suspect it's going to be also inadequate for similar reasons the current JavaScript solution is inadequate. Since the theme I used already had a JavaScript-based lookup, I didn't bother to investigate another solution in favor of other issues that needed to be dealt with. Perhaps it's worth a look, I'm not sure. By the way, the title-keyword lookup was 100% intended as *the* stopgap solution. I knew it would be unsatisfactory, but I also know that despite all I know of Solr, I cannot carry the majority of the weight to make this feature happen. > Provide search for online Ref Guide > ----------------------------------- > > Key: SOLR-10299 > URL: https://issues.apache.org/jira/browse/SOLR-10299 > Project: Solr > Issue Type: Sub-task > Security Level: Public(Default Security Level. Issues are Public) > Components: documentation > Reporter: Cassandra Targett > > The POC to move the Ref Guide off Confluence did not address providing > full-text search of the page content. Not because it's hard or impossible, > but because there were plenty of other issues to work on. > The current HTML page design provides a title index, but to replicate the > current Confluence experience, the online version(s) need to provide a > full-text search experience. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org