[ https://issues.apache.org/jira/browse/SOLR-12746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16668682#comment-16668682 ]
Cassandra Targett commented on SOLR-12746: ------------------------------------------ I wrote up in the README in the branch that all you should need is the {{slim}} gem. You're running into a similar problem I had when I tried to integrate {{asciidoctor-html5s}} directly, so I did not integrate that project directly; I only copied the templates themselves. However, I listed my own gems and realized that I've been running my tests with that gem installed; removing it causes the errors you probably saw about it being missing. I'll add it back in and take a look at what's happening and fix the README accordingly. Thanks for trying it out to find this. > Ref Guide HTML output should adhere to more standard HTML5 > ---------------------------------------------------------- > > Key: SOLR-12746 > URL: https://issues.apache.org/jira/browse/SOLR-12746 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: documentation > Reporter: Cassandra Targett > Assignee: Cassandra Targett > Priority: Major > > The default HTML produced by Jekyll/Asciidoctor adds a lot of extra {{<div>}} > tags to the content which break up our content into very small chunks. This > is acceptable to a casual website reader as far as it goes, but any Reader > view in a browser or another type of content extraction system that uses a > similar "readability" scoring algorithm is going to either miss a lot of > content or fail to display the page entirely. > To see what I mean, take a page like > https://lucene.apache.org/solr/guide/7_4/language-analysis.html and enable > Reader View in your browser (I used Firefox; Steve Rowe told me offline > Safari would not even offer the option on the page for him). You will notice > a lot of missing content. It's almost like someone selected sentences at > random. > Asciidoctor has a long-standing issue to provide a better more > semantic-oriented HTML5 output, but it has not been resolved yet: > https://github.com/asciidoctor/asciidoctor/issues/242 > Asciidoctor does provide a way to override the default output templates by > providing your own in Slim, HAML, ERB or any other template language > supported by Tilt (none of which I know yet). There are some samples > available via the Asciidoctor project which we can borrow, but it's otherwise > unknown as of yet what parts of the output are causing the worst of the > problems. This issue is to explore how to fix it to improve this part of the > HTML reading experience. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org