[ 
https://issues.apache.org/jira/browse/SOLR-12746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16659870#comment-16659870
 ] 

Cassandra Targett commented on SOLR-12746:
------------------------------------------

I've now updated the {{jira/solr-12746}} branch to master as of last night, and 
added a couple more CSS fixes, added a license reference to NOTICE.txt [1], and 
updated the README and {{dev-tools/scripts/jenkins.build.sh}} scripts for the 
proper Slim version as mentioned in the earlier comment [2].

[~arafalov], I think you were interested in this issue last week?

I think this is ready to go. I'll check it out a bit more before committing - 
thoughts/reviews are welcome.

[1] - I'm not sure if I really needed to include the license for 3 reasons: 1) 
we aren't distributing the templates at all, just the output of the templates; 
2) I borrowed only the templates while the project they are from includes much 
more; and 3) I also modified the templates to make integration easier, so they 
aren't the same as the originals. Out of abundance of caution and respect for 
the original author I included a mention in NOTICE.txt anyway.

[2] - The need to define the Slim version is temporary. After I mentioned to 
the Asciidoctor project that I had the problem and that downgrading Slim fixed 
it, they were able to identify the Slim API changes in Slim's v4.0 release that 
caused the problem. Asciidoctor's future 1.5.8 release (which we'll consume in 
some way, eventually) will include the fix. This is the issue that has the fix: 
https://github.com/asciidoctor/asciidoctor/issues/2928. The error is harmless, 
just alarming, so if anyone is using Slim 4.x and sees the error, they can 
continue without any problems. Downgrading just allows us to avoid having to 
see it 30+ times for every HTML build.

> Ref Guide HTML output should adhere to more standard HTML5
> ----------------------------------------------------------
>
>                 Key: SOLR-12746
>                 URL: https://issues.apache.org/jira/browse/SOLR-12746
>             Project: Solr
>          Issue Type: Improvement
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: documentation
>            Reporter: Cassandra Targett
>            Assignee: Cassandra Targett
>            Priority: Major
>
> The default HTML produced by Jekyll/Asciidoctor adds a lot of extra {{<div>}} 
> tags to the content which break up our content into very small chunks. This 
> is acceptable to a casual website reader as far as it goes, but any Reader 
> view in a browser or another type of content extraction system that uses a 
> similar "readability" scoring algorithm is going to either miss a lot of 
> content or fail to display the page entirely.
> To see what I mean, take a page like 
> https://lucene.apache.org/solr/guide/7_4/language-analysis.html and enable 
> Reader View in your browser (I used Firefox; Steve Rowe told me offline 
> Safari would not even offer the option on the page for him). You will notice 
> a lot of missing content. It's almost like someone selected sentences at 
> random.
> Asciidoctor has a long-standing issue to provide a better more 
> semantic-oriented HTML5 output, but it has not been resolved yet: 
> https://github.com/asciidoctor/asciidoctor/issues/242
> Asciidoctor does provide a way to override the default output templates by 
> providing your own in Slim, HAML, ERB or any other template language 
> supported by Tilt (none of which I know yet). There are some samples 
> available via the Asciidoctor project which we can borrow, but it's otherwise 
> unknown as of yet what parts of the output are causing the worst of the 
> problems. This issue is to explore how to fix it to improve this part of the 
> HTML reading experience.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to