Re: VOTE RC0 Release apache-solr-ref-guide-4.5.pdf
The TODO list is now empty (except for a shelved item), so that clears up the stuff I found. Steve On Sep 26, 2013, at 1:28 PM, Chris Hostetter hossman_luc...@fucit.org wrote: Awesome work steve! I collected all of this up into a scratch page, let's see how many we can burn through easily and then post another RC... https://cwiki.apache.org/confluence/display/solr/Internal+-+TODO+List -Hoss - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: VOTE RC0 Release apache-solr-ref-guide-4.5.pdf
I have just 3 chars to contribute: WOW Otis On Thu, Sep 26, 2013 at 8:29 AM, Steve Rowe sar...@gmail.com wrote: Except for #1/#34 - internal links to beginning-of-page sections point one page earlier than they should - and #8/#41 - missing Thai and Polish chars - which I don't know how to fix, I'll try to address the other items on this (um, very long) list of mostly minor stuff I found: 0. All examples in the exported PDF have an extra blank line at the top. I was able to eliminate these from this page https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=32604227 (What is an analyzer?) by eliminating the newline between the initial {code …} line and the first line of the examples. This doesn't have any apparent effect on the layout of the page on the wiki, but the PDF export of that page no longer has the extra blank lines. Any objections to switching all {code} examples in the guide like this? 1. Pg 2: The section links from the TOC all take you to the previous page, rather than to the top of the page where the section starts. (Same behavior on OS X Preview, and under Windows, on Firefox's built-in PDF viewer and on Adobe Reader.) This looks like a general problem - see e.g. #34. 2. Pg 68: Stray asterisks in the analyzer tags in the fieldType example under Analysis Phases, apparently to make the surrounded text bold (which also didn't happen). 3. Pg 69: The solr.KeywordTokenizerFactory example is missing one quotation mark from each of the left and right hand sides. 4. Pg 70: Under solr.TokenizerFactory, there is a bogus StandardTokenizer link in the sentence Theere aren't any filters that use StandardTokenizer's types - the link is to the non-existent StandardTokenizer page on the Solr wiki. (It might be useful to systematically link stuff like this to the corresponding Lucene or Solr javadocs, but this should probably be templated or scripted, so that the version-specific links are handled properly.) 5. Pg 71: Under Standard Tokenizer, the email addresses recognition claim is false, and Internet domain name recognition isn't validation per se, e.g. google.supercomputername will be tokenized as a single token along with google.com. The Out example output needs fixup accordingly. I see that the Classic Tokenizer section on pg 72 has the verbatim email/domain text; for ClassicTokenizer, the email claim is true, but it has the same issue with internet domain names as StandardTokenizer. 6. Pg 74: The NGram Tokenizer example output should be (bicy, bicyc, icyc, icycl, cycl, cycle, ycle) instead of all of the 4grams before the 5grams (I think this class's behavior was changed in 4.4 by LUCENE-5042). 7. Pg 75: The ICU tokenizer rulefiles argument is missing. 8. Pg 75: The ICU Tokenizer's In input and Out output are completely missing the Thai text that's visible on the wiki. 9. Pg 75: Missing spaces in the Regular Expression Pattern Tokenizer's group attribute description, at the boundaries between the first two sentences: token(s).The and tokens.Non-negative. 10. Pg 72, 76, 77, etc.: Many analysis components' factory class names should be styled with a fixed-width font. 11. Pg 77: UAX29 URL Email Tokenizer recognizes not only .com Internet domain names, but also domain names including any other valid top-level domain (i.e., unlike StandardTokenizer and ClassicTokenizer, domain names are validated against the white list drawn from the IANA Root Zone database http://www.internic.net/zones/root.zone as of the last time ant gen-tld was performed and the tokenizer was generated.) 12. Pg 77: UAX29 tokenizer: file::// should be file:// 13. Pg 77: UAX29 tokenizer's URL and EMAIL type names are missing angle brackets. 14. Pg 77: UAX29 tokenizer's maxTokenLength attribute name should be styled with a fixed-width font. 15. Pg 78: In the example demonstrating how arguments can be given to filter elements via attributes, there is a stray asterisk, apparently intended to bold the surrounding text, which also didn't work: *min=2 max=7/ 16. Pg 79: The ASCII Folding Filter's Out output should have the accent stripped from the á - a and the ASCII character value adjusted - (ASCII character 97) 17. Pg 81: The Edge N-gram Filter's 4-6 gram size example Out should be (four, scor, score, twen, twent, twenty) - some of these are missing. 18. Pg 83: The ICU Normalizer 2 Filter example should include the name and mode attributes in the filter element. 19. Pg 87: Stray asterisks in both of the N-Gram Filter examples: *minGramSize=... 20. Pg 87: The N-Gram Filter 3-5 gram size example Out output should be (fou, four, our, sco, scor, score, cor, core, ore) - rather than ordering by gram size, output is now ordered first by position and then by gram size. 21. Pg 88: Stray asterisk in the first occurrence only example of the Pattern Replace Filter: *replace=first.
Re: VOTE RC0 Release apache-solr-ref-guide-4.5.pdf
Hi, SOLR-3076 went into this release, but in the documentation for how to support Block Join in Solr is not present. In the ref guide there is a section called Other Parsers ( https://cwiki.apache.org/confluence/display/solr/Other+Parsers) . We should add BlockJoinChildQParser and BlockJoinParentQParser. Also we should add an example on how to index childDocs in XML to make use of BlockJoin in Solr. I can document them right now but where should I post it? If someone can give me access to the Confluence I could add it there. My confluence username is [varunthacker] On Thu, Sep 26, 2013 at 1:06 AM, Chris Hostetter hossman_luc...@fucit.orgwrote: Please vote to release the following artifacts as the Apache Solr Reference Guide for 4.5... https://dist.apache.org/repos/**dist/dev/lucene/solr/ref-** guide/apache-solr-ref-guide-4.**5-RC0/https://dist.apache.org/repos/dist/dev/lucene/solr/ref-guide/apache-solr-ref-guide-4.5-RC0/ $ cat apache-solr-ref-guide-4.5-RC0/**apache-solr-ref-guide-4.5.pdf.**sha1 ee40215d30f264d663f723ea2196b7**2b8cc5effc apache-solr-ref-guide-4.5.pdf (When reviewing the PDF, please don't hesitate to point out any typos or formatting glitches or any other problems of subject matter. Re-spinning a new RC is trivial, So in my opinion the bar is very low in terms of what things are worth fixing before relase.) -Hoss --**--**- To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.**orgdev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org -- Regards, Varun Thacker http://www.vthacker.in/
Re: VOTE RC0 Release apache-solr-ref-guide-4.5.pdf
Hi Varun, Thanks, good catch! Permission to edit the Reference Guide directly is only granted to Lucene/Solr committers - see https://cwiki.apache.org/confluence/display/solr/Internal+-+Maintaining+Documentation#Internal-MaintainingDocumentation-WhoCanEditThisDocumentation. For small additions/corrections, non-committers can add a comment on a page in the section that is closest to where the content should go, and then a committer can put the content where it belongs. But for larger stuff, it's better to create a JIRA issue, and attach the content there. Steve On Sep 26, 2013, at 5:48 AM, Varun Thacker varunthacker1...@gmail.com wrote: Hi, SOLR-3076 went into this release, but in the documentation for how to support Block Join in Solr is not present. In the ref guide there is a section called Other Parsers (https://cwiki.apache.org/confluence/display/solr/Other+Parsers) . We should add BlockJoinChildQParser and BlockJoinParentQParser. Also we should add an example on how to index childDocs in XML to make use of BlockJoin in Solr. I can document them right now but where should I post it? If someone can give me access to the Confluence I could add it there. My confluence username is [varunthacker] On Thu, Sep 26, 2013 at 1:06 AM, Chris Hostetter hossman_luc...@fucit.org wrote: Please vote to release the following artifacts as the Apache Solr Reference Guide for 4.5... https://dist.apache.org/repos/dist/dev/lucene/solr/ref-guide/apache-solr-ref-guide-4.5-RC0/ $ cat apache-solr-ref-guide-4.5-RC0/apache-solr-ref-guide-4.5.pdf.sha1 ee40215d30f264d663f723ea2196b72b8cc5effc apache-solr-ref-guide-4.5.pdf (When reviewing the PDF, please don't hesitate to point out any typos or formatting glitches or any other problems of subject matter. Re-spinning a new RC is trivial, So in my opinion the bar is very low in terms of what things are worth fixing before relase.) -Hoss - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org -- Regards, Varun Thacker http://www.vthacker.in/ - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: VOTE RC0 Release apache-solr-ref-guide-4.5.pdf
Hi Steve, No problems. I've created SOLR-5275 for this. On Thu, Sep 26, 2013 at 3:26 PM, Steve Rowe sar...@gmail.com wrote: Hi Varun, Thanks, good catch! Permission to edit the Reference Guide directly is only granted to Lucene/Solr committers - see https://cwiki.apache.org/confluence/display/solr/Internal+-+Maintaining+Documentation#Internal-MaintainingDocumentation-WhoCanEditThisDocumentation . For small additions/corrections, non-committers can add a comment on a page in the section that is closest to where the content should go, and then a committer can put the content where it belongs. But for larger stuff, it's better to create a JIRA issue, and attach the content there. Steve On Sep 26, 2013, at 5:48 AM, Varun Thacker varunthacker1...@gmail.com wrote: Hi, SOLR-3076 went into this release, but in the documentation for how to support Block Join in Solr is not present. In the ref guide there is a section called Other Parsers ( https://cwiki.apache.org/confluence/display/solr/Other+Parsers) . We should add BlockJoinChildQParser and BlockJoinParentQParser. Also we should add an example on how to index childDocs in XML to make use of BlockJoin in Solr. I can document them right now but where should I post it? If someone can give me access to the Confluence I could add it there. My confluence username is [varunthacker] On Thu, Sep 26, 2013 at 1:06 AM, Chris Hostetter hossman_luc...@fucit.org wrote: Please vote to release the following artifacts as the Apache Solr Reference Guide for 4.5... https://dist.apache.org/repos/dist/dev/lucene/solr/ref-guide/apache-solr-ref-guide-4.5-RC0/ $ cat apache-solr-ref-guide-4.5-RC0/apache-solr-ref-guide-4.5.pdf.sha1 ee40215d30f264d663f723ea2196b72b8cc5effc apache-solr-ref-guide-4.5.pdf (When reviewing the PDF, please don't hesitate to point out any typos or formatting glitches or any other problems of subject matter. Re-spinning a new RC is trivial, So in my opinion the bar is very low in terms of what things are worth fixing before relase.) -Hoss - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org -- Regards, Varun Thacker http://www.vthacker.in/ - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org -- Regards, Varun Thacker http://www.vthacker.in/
Re: VOTE RC0 Release apache-solr-ref-guide-4.5.pdf
Except for #1/#34 - internal links to beginning-of-page sections point one page earlier than they should - and #8/#41 - missing Thai and Polish chars - which I don't know how to fix, I'll try to address the other items on this (um, very long) list of mostly minor stuff I found: 0. All examples in the exported PDF have an extra blank line at the top. I was able to eliminate these from this page https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=32604227 (What is an analyzer?) by eliminating the newline between the initial {code …} line and the first line of the examples. This doesn't have any apparent effect on the layout of the page on the wiki, but the PDF export of that page no longer has the extra blank lines. Any objections to switching all {code} examples in the guide like this? 1. Pg 2: The section links from the TOC all take you to the previous page, rather than to the top of the page where the section starts. (Same behavior on OS X Preview, and under Windows, on Firefox's built-in PDF viewer and on Adobe Reader.) This looks like a general problem - see e.g. #34. 2. Pg 68: Stray asterisks in the analyzer tags in the fieldType example under Analysis Phases, apparently to make the surrounded text bold (which also didn't happen). 3. Pg 69: The solr.KeywordTokenizerFactory example is missing one quotation mark from each of the left and right hand sides. 4. Pg 70: Under solr.TokenizerFactory, there is a bogus StandardTokenizer link in the sentence Theere aren't any filters that use StandardTokenizer's types - the link is to the non-existent StandardTokenizer page on the Solr wiki. (It might be useful to systematically link stuff like this to the corresponding Lucene or Solr javadocs, but this should probably be templated or scripted, so that the version-specific links are handled properly.) 5. Pg 71: Under Standard Tokenizer, the email addresses recognition claim is false, and Internet domain name recognition isn't validation per se, e.g. google.supercomputername will be tokenized as a single token along with google.com. The Out example output needs fixup accordingly. I see that the Classic Tokenizer section on pg 72 has the verbatim email/domain text; for ClassicTokenizer, the email claim is true, but it has the same issue with internet domain names as StandardTokenizer. 6. Pg 74: The NGram Tokenizer example output should be (bicy, bicyc, icyc, icycl, cycl, cycle, ycle) instead of all of the 4grams before the 5grams (I think this class's behavior was changed in 4.4 by LUCENE-5042). 7. Pg 75: The ICU tokenizer rulefiles argument is missing. 8. Pg 75: The ICU Tokenizer's In input and Out output are completely missing the Thai text that's visible on the wiki. 9. Pg 75: Missing spaces in the Regular Expression Pattern Tokenizer's group attribute description, at the boundaries between the first two sentences: token(s).The and tokens.Non-negative. 10. Pg 72, 76, 77, etc.: Many analysis components' factory class names should be styled with a fixed-width font. 11. Pg 77: UAX29 URL Email Tokenizer recognizes not only .com Internet domain names, but also domain names including any other valid top-level domain (i.e., unlike StandardTokenizer and ClassicTokenizer, domain names are validated against the white list drawn from the IANA Root Zone database http://www.internic.net/zones/root.zone as of the last time ant gen-tld was performed and the tokenizer was generated.) 12. Pg 77: UAX29 tokenizer: file::// should be file:// 13. Pg 77: UAX29 tokenizer's URL and EMAIL type names are missing angle brackets. 14. Pg 77: UAX29 tokenizer's maxTokenLength attribute name should be styled with a fixed-width font. 15. Pg 78: In the example demonstrating how arguments can be given to filter elements via attributes, there is a stray asterisk, apparently intended to bold the surrounding text, which also didn't work: *min=2 max=7/ 16. Pg 79: The ASCII Folding Filter's Out output should have the accent stripped from the á - a and the ASCII character value adjusted - (ASCII character 97) 17. Pg 81: The Edge N-gram Filter's 4-6 gram size example Out should be (four, scor, score, twen, twent, twenty) - some of these are missing. 18. Pg 83: The ICU Normalizer 2 Filter example should include the name and mode attributes in the filter element. 19. Pg 87: Stray asterisks in both of the N-Gram Filter examples: *minGramSize=... 20. Pg 87: The N-Gram Filter 3-5 gram size example Out output should be (fou, four, our, sco, scor, score, cor, core, ore) - rather than ordering by gram size, output is now ordered first by position and then by gram size. 21. Pg 88: Stray asterisk in the first occurrence only example of the Pattern Replace Filter: *replace=first. 22. Pg 89: encoder argument to the Phonetic Filter has surrounding double curly brackets instead of being styled with a fixed-width font. 23. Pg 90: It should be mentioned on Porter Stem
Re: VOTE RC0 Release apache-solr-ref-guide-4.5.pdf
Thanks Steve. I'll only address a couple of your specific issues inline. We can split the rest of the list if you'd like, but I think a lot of them are on the same page in the wiki (although multiple pages in the PDF) - let me know. Cassandra On Thu, Sep 26, 2013 at 7:29 AM, Steve Rowe sar...@gmail.com wrote: 0. All examples in the exported PDF have an extra blank line at the top. I was able to eliminate these from this page https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=32604227 (What is an analyzer?) by eliminating the newline between the initial {code …} line and the first line of the examples. This doesn't have any apparent effect on the layout of the page on the wiki, but the PDF export of that page no longer has the extra blank lines. Any objections to switching all {code} examples in the guide like this? CT: is it that horrible? There are dozens and dozens of code examples, and it will take a while for someone to fix all of them. Since I edit in wiki markup mode, I've always found it easier to add the line break so my eyes can find the samples faster. That said, ease of use for users is more important than my convenience, so if you think it's badly distracting, then it's worth trying to fix it. An alternative might be to try to change the CSS that produces the code examples - the problem is that the default styling for the PDF includes some padding, and then puts in the newline. Fiddling with the CSS is painful though - we can't see the interim HTML and it's essentially trial error over over. So, it's essentially one of two annoying choices: edit all the code examples by hand, or generate the PDF x-dozen times to maybe find out the CSS approach won't work. 1. Pg 2: The section links from the TOC all take you to the previous page, rather than to the top of the page where the section starts. (Same behavior on OS X Preview, and under Windows, on Firefox's built-in PDF viewer and on Adobe Reader.) This looks like a general problem - see e.g. #34. CT: This is essentially a known problem (see my comment: https://issues.apache.org/jira/browse/SOLR-4886?focusedCommentId=13703660#comment-13703660, last bullet point). The way the PDF is created is that Confluence creates the entire document in an HTML page, which include bookmark tags right before the different heading levels. When the PDF is then generated, a rule is applied to insert a page-break before all h2 headings. That leaves the bookmark orphaned on the previous page. I have never found a solution to this problem - you can't edit the HTML and you don't have any control over where the bookmark tags in the HTML are put before the HTML is converted to PDF. The only solution is to never have page breaks, which I think severely diminishes readability. 2. Pg 68: Stray asterisks in the analyzer tags in the fieldType example under Analysis Phases, apparently to make the surrounded text bold (which also didn't happen). CT: BTW, it never will - code examples are rendered verbatim, without any of the styling normally applied. 43. Pg 106: Langauge-Specific Factories: Catalan, Danish, Irish and Romanian are missing from the covered languages; Catalan and Irish should include ElisionFilterFactory in their examples - there are articles lists in Lucene's {Catalan,Irish}Analyzer. CT: A general note about the languages and examples - there used to be examples that were incorrect so were removed so that might account for some of the gaps. There's an open issue you'll want to look at before diving in: https://issues.apache.org/jira/browse/SOLR-5031. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: VOTE RC0 Release apache-solr-ref-guide-4.5.pdf
Hi Cassandra, On Sep 26, 2013, at 9:15 AM, Cassandra Targett casstarg...@gmail.com wrote: I'll only address a couple of your specific issues inline. We can split the rest of the list if you'd like, but I think a lot of them are on the same page in the wiki (although multiple pages in the PDF) - let me know. I'll try to do them all myself, but if it looks like it's going to take more than one day, I'll ask for help. On Thu, Sep 26, 2013 at 7:29 AM, Steve Rowe sar...@gmail.com wrote: 0. All examples in the exported PDF have an extra blank line at the top. I was able to eliminate these from this page https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=32604227 (What is an analyzer?) by eliminating the newline between the initial {code …} line and the first line of the examples. This doesn't have any apparent effect on the layout of the page on the wiki, but the PDF export of that page no longer has the extra blank lines. Any objections to switching all {code} examples in the guide like this? CT: is it that horrible? There are dozens and dozens of code examples, and it will take a while for someone to fix all of them. Since I edit in wiki markup mode, I've always found it easier to add the line break so my eyes can find the samples faster. That said, ease of use for users is more important than my convenience, so if you think it's badly distracting, then it's worth trying to fix it. For me it's somewhere between annoying and badly distracting, but this will of course depend on the viewer. An alternative might be to try to change the CSS that produces the code examples - the problem is that the default styling for the PDF includes some padding, and then puts in the newline. Fiddling with the CSS is painful though - we can't see the interim HTML and it's essentially trial error over over. I'll take a look at the CSS - this is the one, right?: https://cwiki.apache.org/confluence/spaces/flyingpdf/viewpdfstyleconfig.action?key=solr About the interim HTML, I found this description of how to get it: https://confluence.atlassian.com/display/CONF35/Exporting+Confluence+Pages+and+Spaces+to+HTML. 1. Pg 2: The section links from the TOC all take you to the previous page, rather than to the top of the page where the section starts. (Same behavior on OS X Preview, and under Windows, on Firefox's built-in PDF viewer and on Adobe Reader.) This looks like a general problem - see e.g. #34. CT: This is essentially a known problem (see my comment: https://issues.apache.org/jira/browse/SOLR-4886?focusedCommentId=13703660#comment-13703660, last bullet point). The way the PDF is created is that Confluence creates the entire document in an HTML page, which include bookmark tags right before the different heading levels. When the PDF is then generated, a rule is applied to insert a page-break before all h2 headings. That leaves the bookmark orphaned on the previous page. I have never found a solution to this problem - you can't edit the HTML and you don't have any control over where the bookmark tags in the HTML are put before the HTML is converted to PDF. The only solution is to never have page breaks, which I think severely diminishes readability. Thanks for the explanation. I agree about page breaks being more important than off-by-one-page link targets. I wonder if there is some CSS trick to put the page break before the target a instead of the h2 section. 2. Pg 68: Stray asterisks in the analyzer tags in the fieldType example under Analysis Phases, apparently to make the surrounded text bold (which also didn't happen). CT: BTW, it never will - code examples are rendered verbatim, without any of the styling normally applied. Hmm, so there's no way to apply any formatting at all? That's too bad. 43. Pg 106: Langauge-Specific Factories: Catalan, Danish, Irish and Romanian are missing from the covered languages; Catalan and Irish should include ElisionFilterFactory in their examples - there are articles lists in Lucene's {Catalan,Irish}Analyzer. CT: A general note about the languages and examples - there used to be examples that were incorrect so were removed so that might account for some of the gaps. There's an open issue you'll want to look at before diving in: https://issues.apache.org/jira/browse/SOLR-5031. Thanks for the pointer. Steve - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: VOTE RC0 Release apache-solr-ref-guide-4.5.pdf
On Thu, Sep 26, 2013 at 8:59 AM, Steve Rowe sar...@gmail.com wrote: I'll try to do them all myself, but if it looks like it's going to take more than one day, I'll ask for help. OK, let me know. I'll take a look at the CSS - this is the one, right?: https://cwiki.apache.org/confluence/spaces/flyingpdf/viewpdfstyleconfig.action?key=solr About the interim HTML, I found this description of how to get it: https://confluence.atlassian.com/display/CONF35/Exporting+Confluence+Pages+and+Spaces+to+HTML. My first reaction was that it wouldn't work: The HTML export exports the selected pages into a .zip file of HTML files (one file for each wiki page). The interim-HTML for the PDF is one big single HTML file. They're different exports, using different stylesheets. However, it would make sense if the HTML was similar, so I took a look with my own Confluence instance and the two exports use many of the same divs for the same elements. It's not 1:1, but you could at least figure out what the right divs are. The big difference will be heading levels - the PDF flattens them all depending on the page hierarchy. There are also CSS' in place that you don't see and default rules that are applied if you haven't overridden them. And then I also think there are some styles put into the HTML itself that would override anything in the CSS. A few weeks ago I was working on a number of possible changes to the PDF, the formatting of code samples being one of them, but after two days working on it, I gave up for now. It really isn't fun to work on. 1. Pg 2: The section links from the TOC all take you to the previous page, rather than to the top of the page where the section starts. (Same behavior on OS X Preview, and under Windows, on Firefox's built-in PDF viewer and on Adobe Reader.) This looks like a general problem - see e.g. #34. CT: This is essentially a known problem (see my comment: https://issues.apache.org/jira/browse/SOLR-4886?focusedCommentId=13703660#comment-13703660, last bullet point). The way the PDF is created is that Confluence creates the entire document in an HTML page, which include bookmark tags right before the different heading levels. When the PDF is then generated, a rule is applied to insert a page-break before all h2 headings. That leaves the bookmark orphaned on the previous page. I have never found a solution to this problem - you can't edit the HTML and you don't have any control over where the bookmark tags in the HTML are put before the HTML is converted to PDF. The only solution is to never have page breaks, which I think severely diminishes readability. Thanks for the explanation. I agree about page breaks being more important than off-by-one-page link targets. I wonder if there is some CSS trick to put the page break before the target a instead of the h2 section. 2. Pg 68: Stray asterisks in the analyzer tags in the fieldType example under Analysis Phases, apparently to make the surrounded text bold (which also didn't happen). CT: BTW, it never will - code examples are rendered verbatim, without any of the styling normally applied. Hmm, so there's no way to apply any formatting at all? That's too bad. You can apply syntax formatting based on the language of the example, but not inline formatting to highlight specific lines - one way I've gotten around that in other places is to enable line numbers to display in the example, and then call out the line numbers in the text. Cassandra - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: VOTE RC0 Release apache-solr-ref-guide-4.5.pdf
On Thu, Sep 26, 2013 at 5:48 AM, Varun Thacker varunthacker1...@gmail.com wrote: SOLR-3076 went into this release, but in the documentation for how to support Block Join in Solr is not present. IMO, it's a work in progress / experimental. It doesn't necessarily need to be in the normal ref guide at this point, but if anything gets added it should probably be marked as experimental and potentially subject to change. -Yonik http://lucidworks.com - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: VOTE RC0 Release apache-solr-ref-guide-4.5.pdf
Cassandra, On Sep 26, 2013, at 10:39 AM, Cassandra Targett casstarg...@gmail.com wrote: I'll take a look at the CSS - this is the one, right?: https://cwiki.apache.org/confluence/spaces/flyingpdf/viewpdfstyleconfig.action?key=solr About the interim HTML, I found this description of how to get it: https://confluence.atlassian.com/display/CONF35/Exporting+Confluence+Pages+and+Spaces+to+HTML. My first reaction was that it wouldn't work: The HTML export exports the selected pages into a .zip file of HTML files (one file for each wiki page). The interim-HTML for the PDF is one big single HTML file. They're different exports, using different stylesheets. However, it would make sense if the HTML was similar, so I took a look with my own Confluence instance and the two exports use many of the same divs for the same elements. It's not 1:1, but you could at least figure out what the right divs are. The big difference will be heading levels - the PDF flattens them all depending on the page hierarchy. There are also CSS' in place that you don't see and default rules that are applied if you haven't overridden them. And then I also think there are some styles put into the HTML itself that would override anything in the CSS. A few weeks ago I was working on a number of possible changes to the PDF, the formatting of code samples being one of them, but after two days working on it, I gave up for now. It really isn't fun to work on. I added the following to the PDF stylesheet: /* trim leading blank line from pre-formatted code blocks */ div.codeContentpre { margin-top: -6px; } and it seems to do the trick - the top and bottom vertical whitespace look balanced to me now on two individual pages I exported. I'll export the whole thing now and look at every box to make sure this isn't doing the wrong thing somewhere. Steve - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: VOTE RC0 Release apache-solr-ref-guide-4.5.pdf
Awesome work steve! I collected all of this up into a scratch page, let's see how many we can burn through easily and then post another RC... https://cwiki.apache.org/confluence/display/solr/Internal+-+TODO+List -Hoss - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org