[ https://issues.apache.org/jira/browse/LUCENE-644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Uwe Schindler closed LUCENE-644. -------------------------------- Resolution: Fixed Fix Version/s: 2.9 Closing since FastVectorHighlighter was added in Lucene 2.9. > Contrib: another highlighter approach > ------------------------------------- > > Key: LUCENE-644 > URL: https://issues.apache.org/jira/browse/LUCENE-644 > Project: Lucene - Java > Issue Type: Improvement > Components: contrib/highlighter > Reporter: Ronnie Kolehmainen > Priority: Minor > Fix For: 2.9 > > Attachments: FulltextHighlighter.java, FulltextHighlighter.java, > FulltextHighlighterTest.java, FulltextHighlighterTest.java, svn-diff.patch, > svn-diff.patch, TokenSources.java, TokenSources.java.diff > > > Mark Harwoods highlighter package is a great contribution to Lucene, I've > used it a lot! However, when you have *large* documents (fields), > highlighting can be quite time consuming if you increase the number of bytes > to analyze with setMaxDocBytesToAnalyze(int). The default value of 50k is > often too low for indexed PDFs etcetera, which results in empty highlight > strings. > This is an alternative approach using term position vectors only to build > fragment info objects. Then a StringReader can read the relevant fragments > and skip() between them. This is a lot faster. Also, this method uses the > *entire* field for finding the best fragments so you're always guaranteed to > get a highlight snippet. > Because this method only works with fields which have term positions stored > one can check if this method works for a particular field using following > code (taken from TokenSources.java): > TermFreqVector tfv = (TermFreqVector) reader.getTermFreqVector(docId, > field); > if (tfv != null && tfv instanceof TermPositionVector) > { > // use FulltextHighlighter > } > else > { > // use standard Highlighter > } > Someone else might find this useful so I'm posting the code here. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org