[ https://issues.apache.org/jira/browse/LUCENE-1685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12718773#action_12718773 ]
Michael McCandless commented on LUCENE-1685: -------------------------------------------- Why not deprecate QueryScorer? It's buggy, and leaving it in there, with such a juicy name, looking like the right choice, just makes Lucene's (highlighter's) quality look bad. Correctness trumps performance. And then the javadocs should clearly favor SpanScorer... and I would include a clear code fragment showing how to use it all, in context. EG this is what LIA2 currently has, which is fine to copy/modify/etc. to get into the javadocs: {code} public void testHits() throws Exception { IndexSearcher searcher = new IndexSearcher(TestUtil.getBookIndexDirectory()); TermQuery query = new TermQuery(new Term("title", "action")); TopDocs hits = searcher.search(query, 10); Highlighter highlighter = new Highlighter(null); Analyzer analyzer = new SimpleAnalyzer(); for (int i = 0; i < hits.scoreDocs.length; i++) { Document doc = searcher.doc(hits.scoreDocs[i].doc); String title = doc.get("title"); TokenStream stream = TokenSources.getAnyTokenStream(searcher.getIndexReader(), hits.scoreDocs[i].doc, "title", doc, analyzer); SpanScorer scorer = new SpanScorer(query, "title", new CachingTokenFilter(stream)); Fragmenter fragmenter = new SimpleSpanFragmenter(scorer); highlighter.setFragmentScorer(scorer); highlighter.setTextFragmenter(fragmenter); String fragment = highlighter.getBestFragment(stream, title); System.out.println(fragment); } } {code} It would also be nice to simplify that usage, eg, is there some way to not have to make a SpanScorer (and, by extension, fragmenter) per query, but instead make it up-front and add a setter for the new TokenStream for each doc? (Having to create Highlighter(null) is awkward). Or I suppose we could simply make a new Highlighter, SpanScorer, SimpleSpanFragmenter per-hit, but that seems wasteful. > Make the Highlighter use SpanScorer by default > ---------------------------------------------- > > Key: LUCENE-1685 > URL: https://issues.apache.org/jira/browse/LUCENE-1685 > Project: Lucene - Java > Issue Type: Improvement > Reporter: Mark Miller > Assignee: Mark Miller > Priority: Minor > Fix For: 2.9 > > > I've always thought this made sense, but frankly, it took me a year to get > the SpanScorer included with Lucene at all, so I was pretty much ready to > move on after I it got in, rather than push for it as a default. > I think it makes sense as the default in Solr as well, and I mentioned that > back when it was put in, but alas, its an option there as well. > The Highlighter package has no back compat req, but custom has been > conservative - one reason I havn't pushed for this change before. Might be > best to actually make the switch in 3? I could go either way - as is, I know > a bunch of people use it, but I'm betting its the large minority. It has > never been listed in a changes entry and its not in LIA 1, so you pretty much > have to stumble upon it, and figure out what its for. > I'll point out again that its just as fast as the standard scorer for any > clause of a query that is not position sensitive. Position sensitive query > clauses will obviously be somewhat slower to highlight, but that is because > they will be highlighted correctly rather than ignoring position. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org