I've reworked the highlighter package to address some issues (inability to pass 
fieldnames to analyzers,
limiting tokenization of large docs) and have refactored it to be more modular so that 
folks
can provide alternative implementations of the main functions (tokenizing, fragmenting 
and scoring) if required.

This is not backwards compatible with earlier releases but this new version should 
hopefully 
provide a much more robust framework going forward.
If people feel comfortable with this version I am happy to put this in the sandbox 
Any feedback is appreciated.

Code here:
http://www.inperspective.com/lucene/highlighter2/highlighter2.zip

Javadocs here:
http://www.inperspective.com/lucene/highlighter2/index.html

Quick code example:

  IndexSearcher searcher = new IndexSearcher(ramDir);
  Query query = QueryParser.parse("Kenne*", FIELD_NAME, analyzer);
  query=query.rewrite(reader); //required to expand search terms
  Hits hits = searcher.search(query);

  Highlighter highlighter =new Highlighter(new QueryScorer(query));
  for (int i = 0; i < hits.length(); i++)
  {
    String text = hits.doc(i).get(FIELD_NAME);
    TokenStream tokenStream=analyzer.tokenStream(FIELD_NAME,new StringReader(text));
    // Get 3 best fragments and seperate with a "..." 
    String result = highlighter.getBestFragments(tokenStream,text,3,"...");
    System.out.println(result);
  }


Cheers
Mark

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to