Hi Lucene Developers Using org.apache.lucene.search.highlight.Highlighter SRC for Search
The Package.html displays something like this String text = hits.doc(i).get(FIELD_NAME); TokenStream tokenStream=analyzer.tokenStream(FIELD_NAME,new StringReader(text)); On using this SRC My Code Raises an "NullPointerException " [ The text on hits.doc(i) is returning this exception ] I have a piece of code "(refrence from Orielly.com) CustomAnalyser " an being using it other then org.apache.lucene.analysis.standard.StandardAnalyzer() , 1) In the first case [CustomAnalyzer() ] the text returns me NULL , the Hits return me 707. 2) In second case [ StandardAnalyzer() ] No hits are encountered , the Hits return's me 0. 3) But on using a normal SearchFiles from demo ( org.apache.lucene.demo) revels all the correct 707 hits probables. Please somebody look into this........ Karthik -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Sent: Saturday, May 22, 2004 12:34 AM To: [EMAIL PROTECTED] Subject: Re: org.apache.lucene.search.highlight.Highlighter Hi Claude, that example code you provided is out of date. For all concerned - the highlighter code was refactored about a month ago and then moved into the Sandbox. Want the latest version? - get the latest code from the sandbox CVS. Want the latest docs? - Run javadoc on the above. There is a basic example of highlighter use in the package-level javadocs and more extensive examples in the JUnit test that accompanies the source code. Hope this helps clarify things. Mark ps Bruce, I know you were interested in providing an alternative Fragmenter implementation for the highlighter that detects sentence boundaries. You may want to look at LingPipe which has "a heuristic sentence boundary detector". ( http://threattracker.com:8080/lingpipe-demo/demo.html ) I took a quick look at it but it has its own tokenizer that would be difficult to make work with the tokenstream used to identify query terms. At least the code gives some examples of the heuristics involved in detecting sentence boundaries. For my own apps I find the standard Fragmenter implementation suffices. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]