Hi
I am seeing some strange behaviour with the highlighter and I'm wondering if
anyone else is experiencing this. In certain instances I don't get a
summary being generated. I perform the search and the search returns the
correct document. I can see that the lucene document contains the text in
the field. However after doing:
SimpleHTMLFormatter simpleHTMLFormatter = new SimpleHTMLFormatter("<span
class=\"highlight\"><b>", "</b></span>");
//required for highlighting
Query query2 = multiSearcher.rewrite(query);
Highlighter highlighter = new Highlighter(simpleHTMLFormatter,
newQueryScorer(query2));
...
String text= doc.get(FieldNameEnum.BODY.getDescription());
TokenStream tokenStream = analyzer
.tokenStream(FieldNameEnum.BODY.getDescription(), new StringReader(text));
String result = highlighter.getBestFragments(tokenStream,
text, 3, "...");
the string result is empty. This is very strange, if i try a different term
that exists in the document then I get a summary. For example I have a word
document that contains the term "document" and "aspectj". If I search for
"document" I get the correct document but no highlighted summary. However
if I search using "aspectj" I get the same doucment with highlighted
summary.
Just to mentioned I do rewrite the original query before performing the
highlighting.
I'm not sure what i'm missing here. Any help would be appreciated.
Cheers
Amin
On Sat, Mar 7, 2009 at 4:32 PM, Amin Mohammed-Coleman <[email protected]>wrote:
> Hi
> Got it working! Thanks again for your help!
>
>
> Amin
>
>
> On Sat, Mar 7, 2009 at 12:25 PM, Amin Mohammed-Coleman
> <[email protected]>wrote:
>
>> Thanks! The final piece that I needed to do for the project!
>> Cheers
>>
>> Amin
>>
>> On Sat, Mar 7, 2009 at 12:21 PM, Uwe Schindler <[email protected]> wrote:
>>
>>> > cool. i will use compression and store in index. is there anything
>>> > special
>>> > i need to for decompressing the text? i presume i can just do
>>> > doc.get("content")?
>>> > thanks for your advice all!
>>>
>>> No just use Field.Store.COMPRESS when adding to index and Document.get()
>>> when fetching. The decompression is automatically done.
>>>
>>> You may think, why not enable compression for all fields? The case is,
>>> that
>>> this is an overhead for very small and short fields. So you should only
>>> use
>>> it for large contents (it's the same like compressing very small files as
>>> ZIP/GZIP: These files mostly get larger than without compression).
>>>
>>> Uwe
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: [email protected]
>>> For additional commands, e-mail: [email protected]
>>>
>>>
>>
>