Re: Lucene Highlighting and Dynamic Summaries

markharw00d Wed, 11 Mar 2009 11:12:33 -0700

If you can supply a Junit test that recreates the problem I think we canstart to make progress on this.



Amin Mohammed-Coleman wrote:

Hi
Apologies for re sending this mail. Just wondering if anyone hasexperienced the below. I'm not sure if this could happen due nature ofdocument. It does seem strange one term search returns summary whileanother does not even though same document is being returned.
I'm asking this so I can code around this if is normal.


Apologies again for re sending this mail

Cheers

Amin

Sent from my iPhone

On 9 Mar 2009, at 07:50, Amin Mohammed-Coleman <[email protected]> wrote:
Hi
I am seeing some strange behaviour with the highlighter and I'mwondering if anyone else is experiencing this. In certain instancesI don't get a summary being generated. I perform the search and thesearch returns the correct document. I can see that the lucenedocument contains the text in the field. However after doing:
SimpleHTMLFormatter simpleHTMLFormatter = newSimpleHTMLFormatter("<span class=\"highlight\"><b>", "</b></span>");
            //required for highlighting
            Query query2 = multiSearcher.rewrite(query);
Highlighter highlighter = newHighlighter(simpleHTMLFormatter, new QueryScorer(query2));
...

String text= doc.get(FieldNameEnum.BODY.getDescription());
TokenStream tokenStream =analyzer.tokenStream(FieldNameEnum.BODY.getDescription(), newStringReader(text));String result =highlighter.getBestFragments(tokenStream, text, 3, "...");
the string result is empty. This is very strange, if i try adifferent term that exists in the document then I get a summary. Forexample I have a word document that contains the term "document" and"aspectj". If I search for "document" I get the correct document butno highlighted summary. However if I search using "aspectj" I getthe same doucment with highlighted summary.
Just to mentioned I do rewrite the original query before performingthe highlighting.
I'm not sure what i'm missing here.  Any help would be appreciated.

Cheers
Amin
On Sat, Mar 7, 2009 at 4:32 PM, Amin Mohammed-Coleman<[email protected]> wrote:
Hi

Got it working!  Thanks again for your help!


Amin
On Sat, Mar 7, 2009 at 12:25 PM, Amin Mohammed-Coleman<[email protected]> wrote:
Thanks!  The final piece that I needed to do for the project!

Cheers

Amin

On Sat, Mar 7, 2009 at 12:21 PM, Uwe Schindler <[email protected]> wrote:
> cool.  i will use compression and store in index. is there anything
> special
> i need to for decompressing the text? i presume i can just do
> doc.get("content")?
> thanks for your advice all!

No just use Field.Store.COMPRESS when adding to index and Document.get()
when fetching. The decompression is automatically done.
You may think, why not enable compression for all fields? The caseis, thatthis is an overhead for very small and short fields. So you shouldonly useit for large contents (it's the same like compressing very smallfiles as
ZIP/GZIP: These files mostly get larger than without compression).

Uwe


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
------------------------------------------------------------------------


No virus found in this incoming message.
Checked by AVG - www.avg.comVersion: 8.0.237 / Virus Database: 270.11.10/1995 - Release Date: 03/11/09 08:28:00




---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: Lucene Highlighting and Dynamic Summaries

Reply via email to