Just an FYI, Lucene 2.9 has FastVectorHighlighter:

http://hudson.zones.apache.org/hudson/job/Lucene-trunk/javadoc/all/org/apache/lucene/search/vectorhighlight/package-summary.html

Features

   * fast for large docs
   * support N-gram fields
   * support phrase-unit highlighting with slops
   * need Java 1.5
   * highlight fields need to be TermVector.WITH_POSITIONS_OFFSETS
   * take into account query boost to score fragments
   * support colored highlight tags
   * pluggable FragListBuilder
   * pluggable FragmentsBuilder

Unfortunately, Solr hasn't incorporated it yet:

https://issues.apache.org/jira/browse/SOLR-1268

Koji


ravi.gidwani wrote:
Hey Matt:
             I have been facing the same issue. I have a text field that I
highlight along with other fields (may be 10 others fields). But If I enable
highlighting on this text field that contains large number of
characters/words ( > 100 000 characters) , highlighting suffers performance.
Queries return in about 15/20 seconds with this field enabled in highlights
as compared to less than a second WITHOUT this enabled in highlight.
            I did try termvector=true , but I did not see any performance
gain either.
Just wondering if you were able to solve your issue OR tweak the performance
in any other way.
BTW , I use solr 1.3.

~Ravi .

goodieboy wrote:
Thanks Otis. I added termVector="true" for those fields, but there isn't a
noticeable difference. So, just to be a little more clear, the dynamic
fields I'm adding... there might be hundreds. Do you see this as a
problem?

Thanks,
Matt

On Fri, May 15, 2009 at 7:48 PM, Otis Gospodnetic <
otis_gospodne...@yahoo.com> wrote:

Matt,

I believe indexing those fields that you will use for highlighting with
term vectors enabled will make things faster (and your index a bit
bigger).


Otis --
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



----- Original Message ----
From: Matt Mitchell <goodie...@gmail.com>
To: solr-user@lucene.apache.org
Sent: Friday, May 15, 2009 5:08:23 PM
Subject: highlighting performance

Hi,

I'm experimenting with highlighting and am noticing a big drop in
performance with my setup. I have documents that use quite a few
dynamic
fields (20-30). The fields are multiValued stored/indexed text fields,
each
with a few paragraphs worth of text. My hl.fl param is set to *_t

What kinds of things can I tweak to make this faster? Is it because I'm
highlighting so many different fields?

Thanks,
Matt
Quoted from: http://www.nabble.com/highlighting-performance-tp23567323p23713406.html



goodieboy wrote:
Thanks Otis. I added termVector="true" for those fields, but there isn't a
noticeable difference. So, just to be a little more clear, the dynamic
fields I'm adding... there might be hundreds. Do you see this as a
problem?

Thanks,
Matt

On Fri, May 15, 2009 at 7:48 PM, Otis Gospodnetic <
otis_gospodne...@yahoo.com> wrote:

Matt,

I believe indexing those fields that you will use for highlighting with
term vectors enabled will make things faster (and your index a bit
bigger).


Otis --
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



----- Original Message ----
From: Matt Mitchell <goodie...@gmail.com>
To: solr-user@lucene.apache.org
Sent: Friday, May 15, 2009 5:08:23 PM
Subject: highlighting performance

Hi,

I'm experimenting with highlighting and am noticing a big drop in
performance with my setup. I have documents that use quite a few
dynamic
fields (20-30). The fields are multiValued stored/indexed text fields,
each
with a few paragraphs worth of text. My hl.fl param is set to *_t

What kinds of things can I tweak to make this faster? Is it because I'm
highlighting so many different fields?

Thanks,
Matt


Reply via email to