Hi,
Thanks for the answer!
I am doing some logging about stemming, and what I can see is that a lot of
tokens are stemmed for the highlighting. It is the strange part, since I
don't understand why does any highlighter need stemming again.
Anyway my docments are not really large, just a few kilobytes, but thanks
for this suggestion.
If you could help me in "how could I just ignore the stemming for
highlighting" thing it would be very great!
Thanks,
Gyuri
2011/7/29 Mike Sokolov
> I'm not sure I would identify stemming as the culprit here.
>
> Do you have very large documents? If so, there is a patch for FVH
> committed to limit the number of phrases it looks at; see hl.phraseLimit,
> but this won't be available until 3.4 is released.
> You can also limit the amount of each document that is analyzed by the
> regular Highlighter using maxDocCharsToAnalyze (and maybe this applies to
> FVH? not sure)
>
> Using RegexFragmenter is also probably slower than something like
> SimpleFragmenter.
>
> There is work to implement faster highlighting for Solr/Lucene, but it
> depends on some basic changes to the search architecture so it might be a
> while before that becomes available. See https://issues.apache.org/**
> jira/browse/LUCENE-3318<https://issues.apache.org/jira/browse/LUCENE-3318>if
> you're interested in following that development.
>
> -Mike
>
>
> On 07/29/2011 04:55 AM, Orosz György wrote:
>
>> Dear all,
>>
>> I am quite new about using Solr, but would like to ask your help.
>> I am developing an application which should be able to highlight the
>> results
>> of a query. For this I am using regex fragmenter:
>>
>>> class="org.apache.solr.**highlight.RegexFragmenter">
>>
>> 500
>> 0.5
>> <**/str>
>>