Re: Highlighting uses lots of memory and eventually slows down Solr

2011-12-19 Thread Pranav Prakash
No respinse !! Bumping it up

*Pranav Prakash*

temet nosce

Twitter http://twitter.com/pranavprakash | Blog http://blog.myblive.com |
Google http://www.google.com/profiles/pranny


On Fri, Dec 9, 2011 at 14:11, Pranav Prakash pra...@gmail.com wrote:

 Hi Group,

 I would like to have highlighting for search and I have the fields indexed
 with the following schema (Solr 3.4)

 fieldType name=text_commongrams class=solr.TextField
  analyzer
 charFilter class=solr.HTMLStripCharFilterFactory/
 tokenizer class=solr.StandardTokenizerFactory/
 filter class=solr.RemoveDuplicatesTokenFilterFactory/
 filter class=solr.TrimFilterFactory/
 filter class=solr.LowerCaseFilterFactory/
 filter class=solr.SnowballPorterFilterFactory language=English
 protected=protwords.txt/
 filter class=solr.SynonymFilterFactory synonyms=synonyms.txt
 ignoreCase=true expand=true/
 filter class=solr.CommonGramsFilterFactory words=stopwords_en.txt
 ignoreCase=true/
 filter class=solr.StopFilterFactory words=stopwords_en.txt ignoreCase
 =true/
 filter class=solr.WordDelimiterFilterFactory generateWordParts=1
 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll
 =0preserveOriginal=1/
 /analyzer
 /fieldType

 field name=transcript type=text_commongrams indexed=true stored=
 true termVectors=true termPositions=true termOffsets=true/

 dynamicField name=*_en type=text_commongrams indexed=true stored=
 true termVectors=true termPositions=true termOffsets=true/

 And the following config

 highlighting
  fragmenter name=gap class=org.apache.solr.highlight.GapFragmenter
 default=true
  lst name=defaults
 int name=hl.fragsize100/int
 /lst
 /fragmenter
 fragmenter name=regex class=org.apache.solr.highlight.RegexFragmenter
 
  lst name=defaults
 int name=hl.fragsize20/int
 float name=hl.regex.slop0.5/float
 str name=hl.regex.pattern[-\w ,/\n\']{20,200}/str
 /lst
 /fragmenter
 formatter name=html class=org.apache.solr.highlight.HtmlFormatter
 default=true
  lst name=defaults
  str name=hl.simple.pre
 ![CDATA[ strong ]]
 /str
 str name=hl.simple.post
 ![CDATA[ /strong ]]
 /str
 /lst
 /formatter
 /highlighting

 The problem is that when I turn on highlighting, I face memory issues. The
 Memory usage on system goes higher and higher until it consumes all the
 memory (I dont receive OOM errors, there is always like 300 MB free
 memory). The total memory I have is 48GiB. My Index size is 138GiB and
 there are about 10m documents in the index.

 I also get the following warning, but I am not sure how to get it done.

 WARNING: Deprecated syntax found. highlighting/ should move to
 searchComponent/

 My Solr log with highlighting turned on looks something like this

  [core0] webapp=/solr path=/select
 params={mm=390%25qf=title^2hl.simple.pre=stronghl.fl=title,transcript,transcript_enwt=rubyhl=truerows=12defType=dismaxfl=id,title,descriptiondebugQuery=falsestart=0q=asdfghjklbf=recip(ms(NOW,created_at),1.88e-11,1,1)hl.simple.post=/strongps=50}

 Any help on this would be greatly appreciated. Thanks in advance !!

 *Pranav Prakash*

 temet nosce

 Twitter http://twitter.com/pranavprakash | Bloghttp://blog.myblive.com |
 Google http://www.google.com/profiles/pranny



Highlighting uses lots of memory and eventually slows down Solr

2011-12-09 Thread Pranav Prakash
Hi Group,

I would like to have highlighting for search and I have the fields indexed
with the following schema (Solr 3.4)

fieldType name=text_commongrams class=solr.TextField
 analyzer
charFilter class=solr.HTMLStripCharFilterFactory/
tokenizer class=solr.StandardTokenizerFactory/
filter class=solr.RemoveDuplicatesTokenFilterFactory/
filter class=solr.TrimFilterFactory/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.SnowballPorterFilterFactory language=English
protected=protwords.txt/
filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase
=true expand=true/
filter class=solr.CommonGramsFilterFactory words=stopwords_en.txt
ignoreCase=true/
filter class=solr.StopFilterFactory words=stopwords_en.txt ignoreCase=
true/
filter class=solr.WordDelimiterFilterFactory generateWordParts=1
generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=0
preserveOriginal=1/
/analyzer
/fieldType

field name=transcript type=text_commongrams indexed=true stored=true
 termVectors=true termPositions=true termOffsets=true/

dynamicField name=*_en type=text_commongrams indexed=true stored=
true termVectors=true termPositions=true termOffsets=true/

And the following config

highlighting
 fragmenter name=gap class=org.apache.solr.highlight.GapFragmenter
default=true
 lst name=defaults
int name=hl.fragsize100/int
/lst
/fragmenter
fragmenter name=regex class=org.apache.solr.highlight.RegexFragmenter
 lst name=defaults
int name=hl.fragsize20/int
float name=hl.regex.slop0.5/float
str name=hl.regex.pattern[-\w ,/\n\']{20,200}/str
/lst
/fragmenter
formatter name=html class=org.apache.solr.highlight.HtmlFormatter
default=true
 lst name=defaults
 str name=hl.simple.pre
![CDATA[ strong ]]
/str
str name=hl.simple.post
![CDATA[ /strong ]]
/str
/lst
/formatter
/highlighting

The problem is that when I turn on highlighting, I face memory issues. The
Memory usage on system goes higher and higher until it consumes all the
memory (I dont receive OOM errors, there is always like 300 MB free
memory). The total memory I have is 48GiB. My Index size is 138GiB and
there are about 10m documents in the index.

I also get the following warning, but I am not sure how to get it done.

WARNING: Deprecated syntax found. highlighting/ should move to
searchComponent/

My Solr log with highlighting turned on looks something like this

[core0] webapp=/solr path=/select
params={mm=390%25qf=title^2hl.simple.pre=stronghl.fl=title,transcript,transcript_enwt=rubyhl=truerows=12defType=dismaxfl=id,title,descriptiondebugQuery=falsestart=0q=asdfghjklbf=recip(ms(NOW,created_at),1.88e-11,1,1)hl.simple.post=/strongps=50}

Any help on this would be greatly appreciated. Thanks in advance !!

*Pranav Prakash*

temet nosce

Twitter http://twitter.com/pranavprakash | Blog http://blog.myblive.com |
Google http://www.google.com/profiles/pranny