Hi Group, I would like to have highlighting for search and I have the fields indexed with the following schema (Solr 3.4)
<fieldType name="text_commongrams" class="solr.TextField"> <analyzer> <charFilter class="solr.HTMLStripCharFilterFactory"/> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.RemoveDuplicatesTokenFilterFactory"/> <filter class="solr.TrimFilterFactory"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.SnowballPorterFilterFactory" language="English" protected="protwords.txt"/> <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase ="true" expand="true"/> <filter class="solr.CommonGramsFilterFactory" words="stopwords_en.txt" ignoreCase="true"/> <filter class="solr.StopFilterFactory" words="stopwords_en.txt" ignoreCase=" true"/> <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0 "preserveOriginal="1"/> </analyzer> </fieldType> <field name="transcript" type="text_commongrams" indexed="true" stored="true " termVectors="true" termPositions="true" termOffsets="true"/> <dynamicField name="*_en" type="text_commongrams" indexed="true" stored=" true" termVectors="true" termPositions="true" termOffsets="true"/> And the following config <highlighting> <fragmenter name="gap" class="org.apache.solr.highlight.GapFragmenter" default="true"> <lst name="defaults"> <int name="hl.fragsize">100</int> </lst> </fragmenter> <fragmenter name="regex" class="org.apache.solr.highlight.RegexFragmenter"> <lst name="defaults"> <int name="hl.fragsize">20</int> <float name="hl.regex.slop">0.5</float> <str name="hl.regex.pattern">[-\w ,/\n\"']{20,200}</str> </lst> </fragmenter> <formatter name="html" class="org.apache.solr.highlight.HtmlFormatter" default="true"> <lst name="defaults"> <str name="hl.simple.pre"> <![CDATA[ <strong> ]]> </str> <str name="hl.simple.post"> <![CDATA[ </strong> ]]> </str> </lst> </formatter> </highlighting> The problem is that when I turn on highlighting, I face memory issues. The Memory usage on system goes higher and higher until it consumes all the memory (I dont receive OOM errors, there is always like 300 MB free memory). The total memory I have is 48GiB. My Index size is 138GiB and there are about 10m documents in the index. I also get the following warning, but I am not sure how to get it done. WARNING: Deprecated syntax found. <highlighting/> should move to <searchComponent/> My Solr log with highlighting turned on looks something like this [core0] webapp=/solr path=/select params={mm=3<90%25&qf=title^2&hl.simple.pre=<strong>&hl.fl=title,transcript,transcript_en&wt=ruby&hl=true&rows=12&defType=dismax&fl=id,title,description&debugQuery=false&start=0&q=asdfghjkl&bf=recip(ms(NOW,created_at),1.88e-11,1,1)&hl.simple.post=</strong>&ps=50} Any help on this would be greatly appreciated. Thanks in advance !! *Pranav Prakash* "temet nosce" Twitter <http://twitter.com/pranavprakash> | Blog <http://blog.myblive.com> | Google <http://www.google.com/profiles/pranny>