On 9/23/2011 1:45 AM, Pranav Prakash wrote:
Maybe I am wrong. But my intentions of using both of them is - first I want to use phrase queries so used CommonGramsFilterFactory. Secondly, I dont want those stopwords in my index, so I have used StopFilterFactory to remove them.

CommonGrams is not necessary for phrase queries. If you have a super-dense index with very large documents, it will reduce the amount of memory used by Solr, which can make them faster. It comes at a large expense in disk space because your index gets considerably larger. The cost trade-off in index size vs. memory usage may not be worth it. For an index like the Hathi Trust, the tradeoff is worthwhile.

term frequencyto 26164and 25804the 25566of 25022a 24918in 24590for 23646n23588
with 23055is 22510

Is this typical of your production index size, or just a test? With numbers this low, neither commongrams nor stopfilter is really necessary. I suspect that these are probably test numbers, though.


  Did you do delete and do a full reindex after you changed your schema?

Yup I did that a couple of times

I don't know what's going on here, but it sounds like your config might not be saying what you think it's saying. It might be a good idea to include your entire schema.xml and the name of the field that you are looking at for term frequency.

Thanks,
Shawn

Reply via email to