On 9/23/2011 1:45 AM, Pranav Prakash wrote:
Maybe I am wrong. But my intentions of using both of them is - first I
want to use phrase queries so used CommonGramsFilterFactory. Secondly,
I dont want those stopwords in my index, so I have used
StopFilterFactory to remove them.
CommonGrams is not necessary for phrase queries. If you have a
super-dense index with very large documents, it will reduce the amount
of memory used by Solr, which can make them faster. It comes at a large
expense in disk space because your index gets considerably larger. The
cost trade-off in index size vs. memory usage may not be worth it. For
an index like the Hathi Trust, the tradeoff is worthwhile.
term frequencyto 26164and 25804the 25566of 25022a 24918in 24590for 23646n23588
with 23055is 22510
Is this typical of your production index size, or just a test? With
numbers this low, neither commongrams nor stopfilter is really
necessary. I suspect that these are probably test numbers, though.
Did you do delete and do a full reindex after you changed your schema?
Yup I did that a couple of times
I don't know what's going on here, but it sounds like your config might
not be saying what you think it's saying. It might be a good idea to
include your entire schema.xml and the name of the field that you are
looking at for term frequency.
Thanks,
Shawn