I'm working on using spellcheck for giving suggestions, and collations
are giving me good results, but they turn out to be very slow if
my original query has any FQs in it.  We can do 100 maxCollationTries
in no time at all, but if there are FQs in the query, things get
very slow.  As maxCollationTries and the count of FQs increase,
things get very slow very quickly.

         1    10    20    50   100 MaxCollationTries
0FQs     8     9    10    11    10
1FQ     11   160   599  1597  1668
2FQs    20   346  1163  3360  3361
3FQs    29   474  1852  5039  5095
4FQs    36   589  2463  6797  6807

All times are QTimes of ms.

See that top row?  With no FQs, 50 MaxCollationTries comes back
instantly.  Add just one FQ, though, and things go bad, and they
get worse as I add more of the FQs.  Also note that things seem to
level off at 100 MaxCollationTries.

Here's a query that I've been using as a test:

df=title_tracings_t&
fl=flrid,nodeid,title_tracings_t&
q=bagdad+AND+diaries+AND+-parent_tracings:(bagdad+AND+diaries)&
spellcheck.q=bagdad+AND+diaries&
rows=4&
wt=xml&
sort=popular_score+desc,+grouping+asc,+copyrightyear+desc,+flrid+asc&
spellcheck=true&
spellcheck.dictionary=direct&
spellcheck.onlyMorePopular=false&
spellcheck.count=15&
spellcheck.extendedResults=false&
spellcheck.collate=true&
spellcheck.maxCollations=10&
spellcheck.maxCollationTries=50&
spellcheck.collateExtendedResults=true&
spellcheck.alternativeTermCount=5&
spellcheck.maxResultsForSuggest=10&
debugQuery=off&
fq=((grouping:"1"+OR+grouping:"2"+OR+grouping:"3")+OR+solrtype:"N")&
fq=((item_source:"F"+OR+item_source:"B"+OR+item_source:"M")+OR+solrtype:"N")&
fq={!tag%3Dgrouping}((grouping:"1"+OR+grouping:"2")+OR+solrtype:"N")&
fq={!tag%3Dlanguagecode}(languagecode:"eng"+OR+solrtype:"N")&

The only thing that changes between tests is the value of
spellcheck.maxCollationTries and how many FQs are at the end.

Am I doing something wrong?  Do the collation internals not handle
FQs correctly?  The lookup/hit counts on filterCache seem to be
increasing just fine.  It will do N lookups, N hits, so I'm not
thinking that caching is the problem.

We'd really like to be able to use the spellchecker but the results
with only 10-20 maxCollationTries aren't nearly as good as if we
can bump that up to 100, but we can't afford the slow response time.
We also can't do without the FQs.

Thanks,
Andy


--
Andy Lester => a...@petdance.com => www.petdance.com => AIM:petdance

Reply via email to