Great!

We have some very long queries, where students paste entire homework problems. 
One of them was 1051 words. Many of them are over 100 words. This could help.

In the Jira discussion, I saw some comments about handling the most sparse 
lists first. We did something like that in the Infoseek Ultra engine about 
twenty years ago. Short termlists (documents matching a term) were processed 
first, which kept the in-memory lists of matching docs small. It also allowed 
early short-circuiting for no-hits queries.

What would be a high mm value, 75%?

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/


On Sep 4, 2014, at 11:52 PM, Mikhail Khludnev <mkhlud...@griddynamics.com> 
wrote:

> indeed https://issues.apache.org/jira/browse/LUCENE-4571
> my feeling is it gives a significant gain in mm high values.
> 
> 
> 
> On Fri, Sep 5, 2014 at 3:01 AM, Walter Underwood <wun...@wunderwood.org>
> wrote:
> 
>> Are there any speed advantages to using “mm”? I can imagine pruning the
>> set of matching documents early, which could help, but is that (or
>> something else) done?
>> 
>> wunder
>> Walter Underwood
>> wun...@wunderwood.org
>> http://observer.wunderwood.org/
>> 
>> 
>> 
> 
> 
> -- 
> Sincerely yours
> Mikhail Khludnev
> Principal Engineer,
> Grid Dynamics
> 
> <http://www.griddynamics.com>
> <mkhlud...@griddynamics.com>

Reply via email to