subject:"\[jira\] Issue Comment Edited\: \(LUCENE\-1483\) Change IndexSearcher multisegment searches to search each individual segment using a single HitCollector"

[jira] Issue Comment Edited: (LUCENE-1483) Change IndexSearcher multisegment searches to search each individual segment using a single HitCollector

2009-01-23 Thread Mark Miller (JIRA)

[
https://issues.apache.org/jira/browse/LUCENE-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1201#action_1201
]

markrmil...@gmail.com edited comment on LUCENE-1483 at 1/23/09 9:13 AM:
--

bq. I was looking after the initial warmup, but noticed no difference. Maybe
the string field I used was not distinct enough. What is a good number for a
noticeable speed improve (50% distinct terms?).

Hes not saying after the warm up, but that the warm up should be faster based
on that.

Its because of this:

The old way, if you had 5 segments with unique terms distributions of 50,000,
6000, 6000, 5, 5, then for the old way, we would try to load all 62,010 terms
for every segment - 5 x 5 -310,050.

With the new way, we load 50,000 terms for the first, 6000 for the next, then
6000, then 5 and 5: total of 62,010.

Even though most of the 50,000 wont be found in the 5 term segment, it still
takes a long time to check them all. So the more unique terms and the more
segments, the worse the problem got.

*edit*
little fix on those numbers

was (Author: markrmil...@gmail.com):
bq. I was looking after the initial warmup, but noticed no difference.
Maybe the string field I used was not distinct enough. What is a good number
for a noticeable speed improve (50% distinct terms?).

Hes not saying after the warm up, but that the warm up should be faster based
on that.

Its because of this:

The old way, if you had 5 segments with unique terms distributions of 50,000,
6000, 6000, 5, 5, then for the old way, we would try to load all 50,000 terms
for every segment - 5 x 5 - 250,000.

With the new way, we load 50,000 terms for the first, 6000 for the next, then
6000, then 5 and 5: total of 62,000.

Even though most of the 50,000 wont be found in the 5 term segment, it still
takes a long time to check them all. So the more unique terms and the more
segments, the worse the problem got.

Change IndexSearcher multisegment searches to search each individual segment
using a single HitCollector

Key: LUCENE-1483
URL: https://issues.apache.org/jira/browse/LUCENE-1483
Project: Lucene - Java
Issue Type: Improvement
Affects Versions: 2.9
Reporter: Mark Miller
Priority: Minor
Attachments: LUCENE-1483-partial.patch, LUCENE-1483.patch,
LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch,
LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch,
LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch,
LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch,
LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch,
LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch,
LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch,
LUCENE-1483.patch, LUCENE-1483.patch, sortBench.py, sortCollate.py

This issue changes how an IndexSearcher searches over multiple segments. The
current method of searching multiple segments is to use a MultiSegmentReader
and treat all of the segments as one. This causes filters and FieldCaches to
be keyed to the MultiReader and makes reopen expensive. If only a few
segments change, the FieldCache is still loaded for all of them.
This patch changes things by searching each individual segment one at a time,
but sharing the HitCollector used across each segment. This allows
FieldCaches and Filters to be keyed on individual SegmentReaders, making
reopen much cheaper. FieldCache loading over multiple segments can be much
faster as well - with the old method, all unique terms for every segment is
enumerated against each segment - because of the likely logarithmic change in
terms per segment, this can be very wasteful. Searching individual segments
avoids this cost. The term/document statistics from the multireader are used
to score results for each segment.
When sorting, its more difficult to use a single HitCollector for each sub
searcher. Ordinals are not comparable across segments. To account for this, a
new field sort enabled HitCollector is introduced that is able to collect and
sort across segments (because of its ability to compare ordinals across
segments). This TopFieldCollector class will collect the values/ordinals for
a given segment, and upon moving to the next segment, translate any
ordinals/values so that they can be compared against the values for the new
segment. This is done lazily.
All and all, the switch seems to provide numerous performance benefits, in
both sorted and non sorted search. We were seeing a good loss on indices with
lots of segments (1000?) and certain

[jira] Issue Comment Edited: (LUCENE-1483) Change IndexSearcher multisegment searches to search each individual segment using a single HitCollector

2009-01-23 Thread Uwe Schindler (JIRA)

[
https://issues.apache.org/jira/browse/LUCENE-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12666801#action_12666801
]

thetaphi edited comment on LUCENE-1483 at 1/23/09 4:07 PM:

bq. So null -- I cannot be split into sub-readers; empty array -- I am a null
reader; array.length 0 -- I do have sequential sub-readers?

This is a good optimization. If a MultiReader would return null instead of an
empty array, it wouldn't be a problem (the empty reader would be searched with
no results). But returning an empty array is better in this case. So
gatherSubReaders() should only check for (null) and then add the parent reader
itsself to the List and in all other cases do the recursion.

was (Author: thetaphi):
bq. So null -- I cannot be split into sub-readers; empty array -- I am a
null reader; array.length 0 -- I do have sequential sub-readers?

Change IndexSearcher multisegment searches to search each individual segment
using a single HitCollector

[jira] Issue Comment Edited: (LUCENE-1483) Change IndexSearcher multisegment searches to search each individual segment using a single HitCollector

2009-01-22 Thread Mark Miller (JIRA)

[
https://issues.apache.org/jira/browse/LUCENE-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12666163#action_12666163
]

markrmil...@gmail.com edited comment on LUCENE-1483 at 1/22/09 9:58 AM:
--

Nice work Mike - pretty polished. I've spent a little time looking it over, but
I'm going to look more tonight. Everything looking pretty good to me.

Not sure what to name that new class, but here are some ideas:

TopScoreDocCollector
TopHitCollector
TopResultCollector
TopMatchCollector
TopCollector
TopScoreCollector

Could be a low score, so that last one is odd, but I guess the low would kind
of be the top...
*edit*
nevermind...I was thinking lowest score could be considered top match, but it
wouldnt be the case with this hitcollector implementation, so I guess it makes
as much sense as any of the others.

was (Author: markrmil...@gmail.com):
Nice work Mike - pretty polished. I've spent a little time looking it over,
but I'm going to look more tonight. Everything looking pretty good to me.

Not sure what to name that new class, but here are some ideas:

TopScoreDocCollector
TopHitCollector
TopResultCollector
TopMatchCollector
TopCollector
TopScoreCollector

Could be a low score, so that last one is odd, but I guess the low would kind
of be the top...

Change IndexSearcher multisegment searches to search each individual segment
using a single HitCollector

[jira] Issue Comment Edited: (LUCENE-1483) Change IndexSearcher multisegment searches to search each individual segment using a single HitCollector

2009-01-18 Thread Mark Miller (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12664984#action_12664984
 ] 

markrmil...@gmail.com edited comment on LUCENE-1483 at 1/18/09 2:19 PM:
--

My previous results had a few oddities going with them (I was loosely playing 
around). Being a little more careful, here is an example of the difference, and 
the hotspots. Timings are probably not completely comparable as my comp couldnt 
keep up profiling the second version very well - its much slower without 
profiling as well though:

Index is 60 docs, 46 segments, 63849 unique terms.

Load the fieldcache on one multireader

||method||time||invocations||
|FieldCacheImpl.createValue|156536(98%)|1|
|MultiTermDocs.next()|148499(93.5%)|621803|
|MutliTermDocs(int)|140397(88.4%)|1002938|
|SegmentTermDocs.seek(Term)|138332(87.1%)|1002938|

load the fieldcache on each sub reader of the multireader, one at a time

||method||time||invocations||
|FieldCacheImpl.createValue|7815(80.4%)|46|
|SegmentTermDocs.next()|3315(34.1%)|642046|
|SegmentTermEnum.next()|1936(19.9%)|42046|
|SegmentTermDocs.seek(TermEnum)|874(9%)|42046|


*edit*
wrong values





  was (Author: markrmil...@gmail.com):
My previous results had a few oddities going with them (I was loosely 
playing around). Being a little more careful, here is an example of the 
difference, and the hotspots. Timings are probably not completely comparable as 
my comp couldnt keep up profiling the second version very well - its much 
slower without profiling as well though:

Index is 60 docs, 46 segments, 63849 unique terms.

Load the fieldcache on one multireader

||method||time||invocations||
|FieldCacheImpl.createValue|156536(98%)|1|
|MultiTermDocs.next()|148499(93.5%)|621803|
|MutliTermDocs(int)|140397(88.4%)|1002938|
|SegmentTermDocs.seek(Term)|138332(87.1%)|1002938|

load the fieldcache on each sub reader of the multireader, one at a time

||method||time||invocations||
|FieldCacheImpl.createValue|7815(80.4%)|46|
|SegmentTermDocs.next()|3315(34.1%)|642046|
|SegmentTermEnum.next()|1936(19.9%)|42046|
|SegmentTermDocs.seek(TermEnum)|874(9%)|42046|


Unique terms per segment:
21312,41837,41843,41849,41854,41860,41865,41870,41878,41883,41888,41894,41902,41906,41910,41912,41916,41921,41924
41930,41932,41936,41943,41947,41951,41956,41960,41964,41970,41974,41979,41982,41989,41994,41999,42002,42005
42007,42011,42016,42020,42026,42033,42039,42044,42046




  
 Change IndexSearcher multisegment searches to search each individual segment 
 using a single HitCollector
 

 Key: LUCENE-1483
 URL: https://issues.apache.org/jira/browse/LUCENE-1483
 Project: Lucene - Java
  Issue Type: Improvement
Affects Versions: 2.9
Reporter: Mark Miller
Priority: Minor
 Attachments: LUCENE-1483-partial.patch, LUCENE-1483.patch, 
 LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, 
 LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, 
 LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, 
 LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, 
 LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, 
 LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, 
 LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, 
 sortBench.py, sortCollate.py


 FieldCache and Filters are forced down to a single segment reader, allowing 
 for individual segment reloading on reopen.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Issue Comment Edited: (LUCENE-1483) Change IndexSearcher multisegment searches to search each individual segment using a single HitCollector

2009-01-18 Thread Mark Miller (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12664984#action_12664984
 ] 

markrmil...@gmail.com edited comment on LUCENE-1483 at 1/18/09 8:20 PM:
--

My previous results had a few oddities going with them (I was loosely playing 
around). Being a little more careful, here is an example of the difference, and 
the hotspots. Timings are probably not completely comparable as my comp couldnt 
keep up profiling the second version very well - its much slower without 
profiling as well though:

Index is 60 docs, 46 segments

Load the fieldcache on one multireader

||method||time||invocations||
|FieldCacheImpl.createValue|156536(98%)|1|
|MultiTermDocs.next()|148499(93.5%)|621803|
|MutliTermDocs(int)|140397(88.4%)|1002938|
|SegmentTermDocs.seek(Term)|138332(87.1%)|1002938|

load the fieldcache on each sub reader of the multireader, one at a time

||method||time||invocations||
|FieldCacheImpl.createValue|7815(80.4%)|46|
|SegmentTermDocs.next()|3315(34.1%)|642046|
|SegmentTermEnum.next()|1936(19.9%)|42046|
|SegmentTermDocs.seek(TermEnum)|874(9%)|42046|


*edit*
wrong values





  was (Author: markrmil...@gmail.com):
My previous results had a few oddities going with them (I was loosely 
playing around). Being a little more careful, here is an example of the 
difference, and the hotspots. Timings are probably not completely comparable as 
my comp couldnt keep up profiling the second version very well - its much 
slower without profiling as well though:

Index is 60 docs, 46 segments, 63849 unique terms.

Load the fieldcache on one multireader

||method||time||invocations||
|FieldCacheImpl.createValue|156536(98%)|1|
|MultiTermDocs.next()|148499(93.5%)|621803|
|MutliTermDocs(int)|140397(88.4%)|1002938|
|SegmentTermDocs.seek(Term)|138332(87.1%)|1002938|

load the fieldcache on each sub reader of the multireader, one at a time

||method||time||invocations||
|FieldCacheImpl.createValue|7815(80.4%)|46|
|SegmentTermDocs.next()|3315(34.1%)|642046|
|SegmentTermEnum.next()|1936(19.9%)|42046|
|SegmentTermDocs.seek(TermEnum)|874(9%)|42046|


*edit*
wrong values




  
 Change IndexSearcher multisegment searches to search each individual segment 
 using a single HitCollector
 

 Key: LUCENE-1483
 URL: https://issues.apache.org/jira/browse/LUCENE-1483
 Project: Lucene - Java
  Issue Type: Improvement
Affects Versions: 2.9
Reporter: Mark Miller
Priority: Minor
 Attachments: LUCENE-1483-partial.patch, LUCENE-1483.patch, 
 LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, 
 LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, 
 LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, 
 LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, 
 LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, 
 LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, 
 LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, 
 sortBench.py, sortCollate.py


 FieldCache and Filters are forced down to a single segment reader, allowing 
 for individual segment reloading on reopen.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Issue Comment Edited: (LUCENE-1483) Change IndexSearcher multisegment searches to search each individual segment using a single HitCollector

2009-01-13 Thread Mark Miller (JIRA)

[
https://issues.apache.org/jira/browse/LUCENE-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12663607#action_12663607
]

markrmil...@gmail.com edited comment on LUCENE-1483 at 1/13/09 7:03 PM:
--

Disregarding any missing gains with those simple policies, the rest of those
numbers actually look pretty good! Still some problems here and there (large
queue size still sticky), but overall some solid gains as well.

orddem seems to be best in most cases currently - maybe we can tweak that a
little more somehow. Where its not better, or not much worse, is with a single
segment. That result is interesting, because both policies beat it nicely, and
its because they simpely use straight ord on the first segment. But ordsubord
seems to outperform the policies. That doesn't make sense. Its largely the
same, but should be a tad slower if anything. Other results match up so nicely,
it seems like it might not be noise, in which case, weird.

was (Author: markrmil...@gmail.com):
Disregarding any any with those simple policies, the rest of those numbers
actually look pretty good! Still some problems here and there (large queue size
still sticky), but overall some solid gains as well.

orddem seems to be best in most cases currently - maybe we can tweak that a
little more somehow. Where its not better, or not much worse, is with a single
segment. That result is interesting, because both policies beat it nicely, and
its because they simple use straight ord on the first segment. But ordsubord
seems to outperform the policies. That doesn't make sense. Its largely the
same, but should be a tad slower if anything. Other results match up so nicely,
it seems like it might not be noise, in which case, weird.

Change IndexSearcher multisegment searches to search each individual segment
using a single HitCollector

FieldCache and Filters are forced down to a single segment reader, allowing
for individual segment reloading on reopen.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Issue Comment Edited: (LUCENE-1483) Change IndexSearcher multisegment searches to search each individual segment using a single HitCollector

2009-01-08 Thread Mark Miller (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12662038#action_12662038
 ] 

markrmil...@gmail.com edited comment on LUCENE-1483 at 1/8/09 9:15 AM:
-

Its the ORDSUBORD again (which I don't think we will use) and the two Policies. 
Odd because its the last hit of  10 that fails for all 3. I'll ferret it out 
tonight.

- Mark

*EDIT*

yup...always the last entry thats wrong no matter the queue size - for all 3, 
which is odd because ORD_SUBORD doesnt have too much of a relationship to the 
two policies. Will be a fun one.

  was (Author: markrmil...@gmail.com):
Its the ORDSUBORD again (which I don't think we will use) and the two 
Policies. Odd because its the last hit of  10 that fails for all 3. I'll ferret 
it out tonight.

- Mark
  
 Change IndexSearcher multisegment searches to search each individual segment 
 using a single HitCollector
 

 Key: LUCENE-1483
 URL: https://issues.apache.org/jira/browse/LUCENE-1483
 Project: Lucene - Java
  Issue Type: Improvement
Affects Versions: 2.9
Reporter: Mark Miller
Priority: Minor
 Attachments: LUCENE-1483-partial.patch, LUCENE-1483.patch, 
 LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, 
 LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, 
 LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, 
 LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, 
 LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, 
 LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, 
 sortBench.py, sortCollate.py


 FieldCache and Filters are forced down to a single segment reader, allowing 
 for individual segment reloading on reopen.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Issue Comment Edited: (LUCENE-1483) Change IndexSearcher multisegment searches to search each individual segment using a single HitCollector

2009-01-06 Thread Mark Miller (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12661160#action_12661160
 ] 

markrmil...@gmail.com edited comment on LUCENE-1483 at 1/6/09 6:57 AM:
-

bq. Mark, I see 3 testcase failures in TestSort if I pretend that 
SortField.STRING means STRING_ORD - do you see that?

Yeah, sorry. That STRING_ORD custom comparator policy is just a joke really, so 
I only really tested it on the StringSort test. It's just not initing the ords 
along with the values on switching. Making ords package private so that it can 
be changed (and changing it) fixes things. Not sure about new constructors or 
package private for that part of the switch...

bq. I think we should fix TestSort so that it runs N times, each time using a 
different STRING sort method, to make sure we are covering all these methods?

Yeah, this makes sense in any case. I just keep switching them by hand as I 
work on them.

  was (Author: markrmil...@gmail.com):
bq. Mark, I see 3 testcase failures in TestSort if I pretend that 
SortField.STRING means STRING_ORD - do you see that?

Yeah, sorry. That STRING_ORD custom comparator is just a joke really, so I only 
really tested it on the StringSort test. It's just not initing the ords along 
with the values on switching. Making ords package private so that it can be 
changed (and changing it) fixes things. Not sure about new constructors or 
package private for that part of the switch...

bq. I think we should fix TestSort so that it runs N times, each time using a 
different STRING sort method, to make sure we are covering all these methods?

Yeah, this makes sense in any case. I just keep switching them by hand as I 
work on them.
  
 Change IndexSearcher multisegment searches to search each individual segment 
 using a single HitCollector
 

 Key: LUCENE-1483
 URL: https://issues.apache.org/jira/browse/LUCENE-1483
 Project: Lucene - Java
  Issue Type: Improvement
Affects Versions: 2.9
Reporter: Mark Miller
Priority: Minor
 Attachments: LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, 
 LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, 
 LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, 
 LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, 
 LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, 
 LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, 
 sortBench.py, sortCollate.py


 FieldCache and Filters are forced down to a single segment reader, allowing 
 for individual segment reloading on reopen.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Issue Comment Edited: (LUCENE-1483) Change IndexSearcher multisegment searches to search each individual segment using a single HitCollector

2009-01-02 Thread Mark Miller (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12660322#action_12660322
 ] 

markrmil...@gmail.com edited comment on LUCENE-1483 at 1/2/09 6:24 AM:
-

So what looks like a promising strategy?

Off the top I am thinking something as simple as:

start with ORD with no fallback on the largest.
if the next segments are fairly large, use ORD_VAL
if the segments get somewhat smaller, move to ORD_DEM

Oddly, I've seen VAL perform well in certain situations, so maybe it has its 
place, but I don't know where yet.

*edit*

Oh, yeah, queue size should also play a roll in the switching 

  was (Author: markrmil...@gmail.com):
So what looks like a promising strategy?

Off the top I am thinking something as simple as:

start with ORD with no fallback on the largest.
if the next segments are fairly large, use ORD_VAL
if the segments get somewhat smaller, move to ORD_DEM

Oddly, I've seen VAL perform well in certain situations, so maybe it has its 
place, but I don't know where yet.
  
 Change IndexSearcher multisegment searches to search each individual segment 
 using a single HitCollector
 

 Key: LUCENE-1483
 URL: https://issues.apache.org/jira/browse/LUCENE-1483
 Project: Lucene - Java
  Issue Type: Improvement
Affects Versions: 2.9
Reporter: Mark Miller
Priority: Minor
 Attachments: LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, 
 LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, 
 LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, 
 LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, 
 LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, 
 LUCENE-1483.patch, LUCENE-1483.patch, sortBench.py, sortCollate.py


 FieldCache and Filters are forced down to a single segment reader, allowing 
 for individual segment reloading on reopen.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Issue Comment Edited: (LUCENE-1483) Change IndexSearcher multisegment searches to search each individual segment using a single HitCollector

[jira] Issue Comment Edited: (LUCENE-1483) Change IndexSearcher multisegment searches to search each individual segment using a single HitCollector

[jira] Issue Comment Edited: (LUCENE-1483) Change IndexSearcher multisegment searches to search each individual segment using a single HitCollector

[jira] Issue Comment Edited: (LUCENE-1483) Change IndexSearcher multisegment searches to search each individual segment using a single HitCollector

[jira] Issue Comment Edited: (LUCENE-1483) Change IndexSearcher multisegment searches to search each individual segment using a single HitCollector

[jira] Issue Comment Edited: (LUCENE-1483) Change IndexSearcher multisegment searches to search each individual segment using a single HitCollector

[jira] Issue Comment Edited: (LUCENE-1483) Change IndexSearcher multisegment searches to search each individual segment using a single HitCollector

[jira] Issue Comment Edited: (LUCENE-1483) Change IndexSearcher multisegment searches to search each individual segment using a single HitCollector

[jira] Issue Comment Edited: (LUCENE-1483) Change IndexSearcher multisegment searches to search each individual segment using a single HitCollector

9 matches

Site Navigation

Mail list logo

Footer information