On 6/13/07, Chris Hostetter <[EMAIL PROTECTED]> wrote:
: and grab all the ids via TermDocs.read().  But because there is no
: offset, MultiTermDocs returns a list from each sub-segment, forcing me
: to copy each partial int[] filled into my full int[].

it seems like at the very least, MultiTermDocs.read(int[],int[]) could
allocate new arrays to pass to current.read and do the array copying for
you.

That would slow down the other usecase though... someone "streaming"
or incrementally handing batches of ids (like TermScorer does...)

: If one could specify a start/end into the array, this copying could be
: avoided (see partial and untested patch below).

what usecase are you thinking of where specifying the end would be handy?

I didn't have one in mind, except it's pretty much free.

Am i being naive, or would it also be useful to pass "base" to
current.read(int[],int[],...) so it's added to the docIds before they are
put in the array and we don't have to iterate over them again?

Yeah, that occured to me also... but I decided to take baby steps :-)

Yeah, the other thing that occured to me is to also have a reverse of
the interface, a-la HitCollector. The JVM would probably need to
inline the call at runtime for this to be competitive though.

: My feeling is that this is probably too specialized to warrant adding
: an additional method on the interface, so I won't open a JIRA for it.
: I brought it up in case anyone cared
: to argue otherwise.

I say benchmark it ... if it has any serious benefits absolutely add it to
the interface.  If it means reving up to Lucene 3.0 so be it.

The catch is that normal lucene searching would probably not be
sped up at all... only custom code that needs to use TermDocs.read()
in a specific way.

Just a general interface reminder to everyone... passing offset + len
along with array parameters is generally a very good thing.

-Yonik

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to