Re: start + end for TermDocs.read()

Chris Hostetter Tue, 12 Jun 2007 23:49:12 -0700

: and grab all the ids via TermDocs.read().  But because there is no
: offset, MultiTermDocs returns a list from each sub-segment, forcing me
: to copy each partial int[] filled into my full int[].


it seems like at the very least, MultiTermDocs.read(int[],int[]) could
allocate new arrays to pass to current.read and do the array copying for
you.

: If one could specify a start/end into the array, this copying could be
: avoided (see partial and untested patch below).

what usecase are you thinking of where specifying the end would be handy?
wouldn't you always want it to read as much as possible as long as the
array's aren't full?

Am i being naive, or would it also be useful to pass "base" to
current.read(int[],int[],...) so it's added to the docIds before they are
put in the array and we don't have to iterate over them again?

: My feeling is that this is probably too specialized to warrant adding
: an additional method on the interface, so I won't open a JIRA for it.
: I brought it up in case anyone cared
: to argue otherwise.

I say benchmark it ... if it has any serious benefits absolutely add it to
the interface.  If it means reving up to Lucene 3.0 so be it.

(for something like SortComparatorSource, or FieldSelector where we expect
clients who use the interfaces to *implement* the interface i might be
more hesitant, but I'm guessing the number of people who write their own
TermDocs impls are pretty small, and would probably be understanding of a
new method in the API if it allowed for serious performance benefits.




-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: start + end for TermDocs.read()

Reply via email to