Sorry, gmail was screwy and accidentally sent the msg.
Anyway,

I have a large index, about 30M docs.
I have a date field (by days) and there are about 1000 of them, every doc
has a date field filled in.

So out of curiosity I index the date field two ways:
1) using "date" as a field, and set the date value for each doc.
2) new term: "_payload:_val" and added the date (as a long or 8 byte array)
into the payload of each doc.

loading into an array long[] of length maxdoc of dates, the performance was
surprising:
using payload is 7 times slower than using fieldcache.

At first I thought it was because of the conversion between byte[8] to a
long for each doc, I changed it so it loads into byte[8*maxdoc] without
doing the conversion, and the result is the same.

I then did another experiment:
lower the number of dates down to a small number, e.g. 50, and timed field
cache load, and it took much longer than when it had 1000.

I did some profiling and the profiler is pointing to TermPositions.next
and TermPositions.nextPosition and TermPositions.getPayload as the culprit.

I would think payload would always be faster. Any ideas?

Thanks
-John

On Thu, Apr 3, 2008 at 7:27 AM, John Wang <[EMAIL PROTECTED]> wrote:

> Hi:
>
>

Reply via email to