Re: Iterating TermsEnum for Long field produces zero values at the end

Michael McCandless Mon, 17 Nov 2014 10:31:35 -0800

It is expected: those are the "prefix" terms, which come after all the
full-precision numeric terms.


But I'm not sure why you see 0s ... the bytes should be unique for
every term you get back from the TermsEnum.

Mike McCandless

http://blog.mikemccandless.com


On Mon, Nov 17, 2014 at 10:39 AM, Barry Coughlan <b.coughl...@gmail.com> wrote:
> Hi all,
>
> I'm using 4.10.2. I have a Long "id" field. Each document has one "id"
> value. I am creating a look-up between Lucene's internal document id and my
> "id" values by enumerating the inverted index:
>
>     private long[] cacheDocIds() throws IOException {
>         long[] ourIds = new long[reader.maxDoc()];
>
>         Bits liveDocs = MultiFields.getLiveDocs(reader);
>         Fields fields = MultiFields.getFields(reader);
>         Terms terms = fields.terms("id");
>
>         TermsEnum iterator = terms.iterator(null);
>         BytesRef bytesRef = null;
>         while ((bytesRef = iterator.next()) != null) {
>             DocsEnum docsEnum = iterator.docs(liveDocs, null,
> DocsEnum.FLAG_NONE);
>
>             int luceneId = docsEnum.nextDoc();
>             long ourId = NumericUtils.prefixCodedToLong(bytesRef);
>             System.out.println(luceneId + " " + ourId);
>             ourIds[luceneId] = ourId;
>         }
>
>         return ourIds;
>     }
>
> With 5 documents (1, 2, 3, 4, 5) I get this output from the above code:
>
> 0 1
> 1 2
> 2 3
> 3 4
> 4 5
> 0 0
> 0 0
> 0 0
>
> I don't understand why there are three zeroes at the end.
>
> - reader.maxDoc is 5 and no documents have been deleted.
> - I have tried this with a varying number of documents and there are always
> three zeroes at the end.
> - I tried changing version to Lucene 4.10.0 and Lucene 4.9 and the same
> behavior occurs.
>
> I can work around this with but I'm just curious if this behavior is
> expected?
>
> Regards,
> Barry

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: Iterating TermsEnum for Long field produces zero values at the end

Reply via email to