It is expected: those are the "prefix" terms, which come after all the full-precision numeric terms.
But I'm not sure why you see 0s ... the bytes should be unique for every term you get back from the TermsEnum. Mike McCandless http://blog.mikemccandless.com On Mon, Nov 17, 2014 at 10:39 AM, Barry Coughlan <b.coughl...@gmail.com> wrote: > Hi all, > > I'm using 4.10.2. I have a Long "id" field. Each document has one "id" > value. I am creating a look-up between Lucene's internal document id and my > "id" values by enumerating the inverted index: > > private long[] cacheDocIds() throws IOException { > long[] ourIds = new long[reader.maxDoc()]; > > Bits liveDocs = MultiFields.getLiveDocs(reader); > Fields fields = MultiFields.getFields(reader); > Terms terms = fields.terms("id"); > > TermsEnum iterator = terms.iterator(null); > BytesRef bytesRef = null; > while ((bytesRef = iterator.next()) != null) { > DocsEnum docsEnum = iterator.docs(liveDocs, null, > DocsEnum.FLAG_NONE); > > int luceneId = docsEnum.nextDoc(); > long ourId = NumericUtils.prefixCodedToLong(bytesRef); > System.out.println(luceneId + " " + ourId); > ourIds[luceneId] = ourId; > } > > return ourIds; > } > > With 5 documents (1, 2, 3, 4, 5) I get this output from the above code: > > 0 1 > 1 2 > 2 3 > 3 4 > 4 5 > 0 0 > 0 0 > 0 0 > > I don't understand why there are three zeroes at the end. > > - reader.maxDoc is 5 and no documents have been deleted. > - I have tried this with a varying number of documents and there are always > three zeroes at the end. > - I tried changing version to Lucene 4.10.0 and Lucene 4.9 and the same > behavior occurs. > > I can work around this with but I'm just curious if this behavior is > expected? > > Regards, > Barry --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org