Hi all,
I'm using 4.10.2. I have a Long id field. Each document has one id
value. I am creating a look-up between Lucene's internal document id and my
id values by enumerating the inverted index:
private long[] cacheDocIds() throws IOException {
long[] ourIds = new
*Could someone point me how to order docIds as per
**http://wiki.apache.org/lucene-java/ImproveSearchingSpeed
http://wiki.apache.org/lucene-java/ImproveSearchingSpeed*
*Limit usage of stored fields and term vectors. Retrieving these from the
index is quite costly. Typically you should only
It is expected: those are the prefix terms, which come after all the
full-precision numeric terms.
But I'm not sure why you see 0s ... the bytes should be unique for
every term you get back from the TermsEnum.
Mike McCandless
http://blog.mikemccandless.com
On Mon, Nov 17, 2014 at 10:39 AM,
Hi,
It is expected: those are the prefix terms, which come after all the full-
precision numeric terms.
But I'm not sure why you see 0s ... the bytes should be unique for every term
you get back from the TermsEnum.
That's easy to explain:
The lower precision terms at the end have more
Makes sense, thanks. I switched the implementation to a FieldCache with no
noticeable performance difference:
private Longs cacheDocIds() throws IOException {
AtomicReader wrapped = SlowCompositeReaderWrapper.wrap(reader);
Longs vals = FieldCache.DEFAULT.getLongs(wrapped, id, false);
It's better to use doc values than field cache, if you can.
Mike McCandless
http://blog.mikemccandless.com
On Mon, Nov 17, 2014 at 2:55 PM, Barry Coughlan b.coughl...@gmail.com wrote:
Makes sense, thanks. I switched the implementation to a FieldCache with no
noticeable performance
Hi Vijay,
...sorting the documents you need to retrieve by docID order first...
means sorting them by their 'document number' which is the value in the
'scoreDoc.doc' field and is the value that the reader takes to 'retrieve' the
document from the index. If you write a comparator to sort the
Hi,
I am finding that lucene is slowing down a lot when bigger and bigger
doc/pos files are merged... While it's normally the case, the worrying part
is all my data is in RAM. Version is 4.6.1
Some sample statistics took after instrumenting the SortingAtomicReader
code, as we use a