Robert Muir created LUCENE-5729:
-----------------------------------

             Summary: explore random-access methods to IndexInput
                 Key: LUCENE-5729
                 URL: https://issues.apache.org/jira/browse/LUCENE-5729
             Project: Lucene - Core
          Issue Type: Improvement
            Reporter: Robert Muir


Traditionally lucene access is mostly reading lists of postings and geared at 
that, but for random-access stuff like docvalues, it just creates overhead.

So today we are hacking around it, by doing this random access with 
seek+readXXX, but this is inefficient (additional checks by the jdk that we 
dont need).

As a hack, I added the following to IndexInput, changed direct packed ints 
decode to use them, and implemented in MMapDir:
{code}
byte readByte(long pos) --> ByteBuffer.get(pos)
short readShort(long pos) --> ByteBuffer.getShort(pos)
int readInt(long pos) --> ByteBuffer.getInt(pos)
long readLong(long pos) --> ByteBuffer.getLong(pos)
{code}

This gives ~30% performance improvement for docvalues (numerics, sorting 
strings, etc)

We should do a few things first before working this (LUCENE-5728: use slice api 
in decode, pad packed ints so we only have one i/o call ever, etc etc) but I 
think we need to figure out such an API.

It could either be on indexinput like my hack (this is similar to ByteBuffer 
API with both relative and absolute methods), or we could have a separate API. 
But i guess arguably IOContext exists to supply hints too, so I dont know which 
is the way to go.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to