[jira] Commented: (LUCENE-2793) Directory createOutput and openInput should take an IOContext

Robert Muir (JIRA) Sun, 05 Dec 2010 07:06:39 -0800

    [ 
https://issues.apache.org/jira/browse/LUCENE-2793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12966963#action_12966963
 ]


Robert Muir commented on LUCENE-2793:
-------------------------------------

There is another problem we should solve here, and that is the buffersize 
problem.

This is totally broken at the moment for custom directories, here's an example.
I wanted to set the buffersize by default to 4096 (since i measured this is 
like a 20% improvement for my directory impl).

looking at the apis you would think that you simply override the openInput that 
takes no buffer size like this:
{noformat}
  @Override
  public IndexInput openInput(String name) throws IOException {
    return openInput(name, 4096);
  }
{noformat}

unfortunately this doesnt work at all! instead you have to do something like 
this for it to actually "work":
{noformat}
   @Override
   public IndexInput openInput(String name, int bufferSize) throws IOException {
      ensureOpen();
      return new IndexInput(name, Math.max(bufferSize, 4096));
   }
{noformat}

The problem is, throughout lucene's APIs, the directory's "default" is never 
used, instead the static BufferedIndexInput.BUFFER_SIZE is used everywhere... 
eg SegmentReader.get:

{noformat}
  public static SegmentReader get(boolean readOnly, SegmentInfo si, int 
termInfosIndexDivisor) throws CorruptIndexException, IOException {
    return get(readOnly, si.dir, si, BufferedIndexInput.BUFFER_SIZE, true, 
termInfosIndexDivisor);
  }
{noformat}

So I think lucene's apis should never specify buffersize, we should remove it 
completely from the codecs api, and it should be *replaced* with IOContext.


> Directory createOutput and openInput should take an IOContext
> -------------------------------------------------------------
>
>                 Key: LUCENE-2793
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2793
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Store
>            Reporter: Michael McCandless
>
> Today for merging we pass down a larger readBufferSize than for searching 
> because we get better performance.
> I think we should generalize this to a class (IOContext), which would hold 
> the buffer size, but then could hold other flags like DIRECT (bypass OS's 
> buffer cache), SEQUENTIAL, etc.
> Then, we can make the DirectIOLinuxDirectory fully usable because we would 
> only use DIRECT/SEQUENTIAL during merging.
> This will require fixing how IW pools readers, so that a reader opened for 
> merging is not then used for searching, and vice/versa.  Really, it's only 
> all the open file handles that need to be different -- we could in theory 
> share del docs, norms, etc, if that were somehow possible.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] Commented: (LUCENE-2793) Directory createOutput and openInput should take an IOContext

Reply via email to