[ https://issues.apache.org/jira/browse/LUCENE-2793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12966963#action_12966963 ]
Robert Muir commented on LUCENE-2793: ------------------------------------- There is another problem we should solve here, and that is the buffersize problem. This is totally broken at the moment for custom directories, here's an example. I wanted to set the buffersize by default to 4096 (since i measured this is like a 20% improvement for my directory impl). looking at the apis you would think that you simply override the openInput that takes no buffer size like this: {noformat} @Override public IndexInput openInput(String name) throws IOException { return openInput(name, 4096); } {noformat} unfortunately this doesnt work at all! instead you have to do something like this for it to actually "work": {noformat} @Override public IndexInput openInput(String name, int bufferSize) throws IOException { ensureOpen(); return new IndexInput(name, Math.max(bufferSize, 4096)); } {noformat} The problem is, throughout lucene's APIs, the directory's "default" is never used, instead the static BufferedIndexInput.BUFFER_SIZE is used everywhere... eg SegmentReader.get: {noformat} public static SegmentReader get(boolean readOnly, SegmentInfo si, int termInfosIndexDivisor) throws CorruptIndexException, IOException { return get(readOnly, si.dir, si, BufferedIndexInput.BUFFER_SIZE, true, termInfosIndexDivisor); } {noformat} So I think lucene's apis should never specify buffersize, we should remove it completely from the codecs api, and it should be *replaced* with IOContext. > Directory createOutput and openInput should take an IOContext > ------------------------------------------------------------- > > Key: LUCENE-2793 > URL: https://issues.apache.org/jira/browse/LUCENE-2793 > Project: Lucene - Java > Issue Type: Improvement > Components: Store > Reporter: Michael McCandless > > Today for merging we pass down a larger readBufferSize than for searching > because we get better performance. > I think we should generalize this to a class (IOContext), which would hold > the buffer size, but then could hold other flags like DIRECT (bypass OS's > buffer cache), SEQUENTIAL, etc. > Then, we can make the DirectIOLinuxDirectory fully usable because we would > only use DIRECT/SEQUENTIAL during merging. > This will require fixing how IW pools readers, so that a reader opened for > merging is not then used for searching, and vice/versa. Really, it's only > all the open file handles that need to be different -- we could in theory > share del docs, norms, etc, if that were somehow possible. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org