[ https://issues.apache.org/jira/browse/LUCENE-3092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13032936#comment-13032936 ]
Earwin Burrfoot commented on LUCENE-3092: ----------------------------------------- Chris, I don't like the idea of expanding IOContext again and again, but this case seems in line with intended purporse - give Directory implementation hints as to what we're going to do with it. I don't like events either. They look fragile and binding them to threads is a WTF. With all our pausing/unpausing magic there's no guarantee merge will end on the same thread it started on. bq. Stuff like FlushPolicy could take information about concurrent merges and hold of flushes for a little while if memory allows it etc. Coordinating access to shared resource (IO subsystem) with events is very awkward. Ok, your FlushPolicy receives events from MergePolicy and holds flushes during merge. _Now, when a flush is in progress, should FlushPolicy notify MergePolicy so it can hold its merges?_ It goes downhill from there. What if FP and MP fire events simultaneously? :) What should other listeners do? Try looking at a bigger picture. Merges are not your problem. Neither are flushes. Your problem is that several threads try to take their dump on disk simultaneously (for whatever reason, you don't really care). So what we need is an arbitration mechanism for Directory writes. A mechanism located presumably @ Directory level (eg, we don't need to throttle anything when writing to RAMDir). One possible implementation is that we add a constructor parameter to FSDirectory specifying desired level of IO parallelism, and then it keeps track of its IndexOutputs and stalls writes selectively. We can also add 'expectedWriteSize' to IOContext, so the Directory may favor shorter writes over bigger ones. Instead of 'expectedWriteSize' we can use 'priority'. > NRTCachingDirectory, to buffer small segments in a RAMDir > --------------------------------------------------------- > > Key: LUCENE-3092 > URL: https://issues.apache.org/jira/browse/LUCENE-3092 > Project: Lucene - Java > Issue Type: Improvement > Components: Store > Reporter: Michael McCandless > Priority: Minor > Fix For: 3.2, 4.0 > > Attachments: LUCENE-3092-listener.patch, LUCENE-3092.patch > > > I created this simply Directory impl, whose goal is reduce IO > contention in a frequent reopen NRT use case. > The idea is, when reopening quickly, but not indexing that much > content, you wind up with many small files created with time, that can > possibly stress the IO system eg if merges, searching are also > fighting for IO. > So, NRTCachingDirectory puts these newly created files into a RAMDir, > and only when they are merged into a too-large segment, does it then > write-through to the real (delegate) directory. > This lets you spend some RAM to reduce I0. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org