[jira] [Updated] (LUCENE-3092) NRTCachingDirectory, to buffer small segments in a RAMDir
[ https://issues.apache.org/jira/browse/LUCENE-3092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-3092: --- Attachment: LUCENE-3092.patch New patch, fixes the issue Simon hit (was just a bug in the test -- it was using a silly MergePolicy that ignored partial optimize). Test now passes w/ the patch from LUCENE-3100. I think this is ready to commit, after LUCENE-3100 is in. > NRTCachingDirectory, to buffer small segments in a RAMDir > - > > Key: LUCENE-3092 > URL: https://issues.apache.org/jira/browse/LUCENE-3092 > Project: Lucene - Java > Issue Type: Improvement > Components: core/store >Reporter: Michael McCandless >Priority: Minor > Fix For: 3.2, 4.0 > > Attachments: LUCENE-3092-listener.patch, LUCENE-3092.patch, > LUCENE-3092.patch, LUCENE-3092.patch, LUCENE-3092.patch > > > I created this simply Directory impl, whose goal is reduce IO > contention in a frequent reopen NRT use case. > The idea is, when reopening quickly, but not indexing that much > content, you wind up with many small files created with time, that can > possibly stress the IO system eg if merges, searching are also > fighting for IO. > So, NRTCachingDirectory puts these newly created files into a RAMDir, > and only when they are merged into a too-large segment, does it then > write-through to the real (delegate) directory. > This lets you spend some RAM to reduce I0. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3092) NRTCachingDirectory, to buffer small segments in a RAMDir
[ https://issues.apache.org/jira/browse/LUCENE-3092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-3092: --- Attachment: LUCENE-3092.patch Sorry last patch was wrong -- this one should be right. > NRTCachingDirectory, to buffer small segments in a RAMDir > - > > Key: LUCENE-3092 > URL: https://issues.apache.org/jira/browse/LUCENE-3092 > Project: Lucene - Java > Issue Type: Improvement > Components: Store >Reporter: Michael McCandless >Priority: Minor > Fix For: 3.2, 4.0 > > Attachments: LUCENE-3092-listener.patch, LUCENE-3092.patch, > LUCENE-3092.patch, LUCENE-3092.patch > > > I created this simply Directory impl, whose goal is reduce IO > contention in a frequent reopen NRT use case. > The idea is, when reopening quickly, but not indexing that much > content, you wind up with many small files created with time, that can > possibly stress the IO system eg if merges, searching are also > fighting for IO. > So, NRTCachingDirectory puts these newly created files into a RAMDir, > and only when they are merged into a too-large segment, does it then > write-through to the real (delegate) directory. > This lets you spend some RAM to reduce I0. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3092) NRTCachingDirectory, to buffer small segments in a RAMDir
[ https://issues.apache.org/jira/browse/LUCENE-3092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-3092: --- Attachment: LUCENE-3092.patch New patch. I reduced the over-synchronized methods (hopefully not too much!), improved jdocs (added an example usage), added CHANGES entry, and added a test case. But: the test case currently fails, due to LUCENE-3100. > NRTCachingDirectory, to buffer small segments in a RAMDir > - > > Key: LUCENE-3092 > URL: https://issues.apache.org/jira/browse/LUCENE-3092 > Project: Lucene - Java > Issue Type: Improvement > Components: Store >Reporter: Michael McCandless >Priority: Minor > Fix For: 3.2, 4.0 > > Attachments: LUCENE-3092-listener.patch, LUCENE-3092.patch, > LUCENE-3092.patch > > > I created this simply Directory impl, whose goal is reduce IO > contention in a frequent reopen NRT use case. > The idea is, when reopening quickly, but not indexing that much > content, you wind up with many small files created with time, that can > possibly stress the IO system eg if merges, searching are also > fighting for IO. > So, NRTCachingDirectory puts these newly created files into a RAMDir, > and only when they are merged into a too-large segment, does it then > write-through to the real (delegate) directory. > This lets you spend some RAM to reduce I0. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3092) NRTCachingDirectory, to buffer small segments in a RAMDir
[ https://issues.apache.org/jira/browse/LUCENE-3092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Male updated LUCENE-3092: --- Attachment: LUCENE-3092-listener.patch Patch which roughly does what I suggested. Just a proof-of-concept since everything is currently tied to CMS (when it should work with any MS). Introduces a MergeEvent and MergeListener. MergeEvents are fired by CMS before and after merge is done. NRTCachingDirectory implements MergeListener and does it stuff on firing of the Events. There are dangers will calling listeners in a finally block but as I say, just a POC. > NRTCachingDirectory, to buffer small segments in a RAMDir > - > > Key: LUCENE-3092 > URL: https://issues.apache.org/jira/browse/LUCENE-3092 > Project: Lucene - Java > Issue Type: Improvement > Components: Store >Reporter: Michael McCandless >Priority: Minor > Fix For: 3.2, 4.0 > > Attachments: LUCENE-3092-listener.patch, LUCENE-3092.patch > > > I created this simply Directory impl, whose goal is reduce IO > contention in a frequent reopen NRT use case. > The idea is, when reopening quickly, but not indexing that much > content, you wind up with many small files created with time, that can > possibly stress the IO system eg if merges, searching are also > fighting for IO. > So, NRTCachingDirectory puts these newly created files into a RAMDir, > and only when they are merged into a too-large segment, does it then > write-through to the real (delegate) directory. > This lets you spend some RAM to reduce I0. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3092) NRTCachingDirectory, to buffer small segments in a RAMDir
[ https://issues.apache.org/jira/browse/LUCENE-3092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-3092: --- Priority: Minor (was: Major) Fix Version/s: 4.0 3.2 > NRTCachingDirectory, to buffer small segments in a RAMDir > - > > Key: LUCENE-3092 > URL: https://issues.apache.org/jira/browse/LUCENE-3092 > Project: Lucene - Java > Issue Type: Improvement > Components: Store >Reporter: Michael McCandless >Priority: Minor > Fix For: 3.2, 4.0 > > Attachments: LUCENE-3092.patch > > > I created this simply Directory impl, whose goal is reduce IO > contention in a frequent reopen NRT use case. > The idea is, when reopening quickly, but not indexing that much > content, you wind up with many small files created with time, that can > possibly stress the IO system eg if merges, searching are also > fighting for IO. > So, NRTCachingDirectory puts these newly created files into a RAMDir, > and only when they are merged into a too-large segment, does it then > write-through to the real (delegate) directory. > This lets you spend some RAM to reduce I0. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3092) NRTCachingDirectory, to buffer small segments in a RAMDir
[ https://issues.apache.org/jira/browse/LUCENE-3092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-3092: --- Attachment: LUCENE-3092.patch Patch. The patch add NRTCachingDir in core, but if/when we commit this I think we should put it in contrib/module instead. I think it's working correctly -- commit works fine (it flushes all cached files to the real dir on sync), but it needs a test case and I'm not going to have time in the near future to do that so I wanted to open this issue to get it out there > NRTCachingDirectory, to buffer small segments in a RAMDir > - > > Key: LUCENE-3092 > URL: https://issues.apache.org/jira/browse/LUCENE-3092 > Project: Lucene - Java > Issue Type: Improvement > Components: Store >Reporter: Michael McCandless > Fix For: 3.2, 4.0 > > Attachments: LUCENE-3092.patch > > > I created this simply Directory impl, whose goal is reduce IO > contention in a frequent reopen NRT use case. > The idea is, when reopening quickly, but not indexing that much > content, you wind up with many small files created with time, that can > possibly stress the IO system eg if merges, searching are also > fighting for IO. > So, NRTCachingDirectory puts these newly created files into a RAMDir, > and only when they are merged into a too-large segment, does it then > write-through to the real (delegate) directory. > This lets you spend some RAM to reduce I0. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org