[jira] [Updated] (LUCENE-3092) NRTCachingDirectory, to buffer small segments in a RAMDir

2011-05-17 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-3092:
---

Attachment: LUCENE-3092.patch

New patch, fixes the issue Simon hit (was just a bug in the test -- it was 
using a silly MergePolicy that ignored partial optimize).

Test now passes w/ the patch from LUCENE-3100.

I think this is ready to commit, after LUCENE-3100 is in.

> NRTCachingDirectory, to buffer small segments in a RAMDir
> -
>
> Key: LUCENE-3092
> URL: https://issues.apache.org/jira/browse/LUCENE-3092
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: core/store
>Reporter: Michael McCandless
>Priority: Minor
> Fix For: 3.2, 4.0
>
> Attachments: LUCENE-3092-listener.patch, LUCENE-3092.patch, 
> LUCENE-3092.patch, LUCENE-3092.patch, LUCENE-3092.patch
>
>
> I created this simply Directory impl, whose goal is reduce IO
> contention in a frequent reopen NRT use case.
> The idea is, when reopening quickly, but not indexing that much
> content, you wind up with many small files created with time, that can
> possibly stress the IO system eg if merges, searching are also
> fighting for IO.
> So, NRTCachingDirectory puts these newly created files into a RAMDir,
> and only when they are merged into a too-large segment, does it then
> write-through to the real (delegate) directory.
> This lets you spend some RAM to reduce I0.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3092) NRTCachingDirectory, to buffer small segments in a RAMDir

2011-05-14 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-3092:
---

Attachment: LUCENE-3092.patch

Sorry last patch was wrong -- this one should be right.

> NRTCachingDirectory, to buffer small segments in a RAMDir
> -
>
> Key: LUCENE-3092
> URL: https://issues.apache.org/jira/browse/LUCENE-3092
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Store
>Reporter: Michael McCandless
>Priority: Minor
> Fix For: 3.2, 4.0
>
> Attachments: LUCENE-3092-listener.patch, LUCENE-3092.patch, 
> LUCENE-3092.patch, LUCENE-3092.patch
>
>
> I created this simply Directory impl, whose goal is reduce IO
> contention in a frequent reopen NRT use case.
> The idea is, when reopening quickly, but not indexing that much
> content, you wind up with many small files created with time, that can
> possibly stress the IO system eg if merges, searching are also
> fighting for IO.
> So, NRTCachingDirectory puts these newly created files into a RAMDir,
> and only when they are merged into a too-large segment, does it then
> write-through to the real (delegate) directory.
> This lets you spend some RAM to reduce I0.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3092) NRTCachingDirectory, to buffer small segments in a RAMDir

2011-05-14 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-3092:
---

Attachment: LUCENE-3092.patch

New patch.

I reduced the over-synchronized methods (hopefully not too much!), improved 
jdocs (added an example usage), added CHANGES entry, and added a test case.

But: the test case currently fails, due to LUCENE-3100.

> NRTCachingDirectory, to buffer small segments in a RAMDir
> -
>
> Key: LUCENE-3092
> URL: https://issues.apache.org/jira/browse/LUCENE-3092
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Store
>Reporter: Michael McCandless
>Priority: Minor
> Fix For: 3.2, 4.0
>
> Attachments: LUCENE-3092-listener.patch, LUCENE-3092.patch, 
> LUCENE-3092.patch
>
>
> I created this simply Directory impl, whose goal is reduce IO
> contention in a frequent reopen NRT use case.
> The idea is, when reopening quickly, but not indexing that much
> content, you wind up with many small files created with time, that can
> possibly stress the IO system eg if merges, searching are also
> fighting for IO.
> So, NRTCachingDirectory puts these newly created files into a RAMDir,
> and only when they are merged into a too-large segment, does it then
> write-through to the real (delegate) directory.
> This lets you spend some RAM to reduce I0.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3092) NRTCachingDirectory, to buffer small segments in a RAMDir

2011-05-12 Thread Chris Male (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Male updated LUCENE-3092:
---

Attachment: LUCENE-3092-listener.patch

Patch which roughly does what I suggested.  Just a proof-of-concept since 
everything is currently tied to CMS (when it should work with any MS).

Introduces a MergeEvent and MergeListener.  MergeEvents are fired by CMS before 
and after merge is done.  NRTCachingDirectory implements MergeListener and does 
it stuff on firing of the Events.

There are dangers will calling listeners in a finally block but as I say, just 
a POC.

> NRTCachingDirectory, to buffer small segments in a RAMDir
> -
>
> Key: LUCENE-3092
> URL: https://issues.apache.org/jira/browse/LUCENE-3092
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Store
>Reporter: Michael McCandless
>Priority: Minor
> Fix For: 3.2, 4.0
>
> Attachments: LUCENE-3092-listener.patch, LUCENE-3092.patch
>
>
> I created this simply Directory impl, whose goal is reduce IO
> contention in a frequent reopen NRT use case.
> The idea is, when reopening quickly, but not indexing that much
> content, you wind up with many small files created with time, that can
> possibly stress the IO system eg if merges, searching are also
> fighting for IO.
> So, NRTCachingDirectory puts these newly created files into a RAMDir,
> and only when they are merged into a too-large segment, does it then
> write-through to the real (delegate) directory.
> This lets you spend some RAM to reduce I0.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3092) NRTCachingDirectory, to buffer small segments in a RAMDir

2011-05-12 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-3092:
---

 Priority: Minor  (was: Major)
Fix Version/s: 4.0
   3.2

> NRTCachingDirectory, to buffer small segments in a RAMDir
> -
>
> Key: LUCENE-3092
> URL: https://issues.apache.org/jira/browse/LUCENE-3092
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Store
>Reporter: Michael McCandless
>Priority: Minor
> Fix For: 3.2, 4.0
>
> Attachments: LUCENE-3092.patch
>
>
> I created this simply Directory impl, whose goal is reduce IO
> contention in a frequent reopen NRT use case.
> The idea is, when reopening quickly, but not indexing that much
> content, you wind up with many small files created with time, that can
> possibly stress the IO system eg if merges, searching are also
> fighting for IO.
> So, NRTCachingDirectory puts these newly created files into a RAMDir,
> and only when they are merged into a too-large segment, does it then
> write-through to the real (delegate) directory.
> This lets you spend some RAM to reduce I0.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3092) NRTCachingDirectory, to buffer small segments in a RAMDir

2011-05-12 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-3092:
---

Attachment: LUCENE-3092.patch

Patch.

The patch add NRTCachingDir in core, but if/when we commit this I
think we should put it in contrib/module instead.  I think it's
working correctly -- commit works fine (it flushes all cached files to
the real dir on sync), but it needs a test case and I'm not going to
have time in the near future to do that so I wanted to open this issue
to get it out there


> NRTCachingDirectory, to buffer small segments in a RAMDir
> -
>
> Key: LUCENE-3092
> URL: https://issues.apache.org/jira/browse/LUCENE-3092
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Store
>Reporter: Michael McCandless
> Fix For: 3.2, 4.0
>
> Attachments: LUCENE-3092.patch
>
>
> I created this simply Directory impl, whose goal is reduce IO
> contention in a frequent reopen NRT use case.
> The idea is, when reopening quickly, but not indexing that much
> content, you wind up with many small files created with time, that can
> possibly stress the IO system eg if merges, searching are also
> fighting for IO.
> So, NRTCachingDirectory puts these newly created files into a RAMDir,
> and only when they are merged into a too-large segment, does it then
> write-through to the real (delegate) directory.
> This lets you spend some RAM to reduce I0.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org