[jira] Commented: (LUCENE-2161) Some concurrency improvements for NRT

2010-06-02 Thread Shay Banon (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12874454#action_12874454
 ] 

Shay Banon commented on LUCENE-2161:


Thanks!

 Some concurrency improvements for NRT
 -

 Key: LUCENE-2161
 URL: https://issues.apache.org/jira/browse/LUCENE-2161
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Index
Reporter: Michael McCandless
Assignee: Michael McCandless
Priority: Minor
 Fix For: 2.9.3, 3.0.2, 3.1, 4.0

 Attachments: LUCENE-2161.patch


 Some concurrency improvements for NRT
 I found  fixed some silly thread bottlenecks that affect NRT:
   * Multi/DirectoryReader.numDocs is synchronized, I think so only 1
 thread computes numDocs if it's -1.  I removed this sync, and made
 numDocs volatile, instead.  Yes, multiple threads may compute the
 numDocs for the first time, but I think that's harmless?
   * Fixed BitVector's ctor to set count to 0 on creating a new BV, and
 clone to copy the count over; this saves CPU computing the count
 unecessarily.
   * Also strengthened assertions done in SR, testing the delete docs
 count.
 I also found an annoying thread bottleneck that happens, due to CMS.
 Whenever CMS hits the max running merges (default changed from 3 to 1
 recently), and the merge policy now wants to launch another merge, it
 forces the incoming thread to wait until one of the BG threads
 finishes.
 This is a basic crude throttling mechanism -- you force the mutators
 (whoever is causing new segments to appear) to stop, so that merging
 can catch up.
 Unfortunately, when stressing NRT, that thread is the one that's
 opening a new NRT reader.
 So, the first serious problem happens when you call .reopen() on your
 NRT reader -- this call simply forwards to IW.getReader if the reader
 was an NRT reader.  But, because DirectoryReader.doReopen is
 synchronized, this had the horrible effect of holding the monitor lock
 on your main IR.  In my test, this blocked all searches (since each
 search uses incRef/decRef, still sync'd until LUCENE-2156, at least).
 I fixed this by making doReopen only sync'd on this if it's not simply
 forwarding to getWriter.  So that's a good step forward.
 This prevents searches from being blocked while trying to reopen to a
 new NRT.
 However... it doesn't fix the problem that when an immense merge is
 off and running, opening an NRT reader could hit a tremendous delay
 because CMS blocks it.  The BalancedSegmentMergePolicy should help
 here... by avoiding such immense merges.
 But, I think we should also pursue an improvement to CMS.  EG, if it
 has 2 merges running, where one is huge and one is tiny, it ought to
 increase thread priority of the tiny one.  I think with such a change
 we could increase the max thread count again, to prevent this
 starvation.  I'll open a separate issue

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2161) Some concurrency improvements for NRT

2010-06-01 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12874253#action_12874253
 ] 

Michael McCandless commented on LUCENE-2161:


Shay it will be backported to 3.0.2.

 Some concurrency improvements for NRT
 -

 Key: LUCENE-2161
 URL: https://issues.apache.org/jira/browse/LUCENE-2161
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Index
Reporter: Michael McCandless
Assignee: Michael McCandless
Priority: Minor
 Fix For: 2.9.3, 3.0.2, 3.1, 4.0

 Attachments: LUCENE-2161.patch


 Some concurrency improvements for NRT
 I found  fixed some silly thread bottlenecks that affect NRT:
   * Multi/DirectoryReader.numDocs is synchronized, I think so only 1
 thread computes numDocs if it's -1.  I removed this sync, and made
 numDocs volatile, instead.  Yes, multiple threads may compute the
 numDocs for the first time, but I think that's harmless?
   * Fixed BitVector's ctor to set count to 0 on creating a new BV, and
 clone to copy the count over; this saves CPU computing the count
 unecessarily.
   * Also strengthened assertions done in SR, testing the delete docs
 count.
 I also found an annoying thread bottleneck that happens, due to CMS.
 Whenever CMS hits the max running merges (default changed from 3 to 1
 recently), and the merge policy now wants to launch another merge, it
 forces the incoming thread to wait until one of the BG threads
 finishes.
 This is a basic crude throttling mechanism -- you force the mutators
 (whoever is causing new segments to appear) to stop, so that merging
 can catch up.
 Unfortunately, when stressing NRT, that thread is the one that's
 opening a new NRT reader.
 So, the first serious problem happens when you call .reopen() on your
 NRT reader -- this call simply forwards to IW.getReader if the reader
 was an NRT reader.  But, because DirectoryReader.doReopen is
 synchronized, this had the horrible effect of holding the monitor lock
 on your main IR.  In my test, this blocked all searches (since each
 search uses incRef/decRef, still sync'd until LUCENE-2156, at least).
 I fixed this by making doReopen only sync'd on this if it's not simply
 forwarding to getWriter.  So that's a good step forward.
 This prevents searches from being blocked while trying to reopen to a
 new NRT.
 However... it doesn't fix the problem that when an immense merge is
 off and running, opening an NRT reader could hit a tremendous delay
 because CMS blocks it.  The BalancedSegmentMergePolicy should help
 here... by avoiding such immense merges.
 But, I think we should also pursue an improvement to CMS.  EG, if it
 has 2 merges running, where one is huge and one is tiny, it ought to
 increase thread priority of the tiny one.  I think with such a change
 we could increase the max thread count again, to prevent this
 starvation.  I'll open a separate issue

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2161) Some concurrency improvements for NRT

2010-05-30 Thread Shay Banon (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12873475#action_12873475
 ] 

Shay Banon commented on LUCENE-2161:


Mike, is there a reason why this is not backported to 3.0.2?

 Some concurrency improvements for NRT
 -

 Key: LUCENE-2161
 URL: https://issues.apache.org/jira/browse/LUCENE-2161
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Index
Reporter: Michael McCandless
Assignee: Michael McCandless
Priority: Minor
 Fix For: 2.9.3, 4.0

 Attachments: LUCENE-2161.patch


 Some concurrency improvements for NRT
 I found  fixed some silly thread bottlenecks that affect NRT:
   * Multi/DirectoryReader.numDocs is synchronized, I think so only 1
 thread computes numDocs if it's -1.  I removed this sync, and made
 numDocs volatile, instead.  Yes, multiple threads may compute the
 numDocs for the first time, but I think that's harmless?
   * Fixed BitVector's ctor to set count to 0 on creating a new BV, and
 clone to copy the count over; this saves CPU computing the count
 unecessarily.
   * Also strengthened assertions done in SR, testing the delete docs
 count.
 I also found an annoying thread bottleneck that happens, due to CMS.
 Whenever CMS hits the max running merges (default changed from 3 to 1
 recently), and the merge policy now wants to launch another merge, it
 forces the incoming thread to wait until one of the BG threads
 finishes.
 This is a basic crude throttling mechanism -- you force the mutators
 (whoever is causing new segments to appear) to stop, so that merging
 can catch up.
 Unfortunately, when stressing NRT, that thread is the one that's
 opening a new NRT reader.
 So, the first serious problem happens when you call .reopen() on your
 NRT reader -- this call simply forwards to IW.getReader if the reader
 was an NRT reader.  But, because DirectoryReader.doReopen is
 synchronized, this had the horrible effect of holding the monitor lock
 on your main IR.  In my test, this blocked all searches (since each
 search uses incRef/decRef, still sync'd until LUCENE-2156, at least).
 I fixed this by making doReopen only sync'd on this if it's not simply
 forwarding to getWriter.  So that's a good step forward.
 This prevents searches from being blocked while trying to reopen to a
 new NRT.
 However... it doesn't fix the problem that when an immense merge is
 off and running, opening an NRT reader could hit a tremendous delay
 because CMS blocks it.  The BalancedSegmentMergePolicy should help
 here... by avoiding such immense merges.
 But, I think we should also pursue an improvement to CMS.  EG, if it
 has 2 merges running, where one is huge and one is tiny, it ought to
 increase thread priority of the tiny one.  I think with such a change
 we could increase the max thread count again, to prevent this
 starvation.  I'll open a separate issue

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2161) Some concurrency improvements for NRT

2009-12-15 Thread Earwin Burrfoot (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12790663#action_12790663
 ] 

Earwin Burrfoot commented on LUCENE-2161:
-

Remove volatile from numDocs? All threads will hit some other sync sooner or 
later and see the value computed by the first (?few?).

 Some concurrency improvements for NRT
 -

 Key: LUCENE-2161
 URL: https://issues.apache.org/jira/browse/LUCENE-2161
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Index
Reporter: Michael McCandless
Assignee: Michael McCandless
Priority: Minor
 Fix For: 3.1

 Attachments: LUCENE-2161.patch


 Some concurrency improvements for NRT
 I found  fixed some silly thread bottlenecks that affect NRT:
   * Multi/DirectoryReader.numDocs is synchronized, I think so only 1
 thread computes numDocs if it's -1.  I removed this sync, and made
 numDocs volatile, instead.  Yes, multiple threads may compute the
 numDocs for the first time, but I think that's harmless?
   * Fixed BitVector's ctor to set count to 0 on creating a new BV, and
 clone to copy the count over; this saves CPU computing the count
 unecessarily.
   * Also strengthened assertions done in SR, testing the delete docs
 count.
 I also found an annoying thread bottleneck that happens, due to CMS.
 Whenever CMS hits the max running merges (default changed from 3 to 1
 recently), and the merge policy now wants to launch another merge, it
 forces the incoming thread to wait until one of the BG threads
 finishes.
 This is a basic crude throttling mechanism -- you force the mutators
 (whoever is causing new segments to appear) to stop, so that merging
 can catch up.
 Unfortunately, when stressing NRT, that thread is the one that's
 opening a new NRT reader.
 So, the first serious problem happens when you call .reopen() on your
 NRT reader -- this call simply forwards to IW.getReader if the reader
 was an NRT reader.  But, because DirectoryReader.doReopen is
 synchronized, this had the horrible effect of holding the monitor lock
 on your main IR.  In my test, this blocked all searches (since each
 search uses incRef/decRef, still sync'd until LUCENE-2156, at least).
 I fixed this by making doReopen only sync'd on this if it's not simply
 forwarding to getWriter.  So that's a good step forward.
 This prevents searches from being blocked while trying to reopen to a
 new NRT.
 However... it doesn't fix the problem that when an immense merge is
 off and running, opening an NRT reader could hit a tremendous delay
 because CMS blocks it.  The BalancedSegmentMergePolicy should help
 here... by avoiding such immense merges.
 But, I think we should also pursue an improvement to CMS.  EG, if it
 has 2 merges running, where one is huge and one is tiny, it ought to
 increase thread priority of the tiny one.  I think with such a change
 we could increase the max thread count again, to prevent this
 starvation.  I'll open a separate issue

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2161) Some concurrency improvements for NRT

2009-12-15 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12790675#action_12790675
 ] 

Michael McCandless commented on LUCENE-2161:


bq. Remove volatile from numDocs? All threads will hit some other sync sooner 
or later and see the value computed by the first (?few?).

Yeah I guess this would be fine.  Even if they don't see the value (ie they 
still see the stale -1), it's harmless if they recompute it and re-overwrite 
it.  So I'll remove volatile.

 Some concurrency improvements for NRT
 -

 Key: LUCENE-2161
 URL: https://issues.apache.org/jira/browse/LUCENE-2161
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Index
Reporter: Michael McCandless
Assignee: Michael McCandless
Priority: Minor
 Fix For: 3.1

 Attachments: LUCENE-2161.patch


 Some concurrency improvements for NRT
 I found  fixed some silly thread bottlenecks that affect NRT:
   * Multi/DirectoryReader.numDocs is synchronized, I think so only 1
 thread computes numDocs if it's -1.  I removed this sync, and made
 numDocs volatile, instead.  Yes, multiple threads may compute the
 numDocs for the first time, but I think that's harmless?
   * Fixed BitVector's ctor to set count to 0 on creating a new BV, and
 clone to copy the count over; this saves CPU computing the count
 unecessarily.
   * Also strengthened assertions done in SR, testing the delete docs
 count.
 I also found an annoying thread bottleneck that happens, due to CMS.
 Whenever CMS hits the max running merges (default changed from 3 to 1
 recently), and the merge policy now wants to launch another merge, it
 forces the incoming thread to wait until one of the BG threads
 finishes.
 This is a basic crude throttling mechanism -- you force the mutators
 (whoever is causing new segments to appear) to stop, so that merging
 can catch up.
 Unfortunately, when stressing NRT, that thread is the one that's
 opening a new NRT reader.
 So, the first serious problem happens when you call .reopen() on your
 NRT reader -- this call simply forwards to IW.getReader if the reader
 was an NRT reader.  But, because DirectoryReader.doReopen is
 synchronized, this had the horrible effect of holding the monitor lock
 on your main IR.  In my test, this blocked all searches (since each
 search uses incRef/decRef, still sync'd until LUCENE-2156, at least).
 I fixed this by making doReopen only sync'd on this if it's not simply
 forwarding to getWriter.  So that's a good step forward.
 This prevents searches from being blocked while trying to reopen to a
 new NRT.
 However... it doesn't fix the problem that when an immense merge is
 off and running, opening an NRT reader could hit a tremendous delay
 because CMS blocks it.  The BalancedSegmentMergePolicy should help
 here... by avoiding such immense merges.
 But, I think we should also pursue an improvement to CMS.  EG, if it
 has 2 merges running, where one is huge and one is tiny, it ought to
 increase thread priority of the tiny one.  I think with such a change
 we could increase the max thread count again, to prevent this
 starvation.  I'll open a separate issue

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org