[jira] Commented: (LUCENE-2161) Some concurrency improvements for NRT
[ https://issues.apache.org/jira/browse/LUCENE-2161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12874454#action_12874454 ] Shay Banon commented on LUCENE-2161: Thanks! Some concurrency improvements for NRT - Key: LUCENE-2161 URL: https://issues.apache.org/jira/browse/LUCENE-2161 Project: Lucene - Java Issue Type: Improvement Components: Index Reporter: Michael McCandless Assignee: Michael McCandless Priority: Minor Fix For: 2.9.3, 3.0.2, 3.1, 4.0 Attachments: LUCENE-2161.patch Some concurrency improvements for NRT I found fixed some silly thread bottlenecks that affect NRT: * Multi/DirectoryReader.numDocs is synchronized, I think so only 1 thread computes numDocs if it's -1. I removed this sync, and made numDocs volatile, instead. Yes, multiple threads may compute the numDocs for the first time, but I think that's harmless? * Fixed BitVector's ctor to set count to 0 on creating a new BV, and clone to copy the count over; this saves CPU computing the count unecessarily. * Also strengthened assertions done in SR, testing the delete docs count. I also found an annoying thread bottleneck that happens, due to CMS. Whenever CMS hits the max running merges (default changed from 3 to 1 recently), and the merge policy now wants to launch another merge, it forces the incoming thread to wait until one of the BG threads finishes. This is a basic crude throttling mechanism -- you force the mutators (whoever is causing new segments to appear) to stop, so that merging can catch up. Unfortunately, when stressing NRT, that thread is the one that's opening a new NRT reader. So, the first serious problem happens when you call .reopen() on your NRT reader -- this call simply forwards to IW.getReader if the reader was an NRT reader. But, because DirectoryReader.doReopen is synchronized, this had the horrible effect of holding the monitor lock on your main IR. In my test, this blocked all searches (since each search uses incRef/decRef, still sync'd until LUCENE-2156, at least). I fixed this by making doReopen only sync'd on this if it's not simply forwarding to getWriter. So that's a good step forward. This prevents searches from being blocked while trying to reopen to a new NRT. However... it doesn't fix the problem that when an immense merge is off and running, opening an NRT reader could hit a tremendous delay because CMS blocks it. The BalancedSegmentMergePolicy should help here... by avoiding such immense merges. But, I think we should also pursue an improvement to CMS. EG, if it has 2 merges running, where one is huge and one is tiny, it ought to increase thread priority of the tiny one. I think with such a change we could increase the max thread count again, to prevent this starvation. I'll open a separate issue -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2161) Some concurrency improvements for NRT
[ https://issues.apache.org/jira/browse/LUCENE-2161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12874253#action_12874253 ] Michael McCandless commented on LUCENE-2161: Shay it will be backported to 3.0.2. Some concurrency improvements for NRT - Key: LUCENE-2161 URL: https://issues.apache.org/jira/browse/LUCENE-2161 Project: Lucene - Java Issue Type: Improvement Components: Index Reporter: Michael McCandless Assignee: Michael McCandless Priority: Minor Fix For: 2.9.3, 3.0.2, 3.1, 4.0 Attachments: LUCENE-2161.patch Some concurrency improvements for NRT I found fixed some silly thread bottlenecks that affect NRT: * Multi/DirectoryReader.numDocs is synchronized, I think so only 1 thread computes numDocs if it's -1. I removed this sync, and made numDocs volatile, instead. Yes, multiple threads may compute the numDocs for the first time, but I think that's harmless? * Fixed BitVector's ctor to set count to 0 on creating a new BV, and clone to copy the count over; this saves CPU computing the count unecessarily. * Also strengthened assertions done in SR, testing the delete docs count. I also found an annoying thread bottleneck that happens, due to CMS. Whenever CMS hits the max running merges (default changed from 3 to 1 recently), and the merge policy now wants to launch another merge, it forces the incoming thread to wait until one of the BG threads finishes. This is a basic crude throttling mechanism -- you force the mutators (whoever is causing new segments to appear) to stop, so that merging can catch up. Unfortunately, when stressing NRT, that thread is the one that's opening a new NRT reader. So, the first serious problem happens when you call .reopen() on your NRT reader -- this call simply forwards to IW.getReader if the reader was an NRT reader. But, because DirectoryReader.doReopen is synchronized, this had the horrible effect of holding the monitor lock on your main IR. In my test, this blocked all searches (since each search uses incRef/decRef, still sync'd until LUCENE-2156, at least). I fixed this by making doReopen only sync'd on this if it's not simply forwarding to getWriter. So that's a good step forward. This prevents searches from being blocked while trying to reopen to a new NRT. However... it doesn't fix the problem that when an immense merge is off and running, opening an NRT reader could hit a tremendous delay because CMS blocks it. The BalancedSegmentMergePolicy should help here... by avoiding such immense merges. But, I think we should also pursue an improvement to CMS. EG, if it has 2 merges running, where one is huge and one is tiny, it ought to increase thread priority of the tiny one. I think with such a change we could increase the max thread count again, to prevent this starvation. I'll open a separate issue -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2161) Some concurrency improvements for NRT
[ https://issues.apache.org/jira/browse/LUCENE-2161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12873475#action_12873475 ] Shay Banon commented on LUCENE-2161: Mike, is there a reason why this is not backported to 3.0.2? Some concurrency improvements for NRT - Key: LUCENE-2161 URL: https://issues.apache.org/jira/browse/LUCENE-2161 Project: Lucene - Java Issue Type: Improvement Components: Index Reporter: Michael McCandless Assignee: Michael McCandless Priority: Minor Fix For: 2.9.3, 4.0 Attachments: LUCENE-2161.patch Some concurrency improvements for NRT I found fixed some silly thread bottlenecks that affect NRT: * Multi/DirectoryReader.numDocs is synchronized, I think so only 1 thread computes numDocs if it's -1. I removed this sync, and made numDocs volatile, instead. Yes, multiple threads may compute the numDocs for the first time, but I think that's harmless? * Fixed BitVector's ctor to set count to 0 on creating a new BV, and clone to copy the count over; this saves CPU computing the count unecessarily. * Also strengthened assertions done in SR, testing the delete docs count. I also found an annoying thread bottleneck that happens, due to CMS. Whenever CMS hits the max running merges (default changed from 3 to 1 recently), and the merge policy now wants to launch another merge, it forces the incoming thread to wait until one of the BG threads finishes. This is a basic crude throttling mechanism -- you force the mutators (whoever is causing new segments to appear) to stop, so that merging can catch up. Unfortunately, when stressing NRT, that thread is the one that's opening a new NRT reader. So, the first serious problem happens when you call .reopen() on your NRT reader -- this call simply forwards to IW.getReader if the reader was an NRT reader. But, because DirectoryReader.doReopen is synchronized, this had the horrible effect of holding the monitor lock on your main IR. In my test, this blocked all searches (since each search uses incRef/decRef, still sync'd until LUCENE-2156, at least). I fixed this by making doReopen only sync'd on this if it's not simply forwarding to getWriter. So that's a good step forward. This prevents searches from being blocked while trying to reopen to a new NRT. However... it doesn't fix the problem that when an immense merge is off and running, opening an NRT reader could hit a tremendous delay because CMS blocks it. The BalancedSegmentMergePolicy should help here... by avoiding such immense merges. But, I think we should also pursue an improvement to CMS. EG, if it has 2 merges running, where one is huge and one is tiny, it ought to increase thread priority of the tiny one. I think with such a change we could increase the max thread count again, to prevent this starvation. I'll open a separate issue -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2161) Some concurrency improvements for NRT
[ https://issues.apache.org/jira/browse/LUCENE-2161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12790663#action_12790663 ] Earwin Burrfoot commented on LUCENE-2161: - Remove volatile from numDocs? All threads will hit some other sync sooner or later and see the value computed by the first (?few?). Some concurrency improvements for NRT - Key: LUCENE-2161 URL: https://issues.apache.org/jira/browse/LUCENE-2161 Project: Lucene - Java Issue Type: Improvement Components: Index Reporter: Michael McCandless Assignee: Michael McCandless Priority: Minor Fix For: 3.1 Attachments: LUCENE-2161.patch Some concurrency improvements for NRT I found fixed some silly thread bottlenecks that affect NRT: * Multi/DirectoryReader.numDocs is synchronized, I think so only 1 thread computes numDocs if it's -1. I removed this sync, and made numDocs volatile, instead. Yes, multiple threads may compute the numDocs for the first time, but I think that's harmless? * Fixed BitVector's ctor to set count to 0 on creating a new BV, and clone to copy the count over; this saves CPU computing the count unecessarily. * Also strengthened assertions done in SR, testing the delete docs count. I also found an annoying thread bottleneck that happens, due to CMS. Whenever CMS hits the max running merges (default changed from 3 to 1 recently), and the merge policy now wants to launch another merge, it forces the incoming thread to wait until one of the BG threads finishes. This is a basic crude throttling mechanism -- you force the mutators (whoever is causing new segments to appear) to stop, so that merging can catch up. Unfortunately, when stressing NRT, that thread is the one that's opening a new NRT reader. So, the first serious problem happens when you call .reopen() on your NRT reader -- this call simply forwards to IW.getReader if the reader was an NRT reader. But, because DirectoryReader.doReopen is synchronized, this had the horrible effect of holding the monitor lock on your main IR. In my test, this blocked all searches (since each search uses incRef/decRef, still sync'd until LUCENE-2156, at least). I fixed this by making doReopen only sync'd on this if it's not simply forwarding to getWriter. So that's a good step forward. This prevents searches from being blocked while trying to reopen to a new NRT. However... it doesn't fix the problem that when an immense merge is off and running, opening an NRT reader could hit a tremendous delay because CMS blocks it. The BalancedSegmentMergePolicy should help here... by avoiding such immense merges. But, I think we should also pursue an improvement to CMS. EG, if it has 2 merges running, where one is huge and one is tiny, it ought to increase thread priority of the tiny one. I think with such a change we could increase the max thread count again, to prevent this starvation. I'll open a separate issue -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2161) Some concurrency improvements for NRT
[ https://issues.apache.org/jira/browse/LUCENE-2161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12790675#action_12790675 ] Michael McCandless commented on LUCENE-2161: bq. Remove volatile from numDocs? All threads will hit some other sync sooner or later and see the value computed by the first (?few?). Yeah I guess this would be fine. Even if they don't see the value (ie they still see the stale -1), it's harmless if they recompute it and re-overwrite it. So I'll remove volatile. Some concurrency improvements for NRT - Key: LUCENE-2161 URL: https://issues.apache.org/jira/browse/LUCENE-2161 Project: Lucene - Java Issue Type: Improvement Components: Index Reporter: Michael McCandless Assignee: Michael McCandless Priority: Minor Fix For: 3.1 Attachments: LUCENE-2161.patch Some concurrency improvements for NRT I found fixed some silly thread bottlenecks that affect NRT: * Multi/DirectoryReader.numDocs is synchronized, I think so only 1 thread computes numDocs if it's -1. I removed this sync, and made numDocs volatile, instead. Yes, multiple threads may compute the numDocs for the first time, but I think that's harmless? * Fixed BitVector's ctor to set count to 0 on creating a new BV, and clone to copy the count over; this saves CPU computing the count unecessarily. * Also strengthened assertions done in SR, testing the delete docs count. I also found an annoying thread bottleneck that happens, due to CMS. Whenever CMS hits the max running merges (default changed from 3 to 1 recently), and the merge policy now wants to launch another merge, it forces the incoming thread to wait until one of the BG threads finishes. This is a basic crude throttling mechanism -- you force the mutators (whoever is causing new segments to appear) to stop, so that merging can catch up. Unfortunately, when stressing NRT, that thread is the one that's opening a new NRT reader. So, the first serious problem happens when you call .reopen() on your NRT reader -- this call simply forwards to IW.getReader if the reader was an NRT reader. But, because DirectoryReader.doReopen is synchronized, this had the horrible effect of holding the monitor lock on your main IR. In my test, this blocked all searches (since each search uses incRef/decRef, still sync'd until LUCENE-2156, at least). I fixed this by making doReopen only sync'd on this if it's not simply forwarding to getWriter. So that's a good step forward. This prevents searches from being blocked while trying to reopen to a new NRT. However... it doesn't fix the problem that when an immense merge is off and running, opening an NRT reader could hit a tremendous delay because CMS blocks it. The BalancedSegmentMergePolicy should help here... by avoiding such immense merges. But, I think we should also pursue an improvement to CMS. EG, if it has 2 merges running, where one is huge and one is tiny, it ought to increase thread priority of the tiny one. I think with such a change we could increase the max thread count again, to prevent this starvation. I'll open a separate issue -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org