[
https://issues.apache.org/jira/browse/SOLR-6136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Timothy Potter updated SOLR-6136:
-
Attachment: SOLR-6136.patch
Here's a patch based largely on Brandon's original patch, using wait /
notifyAll instead of the spin lock in blockUntilFinished. As mentioned above,
VisualVM shows good evidence of this improvement in that the amount of CPU
spent in the block method is negligible with this patch (and very noticeable
without it).
I've also included the first cut at a unit test for CUSS. There's probably more
things we can do to exercise the logic in CUSS, so let me know if you have any
other ideas for the unit test.
Brandon - please try this patch out in your environment if possible. I'll plan
to commit this to trunk and backport to 4x branch in a few days after keeping
on eye on things in Jenkins.
> ConcurrentUpdateSolrServer includes a Spin Lock
> ---
>
> Key: SOLR-6136
> URL: https://issues.apache.org/jira/browse/SOLR-6136
> Project: Solr
> Issue Type: Bug
> Components: SolrCloud
>Affects Versions: 4.6, 4.6.1, 4.7, 4.7.1, 4.7.2, 4.8, 4.8.1
>Reporter: Brandon Chapman
>Assignee: Timothy Potter
>Priority: Critical
> Attachments: SOLR-6136.patch, wait___notify_all.patch
>
>
> ConcurrentUpdateSolrServer.blockUntilFinished() includes a Spin Lock. This
> causes an extremely high amount of CPU to be used on the Cloud Leader during
> indexing.
> Here is a summary of our system testing.
> Importing data on Solr4.5.0:
> Throughput gets as high as 240 documents per second.
> [tomcat@solr-stg01 logs]$ uptime
> 09:53:50 up 310 days, 23:52, 1 user, load average: 3.33, 3.72, 5.43
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> 9547 tomcat 21 0 6850m 1.2g 16m S 86.2 5.0 1:48.81 java
> Importing data on Solr4.7.0 with no replicas:
> Throughput peaks at 350 documents per second.
> [tomcat@solr-stg01 logs]$ uptime
> 10:03:44 up 311 days, 2 min, 1 user, load average: 4.57, 2.55, 4.18
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> 9728 tomcat 23 0 6859m 2.2g 28m S 62.3 9.0 2:20.20 java
> Importing data on Solr4.7.0 with replicas:
> Throughput peaks at 30 documents per second because the Solr machine is out
> of CPU.
> [tomcat@solr-stg01 logs]$ uptime
> 09:40:04 up 310 days, 23:38, 1 user, load average: 30.54, 12.39, 4.79
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> 9190 tomcat 17 0 7005m 397m 15m S 198.5 1.6 7:14.87 java
--
This message was sent by Atlassian JIRA
(v6.2#6252)
-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org