[jira] [Comment Edited] (SOLR-10130) Serious performance degradation in Solr 6.4.1 due to the new metrics collection
[ https://issues.apache.org/jira/browse/SOLR-10130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15865912#comment-15865912 ] Andrzej Bialecki edited comment on SOLR-10130 at 2/14/17 3:09 PM: --- I haven't been able to reproduce such drastic slowdown using simple benchmarks - example results from indexing using {{post}} tool, fairly representative from several runs on each branch: {code} * branch_6_3 real4m14.804s user0m0.883s sys 0m2.279s * branch_6_4 real5m0.987s user0m0.910s sys 0m2.276s * jira/solr-10130 real4m38.097s user0m0.881s sys 0m2.287s {code} Profiler indeed shows that one of the hotspots on branch_6_4 is the {{Meter.mark}} code that is called in {{org.apache.solr.core.MetricsDirectoryFactory$MetricsInput.readByte}}. In my test the profiler showed that this consumes ~ 3% CPU, which is indeed something that we should avoid and turn off by default. However, this still doesn't explain the order of magnitude slowdown reported above. [~emaijala] and [~wunder] - please apply the above patch in your environment and see what is the impact. It makes sense to make this change anyway, so I'm going to apply this or similar version to all affected branches, but maybe there's more we can do here. was (Author: ab): I haven't been able to reproduce such drastic slowdown using simple benchmarks - example results from indexing using {{post}} tool, fairly representative from several runs on each branch: {code} * branch_6_3 * branch_6_4 * jira/solr-10130 {code} Profiler indeed shows that one of the hotspots on branch_6_4 is the {{Meter.mark}} code that is called in {{org.apache.solr.core.MetricsDirectoryFactory$MetricsInput.readByte}}. In my test the profiler showed that this consumes ~ 3% CPU, which is indeed something that we should avoid and turn off by default. However, this still doesn't explain the order of magnitude slowdown reported above. [~emaijala] and [~wunder] - please apply the above patch in your environment and see what is the impact. It makes sense to make this change anyway, so I'm going to apply this or similar version to all affected branches, but maybe there's more we can do here. > Serious performance degradation in Solr 6.4.1 due to the new metrics > collection > --- > > Key: SOLR-10130 > URL: https://issues.apache.org/jira/browse/SOLR-10130 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: metrics >Affects Versions: 6.4.1 > Environment: Centos 7, OpenJDK 1.8.0 update 111 >Reporter: Ere Maijala >Assignee: Andrzej Bialecki >Priority: Blocker > Labels: perfomance > Attachments: SOLR-10130.patch, solr-8983-console-f1.log > > > We've stumbled on serious performance issues after upgrading to Solr 6.4.1. > Looks like the new metrics collection system in MetricsDirectoryFactory is > causing a major slowdown. This happens with an index configuration that, as > far as I can see, has no metrics specific configuration and uses > luceneMatchVersion 5.5.0. In practice a moderate load will completely bog > down the server with Solr threads constantly using up all CPU (600% on 6 core > machine) capacity with a load that normally where we normally see an average > load of < 50%. > I took stack traces (I'll attach them) and noticed that the threads are > spending time in com.codahale.metrics.Meter.mark. I tested building Solr > 6.4.1 with the metrics collection disabled in MetricsDirectoryFactory getByte > and getBytes methods and was unable to reproduce the issue. > As far as I can see there are several issues: > 1. Collecting metrics on every single byte read is slow. > 2. Having it enabled by default is not a good idea. > 3. The comment "enable coarse-grained metrics by default" at > https://github.com/apache/lucene-solr/blob/branch_6x/solr/core/src/java/org/apache/solr/update/SolrIndexConfig.java#L104 > implies that only coarse-grained metrics should be enabled by default, and > this contradicts with collecting metrics on every single byte read. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-10130) Serious performance degradation in Solr 6.4.1 due to the new metrics collection
[ https://issues.apache.org/jira/browse/SOLR-10130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15865470#comment-15865470 ] Ishan Chattopadhyaya edited comment on SOLR-10130 at 2/14/17 9:39 AM: -- I ran some benchmarks, with and without this patch. Benchmarking suite: https://github.com/chatman/solr-upgrade-tests/blob/master/BENCHMARKS.md Environment: packet.net, Type 0 server (https://www.packet.net/bare-metal/servers/type-0/) {code} 6.4.1 Without patch java -cp target/org.apache.solr.tests.upgradetests-0.0.1-SNAPSHOT-jar-with-dependencies.jar:. org.apache.solr.tests.upgradetests.SimpleBenchmarks -v 72f75b2503fa0aa4f0aff76d439874feb923bb0e -Nodes 1 -Shards 1 -Replicas 1 -numDocs 10 -threads 6 -benchmarkType generalIndexing Indexing times: 188,190 6.4.1 With patch java -cp target/org.apache.solr.tests.upgradetests-0.0.1-SNAPSHOT-jar-with-dependencies.jar:. org.apache.solr.tests.upgradetests.SimpleBenchmarks -v 72f75b2503fa0aa4f0aff76d439874feb923bb0e -patchUrl https://issues.apache.org/jira/secure/attachment/12852444/SOLR-10130.patch -Nodes 1 -Shards 1 -Replicas 1 -numDocs 10 -threads 6 -benchmarkType generalIndexing Indexing times: 171,165 {code} was (Author: ichattopadhyaya): I ran some benchmarks, with and without this patch. {code} Benchmarking suite: https://github.com/chatman/solr-upgrade-tests/blob/master/BENCHMARKS.md Environment: packet.net, Type 0 server (https://www.packet.net/bare-metal/servers/type-0/) 6.4.1 Without patch java -cp target/org.apache.solr.tests.upgradetests-0.0.1-SNAPSHOT-jar-with-dependencies.jar:. org.apache.solr.tests.upgradetests.SimpleBenchmarks -v 72f75b2503fa0aa4f0aff76d439874feb923bb0e -Nodes 1 -Shards 1 -Replicas 1 -numDocs 10 -threads 6 -benchmarkType generalIndexing Indexing times: 188,190 6.4.1 With patch java -cp target/org.apache.solr.tests.upgradetests-0.0.1-SNAPSHOT-jar-with-dependencies.jar:. org.apache.solr.tests.upgradetests.SimpleBenchmarks -v 72f75b2503fa0aa4f0aff76d439874feb923bb0e -patchUrl https://issues.apache.org/jira/secure/attachment/12852444/SOLR-10130.patch -Nodes 1 -Shards 1 -Replicas 1 -numDocs 10 -threads 6 -benchmarkType generalIndexing Indexing times: 171,165 {code} > Serious performance degradation in Solr 6.4.1 due to the new metrics > collection > --- > > Key: SOLR-10130 > URL: https://issues.apache.org/jira/browse/SOLR-10130 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: metrics >Affects Versions: 6.4.1 > Environment: Centos 7, OpenJDK 1.8.0 update 111 >Reporter: Ere Maijala >Assignee: Andrzej Bialecki >Priority: Blocker > Labels: perfomance > Attachments: SOLR-10130.patch, solr-8983-console-f1.log > > > We've stumbled on serious performance issues after upgrading to Solr 6.4.1. > Looks like the new metrics collection system in MetricsDirectoryFactory is > causing a major slowdown. This happens with an index configuration that, as > far as I can see, has no metrics specific configuration and uses > luceneMatchVersion 5.5.0. In practice a moderate load will completely bog > down the server with Solr threads constantly using up all CPU (600% on 6 core > machine) capacity with a load that normally where we normally see an average > load of < 50%. > I took stack traces (I'll attach them) and noticed that the threads are > spending time in com.codahale.metrics.Meter.mark. I tested building Solr > 6.4.1 with the metrics collection disabled in MetricsDirectoryFactory getByte > and getBytes methods and was unable to reproduce the issue. > As far as I can see there are several issues: > 1. Collecting metrics on every single byte read is slow. > 2. Having it enabled by default is not a good idea. > 3. The comment "enable coarse-grained metrics by default" at > https://github.com/apache/lucene-solr/blob/branch_6x/solr/core/src/java/org/apache/solr/update/SolrIndexConfig.java#L104 > implies that only coarse-grained metrics should be enabled by default, and > this contradicts with collecting metrics on every single byte read. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org