[jira] [Commented] (SOLR-10130) Serious performance degradation in Solr 6.4.1 due to the new metrics collection

2017-04-27 Thread Shawn Heisey (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15987488#comment-15987488
 ] 

Shawn Heisey commented on SOLR-10130:
-

Metrics was just a theory, sounds like that's not it.  Thanks [~ab] for the 
assist.

> Serious performance degradation in Solr 6.4.1 due to the new metrics 
> collection
> ---
>
> Key: SOLR-10130
> URL: https://issues.apache.org/jira/browse/SOLR-10130
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: metrics
>Affects Versions: 6.4, 6.4.1
> Environment: Centos 7, OpenJDK 1.8.0 update 111
>Reporter: Ere Maijala
>Assignee: Andrzej Bialecki 
>Priority: Blocker
>  Labels: perfomance
> Fix For: 6.4.2, master (7.0)
>
> Attachments: SOLR-10130.patch, SOLR-10130.patch, 
> solr-8983-console-f1.log
>
>
> We've stumbled on serious performance issues after upgrading to Solr 6.4.1. 
> Looks like the new metrics collection system in MetricsDirectoryFactory is 
> causing a major slowdown. This happens with an index configuration that, as 
> far as I can see, has no metrics specific configuration and uses 
> luceneMatchVersion 5.5.0. In practice a moderate load will completely bog 
> down the server with Solr threads constantly using up all CPU (600% on 6 core 
> machine) capacity with a load that normally  where we normally see an average 
> load of < 50%.
> I took stack traces (I'll attach them) and noticed that the threads are 
> spending time in com.codahale.metrics.Meter.mark. I tested building Solr 
> 6.4.1 with the metrics collection disabled in MetricsDirectoryFactory getByte 
> and getBytes methods and was unable to reproduce the issue.
> As far as I can see there are several issues:
> 1. Collecting metrics on every single byte read is slow.
> 2. Having it enabled by default is not a good idea.
> 3. The comment "enable coarse-grained metrics by default" at 
> https://github.com/apache/lucene-solr/blob/branch_6x/solr/core/src/java/org/apache/solr/update/SolrIndexConfig.java#L104
>  implies that only coarse-grained metrics should be enabled by default, and 
> this contradicts with collecting metrics on every single byte read.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10130) Serious performance degradation in Solr 6.4.1 due to the new metrics collection

2017-04-27 Thread Matthew Sporleder (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15987240#comment-15987240
 ] 

Matthew Sporleder commented on SOLR-10130:
--

Both are running java version "1.8.0_45" sun jdk

> Serious performance degradation in Solr 6.4.1 due to the new metrics 
> collection
> ---
>
> Key: SOLR-10130
> URL: https://issues.apache.org/jira/browse/SOLR-10130
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: metrics
>Affects Versions: 6.4, 6.4.1
> Environment: Centos 7, OpenJDK 1.8.0 update 111
>Reporter: Ere Maijala
>Assignee: Andrzej Bialecki 
>Priority: Blocker
>  Labels: perfomance
> Fix For: 6.4.2, master (7.0)
>
> Attachments: SOLR-10130.patch, SOLR-10130.patch, 
> solr-8983-console-f1.log
>
>
> We've stumbled on serious performance issues after upgrading to Solr 6.4.1. 
> Looks like the new metrics collection system in MetricsDirectoryFactory is 
> causing a major slowdown. This happens with an index configuration that, as 
> far as I can see, has no metrics specific configuration and uses 
> luceneMatchVersion 5.5.0. In practice a moderate load will completely bog 
> down the server with Solr threads constantly using up all CPU (600% on 6 core 
> machine) capacity with a load that normally  where we normally see an average 
> load of < 50%.
> I took stack traces (I'll attach them) and noticed that the threads are 
> spending time in com.codahale.metrics.Meter.mark. I tested building Solr 
> 6.4.1 with the metrics collection disabled in MetricsDirectoryFactory getByte 
> and getBytes methods and was unable to reproduce the issue.
> As far as I can see there are several issues:
> 1. Collecting metrics on every single byte read is slow.
> 2. Having it enabled by default is not a good idea.
> 3. The comment "enable coarse-grained metrics by default" at 
> https://github.com/apache/lucene-solr/blob/branch_6x/solr/core/src/java/org/apache/solr/update/SolrIndexConfig.java#L104
>  implies that only coarse-grained metrics should be enabled by default, and 
> this contradicts with collecting metrics on every single byte read.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10130) Serious performance degradation in Solr 6.4.1 due to the new metrics collection

2017-04-27 Thread Andrzej Bialecki (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15987160#comment-15987160
 ] 

Andrzej Bialecki  commented on SOLR-10130:
--

Did you change JDK version between these two installs? I found an old issue 
(but still open!) that may indicate it's a JDK bug: 
https://github.com/netty/netty/issues/327 . There are other similar reports for 
Jetty, but for older versions ... can't say whether that's relevant here.

However, what these stacktraces do NOT show is anything related to metrics.

> Serious performance degradation in Solr 6.4.1 due to the new metrics 
> collection
> ---
>
> Key: SOLR-10130
> URL: https://issues.apache.org/jira/browse/SOLR-10130
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: metrics
>Affects Versions: 6.4, 6.4.1
> Environment: Centos 7, OpenJDK 1.8.0 update 111
>Reporter: Ere Maijala
>Assignee: Andrzej Bialecki 
>Priority: Blocker
>  Labels: perfomance
> Fix For: 6.4.2, master (7.0)
>
> Attachments: SOLR-10130.patch, SOLR-10130.patch, 
> solr-8983-console-f1.log
>
>
> We've stumbled on serious performance issues after upgrading to Solr 6.4.1. 
> Looks like the new metrics collection system in MetricsDirectoryFactory is 
> causing a major slowdown. This happens with an index configuration that, as 
> far as I can see, has no metrics specific configuration and uses 
> luceneMatchVersion 5.5.0. In practice a moderate load will completely bog 
> down the server with Solr threads constantly using up all CPU (600% on 6 core 
> machine) capacity with a load that normally  where we normally see an average 
> load of < 50%.
> I took stack traces (I'll attach them) and noticed that the threads are 
> spending time in com.codahale.metrics.Meter.mark. I tested building Solr 
> 6.4.1 with the metrics collection disabled in MetricsDirectoryFactory getByte 
> and getBytes methods and was unable to reproduce the issue.
> As far as I can see there are several issues:
> 1. Collecting metrics on every single byte read is slow.
> 2. Having it enabled by default is not a good idea.
> 3. The comment "enable coarse-grained metrics by default" at 
> https://github.com/apache/lucene-solr/blob/branch_6x/solr/core/src/java/org/apache/solr/update/SolrIndexConfig.java#L104
>  implies that only coarse-grained metrics should be enabled by default, and 
> this contradicts with collecting metrics on every single byte read.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10130) Serious performance degradation in Solr 6.4.1 due to the new metrics collection

2017-04-27 Thread Matthew Sporleder (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15987079#comment-15987079
 ] 

Matthew Sporleder commented on SOLR-10130:
--

Not sure I have the tooling right now for a full drill down, but here are some 
examples of a thread dump:
{code}
"qtp968514068-37953" - Thread t@37953
   java.lang.Thread.State: RUNNABLE
at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:79)
at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
- locked <2e101d2c> (a sun.nio.ch.Util$2)
- locked <59c1f901> (a java.util.Collections$UnmodifiableSet)
- locked <54c6a926> (a sun.nio.ch.EPollSelectorImpl)
at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:101)
at 
org.eclipse.jetty.io.ManagedSelector$SelectorProducer.select(ManagedSelector.java:243)
at 
org.eclipse.jetty.io.ManagedSelector$SelectorProducer.produce(ManagedSelector.java:191)
at 
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:249)
at 
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)
at 
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)
at java.lang.Thread.run(Thread.java:745)

   Locked ownable synchronizers:
- None

"qtp968514068-37952" - Thread t@37952
   java.lang.Thread.State: TIMED_WAITING
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <2d78e562> (a 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
at 
org.eclipse.jetty.util.BlockingArrayQueue.poll(BlockingArrayQueue.java:392)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool.idleJobPoll(QueuedThreadPool.java:563)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool.access$800(QueuedThreadPool.java:48)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:626)
at java.lang.Thread.run(Thread.java:745)
{code}

Most are in that TIMED_WAITING and most CPU time is spend on 
org.eclipse.jetty.util.BlockingArrayQueue.poll according to visualvm

> Serious performance degradation in Solr 6.4.1 due to the new metrics 
> collection
> ---
>
> Key: SOLR-10130
> URL: https://issues.apache.org/jira/browse/SOLR-10130
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: metrics
>Affects Versions: 6.4, 6.4.1
> Environment: Centos 7, OpenJDK 1.8.0 update 111
>Reporter: Ere Maijala
>Assignee: Andrzej Bialecki 
>Priority: Blocker
>  Labels: perfomance
> Fix For: 6.4.2, master (7.0)
>
> Attachments: SOLR-10130.patch, SOLR-10130.patch, 
> solr-8983-console-f1.log
>
>
> We've stumbled on serious performance issues after upgrading to Solr 6.4.1. 
> Looks like the new metrics collection system in MetricsDirectoryFactory is 
> causing a major slowdown. This happens with an index configuration that, as 
> far as I can see, has no metrics specific configuration and uses 
> luceneMatchVersion 5.5.0. In practice a moderate load will completely bog 
> down the server with Solr threads constantly using up all CPU (600% on 6 core 
> machine) capacity with a load that normally  where we normally see an average 
> load of < 50%.
> I took stack traces (I'll attach them) and noticed that the threads are 
> spending time in com.codahale.metrics.Meter.mark. I tested building Solr 
> 6.4.1 with the metrics collection disabled in MetricsDirectoryFactory getByte 
> and getBytes methods and was unable to reproduce the issue.
> As far as I can see there are several issues:
> 1. Collecting metrics on every single byte read is slow.
> 2. Having it enabled by default is not a good idea.
> 3. The comment "enable coarse-grained metrics by default" at 
> https://github.com/apache/lucene-solr/blob/branch_6x/solr/core/src/java/org/apache/solr/update/SolrIndexConfig.java#L104
>  implies that only coarse-grained metrics should be enabled by default, and 
> this contradicts with collecting metrics on every single byte read.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (SOLR-10130) Serious performance degradation in Solr 6.4.1 due to the new metrics collection

2017-04-27 Thread Andrzej Bialecki (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15986913#comment-15986913
 ] 

Andrzej Bialecki  commented on SOLR-10130:
--

I don't think this should make much of a difference - 
{{InstrumentedQueuedThreadPool}} only exposes gauges, which basically don't add 
CPU overhead unless accessed, and {{InstrumentedHandler}} collects only a few 
specific metrics, so the overhead should also be minimal, in the order of 
microseconds / request.

A drill-down into these threads to find their hot-spots would be useful.

> Serious performance degradation in Solr 6.4.1 due to the new metrics 
> collection
> ---
>
> Key: SOLR-10130
> URL: https://issues.apache.org/jira/browse/SOLR-10130
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: metrics
>Affects Versions: 6.4, 6.4.1
> Environment: Centos 7, OpenJDK 1.8.0 update 111
>Reporter: Ere Maijala
>Assignee: Andrzej Bialecki 
>Priority: Blocker
>  Labels: perfomance
> Fix For: 6.4.2, master (7.0)
>
> Attachments: SOLR-10130.patch, SOLR-10130.patch, 
> solr-8983-console-f1.log
>
>
> We've stumbled on serious performance issues after upgrading to Solr 6.4.1. 
> Looks like the new metrics collection system in MetricsDirectoryFactory is 
> causing a major slowdown. This happens with an index configuration that, as 
> far as I can see, has no metrics specific configuration and uses 
> luceneMatchVersion 5.5.0. In practice a moderate load will completely bog 
> down the server with Solr threads constantly using up all CPU (600% on 6 core 
> machine) capacity with a load that normally  where we normally see an average 
> load of < 50%.
> I took stack traces (I'll attach them) and noticed that the threads are 
> spending time in com.codahale.metrics.Meter.mark. I tested building Solr 
> 6.4.1 with the metrics collection disabled in MetricsDirectoryFactory getByte 
> and getBytes methods and was unable to reproduce the issue.
> As far as I can see there are several issues:
> 1. Collecting metrics on every single byte read is slow.
> 2. Having it enabled by default is not a good idea.
> 3. The comment "enable coarse-grained metrics by default" at 
> https://github.com/apache/lucene-solr/blob/branch_6x/solr/core/src/java/org/apache/solr/update/SolrIndexConfig.java#L104
>  implies that only coarse-grained metrics should be enabled by default, and 
> this contradicts with collecting metrics on every single byte read.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10130) Serious performance degradation in Solr 6.4.1 due to the new metrics collection

2017-04-27 Thread Shawn Heisey (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15986859#comment-15986859
 ] 

Shawn Heisey commented on SOLR-10130:
-

Have a question related to this issue.  Somebody on the IRC channel running 
6.4.2 is seeing continued performance degradation compared to 4.x.  They were 
running an earlier 6.4.x release, until they were advised about this issue.

Looking at the utilization for threads, the top threads on 6.4.2 are all named 
starting with qtp, which I believe means they are Jetty threads.

https://gist.github.com/msporleder-work/7313ebedbdab2e178ca0aa2e889d006b

If I'm not mistaken, we enabled container-level metrics with the changes that 
went into 6.4.0.  If that's true, do we perhaps have those metrics dialed up to 
11?

> Serious performance degradation in Solr 6.4.1 due to the new metrics 
> collection
> ---
>
> Key: SOLR-10130
> URL: https://issues.apache.org/jira/browse/SOLR-10130
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: metrics
>Affects Versions: 6.4, 6.4.1
> Environment: Centos 7, OpenJDK 1.8.0 update 111
>Reporter: Ere Maijala
>Assignee: Andrzej Bialecki 
>Priority: Blocker
>  Labels: perfomance
> Fix For: 6.4.2, master (7.0)
>
> Attachments: SOLR-10130.patch, SOLR-10130.patch, 
> solr-8983-console-f1.log
>
>
> We've stumbled on serious performance issues after upgrading to Solr 6.4.1. 
> Looks like the new metrics collection system in MetricsDirectoryFactory is 
> causing a major slowdown. This happens with an index configuration that, as 
> far as I can see, has no metrics specific configuration and uses 
> luceneMatchVersion 5.5.0. In practice a moderate load will completely bog 
> down the server with Solr threads constantly using up all CPU (600% on 6 core 
> machine) capacity with a load that normally  where we normally see an average 
> load of < 50%.
> I took stack traces (I'll attach them) and noticed that the threads are 
> spending time in com.codahale.metrics.Meter.mark. I tested building Solr 
> 6.4.1 with the metrics collection disabled in MetricsDirectoryFactory getByte 
> and getBytes methods and was unable to reproduce the issue.
> As far as I can see there are several issues:
> 1. Collecting metrics on every single byte read is slow.
> 2. Having it enabled by default is not a good idea.
> 3. The comment "enable coarse-grained metrics by default" at 
> https://github.com/apache/lucene-solr/blob/branch_6x/solr/core/src/java/org/apache/solr/update/SolrIndexConfig.java#L104
>  implies that only coarse-grained metrics should be enabled by default, and 
> this contradicts with collecting metrics on every single byte read.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10130) Serious performance degradation in Solr 6.4.1 due to the new metrics collection

2017-03-15 Thread Shawn Heisey (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15926179#comment-15926179
 ] 

Shawn Heisey commented on SOLR-10130:
-

The reassignment was accidental, fixed.  I hate the fact that Jira responds 
with real actions to just typing on the keyboard.  I sometimes forget which 
window has the focus, assume that an SSH session I can clearly see is active, 
and find that I'm giving unknown commands to something that accepts keypresses 
as commands, like Thunderbird or Jira.

> Serious performance degradation in Solr 6.4.1 due to the new metrics 
> collection
> ---
>
> Key: SOLR-10130
> URL: https://issues.apache.org/jira/browse/SOLR-10130
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: metrics
>Affects Versions: 6.4.1, 6.4.0
> Environment: Centos 7, OpenJDK 1.8.0 update 111
>Reporter: Ere Maijala
>Assignee: Andrzej Bialecki 
>Priority: Blocker
>  Labels: perfomance
> Fix For: master (7.0), 6.4.2
>
> Attachments: SOLR-10130.patch, SOLR-10130.patch, 
> solr-8983-console-f1.log
>
>
> We've stumbled on serious performance issues after upgrading to Solr 6.4.1. 
> Looks like the new metrics collection system in MetricsDirectoryFactory is 
> causing a major slowdown. This happens with an index configuration that, as 
> far as I can see, has no metrics specific configuration and uses 
> luceneMatchVersion 5.5.0. In practice a moderate load will completely bog 
> down the server with Solr threads constantly using up all CPU (600% on 6 core 
> machine) capacity with a load that normally  where we normally see an average 
> load of < 50%.
> I took stack traces (I'll attach them) and noticed that the threads are 
> spending time in com.codahale.metrics.Meter.mark. I tested building Solr 
> 6.4.1 with the metrics collection disabled in MetricsDirectoryFactory getByte 
> and getBytes methods and was unable to reproduce the issue.
> As far as I can see there are several issues:
> 1. Collecting metrics on every single byte read is slow.
> 2. Having it enabled by default is not a good idea.
> 3. The comment "enable coarse-grained metrics by default" at 
> https://github.com/apache/lucene-solr/blob/branch_6x/solr/core/src/java/org/apache/solr/update/SolrIndexConfig.java#L104
>  implies that only coarse-grained metrics should be enabled by default, and 
> this contradicts with collecting metrics on every single byte read.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10130) Serious performance degradation in Solr 6.4.1 due to the new metrics collection

2017-02-23 Thread Ishan Chattopadhyaya (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881006#comment-15881006
 ] 

Ishan Chattopadhyaya commented on SOLR-10130:
-

Adding a link to https://issues.apache.org/jira/browse/SOLR-10182 for backing 
out the changes that caused these perf degradations.

> Serious performance degradation in Solr 6.4.1 due to the new metrics 
> collection
> ---
>
> Key: SOLR-10130
> URL: https://issues.apache.org/jira/browse/SOLR-10130
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: metrics
>Affects Versions: 6.4.1
> Environment: Centos 7, OpenJDK 1.8.0 update 111
>Reporter: Ere Maijala
>Assignee: Andrzej Bialecki 
>Priority: Blocker
>  Labels: perfomance
> Fix For: master (7.0), 6.4.2
>
> Attachments: SOLR-10130.patch, SOLR-10130.patch, 
> solr-8983-console-f1.log
>
>
> We've stumbled on serious performance issues after upgrading to Solr 6.4.1. 
> Looks like the new metrics collection system in MetricsDirectoryFactory is 
> causing a major slowdown. This happens with an index configuration that, as 
> far as I can see, has no metrics specific configuration and uses 
> luceneMatchVersion 5.5.0. In practice a moderate load will completely bog 
> down the server with Solr threads constantly using up all CPU (600% on 6 core 
> machine) capacity with a load that normally  where we normally see an average 
> load of < 50%.
> I took stack traces (I'll attach them) and noticed that the threads are 
> spending time in com.codahale.metrics.Meter.mark. I tested building Solr 
> 6.4.1 with the metrics collection disabled in MetricsDirectoryFactory getByte 
> and getBytes methods and was unable to reproduce the issue.
> As far as I can see there are several issues:
> 1. Collecting metrics on every single byte read is slow.
> 2. Having it enabled by default is not a good idea.
> 3. The comment "enable coarse-grained metrics by default" at 
> https://github.com/apache/lucene-solr/blob/branch_6x/solr/core/src/java/org/apache/solr/update/SolrIndexConfig.java#L104
>  implies that only coarse-grained metrics should be enabled by default, and 
> this contradicts with collecting metrics on every single byte read.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10130) Serious performance degradation in Solr 6.4.1 due to the new metrics collection

2017-02-17 Thread Ere Maijala (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15871772#comment-15871772
 ] 

Ere Maijala commented on SOLR-10130:


I still don't have proper benchmarks, but I've tested enough to say with fair 
confidence that this is fixed for us.

> Serious performance degradation in Solr 6.4.1 due to the new metrics 
> collection
> ---
>
> Key: SOLR-10130
> URL: https://issues.apache.org/jira/browse/SOLR-10130
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: metrics
>Affects Versions: 6.4.1
> Environment: Centos 7, OpenJDK 1.8.0 update 111
>Reporter: Ere Maijala
>Assignee: Andrzej Bialecki 
>Priority: Blocker
>  Labels: perfomance
> Fix For: master (7.0), 6.4.2
>
> Attachments: SOLR-10130.patch, SOLR-10130.patch, 
> solr-8983-console-f1.log
>
>
> We've stumbled on serious performance issues after upgrading to Solr 6.4.1. 
> Looks like the new metrics collection system in MetricsDirectoryFactory is 
> causing a major slowdown. This happens with an index configuration that, as 
> far as I can see, has no metrics specific configuration and uses 
> luceneMatchVersion 5.5.0. In practice a moderate load will completely bog 
> down the server with Solr threads constantly using up all CPU (600% on 6 core 
> machine) capacity with a load that normally  where we normally see an average 
> load of < 50%.
> I took stack traces (I'll attach them) and noticed that the threads are 
> spending time in com.codahale.metrics.Meter.mark. I tested building Solr 
> 6.4.1 with the metrics collection disabled in MetricsDirectoryFactory getByte 
> and getBytes methods and was unable to reproduce the issue.
> As far as I can see there are several issues:
> 1. Collecting metrics on every single byte read is slow.
> 2. Having it enabled by default is not a good idea.
> 3. The comment "enable coarse-grained metrics by default" at 
> https://github.com/apache/lucene-solr/blob/branch_6x/solr/core/src/java/org/apache/solr/update/SolrIndexConfig.java#L104
>  implies that only coarse-grained metrics should be enabled by default, and 
> this contradicts with collecting metrics on every single byte read.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10130) Serious performance degradation in Solr 6.4.1 due to the new metrics collection

2017-02-16 Thread Erick Erickson (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15870760#comment-15870760
 ] 

Erick Erickson commented on SOLR-10130:
---

How on earth did you get a 4.7G tlog? It looks like you somehow didn't commit, 
shut the node down are replaying a ton of docs (well, how much does 4.7G weigh 
anyway?) from the tlog.

So, simple test:
1> wait for the node to come up.
2> insure you've issued a hard commit
3> try restarting.

My claim is that the restart will be reasonable and the slowness you're seeing 
is a result of somehow shutting down without doing a commit. Of course 
depending on your autocommit interval you may not need to do the hard commit 
before restarting...

Best,
Erick

> Serious performance degradation in Solr 6.4.1 due to the new metrics 
> collection
> ---
>
> Key: SOLR-10130
> URL: https://issues.apache.org/jira/browse/SOLR-10130
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: metrics
>Affects Versions: 6.4.1
> Environment: Centos 7, OpenJDK 1.8.0 update 111
>Reporter: Ere Maijala
>Assignee: Andrzej Bialecki 
>Priority: Blocker
>  Labels: perfomance
> Fix For: master (7.0), 6.4.2
>
> Attachments: SOLR-10130.patch, SOLR-10130.patch, 
> solr-8983-console-f1.log
>
>
> We've stumbled on serious performance issues after upgrading to Solr 6.4.1. 
> Looks like the new metrics collection system in MetricsDirectoryFactory is 
> causing a major slowdown. This happens with an index configuration that, as 
> far as I can see, has no metrics specific configuration and uses 
> luceneMatchVersion 5.5.0. In practice a moderate load will completely bog 
> down the server with Solr threads constantly using up all CPU (600% on 6 core 
> machine) capacity with a load that normally  where we normally see an average 
> load of < 50%.
> I took stack traces (I'll attach them) and noticed that the threads are 
> spending time in com.codahale.metrics.Meter.mark. I tested building Solr 
> 6.4.1 with the metrics collection disabled in MetricsDirectoryFactory getByte 
> and getBytes methods and was unable to reproduce the issue.
> As far as I can see there are several issues:
> 1. Collecting metrics on every single byte read is slow.
> 2. Having it enabled by default is not a good idea.
> 3. The comment "enable coarse-grained metrics by default" at 
> https://github.com/apache/lucene-solr/blob/branch_6x/solr/core/src/java/org/apache/solr/update/SolrIndexConfig.java#L104
>  implies that only coarse-grained metrics should be enabled by default, and 
> this contradicts with collecting metrics on every single byte read.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10130) Serious performance degradation in Solr 6.4.1 due to the new metrics collection

2017-02-16 Thread Walter Underwood (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15870751#comment-15870751
 ] 

Walter Underwood commented on SOLR-10130:
-

This might be part of it:

[wunder@new-solr-c01.test3]# ls -lh 
/solr/data/questions_shard2_replica1/data/tlog/
total 4.7G
-rw-r--r-- 1 bin bin 4.7G Feb 13 11:04 tlog.000
[wunder@new-solr-c01.test3]# du -sh /solr/data/questions_shard2_replica1/data/*
8.4G/solr/data/questions_shard2_replica1/data/index
4.0K/solr/data/questions_shard2_replica1/data/snapshot_metadata
4.7G/solr/data/questions_shard2_replica1/data/tlog


Last Modified: 3 days ago
Num Docs: 3683075
Max Doc: 3683075
Heap Memory Usage: -1
Deleted Docs: 0
Version: 2737
Segment Count: 26
Optimized: yes
Current: yes



> Serious performance degradation in Solr 6.4.1 due to the new metrics 
> collection
> ---
>
> Key: SOLR-10130
> URL: https://issues.apache.org/jira/browse/SOLR-10130
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: metrics
>Affects Versions: 6.4.1
> Environment: Centos 7, OpenJDK 1.8.0 update 111
>Reporter: Ere Maijala
>Assignee: Andrzej Bialecki 
>Priority: Blocker
>  Labels: perfomance
> Fix For: master (7.0), 6.4.2
>
> Attachments: SOLR-10130.patch, SOLR-10130.patch, 
> solr-8983-console-f1.log
>
>
> We've stumbled on serious performance issues after upgrading to Solr 6.4.1. 
> Looks like the new metrics collection system in MetricsDirectoryFactory is 
> causing a major slowdown. This happens with an index configuration that, as 
> far as I can see, has no metrics specific configuration and uses 
> luceneMatchVersion 5.5.0. In practice a moderate load will completely bog 
> down the server with Solr threads constantly using up all CPU (600% on 6 core 
> machine) capacity with a load that normally  where we normally see an average 
> load of < 50%.
> I took stack traces (I'll attach them) and noticed that the threads are 
> spending time in com.codahale.metrics.Meter.mark. I tested building Solr 
> 6.4.1 with the metrics collection disabled in MetricsDirectoryFactory getByte 
> and getBytes methods and was unable to reproduce the issue.
> As far as I can see there are several issues:
> 1. Collecting metrics on every single byte read is slow.
> 2. Having it enabled by default is not a good idea.
> 3. The comment "enable coarse-grained metrics by default" at 
> https://github.com/apache/lucene-solr/blob/branch_6x/solr/core/src/java/org/apache/solr/update/SolrIndexConfig.java#L104
>  implies that only coarse-grained metrics should be enabled by default, and 
> this contradicts with collecting metrics on every single byte read.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10130) Serious performance degradation in Solr 6.4.1 due to the new metrics collection

2017-02-16 Thread Andrzej Bialecki (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15870734#comment-15870734
 ] 

Andrzej Bialecki  commented on SOLR-10130:
--

bq.Is there any way you could pull/build a new version of Solr 6.4 (or apply 
the patch on this JIRA locally) and try? I'd hate to have the 6.4.2 release get 
out (coming soon, due to this) and not have fixed a different issue.
I concur. [~wunder] - we are not sure if your situation is caused by the issue 
fixed here or by some other bug, it would be very helpful if you could try a 
build that contains this patch to see if it solves the problem in your 
environment.

> Serious performance degradation in Solr 6.4.1 due to the new metrics 
> collection
> ---
>
> Key: SOLR-10130
> URL: https://issues.apache.org/jira/browse/SOLR-10130
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: metrics
>Affects Versions: 6.4.1
> Environment: Centos 7, OpenJDK 1.8.0 update 111
>Reporter: Ere Maijala
>Assignee: Andrzej Bialecki 
>Priority: Blocker
>  Labels: perfomance
> Fix For: master (7.0), 6.4.2
>
> Attachments: SOLR-10130.patch, SOLR-10130.patch, 
> solr-8983-console-f1.log
>
>
> We've stumbled on serious performance issues after upgrading to Solr 6.4.1. 
> Looks like the new metrics collection system in MetricsDirectoryFactory is 
> causing a major slowdown. This happens with an index configuration that, as 
> far as I can see, has no metrics specific configuration and uses 
> luceneMatchVersion 5.5.0. In practice a moderate load will completely bog 
> down the server with Solr threads constantly using up all CPU (600% on 6 core 
> machine) capacity with a load that normally  where we normally see an average 
> load of < 50%.
> I took stack traces (I'll attach them) and noticed that the threads are 
> spending time in com.codahale.metrics.Meter.mark. I tested building Solr 
> 6.4.1 with the metrics collection disabled in MetricsDirectoryFactory getByte 
> and getBytes methods and was unable to reproduce the issue.
> As far as I can see there are several issues:
> 1. Collecting metrics on every single byte read is slow.
> 2. Having it enabled by default is not a good idea.
> 3. The comment "enable coarse-grained metrics by default" at 
> https://github.com/apache/lucene-solr/blob/branch_6x/solr/core/src/java/org/apache/solr/update/SolrIndexConfig.java#L104
>  implies that only coarse-grained metrics should be enabled by default, and 
> this contradicts with collecting metrics on every single byte read.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10130) Serious performance degradation in Solr 6.4.1 due to the new metrics collection

2017-02-16 Thread Erick Erickson (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15870444#comment-15870444
 ] 

Erick Erickson commented on SOLR-10130:
---

Ah, ok. I take it this is a replica with a bunch of data in it? Although this 
doesn't make sense, there shouldn't be that much work do fire up a core absent 
tlog replay and the like but it sounds like you're far beyond that so I'm 
missing something.

Is there any way you could pull/build a new version of Solr 6.4 (or apply the 
patch on this JIRA locally) and try? I'd hate to have the 6.4.2 release get out 
(coming soon, due to this) and not have fixed a different issue.

> Serious performance degradation in Solr 6.4.1 due to the new metrics 
> collection
> ---
>
> Key: SOLR-10130
> URL: https://issues.apache.org/jira/browse/SOLR-10130
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: metrics
>Affects Versions: 6.4.1
> Environment: Centos 7, OpenJDK 1.8.0 update 111
>Reporter: Ere Maijala
>Assignee: Andrzej Bialecki 
>Priority: Blocker
>  Labels: perfomance
> Fix For: master (7.0), 6.4.2
>
> Attachments: SOLR-10130.patch, SOLR-10130.patch, 
> solr-8983-console-f1.log
>
>
> We've stumbled on serious performance issues after upgrading to Solr 6.4.1. 
> Looks like the new metrics collection system in MetricsDirectoryFactory is 
> causing a major slowdown. This happens with an index configuration that, as 
> far as I can see, has no metrics specific configuration and uses 
> luceneMatchVersion 5.5.0. In practice a moderate load will completely bog 
> down the server with Solr threads constantly using up all CPU (600% on 6 core 
> machine) capacity with a load that normally  where we normally see an average 
> load of < 50%.
> I took stack traces (I'll attach them) and noticed that the threads are 
> spending time in com.codahale.metrics.Meter.mark. I tested building Solr 
> 6.4.1 with the metrics collection disabled in MetricsDirectoryFactory getByte 
> and getBytes methods and was unable to reproduce the issue.
> As far as I can see there are several issues:
> 1. Collecting metrics on every single byte read is slow.
> 2. Having it enabled by default is not a good idea.
> 3. The comment "enable coarse-grained metrics by default" at 
> https://github.com/apache/lucene-solr/blob/branch_6x/solr/core/src/java/org/apache/solr/update/SolrIndexConfig.java#L104
>  implies that only coarse-grained metrics should be enabled by default, and 
> this contradicts with collecting metrics on every single byte read.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10130) Serious performance degradation in Solr 6.4.1 due to the new metrics collection

2017-02-16 Thread Walter Underwood (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15870429#comment-15870429
 ] 

Walter Underwood commented on SOLR-10130:
-

I'm looking at how long the core is marked "recovering" in the cloud view of 
the admin UI.

There shouldn't be any recovery. The server process is restarted hours after 
the most recent update. I think this is how long it takes to get the core 
loaded and ready for search. Startup time, really.

> Serious performance degradation in Solr 6.4.1 due to the new metrics 
> collection
> ---
>
> Key: SOLR-10130
> URL: https://issues.apache.org/jira/browse/SOLR-10130
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: metrics
>Affects Versions: 6.4.1
> Environment: Centos 7, OpenJDK 1.8.0 update 111
>Reporter: Ere Maijala
>Assignee: Andrzej Bialecki 
>Priority: Blocker
>  Labels: perfomance
> Fix For: master (7.0), 6.4.2
>
> Attachments: SOLR-10130.patch, SOLR-10130.patch, 
> solr-8983-console-f1.log
>
>
> We've stumbled on serious performance issues after upgrading to Solr 6.4.1. 
> Looks like the new metrics collection system in MetricsDirectoryFactory is 
> causing a major slowdown. This happens with an index configuration that, as 
> far as I can see, has no metrics specific configuration and uses 
> luceneMatchVersion 5.5.0. In practice a moderate load will completely bog 
> down the server with Solr threads constantly using up all CPU (600% on 6 core 
> machine) capacity with a load that normally  where we normally see an average 
> load of < 50%.
> I took stack traces (I'll attach them) and noticed that the threads are 
> spending time in com.codahale.metrics.Meter.mark. I tested building Solr 
> 6.4.1 with the metrics collection disabled in MetricsDirectoryFactory getByte 
> and getBytes methods and was unable to reproduce the issue.
> As far as I can see there are several issues:
> 1. Collecting metrics on every single byte read is slow.
> 2. Having it enabled by default is not a good idea.
> 3. The comment "enable coarse-grained metrics by default" at 
> https://github.com/apache/lucene-solr/blob/branch_6x/solr/core/src/java/org/apache/solr/update/SolrIndexConfig.java#L104
>  implies that only coarse-grained metrics should be enabled by default, and 
> this contradicts with collecting metrics on every single byte read.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10130) Serious performance degradation in Solr 6.4.1 due to the new metrics collection

2017-02-16 Thread Erick Erickson (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15870379#comment-15870379
 ] 

Erick Erickson commented on SOLR-10130:
---

[~wunder]: If it's easy, could you try a manual fetchindex? Which you can even 
do in cloud mode. See: 
https://cwiki.apache.org/confluence/display/solr/Index+Replication#IndexReplication-HTTPAPICommandsfortheReplicationHandler

Or maybe just see if the logs show that this very long recovery happens when 
you have a full recovery, i.e. you're copying the full index down from the 
leader/master...





> Serious performance degradation in Solr 6.4.1 due to the new metrics 
> collection
> ---
>
> Key: SOLR-10130
> URL: https://issues.apache.org/jira/browse/SOLR-10130
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: metrics
>Affects Versions: 6.4.1
> Environment: Centos 7, OpenJDK 1.8.0 update 111
>Reporter: Ere Maijala
>Assignee: Andrzej Bialecki 
>Priority: Blocker
>  Labels: perfomance
> Fix For: master (7.0), 6.4.2
>
> Attachments: SOLR-10130.patch, SOLR-10130.patch, 
> solr-8983-console-f1.log
>
>
> We've stumbled on serious performance issues after upgrading to Solr 6.4.1. 
> Looks like the new metrics collection system in MetricsDirectoryFactory is 
> causing a major slowdown. This happens with an index configuration that, as 
> far as I can see, has no metrics specific configuration and uses 
> luceneMatchVersion 5.5.0. In practice a moderate load will completely bog 
> down the server with Solr threads constantly using up all CPU (600% on 6 core 
> machine) capacity with a load that normally  where we normally see an average 
> load of < 50%.
> I took stack traces (I'll attach them) and noticed that the threads are 
> spending time in com.codahale.metrics.Meter.mark. I tested building Solr 
> 6.4.1 with the metrics collection disabled in MetricsDirectoryFactory getByte 
> and getBytes methods and was unable to reproduce the issue.
> As far as I can see there are several issues:
> 1. Collecting metrics on every single byte read is slow.
> 2. Having it enabled by default is not a good idea.
> 3. The comment "enable coarse-grained metrics by default" at 
> https://github.com/apache/lucene-solr/blob/branch_6x/solr/core/src/java/org/apache/solr/update/SolrIndexConfig.java#L104
>  implies that only coarse-grained metrics should be enabled by default, and 
> this contradicts with collecting metrics on every single byte read.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10130) Serious performance degradation in Solr 6.4.1 due to the new metrics collection

2017-02-16 Thread Walter Underwood (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15870198#comment-15870198
 ] 

Walter Underwood commented on SOLR-10130:
-

Also, recovery is much, much slower in 6.4. Each core is about 8 GB. After a 
server process restart, the core is recovering for a few minutes in 6.3, but 
for about a half hour in 6.4.

> Serious performance degradation in Solr 6.4.1 due to the new metrics 
> collection
> ---
>
> Key: SOLR-10130
> URL: https://issues.apache.org/jira/browse/SOLR-10130
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: metrics
>Affects Versions: 6.4.1
> Environment: Centos 7, OpenJDK 1.8.0 update 111
>Reporter: Ere Maijala
>Assignee: Andrzej Bialecki 
>Priority: Blocker
>  Labels: perfomance
> Fix For: master (7.0), 6.4.2
>
> Attachments: SOLR-10130.patch, SOLR-10130.patch, 
> solr-8983-console-f1.log
>
>
> We've stumbled on serious performance issues after upgrading to Solr 6.4.1. 
> Looks like the new metrics collection system in MetricsDirectoryFactory is 
> causing a major slowdown. This happens with an index configuration that, as 
> far as I can see, has no metrics specific configuration and uses 
> luceneMatchVersion 5.5.0. In practice a moderate load will completely bog 
> down the server with Solr threads constantly using up all CPU (600% on 6 core 
> machine) capacity with a load that normally  where we normally see an average 
> load of < 50%.
> I took stack traces (I'll attach them) and noticed that the threads are 
> spending time in com.codahale.metrics.Meter.mark. I tested building Solr 
> 6.4.1 with the metrics collection disabled in MetricsDirectoryFactory getByte 
> and getBytes methods and was unable to reproduce the issue.
> As far as I can see there are several issues:
> 1. Collecting metrics on every single byte read is slow.
> 2. Having it enabled by default is not a good idea.
> 3. The comment "enable coarse-grained metrics by default" at 
> https://github.com/apache/lucene-solr/blob/branch_6x/solr/core/src/java/org/apache/solr/update/SolrIndexConfig.java#L104
>  implies that only coarse-grained metrics should be enabled by default, and 
> this contradicts with collecting metrics on every single byte read.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10130) Serious performance degradation in Solr 6.4.1 due to the new metrics collection

2017-02-16 Thread Walter Underwood (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15870165#comment-15870165
 ] 

Walter Underwood commented on SOLR-10130:
-

The slowdown is impressive under heavy query load. Here are two load benchmarks 
with a 16 node cluster, c4.8xlarge instances (36 CPUs, 60 GB RAM), 15.7 million 
docs, 4 shards, replication factor 4 using production query logs. These are 
very long text queries, up to 40 words. Benchmark runs for two or three hours, 
depending on my patience. Java 8u121, G1 collector.

6.4.0 with 1000 requests/minute is running out of CPU. Median and 95th 
percentile response times for an ngram/prefix match are 7.5 and 9.8 seconds. 
For a word match, they are 11 and 25.4 seconds.

6.3.0 with 6000 rpm, the times are 0.4 and 2.7 seconds, and 0.7 and 4.3 
seconds, respectively. CPU usage is under 50%.

Short version, 6.4 is 10X slower than 6.3 handling 1/6 the load. 

> Serious performance degradation in Solr 6.4.1 due to the new metrics 
> collection
> ---
>
> Key: SOLR-10130
> URL: https://issues.apache.org/jira/browse/SOLR-10130
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: metrics
>Affects Versions: 6.4.1
> Environment: Centos 7, OpenJDK 1.8.0 update 111
>Reporter: Ere Maijala
>Assignee: Andrzej Bialecki 
>Priority: Blocker
>  Labels: perfomance
> Fix For: master (7.0), 6.4.2
>
> Attachments: SOLR-10130.patch, SOLR-10130.patch, 
> solr-8983-console-f1.log
>
>
> We've stumbled on serious performance issues after upgrading to Solr 6.4.1. 
> Looks like the new metrics collection system in MetricsDirectoryFactory is 
> causing a major slowdown. This happens with an index configuration that, as 
> far as I can see, has no metrics specific configuration and uses 
> luceneMatchVersion 5.5.0. In practice a moderate load will completely bog 
> down the server with Solr threads constantly using up all CPU (600% on 6 core 
> machine) capacity with a load that normally  where we normally see an average 
> load of < 50%.
> I took stack traces (I'll attach them) and noticed that the threads are 
> spending time in com.codahale.metrics.Meter.mark. I tested building Solr 
> 6.4.1 with the metrics collection disabled in MetricsDirectoryFactory getByte 
> and getBytes methods and was unable to reproduce the issue.
> As far as I can see there are several issues:
> 1. Collecting metrics on every single byte read is slow.
> 2. Having it enabled by default is not a good idea.
> 3. The comment "enable coarse-grained metrics by default" at 
> https://github.com/apache/lucene-solr/blob/branch_6x/solr/core/src/java/org/apache/solr/update/SolrIndexConfig.java#L104
>  implies that only coarse-grained metrics should be enabled by default, and 
> this contradicts with collecting metrics on every single byte read.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10130) Serious performance degradation in Solr 6.4.1 due to the new metrics collection

2017-02-16 Thread Alessandro Benedetti (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15869753#comment-15869753
 ] 

Alessandro Benedetti commented on SOLR-10130:
-

Thanks, for the clear explanation!
Good spot !

> Serious performance degradation in Solr 6.4.1 due to the new metrics 
> collection
> ---
>
> Key: SOLR-10130
> URL: https://issues.apache.org/jira/browse/SOLR-10130
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: metrics
>Affects Versions: 6.4.1
> Environment: Centos 7, OpenJDK 1.8.0 update 111
>Reporter: Ere Maijala
>Assignee: Andrzej Bialecki 
>Priority: Blocker
>  Labels: perfomance
> Fix For: master (7.0), 6.4.2
>
> Attachments: SOLR-10130.patch, SOLR-10130.patch, 
> solr-8983-console-f1.log
>
>
> We've stumbled on serious performance issues after upgrading to Solr 6.4.1. 
> Looks like the new metrics collection system in MetricsDirectoryFactory is 
> causing a major slowdown. This happens with an index configuration that, as 
> far as I can see, has no metrics specific configuration and uses 
> luceneMatchVersion 5.5.0. In practice a moderate load will completely bog 
> down the server with Solr threads constantly using up all CPU (600% on 6 core 
> machine) capacity with a load that normally  where we normally see an average 
> load of < 50%.
> I took stack traces (I'll attach them) and noticed that the threads are 
> spending time in com.codahale.metrics.Meter.mark. I tested building Solr 
> 6.4.1 with the metrics collection disabled in MetricsDirectoryFactory getByte 
> and getBytes methods and was unable to reproduce the issue.
> As far as I can see there are several issues:
> 1. Collecting metrics on every single byte read is slow.
> 2. Having it enabled by default is not a good idea.
> 3. The comment "enable coarse-grained metrics by default" at 
> https://github.com/apache/lucene-solr/blob/branch_6x/solr/core/src/java/org/apache/solr/update/SolrIndexConfig.java#L104
>  implies that only coarse-grained metrics should be enabled by default, and 
> this contradicts with collecting metrics on every single byte read.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10130) Serious performance degradation in Solr 6.4.1 due to the new metrics collection

2017-02-16 Thread Andrzej Bialecki (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15869702#comment-15869702
 ] 

Andrzej Bialecki  commented on SOLR-10130:
--

Prefix query that matches many terms causes many seek & read ops, which meant 
that the instrumentation in 
{{org.apache.solr.core.MetricsDirectoryFactory$MetricsInput.readByte}} was 
called for every small read. This normally wouldn't matter for regular 
Directory implementations because they use caching extensively, precisely to 
avoid the overhead of reading single bytes, but {{MetricsDirectory}} being a 
wrapper on top of any Directory implementation couldn't benefit from this 
caching and still maintain the read/write counters. The overhead of individual 
{{Meter.mark}} call is in the order of microseconds, but invoking it a few 
million times resulted in significant slowdown. 

> Serious performance degradation in Solr 6.4.1 due to the new metrics 
> collection
> ---
>
> Key: SOLR-10130
> URL: https://issues.apache.org/jira/browse/SOLR-10130
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: metrics
>Affects Versions: 6.4.1
> Environment: Centos 7, OpenJDK 1.8.0 update 111
>Reporter: Ere Maijala
>Assignee: Andrzej Bialecki 
>Priority: Blocker
>  Labels: perfomance
> Fix For: master (7.0), 6.4.2
>
> Attachments: SOLR-10130.patch, SOLR-10130.patch, 
> solr-8983-console-f1.log
>
>
> We've stumbled on serious performance issues after upgrading to Solr 6.4.1. 
> Looks like the new metrics collection system in MetricsDirectoryFactory is 
> causing a major slowdown. This happens with an index configuration that, as 
> far as I can see, has no metrics specific configuration and uses 
> luceneMatchVersion 5.5.0. In practice a moderate load will completely bog 
> down the server with Solr threads constantly using up all CPU (600% on 6 core 
> machine) capacity with a load that normally  where we normally see an average 
> load of < 50%.
> I took stack traces (I'll attach them) and noticed that the threads are 
> spending time in com.codahale.metrics.Meter.mark. I tested building Solr 
> 6.4.1 with the metrics collection disabled in MetricsDirectoryFactory getByte 
> and getBytes methods and was unable to reproduce the issue.
> As far as I can see there are several issues:
> 1. Collecting metrics on every single byte read is slow.
> 2. Having it enabled by default is not a good idea.
> 3. The comment "enable coarse-grained metrics by default" at 
> https://github.com/apache/lucene-solr/blob/branch_6x/solr/core/src/java/org/apache/solr/update/SolrIndexConfig.java#L104
>  implies that only coarse-grained metrics should be enabled by default, and 
> this contradicts with collecting metrics on every single byte read.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10130) Serious performance degradation in Solr 6.4.1 due to the new metrics collection

2017-02-16 Thread Alessandro Benedetti (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15869625#comment-15869625
 ] 

Alessandro Benedetti commented on SOLR-10130:
-

What was causing the Query time slowdown for prefix queries ?
Has this been discovered ?

> Serious performance degradation in Solr 6.4.1 due to the new metrics 
> collection
> ---
>
> Key: SOLR-10130
> URL: https://issues.apache.org/jira/browse/SOLR-10130
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: metrics
>Affects Versions: 6.4.1
> Environment: Centos 7, OpenJDK 1.8.0 update 111
>Reporter: Ere Maijala
>Assignee: Andrzej Bialecki 
>Priority: Blocker
>  Labels: perfomance
> Fix For: master (7.0), 6.4.2
>
> Attachments: SOLR-10130.patch, SOLR-10130.patch, 
> solr-8983-console-f1.log
>
>
> We've stumbled on serious performance issues after upgrading to Solr 6.4.1. 
> Looks like the new metrics collection system in MetricsDirectoryFactory is 
> causing a major slowdown. This happens with an index configuration that, as 
> far as I can see, has no metrics specific configuration and uses 
> luceneMatchVersion 5.5.0. In practice a moderate load will completely bog 
> down the server with Solr threads constantly using up all CPU (600% on 6 core 
> machine) capacity with a load that normally  where we normally see an average 
> load of < 50%.
> I took stack traces (I'll attach them) and noticed that the threads are 
> spending time in com.codahale.metrics.Meter.mark. I tested building Solr 
> 6.4.1 with the metrics collection disabled in MetricsDirectoryFactory getByte 
> and getBytes methods and was unable to reproduce the issue.
> As far as I can see there are several issues:
> 1. Collecting metrics on every single byte read is slow.
> 2. Having it enabled by default is not a good idea.
> 3. The comment "enable coarse-grained metrics by default" at 
> https://github.com/apache/lucene-solr/blob/branch_6x/solr/core/src/java/org/apache/solr/update/SolrIndexConfig.java#L104
>  implies that only coarse-grained metrics should be enabled by default, and 
> this contradicts with collecting metrics on every single byte read.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10130) Serious performance degradation in Solr 6.4.1 due to the new metrics collection

2017-02-15 Thread Walter Underwood (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15868973#comment-15868973
 ] 

Walter Underwood commented on SOLR-10130:
-

I don't have hard numbers, but core recovery after a restart with 6.4.0 was 
taking a really long time. Maybe 30 minutes. Back-reved to 6.3.0, it is maybe 
five minutes.


> Serious performance degradation in Solr 6.4.1 due to the new metrics 
> collection
> ---
>
> Key: SOLR-10130
> URL: https://issues.apache.org/jira/browse/SOLR-10130
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: metrics
>Affects Versions: 6.4.1
> Environment: Centos 7, OpenJDK 1.8.0 update 111
>Reporter: Ere Maijala
>Assignee: Andrzej Bialecki 
>Priority: Blocker
>  Labels: perfomance
> Fix For: master (7.0), 6.4.2
>
> Attachments: SOLR-10130.patch, SOLR-10130.patch, 
> solr-8983-console-f1.log
>
>
> We've stumbled on serious performance issues after upgrading to Solr 6.4.1. 
> Looks like the new metrics collection system in MetricsDirectoryFactory is 
> causing a major slowdown. This happens with an index configuration that, as 
> far as I can see, has no metrics specific configuration and uses 
> luceneMatchVersion 5.5.0. In practice a moderate load will completely bog 
> down the server with Solr threads constantly using up all CPU (600% on 6 core 
> machine) capacity with a load that normally  where we normally see an average 
> load of < 50%.
> I took stack traces (I'll attach them) and noticed that the threads are 
> spending time in com.codahale.metrics.Meter.mark. I tested building Solr 
> 6.4.1 with the metrics collection disabled in MetricsDirectoryFactory getByte 
> and getBytes methods and was unable to reproduce the issue.
> As far as I can see there are several issues:
> 1. Collecting metrics on every single byte read is slow.
> 2. Having it enabled by default is not a good idea.
> 3. The comment "enable coarse-grained metrics by default" at 
> https://github.com/apache/lucene-solr/blob/branch_6x/solr/core/src/java/org/apache/solr/update/SolrIndexConfig.java#L104
>  implies that only coarse-grained metrics should be enabled by default, and 
> this contradicts with collecting metrics on every single byte read.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10130) Serious performance degradation in Solr 6.4.1 due to the new metrics collection

2017-02-15 Thread Henrik (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15868222#comment-15868222
 ] 

Henrik commented on SOLR-10130:
---

We just deployed the latest from branch_6_4 (a9eb001f44) and our systems are 
performing normally again.  Thanks for your work on this [~ab]!

> Serious performance degradation in Solr 6.4.1 due to the new metrics 
> collection
> ---
>
> Key: SOLR-10130
> URL: https://issues.apache.org/jira/browse/SOLR-10130
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: metrics
>Affects Versions: 6.4.1
> Environment: Centos 7, OpenJDK 1.8.0 update 111
>Reporter: Ere Maijala
>Assignee: Andrzej Bialecki 
>Priority: Blocker
>  Labels: perfomance
> Fix For: master (7.0), 6.4.2
>
> Attachments: SOLR-10130.patch, SOLR-10130.patch, 
> solr-8983-console-f1.log
>
>
> We've stumbled on serious performance issues after upgrading to Solr 6.4.1. 
> Looks like the new metrics collection system in MetricsDirectoryFactory is 
> causing a major slowdown. This happens with an index configuration that, as 
> far as I can see, has no metrics specific configuration and uses 
> luceneMatchVersion 5.5.0. In practice a moderate load will completely bog 
> down the server with Solr threads constantly using up all CPU (600% on 6 core 
> machine) capacity with a load that normally  where we normally see an average 
> load of < 50%.
> I took stack traces (I'll attach them) and noticed that the threads are 
> spending time in com.codahale.metrics.Meter.mark. I tested building Solr 
> 6.4.1 with the metrics collection disabled in MetricsDirectoryFactory getByte 
> and getBytes methods and was unable to reproduce the issue.
> As far as I can see there are several issues:
> 1. Collecting metrics on every single byte read is slow.
> 2. Having it enabled by default is not a good idea.
> 3. The comment "enable coarse-grained metrics by default" at 
> https://github.com/apache/lucene-solr/blob/branch_6x/solr/core/src/java/org/apache/solr/update/SolrIndexConfig.java#L104
>  implies that only coarse-grained metrics should be enabled by default, and 
> this contradicts with collecting metrics on every single byte read.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10130) Serious performance degradation in Solr 6.4.1 due to the new metrics collection

2017-02-15 Thread Andrzej Bialecki (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15867912#comment-15867912
 ] 

Andrzej Bialecki  commented on SOLR-10130:
--

You can of course build Solr yourself from {{branch_6_4}} that contains the 
patch. I don't know of any specific timeline for 6.4.2, but this is a pretty 
serious issue so I think we should do it soon - let's discuss this on mailing 
lists.

> Serious performance degradation in Solr 6.4.1 due to the new metrics 
> collection
> ---
>
> Key: SOLR-10130
> URL: https://issues.apache.org/jira/browse/SOLR-10130
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: metrics
>Affects Versions: 6.4.1
> Environment: Centos 7, OpenJDK 1.8.0 update 111
>Reporter: Ere Maijala
>Assignee: Andrzej Bialecki 
>Priority: Blocker
>  Labels: perfomance
> Fix For: master (7.0), 6.4.2
>
> Attachments: SOLR-10130.patch, SOLR-10130.patch, 
> solr-8983-console-f1.log
>
>
> We've stumbled on serious performance issues after upgrading to Solr 6.4.1. 
> Looks like the new metrics collection system in MetricsDirectoryFactory is 
> causing a major slowdown. This happens with an index configuration that, as 
> far as I can see, has no metrics specific configuration and uses 
> luceneMatchVersion 5.5.0. In practice a moderate load will completely bog 
> down the server with Solr threads constantly using up all CPU (600% on 6 core 
> machine) capacity with a load that normally  where we normally see an average 
> load of < 50%.
> I took stack traces (I'll attach them) and noticed that the threads are 
> spending time in com.codahale.metrics.Meter.mark. I tested building Solr 
> 6.4.1 with the metrics collection disabled in MetricsDirectoryFactory getByte 
> and getBytes methods and was unable to reproduce the issue.
> As far as I can see there are several issues:
> 1. Collecting metrics on every single byte read is slow.
> 2. Having it enabled by default is not a good idea.
> 3. The comment "enable coarse-grained metrics by default" at 
> https://github.com/apache/lucene-solr/blob/branch_6x/solr/core/src/java/org/apache/solr/update/SolrIndexConfig.java#L104
>  implies that only coarse-grained metrics should be enabled by default, and 
> this contradicts with collecting metrics on every single byte read.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10130) Serious performance degradation in Solr 6.4.1 due to the new metrics collection

2017-02-15 Thread bidorbuy (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15867900#comment-15867900
 ] 

bidorbuy commented on SOLR-10130:
-

[~ab] would I have to do a Solr build myself to get the patch in or should I 
rather wait for 6.4.2 (if so, any indication of when it would be released)?

> Serious performance degradation in Solr 6.4.1 due to the new metrics 
> collection
> ---
>
> Key: SOLR-10130
> URL: https://issues.apache.org/jira/browse/SOLR-10130
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: metrics
>Affects Versions: 6.4.1
> Environment: Centos 7, OpenJDK 1.8.0 update 111
>Reporter: Ere Maijala
>Assignee: Andrzej Bialecki 
>Priority: Blocker
>  Labels: perfomance
> Fix For: master (7.0), 6.4.2
>
> Attachments: SOLR-10130.patch, SOLR-10130.patch, 
> solr-8983-console-f1.log
>
>
> We've stumbled on serious performance issues after upgrading to Solr 6.4.1. 
> Looks like the new metrics collection system in MetricsDirectoryFactory is 
> causing a major slowdown. This happens with an index configuration that, as 
> far as I can see, has no metrics specific configuration and uses 
> luceneMatchVersion 5.5.0. In practice a moderate load will completely bog 
> down the server with Solr threads constantly using up all CPU (600% on 6 core 
> machine) capacity with a load that normally  where we normally see an average 
> load of < 50%.
> I took stack traces (I'll attach them) and noticed that the threads are 
> spending time in com.codahale.metrics.Meter.mark. I tested building Solr 
> 6.4.1 with the metrics collection disabled in MetricsDirectoryFactory getByte 
> and getBytes methods and was unable to reproduce the issue.
> As far as I can see there are several issues:
> 1. Collecting metrics on every single byte read is slow.
> 2. Having it enabled by default is not a good idea.
> 3. The comment "enable coarse-grained metrics by default" at 
> https://github.com/apache/lucene-solr/blob/branch_6x/solr/core/src/java/org/apache/solr/update/SolrIndexConfig.java#L104
>  implies that only coarse-grained metrics should be enabled by default, and 
> this contradicts with collecting metrics on every single byte read.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10130) Serious performance degradation in Solr 6.4.1 due to the new metrics collection

2017-02-15 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15867867#comment-15867867
 ] 

ASF subversion and git services commented on SOLR-10130:


Commit b6f49dc1fb4ad6ef890ae1d09f6d4c0584bb6f64 in lucene-solr's branch 
refs/heads/master from [~ab]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=b6f49dc ]

SOLR-10130 Serious performance degradation in Solr 6.4.1 due to the
new metrics collection.


> Serious performance degradation in Solr 6.4.1 due to the new metrics 
> collection
> ---
>
> Key: SOLR-10130
> URL: https://issues.apache.org/jira/browse/SOLR-10130
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: metrics
>Affects Versions: 6.4.1
> Environment: Centos 7, OpenJDK 1.8.0 update 111
>Reporter: Ere Maijala
>Assignee: Andrzej Bialecki 
>Priority: Blocker
>  Labels: perfomance
> Fix For: master (7.0), 6.4.2
>
> Attachments: SOLR-10130.patch, SOLR-10130.patch, 
> solr-8983-console-f1.log
>
>
> We've stumbled on serious performance issues after upgrading to Solr 6.4.1. 
> Looks like the new metrics collection system in MetricsDirectoryFactory is 
> causing a major slowdown. This happens with an index configuration that, as 
> far as I can see, has no metrics specific configuration and uses 
> luceneMatchVersion 5.5.0. In practice a moderate load will completely bog 
> down the server with Solr threads constantly using up all CPU (600% on 6 core 
> machine) capacity with a load that normally  where we normally see an average 
> load of < 50%.
> I took stack traces (I'll attach them) and noticed that the threads are 
> spending time in com.codahale.metrics.Meter.mark. I tested building Solr 
> 6.4.1 with the metrics collection disabled in MetricsDirectoryFactory getByte 
> and getBytes methods and was unable to reproduce the issue.
> As far as I can see there are several issues:
> 1. Collecting metrics on every single byte read is slow.
> 2. Having it enabled by default is not a good idea.
> 3. The comment "enable coarse-grained metrics by default" at 
> https://github.com/apache/lucene-solr/blob/branch_6x/solr/core/src/java/org/apache/solr/update/SolrIndexConfig.java#L104
>  implies that only coarse-grained metrics should be enabled by default, and 
> this contradicts with collecting metrics on every single byte read.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10130) Serious performance degradation in Solr 6.4.1 due to the new metrics collection

2017-02-15 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15867817#comment-15867817
 ] 

ASF subversion and git services commented on SOLR-10130:


Commit 835c96ba97a01c61978535c0e8fe34708755dc28 in lucene-solr's branch 
refs/heads/branch_6x from [~ab]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=835c96b ]

SOLR-10130 Serious performance degradation in Solr 6.4.1 due to the
new metrics collection.


> Serious performance degradation in Solr 6.4.1 due to the new metrics 
> collection
> ---
>
> Key: SOLR-10130
> URL: https://issues.apache.org/jira/browse/SOLR-10130
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: metrics
>Affects Versions: 6.4.1
> Environment: Centos 7, OpenJDK 1.8.0 update 111
>Reporter: Ere Maijala
>Assignee: Andrzej Bialecki 
>Priority: Blocker
>  Labels: perfomance
> Fix For: master (7.0), 6.4.2
>
> Attachments: SOLR-10130.patch, SOLR-10130.patch, 
> solr-8983-console-f1.log
>
>
> We've stumbled on serious performance issues after upgrading to Solr 6.4.1. 
> Looks like the new metrics collection system in MetricsDirectoryFactory is 
> causing a major slowdown. This happens with an index configuration that, as 
> far as I can see, has no metrics specific configuration and uses 
> luceneMatchVersion 5.5.0. In practice a moderate load will completely bog 
> down the server with Solr threads constantly using up all CPU (600% on 6 core 
> machine) capacity with a load that normally  where we normally see an average 
> load of < 50%.
> I took stack traces (I'll attach them) and noticed that the threads are 
> spending time in com.codahale.metrics.Meter.mark. I tested building Solr 
> 6.4.1 with the metrics collection disabled in MetricsDirectoryFactory getByte 
> and getBytes methods and was unable to reproduce the issue.
> As far as I can see there are several issues:
> 1. Collecting metrics on every single byte read is slow.
> 2. Having it enabled by default is not a good idea.
> 3. The comment "enable coarse-grained metrics by default" at 
> https://github.com/apache/lucene-solr/blob/branch_6x/solr/core/src/java/org/apache/solr/update/SolrIndexConfig.java#L104
>  implies that only coarse-grained metrics should be enabled by default, and 
> this contradicts with collecting metrics on every single byte read.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10130) Serious performance degradation in Solr 6.4.1 due to the new metrics collection

2017-02-15 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15867744#comment-15867744
 ] 

ASF subversion and git services commented on SOLR-10130:


Commit a9eb001f44ca846b64d4ed6e46af316fe12ce3d0 in lucene-solr's branch 
refs/heads/branch_6_4 from [~ab]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=a9eb001 ]

SOLR-10130 Serious performance degradation in Solr 6.4.1 due to the
new metrics collection.


> Serious performance degradation in Solr 6.4.1 due to the new metrics 
> collection
> ---
>
> Key: SOLR-10130
> URL: https://issues.apache.org/jira/browse/SOLR-10130
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: metrics
>Affects Versions: 6.4.1
> Environment: Centos 7, OpenJDK 1.8.0 update 111
>Reporter: Ere Maijala
>Assignee: Andrzej Bialecki 
>Priority: Blocker
>  Labels: perfomance
> Fix For: master (7.0), 6.4.2
>
> Attachments: SOLR-10130.patch, SOLR-10130.patch, 
> solr-8983-console-f1.log
>
>
> We've stumbled on serious performance issues after upgrading to Solr 6.4.1. 
> Looks like the new metrics collection system in MetricsDirectoryFactory is 
> causing a major slowdown. This happens with an index configuration that, as 
> far as I can see, has no metrics specific configuration and uses 
> luceneMatchVersion 5.5.0. In practice a moderate load will completely bog 
> down the server with Solr threads constantly using up all CPU (600% on 6 core 
> machine) capacity with a load that normally  where we normally see an average 
> load of < 50%.
> I took stack traces (I'll attach them) and noticed that the threads are 
> spending time in com.codahale.metrics.Meter.mark. I tested building Solr 
> 6.4.1 with the metrics collection disabled in MetricsDirectoryFactory getByte 
> and getBytes methods and was unable to reproduce the issue.
> As far as I can see there are several issues:
> 1. Collecting metrics on every single byte read is slow.
> 2. Having it enabled by default is not a good idea.
> 3. The comment "enable coarse-grained metrics by default" at 
> https://github.com/apache/lucene-solr/blob/branch_6x/solr/core/src/java/org/apache/solr/update/SolrIndexConfig.java#L104
>  implies that only coarse-grained metrics should be enabled by default, and 
> this contradicts with collecting metrics on every single byte read.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10130) Serious performance degradation in Solr 6.4.1 due to the new metrics collection

2017-02-15 Thread bidorbuy (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15867647#comment-15867647
 ] 

bidorbuy commented on SOLR-10130:
-

Same issue here. Worked perfectly fine on Solr 6.2.0 and CPU is trashing on 
Solr 6.4.1. I didn't see this bug report and logged a duplicate - 
https://issues.apache.org/jira/browse/SOLR-10140 showing slowdown in comparison.

In our case, Solr 6.4.1 works perfectly fine under production load for about 1 
hour and then CPU starts trashing. From the New Relic reports you will see  
that Solr 6.4.1 is flaring CPU substantially more than prior versions.

> Serious performance degradation in Solr 6.4.1 due to the new metrics 
> collection
> ---
>
> Key: SOLR-10130
> URL: https://issues.apache.org/jira/browse/SOLR-10130
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: metrics
>Affects Versions: 6.4.1
> Environment: Centos 7, OpenJDK 1.8.0 update 111
>Reporter: Ere Maijala
>Assignee: Andrzej Bialecki 
>Priority: Blocker
>  Labels: perfomance
> Fix For: master (7.0), 6.4.2
>
> Attachments: SOLR-10130.patch, solr-8983-console-f1.log
>
>
> We've stumbled on serious performance issues after upgrading to Solr 6.4.1. 
> Looks like the new metrics collection system in MetricsDirectoryFactory is 
> causing a major slowdown. This happens with an index configuration that, as 
> far as I can see, has no metrics specific configuration and uses 
> luceneMatchVersion 5.5.0. In practice a moderate load will completely bog 
> down the server with Solr threads constantly using up all CPU (600% on 6 core 
> machine) capacity with a load that normally  where we normally see an average 
> load of < 50%.
> I took stack traces (I'll attach them) and noticed that the threads are 
> spending time in com.codahale.metrics.Meter.mark. I tested building Solr 
> 6.4.1 with the metrics collection disabled in MetricsDirectoryFactory getByte 
> and getBytes methods and was unable to reproduce the issue.
> As far as I can see there are several issues:
> 1. Collecting metrics on every single byte read is slow.
> 2. Having it enabled by default is not a good idea.
> 3. The comment "enable coarse-grained metrics by default" at 
> https://github.com/apache/lucene-solr/blob/branch_6x/solr/core/src/java/org/apache/solr/update/SolrIndexConfig.java#L104
>  implies that only coarse-grained metrics should be enabled by default, and 
> this contradicts with collecting metrics on every single byte read.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10130) Serious performance degradation in Solr 6.4.1 due to the new metrics collection

2017-02-14 Thread Ere Maijala (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15867402#comment-15867402
 ] 

Ere Maijala commented on SOLR-10130:


I don't have proper benchmarks at hand, but I can support others' findings 
about the serious query performance degradation. I suppose it's easily 
overlooked when testing with light concurrency, but with enough concurrent 
queries being served it gets CPU-heavy. We use queries with a lot of filters so 
that may play a role too. I'll see if I came come up with a reproducible-enough 
test results from our actual queries.

> Serious performance degradation in Solr 6.4.1 due to the new metrics 
> collection
> ---
>
> Key: SOLR-10130
> URL: https://issues.apache.org/jira/browse/SOLR-10130
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: metrics
>Affects Versions: 6.4.1
> Environment: Centos 7, OpenJDK 1.8.0 update 111
>Reporter: Ere Maijala
>Assignee: Andrzej Bialecki 
>Priority: Blocker
>  Labels: perfomance
> Fix For: master (7.0), 6.4.2
>
> Attachments: SOLR-10130.patch, solr-8983-console-f1.log
>
>
> We've stumbled on serious performance issues after upgrading to Solr 6.4.1. 
> Looks like the new metrics collection system in MetricsDirectoryFactory is 
> causing a major slowdown. This happens with an index configuration that, as 
> far as I can see, has no metrics specific configuration and uses 
> luceneMatchVersion 5.5.0. In practice a moderate load will completely bog 
> down the server with Solr threads constantly using up all CPU (600% on 6 core 
> machine) capacity with a load that normally  where we normally see an average 
> load of < 50%.
> I took stack traces (I'll attach them) and noticed that the threads are 
> spending time in com.codahale.metrics.Meter.mark. I tested building Solr 
> 6.4.1 with the metrics collection disabled in MetricsDirectoryFactory getByte 
> and getBytes methods and was unable to reproduce the issue.
> As far as I can see there are several issues:
> 1. Collecting metrics on every single byte read is slow.
> 2. Having it enabled by default is not a good idea.
> 3. The comment "enable coarse-grained metrics by default" at 
> https://github.com/apache/lucene-solr/blob/branch_6x/solr/core/src/java/org/apache/solr/update/SolrIndexConfig.java#L104
>  implies that only coarse-grained metrics should be enabled by default, and 
> this contradicts with collecting metrics on every single byte read.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10130) Serious performance degradation in Solr 6.4.1 due to the new metrics collection

2017-02-14 Thread Ishan Chattopadhyaya (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15866888#comment-15866888
 ] 

Ishan Chattopadhyaya commented on SOLR-10130:
-

{code}
Prefix query times for 6.4.1 with SOLR-10130 patch
-
java -cp 
target/org.apache.solr.tests.upgradetests-0.0.1-SNAPSHOT-jar-with-dependencies.jar:.
 org.apache.solr.tests.upgradetests.SimpleBenchmarks -v 
72f75b2503fa0aa4f0aff76d439874feb923bb0e -patchUrl 
https://issues.apache.org/jira/secure/attachment/12852444/SOLR-10130.patch 
-Nodes 1 -Shards 1 -Replicas 1 -numDocs 1 -threads 4 -benchmarkType 
generalQuerying

Got results for prefix queries: 1
Max time (prefix queries): 1716
Total time (prefix queries): 852266
{code}

> Serious performance degradation in Solr 6.4.1 due to the new metrics 
> collection
> ---
>
> Key: SOLR-10130
> URL: https://issues.apache.org/jira/browse/SOLR-10130
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: metrics
>Affects Versions: 6.4.1
> Environment: Centos 7, OpenJDK 1.8.0 update 111
>Reporter: Ere Maijala
>Assignee: Andrzej Bialecki 
>Priority: Blocker
>  Labels: perfomance
> Fix For: master (7.0), 6.4.2
>
> Attachments: SOLR-10130.patch, solr-8983-console-f1.log
>
>
> We've stumbled on serious performance issues after upgrading to Solr 6.4.1. 
> Looks like the new metrics collection system in MetricsDirectoryFactory is 
> causing a major slowdown. This happens with an index configuration that, as 
> far as I can see, has no metrics specific configuration and uses 
> luceneMatchVersion 5.5.0. In practice a moderate load will completely bog 
> down the server with Solr threads constantly using up all CPU (600% on 6 core 
> machine) capacity with a load that normally  where we normally see an average 
> load of < 50%.
> I took stack traces (I'll attach them) and noticed that the threads are 
> spending time in com.codahale.metrics.Meter.mark. I tested building Solr 
> 6.4.1 with the metrics collection disabled in MetricsDirectoryFactory getByte 
> and getBytes methods and was unable to reproduce the issue.
> As far as I can see there are several issues:
> 1. Collecting metrics on every single byte read is slow.
> 2. Having it enabled by default is not a good idea.
> 3. The comment "enable coarse-grained metrics by default" at 
> https://github.com/apache/lucene-solr/blob/branch_6x/solr/core/src/java/org/apache/solr/update/SolrIndexConfig.java#L104
>  implies that only coarse-grained metrics should be enabled by default, and 
> this contradicts with collecting metrics on every single byte read.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10130) Serious performance degradation in Solr 6.4.1 due to the new metrics collection

2017-02-14 Thread Ishan Chattopadhyaya (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15866806#comment-15866806
 ] 

Ishan Chattopadhyaya commented on SOLR-10130:
-

I could reproduce a 1.6x slowdown for prefix queries.

Benchmarking suite: 
https://github.com/chatman/solr-upgrade-tests/blob/master/BENCHMARKS.md
Environment: packet.net, Type 0 server 
(https://www.packet.net/bare-metal/servers/type-0/)
{code}
Prefix query times for 6.4.1

java -cp 
target/org.apache.solr.tests.upgradetests-0.0.1-SNAPSHOT-jar-with-dependencies.jar:.
 org.apache.solr.tests.upgradetests.SimpleBenchmarks -v 
72f75b2503fa0aa4f0aff76d439874feb923bb0e -Nodes 1 -Shards 1 -Replicas 1 
-numDocs 1 -threads 4 -benchmarkType generalQuerying

Got results for prefix queries: 1
Max time (prefix queries): 2156ms
Total time (prefix queries): 1324856ms

Prefix query times for 6.3.0

java -cp 
target/org.apache.solr.tests.upgradetests-0.0.1-SNAPSHOT-jar-with-dependencies.jar:.
 org.apache.solr.tests.upgradetests.SimpleBenchmarks -v 
6fa26fe8553b7b65dee96da741f2c1adf4cb6216 -patchUrl 
http://147.75.108.131/LUCENE-7651.patch -Nodes 1 -Shards 1 -Replicas 1 -numDocs 
1 -threads 4 -benchmarkType generalQuerying

Got results for prefix queries: 1
Max time (prefix queries): 1358ms
Total time (prefix queries): 839534ms
{code}

Notes:
1. The -threads parameter here is for no. of indexing threads, and number of 
querying threads is 4 times that, i.e. 16 in this case.
2. Total time is the sum of all times, as reported in the response header's 
"QTime". Max time is the QTime for the worst performing query.

> Serious performance degradation in Solr 6.4.1 due to the new metrics 
> collection
> ---
>
> Key: SOLR-10130
> URL: https://issues.apache.org/jira/browse/SOLR-10130
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: metrics
>Affects Versions: 6.4.1
> Environment: Centos 7, OpenJDK 1.8.0 update 111
>Reporter: Ere Maijala
>Assignee: Andrzej Bialecki 
>Priority: Blocker
>  Labels: perfomance
> Fix For: master (7.0), 6.4.2
>
> Attachments: SOLR-10130.patch, solr-8983-console-f1.log
>
>
> We've stumbled on serious performance issues after upgrading to Solr 6.4.1. 
> Looks like the new metrics collection system in MetricsDirectoryFactory is 
> causing a major slowdown. This happens with an index configuration that, as 
> far as I can see, has no metrics specific configuration and uses 
> luceneMatchVersion 5.5.0. In practice a moderate load will completely bog 
> down the server with Solr threads constantly using up all CPU (600% on 6 core 
> machine) capacity with a load that normally  where we normally see an average 
> load of < 50%.
> I took stack traces (I'll attach them) and noticed that the threads are 
> spending time in com.codahale.metrics.Meter.mark. I tested building Solr 
> 6.4.1 with the metrics collection disabled in MetricsDirectoryFactory getByte 
> and getBytes methods and was unable to reproduce the issue.
> As far as I can see there are several issues:
> 1. Collecting metrics on every single byte read is slow.
> 2. Having it enabled by default is not a good idea.
> 3. The comment "enable coarse-grained metrics by default" at 
> https://github.com/apache/lucene-solr/blob/branch_6x/solr/core/src/java/org/apache/solr/update/SolrIndexConfig.java#L104
>  implies that only coarse-grained metrics should be enabled by default, and 
> this contradicts with collecting metrics on every single byte read.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10130) Serious performance degradation in Solr 6.4.1 due to the new metrics collection

2017-02-14 Thread Walter Underwood (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15866092#comment-15866092
 ] 

Walter Underwood commented on SOLR-10130:
-

I have a JMeter-based load script I can share. It replays access logs. I reload 
the collection to clear caches, run warming queries, then run queries at a 
controlled rate. After, it calculates percentiles.

This was a test of 6.4.1. Really slow. The errors are usually log lines with 
queries so long that they are truncated and end up with bad syntax. There is 
one column per request handler, so these results are for /auto, /mobile, 
/select, and /srp.

Mon Feb 13 12:01:29 PST 2017 ; INFO testing solr-cloud.test.cheggnet.com:8983
Mon Feb 13 12:01:29 PST 2017 ; INFO testing with 2000 requests/min
Mon Feb 13 12:01:29 PST 2017 ; INFO testing with 24 requests
Mon Feb 13 12:01:29 PST 2017 : splitting log into cache warming (first 2000 
lines) and benchmark for 
/home/wunder/2016-12-12-peak-questions-traffic-clean.log
Mon Feb 13 12:01:36 PST 2017 : starting cache warming to 
solr-cloud.test.cheggnet.com:8983
Mon Feb 13 12:24:29 PST 2017 : starting benchmarking to 
solr-cloud.test.cheggnet.com:8983
Mon Feb 13 12:24:29 PST 2017 : benchmark should run for 120 minutes
Mon Feb 13 12:24:29 PST 2017 : to get a count of requests sent so far, use "wc 
-l out-32688.jtl"
Mon Feb 13 14:55:01 PST 2017 : WARNING 207 error responses from 
solr-cloud.test.cheggnet.com
Mon Feb 13 14:55:01 PST 2017 : INFO Removing 207 error responses from JMeter 
output file before analysis
Mon Feb 13 14:55:01 PST 2017 : analyzing results
/home/wunder/search-test/load-test
Mon Feb 13 14:55:04 PST 2017 : 25th percentiles are 3151.0,3389.0,9329.0,5647.0
Mon Feb 13 14:55:04 PST 2017 : medians are 6101.0,10579.0,18692.0,8780.0
Mon Feb 13 14:55:04 PST 2017 : 75th percentiles are 
6871.0,12499.0,25000.0,12580.0
Mon Feb 13 14:55:04 PST 2017 : 90th percentiles are 
7593.0,13481.0,27623.0,14068.0
Mon Feb 13 14:55:04 PST 2017 : 95th percentiles are 
8079.0,14039.0,28566.0,16606.0
Mon Feb 13 14:55:04 PST 2017 : full results are in test.csv

> Serious performance degradation in Solr 6.4.1 due to the new metrics 
> collection
> ---
>
> Key: SOLR-10130
> URL: https://issues.apache.org/jira/browse/SOLR-10130
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: metrics
>Affects Versions: 6.4.1
> Environment: Centos 7, OpenJDK 1.8.0 update 111
>Reporter: Ere Maijala
>Assignee: Andrzej Bialecki 
>Priority: Blocker
>  Labels: perfomance
> Fix For: master (7.0), 6.4.2
>
> Attachments: SOLR-10130.patch, solr-8983-console-f1.log
>
>
> We've stumbled on serious performance issues after upgrading to Solr 6.4.1. 
> Looks like the new metrics collection system in MetricsDirectoryFactory is 
> causing a major slowdown. This happens with an index configuration that, as 
> far as I can see, has no metrics specific configuration and uses 
> luceneMatchVersion 5.5.0. In practice a moderate load will completely bog 
> down the server with Solr threads constantly using up all CPU (600% on 6 core 
> machine) capacity with a load that normally  where we normally see an average 
> load of < 50%.
> I took stack traces (I'll attach them) and noticed that the threads are 
> spending time in com.codahale.metrics.Meter.mark. I tested building Solr 
> 6.4.1 with the metrics collection disabled in MetricsDirectoryFactory getByte 
> and getBytes methods and was unable to reproduce the issue.
> As far as I can see there are several issues:
> 1. Collecting metrics on every single byte read is slow.
> 2. Having it enabled by default is not a good idea.
> 3. The comment "enable coarse-grained metrics by default" at 
> https://github.com/apache/lucene-solr/blob/branch_6x/solr/core/src/java/org/apache/solr/update/SolrIndexConfig.java#L104
>  implies that only coarse-grained metrics should be enabled by default, and 
> this contradicts with collecting metrics on every single byte read.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10130) Serious performance degradation in Solr 6.4.1 due to the new metrics collection

2017-02-14 Thread Ishan Chattopadhyaya (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15866077#comment-15866077
 ] 

Ishan Chattopadhyaya commented on SOLR-10130:
-

Thanks Yonik. I'm working on query performance benchmarks for this.

> Serious performance degradation in Solr 6.4.1 due to the new metrics 
> collection
> ---
>
> Key: SOLR-10130
> URL: https://issues.apache.org/jira/browse/SOLR-10130
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: metrics
>Affects Versions: 6.4.1
> Environment: Centos 7, OpenJDK 1.8.0 update 111
>Reporter: Ere Maijala
>Assignee: Andrzej Bialecki 
>Priority: Blocker
>  Labels: perfomance
> Fix For: master (7.0), 6.4.2
>
> Attachments: SOLR-10130.patch, solr-8983-console-f1.log
>
>
> We've stumbled on serious performance issues after upgrading to Solr 6.4.1. 
> Looks like the new metrics collection system in MetricsDirectoryFactory is 
> causing a major slowdown. This happens with an index configuration that, as 
> far as I can see, has no metrics specific configuration and uses 
> luceneMatchVersion 5.5.0. In practice a moderate load will completely bog 
> down the server with Solr threads constantly using up all CPU (600% on 6 core 
> machine) capacity with a load that normally  where we normally see an average 
> load of < 50%.
> I took stack traces (I'll attach them) and noticed that the threads are 
> spending time in com.codahale.metrics.Meter.mark. I tested building Solr 
> 6.4.1 with the metrics collection disabled in MetricsDirectoryFactory getByte 
> and getBytes methods and was unable to reproduce the issue.
> As far as I can see there are several issues:
> 1. Collecting metrics on every single byte read is slow.
> 2. Having it enabled by default is not a good idea.
> 3. The comment "enable coarse-grained metrics by default" at 
> https://github.com/apache/lucene-solr/blob/branch_6x/solr/core/src/java/org/apache/solr/update/SolrIndexConfig.java#L104
>  implies that only coarse-grained metrics should be enabled by default, and 
> this contradicts with collecting metrics on every single byte read.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10130) Serious performance degradation in Solr 6.4.1 due to the new metrics collection

2017-02-14 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15866067#comment-15866067
 ] 

Yonik Seeley commented on SOLR-10130:
-

Just a matter of how many little IOs are involved in your request.
I was easily able to reproduce a 5x slowdown with a prefix query that matches 
many terms.

> Serious performance degradation in Solr 6.4.1 due to the new metrics 
> collection
> ---
>
> Key: SOLR-10130
> URL: https://issues.apache.org/jira/browse/SOLR-10130
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: metrics
>Affects Versions: 6.4.1
> Environment: Centos 7, OpenJDK 1.8.0 update 111
>Reporter: Ere Maijala
>Assignee: Andrzej Bialecki 
>Priority: Blocker
>  Labels: perfomance
> Fix For: master (7.0), 6.4.2
>
> Attachments: SOLR-10130.patch, solr-8983-console-f1.log
>
>
> We've stumbled on serious performance issues after upgrading to Solr 6.4.1. 
> Looks like the new metrics collection system in MetricsDirectoryFactory is 
> causing a major slowdown. This happens with an index configuration that, as 
> far as I can see, has no metrics specific configuration and uses 
> luceneMatchVersion 5.5.0. In practice a moderate load will completely bog 
> down the server with Solr threads constantly using up all CPU (600% on 6 core 
> machine) capacity with a load that normally  where we normally see an average 
> load of < 50%.
> I took stack traces (I'll attach them) and noticed that the threads are 
> spending time in com.codahale.metrics.Meter.mark. I tested building Solr 
> 6.4.1 with the metrics collection disabled in MetricsDirectoryFactory getByte 
> and getBytes methods and was unable to reproduce the issue.
> As far as I can see there are several issues:
> 1. Collecting metrics on every single byte read is slow.
> 2. Having it enabled by default is not a good idea.
> 3. The comment "enable coarse-grained metrics by default" at 
> https://github.com/apache/lucene-solr/blob/branch_6x/solr/core/src/java/org/apache/solr/update/SolrIndexConfig.java#L104
>  implies that only coarse-grained metrics should be enabled by default, and 
> this contradicts with collecting metrics on every single byte read.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10130) Serious performance degradation in Solr 6.4.1 due to the new metrics collection

2017-02-14 Thread Walter Underwood (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15865980#comment-15865980
 ] 

Walter Underwood commented on SOLR-10130:
-

Sorry, the 6000 rpm was with 6.2.1, not 6.4.0.

I've backrev'ed the cluster to 6.3.0 and I'll be running load benchmarks today.


> Serious performance degradation in Solr 6.4.1 due to the new metrics 
> collection
> ---
>
> Key: SOLR-10130
> URL: https://issues.apache.org/jira/browse/SOLR-10130
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: metrics
>Affects Versions: 6.4.1
> Environment: Centos 7, OpenJDK 1.8.0 update 111
>Reporter: Ere Maijala
>Assignee: Andrzej Bialecki 
>Priority: Blocker
>  Labels: perfomance
> Fix For: master (7.0), 6.4.2
>
> Attachments: SOLR-10130.patch, solr-8983-console-f1.log
>
>
> We've stumbled on serious performance issues after upgrading to Solr 6.4.1. 
> Looks like the new metrics collection system in MetricsDirectoryFactory is 
> causing a major slowdown. This happens with an index configuration that, as 
> far as I can see, has no metrics specific configuration and uses 
> luceneMatchVersion 5.5.0. In practice a moderate load will completely bog 
> down the server with Solr threads constantly using up all CPU (600% on 6 core 
> machine) capacity with a load that normally  where we normally see an average 
> load of < 50%.
> I took stack traces (I'll attach them) and noticed that the threads are 
> spending time in com.codahale.metrics.Meter.mark. I tested building Solr 
> 6.4.1 with the metrics collection disabled in MetricsDirectoryFactory getByte 
> and getBytes methods and was unable to reproduce the issue.
> As far as I can see there are several issues:
> 1. Collecting metrics on every single byte read is slow.
> 2. Having it enabled by default is not a good idea.
> 3. The comment "enable coarse-grained metrics by default" at 
> https://github.com/apache/lucene-solr/blob/branch_6x/solr/core/src/java/org/apache/solr/update/SolrIndexConfig.java#L104
>  implies that only coarse-grained metrics should be enabled by default, and 
> this contradicts with collecting metrics on every single byte read.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10130) Serious performance degradation in Solr 6.4.1 due to the new metrics collection

2017-02-14 Thread Henrik (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15865945#comment-15865945
 ] 

Henrik commented on SOLR-10130:
---

We've also seen performance degradation with SolrCloud on 6.4.1, as I've posted 
on solr-user ( 
http://lucene.472066.n3.nabble.com/Performance-degradation-after-upgrading-from-6-2-1-to-6-4-1-td4320226.html
 ):

Here are a couple of graphs.  As you can see, 6.4.1 was introduced 2/10 
12:00: 

https://www.dropbox.com/s/qrc0wodain50azz/solr1.png?dl=0
https://www.dropbox.com/s/sdk30imm8jlomz2/solr2.png?dl=0
https://www.dropbox.com/s/rgd8bq86i3c5mga/solr2b.png?dl=0

These are two very different usage scenarios: 

* Solr1 has constant updates and very volatile data (30 minutes TTL, 20 
shards with no replicas, across 8 servers).  Requests in the 99 percentile 
went from ~400ms to 1000-1500ms. (Hystrix cutoff at 1.5s) 

* Solr2 is a more traditional instance with long-lived data (updated once a 
day, 24 shards with 2 replicas, across 8 servers).  Requests in the 99 
percentile went from ~400ms to at least 1s. (Hystrix cutoff at 1s) 


> Serious performance degradation in Solr 6.4.1 due to the new metrics 
> collection
> ---
>
> Key: SOLR-10130
> URL: https://issues.apache.org/jira/browse/SOLR-10130
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: metrics
>Affects Versions: 6.4.1
> Environment: Centos 7, OpenJDK 1.8.0 update 111
>Reporter: Ere Maijala
>Assignee: Andrzej Bialecki 
>Priority: Blocker
>  Labels: perfomance
> Fix For: master (7.0), 6.4.2
>
> Attachments: SOLR-10130.patch, solr-8983-console-f1.log
>
>
> We've stumbled on serious performance issues after upgrading to Solr 6.4.1. 
> Looks like the new metrics collection system in MetricsDirectoryFactory is 
> causing a major slowdown. This happens with an index configuration that, as 
> far as I can see, has no metrics specific configuration and uses 
> luceneMatchVersion 5.5.0. In practice a moderate load will completely bog 
> down the server with Solr threads constantly using up all CPU (600% on 6 core 
> machine) capacity with a load that normally  where we normally see an average 
> load of < 50%.
> I took stack traces (I'll attach them) and noticed that the threads are 
> spending time in com.codahale.metrics.Meter.mark. I tested building Solr 
> 6.4.1 with the metrics collection disabled in MetricsDirectoryFactory getByte 
> and getBytes methods and was unable to reproduce the issue.
> As far as I can see there are several issues:
> 1. Collecting metrics on every single byte read is slow.
> 2. Having it enabled by default is not a good idea.
> 3. The comment "enable coarse-grained metrics by default" at 
> https://github.com/apache/lucene-solr/blob/branch_6x/solr/core/src/java/org/apache/solr/update/SolrIndexConfig.java#L104
>  implies that only coarse-grained metrics should be enabled by default, and 
> this contradicts with collecting metrics on every single byte read.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10130) Serious performance degradation in Solr 6.4.1 due to the new metrics collection

2017-02-14 Thread Andrzej Bialecki (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15865912#comment-15865912
 ] 

Andrzej Bialecki  commented on SOLR-10130:
--

I haven't been able to reproduce such drastic slowdown using simple benchmarks 
- example results from indexing using {{post}} tool, fairly representative from 
several runs on each branch:
{code}
* branch_6_3

* branch_6_4

* jira/solr-10130

{code}
Profiler indeed shows that one of the hotspots on branch_6_4 is the 
{{Meter.mark}} code that is called in 
{{org.apache.solr.core.MetricsDirectoryFactory$MetricsInput.readByte}}. In my 
test the profiler showed that this consumes ~ 3% CPU, which is indeed something 
that we should avoid and turn off by default.

However, this still doesn't explain the order of magnitude slowdown reported 
above.

[~emaijala] and [~wunder] - please apply the above patch in your environment 
and see what is the impact. It makes sense to make this change anyway, so I'm 
going to apply this or similar version to all affected branches, but maybe 
there's more we can do here.

> Serious performance degradation in Solr 6.4.1 due to the new metrics 
> collection
> ---
>
> Key: SOLR-10130
> URL: https://issues.apache.org/jira/browse/SOLR-10130
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: metrics
>Affects Versions: 6.4.1
> Environment: Centos 7, OpenJDK 1.8.0 update 111
>Reporter: Ere Maijala
>Assignee: Andrzej Bialecki 
>Priority: Blocker
>  Labels: perfomance
> Attachments: SOLR-10130.patch, solr-8983-console-f1.log
>
>
> We've stumbled on serious performance issues after upgrading to Solr 6.4.1. 
> Looks like the new metrics collection system in MetricsDirectoryFactory is 
> causing a major slowdown. This happens with an index configuration that, as 
> far as I can see, has no metrics specific configuration and uses 
> luceneMatchVersion 5.5.0. In practice a moderate load will completely bog 
> down the server with Solr threads constantly using up all CPU (600% on 6 core 
> machine) capacity with a load that normally  where we normally see an average 
> load of < 50%.
> I took stack traces (I'll attach them) and noticed that the threads are 
> spending time in com.codahale.metrics.Meter.mark. I tested building Solr 
> 6.4.1 with the metrics collection disabled in MetricsDirectoryFactory getByte 
> and getBytes methods and was unable to reproduce the issue.
> As far as I can see there are several issues:
> 1. Collecting metrics on every single byte read is slow.
> 2. Having it enabled by default is not a good idea.
> 3. The comment "enable coarse-grained metrics by default" at 
> https://github.com/apache/lucene-solr/blob/branch_6x/solr/core/src/java/org/apache/solr/update/SolrIndexConfig.java#L104
>  implies that only coarse-grained metrics should be enabled by default, and 
> this contradicts with collecting metrics on every single byte read.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10130) Serious performance degradation in Solr 6.4.1 due to the new metrics collection

2017-02-14 Thread Ishan Chattopadhyaya (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15865584#comment-15865584
 ] 

Ishan Chattopadhyaya commented on SOLR-10130:
-

6.3.0 was faster (same as 6.4.1 with patch).

{code}
java -cp 
target/org.apache.solr.tests.upgradetests-0.0.1-SNAPSHOT-jar-with-dependencies.jar:.
 org.apache.solr.tests.upgradetests.SimpleBenchmarks -v 
6fa26fe8553b7b65dee96da741f2c1adf4cb6216 -patchUrl 
http://147.75.108.131/LUCENE-7651.patch -Nodes 1 -Shards 1 -Replicas 1 -numDocs 
10 -threads 6 -benchmarkType generalIndexing
Indexing times: 168,167
{code}


> Serious performance degradation in Solr 6.4.1 due to the new metrics 
> collection
> ---
>
> Key: SOLR-10130
> URL: https://issues.apache.org/jira/browse/SOLR-10130
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: metrics
>Affects Versions: 6.4.1
> Environment: Centos 7, OpenJDK 1.8.0 update 111
>Reporter: Ere Maijala
>Assignee: Andrzej Bialecki 
>Priority: Blocker
>  Labels: perfomance
> Attachments: SOLR-10130.patch, solr-8983-console-f1.log
>
>
> We've stumbled on serious performance issues after upgrading to Solr 6.4.1. 
> Looks like the new metrics collection system in MetricsDirectoryFactory is 
> causing a major slowdown. This happens with an index configuration that, as 
> far as I can see, has no metrics specific configuration and uses 
> luceneMatchVersion 5.5.0. In practice a moderate load will completely bog 
> down the server with Solr threads constantly using up all CPU (600% on 6 core 
> machine) capacity with a load that normally  where we normally see an average 
> load of < 50%.
> I took stack traces (I'll attach them) and noticed that the threads are 
> spending time in com.codahale.metrics.Meter.mark. I tested building Solr 
> 6.4.1 with the metrics collection disabled in MetricsDirectoryFactory getByte 
> and getBytes methods and was unable to reproduce the issue.
> As far as I can see there are several issues:
> 1. Collecting metrics on every single byte read is slow.
> 2. Having it enabled by default is not a good idea.
> 3. The comment "enable coarse-grained metrics by default" at 
> https://github.com/apache/lucene-solr/blob/branch_6x/solr/core/src/java/org/apache/solr/update/SolrIndexConfig.java#L104
>  implies that only coarse-grained metrics should be enabled by default, and 
> this contradicts with collecting metrics on every single byte read.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10130) Serious performance degradation in Solr 6.4.1 due to the new metrics collection

2017-02-14 Thread Ishan Chattopadhyaya (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15865510#comment-15865510
 ] 

Ishan Chattopadhyaya commented on SOLR-10130:
-

6.4.0 shows very similar numbers as compared to 6.4.1

{code}
6.4.0 Without patch

java -cp 
target/org.apache.solr.tests.upgradetests-0.0.1-SNAPSHOT-jar-with-dependencies.jar:.
 org.apache.solr.tests.upgradetests.SimpleBenchmarks -v 
680153de29c5b01d4a8afad88d4a7b84ab01e145 -Nodes 1 -Shards 1 -Replicas 1 
-numDocs 10 -threads 6 -benchmarkType generalIndexing

Indexing times: 191,184
{code}

> Serious performance degradation in Solr 6.4.1 due to the new metrics 
> collection
> ---
>
> Key: SOLR-10130
> URL: https://issues.apache.org/jira/browse/SOLR-10130
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: metrics
>Affects Versions: 6.4.1
> Environment: Centos 7, OpenJDK 1.8.0 update 111
>Reporter: Ere Maijala
>Assignee: Andrzej Bialecki 
>Priority: Blocker
>  Labels: perfomance
> Attachments: SOLR-10130.patch, solr-8983-console-f1.log
>
>
> We've stumbled on serious performance issues after upgrading to Solr 6.4.1. 
> Looks like the new metrics collection system in MetricsDirectoryFactory is 
> causing a major slowdown. This happens with an index configuration that, as 
> far as I can see, has no metrics specific configuration and uses 
> luceneMatchVersion 5.5.0. In practice a moderate load will completely bog 
> down the server with Solr threads constantly using up all CPU (600% on 6 core 
> machine) capacity with a load that normally  where we normally see an average 
> load of < 50%.
> I took stack traces (I'll attach them) and noticed that the threads are 
> spending time in com.codahale.metrics.Meter.mark. I tested building Solr 
> 6.4.1 with the metrics collection disabled in MetricsDirectoryFactory getByte 
> and getBytes methods and was unable to reproduce the issue.
> As far as I can see there are several issues:
> 1. Collecting metrics on every single byte read is slow.
> 2. Having it enabled by default is not a good idea.
> 3. The comment "enable coarse-grained metrics by default" at 
> https://github.com/apache/lucene-solr/blob/branch_6x/solr/core/src/java/org/apache/solr/update/SolrIndexConfig.java#L104
>  implies that only coarse-grained metrics should be enabled by default, and 
> this contradicts with collecting metrics on every single byte read.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10130) Serious performance degradation in Solr 6.4.1 due to the new metrics collection

2017-02-14 Thread Andrzej Bialecki (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15865492#comment-15865492
 ] 

Andrzej Bialecki  commented on SOLR-10130:
--

I would've expected a much larger difference with this patch, if this indeed 
was the cause of the slowdown - the patch completely turns off metrics 
collection at directory and index level.

bq. With 6.4.0, we were handling 6000 requests/minute. With 6.4.1 it is 1000 rpm
[~wunder] This is odd, too - the same metrics code is present in both 6.4.1 and 
6.4.0, with the same defaults, so I would expect that both versions should show 
similar performance. Could you please collect some stacktraces (or sample / 
profile) to verify that you see the same hotspots as [~emaijala] ?

> Serious performance degradation in Solr 6.4.1 due to the new metrics 
> collection
> ---
>
> Key: SOLR-10130
> URL: https://issues.apache.org/jira/browse/SOLR-10130
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: metrics
>Affects Versions: 6.4.1
> Environment: Centos 7, OpenJDK 1.8.0 update 111
>Reporter: Ere Maijala
>Assignee: Andrzej Bialecki 
>Priority: Blocker
>  Labels: perfomance
> Attachments: SOLR-10130.patch, solr-8983-console-f1.log
>
>
> We've stumbled on serious performance issues after upgrading to Solr 6.4.1. 
> Looks like the new metrics collection system in MetricsDirectoryFactory is 
> causing a major slowdown. This happens with an index configuration that, as 
> far as I can see, has no metrics specific configuration and uses 
> luceneMatchVersion 5.5.0. In practice a moderate load will completely bog 
> down the server with Solr threads constantly using up all CPU (600% on 6 core 
> machine) capacity with a load that normally  where we normally see an average 
> load of < 50%.
> I took stack traces (I'll attach them) and noticed that the threads are 
> spending time in com.codahale.metrics.Meter.mark. I tested building Solr 
> 6.4.1 with the metrics collection disabled in MetricsDirectoryFactory getByte 
> and getBytes methods and was unable to reproduce the issue.
> As far as I can see there are several issues:
> 1. Collecting metrics on every single byte read is slow.
> 2. Having it enabled by default is not a good idea.
> 3. The comment "enable coarse-grained metrics by default" at 
> https://github.com/apache/lucene-solr/blob/branch_6x/solr/core/src/java/org/apache/solr/update/SolrIndexConfig.java#L104
>  implies that only coarse-grained metrics should be enabled by default, and 
> this contradicts with collecting metrics on every single byte read.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10130) Serious performance degradation in Solr 6.4.1 due to the new metrics collection

2017-02-14 Thread Ishan Chattopadhyaya (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15865470#comment-15865470
 ] 

Ishan Chattopadhyaya commented on SOLR-10130:
-

I ran some benchmarks, with and without this patch.

{code}
Benchmarking suite: 
https://github.com/chatman/solr-upgrade-tests/blob/master/BENCHMARKS.md
Environment: packet.net, Type 0 server 
(https://www.packet.net/bare-metal/servers/type-0/)

6.4.1 Without patch

java -cp 
target/org.apache.solr.tests.upgradetests-0.0.1-SNAPSHOT-jar-with-dependencies.jar:.
 org.apache.solr.tests.upgradetests.SimpleBenchmarks -v 
72f75b2503fa0aa4f0aff76d439874feb923bb0e -Nodes 1 -Shards 1 -Replicas 1 
-numDocs 10 -threads 6 -benchmarkType generalIndexing

Indexing times: 188,190

6.4.1 With patch

java -cp 
target/org.apache.solr.tests.upgradetests-0.0.1-SNAPSHOT-jar-with-dependencies.jar:.
 org.apache.solr.tests.upgradetests.SimpleBenchmarks -v 
72f75b2503fa0aa4f0aff76d439874feb923bb0e -patchUrl 
https://issues.apache.org/jira/secure/attachment/12852444/SOLR-10130.patch 
-Nodes 1 -Shards 1 -Replicas 1 -numDocs 10 -threads 6 -benchmarkType 
generalIndexing

Indexing times: 171,165
{code}

> Serious performance degradation in Solr 6.4.1 due to the new metrics 
> collection
> ---
>
> Key: SOLR-10130
> URL: https://issues.apache.org/jira/browse/SOLR-10130
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: metrics
>Affects Versions: 6.4.1
> Environment: Centos 7, OpenJDK 1.8.0 update 111
>Reporter: Ere Maijala
>Assignee: Andrzej Bialecki 
>Priority: Blocker
>  Labels: perfomance
> Attachments: SOLR-10130.patch, solr-8983-console-f1.log
>
>
> We've stumbled on serious performance issues after upgrading to Solr 6.4.1. 
> Looks like the new metrics collection system in MetricsDirectoryFactory is 
> causing a major slowdown. This happens with an index configuration that, as 
> far as I can see, has no metrics specific configuration and uses 
> luceneMatchVersion 5.5.0. In practice a moderate load will completely bog 
> down the server with Solr threads constantly using up all CPU (600% on 6 core 
> machine) capacity with a load that normally  where we normally see an average 
> load of < 50%.
> I took stack traces (I'll attach them) and noticed that the threads are 
> spending time in com.codahale.metrics.Meter.mark. I tested building Solr 
> 6.4.1 with the metrics collection disabled in MetricsDirectoryFactory getByte 
> and getBytes methods and was unable to reproduce the issue.
> As far as I can see there are several issues:
> 1. Collecting metrics on every single byte read is slow.
> 2. Having it enabled by default is not a good idea.
> 3. The comment "enable coarse-grained metrics by default" at 
> https://github.com/apache/lucene-solr/blob/branch_6x/solr/core/src/java/org/apache/solr/update/SolrIndexConfig.java#L104
>  implies that only coarse-grained metrics should be enabled by default, and 
> this contradicts with collecting metrics on every single byte read.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10130) Serious performance degradation in Solr 6.4.1 due to the new metrics collection

2017-02-13 Thread Andrzej Bialecki (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15864585#comment-15864585
 ] 

Andrzej Bialecki  commented on SOLR-10130:
--

bq. Does disabling metrics fix it or we we need to go back to 6.4.0?
Unfortunately no, these metrics are always turned on both in 6.4.0 and in 
6.4.1. I'll upload a patch that disables this by default and allows turning it 
on via a solrconfig.xml knob.

> Serious performance degradation in Solr 6.4.1 due to the new metrics 
> collection
> ---
>
> Key: SOLR-10130
> URL: https://issues.apache.org/jira/browse/SOLR-10130
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: metrics
>Affects Versions: 6.4.1
> Environment: Centos 7, OpenJDK 1.8.0 update 111
>Reporter: Ere Maijala
>Assignee: Andrzej Bialecki 
>Priority: Blocker
>  Labels: perfomance
> Attachments: solr-8983-console-f1.log
>
>
> We've stumbled on serious performance issues after upgrading to Solr 6.4.1. 
> Looks like the new metrics collection system in MetricsDirectoryFactory is 
> causing a major slowdown. This happens with an index configuration that, as 
> far as I can see, has no metrics specific configuration and uses 
> luceneMatchVersion 5.5.0. In practice a moderate load will completely bog 
> down the server with Solr threads constantly using up all CPU (600% on 6 core 
> machine) capacity with a load that normally  where we normally see an average 
> load of < 50%.
> I took stack traces (I'll attach them) and noticed that the threads are 
> spending time in com.codahale.metrics.Meter.mark. I tested building Solr 
> 6.4.1 with the metrics collection disabled in MetricsDirectoryFactory getByte 
> and getBytes methods and was unable to reproduce the issue.
> As far as I can see there are several issues:
> 1. Collecting metrics on every single byte read is slow.
> 2. Having it enabled by default is not a good idea.
> 3. The comment "enable coarse-grained metrics by default" at 
> https://github.com/apache/lucene-solr/blob/branch_6x/solr/core/src/java/org/apache/solr/update/SolrIndexConfig.java#L104
>  implies that only coarse-grained metrics should be enabled by default, and 
> this contradicts with collecting metrics on every single byte read.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10130) Serious performance degradation in Solr 6.4.1 due to the new metrics collection

2017-02-13 Thread Walter Underwood (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15863948#comment-15863948
 ] 

Walter Underwood commented on SOLR-10130:
-

I’m seeing similar problems here. With 6.4.0, we were handling 6000 
requests/minute. With 6.4.1 it is 1000 rpm with median response times around 
2.5 seconds. I also switched to the G1 collector. I’m going to back that out 
and retest today to see if the performance comes back.

Does disabling metrics fix it or we we need to go back to 6.4.0?

wunder

> Serious performance degradation in Solr 6.4.1 due to the new metrics 
> collection
> ---
>
> Key: SOLR-10130
> URL: https://issues.apache.org/jira/browse/SOLR-10130
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: metrics
>Affects Versions: 6.4.1
> Environment: Centos 7, OpenJDK 1.8.0 update 111
>Reporter: Ere Maijala
>Priority: Blocker
>  Labels: perfomance
> Attachments: solr-8983-console-f1.log
>
>
> We've stumbled on serious performance issues after upgrading to Solr 6.4.1. 
> Looks like the new metrics collection system in MetricsDirectoryFactory is 
> causing a major slowdown. This happens with an index configuration that, as 
> far as I can see, has no metrics specific configuration and uses 
> luceneMatchVersion 5.5.0. In practice a moderate load will completely bog 
> down the server with Solr threads constantly using up all CPU (600% on 6 core 
> machine) capacity with a load that normally  where we normally see an average 
> load of < 50%.
> I took stack traces (I'll attach them) and noticed that the threads are 
> spending time in com.codahale.metrics.Meter.mark. I tested building Solr 
> 6.4.1 with the metrics collection disabled in MetricsDirectoryFactory getByte 
> and getBytes methods and was unable to reproduce the issue.
> As far as I can see there are several issues:
> 1. Collecting metrics on every single byte read is slow.
> 2. Having it enabled by default is not a good idea.
> 3. The comment "enable coarse-grained metrics by default" at 
> https://github.com/apache/lucene-solr/blob/branch_6x/solr/core/src/java/org/apache/solr/update/SolrIndexConfig.java#L104
>  implies that only coarse-grained metrics should be enabled by default, and 
> this contradicts with collecting metrics on every single byte read.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org