On 6/8/2018 8:17 AM, Markus Jelsma wrote:
> Our local test environment mini cluster goes nuts right after start up. It is 
> a two node/shard/replica collection starts up normally if only one node start 
> up.  But as soon as the second node attempts to join the cluster, both nodes 
> go crazy, creating thousands of threads with identical stack traces.
>
> "qtp1458849419-4738" - Thread t@4738
>    java.lang.Thread.State: TIMED_WAITING
>         at sun.misc.Unsafe.park(Native Method)
>         - parking to wait for <6ee32168> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>         at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
>         at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
>         at 
> org.eclipse.jetty.util.BlockingArrayQueue.poll(BlockingArrayQueue.java:392)
>         at 
> org.eclipse.jetty.util.thread.QueuedThreadPool.idleJobPoll(QueuedThreadPool.java:600)
>         at 
> org.eclipse.jetty.util.thread.QueuedThreadPool.access$800(QueuedThreadPool.java:49)
>         at 
> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:663)
>         at java.lang.Thread.run(Thread.java:748)
>
>    Locked ownable synchronizers:
>         - None
>
> If does not happen always, but most of the time i am unable to boot the 
> cluster normally. Sometimes, apparently right now for the first time, the GUI 
> is still accessible.
>
> Is this a known issue?

It's not a problem that I've heard of.  There are no Solr classes in the
stacktrace, only Jetty and Java classes.  I won't try to tell you that a
bug in Solr can't be the root cause, because it definitely can.  The
threads appear to be created by Jetty, but the supplied info doesn't
indicate WHY it's happening.

Presumably there's a previous version you've used where this problem did
NOT happen.  What version would that be?

Can you share the solr.log file from both nodes when this happens? 
There might be a clue there.

It sounds like you probably have a small number of collections in the
dev cluster.  Can you confirm that?

Thanks,
Shawn

Reply via email to