Only INFO level, so I suspect not bad...

If that Overseer closed, another node should have picked up where it left
off. See that in another log?

Generally an Overseer close means a node or cluster restart.

This can cause a lot of DOWN state publishing. If it's a cluster restart, a
lot of those DOWN publishes are not processed until the cluster is started
back up - which can lead to the Overseer being overwhelmed and things not
responding fast enough. You should be able to see an active Overseer
working on publishing those states though (it shows that at INFO logging
level).

If the Overseer is simply down and another did not take over, that is just
some kind of bug. If it's overwhelmed, 5x is much much faster,
and SOLR-7281 should also help, but that is no real help for 4.x at this
point.

Anyway, key is, what is the active Overseer doing. Is there no active
Overseer? Or is it busy trying to push through a backlog of operations.

- Mark

On Wed, Feb 3, 2016 at 8:46 PM hawk <antonyaugus...@hotmail.com> wrote:

> Thanks Mark.
>
> I was able to search "Overseer" in the solr logs around the time frame of
> the condition. This particular message was from the leader node of the
> shard.
>
> 160201 11:26:36.380 localhost-startStop-1 Overseer (id=null) closing
>
> Also I found this message in the zookeeper logs.
>
> 11:26:35,218 [myid:02] - INFO [ProcessThread(sid:2
> cport:-1)::PrepRequestProcessor@645] - Got user-level KeeperException when
> processing sessionid:0x15297c0fe2e3f2d type:create cxid:0x3
> zxid:0xf0001be48
> txntype:-1 reqpath:n/a Error Path:/overseer Error:KeeperErrorCode =
> NodeExists for /overseer
>
> Any thoughts what these messages suggest?
>
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/I-was-asked-to-wait-on-state-recovering-for-shard-but-I-still-do-not-see-the-request-state-tp4204348p4255105.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
-- 
- Mark
about.me/markrmiller

Reply via email to