Shawn,
thank you, your help is very much appreciated,

We had already changed SO configuration before the last crash, so I think
that the problem is not there.

ulimit -a
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 257683
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 65535
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 65535
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited


When Solr tries to delete a znode? I´am sorry, because I understand nothing
about this process, and it is the only point that seems suspicios for me.
Do you think that it can cause inconsistency leading to the OOM problem?

> Just this message bellow, can you help me to understand what does this
> > message means?
> >
> > 2019-12-12 10:00:23,662 [myid:] - INFO  [ProcessThread(sid:0
> > cport:2181)::PrepRequestProcessor@653] - Got user-level KeeperException
> > when processing sessionid:0x1000071b8ec4adb type:delete cxid:0x10
> > zxid:0xafc6 txntype:-1 reqpath:n/a Error
> >
> Path:/overseer_elect/election/72058082471721304-192.168.0.61:8983_solr-n_0000000018
> > Error:KeeperErrorCode = NoNode for
> >
> /overseer_elect/election/72058082471721304-192.168.0.61:8983_solr-n_0000000018
>
> Solr tried to delete a znode from zookeeper and that deletion failed
> because the znode did not exist.
>
> I can't offer much about WHY it didn't exist, but my best guess is that
> it would have been created by the thread that Solr could not start.
>
>

Just after this INFO message above, ZK log starts to log thousands of this
block of lines below. Where it seems that ZK creates and closes thousands
of sessions.

"""

2019-12-12 10:00:58,591 [myid:] - INFO  [NIOServerCxn.Factory:
0.0.0.0/0.0.0.0:2181:ZooKeeperServer@948] - Client attempting to establish
new session at /192.168.0.31:49351

2019-12-12 10:01:48,038 [myid:] - INFO  [NIOServerCxn.Factory:
0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@215] - Accepted socket connection
from /192.168.0.31:50118

2019-12-12 10:09:03,370 [myid:] - INFO  [SyncThread:0:ZooKeeperServer@693]
- Established session 0x1000071b8ec5013 with negotiated timeout 15000 for
client /192.168.0.31:52474

2019-12-12 10:09:45,631 [myid:] - WARN  [NIOServerCxn.Factory:
0.0.0.0/0.0.0.0:2181:NIOServerCnxn@376] - Unable to read additional data
from client sessionid 0x1000071b8ec5013, likely client has closed socket

2019-12-12 10:09:45,631 [myid:] - INFO  [NIOServerCxn.Factory:
0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1040] - Closed socket connection for
client /192.168.0.31:52474 which had sessionid 0x1000071b8ec5013

2019-12-12 10:09:58,473 [myid:] - INFO  [SessionTracker:ZooKeeperServer@354]
- Expiring session 0x1000071b8ec5013, timeout of 15000ms exceeded

2019-12-12 10:09:58,473 [myid:] - INFO  [ProcessThread(sid:0
cport:2181)::PrepRequestProcessor@487] - Processed session termination for
sessionid: 0x1000071b8ec5013
"""


Again, I really dont know the integration about ZK, and Solr and I am
trying to follow the logs to get the problem. My application is Python and
as far as I inspected it is not the origin or the problem.


Thank you,
Koji

Reply via email to