Still unclear how and where the same session id is being re-used. Any thoughts ?
The ROOT region was just bouncing around state's since the RS would be marked as dead whenever these timeouts occur and thus the ROOT region will be moved to a different server. Once the ROOT gets assigned to a RS which had less handlers (< 15K), it stopped bouncing around. I am surprised bumping up handlers and having 0 traffic on the cluster can cause this issue. -Viral On Mon, Apr 29, 2013 at 1:23 PM, Viral Bajaria <viral.baja...@gmail.com>wrote: > On Mon, Apr 29, 2013 at 2:49 AM, Ted Yu <yuzhih...@gmail.com> wrote: > >> After each zookeeper reconnect, I saw same session Id (0x703e...) >> >> What version of zookeeper are you using ? Can you search zookeeper log >> for this session Id to see what happened ? >> >> Thanks >> > > Zookeeper version is 3.4.5, following are the logs from 2 zookeeper > servers. The first one was the first time ever the regionserver connected > to ZK and after that all of them are retries. I doubt the issue is on the > ZK side since I have 3 other services running which run fine with the same > quorum and have no issues. > > I feel I am hitting some other bug with HBase when the number of handlers > is increased by a lot. Anyone seen a high number of handlers in any > production deployment out there ? > > Thanks, > Viral > > ===server with first session=== > 2013-04-29 07:40:55,574 [myid:1072011376] - INFO > [CommitProcessor:1072011376:ZooKeeperServer@595] - Established session > 0x703e48a8cfd81be6 with negotiated timeout 40000 for client / > 10.155.234.3:53814 > EndOfStreamException: Unable to read additional data from client sessionid > 0x703e48a8cfd81be6, likely client has closed socket > 2013-04-29 07:43:47,737 [myid:1072011376] - INFO [NIOServerCxn.Factory: > 0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - Closed socket connection for > client /10.155.234.3:53814 which had sessionid 0x703e48a8cfd81be6 > 2013-04-29 07:46:29,679 [myid:1072011376] - INFO [NIOServerCxn.Factory: > 0.0.0.0/0.0.0.0:2181:ZooKeeperServer@832] - Client attempting to renew > session 0x703e48a8cfd81be6 at /10.155.234.3:53831 > 2013-04-29 07:46:29,679 [myid:1072011376] - INFO [NIOServerCxn.Factory: > 0.0.0.0/0.0.0.0:2181:Learner@107] - Revalidating client: > 0x703e48a8cfd81be6 > 2013-04-29 07:46:29,680 [myid:1072011376] - INFO > [QuorumPeer[myid=1072011376]/0:0:0:0:0:0:0:0:2181:ZooKeeperServer@595] - > Established session 0x703e48a8cfd81be6 with negotiated timeout 40000 for > client /10.155.234.3:53831 > EndOfStreamException: Unable to read additional data from client sessionid > 0x703e48a8cfd81be6, likely client has closed socket > 2013-04-29 07:49:10,401 [myid:1072011376] - INFO [NIOServerCxn.Factory: > 0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - Closed socket connection for > client /10.155.234.3:53831 which had sessionid 0x703e48a8cfd81be6 > 2013-04-29 07:55:53,441 [myid:1072011376] - INFO [NIOServerCxn.Factory: > 0.0.0.0/0.0.0.0:2181:ZooKeeperServer@832] - Client attempting to renew > session 0x703e48a8cfd81be6 at /10.155.234.3:53854 > 2013-04-29 07:55:53,441 [myid:1072011376] - INFO [NIOServerCxn.Factory: > 0.0.0.0/0.0.0.0:2181:Learner@107] - Revalidating client: > 0x703e48a8cfd81be6 > 2013-04-29 07:55:53,442 [myid:1072011376] - INFO > [QuorumPeer[myid=1072011376]/0:0:0:0:0:0:0:0:2181:ZooKeeperServer@595] - > Established session 0x703e48a8cfd81be6 with negotiated timeout 40000 for > client /10.155.234.3:53854 > EndOfStreamException: Unable to read additional data from client sessionid > 0x703e48a8cfd81be6, likely client has closed socket > 2013-04-29 07:58:33,947 [myid:1072011376] - INFO [NIOServerCxn.Factory: > 0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - Closed socket connection for > client /10.155.234.3:53854 which had sessionid 0x703e48a8cfd81be6 > > ===server with subsequent sessions=== > 2013-04-29 07:44:28,733 [myid:54242244232] - INFO [NIOServerCxn.Factory: > 0.0.0.0/0.0.0.0:2181:ZooKeeperServer@832] - Client attempting to renew > session 0x703e48a8cfd81be6 at /10.155.234.3:51353 > 2013-04-29 07:44:28,734 [myid:54242244232] - INFO [NIOServerCxn.Factory: > 0.0.0.0/0.0.0.0:2181:ZooKeeperServer@595] - Established session > 0x703e48a8cfd81be6 with negotiated timeout 40000 for client / > 10.155.234.3:51353 > EndOfStreamException: Unable to read additional data from client sessionid > 0x703e48a8cfd81be6, likely client has closed socket > 2013-04-29 07:46:29,206 [myid:54242244232] - INFO [NIOServerCxn.Factory: > 0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - Closed socket connection for > client /10.155.234.3:51353 which had sessionid 0x703e48a8cfd81be6 > 2013-04-29 07:53:52,553 [myid:54242244232] - INFO [NIOServerCxn.Factory: > 0.0.0.0/0.0.0.0:2181:ZooKeeperServer@832] - Client attempting to renew > session 0x703e48a8cfd81be6 at /10.155.234.3:51376 > 2013-04-29 07:53:52,553 [myid:54242244232] - INFO [NIOServerCxn.Factory: > 0.0.0.0/0.0.0.0:2181:ZooKeeperServer@595] - Established session > 0x703e48a8cfd81be6 with negotiated timeout 40000 for client / > 10.155.234.3:51376 > EndOfStreamException: Unable to read additional data from client sessionid > 0x703e48a8cfd81be6, likely client has closed socket > 2013-04-29 07:55:53,049 [myid:54242244232] - INFO [NIOServerCxn.Factory: > 0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - Closed socket connection for > client /10.155.234.3:51376 which had sessionid 0x703e48a8cfd81be6 > 2013-04-29 08:02:36,000 [myid:54242244232] - INFO > [SessionTracker:ZooKeeperServer@325] - Expiring session > 0x703e48a8cfd81be6, timeout of 40000ms exceeded > 2013-04-29 08:02:36,001 [myid:54242244232] - INFO > [ProcessThread(sid:54242244232 cport:-1)::PrepRequestProcessor@476] - > Processed session termination for sessionid: 0x703e48a8cfd81be6 > 2013-04-29 08:02:36,283 [myid:54242244232] - INFO [NIOServerCxn.Factory: > 0.0.0.0/0.0.0.0:2181:ZooKeeperServer@832] - Client attempting to renew > session 0x703e48a8cfd81be6 at /10.155.234.3:51398 > 2013-04-29 08:02:36,284 [myid:54242244232] - INFO [NIOServerCxn.Factory: > 0.0.0.0/0.0.0.0:2181:ZooKeeperServer@588] - Invalid session > 0x703e48a8cfd81be6 for client /10.155.234.3:51398, probably expired > 2013-04-29 08:02:36,284 [myid:54242244232] - INFO [NIOServerCxn.Factory: > 0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - Closed socket connection for > client /10.155.234.3:51398 which had sessionid 0x703e48a8cfd81be6 > >