Hi, I have a problem in zookeeper in giraph, after the session has been established, it will lose connection in ~1 minute although I see the timeout is set to 600000, i.e 10minutes. What's the possible reasons?
14/04/08 16:55:22 INFO mapred.JobClient: Running job: job_201404081444_0018 14/04/08 16:55:22 INFO zookeeper.ClientCnxn: Opening socket connection to server compute-0-13.local/10.1.255.241:22181. Will not attempt to authenticate using SASL (unknown error) 14/04/08 16:55:22 INFO zookeeper.ClientCnxn: Socket connection established to compute-0-13.local/10.1.255.241:22181, initiating session 14/04/08 16:55:22 INFO zookeeper.ClientCnxn: Session establishment complete on server compute-0-13.local/10.1.255.241:22181, sessionid = 0x14543567f5e0009, negotiated timeout = 600000 ...... ...... 14/04/08 16:57:02 INFO job.JobProgressTracker: Data from 8 workers - Compute superstep 2: 0 out of 4847571 vertices computed; 0 out of 64 partitions computed; min free memory on worker 2 - 216.01MB, average 287.75MB 14/04/08 16:57:07 INFO zookeeper.ClientCnxn: Unable to read additional data from server sessionid 0x14543567f5e0009, likely server has closed socket, closing socket connection and attempting reconnect 14/04/08 16:57:09 INFO zookeeper.ClientCnxn: Opening socket connection to server compute-0-13.local/10.1.255.241:22181. Will not attempt to authenticate using SASL (unknown error) 14/04/08 16:57:09 WARN zookeeper.ClientCnxn: Session 0x14543567f5e0009 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused I tried to tuned the GC settings of hadoop but not working. Any hints? Best Regards, Suijian