Hi :
I have met a problem about zookeeper "Client session timed out" recentely.
This problem will happen after Storm receiving data from Kafka spout for a
period of time.
Following is the information log.
"2016-04-20 09:29:56 s.k.ZkCoordinator [INFO] Task [1/1] Refreshing partition
manager connections
2016-04-20 09:29:56 s.k.DynamicBrokersReader [INFO] Read partition info from
zookeeper: GlobalPartitionInformation{topic=ods_prod_offer_inst_day,
partitionMap={0=134.130.70.125:9092}}
2016-04-20 09:29:56 s.k.KafkaUtils [INFO] Task [1/1] assigned
[Partition{host=134.130.70.125:9092, topic=ods_prod_offer_inst_day,
partition=0}]
2016-04-20 09:29:56 s.k.ZkCoordinator [INFO] Task [1/1] Deleted partition
managers: []
2016-04-20 09:29:56 s.k.ZkCoordinator [INFO] Task [1/1] New partition managers:
[]
2016-04-20 09:29:56 s.k.ZkCoordinator [INFO] Task [1/1] Finished refreshing
2016-04-20 09:31:00 o.a.s.s.o.a.z.ClientCnxn [INFO] Client session timed out,
have not heard from server in 30476ms for sessionid 0x153f3f2eaae158c, closing
socket connection and attempting reconnect
2016-04-20 09:31:00 o.a.z.ClientCnxn [INFO] Client session timed out, have not
heard from server in 31321ms for sessionid 0x2753a634ca7160d7, closing socket
connection and attempting reconnect"
After a while, the storm worker will down and restart.
I have implemented the following operations:
1. Setting vm.swappiness to 0 to avoid swap;
2. increase zookeeper timeout.
storm.zookeeper.session.timeout: 60000
storm.zookeeper.connection.timeout: 50000
3. modified the parameters related to GC.
-XX:CMSInitiatingOccupancyFraction=60 -XX:-CMSConcurrentMTEnabled
Anyone know how to solve this problem?
ght230