[ https://issues.apache.org/jira/browse/ZOOKEEPER-2315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Patrick Hunt updated ZOOKEEPER-2315: ------------------------------------ Assignee: Lin Yiqun > Change client connect zk service timeout log level from Info to Warn level > -------------------------------------------------------------------------- > > Key: ZOOKEEPER-2315 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2315 > Project: ZooKeeper > Issue Type: Improvement > Components: java client > Affects Versions: 3.4.6 > Reporter: Lin Yiqun > Assignee: Lin Yiqun > Priority: Minor > Fix For: 3.4.7, 3.5.2, 3.6.0 > > Attachments: ZOOKEEPER-2315.001.patch, ZOOKEEPER-2315.002.patch > > > Recently my the resourmanager of my hadoop cluster is fail suddenly,so I > look into the rsourcemanager log.But the log is not helpful for me to direct > find the reson until I found the zk timeout info log record. > {code} > 2015-11-06 06:34:11,257 INFO > org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerNode: > Assigned container container_1446016482901_292094_01_000140 of capacity > <memory:1024, vCores:1> on host mofa2089:41361, which has 30 containers, > <memory:31744, vCores:30> used and <memory:9216, vCores:10> available after > allocation > 2015-11-06 06:34:11,266 INFO org.apache.zookeeper.ClientCnxn: Unable to > reconnect to ZooKeeper service, session 0x24f4fd5118e5c6e has expired, > closing socket connection > 2015-11-06 06:34:11,271 INFO > org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: > container_1446016482901_292094_01_000105 Container Transitioned from RUNNING > to COMPLETED > 2015-11-06 06:34:11,271 INFO > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt: > Completed container: container_1446016482901_292094_01_000105 in state: > COMPLETED event:FINISHED > 2015-11-06 06:34:11,271 INFO > org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=dongwei > OPERATION=AM Released Container TARGET=SchedulerApp RESULT=SUCCESS > APPID=application_1446016482901_292094 > CONTAINERID=container_1446016482901_292094_01_000105 > 2015-11-06 06:34:11,271 INFO > org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerNode: > Released container container_1446016482901_292094_01_000105 of capacity > <memory:1024, vCores:1> on host mofa010079:50991, which currently has 29 > containers, <memory:33792, vCores:29> used and <memory:7168, vCores:11> > available, release resources=true > 2015-11-06 06:34:11,271 INFO > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: > Application attempt appattempt_1446016482901_292094_000001 released container > container_1446016482901_292094_01_000105 on node: host: mofa010079:50991 > #containers=29 available=<memory:7168, vCores:11> used=<memory:33792, > vCores:29> with event: FINISHED > 2015-11-06 06:34:11,272 INFO > org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: > container_1446016482901_292094_01_000141 Container Transitioned from NEW to > ALLOCATED > 2015-11-06 06:34:11,272 INFO > org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=dongwei > OPERATION=AM Allocated Container TARGET=SchedulerApp > RESULT=SUCCESS APPID=application_1446016482901_292094 > CONTAINERID=container_1446016482901_292094_01_000141 > 2015-11-06 06:34:11,272 INFO > org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerNode: > Assigned container container_1446016482901_292094_01_000141 of capacity > <memory:1024, vCores:1> on host mofa010079:50991, which has 30 containers, > <memory:34816, vCores:30> used and <memory:6144, vCores:10> available after > allocation > 2015-11-06 06:34:11,295 WARN > org.apache.hadoop.yarn.server.resourcemanager.amlauncher.ApplicationMasterLauncher: > > org.apache.hadoop.yarn.server.resourcemanager.amlauncher.ApplicationMasterLauncher$LauncherThread > interrupted. Returning. > 2015-11-06 06:34:11,296 INFO org.apache.hadoop.ipc.Server: Stopping server on > 8032 > 2015-11-06 06:34:11,297 INFO org.apache.hadoop.ipc.Server: Stopping IPC > Server Responder > 2015-11-06 06:34:11,297 INFO org.apache.hadoop.ipc.Server: Stopping server on > 8030 > 2015-11-06 06:34:11,297 INFO org.apache.hadoop.ipc.Server: Stopping IPC > Server listener on 8032 > 2015-11-06 06:34:11,298 INFO org.apache.hadoop.ipc.Server: Stopping IPC > Server Responder > 2015-11-06 06:34:11,298 INFO org.apache.hadoop.ipc.Server: Stopping server on > 8031 > 2015-11-06 06:34:11,298 INFO org.apache.hadoop.ipc.Server: Stopping IPC > Server listener on 8030 > 2015-11-06 06:34:11,300 INFO org.apache.hadoop.ipc.Server: Stopping IPC > Server listener on 80312015-11-06 06:34:11,300 INFO > org.apache.hadoop.ipc.Server: Stopping IPC Server Responder > {code} > The problem is solved,but it's too difficult to find the connect zk service > time out info from so many info log records.And we will easily to ignore > these records.So we should chang these zk seesion timeout log level form info > level to warn. -- This message was sent by Atlassian JIRA (v6.3.4#6332)