[ https://issues.apache.org/jira/browse/YARN-1062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
shenhong updated YARN-1062: --------------------------- Description: In our cluster, MRAppMaster take a long time to init taskAttempt, the following log last one minute, 2013-07-17 11:28:06,328 INFO [AsyncDispatcher event handler] org.apache.hadoop.yarn.util.RackResolver: Resolved r01f11012.yh.aliyun.com to /r01f11 2013-07-17 11:28:06,357 INFO [AsyncDispatcher event handler] org.apache.hadoop.yarn.util.RackResolver: Resolved r01f11004.yh.aliyun.com to /r01f11 2013-07-17 11:28:06,383 INFO [AsyncDispatcher event handler] org.apache.hadoop.yarn.util.RackResolver: Resolved r03b05042.yh.aliyun.com to /r03b05 2013-07-17 11:28:06,384 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1373523419753_4543_m_000000_0 TaskAttempt Transitioned from NEW to UNASSIGNED 2013-07-17 11:28:06,415 INFO [AsyncDispatcher event handler] org.apache.hadoop.yarn.util.RackResolver: Resolved r03b02006.yh.aliyun.com to /r03b02 2013-07-17 11:28:06,436 INFO [AsyncDispatcher event handler] org.apache.hadoop.yarn.util.RackResolver: Resolved r02f02045.yh.aliyun.com to /r02f02 2013-07-17 11:28:06,457 INFO [AsyncDispatcher event handler] org.apache.hadoop.yarn.util.RackResolver: Resolved r02f02034.yh.aliyun.com to /r02f02 2013-07-17 11:28:06,457 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1373523419753_4543_m_000001_0 TaskAttempt Transitioned from NEW to UNASSIGNED The reason is: resolved one host to rack almost take 25ms (We resolve the host to rack by a python script). Our hdfs cluster is more than 4000 datanodes, then a large input job will take a long time to init TaskAttempt. Is there any good idea to solve this problem. was: In our cluster, MRAppMaster take a long time to init taskAttempt, the following log last one minute, 2013-07-17 11:28:06,328 INFO [AsyncDispatcher event handler] org.apache.hadoop.yarn.util.RackResolver: Resolved r01f11012.yh.aliyun.com to /r01f11 2013-07-17 11:28:06,357 INFO [AsyncDispatcher event handler] org.apache.hadoop.yarn.util.RackResolver: Resolved r01f11004.yh.aliyun.com to /r01f11 2013-07-17 11:28:06,383 INFO [AsyncDispatcher event handler] org.apache.hadoop.yarn.util.RackResolver: Resolved r03b05042.yh.aliyun.com to /r03b05 2013-07-17 11:28:06,384 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1373523419753_4543_m_000000_0 TaskAttempt Transitioned from NEW to UNASSIGNED The reason is: resolved one host to rack almost take 25ms, our hdfs cluster is more than 4000 datanodes, then a large input job will take a long time to init TaskAttempt. Is there any good idea to solve this problem. > MRAppMaster take a long time to init taskAttempt > ------------------------------------------------ > > Key: YARN-1062 > URL: https://issues.apache.org/jira/browse/YARN-1062 > Project: Hadoop YARN > Issue Type: Bug > Components: applications > Affects Versions: 0.23.6 > Reporter: shenhong > > In our cluster, MRAppMaster take a long time to init taskAttempt, the > following log last one minute, > 2013-07-17 11:28:06,328 INFO [AsyncDispatcher event handler] > org.apache.hadoop.yarn.util.RackResolver: Resolved r01f11012.yh.aliyun.com to > /r01f11 > 2013-07-17 11:28:06,357 INFO [AsyncDispatcher event handler] > org.apache.hadoop.yarn.util.RackResolver: Resolved r01f11004.yh.aliyun.com to > /r01f11 > 2013-07-17 11:28:06,383 INFO [AsyncDispatcher event handler] > org.apache.hadoop.yarn.util.RackResolver: Resolved r03b05042.yh.aliyun.com to > /r03b05 > 2013-07-17 11:28:06,384 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: > attempt_1373523419753_4543_m_000000_0 TaskAttempt Transitioned from NEW to > UNASSIGNED > 2013-07-17 11:28:06,415 INFO [AsyncDispatcher event handler] > org.apache.hadoop.yarn.util.RackResolver: Resolved r03b02006.yh.aliyun.com to > /r03b02 > 2013-07-17 11:28:06,436 INFO [AsyncDispatcher event handler] > org.apache.hadoop.yarn.util.RackResolver: Resolved r02f02045.yh.aliyun.com to > /r02f02 > 2013-07-17 11:28:06,457 INFO [AsyncDispatcher event handler] > org.apache.hadoop.yarn.util.RackResolver: Resolved r02f02034.yh.aliyun.com to > /r02f02 > 2013-07-17 11:28:06,457 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: > attempt_1373523419753_4543_m_000001_0 TaskAttempt Transitioned from NEW to > UNASSIGNED > The reason is: resolved one host to rack almost take 25ms (We resolve the > host to rack by a python script). Our hdfs cluster is more than 4000 > datanodes, then a large input job will take a long time to init TaskAttempt. > Is there any good idea to solve this problem. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira