Re: Re: tez map task and reduce task stay pending forerver

[email protected] Wed, 28 Jan 2015 16:49:05 -0800

Hello
    My cluster mapreduce.map.cpu.vcores setting is 3.



[email protected]
 
From: Hitesh Shah
Date: 2015-01-29 05:36
To: user
Subject: Re: tez map task and reduce task stay pending forerver
Hello 
 
Thanks for tracking down the issue to the vcores setting. Let me dig into that.
 
Some initial questions: 
  - do you know if YARN has been configured to schedule on both memory and 
vcores i.e using the DominantResourceScheduler?
  - I am assuming that the max vcores per container is 1 but the job is 
configured wrongly hence the error. The most likely fix for this is to probably 
check the max resource settings allowed by YARN before allowing a Job/DAG to be 
submitted. Do you see the problem show up if YARN is configured to allow 
containers with 2 vcores i.e. changing the max allocation setting for vcores in 
the RM? 
 
thanks
— Hitesh
 
 
On Jan 27, 2015, at 9:24 PM, [email protected] wrote:
 
> I test again, I found if I set mapreduce.map.cpu.vcores >1 ,the job will hang 
> . Very similar to https://issues.apache.org/jira/browse/TEZ-704
> 
> [email protected]
>  
> From: [email protected]
> Date: 2015-01-28 10:29
> To: user
> Subject: Re: Re: tez map task and reduce task stay pending forerver
> o yeah . I fix the problem.
> I add the config to my hive-site.xml
> <property> 
> <name>yarn.app.mapreduce.am.resource.mb</name> 
> <value>1024</value> 
> </property> 
> <property> 
> <name>yarn.app.mapreduce.am.resource.cpu-vcores</name> 
> <value>1</value> 
> </property> 
> <property> 
> <name>yarn.app.mapreduce.am.command-opts</name> 
> <value>-Djava.net.preferIPv4Stack=true -Xmx825955249</value> 
> </property> 
> <property> 
> <name>mapreduce.map.java.opts</name> 
> <value>-Djava.net.preferIPv4Stack=true -Xmx825955249</value> 
> </property> 
> <property> 
> <name>mapreduce.reduce.java.opts</name> 
> <value>-Djava.net.preferIPv4Stack=true -Xmx825955249</value> 
> </property> 
> <property> 
> <name>mapreduce.map.memory.mb</name> 
> <value>1024</value> 
> </property> 
> <property> 
> <name>mapreduce.map.cpu.vcores</name> 
> <value>1</value> 
> </property> 
> <property> 
> <name>mapreduce.reduce.memory.mb</name> 
> <value>1024</value> 
> </property> 
> <property> 
> <name>mapreduce.reduce.cpu.vcores</name> 
> <value>1</value> 
> </property>
> And config my tez-site.xml just
>  <property> 
> <name>tez.lib.uris</name> 
> <value>${fs.defaultFS}/apps/tez-0.5.3/tez-0.5.3-minimal.tar.gz</value> 
> </property> 
> <property> 
> <name>tez.use.cluster.hadoop-libs</name> 
> <value>true</value> 
> </property>
> 
> Every thing is ok.
> I think some config in my cluster is too larger .
> 
> [email protected]
>  
> From: [email protected]
> Date: 2015-01-28 10:24
> To: user
> Subject: Re: Re: tez map task and reduce task stay pending forerver
> No .   set hive.execution.engine=mr , still hang...
> 
> [email protected]
>  
> 发件人： Jianfeng (Jeff) Zhang
> 发送时间： 2015-01-28 10:11
> 收件人： user
> 主题： Re: 回复: tez map task and reduce task stay pending forerver
> Can you run this query successfully using hive on mr ? 
> 
> 
> 
> Best Regards,
> Jeff Zhang
> 
> 
> On Wed, Jan 28, 2015 at 10:01 AM, [email protected] <[email protected]> 
> wrote:
> 
> I check the tez document from HDP page 
> http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.1.7/bk_installing_manually_book/content/rpm-chap-tez_configure_tez.html.
>  
> tez.am.resource.memory.mb default value is 1536 
> My hadoop yarn.app.mapreduce.am.resource.mb value is 5734 MiB
> 
> The configuration mismatch will cause the problem ?
> [email protected]
>  
> 发件人： [email protected]
> 发送时间： 2015-01-27 17:59
> 收件人： user
> 主题： 回复: 回复: tez map task and reduce task stay pending forerver
> Sorry Gopal V, I made a mistake, My config mapreduce.map.memory.mb is 2867 . 
> 
> [email protected]
>  
> 发件人： [email protected]
> 发送时间： 2015-01-27 17:58
> 收件人： user
> 主题： 回复: 回复: tez map task and reduce task stay pending forerver
> Hello Gopal V,
>         I check my cdh config ,I found mapreduce.map.memory.mb is 2876.
> [email protected]
>  
> 发件人： [email protected]
> 发送时间： 2015-01-27 17:31
> 收件人： user
> 主题： 回复: Re: tez map task and reduce task stay pending forerver
> 
> I check the hivetez.log . No kill  request  trigger by hive.
> [email protected]
>  
> 发件人： Gopal V
> 发送时间： 2015-01-27 17:17
> 收件人： user
> 抄送： [email protected]
> 主题： Re: tez map task and reduce task stay pending forerver
> On 1/27/15, 12:50 AM, [email protected] wrote:
> > hive 0.14.0  tez 0.53  hadoop 2.3.0-cdh 5.0.2
> > hive> select * from p_city order by id;
> > Query ID = zhoushugang_20150127163434_da70d957-6ac4-4b8b-a484-42b593838076
> ...
> > --------------------------------------------------------------------------------
> >
> > VERTICES STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED
> > --------------------------------------------------------------------------------
> >
> > Map 1 INITED 1 0 0 1 0 0
> > Reducer 2 INITED 1 0 0 1 0 0
>  
> Looks all container requests are pending/unresponsive.
>  
> I see a container request in the log with
>  
> 2015-01-27 15:43:15,434 INFO [TaskSchedulerEventHandlerThread]
> rm.YarnTaskSchedulerService: Allocation request for task:
> attempt_1419300485749_371785_1_00_000000_0 with request:
> Capability[<memory:2867, vCores:3>]Priority[2] host:
> yhd-jqhadoop11.int.yihaodian.com rack: null
> ...
> 2015-01-27 15:43:17,635 INFO [DelayedContainerManager]
> rm.YarnTaskSchedulerService: Releasing held container as either there
> are pending but  unmatched requests or this is not a session,
> containerId=container_1419300485749_371785_01_000002, pendingTasks=1,
> isSession=true. isNew=true
>  
> That seems to indicate that a container allocation request was made, but
> YARN resource manager never responded with a container (or gave the
> wrong container?).
>  
> Does the container size 2867 suggest any idea on what that might be?
>  
> Cheers,
> Gopal
>  
> 
> 
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to 
> which it is addressed and may contain information that is confidential, 
> privileged and exempt from disclosure under applicable law. If the reader of 
> this message is not the intended recipient, you are hereby notified that any 
> printing, copying, dissemination, distribution, disclosure or forwarding of 
> this communication is strictly prohibited. If you have received this 
> communication in error, please contact the sender immediately and delete it 
> from your system. Thank You.

Re: Re: tez map task and reduce task stay pending forerver

Reply via email to