Re: ACCEPTED: waiting for AM container to be allocated, launched and register with RM

Sunil Govind Mon, 22 Aug 2016 09:44:35 -0700

HI Ram

RM logs looks fine and as per config it looks like RM is running on 8030
itself.
I am not very sure about the oozie end config which you mentioned. I
suggest you could check the config end more and debug there.
Also will let other community folks to pitch in if they have some other
opinion.


Thanks
Sunil

On Mon, Aug 22, 2016 at 8:57 PM rammohan ganapavarapu <
rammohanga...@gmail.com> wrote:

> any thoughts from the logs and config I have shared?
>
> On Aug 21, 2016 8:32 AM, "rammohan ganapavarapu" <rammohanga...@gmail.com>
> wrote:
>
>> so in job.properties what is the jobtracker property, is it RM ip: port
>> or scheduler port which is 8030, if I use 8030 I am getting unknown
>> protocol proto buffer error.
>>
>> On Aug 21, 2016 7:37 AM, "Sunil Govind" <sunil.gov...@gmail.com> wrote:
>>
>>> Hi.
>>>
>>> It seems its an oozie issue. From conf, RM scheduler is running at port
>>> 8030.
>>> But your job.properties is taking 8032. I suggest you could double
>>> confirm your oozie configuration and see the configurations are intact to
>>> contact RM. Sharing a link also
>>>
>>> https://discuss.zendesk.com/hc/en-us/articles/203355837-How-to-run-a-MapReduce-jar-using-Oozie-workflow
>>>
>>> Thanks
>>> Sunil
>>>
>>>
>>> On Sun, Aug 21, 2016 at 8:41 AM rammohan ganapavarapu <
>>> rammohanga...@gmail.com> wrote:
>>>
>>>> Please find the attached config that i got from yarn ui and  AM,RM
>>>> logs. I only see that connecting to 0.0.0.0:8030 when i submit job
>>>> using oozie, but if i submit as yarn jar its working fine as i posted in my
>>>> previous posts.
>>>>
>>>> Here is my oozie job.properties file, i have a java class that just
>>>> prints
>>>>
>>>> nameNode=hdfs://master01:8020
>>>> jobTracker=master01:8032
>>>> workflowName=EchoJavaJob
>>>> oozie.use.system.libpath=true
>>>>
>>>> queueName=default
>>>> hdfsWorkflowHome=/user/uap/oozieWorkflows
>>>>
>>>> workflowPath=${nameNode}${hdfsWorkflowHome}/${workflowName}
>>>> oozie.wf.application.path=${workflowPath}
>>>>
>>>> Please let me know if you guys find any clue why its trying to connect
>>>> to 0.0.0.:8030.
>>>>
>>>> Thanks,
>>>> Ram
>>>>
>>>>
>>>> On Fri, Aug 19, 2016 at 11:54 PM, Sunil Govind <sunil.gov...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi Ram
>>>>>
>>>>> From the console log, as Rohith said, AM is looking for AM at 8030. So
>>>>> pls confirm the RM port once.
>>>>> Could you please share AM and RM logs.
>>>>>
>>>>> Thanks
>>>>> Sunil
>>>>>
>>>>> On Sat, Aug 20, 2016 at 10:36 AM rammohan ganapavarapu <
>>>>> rammohanga...@gmail.com> wrote:
>>>>>
>>>>>> yes, I did configured.
>>>>>>
>>>>>> On Aug 19, 2016 7:22 PM, "Rohith Sharma K S" <
>>>>>> ksrohithsha...@gmail.com> wrote:
>>>>>>
>>>>>>> Hi
>>>>>>>
>>>>>>> From below discussion and AM logs, I see that AM container has
>>>>>>> launched but not able to connect to RM.
>>>>>>>
>>>>>>> This looks like your configuration issue. Would you check your
>>>>>>> job.xml jar that does *yarn.resourcemanager.scheduler.address *has
>>>>>>> been configured?
>>>>>>>
>>>>>>> Essentially, this address required by MRAppMaster for connecting to
>>>>>>> RM for heartbeats. If you don’t not configure, default value will be 
>>>>>>> taken
>>>>>>> i.e 8030.
>>>>>>>
>>>>>>>
>>>>>>> Thanks & Regards
>>>>>>> Rohith Sharma K S
>>>>>>>
>>>>>>> On Aug 20, 2016, at 7:02 AM, rammohan ganapavarapu <
>>>>>>> rammohanga...@gmail.com> wrote:
>>>>>>>
>>>>>>> Even if  the cluster dont have enough resources it should connect to
>>>>>>> "
>>>>>>>
>>>>>>> /0.0.0.0:8030" right? it should connect to my <RM_HOST:8030>, not sure 
>>>>>>> why its trying to connect to 0.0.0.0:8030.
>>>>>>>
>>>>>>> I have verified the config and i removed traces of 0.0.0.0 still no 
>>>>>>> luck.
>>>>>>>
>>>>>>> org.apache.hadoop.yarn.client.RMProxy: Connecting to ResourceManager at 
>>>>>>> /0.0.0.0:8030
>>>>>>>
>>>>>>> If an one has any clue please share.
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Ram
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Aug 19, 2016 at 2:32 PM, rammohan ganapavarapu <
>>>>>>> rammohanga...@gmail.com> wrote:
>>>>>>>
>>>>>>>> When i submit a job using yarn its seems working only with oozie
>>>>>>>> its failing i guess, not sure what is missing.
>>>>>>>>
>>>>>>>> yarn jar
>>>>>>>> /uap/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar 
>>>>>>>> pi
>>>>>>>> 20 1000
>>>>>>>> Number of Maps  = 20
>>>>>>>> Samples per Map = 1000
>>>>>>>> .
>>>>>>>> .
>>>>>>>> .
>>>>>>>> Job Finished in 19.622 seconds
>>>>>>>> Estimated value of Pi is 3.14280000000000000000
>>>>>>>>
>>>>>>>> Ram
>>>>>>>>
>>>>>>>> On Fri, Aug 19, 2016 at 11:46 AM, rammohan ganapavarapu <
>>>>>>>> rammohanga...@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Ok, i have used yarn-utils.py to get the correct values for my
>>>>>>>>> cluster and update those properties and restarted RM and NM but still 
>>>>>>>>> no
>>>>>>>>> luck not sure what i am missing, any other insights will help me.
>>>>>>>>>
>>>>>>>>> Below are my properties from yarn-site.xml and map-site.xml.
>>>>>>>>>
>>>>>>>>> python yarn-utils.py -c 24 -m 63 -d 3 -k False
>>>>>>>>>  Using cores=24 memory=63GB disks=3 hbase=False
>>>>>>>>>  Profile: cores=24 memory=63488MB reserved=1GB usableMem=62GB
>>>>>>>>> disks=3
>>>>>>>>>  Num Container=6
>>>>>>>>>  Container Ram=10240MB
>>>>>>>>>  Used Ram=60GB
>>>>>>>>>  Unused Ram=1GB
>>>>>>>>>  yarn.scheduler.minimum-allocation-mb=10240
>>>>>>>>>  yarn.scheduler.maximum-allocation-mb=61440
>>>>>>>>>  yarn.nodemanager.resource.memory-mb=61440
>>>>>>>>>  mapreduce.map.memory.mb=5120
>>>>>>>>>  mapreduce.map.java.opts=-Xmx4096m
>>>>>>>>>  mapreduce.reduce.memory.mb=10240
>>>>>>>>>  mapreduce.reduce.java.opts=-Xmx8192m
>>>>>>>>>  yarn.app.mapreduce.am.resource.mb=5120
>>>>>>>>>  yarn.app.mapreduce.am.command-opts=-Xmx4096m
>>>>>>>>>  mapreduce.task.io.sort.mb=1024
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>     <property>
>>>>>>>>>       <name>mapreduce.map.memory.mb</name>
>>>>>>>>>       <value>5120</value>
>>>>>>>>>     </property>
>>>>>>>>>     <property>
>>>>>>>>>       <name>mapreduce.map.java.opts</name>
>>>>>>>>>       <value>-Xmx4096m</value>
>>>>>>>>>     </property>
>>>>>>>>>     <property>
>>>>>>>>>       <name>mapreduce.reduce.memory.mb</name>
>>>>>>>>>       <value>10240</value>
>>>>>>>>>     </property>
>>>>>>>>>     <property>
>>>>>>>>>       <name>mapreduce.reduce.java.opts</name>
>>>>>>>>>       <value>-Xmx8192m</value>
>>>>>>>>>     </property>
>>>>>>>>>     <property>
>>>>>>>>>       <name>yarn.app.mapreduce.am.resource.mb</name>
>>>>>>>>>       <value>5120</value>
>>>>>>>>>     </property>
>>>>>>>>>     <property>
>>>>>>>>>       <name>yarn.app.mapreduce.am.command-opts</name>
>>>>>>>>>       <value>-Xmx4096m</value>
>>>>>>>>>     </property>
>>>>>>>>>     <property>
>>>>>>>>>       <name>mapreduce.task.io.sort.mb</name>
>>>>>>>>>       <value>1024</value>
>>>>>>>>>     </property>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>      <property>
>>>>>>>>>       <name>yarn.scheduler.minimum-allocation-mb</name>
>>>>>>>>>       <value>10240</value>
>>>>>>>>>     </property>
>>>>>>>>>
>>>>>>>>>      <property>
>>>>>>>>>       <name>yarn.scheduler.maximum-allocation-mb</name>
>>>>>>>>>       <value>61440</value>
>>>>>>>>>     </property>
>>>>>>>>>
>>>>>>>>>      <property>
>>>>>>>>>       <name>yarn.nodemanager.resource.memory-mb</name>
>>>>>>>>>       <value>61440</value>
>>>>>>>>>     </property>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Ram
>>>>>>>>>
>>>>>>>>> On Thu, Aug 18, 2016 at 11:14 PM, tkg_cangkul <
>>>>>>>>> yuza.ras...@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> maybe this link can be some reference to tune up the cluster:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> http://jason4zhu.blogspot.co.id/2014/10/memory-configuration-in-hadoop.html
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 19/08/16 11:13, rammohan ganapavarapu wrote:
>>>>>>>>>>
>>>>>>>>>> Do you know what properties to tune?
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Ram
>>>>>>>>>>
>>>>>>>>>> On Thu, Aug 18, 2016 at 9:11 PM, tkg_cangkul <
>>>>>>>>>> yuza.ras...@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> i think that's because you don't have enough resource.  u can
>>>>>>>>>>> tune your cluster config to maximize your resource.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 19/08/16 11:03, rammohan ganapavarapu wrote:
>>>>>>>>>>>
>>>>>>>>>>> I dont see any thing odd except this not sure if i have to worry
>>>>>>>>>>> about it or not.
>>>>>>>>>>>
>>>>>>>>>>> 2016-08-19 03:29:26,621 INFO [main]
>>>>>>>>>>> org.apache.hadoop.yarn.client.RMProxy: Connecting to 
>>>>>>>>>>> ResourceManager at /
>>>>>>>>>>> 0.0.0.0:8030
>>>>>>>>>>> 2016-08-19 03:29:27,646 INFO [main]
>>>>>>>>>>> org.apache.hadoop.ipc.Client: Retrying connect to server:
>>>>>>>>>>> 0.0.0.0/0.0.0.0:8030. Already tried 0 time(s); retry policy is
>>>>>>>>>>> RetryUpToMaximumCo
>>>>>>>>>>> untWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
>>>>>>>>>>> 2016-08-19 03:29:28,647 INFO [main]
>>>>>>>>>>> org.apache.hadoop.ipc.Client: Retrying connect to server:
>>>>>>>>>>> 0.0.0.0/0.0.0.0:8030. Already tried 1 time(s); retry policy is
>>>>>>>>>>> RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000
>>>>>>>>>>> MILLISECONDS)
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> its keep printing this log ..in app container logs.
>>>>>>>>>>>
>>>>>>>>>>> On Thu, Aug 18, 2016 at 8:20 PM, tkg_cangkul <
>>>>>>>>>>> yuza.ras...@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> maybe u can check the logs from port 8088 on your browser. that
>>>>>>>>>>>> was RM UI. just choose your job id and then check the logs.
>>>>>>>>>>>>
>>>>>>>>>>>> On 19/08/16 10:14, rammohan ganapavarapu wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> Sunil,
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks you for your input, below are my server metrics for RM.
>>>>>>>>>>>> Also attached RM UI for capacity scheduler resources. How else i 
>>>>>>>>>>>> can find?
>>>>>>>>>>>>
>>>>>>>>>>>> {
>>>>>>>>>>>>       "name":
>>>>>>>>>>>> "Hadoop:service=ResourceManager,name=QueueMetrics,q0=root",
>>>>>>>>>>>>       "modelerType": "QueueMetrics,q0=root",
>>>>>>>>>>>>       "tag.Queue": "root",
>>>>>>>>>>>>       "tag.Context": "yarn",
>>>>>>>>>>>>       "tag.Hostname": "hadoop001",
>>>>>>>>>>>>       "running_0": 0,
>>>>>>>>>>>>       "running_60": 0,
>>>>>>>>>>>>       "running_300": 0,
>>>>>>>>>>>>       "running_1440": 0,
>>>>>>>>>>>>       "AppsSubmitted": 1,
>>>>>>>>>>>>       "AppsRunning": 0,
>>>>>>>>>>>>       "AppsPending": 0,
>>>>>>>>>>>>       "AppsCompleted": 0,
>>>>>>>>>>>>       "AppsKilled": 0,
>>>>>>>>>>>>       "AppsFailed": 1,
>>>>>>>>>>>>       "AllocatedMB": 0,
>>>>>>>>>>>>       "AllocatedVCores": 0,
>>>>>>>>>>>>       "AllocatedContainers": 0,
>>>>>>>>>>>>       "AggregateContainersAllocated": 2,
>>>>>>>>>>>>       "AggregateContainersReleased": 2,
>>>>>>>>>>>>       "AvailableMB": 64512,
>>>>>>>>>>>>       "AvailableVCores": 24,
>>>>>>>>>>>>       "PendingMB": 0,
>>>>>>>>>>>>       "PendingVCores": 0,
>>>>>>>>>>>>       "PendingContainers": 0,
>>>>>>>>>>>>       "ReservedMB": 0,
>>>>>>>>>>>>       "ReservedVCores": 0,
>>>>>>>>>>>>       "ReservedContainers": 0,
>>>>>>>>>>>>       "ActiveUsers": 0,
>>>>>>>>>>>>       "ActiveApplications": 0
>>>>>>>>>>>>     },
>>>>>>>>>>>>
>>>>>>>>>>>> On Thu, Aug 18, 2016 at 6:49 PM, Sunil Govind <
>>>>>>>>>>>> sunil.gov...@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi
>>>>>>>>>>>>>
>>>>>>>>>>>>> It could be because of many of reasons. Also I am not sure
>>>>>>>>>>>>> about which scheduler your are using, pls share more details such 
>>>>>>>>>>>>> as RM log
>>>>>>>>>>>>> etc.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I could point out few reasons
>>>>>>>>>>>>>  - Such as "Not enough resource is cluster" can cause this
>>>>>>>>>>>>>  - If using Capacity Scheduler, if queue capacity is maxed
>>>>>>>>>>>>> out, such case can happen.
>>>>>>>>>>>>>  - Similarly if max-am-resource-percent is crossed per queue
>>>>>>>>>>>>> level, then also AM container may not be launched.
>>>>>>>>>>>>>
>>>>>>>>>>>>> you could check RM log to get more information if AM container
>>>>>>>>>>>>> is laucnhed.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>> Sunil
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Fri, Aug 19, 2016 at 5:37 AM rammohan ganapavarapu <
>>>>>>>>>>>>> rammohanga...@gmail.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> When i submit a MR job, i am getting this from AM UI but it
>>>>>>>>>>>>>> never get finished, what am i missing ?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> Ram
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>>>>> To unsubscribe, e-mail: user-unsubscr...@hadoop.apache.org
>>>>>>>>>>>> For additional commands, e-mail: user-h...@hadoop.apache.org
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>

Re: ACCEPTED: waiting for AM container to be allocated, launched and register with RM

Reply via email to