[ 
https://issues.apache.org/jira/browse/YARN-4711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15206979#comment-15206979
 ] 

Naganarasimha G R commented on YARN-4711:
-----------------------------------------

Thanks for sharing your views [~sjlee0], but main reason for going for this 
approach is unnecessarily the entities needs to be places in 2 queues so any 
way timeline client also gives the same benefit hence thought of placing only 
in one queue. I have avoided the NPE in multiple ways but we just need to 
decide here whether its ok to have code level complexity and place async 
directly to timelineclient or have common approach for handling sync and async 
entity publish. IMO i thought it would be better to avoid unnecessary placement 
in the dispatcher queue. 

> NM is going down with NPE's due to single thread processing of events by 
> Timeline client
> ----------------------------------------------------------------------------------------
>
>                 Key: YARN-4711
>                 URL: https://issues.apache.org/jira/browse/YARN-4711
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>            Reporter: Naganarasimha G R
>            Assignee: Naganarasimha G R
>            Priority: Critical
>              Labels: yarn-2928-1st-milestone
>         Attachments: 4711Analysis.txt, YARN-4711-YARN-2928.v1.001.patch
>
>
> After YARN-3367, while testing the latest 2928 branch came across few NPEs 
> due to which NM is shutting down.
> {code}
> 2016-02-21 23:19:54,078 FATAL org.apache.hadoop.yarn.event.AsyncDispatcher: 
> Error in dispatcher thread
> java.lang.NullPointerException
>         at 
> org.apache.hadoop.yarn.server.nodemanager.timelineservice.NMTimelinePublisher$ContainerEventHandler.handle(NMTimelinePublisher.java:306)
>         at 
> org.apache.hadoop.yarn.server.nodemanager.timelineservice.NMTimelinePublisher$ContainerEventHandler.handle(NMTimelinePublisher.java:296)
>         at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:183)
>         at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:109)
>         at java.lang.Thread.run(Thread.java:745)
> {code}
> {code}
> java.lang.NullPointerException
>         at 
> org.apache.hadoop.yarn.server.nodemanager.timelineservice.NMTimelinePublisher.putEntity(NMTimelinePublisher.java:213)
>         at 
> org.apache.hadoop.yarn.server.nodemanager.timelineservice.NMTimelinePublisher.publishContainerFinishedEvent(NMTimelinePublisher.java:192)
>         at 
> org.apache.hadoop.yarn.server.nodemanager.timelineservice.NMTimelinePublisher.access$400(NMTimelinePublisher.java:63)
>         at 
> org.apache.hadoop.yarn.server.nodemanager.timelineservice.NMTimelinePublisher$ApplicationEventHandler.handle(NMTimelinePublisher.java:289)
>         at 
> org.apache.hadoop.yarn.server.nodemanager.timelineservice.NMTimelinePublisher$ApplicationEventHandler.handle(NMTimelinePublisher.java:280)
>         at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:183)
>         at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:109)
>         at java.lang.Thread.run(Thread.java:745)
> {code}
> On analysis found that the there was delay in processing of events, as after 
> YARN-3367 all the events were getting processed by a single thread inside the 
> timeline client. 
> Additionally found one scenario where there is possibility of NPE:
> * TimelineEntity.toString() when {{real}} is not null



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to