Tarun Parimi created YARN-10240:
-----------------------------------

             Summary: Prevent Fatal CancelledException in TimelineV2Client when 
stopping
                 Key: YARN-10240
                 URL: https://issues.apache.org/jira/browse/YARN-10240
             Project: Hadoop YARN
          Issue Type: Bug
          Components: ATSv2
            Reporter: Tarun Parimi


When the timeline client is stopped, it will cancel all sync EntityHolders 
after waiting for a drain timeout.

{code:java}
// if some entities were not drained then we need interrupt
                  // the threads which had put sync EntityHolders to the queue.
                  EntitiesHolder nextEntityInTheQueue = null;
                  while ((nextEntityInTheQueue =
                      timelineEntityQueue.poll()) != null) {
                    nextEntityInTheQueue.cancel(true);
                  }
{code}

We only handle interrupted exception here.
{code:java}
if (sync) {
        // In sync call we need to wait till its published and if any error then
        // throw it back
        try {
          entitiesHolder.get();
        } catch (ExecutionException e) {
          throw new YarnException("Failed while publishing entity",
              e.getCause());
        } catch (InterruptedException e) {
          Thread.currentThread().interrupt();
          throw new YarnException("Interrupted while publishing entity", e);
        }
      }
{code}

 But calling nextEntityInTheQueue.cancel(true) will result in 
entitiesHolder.get() throwing a CancelledException which is not handled. This 
can result in FATAL error in NM. We need to prevent this.

{code:java}
FATAL event.AsyncDispatcher (AsyncDispatcher.java:dispatch(203)) - Error in 
dispatcher thread
java.util.concurrent.CancellationException
        at java.util.concurrent.FutureTask.report(FutureTask.java:121)
        at java.util.concurrent.FutureTask.get(FutureTask.java:192)
        at 
org.apache.hadoop.yarn.client.api.impl.TimelineV2ClientImpl$TimelineEntityDispatcher.dispatchEntities(TimelineV2ClientImpl.java:545)
        at 
org.apache.hadoop.yarn.client.api.impl.TimelineV2ClientImpl.putEntities(TimelineV2ClientImpl.java:149)
        at 
org.apache.hadoop.yarn.server.nodemanager.timelineservice.NMTimelinePublisher.putEntity(NMTimelinePublisher.java:348)
{code}







--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to