[ https://issues.apache.org/jira/browse/TEZ-3767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16065257#comment-16065257 ]
Siddharth Seth commented on TEZ-3767: ------------------------------------- +1. Not sure if the test failure is related. > Shuffle should not report error to AM during inputContext.killSelf() > -------------------------------------------------------------------- > > Key: TEZ-3767 > URL: https://issues.apache.org/jira/browse/TEZ-3767 > Project: Apache Tez > Issue Type: Bug > Reporter: Rajesh Balamohan > Assignee: Rajesh Balamohan > Attachments: TEZ-3767.1.patch, TEZ-3767.2.patch, TEZ-3767.2.patch, > TEZ-3767.3.patch > > > {{ShuffleScheduler::killSelf}} kills the current attempt when it encounters > certain errors. As a part of cleanup, it invokes {{close}} which internally > releases the resources. > If merge is happening in the middle, it could throw the following exception. > This is caught in {{RunShuffleCallable}} and reported to AM immediately. This > causes tasks to fail. > {noformat} > » Error: Error while running task ( failure ) : > org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$ShuffleError: > Error while doing final merge > at > org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:320) > at > org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:285) > at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.util.ConcurrentModificationException > at java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1211) > at java.util.TreeMap$KeyIterator.next(TreeMap.java:1265) > at java.util.AbstractCollection.toArray(AbstractCollection.java:141) > at java.util.ArrayList.addAll(ArrayList.java:577) > at > org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MergeManager.close(MergeManager.java:636) > at > org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:316) > ... 6 more > {noformat} > When {{isShutDown}} is set to true, it would be good to avoid sending error > messages to AM. -- This message was sent by Atlassian JIRA (v6.4.14#64029)