[jira] [Commented] (TEZ-3880) do not count rejected tasks as killed in vertex progress
[ https://issues.apache.org/jira/browse/TEZ-3880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16314343#comment-16314343 ] TezQA commented on TEZ-3880: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12904899/TEZ-3880.01.patch against master revision d777f45. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in : org.apache.tez.tests.TestExternalTezServices Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/2709//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2709//console This message is automatically generated. > do not count rejected tasks as killed in vertex progress > > > Key: TEZ-3880 > URL: https://issues.apache.org/jira/browse/TEZ-3880 > Project: Apache Tez > Issue Type: Task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: TEZ-3880.01.patch, TEZ-3880.patch > > > Tasks rejected from LLAP because the cluster is full are shown as killed > tasks in the commandline query UI (CLI and beeline). This shouldn't really > happen; killed tasks in the container case means something else, and this > scenario doesn't exist because AM doesn't continuously try to queue tasks. We > could change LLAP queue to use sort of a pull model (would also allow for > better duplicate scheduling), but for now we should fix the UI -- This message was sent by Atlassian JIRA (v6.4.14#64029)
Failed: TEZ-3880 PreCommit Build #2709
Jira: https://issues.apache.org/jira/browse/TEZ-3880 Build: https://builds.apache.org/job/PreCommit-TEZ-Build/2709/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 331.20 KB...] [ERROR] [Help 2] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException [ERROR] [ERROR] After correcting the problems, you can resume the build with the command [ERROR] mvn -rf :tez-ext-service-tests [INFO] Build failures were ignored. {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12904899/TEZ-3880.01.patch against master revision d777f45. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in : org.apache.tez.tests.TestExternalTezServices Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/2709//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2709//console This message is automatically generated. == == Adding comment to Jira. == == Comment added. b5764015be033d03560b175609758b8f39a02f94 logged out == == Finished build. == == Build step 'Execute shell' marked build as failure Archiving artifacts [Fast Archiver] Compressed 3.51 MB of artifacts by 30.3% relative to #2708 [description-setter] Could not determine description. Recording test results Email was triggered for: Failure - Any Sending email for trigger: Failure - Any ### ## FAILED TESTS (if any) ## 1 tests failed. FAILED: org.apache.tez.tests.TestExternalTezServices.testErrorPropagation Error Message: expected:<1> but was:<0> Stack Trace: java.lang.AssertionError: expected:<1> but was:<0> at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.failNotEquals(Assert.java:743) at org.junit.Assert.assertEquals(Assert.java:118) at org.junit.Assert.assertEquals(Assert.java:555) at org.junit.Assert.assertEquals(Assert.java:542) at org.apache.tez.tests.TestExternalTezServices.runExceptionSimulation(TestExternalTezServices.java:203) at org.apache.tez.tests.TestExternalTezServices.testErrorPropagation(TestExternalTezServices.java:187)
[jira] [Updated] (TEZ-3880) do not count rejected tasks as killed in vertex progress
[ https://issues.apache.org/jira/browse/TEZ-3880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated TEZ-3880: -- Attachment: TEZ-3880.01.patch > do not count rejected tasks as killed in vertex progress > > > Key: TEZ-3880 > URL: https://issues.apache.org/jira/browse/TEZ-3880 > Project: Apache Tez > Issue Type: Task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: TEZ-3880.01.patch, TEZ-3880.patch > > > Tasks rejected from LLAP because the cluster is full are shown as killed > tasks in the commandline query UI (CLI and beeline). This shouldn't really > happen; killed tasks in the container case means something else, and this > scenario doesn't exist because AM doesn't continuously try to queue tasks. We > could change LLAP queue to use sort of a pull model (would also allow for > better duplicate scheduling), but for now we should fix the UI -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (TEZ-3880) do not count rejected tasks as killed in vertex progress
[ https://issues.apache.org/jira/browse/TEZ-3880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated TEZ-3880: -- Attachment: (was: TEZ-3880.01.patch) > do not count rejected tasks as killed in vertex progress > > > Key: TEZ-3880 > URL: https://issues.apache.org/jira/browse/TEZ-3880 > Project: Apache Tez > Issue Type: Task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: TEZ-3880.01.patch, TEZ-3880.patch > > > Tasks rejected from LLAP because the cluster is full are shown as killed > tasks in the commandline query UI (CLI and beeline). This shouldn't really > happen; killed tasks in the container case means something else, and this > scenario doesn't exist because AM doesn't continuously try to queue tasks. We > could change LLAP queue to use sort of a pull model (would also allow for > better duplicate scheduling), but for now we should fix the UI -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (TEZ-3880) do not count rejected tasks as killed in vertex progress
[ https://issues.apache.org/jira/browse/TEZ-3880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated TEZ-3880: -- Attachment: TEZ-3880.01.patch Removed the TODOs, and added a test > do not count rejected tasks as killed in vertex progress > > > Key: TEZ-3880 > URL: https://issues.apache.org/jira/browse/TEZ-3880 > Project: Apache Tez > Issue Type: Task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: TEZ-3880.01.patch, TEZ-3880.patch > > > Tasks rejected from LLAP because the cluster is full are shown as killed > tasks in the commandline query UI (CLI and beeline). This shouldn't really > happen; killed tasks in the container case means something else, and this > scenario doesn't exist because AM doesn't continuously try to queue tasks. We > could change LLAP queue to use sort of a pull model (would also allow for > better duplicate scheduling), but for now we should fix the UI -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TEZ-3877) Delete unordered spill files once merge is done
[ https://issues.apache.org/jira/browse/TEZ-3877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16314186#comment-16314186 ] Rohini Palaniswamy commented on TEZ-3877: - +1 > Delete unordered spill files once merge is done > --- > > Key: TEZ-3877 > URL: https://issues.apache.org/jira/browse/TEZ-3877 > Project: Apache Tez > Issue Type: Bug >Reporter: Rohini Palaniswamy >Assignee: Jason Lowe > Attachments: TEZ-3877.001.patch > > > I see that spill files are not deleted right after merge completes. We > should do that as it takes up a lot of space and we can't afford that wastage > when Tez takes up a lot of shuffle space with complex DAGs. [~jlowe] told me > they are only cleaned up after application completes as they are written in > app directory and not container directory. That also has to be done so that > they are cleaned up by node manager during task failures or container crashes. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TEZ-3877) Delete unordered spill files once merge is done
[ https://issues.apache.org/jira/browse/TEZ-3877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16314116#comment-16314116 ] TezQA commented on TEZ-3877: {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12904866/TEZ-3877.001.patch against master revision d777f45. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/2708//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2708//console This message is automatically generated. > Delete unordered spill files once merge is done > --- > > Key: TEZ-3877 > URL: https://issues.apache.org/jira/browse/TEZ-3877 > Project: Apache Tez > Issue Type: Bug >Reporter: Rohini Palaniswamy >Assignee: Jason Lowe > Attachments: TEZ-3877.001.patch > > > I see that spill files are not deleted right after merge completes. We > should do that as it takes up a lot of space and we can't afford that wastage > when Tez takes up a lot of shuffle space with complex DAGs. [~jlowe] told me > they are only cleaned up after application completes as they are written in > app directory and not container directory. That also has to be done so that > they are cleaned up by node manager during task failures or container crashes. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
Success: TEZ-3877 PreCommit Build #2708
Jira: https://issues.apache.org/jira/browse/TEZ-3877 Build: https://builds.apache.org/job/PreCommit-TEZ-Build/2708/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 339.61 KB...] [INFO] [INFO] BUILD SUCCESS [INFO] [INFO] Total time: 53:16 min [INFO] Finished at: 2018-01-05T23:20:24Z [INFO] Final Memory: 93M/1412M [INFO] {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12904866/TEZ-3877.001.patch against master revision d777f45. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/2708//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2708//console This message is automatically generated. == == Adding comment to Jira. == == Comment added. d8520269c451c2f541fff5dfc3fe8ae16e810a9f logged out == == Finished build. == == Archiving artifacts [Fast Archiver] Compressed 3.52 MB of artifacts by 24.0% relative to #2706 [description-setter] Description set: TEZ-3877 Recording test results Email was triggered for: Success Sending email for trigger: Success ### ## FAILED TESTS (if any) ## All tests passed
[jira] [Commented] (TEZ-3880) do not count rejected tasks as killed in vertex progress
[ https://issues.apache.org/jira/browse/TEZ-3880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16314066#comment-16314066 ] Sergey Shelukhin commented on TEZ-3880: --- I don't see it used anywhere in the codebase, so I'm assuming it's unused. I can remove the TODO-s. > do not count rejected tasks as killed in vertex progress > > > Key: TEZ-3880 > URL: https://issues.apache.org/jira/browse/TEZ-3880 > Project: Apache Tez > Issue Type: Task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: TEZ-3880.patch > > > Tasks rejected from LLAP because the cluster is full are shown as killed > tasks in the commandline query UI (CLI and beeline). This shouldn't really > happen; killed tasks in the container case means something else, and this > scenario doesn't exist because AM doesn't continuously try to queue tasks. We > could change LLAP queue to use sort of a pull model (would also allow for > better duplicate scheduling), but for now we should fix the UI -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TEZ-3880) do not count rejected tasks as killed in vertex progress
[ https://issues.apache.org/jira/browse/TEZ-3880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16314059#comment-16314059 ] Gunther Hagleitner commented on TEZ-3880: - There's a comment in the TaskAttemptTerminationCause that references LLAP. I think that shouldn't be committed. I also don't know why this patch is calling in question whether INTERRUPTED_BY_SYSTEM is used or not. Can you add a test for the new behavior? > do not count rejected tasks as killed in vertex progress > > > Key: TEZ-3880 > URL: https://issues.apache.org/jira/browse/TEZ-3880 > Project: Apache Tez > Issue Type: Task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: TEZ-3880.patch > > > Tasks rejected from LLAP because the cluster is full are shown as killed > tasks in the commandline query UI (CLI and beeline). This shouldn't really > happen; killed tasks in the container case means something else, and this > scenario doesn't exist because AM doesn't continuously try to queue tasks. We > could change LLAP queue to use sort of a pull model (would also allow for > better duplicate scheduling), but for now we should fix the UI -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (TEZ-3877) Delete unordered spill files once merge is done
[ https://issues.apache.org/jira/browse/TEZ-3877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated TEZ-3877: Attachment: TEZ-3877.001.patch Attaching a patch that cleans up the intermediate spills in in the unordered writer after the merge is complete or encounters an error. > Delete unordered spill files once merge is done > --- > > Key: TEZ-3877 > URL: https://issues.apache.org/jira/browse/TEZ-3877 > Project: Apache Tez > Issue Type: Bug >Reporter: Rohini Palaniswamy >Assignee: Jason Lowe > Attachments: TEZ-3877.001.patch > > > I see that spill files are not deleted right after merge completes. We > should do that as it takes up a lot of space and we can't afford that wastage > when Tez takes up a lot of shuffle space with complex DAGs. [~jlowe] told me > they are only cleaned up after application completes as they are written in > app directory and not container directory. That also has to be done so that > they are cleaned up by node manager during task failures or container crashes. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TEZ-3884) Hadoop3-beta1 fixes for Tez tests
[ https://issues.apache.org/jira/browse/TEZ-3884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16313887#comment-16313887 ] Gopal V commented on TEZ-3884: -- This is a place-holder for -Phadoop3, so that the build against Hadoop3 has its own profile instead of using -Phadoop28 > Hadoop3-beta1 fixes for Tez tests > - > > Key: TEZ-3884 > URL: https://issues.apache.org/jira/browse/TEZ-3884 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.9.1 >Reporter: Gopal V > > {code} > [ERROR] > /grid/5/dev/gopalv/llap-autobuild/tez/tez-api/src/test/java/org/apache/tez/client/TestTezClientUtils.java:[48,30] > cannot find symbol > [ERROR] symbol: class DistributedFileSystem > [ERROR] location: package org.apache.hadoop.hdfs > [ERROR] > /grid/5/dev/gopalv/llap-autobuild/tez/tez-api/src/test/java/org/apache/tez/client/TestTezClientUtils.java:[680,50] > cannot find symbol > [ERROR] symbol: class DistributedFileSystem > [ERROR] location: class org.apache.tez.client.TestTezClientUtils > [ERROR] > /grid/5/dev/gopalv/llap-autobuild/tez/tez-api/src/test/java/org/apache/tez/common/TestTezCommonUtils.java:[62,42] > cannot access org.apache.hadoop.hdfs.DistributedFileSystem > [ERROR] class file for org.apache.hadoop.hdfs.DistributedFileSystem not found > [ERROR] -> [Help 1] > [ERROR] > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (TEZ-3884) Hadoop3-beta1 fixes for Tez tests
[ https://issues.apache.org/jira/browse/TEZ-3884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated TEZ-3884: - Priority: Minor (was: Major) > Hadoop3-beta1 fixes for Tez tests > - > > Key: TEZ-3884 > URL: https://issues.apache.org/jira/browse/TEZ-3884 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.9.1 >Reporter: Gopal V >Priority: Minor > > {code} > [ERROR] > /grid/5/dev/gopalv/llap-autobuild/tez/tez-api/src/test/java/org/apache/tez/client/TestTezClientUtils.java:[48,30] > cannot find symbol > [ERROR] symbol: class DistributedFileSystem > [ERROR] location: package org.apache.hadoop.hdfs > [ERROR] > /grid/5/dev/gopalv/llap-autobuild/tez/tez-api/src/test/java/org/apache/tez/client/TestTezClientUtils.java:[680,50] > cannot find symbol > [ERROR] symbol: class DistributedFileSystem > [ERROR] location: class org.apache.tez.client.TestTezClientUtils > [ERROR] > /grid/5/dev/gopalv/llap-autobuild/tez/tez-api/src/test/java/org/apache/tez/common/TestTezCommonUtils.java:[62,42] > cannot access org.apache.hadoop.hdfs.DistributedFileSystem > [ERROR] class file for org.apache.hadoop.hdfs.DistributedFileSystem not found > [ERROR] -> [Help 1] > [ERROR] > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (TEZ-3884) Hadoop3-beta1 fixes for Tez tests
Gopal V created TEZ-3884: Summary: Hadoop3-beta1 fixes for Tez tests Key: TEZ-3884 URL: https://issues.apache.org/jira/browse/TEZ-3884 Project: Apache Tez Issue Type: Bug Affects Versions: 0.9.1 Reporter: Gopal V {code} [ERROR] /grid/5/dev/gopalv/llap-autobuild/tez/tez-api/src/test/java/org/apache/tez/client/TestTezClientUtils.java:[48,30] cannot find symbol [ERROR] symbol: class DistributedFileSystem [ERROR] location: package org.apache.hadoop.hdfs [ERROR] /grid/5/dev/gopalv/llap-autobuild/tez/tez-api/src/test/java/org/apache/tez/client/TestTezClientUtils.java:[680,50] cannot find symbol [ERROR] symbol: class DistributedFileSystem [ERROR] location: class org.apache.tez.client.TestTezClientUtils [ERROR] /grid/5/dev/gopalv/llap-autobuild/tez/tez-api/src/test/java/org/apache/tez/common/TestTezCommonUtils.java:[62,42] cannot access org.apache.hadoop.hdfs.DistributedFileSystem [ERROR] class file for org.apache.hadoop.hdfs.DistributedFileSystem not found [ERROR] -> [Help 1] [ERROR] {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TEZ-3880) do not count rejected tasks as killed in vertex progress
[ https://issues.apache.org/jira/browse/TEZ-3880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16313832#comment-16313832 ] Eric Wohlstadter commented on TEZ-3880: --- [~sershe] Ok, the important thing is that for non-LLAP tasks, the old behavior is preserved. So if SERVICE_BUSY is an LLAP specific termination reason, then this lgtm. > do not count rejected tasks as killed in vertex progress > > > Key: TEZ-3880 > URL: https://issues.apache.org/jira/browse/TEZ-3880 > Project: Apache Tez > Issue Type: Task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: TEZ-3880.patch > > > Tasks rejected from LLAP because the cluster is full are shown as killed > tasks in the commandline query UI (CLI and beeline). This shouldn't really > happen; killed tasks in the container case means something else, and this > scenario doesn't exist because AM doesn't continuously try to queue tasks. We > could change LLAP queue to use sort of a pull model (would also allow for > better duplicate scheduling), but for now we should fix the UI -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TEZ-160) Remove 5 second sleep at the end of AM completion.
[ https://issues.apache.org/jira/browse/TEZ-160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16313799#comment-16313799 ] Rohini Palaniswamy commented on TEZ-160: Recently ran noticed that about 5% of Pig jobs launched from Oozie in a cluster, had application status as KILLED even though the DAG succeeded and Pig scripts completed successfully. This was because Pig calls TezClient.stop() on shutdown. If it is not killed within 10 seconds, it calls frameworkClient.killApplication(sessionAppId); which kill the AM. Because of the sleep time of 5 seconds after shutdown is issued, an application finishing as SUCCEEDED or KILLED depended on whether the shutdown completed within the next 5 seconds. Can we skip this check if it is a user initiated shutdown or at least lower it to 1 or 2 seconds? In case of Pig it is a Tez session and pig client is calling shutdown. I think we can skip it in general if it was a Tez session. The only time it will go down automatically is if session timeout expires. Adding another 5 seconds in that case is also wasteful. > Remove 5 second sleep at the end of AM completion. > -- > > Key: TEZ-160 > URL: https://issues.apache.org/jira/browse/TEZ-160 > Project: Apache Tez > Issue Type: Bug >Reporter: Siddharth Seth > Labels: TEZ-0.2.0 > Attachments: test.timeouts.txt > > > ClientServiceDelegate/DAGClient doesn't seem to be getting job completion > status from the AM after job completion. It, instead, always relies on the RM > for this information. The information returned by the AM should be used while > it's available. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (TEZ-3877) Delete unordered spill files once merge is done
[ https://issues.apache.org/jira/browse/TEZ-3877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe reassigned TEZ-3877: --- Assignee: Jason Lowe Summary: Delete unordered spill files once merge is done (was: Delete spill files once merge is done) Offline Rohini pointed me to the UnorderedKVWriter, and indeed the intermediate spill files are *not* being deleted after being merged like they are for the ordered case. Updated the JIRA summary accordingly. > Delete unordered spill files once merge is done > --- > > Key: TEZ-3877 > URL: https://issues.apache.org/jira/browse/TEZ-3877 > Project: Apache Tez > Issue Type: Bug >Reporter: Rohini Palaniswamy >Assignee: Jason Lowe > > I see that spill files are not deleted right after merge completes. We > should do that as it takes up a lot of space and we can't afford that wastage > when Tez takes up a lot of shuffle space with complex DAGs. [~jlowe] told me > they are only cleaned up after application completes as they are written in > app directory and not container directory. That also has to be done so that > they are cleaned up by node manager during task failures or container crashes. -- This message was sent by Atlassian JIRA (v6.4.14#64029)