[ https://issues.apache.org/jira/browse/TEZ-4110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
László Bodor updated TEZ-4110: ------------------------------ Fix Version/s: 0.10.3 > Make Tez fail fast when DFS quota is exceeded > --------------------------------------------- > > Key: TEZ-4110 > URL: https://issues.apache.org/jira/browse/TEZ-4110 > Project: Apache Tez > Issue Type: Improvement > Affects Versions: 0.9.0, 0.8.4, 0.9.2 > Environment: hadoop 2.9, hive 2.3, tez > > Reporter: Wang Yan > Assignee: Ayush Saxena > Priority: Major > Fix For: 0.10.3 > > Attachments: With-Patch-Output.rtf, Without-Patch-Output.rtf > > Time Spent: 1.5h > Remaining Estimate: 0h > > This ticket aims at creating a similar feature as MAPREDUCE-7148 in tez. > Make a tez job fail fast when dfs quota limitation is reached. > The background is : We are running hive jobs with a DFS quota limitation per > job(3TB). If a job hits DFS quota limitation, the task that hit it will fail > and there will be a few task reties before the job actually fails. The retry > is not very helpful because the job will always fail anyway. In some worse > cases, we have a job which has a single reduce task writing more than 3TB to > HDFS over 20 hours, the reduce task exceeds the quota limitation and retries > 4 times until the job fails in the end thus consuming a lot of unnecessary > resource. This ticket aims at providing the feature to let a job fail fast > when it writes too much data to the DFS and exceeds the DFS quota limitation. > -- This message was sent by Atlassian Jira (v8.20.10#820010)