[jira] [Commented] (TEZ-3833) Tasks should report codec errors during shuffle as fetch failures
[ https://issues.apache.org/jira/browse/TEZ-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16166978#comment-16166978 ] Kuhu Shukla commented on TEZ-3833: -- Thanks [~jlowe] for the catch. My bad, I noted this but forgot to address it. I will post an updated patch soon. > Tasks should report codec errors during shuffle as fetch failures > - > > Key: TEZ-3833 > URL: https://issues.apache.org/jira/browse/TEZ-3833 > Project: Apache Tez > Issue Type: Bug >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: TEZ-3833.001.patch > > > Do the equivalent of https://issues.apache.org/jira/browse/MAPREDUCE-6633 so > that compression errors do not prove fatal for the DAG/tasks. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TEZ-3833) Tasks should report codec errors during shuffle as fetch failures
[ https://issues.apache.org/jira/browse/TEZ-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16170214#comment-16170214 ] TezQA commented on TEZ-3833: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12887672/TEZ-3833.002.patch against master revision 4c29635. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/2635//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2635//console This message is automatically generated. > Tasks should report codec errors during shuffle as fetch failures > - > > Key: TEZ-3833 > URL: https://issues.apache.org/jira/browse/TEZ-3833 > Project: Apache Tez > Issue Type: Bug >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: TEZ-3833.001.patch, TEZ-3833.002.patch > > > Do the equivalent of https://issues.apache.org/jira/browse/MAPREDUCE-6633 so > that compression errors do not prove fatal for the DAG/tasks. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TEZ-3833) Tasks should report codec errors during shuffle as fetch failures
[ https://issues.apache.org/jira/browse/TEZ-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16170632#comment-16170632 ] Jason Lowe commented on TEZ-3833: - Thanks for updating the patch! Rather than replicate the code for a separate InternalError catch clause, it would be simpler to change the shouldRetry signature to take an Exception rather than an IOException so we can reuse the error handling logic. Despite the incorrect Javadocs for it, it does not throw IOException or really require the exception to be an instance of IOException to do what it does. I'm not sure it makes sense to wrap the InternalError in a timeout exception since it really isn't a timeout exception. > Tasks should report codec errors during shuffle as fetch failures > - > > Key: TEZ-3833 > URL: https://issues.apache.org/jira/browse/TEZ-3833 > Project: Apache Tez > Issue Type: Bug >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: TEZ-3833.001.patch, TEZ-3833.002.patch > > > Do the equivalent of https://issues.apache.org/jira/browse/MAPREDUCE-6633 so > that compression errors do not prove fatal for the DAG/tasks. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TEZ-3833) Tasks should report codec errors during shuffle as fetch failures
[ https://issues.apache.org/jira/browse/TEZ-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16176580#comment-16176580 ] Kuhu Shukla commented on TEZ-3833: -- bq. I'm not sure it makes sense to wrap the InternalError in a timeout exception since it really isn't a timeout exception. I agree. I am thinking of throwing the exception as FetcherReadTimeoutException if it is an IOE (as in the existing code) and as it is if it is InternalError. {{copyFromHost}} can then catch both and do the same cleanupConnection and connect with retry. Does that make sense? > Tasks should report codec errors during shuffle as fetch failures > - > > Key: TEZ-3833 > URL: https://issues.apache.org/jira/browse/TEZ-3833 > Project: Apache Tez > Issue Type: Bug >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: TEZ-3833.001.patch, TEZ-3833.002.patch > > > Do the equivalent of https://issues.apache.org/jira/browse/MAPREDUCE-6633 so > that compression errors do not prove fatal for the DAG/tasks. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TEZ-3833) Tasks should report codec errors during shuffle as fetch failures
[ https://issues.apache.org/jira/browse/TEZ-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16176883#comment-16176883 ] Jason Lowe commented on TEZ-3833: - Ah, I see we're already wrapping all the other I/O errors that aren't really timeouts as timeout exceptions. We probably should treat the InternalError just like how we currently treat IOException since those errors will be coming from the codec and should be treated like read errors with respect to fetching logic. So I'm OK if we want to just reuse the same code paths we're already leveraging for IOExceptions. > Tasks should report codec errors during shuffle as fetch failures > - > > Key: TEZ-3833 > URL: https://issues.apache.org/jira/browse/TEZ-3833 > Project: Apache Tez > Issue Type: Bug >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: TEZ-3833.001.patch, TEZ-3833.002.patch > > > Do the equivalent of https://issues.apache.org/jira/browse/MAPREDUCE-6633 so > that compression errors do not prove fatal for the DAG/tasks. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TEZ-3833) Tasks should report codec errors during shuffle as fetch failures
[ https://issues.apache.org/jira/browse/TEZ-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16182997#comment-16182997 ] TezQA commented on TEZ-3833: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12889274/TEZ-3833.003.patch against master revision 8f61c51. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/2643//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2643//console This message is automatically generated. > Tasks should report codec errors during shuffle as fetch failures > - > > Key: TEZ-3833 > URL: https://issues.apache.org/jira/browse/TEZ-3833 > Project: Apache Tez > Issue Type: Bug >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: TEZ-3833.001.patch, TEZ-3833.002.patch, > TEZ-3833.003.patch > > > Do the equivalent of https://issues.apache.org/jira/browse/MAPREDUCE-6633 so > that compression errors do not prove fatal for the DAG/tasks. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TEZ-3833) Tasks should report codec errors during shuffle as fetch failures
[ https://issues.apache.org/jira/browse/TEZ-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16183035#comment-16183035 ] Jason Lowe commented on TEZ-3833: - Thanks for updating the patch! InternalError cannot be successfully cast to an Exception. shouldRetry should take a Throwable rather than an Exception and then I think we're good. > Tasks should report codec errors during shuffle as fetch failures > - > > Key: TEZ-3833 > URL: https://issues.apache.org/jira/browse/TEZ-3833 > Project: Apache Tez > Issue Type: Bug >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: TEZ-3833.001.patch, TEZ-3833.002.patch, > TEZ-3833.003.patch > > > Do the equivalent of https://issues.apache.org/jira/browse/MAPREDUCE-6633 so > that compression errors do not prove fatal for the DAG/tasks. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TEZ-3833) Tasks should report codec errors during shuffle as fetch failures
[ https://issues.apache.org/jira/browse/TEZ-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16184473#comment-16184473 ] TezQA commented on TEZ-3833: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12889517/TEZ-3833.004.patch against master revision 8f61c51. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/2645//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2645//console This message is automatically generated. > Tasks should report codec errors during shuffle as fetch failures > - > > Key: TEZ-3833 > URL: https://issues.apache.org/jira/browse/TEZ-3833 > Project: Apache Tez > Issue Type: Bug >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: TEZ-3833.001.patch, TEZ-3833.002.patch, > TEZ-3833.003.patch, TEZ-3833.004.patch > > > Do the equivalent of https://issues.apache.org/jira/browse/MAPREDUCE-6633 so > that compression errors do not prove fatal for the DAG/tasks. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TEZ-3833) Tasks should report codec errors during shuffle as fetch failures
[ https://issues.apache.org/jira/browse/TEZ-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16184873#comment-16184873 ] Jason Lowe commented on TEZ-3833: - I'm still confused why we want to treat InternalErrors like timeouts. Seems like we will do some bad things in some cases if we do. For example if we are trying to fetch 5 maps from a node and get an InternalError then we should blame the current map not all 5 maps, whereas if we are getting a connection timeout then we do want to associate that failure to connect with all 5 maps. Therefore I think we simply need to remove the instanceof check for InternalError. That will cause them to be treated like a regular I/O error which seems more appropriate. > Tasks should report codec errors during shuffle as fetch failures > - > > Key: TEZ-3833 > URL: https://issues.apache.org/jira/browse/TEZ-3833 > Project: Apache Tez > Issue Type: Bug >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: TEZ-3833.001.patch, TEZ-3833.002.patch, > TEZ-3833.003.patch, TEZ-3833.004.patch > > > Do the equivalent of https://issues.apache.org/jira/browse/MAPREDUCE-6633 so > that compression errors do not prove fatal for the DAG/tasks. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TEZ-3833) Tasks should report codec errors during shuffle as fetch failures
[ https://issues.apache.org/jira/browse/TEZ-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16186025#comment-16186025 ] TezQA commented on TEZ-3833: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12889589/TEZ-3833.005.patch against master revision a4a3c6d. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/2648//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2648//console This message is automatically generated. > Tasks should report codec errors during shuffle as fetch failures > - > > Key: TEZ-3833 > URL: https://issues.apache.org/jira/browse/TEZ-3833 > Project: Apache Tez > Issue Type: Bug >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: TEZ-3833.001.patch, TEZ-3833.002.patch, > TEZ-3833.003.patch, TEZ-3833.004.patch, TEZ-3833.005.patch > > > Do the equivalent of https://issues.apache.org/jira/browse/MAPREDUCE-6633 so > that compression errors do not prove fatal for the DAG/tasks. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TEZ-3833) Tasks should report codec errors during shuffle as fetch failures
[ https://issues.apache.org/jira/browse/TEZ-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16186133#comment-16186133 ] TezQA commented on TEZ-3833: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12889589/TEZ-3833.005.patch against master revision a4a3c6d. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/2649//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2649//console This message is automatically generated. > Tasks should report codec errors during shuffle as fetch failures > - > > Key: TEZ-3833 > URL: https://issues.apache.org/jira/browse/TEZ-3833 > Project: Apache Tez > Issue Type: Bug >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: TEZ-3833.001.patch, TEZ-3833.002.patch, > TEZ-3833.003.patch, TEZ-3833.004.patch, TEZ-3833.005.patch > > > Do the equivalent of https://issues.apache.org/jira/browse/MAPREDUCE-6633 so > that compression errors do not prove fatal for the DAG/tasks. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TEZ-3833) Tasks should report codec errors during shuffle as fetch failures
[ https://issues.apache.org/jira/browse/TEZ-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16186138#comment-16186138 ] Jason Lowe commented on TEZ-3833: - +1 lgtm. Committing this. > Tasks should report codec errors during shuffle as fetch failures > - > > Key: TEZ-3833 > URL: https://issues.apache.org/jira/browse/TEZ-3833 > Project: Apache Tez > Issue Type: Bug >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: TEZ-3833.001.patch, TEZ-3833.002.patch, > TEZ-3833.003.patch, TEZ-3833.004.patch, TEZ-3833.005.patch > > > Do the equivalent of https://issues.apache.org/jira/browse/MAPREDUCE-6633 so > that compression errors do not prove fatal for the DAG/tasks. -- This message was sent by Atlassian JIRA (v6.4.14#64029)