[ 
https://issues.apache.org/jira/browse/TEZ-3914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles updated TEZ-3914:
---------------------------------
    Description: 
A large message will be failed to parse and will be treated as recovery file 
EOF.

{noformat}
2018-04-16 15:33:59,807 WARN  [Thread-2] app.RecoveryParser 
(RecoveryParser.java:parseRecoveryData(771)) - Corrupt data found when trying 
to read next event
com.google.protobuf.InvalidProtocolBufferException: Protocol message was too 
large.  May be malicious.  Use CodedInputStream.setSizeLimit() to increase the 
size limit.
{noformat}

  was:Any failure to parse recovery event is ignore and treated as eof. Job can 
hang since some task completions may be missed and shuffle will hang.


> Recovering a large DAG fails to due
> -----------------------------------
>
>                 Key: TEZ-3914
>                 URL: https://issues.apache.org/jira/browse/TEZ-3914
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Jonathan Eagles
>            Assignee: Jonathan Eagles
>            Priority: Major
>         Attachments: TEZ-3914.001.patch, TEZ-3914.002.patch, 
> TEZ-3914.003.patch
>
>
> A large message will be failed to parse and will be treated as recovery file 
> EOF.
> {noformat}
> 2018-04-16 15:33:59,807 WARN  [Thread-2] app.RecoveryParser 
> (RecoveryParser.java:parseRecoveryData(771)) - Corrupt data found when trying 
> to read next event
> com.google.protobuf.InvalidProtocolBufferException: Protocol message was too 
> large.  May be malicious.  Use CodedInputStream.setSizeLimit() to increase 
> the size limit.
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to