[ 
https://issues.apache.org/jira/browse/YARN-5924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated YARN-5924:
--------------------------
    Assignee: Oleksii Dymytrov

> Resource Manager fails to load state with InvalidProtocolBufferException
> ------------------------------------------------------------------------
>
>                 Key: YARN-5924
>                 URL: https://issues.apache.org/jira/browse/YARN-5924
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager
>    Affects Versions: 3.0.0-alpha1
>            Reporter: Oleksii Dymytrov
>            Assignee: Oleksii Dymytrov
>         Attachments: YARN-5924.002.patch
>
>
> InvalidProtocolBufferException is thrown during recovering of the 
> application's state if application's data has invalid format (or is broken) 
> under FSRMStateRoot/RMAppRoot/application_1477986176766_0134/ directory in 
> HDFS:
> {noformat}
> com.google.protobuf.InvalidProtocolBufferException: Protocol message 
> end-group tag did not match expected tag.
>       at 
> com.google.protobuf.InvalidProtocolBufferException.invalidEndTag(InvalidProtocolBufferException.java:94)
>       at 
> com.google.protobuf.CodedInputStream.checkLastTagWas(CodedInputStream.java:124)
>       at 
> com.google.protobuf.AbstractParser.parsePartialFrom(AbstractParser.java:143)
>       at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:176)
>       at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:188)
>       at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:193)
>       at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:49)
>       at 
> org.apache.hadoop.yarn.proto.YarnServerResourceManagerRecoveryProtos$ApplicationStateDataProto.parseFrom(YarnServerResourceManagerRecoveryProtos.java:1028)
>       at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore$RMAppStateFileProcessor.processChildNode(FileSystemRMStateStore.java:966)
>       at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.processDirectoriesOfFiles(FileSystemRMStateStore.java:317)
>       at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.loadRMAppState(FileSystemRMStateStore.java:281)
>       at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.loadState(FileSystemRMStateStore.java:232)
> {noformat}
> The solution can be to catch "InvalidProtocolBufferException", show warning 
> and remove application's folder that contains invalid data to prevent RM 
> restart failure. 
> Additionally, I've added catch for other exceptions that can appear during 
> recovering of the specific application, to avoid RM failure even if the only 
> one application's state can't be loaded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to