[ https://issues.apache.org/jira/browse/YARN-5594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Robert Kanter updated YARN-5594: -------------------------------- Attachment: YARN-5594.002.patch The 002 patch: - It turns out that the {{readOldFormatFields}} code is the same as the {{readFields}} code in {{AbstractDelegationTokenIdentifer}}, so we can simply call {{super.readFields}} instead of duplicating the code. - Moved the token reading (old and new format handling) logic to a common place in {{RMStateStoreUtils}} so it can be used by {{LeveldbRMStateStore}} and {{ZKRMStateStore}} (in addition to {{FileSystemRMStateStore}}) - Improved existing unit test - Added additional unit tests I also manually verified that it fixes the problem in a cluster with the {{ZKRMStateStore}}. > Handle old RMDelegationToken format when recovering RM > ------------------------------------------------------ > > Key: YARN-5594 > URL: https://issues.apache.org/jira/browse/YARN-5594 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager > Affects Versions: 2.6.0 > Reporter: Tatyana But > Assignee: Robert Kanter > Labels: oct16-medium > Attachments: YARN-5594.001.patch, YARN-5594.002.patch > > > We've got that error after upgrade cluster from v.2.5.1 to 2.7.0. > {noformat} > 2016-08-25 17:20:33,293 ERROR > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Failed to > load/recover state > com.google.protobuf.InvalidProtocolBufferException: Protocol message contained > an invalid tag (zero). > at > com.google.protobuf.InvalidProtocolBufferException.invalidTag(InvalidProtocolBufferException.java:89) > at com.google.protobuf.CodedInputStream.readTag(CodedInputStream.java:108) > at > org.apache.hadoop.yarn.proto.YarnServerResourceManagerRecoveryProtos$RMDelegationTokenIdentifierDataProto.<init>(YarnServerResourceManagerRecoveryProtos.java:4680) > at > org.apache.hadoop.yarn.proto.YarnServerResourceManagerRecoveryProtos$RMDelegationTokenIdentifierDataProto.<init>(YarnServerResourceManagerRecoveryProtos.java:4644) > at > org.apache.hadoop.yarn.proto.YarnServerResourceManagerRecoveryProtos$RMDelegationTokenIdentifierDataProto$1.parsePartialFrom(YarnServerResourceManagerRecoveryProtos.java:4740) > at > org.apache.hadoop.yarn.proto.YarnServerResourceManagerRecoveryProtos$RMDelegationTokenIdentifierDataProto$1.parsePartialFrom(YarnServerResourceManagerRecoveryProtos.java:4735) > at > org.apache.hadoop.yarn.proto.YarnServerResourceManagerRecoveryProtos$RMDelegationTokenIdentifierDataProto$Builder.mergeFrom(YarnServerResourceManagerRecoveryProtos.java:5075) > at > org.apache.hadoop.yarn.proto.YarnServerResourceManagerRecoveryProtos$RMDelegationTokenIdentifierDataProto$Builder.mergeFrom(YarnServerResourceManagerRecoveryProtos.java:4955) > at > com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:337) > at > com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:267) > at > com.google.protobuf.AbstractMessageLite$Builder.mergeFrom(AbstractMessageLite.java:210) > at > com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:904) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.records.RMDelegationTokenIdentifierData.readFields(RMDelegationTokenIdentifierData.java:43) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.loadRMDTSecretManagerState(FileSystemRMStateStore.java:355) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.loadState(FileSystemRMStateStore.java:199) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:587) > at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:1007) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1048) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1044 > {noformat} > The reason of this problem is that we use different formats of files > /var/mapr/cluster/yarn/rm/system/FSRMStateRoot/RMDTSecretManagerRoot/RMDelegationToken* > in these hadoop versions. > This fix handle old data format during RM recover if > InvalidProtocolBufferException occures. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org