[ https://issues.apache.org/jira/browse/YARN-8310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16486637#comment-16486637 ]
Miklos Szegedi commented on YARN-8310: -------------------------------------- I will backport this to branches branch-2, branch-3.0 and branch 3.1 > Handle old NMTokenIdentifier, AMRMTokenIdentifier, and > ContainerTokenIdentifier formats > --------------------------------------------------------------------------------------- > > Key: YARN-8310 > URL: https://issues.apache.org/jira/browse/YARN-8310 > Project: Hadoop YARN > Issue Type: Bug > Affects Versions: 2.6.0 > Reporter: Robert Kanter > Assignee: Robert Kanter > Priority: Major > Attachments: YARN-8310.001.patch, YARN-8310.002.patch, > YARN-8310.003.patch, YARN-8310.branch-2.001.patch, > YARN-8310.branch-2.002.patch, YARN-8310.branch-2.003.patch > > > In some recent upgrade testing, we saw this error causing the NodeManager to > fail to startup afterwards: > {noformat} > org.apache.hadoop.service.ServiceStateException: > com.google.protobuf.InvalidProtocolBufferException: Protocol message > contained an invalid tag (zero). > at > org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:105) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:173) > at > org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108) > at > org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:441) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) > at > org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:834) > at > org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:895) > Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol > message contained an invalid tag (zero). > at > com.google.protobuf.InvalidProtocolBufferException.invalidTag(InvalidProtocolBufferException.java:89) > at > com.google.protobuf.CodedInputStream.readTag(CodedInputStream.java:108) > at > org.apache.hadoop.yarn.proto.YarnSecurityTokenProtos$ContainerTokenIdentifierProto.<init>(YarnSecurityTokenProtos.java:1860) > at > org.apache.hadoop.yarn.proto.YarnSecurityTokenProtos$ContainerTokenIdentifierProto.<init>(YarnSecurityTokenProtos.java:1824) > at > org.apache.hadoop.yarn.proto.YarnSecurityTokenProtos$ContainerTokenIdentifierProto$1.parsePartialFrom(YarnSecurityTokenProtos.java:2016) > at > org.apache.hadoop.yarn.proto.YarnSecurityTokenProtos$ContainerTokenIdentifierProto$1.parsePartialFrom(YarnSecurityTokenProtos.java:2011) > at > com.google.protobuf.AbstractParser.parsePartialFrom(AbstractParser.java:200) > at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:217) > at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:223) > at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:49) > at > org.apache.hadoop.yarn.proto.YarnSecurityTokenProtos$ContainerTokenIdentifierProto.parseFrom(YarnSecurityTokenProtos.java:2686) > at > org.apache.hadoop.yarn.security.ContainerTokenIdentifier.readFields(ContainerTokenIdentifier.java:254) > at > org.apache.hadoop.security.token.Token.decodeIdentifier(Token.java:177) > at > org.apache.hadoop.yarn.server.utils.BuilderUtils.newContainerTokenIdentifier(BuilderUtils.java:322) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.recoverContainer(ContainerManagerImpl.java:455) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.recover(ContainerManagerImpl.java:373) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:316) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) > ... 5 more > {noformat} > The NodeManager fails because it's trying to read a > {{ContainerTokenIdentifier}} in the "old" format before we changed them to > protobufs (YARN-668). This is very similar to YARN-5594 where we ran into a > similar problem with the ResourceManager and RM Delegation Tokens. > To provide a better experience, we should make the code able to read the old > format if it's unable to read it using the new format. We didn't run into > any errors with the other two types of tokens that YARN-668 incompatibly > changed (NMTokenIdentifier and AMRMTokenIdentifier), but we may as well fix > those while we're at it. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org