[ https://issues.apache.org/jira/browse/HDFS-15756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17321862#comment-17321862 ]
zhangxiping edited comment on HDFS-15756 at 4/15/21, 2:35 AM: -------------------------------------------------------------- [~hexiaoqiao] I am very happy to see your comment. We are also using Router now and have encountered this problem.We encountered this problem when we submitted the Spark application. During the task submission, the following log appeared: {code:java} //代码占位符 2021-04-13 01:01:13 CST DFSClient INFO - Created HDFS_DELEGATION_TOKEN token 205440696 for da_music on ha-hdfs:hz-cluster11 2021-04-13 01:01:13 CST SparkContext ERROR - Error initializing SparkContext. org.apache.hadoop.security.token.SecretManager$InvalidToken: Renewal request for unknown token (token for da_music: HDFS_DELEGATION_TOKEN owner=da_music/d...@hadoop.hz.netease.com, renewer=da_music, realUser=, issueDate=1618246873345, maxDate=1618851673345, sequenceNumber=205440696, masterKeyId=161) at org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.renewToken(AbstractDelegationTokenSecretManager.java:543) {code} {code:java} //代码占位符 private def getTokenRenewalInterval( hadoopConf: Configuration, sparkConf: SparkConf, filesystems: Set[FileSystem]): Option[Long] = { // We cannot use the tokens generated with renewer yarn. Trying to renew // those will fail with an access control issue. So create new tokens with the logged in // user as renewer. sparkConf.get(PRINCIPAL).flatMap { renewer => val creds = new Credentials() fetchDelegationTokens(renewer, filesystems, creds) val renewIntervals = creds.getAllTokens.asScala.filter { _.decodeIdentifier().isInstanceOf[AbstractDelegationTokenIdentifier] }.flatMap { token => Try { val newExpiration = token.renew(hadoopConf) val identifier = token.decodeIdentifier().asInstanceOf[AbstractDelegationTokenIdentifier] val interval = newExpiration - identifier.getIssueDate logInfo(s"Renewal interval is $interval for token ${token.getKind.toString}") interval }.toOption } if (renewIntervals.isEmpty) None else Some(renewIntervals.min) } } } {code} Looking at the Spark2.4.7 code, we found that the time between creating a token and renew it is very short。 So can we only make retry requests for tokens that were created shortly after, such as those created within 1 minute? was (Author: zhangxiping): [~hexiaoqiao] I am very happy to see your comment. We are also using Router now and have encountered this problem.We encountered this problem when we submitted the Spark application. During the task submission, the following log appeared: {code:java} //代码占位符 2021-04-13 01:01:13 CST DFSClient INFO - Created HDFS_DELEGATION_TOKEN token 205440696 for da_music on ha-hdfs:hz-cluster11 2021-04-13 01:01:13 CST SparkContext ERROR - Error initializing SparkContext. org.apache.hadoop.security.token.SecretManager$InvalidToken: Renewal request for unknown token (token for da_music: HDFS_DELEGATION_TOKEN owner=da_music/d...@hadoop.hz.netease.com, renewer=da_music, realUser=, issueDate=1618246873345, maxDate=1618851673345, sequenceNumber=205440696, masterKeyId=161) at org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.renewToken(AbstractDelegationTokenSecretManager.java:543) {code} !image-2021-04-15-10-27-29-927.png! Looking at the Spark2.4.7 code, we found that the time between creating a token and renew it is very short。 So can we only make retry requests for tokens that were created shortly after, such as those created within 1 minute? > RBF: Cannot get updated delegation token from zookeeper > ------------------------------------------------------- > > Key: HDFS-15756 > URL: https://issues.apache.org/jira/browse/HDFS-15756 > Project: Hadoop HDFS > Issue Type: Bug > Components: rbf > Affects Versions: 3.0.0 > Reporter: hbprotoss > Priority: Major > > Affected version: all version with rbf > When RBF work with spark 2.4 client mode, there will be a chance that token > is missing across different nodes in RBF cluster. The root cause is that > spark renew the token(via resource manager) immediately after got one, as > zookeeper don't have a strong consistency guarantee after an update in > cluster, zookeeper client may read a stale value in some followers not synced > with other nodes. > > We apply a patch in spark, but it is still the problem of RBF. Is it possible > for RBF to replace the delegation token store using some other > datasource(redis for example)? -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org