[ https://issues.apache.org/jira/browse/SPARK-33440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jungtaek Lim resolved SPARK-33440. ---------------------------------- Fix Version/s: 3.0.2 3.1.0 Resolution: Fixed Issue resolved by pull request 30366 [https://github.com/apache/spark/pull/30366] > Spark schedules on updating delegation token with 0 interval under some token > provider implementation > ----------------------------------------------------------------------------------------------------- > > Key: SPARK-33440 > URL: https://issues.apache.org/jira/browse/SPARK-33440 > Project: Spark > Issue Type: Bug > Components: Spark Core > Affects Versions: 3.0.1, 3.1.0 > Reporter: Jungtaek Lim > Assignee: Jungtaek Lim > Priority: Major > Fix For: 3.1.0, 3.0.2 > > > We got a report from customer that under specific circumstance Spark > schedules on updating delegation token with 0 interval, ended up with > flooding log message & massive requests on token handler side. > After investigation, the problem was they have two delegation token > identifiers which one of token identifier (IDBS3ATokenIdentifier) has the > value of "issue date" to be 0, whereas another token identifier > (DelegationTokenIdentifier) has correct value. > Both are providing the expire time correctly via Token.renew(), and Spark > assumes issue date is "correct", hence calculating the token expire period as > (the result of Token.renew() - "issue date"). > {code} > 20/10/13 06:34:19 INFO security.HadoopFSDelegationTokenProvider: Renewal > interval is 1603175657000 for token S3ADelegationToken/IDBroker > 20/10/13 06:34:19 INFO security.HadoopFSDelegationTokenProvider: Renewal > interval is 86400048 for token HDFS_DELEGATION_TOKEN > {code} > It's safe at least here because Spark picks "minimal" value. The thing is, to > calculate the next renewal timestamp, Spark tries to add the renewal interval > with issue date for every token, and pick minimum value, hence "86400048" is > picked as the next renewal timestamp. > This is "earlier" than now, hence interval to schedule goes to be negative > (as we apply subtract with now), and Spark applies safeguard to pick the > greater between 0 and interval, hence 0 is picked up, and schedule updating > token infinitely. (Schedule is one-time, but the calculation will always lead > to the negative, so that's effectively immediate schedule.) > We should construct the better consideration of "safe guard", instead of just > guarding the schedule interval doesn't go to negative. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org