Github user saturday-shi commented on the issue: https://github.com/apache/spark/pull/18230 @jerryshao > "reload" here meanings retrieving back SparkConf from checkpoint file and using this retrieved SparkConf to create SparkContext when restarting streaming application. That explanation is extremely wrong. But your opinion of what the `propertiesToReload` list does is right. After restarting from checkpoint, properties in `SparkConf` will be the same as the previous application. But something like `spark.yarn.app.id` will be stale an useless in a restarted app. So after retrieving back the `SparkConf` from checkpoint, we want to "reload" the fresh values from system properties, instead of using old ones in the checkpoint. @vanzin > So if you start the second streaming application without providing principal / keytab, Client.scala will not overwrite the credential file path, but still the AM will start the credential updater, because the file location is in the configuration read from the checkpoint. That's probably right, but not the case. I do submit the principal & keytab at restarting and the AM do renew the token using the principal successfully. I noticed that the `SparkConf` used by `AMCredentialRenewer` and `CredentialUpdater` seems NOT THE SAME. The credential renewer thread launched by the AM will work correctly, but the credential updater in executor backend - which uses configs provided by the diver - will confuse and fail in its job. So just fixing the AM code doesn't make much sense. FYI, the log of `AMCredentialRenewer` looks like this: ``` 17/06/07 15:11:14 INFO security.AMCredentialRenewer: Scheduling login from keytab in 96952 millis. ... 17/06/07 15:12:51 INFO security.AMCredentialRenewer: Attempting to login to KDC using principal: xxx@XXX.LOCAL 17/06/07 15:12:51 INFO security.AMCredentialRenewer: Successfully logged into KDC. ... 17/06/07 15:12:53 INFO security.AMCredentialRenewer: Writing out delegation tokens to hdfs://nameservice1/user/xxx/.sparkStaging/application_1496384469444_0036/credentials-044b83ea-b46b-4bd4-8e98-0e38928fd58c-1496816091985-1.tmp 17/06/07 15:12:53 INFO security.AMCredentialRenewer: Delegation Tokens written out successfully. Renaming file to hdfs://nameservice1/user/xxx/.sparkStaging/application_1496384469444_0036/credentials-044b83ea-b46b-4bd4-8e98-0e38928fd58c-1496816091985-1 17/06/07 15:12:53 INFO security.AMCredentialRenewer: Delegation token file rename complete. 17/06/07 15:12:53 INFO security.AMCredentialRenewer: Scheduling login from keytab in 110925 millis. ... ``` It renews the token successfully and saves it to application_1496384469444_0036's dir. But the `CredentialUpdater` (started by `YarnSparkHadoopUtil`) complains about this: ``` 17/06/07 15:11:24 INFO executor.CoarseGrainedExecutorBackend: Will periodically update credentials from: hdfs://nameservice1/user/xxx/.sparkStaging/application_1496384469444_0035/credentials-19a7c11e-8c93-478c-ab0a-cdbfae5b2ae5 ... 17/06/07 15:12:24 WARN yarn.YarnSparkHadoopUtil: Error while attempting to list files from application staging dir java.io.FileNotFoundException: File hdfs://nameservice1/user/xxx/.sparkStaging/application_1496384469444_0035 does not exist. ... ``` ... which says that the credentials file doesn't exist in application_1496384469444_0035's dir.
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org