Github user saturday-shi commented on the issue:

    https://github.com/apache/spark/pull/18230
  
    @jerryshao 
    > "reload" here meanings retrieving back SparkConf from checkpoint file and 
using this retrieved SparkConf to create SparkContext when restarting streaming 
application.
    
    That explanation is extremely wrong. But your opinion of what the 
`propertiesToReload` list does is right.
    
    After restarting from checkpoint, properties in `SparkConf` will be the 
same as the previous application. But something like `spark.yarn.app.id` will 
be stale an useless in a restarted app. So after retrieving back the 
`SparkConf` from checkpoint, we want to "reload" the fresh values from system 
properties, instead of using old ones in the checkpoint.
    
    @vanzin 
    > So if you start the second streaming application without providing 
principal / keytab, Client.scala will not overwrite the credential file path, 
but still the AM will start the credential updater, because the file location 
is in the configuration read from the checkpoint.
    
    That's probably right, but not the case. I do submit the principal & keytab 
at restarting and the AM do renew the token using the principal successfully.
    
    I noticed that the `SparkConf` used by `AMCredentialRenewer` and 
`CredentialUpdater` seems NOT THE SAME. The credential renewer thread launched 
by the AM will work correctly, but the credential updater in executor backend - 
which uses configs provided by the diver - will confuse and fail in its job. So 
just fixing the AM code doesn't make much sense.
    
    FYI, the log of `AMCredentialRenewer` looks like this:
    ```
    17/06/07 15:11:14 INFO security.AMCredentialRenewer: Scheduling login from 
keytab in 96952 millis.
    ...
    17/06/07 15:12:51 INFO security.AMCredentialRenewer: Attempting to login to 
KDC using principal: xxx@XXX.LOCAL
    17/06/07 15:12:51 INFO security.AMCredentialRenewer: Successfully logged 
into KDC.
    ...
    17/06/07 15:12:53 INFO security.AMCredentialRenewer: Writing out delegation 
tokens to 
hdfs://nameservice1/user/xxx/.sparkStaging/application_1496384469444_0036/credentials-044b83ea-b46b-4bd4-8e98-0e38928fd58c-1496816091985-1.tmp
    17/06/07 15:12:53 INFO security.AMCredentialRenewer: Delegation Tokens 
written out successfully. Renaming file to 
hdfs://nameservice1/user/xxx/.sparkStaging/application_1496384469444_0036/credentials-044b83ea-b46b-4bd4-8e98-0e38928fd58c-1496816091985-1
    17/06/07 15:12:53 INFO security.AMCredentialRenewer: Delegation token file 
rename complete.
    17/06/07 15:12:53 INFO security.AMCredentialRenewer: Scheduling login from 
keytab in 110925 millis.
    ...
    ```
    It renews the token successfully and saves it to 
application_1496384469444_0036's dir.
    But the `CredentialUpdater` (started by `YarnSparkHadoopUtil`) complains 
about this:
    ```
    17/06/07 15:11:24 INFO executor.CoarseGrainedExecutorBackend: Will 
periodically update credentials from: 
hdfs://nameservice1/user/xxx/.sparkStaging/application_1496384469444_0035/credentials-19a7c11e-8c93-478c-ab0a-cdbfae5b2ae5
    ...
    17/06/07 15:12:24 WARN yarn.YarnSparkHadoopUtil: Error while attempting to 
list files from application staging dir
    java.io.FileNotFoundException: File 
hdfs://nameservice1/user/xxx/.sparkStaging/application_1496384469444_0035 does 
not exist.
    ...
    ```
    ... which says that the credentials file doesn't exist in 
application_1496384469444_0035's dir.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to