Github user parente commented on the issue: https://github.com/apache/spark/pull/16482 > please check the yarn container's local cache: Without this fix, when I specify `--principal user@REALM` and `--keytab /some/path/user.keytab`, I see the following in my app staging directory on HDFS: ``` Found 7 items -rw-r--r-- 3 user supergroup 68 2017-01-06 03:59 user.keytab -rw-r--r-- 3 user supergroup 73502 2017-01-06 03:59 __spark_conf__.zip -rw-r--r-- 3 user supergroup 189767340 2017-01-06 03:59 __spark_libs__4440821503780683972.zip -rw-r--r-- 3 user supergroup 91275 2017-01-06 03:59 py4j-0.10.3-src.zip -rw-r--r-- 3 user supergroup 440385 2017-01-06 03:59 pyspark.zip ``` Notice that the keytab has not been properly suffixed during the remote copy. It's not clear to me at all how the file in your example receives the suffix when the call to `copyFileToRemote` in `Client.distribute` does not pass the destination name at all. The other calls to `copyFileToRemote` to copy the spark conf and libs do indeed pass `destName` to rename to the underscored versions we see above. > Also please see the comment in HiveClientImple, it've already mentioned the problem you met. Thanks for the pointer about this aspect. The linked JIRA issue describes the HiveClientImpl problem, but also notes that Kerberos ticket renewal is not occurring properly either in the first comment. I can open a separate JIRA issue explicitly about the keytab naming problem and lack of re-ticketing and separate out the two issues.
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org