[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user steveloughran closed the pull request at: https://github.com/apache/spark/pull/9466 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user vanzin commented on the pull request: https://github.com/apache/spark/pull/9466#issuecomment-153833441 @steveloughran I merged this, can you close the PR? github only closes PRs submitted against master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user vanzin commented on the pull request: https://github.com/apache/spark/pull/9466#issuecomment-153833114 LGTM, merging. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9466#issuecomment-153740806 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45014/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9466#issuecomment-153740799 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9466#issuecomment-153740334 [Test build #45014 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45014/console) for PR 9466 at commit [`2b72de9`](https://github.com/apache/spark/commit/2b72de977bec23d840c3895291c7b8886f5822cd). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9466#issuecomment-153693389 [Test build #45014 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45014/consoleFull) for PR 9466 at commit [`2b72de9`](https://github.com/apache/spark/commit/2b72de977bec23d840c3895291c7b8886f5822cd). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9466#issuecomment-153690587 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9466#issuecomment-153690560 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
GitHub user steveloughran opened a pull request: https://github.com/apache/spark/pull/9466 [SPARK-11265] [YARN] YarnClient can't get tokens to talk to Hive in a secure⦠Backport to branch-1.5 of SPARK-11265 patch. The sole change is in `Client`, where there's no longer a probe to see if the token request is enabled; it always happens. This means that provided there's hive.jar on the classpath, this will attempt to get the token âand if that fails for any reason other than CNFE, the client launch will fail You can merge this pull request into a Git repository by running: $ git pull https://github.com/steveloughran/spark stevel/patches/SPARK-11265-on-branch-1.5 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/9466.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #9466 commit 2b72de977bec23d840c3895291c7b8886f5822cd Author: Steve Loughran Date: 2015-11-03T16:39:24Z [SPARK-11265] YarnClient can't get tokens to talk to Hive in a secure cluster - backport to branch-1.5 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user steveloughran closed the pull request at: https://github.com/apache/spark/pull/9438 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user steveloughran commented on the pull request: https://github.com/apache/spark/pull/9438#issuecomment-153688927 that's what comes of trying to code at a conference; will cancel and resubmit --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9438#issuecomment-153467262 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44936/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9438#issuecomment-153467259 Build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9438#issuecomment-153467150 **[Test build #44936 timed out](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44936/console)** for PR 9438 at commit [`2b72de9`](https://github.com/apache/spark/commit/2b72de977bec23d840c3895291c7b8886f5822cd) after a configured wait of `175m`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user vanzin commented on the pull request: https://github.com/apache/spark/pull/9438#issuecomment-153415945 @steveloughran you have to choose the right target branch when submitting the PR. Can you close this one and open a new one with the correct target branch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9438#issuecomment-153414575 [Test build #44936 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44936/consoleFull) for PR 9438 at commit [`2b72de9`](https://github.com/apache/spark/commit/2b72de977bec23d840c3895291c7b8886f5822cd). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9438#issuecomment-153413113 Build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9438#issuecomment-153413163 Build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
GitHub user steveloughran opened a pull request: https://github.com/apache/spark/pull/9438 [SPARK-11265] [YARN] YarnClient can't get tokens to talk to Hive in a secure cluster - backport to branch-1.5 This is a backport of the [SPARK-11265] patch to Branch-1.5; won't compile against master. You can merge this pull request into a Git repository by running: $ git pull https://github.com/steveloughran/spark stevel/patches/SPARK-11265-on-branch-1.5 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/9438.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #9438 commit 76d920f2b814304051dd76f0ca78301e872fc811 Author: Yu ISHIKAWA Date: 2015-08-25T07:28:51Z [SPARK-10214] [SPARKR] [DOCS] Improve SparkR Column, DataFrame API docs cc: shivaram ## Summary - Add name tags to each methods in DataFrame.R and column.R - Replace `rdname column` with `rdname {each_func}`. i.e. alias method : `rdname column` => `rdname alias` ## Generated PDF File https://drive.google.com/file/d/0B9biIZIU47lLNHN2aFpnQXlSeGs/view?usp=sharing ## JIRA [[SPARK-10214] Improve SparkR Column, DataFrame API docs - ASF JIRA](https://issues.apache.org/jira/browse/SPARK-10214) Author: Yu ISHIKAWA Closes #8414 from yu-iskw/SPARK-10214. (cherry picked from commit d4549fe58fa0d781e0e891bceff893420cb1d598) Signed-off-by: Shivaram Venkataraman commit 4841ebb1861025067a1108c11f64bb144427a308 Author: Sean Owen Date: 2015-08-25T07:32:20Z [SPARK-6196] [BUILD] Remove MapR profiles in favor of hadoop-provided Follow up to https://github.com/apache/spark/pull/7047 pwendell mentioned that MapR should use `hadoop-provided` now, and indeed the new build script does not produce `mapr3`/`mapr4` artifacts anymore. Hence the action seems to be to remove the profiles, which are now not used. CC trystanleftwich Author: Sean Owen Closes #8338 from srowen/SPARK-6196. (cherry picked from commit 57b960bf3706728513f9e089455a533f0244312e) Signed-off-by: Sean Owen commit 2032d66706d165079550f06bf695e0b08be7e143 Author: Tathagata Das Date: 2015-08-25T07:35:51Z [SPARK-10210] [STREAMING] Filter out non-existent blocks before creating BlockRDD When write ahead log is not enabled, a recovered streaming driver still tries to run jobs using pre-failure block ids, and fails as the block do not exists in-memory any more (and cannot be recovered as receiver WAL is not enabled). This occurs because the driver-side WAL of ReceivedBlockTracker is recovers that past block information, and ReceiveInputDStream creates BlockRDDs even if those blocks do not exist. The solution in this PR is to filter out block ids that do not exist before creating the BlockRDD. In addition, it adds unit tests to verify other logic in ReceiverInputDStream. Author: Tathagata Das Closes #8405 from tdas/SPARK-10210. (cherry picked from commit 1fc37581a52530bac5d555dbf14927a5780c3b75) Signed-off-by: Tathagata Das commit e5cea566a32d254adc9424a2f9e79b92eda3e6e4 Author: Davies Liu Date: 2015-08-25T08:00:44Z [SPARK-10177] [SQL] fix reading Timestamp in parquet from Hive We misunderstood the Julian days and nanoseconds of the day in parquet (as TimestampType) from Hive/Impala, they are overlapped, so can't be added together directly. In order to avoid the confusing rounding when do the converting, we use `2440588` as the Julian Day of epoch of unix timestamp (which should be 2440587.5). Author: Davies Liu Author: Cheng Lian Closes #8400 from davies/timestamp_parquet. (cherry picked from commit 2f493f7e3924b769160a16f73cccbebf21973b91) Signed-off-by: Cheng Lian commit a0f22cf295a1d20814c5be6cc727e39e95a81c27 Author: Josh Rosen Date: 2015-08-25T08:06:36Z [SPARK-10195] [SQL] Data sources Filter should not expose internal types Spark SQL's data sources API exposes Catalyst's internal types through its Filter interfaces. This is a problem because types like UTF8String are not stable developer APIs and should not be exposed to third-parties. This issue caused incompatibilities when upgrading our `spark-redshift` library to work against Spark 1.5.0. To avoid these issues in the future we should only expose public types through these Filter objects. This patch accomplishes this by using CatalystTypeConverters to add the appropriate conversions. Author: Josh Rosen Closes #8403 from JoshRosen/datasources-internal-vs-external-types. (cherry picked from commit 7bc9a8c6249300ded31ea931c463d0a8f798e193) Signed-off-by: Reynold Xin commit 73f1dd1b5acf1c6c37045da25902d7ca5
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/9232#discussion_r43587708 --- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtil.scala --- @@ -142,6 +145,76 @@ class YarnSparkHadoopUtil extends SparkHadoopUtil { val containerIdString = System.getenv(ApplicationConstants.Environment.CONTAINER_ID.name()) ConverterUtils.toContainerId(containerIdString) } + + /** + * Obtains token for the Hive metastore, using the current user as the principal. + * Some exceptions are caught and downgraded to a log message. + * @param conf hadoop configuration; the Hive configuration will be based on this + * @return a token, or `None` if there's no need for a token (no metastore URI or principal + * in the config), or if a binding exception was caught and downgraded. + */ + def obtainTokenForHiveMetastore(conf: Configuration): Option[Token[DelegationTokenIdentifier]] = { +try { + obtainTokenForHiveMetastoreInner(conf, UserGroupInformation.getCurrentUser().getUserName) +} catch { + case e: ClassNotFoundException => +logInfo(s"Hive class not found $e") +logDebug("Hive class not found", e) --- End diff -- They're not exactly the same. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user vanzin commented on the pull request: https://github.com/apache/spark/pull/9232#issuecomment-152864441 > Same JIRA or a new backport one? You could use the same one; I marked it as resolved but it's ok to reopen it for the backport. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user tedyu commented on a diff in the pull request: https://github.com/apache/spark/pull/9232#discussion_r43583071 --- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtil.scala --- @@ -142,6 +145,76 @@ class YarnSparkHadoopUtil extends SparkHadoopUtil { val containerIdString = System.getenv(ApplicationConstants.Environment.CONTAINER_ID.name()) ConverterUtils.toContainerId(containerIdString) } + + /** + * Obtains token for the Hive metastore, using the current user as the principal. + * Some exceptions are caught and downgraded to a log message. + * @param conf hadoop configuration; the Hive configuration will be based on this + * @return a token, or `None` if there's no need for a token (no metastore URI or principal + * in the config), or if a binding exception was caught and downgraded. + */ + def obtainTokenForHiveMetastore(conf: Configuration): Option[Token[DelegationTokenIdentifier]] = { +try { + obtainTokenForHiveMetastoreInner(conf, UserGroupInformation.getCurrentUser().getUserName) +} catch { + case e: ClassNotFoundException => +logInfo(s"Hive class not found $e") +logDebug("Hive class not found", e) --- End diff -- Why double log the message ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user steveloughran commented on the pull request: https://github.com/apache/spark/pull/9232#issuecomment-152815974 Will do. Same JIRA or a new backport one? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user vanzin commented on the pull request: https://github.com/apache/spark/pull/9232#issuecomment-152786237 Hi @steveloughran, this patch doesn't merge cleanly to branch-1.5. If you want to apply it there, could you send a new pr for that branch? Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/9232 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user vanzin commented on the pull request: https://github.com/apache/spark/pull/9232#issuecomment-152785103 Merging to master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/9232#discussion_r43577485 --- Diff: yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtilSuite.scala --- @@ -245,4 +247,31 @@ class YarnSparkHadoopUtilSuite extends SparkFunSuite with Matchers with Logging System.clearProperty("SPARK_YARN_MODE") } } + + test("Obtain tokens For HiveMetastore") { +val hadoopConf = new Configuration() +hadoopConf.set("hive.metastore.kerberos.principal", "bob") +// thrift picks up on port 0 and bails out, without trying to talk to endpoint +hadoopConf.set("hive.metastore.uris", "http://localhost:0";) +val util = new YarnSparkHadoopUtil --- End diff -- nah, it's ok to leave as is. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user vanzin commented on the pull request: https://github.com/apache/spark/pull/9232#issuecomment-152781530 Latest patch LGTM. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9232#issuecomment-152497070 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9232#issuecomment-152496951 **[Test build #44676 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44676/consoleFull)** for PR 9232 at commit [`ebb2b5a`](https://github.com/apache/spark/commit/ebb2b5abdf26c9e0a72452a47f8cd23b09e4339c). * This patch passes all tests. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_:\n * ` logInfo(s\"Hive class not found $e\")`\n * `logDebug(\"Hive class not found\", e)`\n --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9232#issuecomment-152497071 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44676/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9232#issuecomment-152490177 **[Test build #44676 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44676/consoleFull)** for PR 9232 at commit [`ebb2b5a`](https://github.com/apache/spark/commit/ebb2b5abdf26c9e0a72452a47f8cd23b09e4339c). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9232#issuecomment-152489542 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9232#issuecomment-152489569 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user steveloughran commented on a diff in the pull request: https://github.com/apache/spark/pull/9232#discussion_r43489033 --- Diff: yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtilSuite.scala --- @@ -245,4 +247,28 @@ class YarnSparkHadoopUtilSuite extends SparkFunSuite with Matchers with Logging System.clearProperty("SPARK_YARN_MODE") } } + + test("Obtain tokens For HiveMetastore") { +val hadoopConf = new Configuration() +hadoopConf.set("hive.metastore.kerberos.principal", "bob") +// thrift picks up on port 0 and bails out, without trying to talk to endpoint +hadoopConf.set("hive.metastore.uris", "http://localhost:0";) +val util = new YarnSparkHadoopUtil +val e = intercept[InvocationTargetException] { + val token = util.obtainTokenForHiveMetastoreInner(hadoopConf, "alice") + fail(s"Expected an exception, got the token $token") +} +val inner = e.getCause +if (inner == null) { + fail("No inner cause", e) +} +if (!inner.isInstanceOf[HiveException]) { + fail(s"Not a hive exception", inner) --- End diff -- done --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user steveloughran commented on a diff in the pull request: https://github.com/apache/spark/pull/9232#discussion_r43488997 --- Diff: yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtilSuite.scala --- @@ -245,4 +247,31 @@ class YarnSparkHadoopUtilSuite extends SparkFunSuite with Matchers with Logging System.clearProperty("SPARK_YARN_MODE") } } + + test("Obtain tokens For HiveMetastore") { +val hadoopConf = new Configuration() +hadoopConf.set("hive.metastore.kerberos.principal", "bob") +// thrift picks up on port 0 and bails out, without trying to talk to endpoint +hadoopConf.set("hive.metastore.uris", "http://localhost:0";) +val util = new YarnSparkHadoopUtil --- End diff -- all the other tests do the same; if you want a switch it may as well be across the suite --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user steveloughran commented on a diff in the pull request: https://github.com/apache/spark/pull/9232#discussion_r43488942 --- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtil.scala --- @@ -142,6 +145,81 @@ class YarnSparkHadoopUtil extends SparkHadoopUtil { val containerIdString = System.getenv(ApplicationConstants.Environment.CONTAINER_ID.name()) ConverterUtils.toContainerId(containerIdString) } + + /** + * Obtains token for the Hive metastore, using the current user as the principal. + * Some exceptions are caught and downgraded to a log message. + * @param conf hadoop configuration; the Hive configuration will be based on this + * @return a token, or `None` if there's no need for a token (no metastore URI or principal + * in the config), or if a binding exception was caught and downgraded. + */ + def obtainTokenForHiveMetastore(conf: Configuration): Option[Token[DelegationTokenIdentifier]] = { +try { + obtainTokenForHiveMetastoreInner(conf, UserGroupInformation.getCurrentUser().getUserName) +} catch { + case e: ClassNotFoundException => +logInfo(s"Hive class not found $e") +logDebug("Hive class not found", e) +None + case t: Throwable => +throw t +} + } + + /** + * Inner routine to obtains token for the Hive metastore; exceptions are raised on any problem. + * @param conf hadoop configuration; the Hive configuration will be based on this. + * @param username the username of the principal requesting the delegating token. + * @return a delegation token + */ + private[yarn] def obtainTokenForHiveMetastoreInner(conf: Configuration, + username: String): Option[Token[DelegationTokenIdentifier]] = { +val mirror = universe.runtimeMirror(Utils.getContextOrSparkClassLoader) + +// the hive configuration class is a subclass of Hadoop Configuration, so can be cast down +// to a Configuration and used without reflection +val hiveConfClass = mirror.classLoader.loadClass("org.apache.hadoop.hive.conf.HiveConf") +// using the (Configuration, Class) constructor allows the current configuratin to be included +// in the hive config. +val ctor = hiveConfClass.getDeclaredConstructor(classOf[Configuration], + classOf[Object].getClass) +val hiveConf = ctor.newInstance(conf, hiveConfClass).asInstanceOf[Configuration] +val metastoreUri = hiveConf.getTrimmed("hive.metastore.uris", "") + +// Check for local metastore +if (metastoreUri.nonEmpty) { + require(username.nonEmpty, "Username undefined") + val principalKey = "hive.metastore.kerberos.principal" + val principal = hiveConf.getTrimmed(principalKey, "") + require(principal.nonEmpty, "Hive principal $principalKey undefined") + logDebug(s"Getting Hive delegation token for $username against $principal at $metastoreUri") + val hiveClass = mirror.classLoader.loadClass("org.apache.hadoop.hive.ql.metadata.Hive") + val closeCurrent = hiveClass.getMethod("closeCurrent") + try { +// get all the instance methods before invoking any +val getDelegationToken = hiveClass.getMethod("getDelegationToken", + classOf[String], classOf[String]) +val getHive = hiveClass.getMethod("get", hiveConfClass) + +// invoke +val hive = getHive.invoke(null, hiveConf) +val tokenStr = getDelegationToken.invoke(hive, username, principal) + .asInstanceOf[java.lang.String] +val hive2Token = new Token[DelegationTokenIdentifier]() +hive2Token.decodeFromUrlString(tokenStr) +Some(hive2Token) + } finally { +try { + closeCurrent.invoke(null) +} catch { + case e: Exception => logWarning("In Hive.closeCurrent()", e) --- End diff -- `Utils.tryLogNonFatalError` looks cleaner; switching --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user steveloughran commented on a diff in the pull request: https://github.com/apache/spark/pull/9232#discussion_r43488794 --- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtil.scala --- @@ -142,6 +145,81 @@ class YarnSparkHadoopUtil extends SparkHadoopUtil { val containerIdString = System.getenv(ApplicationConstants.Environment.CONTAINER_ID.name()) ConverterUtils.toContainerId(containerIdString) } + + /** + * Obtains token for the Hive metastore, using the current user as the principal. + * Some exceptions are caught and downgraded to a log message. + * @param conf hadoop configuration; the Hive configuration will be based on this + * @return a token, or `None` if there's no need for a token (no metastore URI or principal + * in the config), or if a binding exception was caught and downgraded. + */ + def obtainTokenForHiveMetastore(conf: Configuration): Option[Token[DelegationTokenIdentifier]] = { +try { + obtainTokenForHiveMetastoreInner(conf, UserGroupInformation.getCurrentUser().getUserName) +} catch { + case e: ClassNotFoundException => +logInfo(s"Hive class not found $e") +logDebug("Hive class not found", e) +None + case t: Throwable => +throw t +} + } + + /** + * Inner routine to obtains token for the Hive metastore; exceptions are raised on any problem. + * @param conf hadoop configuration; the Hive configuration will be based on this. + * @param username the username of the principal requesting the delegating token. + * @return a delegation token + */ + private[yarn] def obtainTokenForHiveMetastoreInner(conf: Configuration, + username: String): Option[Token[DelegationTokenIdentifier]] = { +val mirror = universe.runtimeMirror(Utils.getContextOrSparkClassLoader) + +// the hive configuration class is a subclass of Hadoop Configuration, so can be cast down +// to a Configuration and used without reflection +val hiveConfClass = mirror.classLoader.loadClass("org.apache.hadoop.hive.conf.HiveConf") +// using the (Configuration, Class) constructor allows the current configuratin to be included +// in the hive config. +val ctor = hiveConfClass.getDeclaredConstructor(classOf[Configuration], + classOf[Object].getClass) +val hiveConf = ctor.newInstance(conf, hiveConfClass).asInstanceOf[Configuration] +val metastoreUri = hiveConf.getTrimmed("hive.metastore.uris", "") + +// Check for local metastore +if (metastoreUri.nonEmpty) { + require(username.nonEmpty, "Username undefined") + val principalKey = "hive.metastore.kerberos.principal" + val principal = hiveConf.getTrimmed(principalKey, "") + require(principal.nonEmpty, "Hive principal $principalKey undefined") + logDebug(s"Getting Hive delegation token for $username against $principal at $metastoreUri") + val hiveClass = mirror.classLoader.loadClass("org.apache.hadoop.hive.ql.metadata.Hive") + val closeCurrent = hiveClass.getMethod("closeCurrent") + try { +// get all the instance methods before invoking any +val getDelegationToken = hiveClass.getMethod("getDelegationToken", + classOf[String], classOf[String]) +val getHive = hiveClass.getMethod("get", hiveConfClass) + +// invoke +val hive = getHive.invoke(null, hiveConf) +val tokenStr = getDelegationToken.invoke(hive, username, principal) + .asInstanceOf[java.lang.String] --- End diff -- copied from the original...cut that and joined the lines --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user steveloughran commented on a diff in the pull request: https://github.com/apache/spark/pull/9232#discussion_r43488706 --- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala --- @@ -1337,55 +1337,9 @@ object Client extends Logging { conf: Configuration, credentials: Credentials) { if (shouldGetTokens(sparkConf, "hive") && UserGroupInformation.isSecurityEnabled) { - val mirror = universe.runtimeMirror(getClass.getClassLoader) - - try { -val hiveConfClass = mirror.classLoader.loadClass("org.apache.hadoop.hive.conf.HiveConf") -val hiveConf = hiveConfClass.newInstance() - -val hiveConfGet = (param: String) => Option(hiveConfClass - .getMethod("get", classOf[java.lang.String]) - .invoke(hiveConf, param)) - -val metastore_uri = hiveConfGet("hive.metastore.uris") - -// Check for local metastore -if (metastore_uri != None && metastore_uri.get.toString.size > 0) { - val hiveClass = mirror.classLoader.loadClass("org.apache.hadoop.hive.ql.metadata.Hive") - val hive = hiveClass.getMethod("get").invoke(null, hiveConf.asInstanceOf[Object]) - - val metastore_kerberos_principal_conf_var = mirror.classLoader -.loadClass("org.apache.hadoop.hive.conf.HiveConf$ConfVars") - .getField("METASTORE_KERBEROS_PRINCIPAL").get("varname").toString - - val principal = hiveConfGet(metastore_kerberos_principal_conf_var) - - val username = Option(UserGroupInformation.getCurrentUser().getUserName) - if (principal != None && username != None) { -val tokenStr = hiveClass.getMethod("getDelegationToken", - classOf[java.lang.String], classOf[java.lang.String]) - .invoke(hive, username.get, principal.get).asInstanceOf[java.lang.String] - -val hive2Token = new Token[DelegationTokenIdentifier]() -hive2Token.decodeFromUrlString(tokenStr) -credentials.addToken(new Text("hive.server2.delegation.token"), hive2Token) -logDebug("Added hive.Server2.delegation.token to conf.") -hiveClass.getMethod("closeCurrent").invoke(null) - } else { -logError("Username or principal == NULL") -logError(s"""username=${username.getOrElse("(NULL)")}""") -logError(s"""principal=${principal.getOrElse("(NULL)")}""") -throw new IllegalArgumentException("username and/or principal is equal to null!") - } -} else { - logDebug("HiveMetaStore configured in localmode") -} - } catch { -case e: java.lang.NoSuchMethodException => { logInfo("Hive Method not found " + e); return } -case e: java.lang.ClassNotFoundException => { logInfo("Hive Class not found " + e); return } -case e: Exception => { logError("Unexpected Exception " + e) - throw new RuntimeException("Unexpected exception", e) -} + val util = new YarnSparkHadoopUtil() --- End diff -- done --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user steveloughran commented on a diff in the pull request: https://github.com/apache/spark/pull/9232#discussion_r43488737 --- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtil.scala --- @@ -142,6 +145,81 @@ class YarnSparkHadoopUtil extends SparkHadoopUtil { val containerIdString = System.getenv(ApplicationConstants.Environment.CONTAINER_ID.name()) ConverterUtils.toContainerId(containerIdString) } + + /** + * Obtains token for the Hive metastore, using the current user as the principal. + * Some exceptions are caught and downgraded to a log message. + * @param conf hadoop configuration; the Hive configuration will be based on this + * @return a token, or `None` if there's no need for a token (no metastore URI or principal + * in the config), or if a binding exception was caught and downgraded. + */ + def obtainTokenForHiveMetastore(conf: Configuration): Option[Token[DelegationTokenIdentifier]] = { +try { + obtainTokenForHiveMetastoreInner(conf, UserGroupInformation.getCurrentUser().getUserName) +} catch { + case e: ClassNotFoundException => +logInfo(s"Hive class not found $e") +logDebug("Hive class not found", e) +None + case t: Throwable => --- End diff -- OK --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user vanzin commented on the pull request: https://github.com/apache/spark/pull/9232#issuecomment-152453057 Just some minor things left to clean up, otherwise looks ok. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/9232#discussion_r43478640 --- Diff: yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtilSuite.scala --- @@ -245,4 +247,28 @@ class YarnSparkHadoopUtilSuite extends SparkFunSuite with Matchers with Logging System.clearProperty("SPARK_YARN_MODE") } } + + test("Obtain tokens For HiveMetastore") { +val hadoopConf = new Configuration() +hadoopConf.set("hive.metastore.kerberos.principal", "bob") +// thrift picks up on port 0 and bails out, without trying to talk to endpoint +hadoopConf.set("hive.metastore.uris", "http://localhost:0";) +val util = new YarnSparkHadoopUtil +val e = intercept[InvocationTargetException] { + val token = util.obtainTokenForHiveMetastoreInner(hadoopConf, "alice") + fail(s"Expected an exception, got the token $token") +} +val inner = e.getCause +if (inner == null) { + fail("No inner cause", e) +} +if (!inner.isInstanceOf[HiveException]) { + fail(s"Not a hive exception", inner) --- End diff -- nit: nothing to interpolate, can drop the `s`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/9232#discussion_r43478556 --- Diff: yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtilSuite.scala --- @@ -245,4 +247,31 @@ class YarnSparkHadoopUtilSuite extends SparkFunSuite with Matchers with Logging System.clearProperty("SPARK_YARN_MODE") } } + + test("Obtain tokens For HiveMetastore") { +val hadoopConf = new Configuration() +hadoopConf.set("hive.metastore.kerberos.principal", "bob") +// thrift picks up on port 0 and bails out, without trying to talk to endpoint +hadoopConf.set("hive.metastore.uris", "http://localhost:0";) +val util = new YarnSparkHadoopUtil --- End diff -- `YarnSparkHadoopUtil.get` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/9232#discussion_r43478531 --- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtil.scala --- @@ -142,6 +145,81 @@ class YarnSparkHadoopUtil extends SparkHadoopUtil { val containerIdString = System.getenv(ApplicationConstants.Environment.CONTAINER_ID.name()) ConverterUtils.toContainerId(containerIdString) } + + /** + * Obtains token for the Hive metastore, using the current user as the principal. + * Some exceptions are caught and downgraded to a log message. + * @param conf hadoop configuration; the Hive configuration will be based on this + * @return a token, or `None` if there's no need for a token (no metastore URI or principal + * in the config), or if a binding exception was caught and downgraded. + */ + def obtainTokenForHiveMetastore(conf: Configuration): Option[Token[DelegationTokenIdentifier]] = { +try { + obtainTokenForHiveMetastoreInner(conf, UserGroupInformation.getCurrentUser().getUserName) +} catch { + case e: ClassNotFoundException => +logInfo(s"Hive class not found $e") +logDebug("Hive class not found", e) +None + case t: Throwable => +throw t +} + } + + /** + * Inner routine to obtains token for the Hive metastore; exceptions are raised on any problem. + * @param conf hadoop configuration; the Hive configuration will be based on this. + * @param username the username of the principal requesting the delegating token. + * @return a delegation token + */ + private[yarn] def obtainTokenForHiveMetastoreInner(conf: Configuration, + username: String): Option[Token[DelegationTokenIdentifier]] = { +val mirror = universe.runtimeMirror(Utils.getContextOrSparkClassLoader) + +// the hive configuration class is a subclass of Hadoop Configuration, so can be cast down +// to a Configuration and used without reflection +val hiveConfClass = mirror.classLoader.loadClass("org.apache.hadoop.hive.conf.HiveConf") +// using the (Configuration, Class) constructor allows the current configuratin to be included +// in the hive config. +val ctor = hiveConfClass.getDeclaredConstructor(classOf[Configuration], + classOf[Object].getClass) +val hiveConf = ctor.newInstance(conf, hiveConfClass).asInstanceOf[Configuration] +val metastoreUri = hiveConf.getTrimmed("hive.metastore.uris", "") + +// Check for local metastore +if (metastoreUri.nonEmpty) { + require(username.nonEmpty, "Username undefined") + val principalKey = "hive.metastore.kerberos.principal" + val principal = hiveConf.getTrimmed(principalKey, "") + require(principal.nonEmpty, "Hive principal $principalKey undefined") + logDebug(s"Getting Hive delegation token for $username against $principal at $metastoreUri") + val hiveClass = mirror.classLoader.loadClass("org.apache.hadoop.hive.ql.metadata.Hive") + val closeCurrent = hiveClass.getMethod("closeCurrent") + try { +// get all the instance methods before invoking any +val getDelegationToken = hiveClass.getMethod("getDelegationToken", + classOf[String], classOf[String]) +val getHive = hiveClass.getMethod("get", hiveConfClass) + +// invoke +val hive = getHive.invoke(null, hiveConf) +val tokenStr = getDelegationToken.invoke(hive, username, principal) + .asInstanceOf[java.lang.String] +val hive2Token = new Token[DelegationTokenIdentifier]() +hive2Token.decodeFromUrlString(tokenStr) +Some(hive2Token) + } finally { +try { + closeCurrent.invoke(null) +} catch { + case e: Exception => logWarning("In Hive.closeCurrent()", e) --- End diff -- minor: you could use `Utils.tryLogNonFatalError`, although that uses `logError`. Should be fine here, though, since this shouldn't really happen normally. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/9232#discussion_r43478447 --- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtil.scala --- @@ -142,6 +145,81 @@ class YarnSparkHadoopUtil extends SparkHadoopUtil { val containerIdString = System.getenv(ApplicationConstants.Environment.CONTAINER_ID.name()) ConverterUtils.toContainerId(containerIdString) } + + /** + * Obtains token for the Hive metastore, using the current user as the principal. + * Some exceptions are caught and downgraded to a log message. + * @param conf hadoop configuration; the Hive configuration will be based on this + * @return a token, or `None` if there's no need for a token (no metastore URI or principal + * in the config), or if a binding exception was caught and downgraded. + */ + def obtainTokenForHiveMetastore(conf: Configuration): Option[Token[DelegationTokenIdentifier]] = { +try { + obtainTokenForHiveMetastoreInner(conf, UserGroupInformation.getCurrentUser().getUserName) +} catch { + case e: ClassNotFoundException => +logInfo(s"Hive class not found $e") +logDebug("Hive class not found", e) +None + case t: Throwable => +throw t +} + } + + /** + * Inner routine to obtains token for the Hive metastore; exceptions are raised on any problem. + * @param conf hadoop configuration; the Hive configuration will be based on this. + * @param username the username of the principal requesting the delegating token. + * @return a delegation token + */ + private[yarn] def obtainTokenForHiveMetastoreInner(conf: Configuration, + username: String): Option[Token[DelegationTokenIdentifier]] = { +val mirror = universe.runtimeMirror(Utils.getContextOrSparkClassLoader) + +// the hive configuration class is a subclass of Hadoop Configuration, so can be cast down +// to a Configuration and used without reflection +val hiveConfClass = mirror.classLoader.loadClass("org.apache.hadoop.hive.conf.HiveConf") +// using the (Configuration, Class) constructor allows the current configuratin to be included +// in the hive config. +val ctor = hiveConfClass.getDeclaredConstructor(classOf[Configuration], + classOf[Object].getClass) +val hiveConf = ctor.newInstance(conf, hiveConfClass).asInstanceOf[Configuration] +val metastoreUri = hiveConf.getTrimmed("hive.metastore.uris", "") + +// Check for local metastore +if (metastoreUri.nonEmpty) { + require(username.nonEmpty, "Username undefined") + val principalKey = "hive.metastore.kerberos.principal" + val principal = hiveConf.getTrimmed(principalKey, "") + require(principal.nonEmpty, "Hive principal $principalKey undefined") + logDebug(s"Getting Hive delegation token for $username against $principal at $metastoreUri") + val hiveClass = mirror.classLoader.loadClass("org.apache.hadoop.hive.ql.metadata.Hive") + val closeCurrent = hiveClass.getMethod("closeCurrent") + try { +// get all the instance methods before invoking any +val getDelegationToken = hiveClass.getMethod("getDelegationToken", + classOf[String], classOf[String]) +val getHive = hiveClass.getMethod("get", hiveConfClass) + +// invoke +val hive = getHive.invoke(null, hiveConf) +val tokenStr = getDelegationToken.invoke(hive, username, principal) + .asInstanceOf[java.lang.String] --- End diff -- minor: is `java.lang.` necessary here? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/9232#discussion_r43478397 --- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtil.scala --- @@ -142,6 +145,81 @@ class YarnSparkHadoopUtil extends SparkHadoopUtil { val containerIdString = System.getenv(ApplicationConstants.Environment.CONTAINER_ID.name()) ConverterUtils.toContainerId(containerIdString) } + + /** + * Obtains token for the Hive metastore, using the current user as the principal. + * Some exceptions are caught and downgraded to a log message. + * @param conf hadoop configuration; the Hive configuration will be based on this + * @return a token, or `None` if there's no need for a token (no metastore URI or principal + * in the config), or if a binding exception was caught and downgraded. + */ + def obtainTokenForHiveMetastore(conf: Configuration): Option[Token[DelegationTokenIdentifier]] = { +try { + obtainTokenForHiveMetastoreInner(conf, UserGroupInformation.getCurrentUser().getUserName) +} catch { + case e: ClassNotFoundException => +logInfo(s"Hive class not found $e") +logDebug("Hive class not found", e) +None + case t: Throwable => --- End diff -- This is not needed... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/9232#discussion_r43478370 --- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala --- @@ -1337,55 +1337,9 @@ object Client extends Logging { conf: Configuration, credentials: Credentials) { if (shouldGetTokens(sparkConf, "hive") && UserGroupInformation.isSecurityEnabled) { - val mirror = universe.runtimeMirror(getClass.getClassLoader) - - try { -val hiveConfClass = mirror.classLoader.loadClass("org.apache.hadoop.hive.conf.HiveConf") -val hiveConf = hiveConfClass.newInstance() - -val hiveConfGet = (param: String) => Option(hiveConfClass - .getMethod("get", classOf[java.lang.String]) - .invoke(hiveConf, param)) - -val metastore_uri = hiveConfGet("hive.metastore.uris") - -// Check for local metastore -if (metastore_uri != None && metastore_uri.get.toString.size > 0) { - val hiveClass = mirror.classLoader.loadClass("org.apache.hadoop.hive.ql.metadata.Hive") - val hive = hiveClass.getMethod("get").invoke(null, hiveConf.asInstanceOf[Object]) - - val metastore_kerberos_principal_conf_var = mirror.classLoader -.loadClass("org.apache.hadoop.hive.conf.HiveConf$ConfVars") - .getField("METASTORE_KERBEROS_PRINCIPAL").get("varname").toString - - val principal = hiveConfGet(metastore_kerberos_principal_conf_var) - - val username = Option(UserGroupInformation.getCurrentUser().getUserName) - if (principal != None && username != None) { -val tokenStr = hiveClass.getMethod("getDelegationToken", - classOf[java.lang.String], classOf[java.lang.String]) - .invoke(hive, username.get, principal.get).asInstanceOf[java.lang.String] - -val hive2Token = new Token[DelegationTokenIdentifier]() -hive2Token.decodeFromUrlString(tokenStr) -credentials.addToken(new Text("hive.server2.delegation.token"), hive2Token) -logDebug("Added hive.Server2.delegation.token to conf.") -hiveClass.getMethod("closeCurrent").invoke(null) - } else { -logError("Username or principal == NULL") -logError(s"""username=${username.getOrElse("(NULL)")}""") -logError(s"""principal=${principal.getOrElse("(NULL)")}""") -throw new IllegalArgumentException("username and/or principal is equal to null!") - } -} else { - logDebug("HiveMetaStore configured in localmode") -} - } catch { -case e: java.lang.NoSuchMethodException => { logInfo("Hive Method not found " + e); return } -case e: java.lang.ClassNotFoundException => { logInfo("Hive Class not found " + e); return } -case e: Exception => { logError("Unexpected Exception " + e) - throw new RuntimeException("Unexpected exception", e) -} + val util = new YarnSparkHadoopUtil() --- End diff -- This should be `YarnSparkHadoopUtil.get`; you could also avoid the `val` altogether. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9232#issuecomment-152334259 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44637/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9232#issuecomment-152334258 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9232#issuecomment-152334119 **[Test build #44637 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44637/consoleFull)** for PR 9232 at commit [`dd8dea9`](https://github.com/apache/spark/commit/dd8dea926cec22853fb665c83f13c78a4128c20a). * This patch passes all tests. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_:\n * ` logInfo(s\"Hive class not found $e\")`\n * `logDebug(\"Hive class not found\", e)`\n --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9232#issuecomment-152328362 **[Test build #44637 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44637/consoleFull)** for PR 9232 at commit [`dd8dea9`](https://github.com/apache/spark/commit/dd8dea926cec22853fb665c83f13c78a4128c20a). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9232#issuecomment-152325087 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9232#issuecomment-152325046 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user steveloughran commented on a diff in the pull request: https://github.com/apache/spark/pull/9232#discussion_r43439849 --- Diff: yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtilSuite.scala --- @@ -245,4 +247,55 @@ class YarnSparkHadoopUtilSuite extends SparkFunSuite with Matchers with Logging System.clearProperty("SPARK_YARN_MODE") } } + + test("Obtain tokens For HiveMetastore") { +val hadoopConf = new Configuration() +hadoopConf.set("hive.metastore.kerberos.principal", "bob") +// thrift picks up on port 0 and bails out, without trying to talk to endpoint +hadoopConf.set("hive.metastore.uris", "http://localhost:0";) +val util = new YarnSparkHadoopUtil +val e = intercept[InvocationTargetException] { --- End diff -- done --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user steveloughran commented on a diff in the pull request: https://github.com/apache/spark/pull/9232#discussion_r43369014 --- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala --- @@ -1337,55 +1337,9 @@ object Client extends Logging { conf: Configuration, credentials: Credentials) { if (shouldGetTokens(sparkConf, "hive") && UserGroupInformation.isSecurityEnabled) { - val mirror = universe.runtimeMirror(getClass.getClassLoader) --- End diff -- this should go to `utils.getContextOrSparkClassLoader()`; notable that scalastyle doesn't pick up on this, even though it rejects `Class.forName()` since SPARK-8962 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user steveloughran commented on a diff in the pull request: https://github.com/apache/spark/pull/9232#discussion_r43368611 --- Diff: yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtilSuite.scala --- @@ -245,4 +247,55 @@ class YarnSparkHadoopUtilSuite extends SparkFunSuite with Matchers with Logging System.clearProperty("SPARK_YARN_MODE") } } + + test("Obtain tokens For HiveMetastore") { +val hadoopConf = new Configuration() +hadoopConf.set("hive.metastore.kerberos.principal", "bob") +// thrift picks up on port 0 and bails out, without trying to talk to endpoint +hadoopConf.set("hive.metastore.uris", "http://localhost:0";) +val util = new YarnSparkHadoopUtil +val e = intercept[InvocationTargetException] { + util.obtainTokenForHiveMetastoreInner(hadoopConf, "alice") +} +assertNestedHiveException(e) +// expect exception trapping code to unwind this hive-side exception +assertNestedHiveException(intercept[InvocationTargetException] { + util.obtainTokenForHiveMetastore(hadoopConf) +}) + } + + def assertNestedHiveException(e: InvocationTargetException): Throwable = { +val inner = e.getCause +if (inner == null) { + fail("No inner cause", e) +} +if (!inner.isInstanceOf[HiveException]) { + fail(s"Not a hive exception", inner) +} +inner + } + + test("handleTokenIntrospectionFailure") { +val util = new YarnSparkHadoopUtil +// downgraded exceptions +util.handleTokenIntrospectionFailure("hive", new ClassNotFoundException("cnfe")) --- End diff -- I'd thought about that purer option; it's easily testable too. given the policy is so simple now, I'll just pull the catch handler into place, and replicate it in the fixed hbase code after; its simple enough that a review should suffice. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/9232#discussion_r43336052 --- Diff: yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtilSuite.scala --- @@ -245,4 +247,55 @@ class YarnSparkHadoopUtilSuite extends SparkFunSuite with Matchers with Logging System.clearProperty("SPARK_YARN_MODE") } } + + test("Obtain tokens For HiveMetastore") { +val hadoopConf = new Configuration() +hadoopConf.set("hive.metastore.kerberos.principal", "bob") +// thrift picks up on port 0 and bails out, without trying to talk to endpoint +hadoopConf.set("hive.metastore.uris", "http://localhost:0";) +val util = new YarnSparkHadoopUtil +val e = intercept[InvocationTargetException] { + util.obtainTokenForHiveMetastoreInner(hadoopConf, "alice") +} +assertNestedHiveException(e) +// expect exception trapping code to unwind this hive-side exception +assertNestedHiveException(intercept[InvocationTargetException] { + util.obtainTokenForHiveMetastore(hadoopConf) +}) + } + + def assertNestedHiveException(e: InvocationTargetException): Throwable = { +val inner = e.getCause +if (inner == null) { + fail("No inner cause", e) +} +if (!inner.isInstanceOf[HiveException]) { + fail(s"Not a hive exception", inner) +} +inner + } + + test("handleTokenIntrospectionFailure") { +val util = new YarnSparkHadoopUtil +// downgraded exceptions +util.handleTokenIntrospectionFailure("hive", new ClassNotFoundException("cnfe")) --- End diff -- Or yet another option is to have the method that handles exception take a closure, instead of the current approach of a method that matches on an exception parameter. e.g.: def tryToGetTokens(service: String)(fn: () => Option[Token]): Option[Token] = { try { fn() } catch { ... } } def obtainTokenForHiveMetastore... = { tryToGetTokens("Hive") { obtainTokenForHiveMetastoreInner(...) } } I mostly dislike that exception handling feels like it's scattered around. You have a catch block in one place, then match on the exception somewhere else, it makes it hard to see what's really being done in one look. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/9232#discussion_r43266858 --- Diff: yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtilSuite.scala --- @@ -245,4 +247,55 @@ class YarnSparkHadoopUtilSuite extends SparkFunSuite with Matchers with Logging System.clearProperty("SPARK_YARN_MODE") } } + + test("Obtain tokens For HiveMetastore") { +val hadoopConf = new Configuration() +hadoopConf.set("hive.metastore.kerberos.principal", "bob") +// thrift picks up on port 0 and bails out, without trying to talk to endpoint +hadoopConf.set("hive.metastore.uris", "http://localhost:0";) +val util = new YarnSparkHadoopUtil +val e = intercept[InvocationTargetException] { + util.obtainTokenForHiveMetastoreInner(hadoopConf, "alice") +} +assertNestedHiveException(e) +// expect exception trapping code to unwind this hive-side exception +assertNestedHiveException(intercept[InvocationTargetException] { + util.obtainTokenForHiveMetastore(hadoopConf) +}) + } + + def assertNestedHiveException(e: InvocationTargetException): Throwable = { +val inner = e.getCause +if (inner == null) { + fail("No inner cause", e) +} +if (!inner.isInstanceOf[HiveException]) { + fail(s"Not a hive exception", inner) +} +inner + } + + test("handleTokenIntrospectionFailure") { +val util = new YarnSparkHadoopUtil +// downgraded exceptions +util.handleTokenIntrospectionFailure("hive", new ClassNotFoundException("cnfe")) --- End diff -- BTW, if you really want to implement a shared policy, I'd recommend adding something like `scala.util.control.NonFatal`. That makes the exception handling cleaner; it would look more like this: try { // code that can throw } catch { case IgnorableException(e) => logDebug(...) } --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/9232#discussion_r43251011 --- Diff: yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtilSuite.scala --- @@ -245,4 +247,55 @@ class YarnSparkHadoopUtilSuite extends SparkFunSuite with Matchers with Logging System.clearProperty("SPARK_YARN_MODE") } } + + test("Obtain tokens For HiveMetastore") { +val hadoopConf = new Configuration() +hadoopConf.set("hive.metastore.kerberos.principal", "bob") +// thrift picks up on port 0 and bails out, without trying to talk to endpoint +hadoopConf.set("hive.metastore.uris", "http://localhost:0";) +val util = new YarnSparkHadoopUtil +val e = intercept[InvocationTargetException] { + util.obtainTokenForHiveMetastoreInner(hadoopConf, "alice") +} +assertNestedHiveException(e) +// expect exception trapping code to unwind this hive-side exception +assertNestedHiveException(intercept[InvocationTargetException] { + util.obtainTokenForHiveMetastore(hadoopConf) +}) + } + + def assertNestedHiveException(e: InvocationTargetException): Throwable = { +val inner = e.getCause +if (inner == null) { + fail("No inner cause", e) +} +if (!inner.isInstanceOf[HiveException]) { + fail(s"Not a hive exception", inner) +} +inner + } + + test("handleTokenIntrospectionFailure") { +val util = new YarnSparkHadoopUtil +// downgraded exceptions +util.handleTokenIntrospectionFailure("hive", new ClassNotFoundException("cnfe")) --- End diff -- I think that because there's really only one exception that's currently interesting, you need more code to implement this "shared policy" approach than just catching the one interesting exception in each call site. It's true that if you need to modify the policy you'd need you'd need to duplicate code (or switch to your current approach), but then do you envision needing to do that? What if the policy for each service needs to be different? Personally I think that the current approach is a little confusing for someone reading the code (and inconsistent; for example the current code catches `Exception` and then feeds it to a method that matches on `Throwable`), and because the policy is so simple, the sharing argument doesn't justify making the code harder to follow. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9232#issuecomment-151818863 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44522/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9232#issuecomment-151818859 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9232#issuecomment-151818616 **[Test build #44522 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44522/consoleFull)** for PR 9232 at commit [`217faba`](https://github.com/apache/spark/commit/217faba0d372ac66c57420372db62244e628da39). * This patch passes all tests. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_:\n * ` logInfo(s\"$service class not found $e\")`\n * ` logDebug(\"$service class not found\", e)`\n --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9232#issuecomment-151809791 **[Test build #44522 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44522/consoleFull)** for PR 9232 at commit [`217faba`](https://github.com/apache/spark/commit/217faba0d372ac66c57420372db62244e628da39). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9232#issuecomment-151808462 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9232#issuecomment-151808446 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user steveloughran commented on a diff in the pull request: https://github.com/apache/spark/pull/9232#discussion_r43242989 --- Diff: yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtilSuite.scala --- @@ -245,4 +247,55 @@ class YarnSparkHadoopUtilSuite extends SparkFunSuite with Matchers with Logging System.clearProperty("SPARK_YARN_MODE") } } + + test("Obtain tokens For HiveMetastore") { +val hadoopConf = new Configuration() +hadoopConf.set("hive.metastore.kerberos.principal", "bob") +// thrift picks up on port 0 and bails out, without trying to talk to endpoint +hadoopConf.set("hive.metastore.uris", "http://localhost:0";) +val util = new YarnSparkHadoopUtil +val e = intercept[InvocationTargetException] { + util.obtainTokenForHiveMetastoreInner(hadoopConf, "alice") +} +assertNestedHiveException(e) +// expect exception trapping code to unwind this hive-side exception +assertNestedHiveException(intercept[InvocationTargetException] { + util.obtainTokenForHiveMetastore(hadoopConf) +}) + } + + def assertNestedHiveException(e: InvocationTargetException): Throwable = { +val inner = e.getCause +if (inner == null) { + fail("No inner cause", e) +} +if (!inner.isInstanceOf[HiveException]) { + fail(s"Not a hive exception", inner) +} +inner + } + + test("handleTokenIntrospectionFailure") { +val util = new YarnSparkHadoopUtil +// downgraded exceptions +util.handleTokenIntrospectionFailure("hive", new ClassNotFoundException("cnfe")) --- End diff -- As soon as this patch is in I'll turn to [SPARK-11317](https://issues.apache.org/jira/browse/SPARK-11317), which is essentially "apply the same catching, filtering and reporting strategy for HBase tokens as for Hive ones". It's not as critical as this one (token retrieval is working), but as nothing gets logged except "InvocationTargetException" with no stack trace, trying to recognise the issue is a Kerberos auth problem, let alone trying to fix it, is a weekend's effort, rather than 20 minutes worth. Because the policy goes in both places, having it separate and re-usable makes it a zero-cut-and-paste reuse, with that single test for failures without having to mock up failures across two separate clauses. And future maintenance costs are kept down if someone ever decides to change the policy again. Would you be happier if I cleaned up the HBase code as part of this same patch? Because I can and it will make the benefits of the factored out behaviour clearer. It's just messy to fix two things in one patch, especially if someone ever needs to play cherry pick or reverting games. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user steveloughran commented on a diff in the pull request: https://github.com/apache/spark/pull/9232#discussion_r43240828 --- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtil.scala --- @@ -142,6 +145,99 @@ class YarnSparkHadoopUtil extends SparkHadoopUtil { val containerIdString = System.getenv(ApplicationConstants.Environment.CONTAINER_ID.name()) ConverterUtils.toContainerId(containerIdString) } + + /** + * Obtains token for the Hive metastore, using the current user as the principal. + * Some exceptions are caught and downgraded to a log message. + * @param conf hadoop configuration; the Hive configuration will be based on this + * @return a token, or `None` if there's no need for a token (no metastore URI or principal + * in the config), or if a binding exception was caught and downgraded. + */ + def obtainTokenForHiveMetastore(conf: Configuration): Option[Token[DelegationTokenIdentifier]] = { +try { + obtainTokenForHiveMetastoreInner(conf, UserGroupInformation.getCurrentUser().getUserName) +} catch { + case e: Exception => { +handleTokenIntrospectionFailure("Hive", e) +None + } +} + } + + /** + * Handle failures to obtain a token through introspection. Failures to load the class are + * not treated as errors: anything else is. + * @param service service name for error messages + * @param thrown exception caught + * @throws Exception if the `thrown` exception isn't one that is to be ignored + */ + private[yarn] def handleTokenIntrospectionFailure(service: String, thrown: Throwable): Unit = { +thrown match { + case e: ClassNotFoundException => +logInfo(s"$service class not found $e") +logDebug("Hive Class not found", e) + case t: Throwable => { --- End diff -- the reason I'd pulled it out was to have an isolated policy which could be both tested without mocking failures, and be re-used in the HBase token retrieval, which needs an identical set of clauses. Given the policy has now been simplified so much, the method is now pretty much unused; I can pull it. But still the HBase token logic will need to be 100% in sync. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/9232#discussion_r43233534 --- Diff: yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtilSuite.scala --- @@ -245,4 +247,55 @@ class YarnSparkHadoopUtilSuite extends SparkFunSuite with Matchers with Logging System.clearProperty("SPARK_YARN_MODE") } } + + test("Obtain tokens For HiveMetastore") { +val hadoopConf = new Configuration() +hadoopConf.set("hive.metastore.kerberos.principal", "bob") +// thrift picks up on port 0 and bails out, without trying to talk to endpoint +hadoopConf.set("hive.metastore.uris", "http://localhost:0";) +val util = new YarnSparkHadoopUtil +val e = intercept[InvocationTargetException] { --- End diff -- minor, but you could use the same style as below (where you avoid the temp variable). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/9232#discussion_r43233453 --- Diff: yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtilSuite.scala --- @@ -245,4 +247,55 @@ class YarnSparkHadoopUtilSuite extends SparkFunSuite with Matchers with Logging System.clearProperty("SPARK_YARN_MODE") } } + + test("Obtain tokens For HiveMetastore") { +val hadoopConf = new Configuration() +hadoopConf.set("hive.metastore.kerberos.principal", "bob") +// thrift picks up on port 0 and bails out, without trying to talk to endpoint +hadoopConf.set("hive.metastore.uris", "http://localhost:0";) +val util = new YarnSparkHadoopUtil +val e = intercept[InvocationTargetException] { + util.obtainTokenForHiveMetastoreInner(hadoopConf, "alice") +} +assertNestedHiveException(e) +// expect exception trapping code to unwind this hive-side exception +assertNestedHiveException(intercept[InvocationTargetException] { + util.obtainTokenForHiveMetastore(hadoopConf) +}) + } + + def assertNestedHiveException(e: InvocationTargetException): Throwable = { +val inner = e.getCause +if (inner == null) { + fail("No inner cause", e) +} +if (!inner.isInstanceOf[HiveException]) { + fail(s"Not a hive exception", inner) +} +inner + } + + test("handleTokenIntrospectionFailure") { +val util = new YarnSparkHadoopUtil +// downgraded exceptions +util.handleTokenIntrospectionFailure("hive", new ClassNotFoundException("cnfe")) --- End diff -- Following my previous comment, you could get rid of this whole test case if you just do exception handling in the caller method. If you really want to test that CNFE is ignored, you could use Mockito's `spy` to mock `obtainTokenForHiveMetastoreInner` and make it throw a CNFE. The other tests here are not really that interesting anymore. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/9232#discussion_r43231831 --- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtil.scala --- @@ -142,6 +145,99 @@ class YarnSparkHadoopUtil extends SparkHadoopUtil { val containerIdString = System.getenv(ApplicationConstants.Environment.CONTAINER_ID.name()) ConverterUtils.toContainerId(containerIdString) } + + /** + * Obtains token for the Hive metastore, using the current user as the principal. + * Some exceptions are caught and downgraded to a log message. + * @param conf hadoop configuration; the Hive configuration will be based on this + * @return a token, or `None` if there's no need for a token (no metastore URI or principal + * in the config), or if a binding exception was caught and downgraded. + */ + def obtainTokenForHiveMetastore(conf: Configuration): Option[Token[DelegationTokenIdentifier]] = { +try { + obtainTokenForHiveMetastoreInner(conf, UserGroupInformation.getCurrentUser().getUserName) +} catch { + case e: Exception => { +handleTokenIntrospectionFailure("Hive", e) +None + } +} + } + + /** + * Handle failures to obtain a token through introspection. Failures to load the class are + * not treated as errors: anything else is. + * @param service service name for error messages + * @param thrown exception caught + * @throws Exception if the `thrown` exception isn't one that is to be ignored + */ + private[yarn] def handleTokenIntrospectionFailure(service: String, thrown: Throwable): Unit = { +thrown match { + case e: ClassNotFoundException => +logInfo(s"$service class not found $e") +logDebug("Hive Class not found", e) + case t: Throwable => { --- End diff -- Hi @steveloughran , I think you're still not really getting what I'm saying. You can just *delete this whole `case`*. The exception will just propagate up the call stack --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/9232#discussion_r43223745 --- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtil.scala --- @@ -142,6 +145,97 @@ class YarnSparkHadoopUtil extends SparkHadoopUtil { val containerIdString = System.getenv(ApplicationConstants.Environment.CONTAINER_ID.name()) ConverterUtils.toContainerId(containerIdString) } + + /** + * Obtains token for the Hive metastore, using the current user as the principal. + * Some exceptions are caught and downgraded to a log message. + * @param conf hadoop configuration; the Hive configuration will be based on this + * @return a token, or `None` if there's no need for a token (no metastore URI or principal + * in the config), or if a binding exception was caught and downgraded. + */ + def obtainTokenForHiveMetastore(conf: Configuration): Option[Token[DelegationTokenIdentifier]] = { +try { + obtainTokenForHiveMetastoreInner(conf, UserGroupInformation.getCurrentUser().getUserName) +} catch { + case e: Exception => { +handleTokenIntrospectionFailure("Hive", e) +None + } +} + } + + /** + * Handle failures to obtain a token through introspection. Failures to load the class are + * not treated as errors: anything else is. + * @param service service name for error messages + * @param thrown exception caught + * @throws Exception if the `thrown` exception isn't one that is to be ignored + */ + private[yarn] def handleTokenIntrospectionFailure(service: String, thrown: Throwable): Unit = { +thrown match { + case e: ClassNotFoundException => +logInfo(s"$service class not found $e") +logDebug("Hive Class not found", e) + case t: Throwable => { +throw t --- End diff -- The user can (i) not give Spark a hive configuration, in which case there will be no metastore URIs, and this code will be skipped, or (ii) set `spark.yarn.security.tokens.hive.enabled` to false. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user zhzhan commented on a diff in the pull request: https://github.com/apache/spark/pull/9232#discussion_r43204781 --- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtil.scala --- @@ -142,6 +145,97 @@ class YarnSparkHadoopUtil extends SparkHadoopUtil { val containerIdString = System.getenv(ApplicationConstants.Environment.CONTAINER_ID.name()) ConverterUtils.toContainerId(containerIdString) } + + /** + * Obtains token for the Hive metastore, using the current user as the principal. + * Some exceptions are caught and downgraded to a log message. + * @param conf hadoop configuration; the Hive configuration will be based on this + * @return a token, or `None` if there's no need for a token (no metastore URI or principal + * in the config), or if a binding exception was caught and downgraded. + */ + def obtainTokenForHiveMetastore(conf: Configuration): Option[Token[DelegationTokenIdentifier]] = { +try { + obtainTokenForHiveMetastoreInner(conf, UserGroupInformation.getCurrentUser().getUserName) +} catch { + case e: Exception => { +handleTokenIntrospectionFailure("Hive", e) +None + } +} + } + + /** + * Handle failures to obtain a token through introspection. Failures to load the class are + * not treated as errors: anything else is. + * @param service service name for error messages + * @param thrown exception caught + * @throws Exception if the `thrown` exception isn't one that is to be ignored + */ + private[yarn] def handleTokenIntrospectionFailure(service: String, thrown: Throwable): Unit = { +thrown match { + case e: ClassNotFoundException => +logInfo(s"$service class not found $e") +logDebug("Hive Class not found", e) + case t: Throwable => { +throw t --- End diff -- Here the exception is thrown. I know swallow the exception is bad, but what happen if the user does not want to access the hive metastore but want to use spark even if token cannot be acquired? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9232#issuecomment-151677558 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9232#issuecomment-151677560 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44475/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9232#issuecomment-151677475 **[Test build #44475 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44475/consoleFull)** for PR 9232 at commit [`00cb5a7`](https://github.com/apache/spark/commit/00cb5a7323a4f91adfa5c4273c8a6bcbc67dc008). * This patch passes all tests. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_:\n * ` logInfo(s\"$service class not found $e\")`\n --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9232#issuecomment-151674747 **[Test build #44475 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44475/consoleFull)** for PR 9232 at commit [`00cb5a7`](https://github.com/apache/spark/commit/00cb5a7323a4f91adfa5c4273c8a6bcbc67dc008). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9232#issuecomment-151673386 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9232#issuecomment-151673409 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user steveloughran commented on a diff in the pull request: https://github.com/apache/spark/pull/9232#discussion_r43201346 --- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtil.scala --- @@ -142,6 +145,99 @@ class YarnSparkHadoopUtil extends SparkHadoopUtil { val containerIdString = System.getenv(ApplicationConstants.Environment.CONTAINER_ID.name()) ConverterUtils.toContainerId(containerIdString) } + + /** + * Obtains token for the Hive metastore, using the current user as the principal. + * Some exceptions are caught and downgraded to a log message. + * @param conf hadoop configuration; the Hive configuration will be based on this + * @return a token, or `None` if there's no need for a token (no metastore URI or principal + * in the config), or if a binding exception was caught and downgraded. + */ + def obtainTokenForHiveMetastore(conf: Configuration): Option[Token[DelegationTokenIdentifier]] = { +try { + obtainTokenForHiveMetastoreInner(conf, UserGroupInformation.getCurrentUser().getUserName) +} catch { + case e: Exception => { +handleTokenIntrospectionFailure("Hive", e) +None + } +} + } + + /** + * Handle failures to obtain a token through introspection. Failures to load the class are + * not treated as errors: anything else is. + * @param service service name for error messages + * @param thrown exception caught + * @throws Exception if the `thrown` exception isn't one that is to be ignored + */ + private[yarn] def handleTokenIntrospectionFailure(service: String, thrown: Throwable): Unit = { +thrown match { + case e: ClassNotFoundException => +logInfo(s"$service class not found $e") +logDebug("Hive Class not found", e) + case t: Throwable => { --- End diff -- oh, we're at cross purposes. I was looking @ line 128, above. you were at 180. That's why I Was confused. Yes, I'll cut the lower --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9232#issuecomment-151661827 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44469/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9232#issuecomment-151661825 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9232#issuecomment-151661688 **[Test build #44469 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44469/consoleFull)** for PR 9232 at commit [`fbf0ecb`](https://github.com/apache/spark/commit/fbf0ecbd4ae4303846608193c91d735f536e9015). * This patch passes all tests. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_:\n * ` logInfo(s\"$service class not found $e\")`\n --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/9232#discussion_r43195162 --- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtil.scala --- @@ -142,6 +145,99 @@ class YarnSparkHadoopUtil extends SparkHadoopUtil { val containerIdString = System.getenv(ApplicationConstants.Environment.CONTAINER_ID.name()) ConverterUtils.toContainerId(containerIdString) } + + /** + * Obtains token for the Hive metastore, using the current user as the principal. + * Some exceptions are caught and downgraded to a log message. + * @param conf hadoop configuration; the Hive configuration will be based on this + * @return a token, or `None` if there's no need for a token (no metastore URI or principal + * in the config), or if a binding exception was caught and downgraded. + */ + def obtainTokenForHiveMetastore(conf: Configuration): Option[Token[DelegationTokenIdentifier]] = { +try { + obtainTokenForHiveMetastoreInner(conf, UserGroupInformation.getCurrentUser().getUserName) +} catch { + case e: Exception => { +handleTokenIntrospectionFailure("Hive", e) +None + } +} + } + + /** + * Handle failures to obtain a token through introspection. Failures to load the class are + * not treated as errors: anything else is. + * @param service service name for error messages + * @param thrown exception caught + * @throws Exception if the `thrown` exception isn't one that is to be ignored + */ + private[yarn] def handleTokenIntrospectionFailure(service: String, thrown: Throwable): Unit = { +thrown match { + case e: ClassNotFoundException => +logInfo(s"$service class not found $e") +logDebug("Hive Class not found", e) + case t: Throwable => { --- End diff -- I don't follow. You're catching an exception, logging it, and re-throwing it, which causes the exception to show up twice in the process output. Instead, you can just delete your code, and let the exception propagate naturally. It will show up in the output the same way. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user steveloughran commented on a diff in the pull request: https://github.com/apache/spark/pull/9232#discussion_r43194609 --- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtil.scala --- @@ -142,6 +145,99 @@ class YarnSparkHadoopUtil extends SparkHadoopUtil { val containerIdString = System.getenv(ApplicationConstants.Environment.CONTAINER_ID.name()) ConverterUtils.toContainerId(containerIdString) } + + /** + * Obtains token for the Hive metastore, using the current user as the principal. + * Some exceptions are caught and downgraded to a log message. + * @param conf hadoop configuration; the Hive configuration will be based on this + * @return a token, or `None` if there's no need for a token (no metastore URI or principal + * in the config), or if a binding exception was caught and downgraded. + */ + def obtainTokenForHiveMetastore(conf: Configuration): Option[Token[DelegationTokenIdentifier]] = { +try { + obtainTokenForHiveMetastoreInner(conf, UserGroupInformation.getCurrentUser().getUserName) +} catch { + case e: Exception => { +handleTokenIntrospectionFailure("Hive", e) +None + } +} + } + + /** + * Handle failures to obtain a token through introspection. Failures to load the class are + * not treated as errors: anything else is. + * @param service service name for error messages + * @param thrown exception caught + * @throws Exception if the `thrown` exception isn't one that is to be ignored + */ + private[yarn] def handleTokenIntrospectionFailure(service: String, thrown: Throwable): Unit = { +thrown match { + case e: ClassNotFoundException => +logInfo(s"$service class not found $e") +logDebug("Hive Class not found", e) + case t: Throwable => { --- End diff -- those exceptions aren't being rethrown though, are they? So its logging the full stack @ debug, and a one-liner for most. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9232#issuecomment-151657516 **[Test build #44469 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44469/consoleFull)** for PR 9232 at commit [`fbf0ecb`](https://github.com/apache/spark/commit/fbf0ecbd4ae4303846608193c91d735f536e9015). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/9232#discussion_r43193976 --- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtil.scala --- @@ -142,6 +145,99 @@ class YarnSparkHadoopUtil extends SparkHadoopUtil { val containerIdString = System.getenv(ApplicationConstants.Environment.CONTAINER_ID.name()) ConverterUtils.toContainerId(containerIdString) } + + /** + * Obtains token for the Hive metastore, using the current user as the principal. + * Some exceptions are caught and downgraded to a log message. + * @param conf hadoop configuration; the Hive configuration will be based on this + * @return a token, or `None` if there's no need for a token (no metastore URI or principal + * in the config), or if a binding exception was caught and downgraded. + */ + def obtainTokenForHiveMetastore(conf: Configuration): Option[Token[DelegationTokenIdentifier]] = { +try { + obtainTokenForHiveMetastoreInner(conf, UserGroupInformation.getCurrentUser().getUserName) +} catch { + case e: Exception => { +handleTokenIntrospectionFailure("Hive", e) +None + } +} + } + + /** + * Handle failures to obtain a token through introspection. Failures to load the class are + * not treated as errors: anything else is. + * @param service service name for error messages + * @param thrown exception caught + * @throws Exception if the `thrown` exception isn't one that is to be ignored + */ + private[yarn] def handleTokenIntrospectionFailure(service: String, thrown: Throwable): Unit = { +thrown match { + case e: ClassNotFoundException => +logInfo(s"$service class not found $e") +logDebug("Hive Class not found", e) + case t: Throwable => { --- End diff -- Why would you? All you're doing here is printing the stack trace to stderr, which will happen again when you re-throw the exception. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user steveloughran commented on a diff in the pull request: https://github.com/apache/spark/pull/9232#discussion_r43193733 --- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtil.scala --- @@ -142,6 +145,99 @@ class YarnSparkHadoopUtil extends SparkHadoopUtil { val containerIdString = System.getenv(ApplicationConstants.Environment.CONTAINER_ID.name()) ConverterUtils.toContainerId(containerIdString) } + + /** + * Obtains token for the Hive metastore, using the current user as the principal. + * Some exceptions are caught and downgraded to a log message. + * @param conf hadoop configuration; the Hive configuration will be based on this + * @return a token, or `None` if there's no need for a token (no metastore URI or principal + * in the config), or if a binding exception was caught and downgraded. + */ + def obtainTokenForHiveMetastore(conf: Configuration): Option[Token[DelegationTokenIdentifier]] = { +try { + obtainTokenForHiveMetastoreInner(conf, UserGroupInformation.getCurrentUser().getUserName) +} catch { + case e: Exception => { +handleTokenIntrospectionFailure("Hive", e) +None + } +} + } + + /** + * Handle failures to obtain a token through introspection. Failures to load the class are + * not treated as errors: anything else is. + * @param service service name for error messages + * @param thrown exception caught + * @throws Exception if the `thrown` exception isn't one that is to be ignored + */ + private[yarn] def handleTokenIntrospectionFailure(service: String, thrown: Throwable): Unit = { +thrown match { + case e: ClassNotFoundException => +logInfo(s"$service class not found $e") +logDebug("Hive Class not found", e) + case t: Throwable => { --- End diff -- Either? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/9232#discussion_r43193401 --- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtil.scala --- @@ -142,6 +145,99 @@ class YarnSparkHadoopUtil extends SparkHadoopUtil { val containerIdString = System.getenv(ApplicationConstants.Environment.CONTAINER_ID.name()) ConverterUtils.toContainerId(containerIdString) } + + /** + * Obtains token for the Hive metastore, using the current user as the principal. + * Some exceptions are caught and downgraded to a log message. + * @param conf hadoop configuration; the Hive configuration will be based on this + * @return a token, or `None` if there's no need for a token (no metastore URI or principal + * in the config), or if a binding exception was caught and downgraded. + */ + def obtainTokenForHiveMetastore(conf: Configuration): Option[Token[DelegationTokenIdentifier]] = { +try { + obtainTokenForHiveMetastoreInner(conf, UserGroupInformation.getCurrentUser().getUserName) +} catch { + case e: Exception => { +handleTokenIntrospectionFailure("Hive", e) +None + } +} + } + + /** + * Handle failures to obtain a token through introspection. Failures to load the class are + * not treated as errors: anything else is. + * @param service service name for error messages + * @param thrown exception caught + * @throws Exception if the `thrown` exception isn't one that is to be ignored + */ + private[yarn] def handleTokenIntrospectionFailure(service: String, thrown: Throwable): Unit = { +thrown match { + case e: ClassNotFoundException => +logInfo(s"$service class not found $e") +logDebug("Hive Class not found", e) + case t: Throwable => { --- End diff -- really, you don't need this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9232#issuecomment-151655354 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9232#issuecomment-151655385 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user steveloughran commented on a diff in the pull request: https://github.com/apache/spark/pull/9232#discussion_r43189656 --- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtil.scala --- @@ -142,6 +145,117 @@ class YarnSparkHadoopUtil extends SparkHadoopUtil { val containerIdString = System.getenv(ApplicationConstants.Environment.CONTAINER_ID.name()) ConverterUtils.toContainerId(containerIdString) } + + /** + * Obtains token for the Hive metastore, using the current user as the principal. + * Some exceptions are caught and downgraded to a log message. + * @param conf hadoop configuration; the Hive configuration will be based on this + * @return a token, or `None` if there's no need for a token (no metastore URI or principal + * in the config), or if a binding exception was caught and downgraded. + */ + def obtainTokenForHiveMetastore(conf: Configuration): Option[Token[DelegationTokenIdentifier]] = { +try { + obtainTokenForHiveMetastoreInner(conf, UserGroupInformation.getCurrentUser().getUserName) +} catch { + case e: Exception => { +handleTokenIntrospectionFailure("Hive", e) +None + } +} + } + + /** + * Handle failures to obtain a token through introspection. Failures to load the class are + * not treated as errors: anything else is. + * @param service service name for error messages + * @param thrown exception caught + * @throws Exception if the `thrown` exception isn't one that is to be ignored + */ + private[yarn] def handleTokenIntrospectionFailure(service: String, thrown: Throwable): Unit = { +thrown match { + case e: ClassNotFoundException => +logInfo(s"$service class not found $e") +logDebug("Hive Class not found", e) + case e: NoClassDefFoundError => +logDebug(s"$service class not found", e) + case e: InvocationTargetException => +// problem talking to the metastore or other hive-side exception +logInfo(s"$service method invocation failed", e) +throw if (e.getCause != null) e.getCause else e + case e: ReflectiveOperationException => +// any other reflection failure log at error and rethrow +logError(s"$service Class operation failed", e) +throw e; --- End diff -- well, that simplifies the clause, and the test... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user steveloughran commented on a diff in the pull request: https://github.com/apache/spark/pull/9232#discussion_r43189502 --- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtil.scala --- @@ -142,6 +145,117 @@ class YarnSparkHadoopUtil extends SparkHadoopUtil { val containerIdString = System.getenv(ApplicationConstants.Environment.CONTAINER_ID.name()) ConverterUtils.toContainerId(containerIdString) } + + /** + * Obtains token for the Hive metastore, using the current user as the principal. + * Some exceptions are caught and downgraded to a log message. + * @param conf hadoop configuration; the Hive configuration will be based on this + * @return a token, or `None` if there's no need for a token (no metastore URI or principal + * in the config), or if a binding exception was caught and downgraded. + */ + def obtainTokenForHiveMetastore(conf: Configuration): Option[Token[DelegationTokenIdentifier]] = { +try { + obtainTokenForHiveMetastoreInner(conf, UserGroupInformation.getCurrentUser().getUserName) +} catch { + case e: Exception => { +handleTokenIntrospectionFailure("Hive", e) +None + } +} + } + + /** + * Handle failures to obtain a token through introspection. Failures to load the class are + * not treated as errors: anything else is. + * @param service service name for error messages + * @param thrown exception caught + * @throws Exception if the `thrown` exception isn't one that is to be ignored + */ + private[yarn] def handleTokenIntrospectionFailure(service: String, thrown: Throwable): Unit = { +thrown match { + case e: ClassNotFoundException => +logInfo(s"$service class not found $e") +logDebug("Hive Class not found", e) + case e: NoClassDefFoundError => +logDebug(s"$service class not found", e) + case e: InvocationTargetException => +// problem talking to the metastore or other hive-side exception +logInfo(s"$service method invocation failed", e) +throw if (e.getCause != null) e.getCause else e + case e: ReflectiveOperationException => +// any other reflection failure log at error and rethrow +logError(s"$service Class operation failed", e) +throw e; + case e: RuntimeException => +// any runtime exception, including Illegal Argument Exception +throw e + case t: Throwable => { +val msg = s"$service: Unexpected Exception " + t +logError(msg, t) +throw new RuntimeException(msg, t) + } +} + } + + /** + * Inner routine to obtains token for the Hive metastore; exceptions are raised on any problem. + * @param conf hadoop configuration; the Hive configuration will be based on this. + * @param username the username of the principal requesting the delegating token. + * @return a delegation token + */ + private[yarn] def obtainTokenForHiveMetastoreInner(conf: Configuration, + username: String): Option[Token[DelegationTokenIdentifier]] = { +val mirror = universe.runtimeMirror(getClass.getClassLoader) + +// the hive configuration class is a subclass of Hadoop Configuration, so can be cast down +// to a Configuration and used without reflection +val hiveConfClass = mirror.classLoader.loadClass("org.apache.hadoop.hive.conf.HiveConf") +// using the (Configuration, Class) constructor allows the current configuratin to be included +// in the hive config. +val ctor = hiveConfClass.getDeclaredConstructor(classOf[Configuration], + classOf[Object].getClass) +val hiveConf = ctor.newInstance(conf, hiveConfClass).asInstanceOf[Configuration] +val metastore_uri = hiveConf.getTrimmed("hive.metastore.uris", "") + +// Check for local metastore +if (metastore_uri.nonEmpty) { + if (username.isEmpty) { +throw new IllegalArgumentException(s"Username undefined") + } + val metastore_kerberos_principal_key = "hive.metastore.kerberos.principal" + val principal = hiveConf.getTrimmed(metastore_kerberos_principal_key, "") + if (principal.isEmpty) { +throw new IllegalArgumentException(s"Hive principal" + +s" $metastore_kerberos_principal_key undefined") + } + logDebug(s"Getting Hive delegation token for user $username against $metastore_uri") + val hiveClass = mirror.classLoader.loadClass("org.apache.hadoop.hive.ql.metadata.Hive") + val closeCurrent = hiveClass.getMethod("closeCurrent") + try { +// get all the instance meth
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/9232#discussion_r43160341 --- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtil.scala --- @@ -142,6 +145,117 @@ class YarnSparkHadoopUtil extends SparkHadoopUtil { val containerIdString = System.getenv(ApplicationConstants.Environment.CONTAINER_ID.name()) ConverterUtils.toContainerId(containerIdString) } + + /** + * Obtains token for the Hive metastore, using the current user as the principal. + * Some exceptions are caught and downgraded to a log message. + * @param conf hadoop configuration; the Hive configuration will be based on this + * @return a token, or `None` if there's no need for a token (no metastore URI or principal + * in the config), or if a binding exception was caught and downgraded. + */ + def obtainTokenForHiveMetastore(conf: Configuration): Option[Token[DelegationTokenIdentifier]] = { +try { + obtainTokenForHiveMetastoreInner(conf, UserGroupInformation.getCurrentUser().getUserName) +} catch { + case e: Exception => { +handleTokenIntrospectionFailure("Hive", e) +None + } +} + } + + /** + * Handle failures to obtain a token through introspection. Failures to load the class are + * not treated as errors: anything else is. + * @param service service name for error messages + * @param thrown exception caught + * @throws Exception if the `thrown` exception isn't one that is to be ignored + */ + private[yarn] def handleTokenIntrospectionFailure(service: String, thrown: Throwable): Unit = { +thrown match { + case e: ClassNotFoundException => +logInfo(s"$service class not found $e") +logDebug("Hive Class not found", e) + case e: NoClassDefFoundError => +logDebug(s"$service class not found", e) + case e: InvocationTargetException => +// problem talking to the metastore or other hive-side exception +logInfo(s"$service method invocation failed", e) +throw if (e.getCause != null) e.getCause else e + case e: ReflectiveOperationException => +// any other reflection failure log at error and rethrow +logError(s"$service Class operation failed", e) +throw e; --- End diff -- I think unwinding just makes the code more confusing. Just handle the exceptions you really mean to handle, and let the others propagate as is. Errors here should be very uncommon, and the extra info in the stack trace won't really make it harder to find the case. I think it's very unlikely you'll get a `NoClassDefFoundError` if you don't get a `ClassDefNotFoundException`. If you do, it's probably an actual error - user has added some Hive jars to the path but not others, or something like that. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user steveloughran commented on a diff in the pull request: https://github.com/apache/spark/pull/9232#discussion_r43128655 --- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtil.scala --- @@ -142,6 +145,117 @@ class YarnSparkHadoopUtil extends SparkHadoopUtil { val containerIdString = System.getenv(ApplicationConstants.Environment.CONTAINER_ID.name()) ConverterUtils.toContainerId(containerIdString) } + + /** + * Obtains token for the Hive metastore, using the current user as the principal. + * Some exceptions are caught and downgraded to a log message. + * @param conf hadoop configuration; the Hive configuration will be based on this + * @return a token, or `None` if there's no need for a token (no metastore URI or principal + * in the config), or if a binding exception was caught and downgraded. + */ + def obtainTokenForHiveMetastore(conf: Configuration): Option[Token[DelegationTokenIdentifier]] = { +try { + obtainTokenForHiveMetastoreInner(conf, UserGroupInformation.getCurrentUser().getUserName) +} catch { + case e: Exception => { +handleTokenIntrospectionFailure("Hive", e) +None + } +} + } + + /** + * Handle failures to obtain a token through introspection. Failures to load the class are + * not treated as errors: anything else is. + * @param service service name for error messages + * @param thrown exception caught + * @throws Exception if the `thrown` exception isn't one that is to be ignored + */ + private[yarn] def handleTokenIntrospectionFailure(service: String, thrown: Throwable): Unit = { +thrown match { + case e: ClassNotFoundException => +logInfo(s"$service class not found $e") +logDebug("Hive Class not found", e) + case e: NoClassDefFoundError => +logDebug(s"$service class not found", e) + case e: InvocationTargetException => +// problem talking to the metastore or other hive-side exception +logInfo(s"$service method invocation failed", e) +throw if (e.getCause != null) e.getCause else e + case e: ReflectiveOperationException => +// any other reflection failure log at error and rethrow +logError(s"$service Class operation failed", e) +throw e; --- End diff -- `NoClassDefFound` may need to be covered too; I think it's the transient version of CNFE, though that may be superstition. I'd like to still unwind the `InvocationTargetException`, as its just a wrapper for what came in before; cutting it will simply reduce one level of needless stack trace --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/9232#discussion_r43014446 --- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtil.scala --- @@ -142,6 +145,117 @@ class YarnSparkHadoopUtil extends SparkHadoopUtil { val containerIdString = System.getenv(ApplicationConstants.Environment.CONTAINER_ID.name()) ConverterUtils.toContainerId(containerIdString) } + + /** + * Obtains token for the Hive metastore, using the current user as the principal. + * Some exceptions are caught and downgraded to a log message. + * @param conf hadoop configuration; the Hive configuration will be based on this + * @return a token, or `None` if there's no need for a token (no metastore URI or principal + * in the config), or if a binding exception was caught and downgraded. + */ + def obtainTokenForHiveMetastore(conf: Configuration): Option[Token[DelegationTokenIdentifier]] = { +try { + obtainTokenForHiveMetastoreInner(conf, UserGroupInformation.getCurrentUser().getUserName) +} catch { + case e: Exception => { +handleTokenIntrospectionFailure("Hive", e) +None + } +} + } + + /** + * Handle failures to obtain a token through introspection. Failures to load the class are + * not treated as errors: anything else is. + * @param service service name for error messages + * @param thrown exception caught + * @throws Exception if the `thrown` exception isn't one that is to be ignored + */ + private[yarn] def handleTokenIntrospectionFailure(service: String, thrown: Throwable): Unit = { +thrown match { + case e: ClassNotFoundException => +logInfo(s"$service class not found $e") +logDebug("Hive Class not found", e) + case e: NoClassDefFoundError => +logDebug(s"$service class not found", e) + case e: InvocationTargetException => +// problem talking to the metastore or other hive-side exception +logInfo(s"$service method invocation failed", e) +throw if (e.getCause != null) e.getCause else e + case e: ReflectiveOperationException => +// any other reflection failure log at error and rethrow +logError(s"$service Class operation failed", e) +throw e; + case e: RuntimeException => +// any runtime exception, including Illegal Argument Exception +throw e + case t: Throwable => { +val msg = s"$service: Unexpected Exception " + t +logError(msg, t) +throw new RuntimeException(msg, t) + } +} + } + + /** + * Inner routine to obtains token for the Hive metastore; exceptions are raised on any problem. + * @param conf hadoop configuration; the Hive configuration will be based on this. + * @param username the username of the principal requesting the delegating token. + * @return a delegation token + */ + private[yarn] def obtainTokenForHiveMetastoreInner(conf: Configuration, + username: String): Option[Token[DelegationTokenIdentifier]] = { +val mirror = universe.runtimeMirror(getClass.getClassLoader) + +// the hive configuration class is a subclass of Hadoop Configuration, so can be cast down +// to a Configuration and used without reflection +val hiveConfClass = mirror.classLoader.loadClass("org.apache.hadoop.hive.conf.HiveConf") +// using the (Configuration, Class) constructor allows the current configuratin to be included +// in the hive config. +val ctor = hiveConfClass.getDeclaredConstructor(classOf[Configuration], + classOf[Object].getClass) +val hiveConf = ctor.newInstance(conf, hiveConfClass).asInstanceOf[Configuration] +val metastore_uri = hiveConf.getTrimmed("hive.metastore.uris", "") + +// Check for local metastore +if (metastore_uri.nonEmpty) { + if (username.isEmpty) { +throw new IllegalArgumentException(s"Username undefined") + } + val metastore_kerberos_principal_key = "hive.metastore.kerberos.principal" + val principal = hiveConf.getTrimmed(metastore_kerberos_principal_key, "") + if (principal.isEmpty) { --- End diff -- use `require`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsub
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/9232#discussion_r43014505 --- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtil.scala --- @@ -142,6 +145,117 @@ class YarnSparkHadoopUtil extends SparkHadoopUtil { val containerIdString = System.getenv(ApplicationConstants.Environment.CONTAINER_ID.name()) ConverterUtils.toContainerId(containerIdString) } + + /** + * Obtains token for the Hive metastore, using the current user as the principal. + * Some exceptions are caught and downgraded to a log message. + * @param conf hadoop configuration; the Hive configuration will be based on this + * @return a token, or `None` if there's no need for a token (no metastore URI or principal + * in the config), or if a binding exception was caught and downgraded. + */ + def obtainTokenForHiveMetastore(conf: Configuration): Option[Token[DelegationTokenIdentifier]] = { +try { + obtainTokenForHiveMetastoreInner(conf, UserGroupInformation.getCurrentUser().getUserName) +} catch { + case e: Exception => { +handleTokenIntrospectionFailure("Hive", e) +None + } +} + } + + /** + * Handle failures to obtain a token through introspection. Failures to load the class are + * not treated as errors: anything else is. + * @param service service name for error messages + * @param thrown exception caught + * @throws Exception if the `thrown` exception isn't one that is to be ignored + */ + private[yarn] def handleTokenIntrospectionFailure(service: String, thrown: Throwable): Unit = { +thrown match { + case e: ClassNotFoundException => +logInfo(s"$service class not found $e") +logDebug("Hive Class not found", e) + case e: NoClassDefFoundError => +logDebug(s"$service class not found", e) + case e: InvocationTargetException => +// problem talking to the metastore or other hive-side exception +logInfo(s"$service method invocation failed", e) +throw if (e.getCause != null) e.getCause else e + case e: ReflectiveOperationException => +// any other reflection failure log at error and rethrow +logError(s"$service Class operation failed", e) +throw e; + case e: RuntimeException => +// any runtime exception, including Illegal Argument Exception +throw e + case t: Throwable => { +val msg = s"$service: Unexpected Exception " + t +logError(msg, t) +throw new RuntimeException(msg, t) + } +} + } + + /** + * Inner routine to obtains token for the Hive metastore; exceptions are raised on any problem. + * @param conf hadoop configuration; the Hive configuration will be based on this. + * @param username the username of the principal requesting the delegating token. + * @return a delegation token + */ + private[yarn] def obtainTokenForHiveMetastoreInner(conf: Configuration, + username: String): Option[Token[DelegationTokenIdentifier]] = { +val mirror = universe.runtimeMirror(getClass.getClassLoader) + +// the hive configuration class is a subclass of Hadoop Configuration, so can be cast down +// to a Configuration and used without reflection +val hiveConfClass = mirror.classLoader.loadClass("org.apache.hadoop.hive.conf.HiveConf") +// using the (Configuration, Class) constructor allows the current configuratin to be included +// in the hive config. +val ctor = hiveConfClass.getDeclaredConstructor(classOf[Configuration], + classOf[Object].getClass) +val hiveConf = ctor.newInstance(conf, hiveConfClass).asInstanceOf[Configuration] +val metastore_uri = hiveConf.getTrimmed("hive.metastore.uris", "") + +// Check for local metastore +if (metastore_uri.nonEmpty) { + if (username.isEmpty) { +throw new IllegalArgumentException(s"Username undefined") + } + val metastore_kerberos_principal_key = "hive.metastore.kerberos.principal" + val principal = hiveConf.getTrimmed(metastore_kerberos_principal_key, "") + if (principal.isEmpty) { +throw new IllegalArgumentException(s"Hive principal" + +s" $metastore_kerberos_principal_key undefined") + } + logDebug(s"Getting Hive delegation token for user $username against $metastore_uri") + val hiveClass = mirror.classLoader.loadClass("org.apache.hadoop.hive.ql.metadata.Hive") + val closeCurrent = hiveClass.getMethod("closeCurrent") + try { +// get all the instance methods bef
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/9232#discussion_r43014420 --- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtil.scala --- @@ -142,6 +145,117 @@ class YarnSparkHadoopUtil extends SparkHadoopUtil { val containerIdString = System.getenv(ApplicationConstants.Environment.CONTAINER_ID.name()) ConverterUtils.toContainerId(containerIdString) } + + /** + * Obtains token for the Hive metastore, using the current user as the principal. + * Some exceptions are caught and downgraded to a log message. + * @param conf hadoop configuration; the Hive configuration will be based on this + * @return a token, or `None` if there's no need for a token (no metastore URI or principal + * in the config), or if a binding exception was caught and downgraded. + */ + def obtainTokenForHiveMetastore(conf: Configuration): Option[Token[DelegationTokenIdentifier]] = { +try { + obtainTokenForHiveMetastoreInner(conf, UserGroupInformation.getCurrentUser().getUserName) +} catch { + case e: Exception => { +handleTokenIntrospectionFailure("Hive", e) +None + } +} + } + + /** + * Handle failures to obtain a token through introspection. Failures to load the class are + * not treated as errors: anything else is. + * @param service service name for error messages + * @param thrown exception caught + * @throws Exception if the `thrown` exception isn't one that is to be ignored + */ + private[yarn] def handleTokenIntrospectionFailure(service: String, thrown: Throwable): Unit = { +thrown match { + case e: ClassNotFoundException => +logInfo(s"$service class not found $e") +logDebug("Hive Class not found", e) + case e: NoClassDefFoundError => +logDebug(s"$service class not found", e) + case e: InvocationTargetException => +// problem talking to the metastore or other hive-side exception +logInfo(s"$service method invocation failed", e) +throw if (e.getCause != null) e.getCause else e + case e: ReflectiveOperationException => +// any other reflection failure log at error and rethrow +logError(s"$service Class operation failed", e) +throw e; + case e: RuntimeException => +// any runtime exception, including Illegal Argument Exception +throw e + case t: Throwable => { +val msg = s"$service: Unexpected Exception " + t +logError(msg, t) +throw new RuntimeException(msg, t) + } +} + } + + /** + * Inner routine to obtains token for the Hive metastore; exceptions are raised on any problem. + * @param conf hadoop configuration; the Hive configuration will be based on this. + * @param username the username of the principal requesting the delegating token. + * @return a delegation token + */ + private[yarn] def obtainTokenForHiveMetastoreInner(conf: Configuration, + username: String): Option[Token[DelegationTokenIdentifier]] = { +val mirror = universe.runtimeMirror(getClass.getClassLoader) + +// the hive configuration class is a subclass of Hadoop Configuration, so can be cast down +// to a Configuration and used without reflection +val hiveConfClass = mirror.classLoader.loadClass("org.apache.hadoop.hive.conf.HiveConf") +// using the (Configuration, Class) constructor allows the current configuratin to be included +// in the hive config. +val ctor = hiveConfClass.getDeclaredConstructor(classOf[Configuration], + classOf[Object].getClass) +val hiveConf = ctor.newInstance(conf, hiveConfClass).asInstanceOf[Configuration] +val metastore_uri = hiveConf.getTrimmed("hive.metastore.uris", "") + +// Check for local metastore +if (metastore_uri.nonEmpty) { + if (username.isEmpty) { --- End diff -- `require(username.nonEmpty)` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/9232#discussion_r43014259 --- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtil.scala --- @@ -142,6 +145,117 @@ class YarnSparkHadoopUtil extends SparkHadoopUtil { val containerIdString = System.getenv(ApplicationConstants.Environment.CONTAINER_ID.name()) ConverterUtils.toContainerId(containerIdString) } + + /** + * Obtains token for the Hive metastore, using the current user as the principal. + * Some exceptions are caught and downgraded to a log message. + * @param conf hadoop configuration; the Hive configuration will be based on this + * @return a token, or `None` if there's no need for a token (no metastore URI or principal + * in the config), or if a binding exception was caught and downgraded. + */ + def obtainTokenForHiveMetastore(conf: Configuration): Option[Token[DelegationTokenIdentifier]] = { +try { + obtainTokenForHiveMetastoreInner(conf, UserGroupInformation.getCurrentUser().getUserName) +} catch { + case e: Exception => { +handleTokenIntrospectionFailure("Hive", e) +None + } +} + } + + /** + * Handle failures to obtain a token through introspection. Failures to load the class are + * not treated as errors: anything else is. + * @param service service name for error messages + * @param thrown exception caught + * @throws Exception if the `thrown` exception isn't one that is to be ignored + */ + private[yarn] def handleTokenIntrospectionFailure(service: String, thrown: Throwable): Unit = { +thrown match { + case e: ClassNotFoundException => +logInfo(s"$service class not found $e") +logDebug("Hive Class not found", e) + case e: NoClassDefFoundError => +logDebug(s"$service class not found", e) + case e: InvocationTargetException => +// problem talking to the metastore or other hive-side exception +logInfo(s"$service method invocation failed", e) +throw if (e.getCause != null) e.getCause else e + case e: ReflectiveOperationException => +// any other reflection failure log at error and rethrow +logError(s"$service Class operation failed", e) +throw e; + case e: RuntimeException => +// any runtime exception, including Illegal Argument Exception +throw e + case t: Throwable => { +val msg = s"$service: Unexpected Exception " + t +logError(msg, t) +throw new RuntimeException(msg, t) + } +} + } + + /** + * Inner routine to obtains token for the Hive metastore; exceptions are raised on any problem. + * @param conf hadoop configuration; the Hive configuration will be based on this. + * @param username the username of the principal requesting the delegating token. + * @return a delegation token + */ + private[yarn] def obtainTokenForHiveMetastoreInner(conf: Configuration, + username: String): Option[Token[DelegationTokenIdentifier]] = { +val mirror = universe.runtimeMirror(getClass.getClassLoader) + +// the hive configuration class is a subclass of Hadoop Configuration, so can be cast down +// to a Configuration and used without reflection +val hiveConfClass = mirror.classLoader.loadClass("org.apache.hadoop.hive.conf.HiveConf") +// using the (Configuration, Class) constructor allows the current configuratin to be included +// in the hive config. +val ctor = hiveConfClass.getDeclaredConstructor(classOf[Configuration], + classOf[Object].getClass) +val hiveConf = ctor.newInstance(conf, hiveConfClass).asInstanceOf[Configuration] +val metastore_uri = hiveConf.getTrimmed("hive.metastore.uris", "") --- End diff -- `metastoreUri`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/9232#discussion_r43014139 --- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtil.scala --- @@ -142,6 +145,117 @@ class YarnSparkHadoopUtil extends SparkHadoopUtil { val containerIdString = System.getenv(ApplicationConstants.Environment.CONTAINER_ID.name()) ConverterUtils.toContainerId(containerIdString) } + + /** + * Obtains token for the Hive metastore, using the current user as the principal. + * Some exceptions are caught and downgraded to a log message. + * @param conf hadoop configuration; the Hive configuration will be based on this + * @return a token, or `None` if there's no need for a token (no metastore URI or principal + * in the config), or if a binding exception was caught and downgraded. + */ + def obtainTokenForHiveMetastore(conf: Configuration): Option[Token[DelegationTokenIdentifier]] = { +try { + obtainTokenForHiveMetastoreInner(conf, UserGroupInformation.getCurrentUser().getUserName) +} catch { + case e: Exception => { +handleTokenIntrospectionFailure("Hive", e) +None + } +} + } + + /** + * Handle failures to obtain a token through introspection. Failures to load the class are + * not treated as errors: anything else is. + * @param service service name for error messages + * @param thrown exception caught + * @throws Exception if the `thrown` exception isn't one that is to be ignored + */ + private[yarn] def handleTokenIntrospectionFailure(service: String, thrown: Throwable): Unit = { +thrown match { + case e: ClassNotFoundException => +logInfo(s"$service class not found $e") +logDebug("Hive Class not found", e) + case e: NoClassDefFoundError => +logDebug(s"$service class not found", e) + case e: InvocationTargetException => +// problem talking to the metastore or other hive-side exception +logInfo(s"$service method invocation failed", e) +throw if (e.getCause != null) e.getCause else e + case e: ReflectiveOperationException => +// any other reflection failure log at error and rethrow +logError(s"$service Class operation failed", e) +throw e; --- End diff -- nuke the semi-colon. Also, I don't think you need to catch / log / re-throw any of these. Just let them bubble up and fail the app. The user will see the error. As I suggested before, the only `case` you need here is for `ClassNotFoundException`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org