[GitHub] spark pull request: [SPARK-11265] [YARN] [WIP] YarnClient can't ge...
Github user steveloughran commented on the pull request: https://github.com/apache/spark/pull/9232#issuecomment-151145666 Thomas: been through the hive versions and they didn't pull anything, simply added another overloaded method before the existing three. Probably a latent bug in the reflection code, which wasn't explicitly asking for the method whose parameters matched those which it needed. In fact, i think this patch here would work with 0.13 too. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] [WIP] YarnClient can't ge...
Github user tgravescs commented on the pull request: https://github.com/apache/spark/pull/9232#issuecomment-151132336 can you update jira and this pr to include which version of hive changed the interface? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] [WIP] YarnClient can't ge...
Github user steveloughran commented on the pull request: https://github.com/apache/spark/pull/9232#issuecomment-151140385 Thomas, I'm not sure that they actually pulled something, but instead overloaded the get() operation, which confused the reflection logic. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] [WIP] YarnClient can't ge...
Github user steveloughran commented on a diff in the pull request: https://github.com/apache/spark/pull/9232#discussion_r43004016 --- Diff: yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtilSuite.scala --- @@ -245,4 +247,28 @@ class YarnSparkHadoopUtilSuite extends SparkFunSuite with Matchers with Logging System.clearProperty("SPARK_YARN_MODE") } } + + test("Obtain tokens For HiveMetastore") { +val hadoopConf = new Configuration() +hadoopConf.set("hive.metastore.kerberos.principal", "bob") +// thrift picks up on port 0 and bails out, without trying to talk to endpoint +hadoopConf.set("hive.metastore.uris", "http://localhost:0;) +val util = new YarnSparkHadoopUtil +val e = intercept[InvocationTargetException] { + val token = util.obtainTokenForHiveMetastoreInner(hadoopConf, "alice") + fail(s"Expected an exception, got the token $token") --- End diff -- I'm a firm believer in not losing information that may help debug failures. If you think token is worthless, I'll pull it, but in the other checks I will try and preserve stack traces, rather than raise new asserts --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] [WIP] YarnClient can't ge...
Github user steveloughran commented on a diff in the pull request: https://github.com/apache/spark/pull/9232#discussion_r43003603 --- Diff: yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtilSuite.scala --- @@ -245,4 +247,28 @@ class YarnSparkHadoopUtilSuite extends SparkFunSuite with Matchers with Logging System.clearProperty("SPARK_YARN_MODE") } } + + test("Obtain tokens For HiveMetastore") { +val hadoopConf = new Configuration() +hadoopConf.set("hive.metastore.kerberos.principal", "bob") +// thrift picks up on port 0 and bails out, without trying to talk to endpoint +hadoopConf.set("hive.metastore.uris", "http://localhost:0;) +val util = new YarnSparkHadoopUtil +val e = intercept[InvocationTargetException] { + val token = util.obtainTokenForHiveMetastoreInner(hadoopConf, "alice") --- End diff -- It's used in the error text raised in the `fail()` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] [WIP] YarnClient can't ge...
Github user steveloughran commented on the pull request: https://github.com/apache/spark/pull/9232#issuecomment-151173854 -Updated patch pulls the `val token=` and `fail` code from the tests. It also pulls out the logic for deciding what to do with exceptions into its own class, which has two benefits. 1. The logic can be tested directly, going through all the possible clauses of the match statement, and any conditions inside them. 1. The hbase introspection code can be reworked to use the same method. This would guarantee one single implementation of logging & rethrow policy across the two (and any future) token acquisition clauses. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] [WIP] YarnClient can't ge...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9232#issuecomment-151174376 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] [WIP] YarnClient can't ge...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9232#issuecomment-151176637 **[Test build #44358 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44358/consoleFull)** for PR 9232 at commit [`0356c4c`](https://github.com/apache/spark/commit/0356c4c6f9356ecd4234dbe6a6f1b2f7b38b5e14). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] [WIP] YarnClient can't ge...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9232#issuecomment-151182699 **[Test build #44358 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44358/consoleFull)** for PR 9232 at commit [`0356c4c`](https://github.com/apache/spark/commit/0356c4c6f9356ecd4234dbe6a6f1b2f7b38b5e14). * This patch passes all tests. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_:\n * ` logInfo(s\"$service class not found $e\")`\n * ` logDebug(s\"$service class not found\", e)`\n --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] [WIP] YarnClient can't ge...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9232#issuecomment-151174413 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] [WIP] YarnClient can't ge...
Github user steveloughran commented on a diff in the pull request: https://github.com/apache/spark/pull/9232#discussion_r42945007 --- Diff: yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtilSuite.scala --- @@ -245,4 +247,28 @@ class YarnSparkHadoopUtilSuite extends SparkFunSuite with Matchers with Logging System.clearProperty("SPARK_YARN_MODE") } } + + test("Obtain tokens For HiveMetastore") { +val hadoopConf = new Configuration() +hadoopConf.set("hive.metastore.kerberos.principal", "bob") +// thrift picks up on port 0 and bails out, without trying to talk to endpoint +hadoopConf.set("hive.metastore.uris", "http://localhost:0;) +val util = new YarnSparkHadoopUtil +val e = intercept[InvocationTargetException] { + val token = util.obtainTokenForHiveMetastoreInner(hadoopConf, "alice") + fail(s"Expected an exception, got the token $token") +} +val inner = e.getCause +if (inner == null) { + fail("No inner cause", e) +} +if (!inner.isInstanceOf[HiveException]) { + fail(s"Not a hive exception", inner) --- End diff -- I want to include the inner exception if it's of the wrong type, so that the Junit/jenkins report can diagnose the failure. an `assert(inner.isInstanceOf[HiveException])` will say when the exception was of the wrong type, but not include the details, including the stack trace. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] [WIP] YarnClient can't ge...
Github user steveloughran commented on a diff in the pull request: https://github.com/apache/spark/pull/9232#discussion_r42944993 --- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtil.scala --- @@ -142,6 +145,104 @@ class YarnSparkHadoopUtil extends SparkHadoopUtil { val containerIdString = System.getenv(ApplicationConstants.Environment.CONTAINER_ID.name()) ConverterUtils.toContainerId(containerIdString) } + + /** + * Obtains token for the Hive metastore, using the current user as the principal. + * Some exceptions are caught and downgraded to a log message. + * @param conf hadoop configuration; the Hive configuration will be based on this + * @return a token, or `None` if there's no need for a token (no metastore URI or principal + * in the config), or if a binding exception was caught and downgraded. + */ + def obtainTokenForHiveMetastore(conf: Configuration): Option[Token[DelegationTokenIdentifier]] = { +try { + obtainTokenForHiveMetastoreInner(conf, UserGroupInformation.getCurrentUser().getUserName) +} catch { + case e: NoSuchMethodException => +logInfo("Hive Method not found", e) +None + case e: ClassNotFoundException => --- End diff -- +1. I'd left it in there as it may have had a valid reason for being there, but i do things it's correct. Detecting config problems, that is something to throw up. Note that `Client.obtainTokenForHBase()` has similar behaviour; this patch doesn't address it. When someone sits down to do it, the policy about how to react to failures could be converted into a wrapper around a closure which executes the token retrieval (here `obtainTokenForHiveMetastoreInner`), so there'd be no divergence. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] [WIP] YarnClient can't ge...
Github user steveloughran commented on a diff in the pull request: https://github.com/apache/spark/pull/9232#discussion_r42944947 --- Diff: yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtilSuite.scala --- @@ -245,4 +247,28 @@ class YarnSparkHadoopUtilSuite extends SparkFunSuite with Matchers with Logging System.clearProperty("SPARK_YARN_MODE") } } + + test("Obtain tokens For HiveMetastore") { +val hadoopConf = new Configuration() +hadoopConf.set("hive.metastore.kerberos.principal", "bob") +// thrift picks up on port 0 and bails out, without trying to talk to endpoint +hadoopConf.set("hive.metastore.uris", "http://localhost:0;) +val util = new YarnSparkHadoopUtil +val e = intercept[InvocationTargetException] { + val token = util.obtainTokenForHiveMetastoreInner(hadoopConf, "alice") + fail(s"Expected an exception, got the token $token") --- End diff -- wanted to include any token returned in the assert failure, on the basis that if something came back, it would be useful to find out what went wrong. `intercept`, just like JUnit's `@Test(expected=)` feature, picks up on the failure to raise the specific exception, but doesn't appear to say much else. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] [WIP] YarnClient can't ge...
Github user steveloughran commented on a diff in the pull request: https://github.com/apache/spark/pull/9232#discussion_r42944952 --- Diff: yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtilSuite.scala --- @@ -245,4 +247,28 @@ class YarnSparkHadoopUtilSuite extends SparkFunSuite with Matchers with Logging System.clearProperty("SPARK_YARN_MODE") } } + + test("Obtain tokens For HiveMetastore") { +val hadoopConf = new Configuration() +hadoopConf.set("hive.metastore.kerberos.principal", "bob") +// thrift picks up on port 0 and bails out, without trying to talk to endpoint +hadoopConf.set("hive.metastore.uris", "http://localhost:0;) +val util = new YarnSparkHadoopUtil +val e = intercept[InvocationTargetException] { + val token = util.obtainTokenForHiveMetastoreInner(hadoopConf, "alice") + fail(s"Expected an exception, got the token $token") +} +val inner = e.getCause +if (inner == null) { + fail("No inner cause", e) --- End diff -- good point --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] [WIP] YarnClient can't ge...
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/9232#discussion_r42952937 --- Diff: yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtilSuite.scala --- @@ -245,4 +247,28 @@ class YarnSparkHadoopUtilSuite extends SparkFunSuite with Matchers with Logging System.clearProperty("SPARK_YARN_MODE") } } + + test("Obtain tokens For HiveMetastore") { +val hadoopConf = new Configuration() +hadoopConf.set("hive.metastore.kerberos.principal", "bob") +// thrift picks up on port 0 and bails out, without trying to talk to endpoint +hadoopConf.set("hive.metastore.uris", "http://localhost:0;) +val util = new YarnSparkHadoopUtil +val e = intercept[InvocationTargetException] { + val token = util.obtainTokenForHiveMetastoreInner(hadoopConf, "alice") + fail(s"Expected an exception, got the token $token") +} +val inner = e.getCause +if (inner == null) { + fail("No inner cause", e) +} +if (!inner.isInstanceOf[HiveException]) { + fail(s"Not a hive exception", inner) +} +// expect exception trapping code to unwind this hive-side exception +intercept[HiveException] { + val token = util.obtainTokenForHiveMetastore(hadoopConf) + fail(s"Expected an exception, got the token $token") --- End diff -- I fail to see how the contents of `token` are useful in diagnosing anything. The fact that you didn't get the expected exception is the only thing being tested get. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11265] [YARN] [WIP] YarnClient can't ge...
Github user steveloughran commented on a diff in the pull request: https://github.com/apache/spark/pull/9232#discussion_r42945056 --- Diff: yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtilSuite.scala --- @@ -245,4 +247,28 @@ class YarnSparkHadoopUtilSuite extends SparkFunSuite with Matchers with Logging System.clearProperty("SPARK_YARN_MODE") } } + + test("Obtain tokens For HiveMetastore") { +val hadoopConf = new Configuration() +hadoopConf.set("hive.metastore.kerberos.principal", "bob") +// thrift picks up on port 0 and bails out, without trying to talk to endpoint +hadoopConf.set("hive.metastore.uris", "http://localhost:0;) +val util = new YarnSparkHadoopUtil +val e = intercept[InvocationTargetException] { + val token = util.obtainTokenForHiveMetastoreInner(hadoopConf, "alice") + fail(s"Expected an exception, got the token $token") +} +val inner = e.getCause +if (inner == null) { + fail("No inner cause", e) +} +if (!inner.isInstanceOf[HiveException]) { + fail(s"Not a hive exception", inner) +} +// expect exception trapping code to unwind this hive-side exception +intercept[HiveException] { + val token = util.obtainTokenForHiveMetastore(hadoopConf) + fail(s"Expected an exception, got the token $token") --- End diff -- again, better diagnostics. Now, if you look at `Assertions.intercept()`, it's clearly discarding the result of the inner closure. If it was redone to take an error string on a mismatch, and prepend that to the `.toString()` value of anything returned which wasn't of type `Unit`, then we'd have something really good at reporting failures to raise exceptions. As it is, `intercept` is pretty lossy. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org