[GitHub] spark pull request: [SPARK-11265] [YARN] [WIP] YarnClient can't ge...

2015-10-26 Thread steveloughran
Github user steveloughran commented on the pull request:

https://github.com/apache/spark/pull/9232#issuecomment-151145666
  
Thomas: been through the hive versions and they didn't pull anything, 
simply added another overloaded method before the existing three. Probably a 
latent bug in the reflection code, which wasn't explicitly asking for the 
method whose parameters matched those which it needed. In fact, i think this 
patch here would work with 0.13 too.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] [WIP] YarnClient can't ge...

2015-10-26 Thread tgravescs
Github user tgravescs commented on the pull request:

https://github.com/apache/spark/pull/9232#issuecomment-151132336
  
can you update jira and this pr to include which version of hive changed 
the interface?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] [WIP] YarnClient can't ge...

2015-10-26 Thread steveloughran
Github user steveloughran commented on the pull request:

https://github.com/apache/spark/pull/9232#issuecomment-151140385
  
Thomas, I'm not sure that they actually pulled something, but instead 
overloaded the get() operation, which confused the reflection logic.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] [WIP] YarnClient can't ge...

2015-10-26 Thread steveloughran
Github user steveloughran commented on a diff in the pull request:

https://github.com/apache/spark/pull/9232#discussion_r43004016
  
--- Diff: 
yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtilSuite.scala 
---
@@ -245,4 +247,28 @@ class YarnSparkHadoopUtilSuite extends SparkFunSuite 
with Matchers with Logging
   System.clearProperty("SPARK_YARN_MODE")
 }
   }
+
+  test("Obtain tokens For HiveMetastore") {
+val hadoopConf = new Configuration()
+hadoopConf.set("hive.metastore.kerberos.principal", "bob")
+// thrift picks up on port 0 and bails out, without trying to talk to 
endpoint
+hadoopConf.set("hive.metastore.uris", "http://localhost:0;)
+val util = new YarnSparkHadoopUtil
+val e = intercept[InvocationTargetException] {
+  val token = util.obtainTokenForHiveMetastoreInner(hadoopConf, 
"alice")
+  fail(s"Expected an exception, got the token $token")
--- End diff --

I'm a firm believer in not losing information that may help debug failures. 
If you think token is worthless, I'll pull it, but in the other checks I will 
try and preserve stack traces, rather than raise new asserts


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] [WIP] YarnClient can't ge...

2015-10-26 Thread steveloughran
Github user steveloughran commented on a diff in the pull request:

https://github.com/apache/spark/pull/9232#discussion_r43003603
  
--- Diff: 
yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtilSuite.scala 
---
@@ -245,4 +247,28 @@ class YarnSparkHadoopUtilSuite extends SparkFunSuite 
with Matchers with Logging
   System.clearProperty("SPARK_YARN_MODE")
 }
   }
+
+  test("Obtain tokens For HiveMetastore") {
+val hadoopConf = new Configuration()
+hadoopConf.set("hive.metastore.kerberos.principal", "bob")
+// thrift picks up on port 0 and bails out, without trying to talk to 
endpoint
+hadoopConf.set("hive.metastore.uris", "http://localhost:0;)
+val util = new YarnSparkHadoopUtil
+val e = intercept[InvocationTargetException] {
+  val token = util.obtainTokenForHiveMetastoreInner(hadoopConf, 
"alice")
--- End diff --

It's used in the error text raised in the `fail()`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] [WIP] YarnClient can't ge...

2015-10-26 Thread steveloughran
Github user steveloughran commented on the pull request:

https://github.com/apache/spark/pull/9232#issuecomment-151173854
  
-Updated patch pulls the `val token=` and `fail` code from the tests. It 
also pulls out the logic for deciding what to do with exceptions into its own 
class, which has two benefits.

1. The logic can be tested directly, going through all the possible clauses 
of the match statement, and any conditions inside them.
1. The hbase introspection code can be reworked to use the same method. 
This would guarantee one single implementation of logging & rethrow policy 
across the two (and any future) token acquisition clauses.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] [WIP] YarnClient can't ge...

2015-10-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9232#issuecomment-151174376
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] [WIP] YarnClient can't ge...

2015-10-26 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9232#issuecomment-151176637
  
**[Test build #44358 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44358/consoleFull)**
 for PR 9232 at commit 
[`0356c4c`](https://github.com/apache/spark/commit/0356c4c6f9356ecd4234dbe6a6f1b2f7b38b5e14).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] [WIP] YarnClient can't ge...

2015-10-26 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9232#issuecomment-151182699
  
**[Test build #44358 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44358/consoleFull)**
 for PR 9232 at commit 
[`0356c4c`](https://github.com/apache/spark/commit/0356c4c6f9356ecd4234dbe6a6f1b2f7b38b5e14).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:\n  * `
logInfo(s\"$service class not found $e\")`\n  * `
logDebug(s\"$service class not found\", e)`\n


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] [WIP] YarnClient can't ge...

2015-10-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9232#issuecomment-151174413
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] [WIP] YarnClient can't ge...

2015-10-25 Thread steveloughran
Github user steveloughran commented on a diff in the pull request:

https://github.com/apache/spark/pull/9232#discussion_r42945007
  
--- Diff: 
yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtilSuite.scala 
---
@@ -245,4 +247,28 @@ class YarnSparkHadoopUtilSuite extends SparkFunSuite 
with Matchers with Logging
   System.clearProperty("SPARK_YARN_MODE")
 }
   }
+
+  test("Obtain tokens For HiveMetastore") {
+val hadoopConf = new Configuration()
+hadoopConf.set("hive.metastore.kerberos.principal", "bob")
+// thrift picks up on port 0 and bails out, without trying to talk to 
endpoint
+hadoopConf.set("hive.metastore.uris", "http://localhost:0;)
+val util = new YarnSparkHadoopUtil
+val e = intercept[InvocationTargetException] {
+  val token = util.obtainTokenForHiveMetastoreInner(hadoopConf, 
"alice")
+  fail(s"Expected an exception, got the token $token")
+}
+val inner = e.getCause
+if (inner == null) {
+  fail("No inner cause", e)
+}
+if (!inner.isInstanceOf[HiveException]) {
+  fail(s"Not a hive exception", inner)
--- End diff --

I want to include the inner exception if it's of the wrong type, so that 
the Junit/jenkins report can diagnose the failure. an 
`assert(inner.isInstanceOf[HiveException])` will say when the exception was of 
the wrong type, but not include the details, including the stack trace.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] [WIP] YarnClient can't ge...

2015-10-25 Thread steveloughran
Github user steveloughran commented on a diff in the pull request:

https://github.com/apache/spark/pull/9232#discussion_r42944993
  
--- Diff: 
yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtil.scala ---
@@ -142,6 +145,104 @@ class YarnSparkHadoopUtil extends SparkHadoopUtil {
 val containerIdString = 
System.getenv(ApplicationConstants.Environment.CONTAINER_ID.name())
 ConverterUtils.toContainerId(containerIdString)
   }
+
+  /**
+   * Obtains token for the Hive metastore, using the current user as the 
principal.
+   * Some exceptions are caught and downgraded to a log message.
+   * @param conf hadoop configuration; the Hive configuration will be 
based on this
+   * @return a token, or `None` if there's no need for a token (no 
metastore URI or principal
+   * in the config), or if a binding exception was caught and 
downgraded.
+   */
+  def obtainTokenForHiveMetastore(conf: Configuration): 
Option[Token[DelegationTokenIdentifier]] = {
+try {
+  obtainTokenForHiveMetastoreInner(conf, 
UserGroupInformation.getCurrentUser().getUserName)
+} catch {
+  case e: NoSuchMethodException =>
+logInfo("Hive Method not found", e)
+None
+  case e: ClassNotFoundException =>
--- End diff --

+1. I'd left it in there as it may have had a valid reason for being there, 
but i do things it's correct. Detecting config problems, that is something to 
throw up.

Note that `Client.obtainTokenForHBase()` has similar behaviour; this patch 
doesn't address it. When someone sits down to do it, the policy about how to 
react to failures could be converted into a wrapper around a closure which 
executes the token retrieval (here `obtainTokenForHiveMetastoreInner`), so 
there'd be no divergence.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] [WIP] YarnClient can't ge...

2015-10-25 Thread steveloughran
Github user steveloughran commented on a diff in the pull request:

https://github.com/apache/spark/pull/9232#discussion_r42944947
  
--- Diff: 
yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtilSuite.scala 
---
@@ -245,4 +247,28 @@ class YarnSparkHadoopUtilSuite extends SparkFunSuite 
with Matchers with Logging
   System.clearProperty("SPARK_YARN_MODE")
 }
   }
+
+  test("Obtain tokens For HiveMetastore") {
+val hadoopConf = new Configuration()
+hadoopConf.set("hive.metastore.kerberos.principal", "bob")
+// thrift picks up on port 0 and bails out, without trying to talk to 
endpoint
+hadoopConf.set("hive.metastore.uris", "http://localhost:0;)
+val util = new YarnSparkHadoopUtil
+val e = intercept[InvocationTargetException] {
+  val token = util.obtainTokenForHiveMetastoreInner(hadoopConf, 
"alice")
+  fail(s"Expected an exception, got the token $token")
--- End diff --

wanted to include any token returned in the assert failure, on the basis 
that if something came back, it would be useful to find out what went wrong. 
`intercept`, just like JUnit's `@Test(expected=)` feature, picks up on the 
failure to raise the specific exception, but doesn't appear to say much else.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] [WIP] YarnClient can't ge...

2015-10-25 Thread steveloughran
Github user steveloughran commented on a diff in the pull request:

https://github.com/apache/spark/pull/9232#discussion_r42944952
  
--- Diff: 
yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtilSuite.scala 
---
@@ -245,4 +247,28 @@ class YarnSparkHadoopUtilSuite extends SparkFunSuite 
with Matchers with Logging
   System.clearProperty("SPARK_YARN_MODE")
 }
   }
+
+  test("Obtain tokens For HiveMetastore") {
+val hadoopConf = new Configuration()
+hadoopConf.set("hive.metastore.kerberos.principal", "bob")
+// thrift picks up on port 0 and bails out, without trying to talk to 
endpoint
+hadoopConf.set("hive.metastore.uris", "http://localhost:0;)
+val util = new YarnSparkHadoopUtil
+val e = intercept[InvocationTargetException] {
+  val token = util.obtainTokenForHiveMetastoreInner(hadoopConf, 
"alice")
+  fail(s"Expected an exception, got the token $token")
+}
+val inner = e.getCause
+if (inner == null) {
+  fail("No inner cause", e)
--- End diff --

good point


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] [WIP] YarnClient can't ge...

2015-10-25 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/9232#discussion_r42952937
  
--- Diff: 
yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtilSuite.scala 
---
@@ -245,4 +247,28 @@ class YarnSparkHadoopUtilSuite extends SparkFunSuite 
with Matchers with Logging
   System.clearProperty("SPARK_YARN_MODE")
 }
   }
+
+  test("Obtain tokens For HiveMetastore") {
+val hadoopConf = new Configuration()
+hadoopConf.set("hive.metastore.kerberos.principal", "bob")
+// thrift picks up on port 0 and bails out, without trying to talk to 
endpoint
+hadoopConf.set("hive.metastore.uris", "http://localhost:0;)
+val util = new YarnSparkHadoopUtil
+val e = intercept[InvocationTargetException] {
+  val token = util.obtainTokenForHiveMetastoreInner(hadoopConf, 
"alice")
+  fail(s"Expected an exception, got the token $token")
+}
+val inner = e.getCause
+if (inner == null) {
+  fail("No inner cause", e)
+}
+if (!inner.isInstanceOf[HiveException]) {
+  fail(s"Not a hive exception", inner)
+}
+// expect exception trapping code to unwind this hive-side exception
+intercept[HiveException] {
+  val token = util.obtainTokenForHiveMetastore(hadoopConf)
+  fail(s"Expected an exception, got the token $token")
--- End diff --

I fail to see how the contents of `token` are useful in diagnosing 
anything. The fact that you didn't get the expected exception is the only thing 
being tested get.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] [WIP] YarnClient can't ge...

2015-10-25 Thread steveloughran
Github user steveloughran commented on a diff in the pull request:

https://github.com/apache/spark/pull/9232#discussion_r42945056
  
--- Diff: 
yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtilSuite.scala 
---
@@ -245,4 +247,28 @@ class YarnSparkHadoopUtilSuite extends SparkFunSuite 
with Matchers with Logging
   System.clearProperty("SPARK_YARN_MODE")
 }
   }
+
+  test("Obtain tokens For HiveMetastore") {
+val hadoopConf = new Configuration()
+hadoopConf.set("hive.metastore.kerberos.principal", "bob")
+// thrift picks up on port 0 and bails out, without trying to talk to 
endpoint
+hadoopConf.set("hive.metastore.uris", "http://localhost:0;)
+val util = new YarnSparkHadoopUtil
+val e = intercept[InvocationTargetException] {
+  val token = util.obtainTokenForHiveMetastoreInner(hadoopConf, 
"alice")
+  fail(s"Expected an exception, got the token $token")
+}
+val inner = e.getCause
+if (inner == null) {
+  fail("No inner cause", e)
+}
+if (!inner.isInstanceOf[HiveException]) {
+  fail(s"Not a hive exception", inner)
+}
+// expect exception trapping code to unwind this hive-side exception
+intercept[HiveException] {
+  val token = util.obtainTokenForHiveMetastore(hadoopConf)
+  fail(s"Expected an exception, got the token $token")
--- End diff --

again, better diagnostics. Now, if you look at `Assertions.intercept()`, 
it's clearly discarding the result of the inner closure. If it  was redone to 
take an error string on a mismatch, and prepend that to the `.toString()` value 
of anything returned which wasn't of type `Unit`, then we'd have something 
really good at reporting failures to raise exceptions. As it is, `intercept` is 
pretty lossy.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org