[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-11-04 Thread steveloughran
Github user steveloughran closed the pull request at:

https://github.com/apache/spark/pull/9466


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-11-04 Thread vanzin
Github user vanzin commented on the pull request:

https://github.com/apache/spark/pull/9466#issuecomment-153833441
  
@steveloughran I merged this, can you close the PR? github only closes PRs 
submitted against master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-11-04 Thread vanzin
Github user vanzin commented on the pull request:

https://github.com/apache/spark/pull/9466#issuecomment-153833114
  
LGTM, merging.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-11-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9466#issuecomment-153740806
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45014/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-11-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9466#issuecomment-153740799
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-11-04 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9466#issuecomment-153740334
  
  [Test build #45014 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45014/console)
 for   PR 9466 at commit 
[`2b72de9`](https://github.com/apache/spark/commit/2b72de977bec23d840c3895291c7b8886f5822cd).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-11-04 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9466#issuecomment-153693389
  
  [Test build #45014 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45014/consoleFull)
 for   PR 9466 at commit 
[`2b72de9`](https://github.com/apache/spark/commit/2b72de977bec23d840c3895291c7b8886f5822cd).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-11-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9466#issuecomment-153690587
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-11-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9466#issuecomment-153690560
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-11-04 Thread steveloughran
GitHub user steveloughran opened a pull request:

https://github.com/apache/spark/pull/9466

[SPARK-11265] [YARN]  YarnClient can't get tokens to talk to Hive in a 
secure…

Backport to branch-1.5 of SPARK-11265 patch.

The sole change is in `Client`, where there's no longer a probe to see if 
the token request is enabled; it always happens. This means that provided 
there's hive.jar on the classpath, this will attempt to get the token —and if 
that fails for any reason other than CNFE, the client launch will fail

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/steveloughran/spark 
stevel/patches/SPARK-11265-on-branch-1.5

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/9466.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #9466


commit 2b72de977bec23d840c3895291c7b8886f5822cd
Author: Steve Loughran 
Date:   2015-11-03T16:39:24Z

[SPARK-11265] YarnClient can't get tokens to talk to Hive in a secure 
cluster - backport to branch-1.5




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-11-04 Thread steveloughran
Github user steveloughran closed the pull request at:

https://github.com/apache/spark/pull/9438


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-11-04 Thread steveloughran
Github user steveloughran commented on the pull request:

https://github.com/apache/spark/pull/9438#issuecomment-153688927
  
that's what comes of trying to code at a conference; will cancel and 
resubmit


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-11-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9438#issuecomment-153467262
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44936/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-11-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9438#issuecomment-153467259
  
Build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-11-03 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9438#issuecomment-153467150
  
**[Test build #44936 timed 
out](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44936/console)**
 for PR 9438 at commit 
[`2b72de9`](https://github.com/apache/spark/commit/2b72de977bec23d840c3895291c7b8886f5822cd)
 after a configured wait of `175m`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-11-03 Thread vanzin
Github user vanzin commented on the pull request:

https://github.com/apache/spark/pull/9438#issuecomment-153415945
  
@steveloughran you have to choose the right target branch when submitting 
the PR. Can you close this one and open a new one with the correct target 
branch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-11-03 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9438#issuecomment-153414575
  
  [Test build #44936 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44936/consoleFull)
 for   PR 9438 at commit 
[`2b72de9`](https://github.com/apache/spark/commit/2b72de977bec23d840c3895291c7b8886f5822cd).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-11-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9438#issuecomment-153413113
  
 Build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-11-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9438#issuecomment-153413163
  
Build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-11-03 Thread steveloughran
GitHub user steveloughran opened a pull request:

https://github.com/apache/spark/pull/9438

[SPARK-11265] [YARN] YarnClient can't get tokens to talk to Hive in a 
secure cluster - backport to branch-1.5

This is a backport of the [SPARK-11265] patch to Branch-1.5; won't compile 
against master.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/steveloughran/spark 
stevel/patches/SPARK-11265-on-branch-1.5

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/9438.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #9438


commit 76d920f2b814304051dd76f0ca78301e872fc811
Author: Yu ISHIKAWA 
Date:   2015-08-25T07:28:51Z

[SPARK-10214] [SPARKR] [DOCS] Improve SparkR Column, DataFrame API docs

cc: shivaram

## Summary

- Add name tags to each methods in DataFrame.R and column.R
- Replace `rdname column` with `rdname {each_func}`. i.e. alias method : 
`rdname column` =>  `rdname alias`

## Generated PDF File

https://drive.google.com/file/d/0B9biIZIU47lLNHN2aFpnQXlSeGs/view?usp=sharing

## JIRA
[[SPARK-10214] Improve SparkR Column, DataFrame API docs - ASF 
JIRA](https://issues.apache.org/jira/browse/SPARK-10214)

Author: Yu ISHIKAWA 

Closes #8414 from yu-iskw/SPARK-10214.

(cherry picked from commit d4549fe58fa0d781e0e891bceff893420cb1d598)
Signed-off-by: Shivaram Venkataraman 

commit 4841ebb1861025067a1108c11f64bb144427a308
Author: Sean Owen 
Date:   2015-08-25T07:32:20Z

[SPARK-6196] [BUILD] Remove MapR profiles in favor of hadoop-provided

Follow up to https://github.com/apache/spark/pull/7047

pwendell mentioned that MapR should use `hadoop-provided` now, and indeed 
the new build script does not produce `mapr3`/`mapr4` artifacts anymore. Hence 
the action seems to be to remove the profiles, which are now not used.

CC trystanleftwich

Author: Sean Owen 

Closes #8338 from srowen/SPARK-6196.

(cherry picked from commit 57b960bf3706728513f9e089455a533f0244312e)
Signed-off-by: Sean Owen 

commit 2032d66706d165079550f06bf695e0b08be7e143
Author: Tathagata Das 
Date:   2015-08-25T07:35:51Z

[SPARK-10210] [STREAMING] Filter out non-existent blocks before creating 
BlockRDD

When write ahead log is not enabled, a recovered streaming driver still 
tries to run jobs using pre-failure block ids, and fails as the block do not 
exists in-memory any more (and cannot be recovered as receiver WAL is not 
enabled).

This occurs because the driver-side WAL of ReceivedBlockTracker is recovers 
that past block information, and ReceiveInputDStream creates BlockRDDs even if 
those blocks do not exist.

The solution in this PR is to filter out block ids that do not exist before 
creating the BlockRDD. In addition, it adds unit tests to verify other logic in 
ReceiverInputDStream.

Author: Tathagata Das 

Closes #8405 from tdas/SPARK-10210.

(cherry picked from commit 1fc37581a52530bac5d555dbf14927a5780c3b75)
Signed-off-by: Tathagata Das 

commit e5cea566a32d254adc9424a2f9e79b92eda3e6e4
Author: Davies Liu 
Date:   2015-08-25T08:00:44Z

[SPARK-10177] [SQL] fix reading Timestamp in parquet from Hive

We misunderstood the Julian days and nanoseconds of the day in parquet (as 
TimestampType) from Hive/Impala, they are overlapped, so can't be added 
together directly.

In order to avoid the confusing rounding when do the converting, we use 
`2440588` as the Julian Day of epoch of unix timestamp (which should be 
2440587.5).

Author: Davies Liu 
Author: Cheng Lian 

Closes #8400 from davies/timestamp_parquet.

(cherry picked from commit 2f493f7e3924b769160a16f73cccbebf21973b91)
Signed-off-by: Cheng Lian 

commit a0f22cf295a1d20814c5be6cc727e39e95a81c27
Author: Josh Rosen 
Date:   2015-08-25T08:06:36Z

[SPARK-10195] [SQL] Data sources Filter should not expose internal types

Spark SQL's data sources API exposes Catalyst's internal types through its 
Filter interfaces. This is a problem because types like UTF8String are not 
stable developer APIs and should not be exposed to third-parties.

This issue caused incompatibilities when upgrading our `spark-redshift` 
library to work against Spark 1.5.0.  To avoid these issues in the future we 
should only expose public types through these Filter objects. This patch 
accomplishes this by using CatalystTypeConverters to add the appropriate 
conversions.

Author: Josh Rosen 

Closes #8403 from JoshRosen/datasources-internal-vs-external-types.

(cherry picked from commit 7bc9a8c6249300ded31ea931c463d0a8f798e193)
Signed-off-by: Reynold Xin 

commit 73f1dd1b5acf1c6c37045da25902d7ca5

[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-11-01 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/9232#discussion_r43587708
  
--- Diff: 
yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtil.scala ---
@@ -142,6 +145,76 @@ class YarnSparkHadoopUtil extends SparkHadoopUtil {
 val containerIdString = 
System.getenv(ApplicationConstants.Environment.CONTAINER_ID.name())
 ConverterUtils.toContainerId(containerIdString)
   }
+
+  /**
+   * Obtains token for the Hive metastore, using the current user as the 
principal.
+   * Some exceptions are caught and downgraded to a log message.
+   * @param conf hadoop configuration; the Hive configuration will be 
based on this
+   * @return a token, or `None` if there's no need for a token (no 
metastore URI or principal
+   * in the config), or if a binding exception was caught and 
downgraded.
+   */
+  def obtainTokenForHiveMetastore(conf: Configuration): 
Option[Token[DelegationTokenIdentifier]] = {
+try {
+  obtainTokenForHiveMetastoreInner(conf, 
UserGroupInformation.getCurrentUser().getUserName)
+} catch {
+  case e: ClassNotFoundException =>
+logInfo(s"Hive class not found $e")
+logDebug("Hive class not found", e)
--- End diff --

They're not exactly the same.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-11-01 Thread vanzin
Github user vanzin commented on the pull request:

https://github.com/apache/spark/pull/9232#issuecomment-152864441
  
> Same JIRA or a new backport one?

You could use the same one; I marked it as resolved but it's ok to reopen 
it for the backport.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-11-01 Thread tedyu
Github user tedyu commented on a diff in the pull request:

https://github.com/apache/spark/pull/9232#discussion_r43583071
  
--- Diff: 
yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtil.scala ---
@@ -142,6 +145,76 @@ class YarnSparkHadoopUtil extends SparkHadoopUtil {
 val containerIdString = 
System.getenv(ApplicationConstants.Environment.CONTAINER_ID.name())
 ConverterUtils.toContainerId(containerIdString)
   }
+
+  /**
+   * Obtains token for the Hive metastore, using the current user as the 
principal.
+   * Some exceptions are caught and downgraded to a log message.
+   * @param conf hadoop configuration; the Hive configuration will be 
based on this
+   * @return a token, or `None` if there's no need for a token (no 
metastore URI or principal
+   * in the config), or if a binding exception was caught and 
downgraded.
+   */
+  def obtainTokenForHiveMetastore(conf: Configuration): 
Option[Token[DelegationTokenIdentifier]] = {
+try {
+  obtainTokenForHiveMetastoreInner(conf, 
UserGroupInformation.getCurrentUser().getUserName)
+} catch {
+  case e: ClassNotFoundException =>
+logInfo(s"Hive class not found $e")
+logDebug("Hive class not found", e)
--- End diff --

Why double log the message ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-11-01 Thread steveloughran
Github user steveloughran commented on the pull request:

https://github.com/apache/spark/pull/9232#issuecomment-152815974
  
Will do. Same JIRA or a new backport one?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-10-31 Thread vanzin
Github user vanzin commented on the pull request:

https://github.com/apache/spark/pull/9232#issuecomment-152786237
  
Hi @steveloughran, this patch doesn't merge cleanly to branch-1.5. If you 
want to apply it there, could you send a new pr for that branch? Thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-10-31 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/9232


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-10-31 Thread vanzin
Github user vanzin commented on the pull request:

https://github.com/apache/spark/pull/9232#issuecomment-152785103
  
Merging to master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-10-31 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/9232#discussion_r43577485
  
--- Diff: 
yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtilSuite.scala 
---
@@ -245,4 +247,31 @@ class YarnSparkHadoopUtilSuite extends SparkFunSuite 
with Matchers with Logging
   System.clearProperty("SPARK_YARN_MODE")
 }
   }
+
+  test("Obtain tokens For HiveMetastore") {
+val hadoopConf = new Configuration()
+hadoopConf.set("hive.metastore.kerberos.principal", "bob")
+// thrift picks up on port 0 and bails out, without trying to talk to 
endpoint
+hadoopConf.set("hive.metastore.uris", "http://localhost:0";)
+val util = new YarnSparkHadoopUtil
--- End diff --

nah, it's ok to leave as is.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-10-31 Thread vanzin
Github user vanzin commented on the pull request:

https://github.com/apache/spark/pull/9232#issuecomment-152781530
  
Latest patch LGTM.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-10-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9232#issuecomment-152497070
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-10-30 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9232#issuecomment-152496951
  
**[Test build #44676 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44676/consoleFull)**
 for PR 9232 at commit 
[`ebb2b5a`](https://github.com/apache/spark/commit/ebb2b5abdf26c9e0a72452a47f8cd23b09e4339c).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:\n  * `
logInfo(s\"Hive class not found $e\")`\n  * `logDebug(\"Hive class 
not found\", e)`\n


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-10-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9232#issuecomment-152497071
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44676/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-10-30 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9232#issuecomment-152490177
  
**[Test build #44676 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44676/consoleFull)**
 for PR 9232 at commit 
[`ebb2b5a`](https://github.com/apache/spark/commit/ebb2b5abdf26c9e0a72452a47f8cd23b09e4339c).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-10-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9232#issuecomment-152489542
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-10-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9232#issuecomment-152489569
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-10-30 Thread steveloughran
Github user steveloughran commented on a diff in the pull request:

https://github.com/apache/spark/pull/9232#discussion_r43489033
  
--- Diff: 
yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtilSuite.scala 
---
@@ -245,4 +247,28 @@ class YarnSparkHadoopUtilSuite extends SparkFunSuite 
with Matchers with Logging
   System.clearProperty("SPARK_YARN_MODE")
 }
   }
+
+  test("Obtain tokens For HiveMetastore") {
+val hadoopConf = new Configuration()
+hadoopConf.set("hive.metastore.kerberos.principal", "bob")
+// thrift picks up on port 0 and bails out, without trying to talk to 
endpoint
+hadoopConf.set("hive.metastore.uris", "http://localhost:0";)
+val util = new YarnSparkHadoopUtil
+val e = intercept[InvocationTargetException] {
+  val token = util.obtainTokenForHiveMetastoreInner(hadoopConf, 
"alice")
+  fail(s"Expected an exception, got the token $token")
+}
+val inner = e.getCause
+if (inner == null) {
+  fail("No inner cause", e)
+}
+if (!inner.isInstanceOf[HiveException]) {
+  fail(s"Not a hive exception", inner)
--- End diff --

done


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-10-30 Thread steveloughran
Github user steveloughran commented on a diff in the pull request:

https://github.com/apache/spark/pull/9232#discussion_r43488997
  
--- Diff: 
yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtilSuite.scala 
---
@@ -245,4 +247,31 @@ class YarnSparkHadoopUtilSuite extends SparkFunSuite 
with Matchers with Logging
   System.clearProperty("SPARK_YARN_MODE")
 }
   }
+
+  test("Obtain tokens For HiveMetastore") {
+val hadoopConf = new Configuration()
+hadoopConf.set("hive.metastore.kerberos.principal", "bob")
+// thrift picks up on port 0 and bails out, without trying to talk to 
endpoint
+hadoopConf.set("hive.metastore.uris", "http://localhost:0";)
+val util = new YarnSparkHadoopUtil
--- End diff --

all the other tests do the same; if you want a switch it may as well be 
across the suite


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-10-30 Thread steveloughran
Github user steveloughran commented on a diff in the pull request:

https://github.com/apache/spark/pull/9232#discussion_r43488942
  
--- Diff: 
yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtil.scala ---
@@ -142,6 +145,81 @@ class YarnSparkHadoopUtil extends SparkHadoopUtil {
 val containerIdString = 
System.getenv(ApplicationConstants.Environment.CONTAINER_ID.name())
 ConverterUtils.toContainerId(containerIdString)
   }
+
+  /**
+   * Obtains token for the Hive metastore, using the current user as the 
principal.
+   * Some exceptions are caught and downgraded to a log message.
+   * @param conf hadoop configuration; the Hive configuration will be 
based on this
+   * @return a token, or `None` if there's no need for a token (no 
metastore URI or principal
+   * in the config), or if a binding exception was caught and 
downgraded.
+   */
+  def obtainTokenForHiveMetastore(conf: Configuration): 
Option[Token[DelegationTokenIdentifier]] = {
+try {
+  obtainTokenForHiveMetastoreInner(conf, 
UserGroupInformation.getCurrentUser().getUserName)
+} catch {
+  case e: ClassNotFoundException =>
+logInfo(s"Hive class not found $e")
+logDebug("Hive class not found", e)
+None
+  case t: Throwable =>
+throw t
+}
+  }
+
+  /**
+   * Inner routine to obtains token for the Hive metastore; exceptions are 
raised on any problem.
+   * @param conf hadoop configuration; the Hive configuration will be 
based on this.
+   * @param username the username of the principal requesting the 
delegating token.
+   * @return a delegation token
+   */
+  private[yarn] def obtainTokenForHiveMetastoreInner(conf: Configuration,
+  username: String): Option[Token[DelegationTokenIdentifier]] = {
+val mirror = universe.runtimeMirror(Utils.getContextOrSparkClassLoader)
+
+// the hive configuration class is a subclass of Hadoop Configuration, 
so can be cast down
+// to a Configuration and used without reflection
+val hiveConfClass = 
mirror.classLoader.loadClass("org.apache.hadoop.hive.conf.HiveConf")
+// using the (Configuration, Class) constructor allows the current 
configuratin to be included
+// in the hive config.
+val ctor = hiveConfClass.getDeclaredConstructor(classOf[Configuration],
+  classOf[Object].getClass)
+val hiveConf = ctor.newInstance(conf, 
hiveConfClass).asInstanceOf[Configuration]
+val metastoreUri = hiveConf.getTrimmed("hive.metastore.uris", "")
+
+// Check for local metastore
+if (metastoreUri.nonEmpty) {
+  require(username.nonEmpty, "Username undefined")
+  val principalKey = "hive.metastore.kerberos.principal"
+  val principal = hiveConf.getTrimmed(principalKey, "")
+  require(principal.nonEmpty, "Hive principal $principalKey undefined")
+  logDebug(s"Getting Hive delegation token for $username against 
$principal at $metastoreUri")
+  val hiveClass = 
mirror.classLoader.loadClass("org.apache.hadoop.hive.ql.metadata.Hive")
+  val closeCurrent = hiveClass.getMethod("closeCurrent")
+  try {
+// get all the instance methods before invoking any
+val getDelegationToken = hiveClass.getMethod("getDelegationToken",
+  classOf[String], classOf[String])
+val getHive = hiveClass.getMethod("get", hiveConfClass)
+
+// invoke
+val hive = getHive.invoke(null, hiveConf)
+val tokenStr = getDelegationToken.invoke(hive, username, principal)
+  .asInstanceOf[java.lang.String]
+val hive2Token = new Token[DelegationTokenIdentifier]()
+hive2Token.decodeFromUrlString(tokenStr)
+Some(hive2Token)
+  } finally {
+try {
+  closeCurrent.invoke(null)
+} catch {
+  case e: Exception => logWarning("In Hive.closeCurrent()", e)
--- End diff --

`Utils.tryLogNonFatalError` looks cleaner; switching


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-10-30 Thread steveloughran
Github user steveloughran commented on a diff in the pull request:

https://github.com/apache/spark/pull/9232#discussion_r43488794
  
--- Diff: 
yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtil.scala ---
@@ -142,6 +145,81 @@ class YarnSparkHadoopUtil extends SparkHadoopUtil {
 val containerIdString = 
System.getenv(ApplicationConstants.Environment.CONTAINER_ID.name())
 ConverterUtils.toContainerId(containerIdString)
   }
+
+  /**
+   * Obtains token for the Hive metastore, using the current user as the 
principal.
+   * Some exceptions are caught and downgraded to a log message.
+   * @param conf hadoop configuration; the Hive configuration will be 
based on this
+   * @return a token, or `None` if there's no need for a token (no 
metastore URI or principal
+   * in the config), or if a binding exception was caught and 
downgraded.
+   */
+  def obtainTokenForHiveMetastore(conf: Configuration): 
Option[Token[DelegationTokenIdentifier]] = {
+try {
+  obtainTokenForHiveMetastoreInner(conf, 
UserGroupInformation.getCurrentUser().getUserName)
+} catch {
+  case e: ClassNotFoundException =>
+logInfo(s"Hive class not found $e")
+logDebug("Hive class not found", e)
+None
+  case t: Throwable =>
+throw t
+}
+  }
+
+  /**
+   * Inner routine to obtains token for the Hive metastore; exceptions are 
raised on any problem.
+   * @param conf hadoop configuration; the Hive configuration will be 
based on this.
+   * @param username the username of the principal requesting the 
delegating token.
+   * @return a delegation token
+   */
+  private[yarn] def obtainTokenForHiveMetastoreInner(conf: Configuration,
+  username: String): Option[Token[DelegationTokenIdentifier]] = {
+val mirror = universe.runtimeMirror(Utils.getContextOrSparkClassLoader)
+
+// the hive configuration class is a subclass of Hadoop Configuration, 
so can be cast down
+// to a Configuration and used without reflection
+val hiveConfClass = 
mirror.classLoader.loadClass("org.apache.hadoop.hive.conf.HiveConf")
+// using the (Configuration, Class) constructor allows the current 
configuratin to be included
+// in the hive config.
+val ctor = hiveConfClass.getDeclaredConstructor(classOf[Configuration],
+  classOf[Object].getClass)
+val hiveConf = ctor.newInstance(conf, 
hiveConfClass).asInstanceOf[Configuration]
+val metastoreUri = hiveConf.getTrimmed("hive.metastore.uris", "")
+
+// Check for local metastore
+if (metastoreUri.nonEmpty) {
+  require(username.nonEmpty, "Username undefined")
+  val principalKey = "hive.metastore.kerberos.principal"
+  val principal = hiveConf.getTrimmed(principalKey, "")
+  require(principal.nonEmpty, "Hive principal $principalKey undefined")
+  logDebug(s"Getting Hive delegation token for $username against 
$principal at $metastoreUri")
+  val hiveClass = 
mirror.classLoader.loadClass("org.apache.hadoop.hive.ql.metadata.Hive")
+  val closeCurrent = hiveClass.getMethod("closeCurrent")
+  try {
+// get all the instance methods before invoking any
+val getDelegationToken = hiveClass.getMethod("getDelegationToken",
+  classOf[String], classOf[String])
+val getHive = hiveClass.getMethod("get", hiveConfClass)
+
+// invoke
+val hive = getHive.invoke(null, hiveConf)
+val tokenStr = getDelegationToken.invoke(hive, username, principal)
+  .asInstanceOf[java.lang.String]
--- End diff --

copied from the original...cut that and joined the lines


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-10-30 Thread steveloughran
Github user steveloughran commented on a diff in the pull request:

https://github.com/apache/spark/pull/9232#discussion_r43488706
  
--- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala ---
@@ -1337,55 +1337,9 @@ object Client extends Logging {
   conf: Configuration,
   credentials: Credentials) {
 if (shouldGetTokens(sparkConf, "hive") && 
UserGroupInformation.isSecurityEnabled) {
-  val mirror = universe.runtimeMirror(getClass.getClassLoader)
-
-  try {
-val hiveConfClass = 
mirror.classLoader.loadClass("org.apache.hadoop.hive.conf.HiveConf")
-val hiveConf = hiveConfClass.newInstance()
-
-val hiveConfGet = (param: String) => Option(hiveConfClass
-  .getMethod("get", classOf[java.lang.String])
-  .invoke(hiveConf, param))
-
-val metastore_uri = hiveConfGet("hive.metastore.uris")
-
-// Check for local metastore
-if (metastore_uri != None && metastore_uri.get.toString.size > 0) {
-  val hiveClass = 
mirror.classLoader.loadClass("org.apache.hadoop.hive.ql.metadata.Hive")
-  val hive = hiveClass.getMethod("get").invoke(null, 
hiveConf.asInstanceOf[Object])
-
-  val metastore_kerberos_principal_conf_var = mirror.classLoader
-.loadClass("org.apache.hadoop.hive.conf.HiveConf$ConfVars")
-
.getField("METASTORE_KERBEROS_PRINCIPAL").get("varname").toString
-
-  val principal = 
hiveConfGet(metastore_kerberos_principal_conf_var)
-
-  val username = 
Option(UserGroupInformation.getCurrentUser().getUserName)
-  if (principal != None && username != None) {
-val tokenStr = hiveClass.getMethod("getDelegationToken",
-  classOf[java.lang.String], classOf[java.lang.String])
-  .invoke(hive, username.get, 
principal.get).asInstanceOf[java.lang.String]
-
-val hive2Token = new Token[DelegationTokenIdentifier]()
-hive2Token.decodeFromUrlString(tokenStr)
-credentials.addToken(new 
Text("hive.server2.delegation.token"), hive2Token)
-logDebug("Added hive.Server2.delegation.token to conf.")
-hiveClass.getMethod("closeCurrent").invoke(null)
-  } else {
-logError("Username or principal == NULL")
-logError(s"""username=${username.getOrElse("(NULL)")}""")
-logError(s"""principal=${principal.getOrElse("(NULL)")}""")
-throw new IllegalArgumentException("username and/or principal 
is equal to null!")
-  }
-} else {
-  logDebug("HiveMetaStore configured in localmode")
-}
-  } catch {
-case e: java.lang.NoSuchMethodException => { logInfo("Hive Method 
not found " + e); return }
-case e: java.lang.ClassNotFoundException => { logInfo("Hive Class 
not found " + e); return }
-case e: Exception => { logError("Unexpected Exception " + e)
-  throw new RuntimeException("Unexpected exception", e)
-}
+  val util = new YarnSparkHadoopUtil()
--- End diff --

done


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-10-30 Thread steveloughran
Github user steveloughran commented on a diff in the pull request:

https://github.com/apache/spark/pull/9232#discussion_r43488737
  
--- Diff: 
yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtil.scala ---
@@ -142,6 +145,81 @@ class YarnSparkHadoopUtil extends SparkHadoopUtil {
 val containerIdString = 
System.getenv(ApplicationConstants.Environment.CONTAINER_ID.name())
 ConverterUtils.toContainerId(containerIdString)
   }
+
+  /**
+   * Obtains token for the Hive metastore, using the current user as the 
principal.
+   * Some exceptions are caught and downgraded to a log message.
+   * @param conf hadoop configuration; the Hive configuration will be 
based on this
+   * @return a token, or `None` if there's no need for a token (no 
metastore URI or principal
+   * in the config), or if a binding exception was caught and 
downgraded.
+   */
+  def obtainTokenForHiveMetastore(conf: Configuration): 
Option[Token[DelegationTokenIdentifier]] = {
+try {
+  obtainTokenForHiveMetastoreInner(conf, 
UserGroupInformation.getCurrentUser().getUserName)
+} catch {
+  case e: ClassNotFoundException =>
+logInfo(s"Hive class not found $e")
+logDebug("Hive class not found", e)
+None
+  case t: Throwable =>
--- End diff --

OK


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-10-30 Thread vanzin
Github user vanzin commented on the pull request:

https://github.com/apache/spark/pull/9232#issuecomment-152453057
  
Just some minor things left to clean up, otherwise looks ok.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-10-30 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/9232#discussion_r43478640
  
--- Diff: 
yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtilSuite.scala 
---
@@ -245,4 +247,28 @@ class YarnSparkHadoopUtilSuite extends SparkFunSuite 
with Matchers with Logging
   System.clearProperty("SPARK_YARN_MODE")
 }
   }
+
+  test("Obtain tokens For HiveMetastore") {
+val hadoopConf = new Configuration()
+hadoopConf.set("hive.metastore.kerberos.principal", "bob")
+// thrift picks up on port 0 and bails out, without trying to talk to 
endpoint
+hadoopConf.set("hive.metastore.uris", "http://localhost:0";)
+val util = new YarnSparkHadoopUtil
+val e = intercept[InvocationTargetException] {
+  val token = util.obtainTokenForHiveMetastoreInner(hadoopConf, 
"alice")
+  fail(s"Expected an exception, got the token $token")
+}
+val inner = e.getCause
+if (inner == null) {
+  fail("No inner cause", e)
+}
+if (!inner.isInstanceOf[HiveException]) {
+  fail(s"Not a hive exception", inner)
--- End diff --

nit: nothing to interpolate, can drop the `s`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-10-30 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/9232#discussion_r43478556
  
--- Diff: 
yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtilSuite.scala 
---
@@ -245,4 +247,31 @@ class YarnSparkHadoopUtilSuite extends SparkFunSuite 
with Matchers with Logging
   System.clearProperty("SPARK_YARN_MODE")
 }
   }
+
+  test("Obtain tokens For HiveMetastore") {
+val hadoopConf = new Configuration()
+hadoopConf.set("hive.metastore.kerberos.principal", "bob")
+// thrift picks up on port 0 and bails out, without trying to talk to 
endpoint
+hadoopConf.set("hive.metastore.uris", "http://localhost:0";)
+val util = new YarnSparkHadoopUtil
--- End diff --

`YarnSparkHadoopUtil.get`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-10-30 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/9232#discussion_r43478531
  
--- Diff: 
yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtil.scala ---
@@ -142,6 +145,81 @@ class YarnSparkHadoopUtil extends SparkHadoopUtil {
 val containerIdString = 
System.getenv(ApplicationConstants.Environment.CONTAINER_ID.name())
 ConverterUtils.toContainerId(containerIdString)
   }
+
+  /**
+   * Obtains token for the Hive metastore, using the current user as the 
principal.
+   * Some exceptions are caught and downgraded to a log message.
+   * @param conf hadoop configuration; the Hive configuration will be 
based on this
+   * @return a token, or `None` if there's no need for a token (no 
metastore URI or principal
+   * in the config), or if a binding exception was caught and 
downgraded.
+   */
+  def obtainTokenForHiveMetastore(conf: Configuration): 
Option[Token[DelegationTokenIdentifier]] = {
+try {
+  obtainTokenForHiveMetastoreInner(conf, 
UserGroupInformation.getCurrentUser().getUserName)
+} catch {
+  case e: ClassNotFoundException =>
+logInfo(s"Hive class not found $e")
+logDebug("Hive class not found", e)
+None
+  case t: Throwable =>
+throw t
+}
+  }
+
+  /**
+   * Inner routine to obtains token for the Hive metastore; exceptions are 
raised on any problem.
+   * @param conf hadoop configuration; the Hive configuration will be 
based on this.
+   * @param username the username of the principal requesting the 
delegating token.
+   * @return a delegation token
+   */
+  private[yarn] def obtainTokenForHiveMetastoreInner(conf: Configuration,
+  username: String): Option[Token[DelegationTokenIdentifier]] = {
+val mirror = universe.runtimeMirror(Utils.getContextOrSparkClassLoader)
+
+// the hive configuration class is a subclass of Hadoop Configuration, 
so can be cast down
+// to a Configuration and used without reflection
+val hiveConfClass = 
mirror.classLoader.loadClass("org.apache.hadoop.hive.conf.HiveConf")
+// using the (Configuration, Class) constructor allows the current 
configuratin to be included
+// in the hive config.
+val ctor = hiveConfClass.getDeclaredConstructor(classOf[Configuration],
+  classOf[Object].getClass)
+val hiveConf = ctor.newInstance(conf, 
hiveConfClass).asInstanceOf[Configuration]
+val metastoreUri = hiveConf.getTrimmed("hive.metastore.uris", "")
+
+// Check for local metastore
+if (metastoreUri.nonEmpty) {
+  require(username.nonEmpty, "Username undefined")
+  val principalKey = "hive.metastore.kerberos.principal"
+  val principal = hiveConf.getTrimmed(principalKey, "")
+  require(principal.nonEmpty, "Hive principal $principalKey undefined")
+  logDebug(s"Getting Hive delegation token for $username against 
$principal at $metastoreUri")
+  val hiveClass = 
mirror.classLoader.loadClass("org.apache.hadoop.hive.ql.metadata.Hive")
+  val closeCurrent = hiveClass.getMethod("closeCurrent")
+  try {
+// get all the instance methods before invoking any
+val getDelegationToken = hiveClass.getMethod("getDelegationToken",
+  classOf[String], classOf[String])
+val getHive = hiveClass.getMethod("get", hiveConfClass)
+
+// invoke
+val hive = getHive.invoke(null, hiveConf)
+val tokenStr = getDelegationToken.invoke(hive, username, principal)
+  .asInstanceOf[java.lang.String]
+val hive2Token = new Token[DelegationTokenIdentifier]()
+hive2Token.decodeFromUrlString(tokenStr)
+Some(hive2Token)
+  } finally {
+try {
+  closeCurrent.invoke(null)
+} catch {
+  case e: Exception => logWarning("In Hive.closeCurrent()", e)
--- End diff --

minor: you could use `Utils.tryLogNonFatalError`, although that uses 
`logError`. Should be fine here, though, since this shouldn't really happen 
normally.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-10-30 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/9232#discussion_r43478447
  
--- Diff: 
yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtil.scala ---
@@ -142,6 +145,81 @@ class YarnSparkHadoopUtil extends SparkHadoopUtil {
 val containerIdString = 
System.getenv(ApplicationConstants.Environment.CONTAINER_ID.name())
 ConverterUtils.toContainerId(containerIdString)
   }
+
+  /**
+   * Obtains token for the Hive metastore, using the current user as the 
principal.
+   * Some exceptions are caught and downgraded to a log message.
+   * @param conf hadoop configuration; the Hive configuration will be 
based on this
+   * @return a token, or `None` if there's no need for a token (no 
metastore URI or principal
+   * in the config), or if a binding exception was caught and 
downgraded.
+   */
+  def obtainTokenForHiveMetastore(conf: Configuration): 
Option[Token[DelegationTokenIdentifier]] = {
+try {
+  obtainTokenForHiveMetastoreInner(conf, 
UserGroupInformation.getCurrentUser().getUserName)
+} catch {
+  case e: ClassNotFoundException =>
+logInfo(s"Hive class not found $e")
+logDebug("Hive class not found", e)
+None
+  case t: Throwable =>
+throw t
+}
+  }
+
+  /**
+   * Inner routine to obtains token for the Hive metastore; exceptions are 
raised on any problem.
+   * @param conf hadoop configuration; the Hive configuration will be 
based on this.
+   * @param username the username of the principal requesting the 
delegating token.
+   * @return a delegation token
+   */
+  private[yarn] def obtainTokenForHiveMetastoreInner(conf: Configuration,
+  username: String): Option[Token[DelegationTokenIdentifier]] = {
+val mirror = universe.runtimeMirror(Utils.getContextOrSparkClassLoader)
+
+// the hive configuration class is a subclass of Hadoop Configuration, 
so can be cast down
+// to a Configuration and used without reflection
+val hiveConfClass = 
mirror.classLoader.loadClass("org.apache.hadoop.hive.conf.HiveConf")
+// using the (Configuration, Class) constructor allows the current 
configuratin to be included
+// in the hive config.
+val ctor = hiveConfClass.getDeclaredConstructor(classOf[Configuration],
+  classOf[Object].getClass)
+val hiveConf = ctor.newInstance(conf, 
hiveConfClass).asInstanceOf[Configuration]
+val metastoreUri = hiveConf.getTrimmed("hive.metastore.uris", "")
+
+// Check for local metastore
+if (metastoreUri.nonEmpty) {
+  require(username.nonEmpty, "Username undefined")
+  val principalKey = "hive.metastore.kerberos.principal"
+  val principal = hiveConf.getTrimmed(principalKey, "")
+  require(principal.nonEmpty, "Hive principal $principalKey undefined")
+  logDebug(s"Getting Hive delegation token for $username against 
$principal at $metastoreUri")
+  val hiveClass = 
mirror.classLoader.loadClass("org.apache.hadoop.hive.ql.metadata.Hive")
+  val closeCurrent = hiveClass.getMethod("closeCurrent")
+  try {
+// get all the instance methods before invoking any
+val getDelegationToken = hiveClass.getMethod("getDelegationToken",
+  classOf[String], classOf[String])
+val getHive = hiveClass.getMethod("get", hiveConfClass)
+
+// invoke
+val hive = getHive.invoke(null, hiveConf)
+val tokenStr = getDelegationToken.invoke(hive, username, principal)
+  .asInstanceOf[java.lang.String]
--- End diff --

minor: is `java.lang.` necessary here?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-10-30 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/9232#discussion_r43478397
  
--- Diff: 
yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtil.scala ---
@@ -142,6 +145,81 @@ class YarnSparkHadoopUtil extends SparkHadoopUtil {
 val containerIdString = 
System.getenv(ApplicationConstants.Environment.CONTAINER_ID.name())
 ConverterUtils.toContainerId(containerIdString)
   }
+
+  /**
+   * Obtains token for the Hive metastore, using the current user as the 
principal.
+   * Some exceptions are caught and downgraded to a log message.
+   * @param conf hadoop configuration; the Hive configuration will be 
based on this
+   * @return a token, or `None` if there's no need for a token (no 
metastore URI or principal
+   * in the config), or if a binding exception was caught and 
downgraded.
+   */
+  def obtainTokenForHiveMetastore(conf: Configuration): 
Option[Token[DelegationTokenIdentifier]] = {
+try {
+  obtainTokenForHiveMetastoreInner(conf, 
UserGroupInformation.getCurrentUser().getUserName)
+} catch {
+  case e: ClassNotFoundException =>
+logInfo(s"Hive class not found $e")
+logDebug("Hive class not found", e)
+None
+  case t: Throwable =>
--- End diff --

This is not needed...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-10-30 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/9232#discussion_r43478370
  
--- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala ---
@@ -1337,55 +1337,9 @@ object Client extends Logging {
   conf: Configuration,
   credentials: Credentials) {
 if (shouldGetTokens(sparkConf, "hive") && 
UserGroupInformation.isSecurityEnabled) {
-  val mirror = universe.runtimeMirror(getClass.getClassLoader)
-
-  try {
-val hiveConfClass = 
mirror.classLoader.loadClass("org.apache.hadoop.hive.conf.HiveConf")
-val hiveConf = hiveConfClass.newInstance()
-
-val hiveConfGet = (param: String) => Option(hiveConfClass
-  .getMethod("get", classOf[java.lang.String])
-  .invoke(hiveConf, param))
-
-val metastore_uri = hiveConfGet("hive.metastore.uris")
-
-// Check for local metastore
-if (metastore_uri != None && metastore_uri.get.toString.size > 0) {
-  val hiveClass = 
mirror.classLoader.loadClass("org.apache.hadoop.hive.ql.metadata.Hive")
-  val hive = hiveClass.getMethod("get").invoke(null, 
hiveConf.asInstanceOf[Object])
-
-  val metastore_kerberos_principal_conf_var = mirror.classLoader
-.loadClass("org.apache.hadoop.hive.conf.HiveConf$ConfVars")
-
.getField("METASTORE_KERBEROS_PRINCIPAL").get("varname").toString
-
-  val principal = 
hiveConfGet(metastore_kerberos_principal_conf_var)
-
-  val username = 
Option(UserGroupInformation.getCurrentUser().getUserName)
-  if (principal != None && username != None) {
-val tokenStr = hiveClass.getMethod("getDelegationToken",
-  classOf[java.lang.String], classOf[java.lang.String])
-  .invoke(hive, username.get, 
principal.get).asInstanceOf[java.lang.String]
-
-val hive2Token = new Token[DelegationTokenIdentifier]()
-hive2Token.decodeFromUrlString(tokenStr)
-credentials.addToken(new 
Text("hive.server2.delegation.token"), hive2Token)
-logDebug("Added hive.Server2.delegation.token to conf.")
-hiveClass.getMethod("closeCurrent").invoke(null)
-  } else {
-logError("Username or principal == NULL")
-logError(s"""username=${username.getOrElse("(NULL)")}""")
-logError(s"""principal=${principal.getOrElse("(NULL)")}""")
-throw new IllegalArgumentException("username and/or principal 
is equal to null!")
-  }
-} else {
-  logDebug("HiveMetaStore configured in localmode")
-}
-  } catch {
-case e: java.lang.NoSuchMethodException => { logInfo("Hive Method 
not found " + e); return }
-case e: java.lang.ClassNotFoundException => { logInfo("Hive Class 
not found " + e); return }
-case e: Exception => { logError("Unexpected Exception " + e)
-  throw new RuntimeException("Unexpected exception", e)
-}
+  val util = new YarnSparkHadoopUtil()
--- End diff --

This should be `YarnSparkHadoopUtil.get`; you could also avoid the `val` 
altogether.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-10-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9232#issuecomment-152334259
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44637/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-10-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9232#issuecomment-152334258
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-10-29 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9232#issuecomment-152334119
  
**[Test build #44637 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44637/consoleFull)**
 for PR 9232 at commit 
[`dd8dea9`](https://github.com/apache/spark/commit/dd8dea926cec22853fb665c83f13c78a4128c20a).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:\n  * `
logInfo(s\"Hive class not found $e\")`\n  * `logDebug(\"Hive class 
not found\", e)`\n


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-10-29 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9232#issuecomment-152328362
  
**[Test build #44637 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44637/consoleFull)**
 for PR 9232 at commit 
[`dd8dea9`](https://github.com/apache/spark/commit/dd8dea926cec22853fb665c83f13c78a4128c20a).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-10-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9232#issuecomment-152325087
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-10-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9232#issuecomment-152325046
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-10-29 Thread steveloughran
Github user steveloughran commented on a diff in the pull request:

https://github.com/apache/spark/pull/9232#discussion_r43439849
  
--- Diff: 
yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtilSuite.scala 
---
@@ -245,4 +247,55 @@ class YarnSparkHadoopUtilSuite extends SparkFunSuite 
with Matchers with Logging
   System.clearProperty("SPARK_YARN_MODE")
 }
   }
+
+  test("Obtain tokens For HiveMetastore") {
+val hadoopConf = new Configuration()
+hadoopConf.set("hive.metastore.kerberos.principal", "bob")
+// thrift picks up on port 0 and bails out, without trying to talk to 
endpoint
+hadoopConf.set("hive.metastore.uris", "http://localhost:0";)
+val util = new YarnSparkHadoopUtil
+val e = intercept[InvocationTargetException] {
--- End diff --

done


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-10-29 Thread steveloughran
Github user steveloughran commented on a diff in the pull request:

https://github.com/apache/spark/pull/9232#discussion_r43369014
  
--- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala ---
@@ -1337,55 +1337,9 @@ object Client extends Logging {
   conf: Configuration,
   credentials: Credentials) {
 if (shouldGetTokens(sparkConf, "hive") && 
UserGroupInformation.isSecurityEnabled) {
-  val mirror = universe.runtimeMirror(getClass.getClassLoader)
--- End diff --

this should go to `utils.getContextOrSparkClassLoader()`; notable that 
scalastyle doesn't pick up on this, even though it rejects `Class.forName()` 
since SPARK-8962


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-10-29 Thread steveloughran
Github user steveloughran commented on a diff in the pull request:

https://github.com/apache/spark/pull/9232#discussion_r43368611
  
--- Diff: 
yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtilSuite.scala 
---
@@ -245,4 +247,55 @@ class YarnSparkHadoopUtilSuite extends SparkFunSuite 
with Matchers with Logging
   System.clearProperty("SPARK_YARN_MODE")
 }
   }
+
+  test("Obtain tokens For HiveMetastore") {
+val hadoopConf = new Configuration()
+hadoopConf.set("hive.metastore.kerberos.principal", "bob")
+// thrift picks up on port 0 and bails out, without trying to talk to 
endpoint
+hadoopConf.set("hive.metastore.uris", "http://localhost:0";)
+val util = new YarnSparkHadoopUtil
+val e = intercept[InvocationTargetException] {
+  util.obtainTokenForHiveMetastoreInner(hadoopConf, "alice")
+}
+assertNestedHiveException(e)
+// expect exception trapping code to unwind this hive-side exception
+assertNestedHiveException(intercept[InvocationTargetException] {
+  util.obtainTokenForHiveMetastore(hadoopConf)
+})
+  }
+
+  def assertNestedHiveException(e: InvocationTargetException): Throwable = 
{
+val inner = e.getCause
+if (inner == null) {
+  fail("No inner cause", e)
+}
+if (!inner.isInstanceOf[HiveException]) {
+  fail(s"Not a hive exception", inner)
+}
+inner
+  }
+
+  test("handleTokenIntrospectionFailure") {
+val util = new YarnSparkHadoopUtil
+// downgraded exceptions
+util.handleTokenIntrospectionFailure("hive", new 
ClassNotFoundException("cnfe"))
--- End diff --

I'd thought about that purer option; it's easily testable too.

given the policy is so simple now, I'll just pull the catch handler into 
place, and replicate it in the fixed hbase code after; its simple enough that a 
review should suffice.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-10-28 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/9232#discussion_r43336052
  
--- Diff: 
yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtilSuite.scala 
---
@@ -245,4 +247,55 @@ class YarnSparkHadoopUtilSuite extends SparkFunSuite 
with Matchers with Logging
   System.clearProperty("SPARK_YARN_MODE")
 }
   }
+
+  test("Obtain tokens For HiveMetastore") {
+val hadoopConf = new Configuration()
+hadoopConf.set("hive.metastore.kerberos.principal", "bob")
+// thrift picks up on port 0 and bails out, without trying to talk to 
endpoint
+hadoopConf.set("hive.metastore.uris", "http://localhost:0";)
+val util = new YarnSparkHadoopUtil
+val e = intercept[InvocationTargetException] {
+  util.obtainTokenForHiveMetastoreInner(hadoopConf, "alice")
+}
+assertNestedHiveException(e)
+// expect exception trapping code to unwind this hive-side exception
+assertNestedHiveException(intercept[InvocationTargetException] {
+  util.obtainTokenForHiveMetastore(hadoopConf)
+})
+  }
+
+  def assertNestedHiveException(e: InvocationTargetException): Throwable = 
{
+val inner = e.getCause
+if (inner == null) {
+  fail("No inner cause", e)
+}
+if (!inner.isInstanceOf[HiveException]) {
+  fail(s"Not a hive exception", inner)
+}
+inner
+  }
+
+  test("handleTokenIntrospectionFailure") {
+val util = new YarnSparkHadoopUtil
+// downgraded exceptions
+util.handleTokenIntrospectionFailure("hive", new 
ClassNotFoundException("cnfe"))
--- End diff --

Or yet another option is to have the method that handles exception take a 
closure, instead of the current approach of a method that matches on an 
exception parameter. e.g.:

def tryToGetTokens(service: String)(fn: () => Option[Token]): 
Option[Token] = {
  try {
fn()
  } catch {
 ...
  }
}

def obtainTokenForHiveMetastore... = {
  tryToGetTokens("Hive") { obtainTokenForHiveMetastoreInner(...) }
}

I mostly dislike that exception handling feels like it's scattered around. 
You have a catch block in one place, then match on the exception somewhere 
else, it makes it hard to see what's really being done in one look.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-10-28 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/9232#discussion_r43266858
  
--- Diff: 
yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtilSuite.scala 
---
@@ -245,4 +247,55 @@ class YarnSparkHadoopUtilSuite extends SparkFunSuite 
with Matchers with Logging
   System.clearProperty("SPARK_YARN_MODE")
 }
   }
+
+  test("Obtain tokens For HiveMetastore") {
+val hadoopConf = new Configuration()
+hadoopConf.set("hive.metastore.kerberos.principal", "bob")
+// thrift picks up on port 0 and bails out, without trying to talk to 
endpoint
+hadoopConf.set("hive.metastore.uris", "http://localhost:0";)
+val util = new YarnSparkHadoopUtil
+val e = intercept[InvocationTargetException] {
+  util.obtainTokenForHiveMetastoreInner(hadoopConf, "alice")
+}
+assertNestedHiveException(e)
+// expect exception trapping code to unwind this hive-side exception
+assertNestedHiveException(intercept[InvocationTargetException] {
+  util.obtainTokenForHiveMetastore(hadoopConf)
+})
+  }
+
+  def assertNestedHiveException(e: InvocationTargetException): Throwable = 
{
+val inner = e.getCause
+if (inner == null) {
+  fail("No inner cause", e)
+}
+if (!inner.isInstanceOf[HiveException]) {
+  fail(s"Not a hive exception", inner)
+}
+inner
+  }
+
+  test("handleTokenIntrospectionFailure") {
+val util = new YarnSparkHadoopUtil
+// downgraded exceptions
+util.handleTokenIntrospectionFailure("hive", new 
ClassNotFoundException("cnfe"))
--- End diff --

BTW, if you really want to implement a shared policy, I'd recommend adding 
something like `scala.util.control.NonFatal`. That makes the exception handling 
cleaner; it would look more like this:

try {
  // code that can throw
} catch {
  case IgnorableException(e) => logDebug(...)
}



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-10-28 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/9232#discussion_r43251011
  
--- Diff: 
yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtilSuite.scala 
---
@@ -245,4 +247,55 @@ class YarnSparkHadoopUtilSuite extends SparkFunSuite 
with Matchers with Logging
   System.clearProperty("SPARK_YARN_MODE")
 }
   }
+
+  test("Obtain tokens For HiveMetastore") {
+val hadoopConf = new Configuration()
+hadoopConf.set("hive.metastore.kerberos.principal", "bob")
+// thrift picks up on port 0 and bails out, without trying to talk to 
endpoint
+hadoopConf.set("hive.metastore.uris", "http://localhost:0";)
+val util = new YarnSparkHadoopUtil
+val e = intercept[InvocationTargetException] {
+  util.obtainTokenForHiveMetastoreInner(hadoopConf, "alice")
+}
+assertNestedHiveException(e)
+// expect exception trapping code to unwind this hive-side exception
+assertNestedHiveException(intercept[InvocationTargetException] {
+  util.obtainTokenForHiveMetastore(hadoopConf)
+})
+  }
+
+  def assertNestedHiveException(e: InvocationTargetException): Throwable = 
{
+val inner = e.getCause
+if (inner == null) {
+  fail("No inner cause", e)
+}
+if (!inner.isInstanceOf[HiveException]) {
+  fail(s"Not a hive exception", inner)
+}
+inner
+  }
+
+  test("handleTokenIntrospectionFailure") {
+val util = new YarnSparkHadoopUtil
+// downgraded exceptions
+util.handleTokenIntrospectionFailure("hive", new 
ClassNotFoundException("cnfe"))
--- End diff --

I think that because there's really only one exception that's currently 
interesting, you need more code to implement this "shared policy" approach than 
just catching the one interesting exception in each call site. It's true that 
if you need to modify the policy you'd need you'd need to duplicate code (or 
switch to your current approach), but then do you envision needing to do that? 
What if the policy for each service needs to be different?

Personally I think that the current approach is a little confusing for 
someone reading the code (and inconsistent; for example the current code 
catches `Exception` and then feeds it to a method that matches on `Throwable`), 
and because the policy is so simple, the sharing argument doesn't justify 
making the code harder to follow.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-10-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9232#issuecomment-151818863
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44522/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-10-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9232#issuecomment-151818859
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-10-28 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9232#issuecomment-151818616
  
**[Test build #44522 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44522/consoleFull)**
 for PR 9232 at commit 
[`217faba`](https://github.com/apache/spark/commit/217faba0d372ac66c57420372db62244e628da39).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:\n  * `
logInfo(s\"$service class not found $e\")`\n  * `
logDebug(\"$service class not found\", e)`\n


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-10-28 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9232#issuecomment-151809791
  
**[Test build #44522 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44522/consoleFull)**
 for PR 9232 at commit 
[`217faba`](https://github.com/apache/spark/commit/217faba0d372ac66c57420372db62244e628da39).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-10-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9232#issuecomment-151808462
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-10-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9232#issuecomment-151808446
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-10-28 Thread steveloughran
Github user steveloughran commented on a diff in the pull request:

https://github.com/apache/spark/pull/9232#discussion_r43242989
  
--- Diff: 
yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtilSuite.scala 
---
@@ -245,4 +247,55 @@ class YarnSparkHadoopUtilSuite extends SparkFunSuite 
with Matchers with Logging
   System.clearProperty("SPARK_YARN_MODE")
 }
   }
+
+  test("Obtain tokens For HiveMetastore") {
+val hadoopConf = new Configuration()
+hadoopConf.set("hive.metastore.kerberos.principal", "bob")
+// thrift picks up on port 0 and bails out, without trying to talk to 
endpoint
+hadoopConf.set("hive.metastore.uris", "http://localhost:0";)
+val util = new YarnSparkHadoopUtil
+val e = intercept[InvocationTargetException] {
+  util.obtainTokenForHiveMetastoreInner(hadoopConf, "alice")
+}
+assertNestedHiveException(e)
+// expect exception trapping code to unwind this hive-side exception
+assertNestedHiveException(intercept[InvocationTargetException] {
+  util.obtainTokenForHiveMetastore(hadoopConf)
+})
+  }
+
+  def assertNestedHiveException(e: InvocationTargetException): Throwable = 
{
+val inner = e.getCause
+if (inner == null) {
+  fail("No inner cause", e)
+}
+if (!inner.isInstanceOf[HiveException]) {
+  fail(s"Not a hive exception", inner)
+}
+inner
+  }
+
+  test("handleTokenIntrospectionFailure") {
+val util = new YarnSparkHadoopUtil
+// downgraded exceptions
+util.handleTokenIntrospectionFailure("hive", new 
ClassNotFoundException("cnfe"))
--- End diff --

As soon as this patch is in I'll  turn to 
[SPARK-11317](https://issues.apache.org/jira/browse/SPARK-11317), which is 
essentially "apply the same catching, filtering and reporting strategy for 
HBase tokens as for Hive ones". It's not as critical as this one (token 
retrieval is working), but as nothing gets logged except 
"InvocationTargetException" with no stack trace, trying to recognise the issue 
is a Kerberos auth problem, let alone trying to fix it, is a weekend's effort, 
rather than 20 minutes worth. 

Because the policy goes in both places, having it separate and re-usable 
makes it a zero-cut-and-paste reuse, with that single test for failures without 
having to mock up failures across two separate clauses. And future maintenance 
costs are kept down if someone ever decides to change the policy again.

Would you be happier if I cleaned up the HBase code as part of this same 
patch? Because I can and it will make the benefits of the factored out 
behaviour clearer. It's just messy to fix two things in one patch, especially 
if someone ever needs to play cherry pick or reverting games.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-10-28 Thread steveloughran
Github user steveloughran commented on a diff in the pull request:

https://github.com/apache/spark/pull/9232#discussion_r43240828
  
--- Diff: 
yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtil.scala ---
@@ -142,6 +145,99 @@ class YarnSparkHadoopUtil extends SparkHadoopUtil {
 val containerIdString = 
System.getenv(ApplicationConstants.Environment.CONTAINER_ID.name())
 ConverterUtils.toContainerId(containerIdString)
   }
+
+  /**
+   * Obtains token for the Hive metastore, using the current user as the 
principal.
+   * Some exceptions are caught and downgraded to a log message.
+   * @param conf hadoop configuration; the Hive configuration will be 
based on this
+   * @return a token, or `None` if there's no need for a token (no 
metastore URI or principal
+   * in the config), or if a binding exception was caught and 
downgraded.
+   */
+  def obtainTokenForHiveMetastore(conf: Configuration): 
Option[Token[DelegationTokenIdentifier]] = {
+try {
+  obtainTokenForHiveMetastoreInner(conf, 
UserGroupInformation.getCurrentUser().getUserName)
+} catch {
+  case e: Exception => {
+handleTokenIntrospectionFailure("Hive", e)
+None
+  }
+}
+  }
+
+  /**
+   * Handle failures to obtain a token through introspection. Failures to 
load the class are
+   * not treated as errors: anything else is.
+   * @param service service name for error messages
+   * @param thrown exception caught
+   * @throws Exception if the `thrown` exception isn't one that is to be 
ignored
+   */
+  private[yarn] def handleTokenIntrospectionFailure(service: String, 
thrown: Throwable): Unit = {
+thrown match {
+  case e: ClassNotFoundException =>
+logInfo(s"$service class not found $e")
+logDebug("Hive Class not found", e)
+  case t: Throwable => {
--- End diff --

the reason I'd pulled it out was to have an isolated policy which could be 
both tested without mocking failures, and be re-used in the HBase token 
retrieval, which needs an identical set of clauses.

Given the policy has now been simplified so much, the method is now pretty 
much unused; I can pull it. But still the HBase token logic will need to be 
100% in sync.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-10-28 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/9232#discussion_r43233534
  
--- Diff: 
yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtilSuite.scala 
---
@@ -245,4 +247,55 @@ class YarnSparkHadoopUtilSuite extends SparkFunSuite 
with Matchers with Logging
   System.clearProperty("SPARK_YARN_MODE")
 }
   }
+
+  test("Obtain tokens For HiveMetastore") {
+val hadoopConf = new Configuration()
+hadoopConf.set("hive.metastore.kerberos.principal", "bob")
+// thrift picks up on port 0 and bails out, without trying to talk to 
endpoint
+hadoopConf.set("hive.metastore.uris", "http://localhost:0";)
+val util = new YarnSparkHadoopUtil
+val e = intercept[InvocationTargetException] {
--- End diff --

minor, but you could use the same style as below (where you avoid the temp 
variable).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-10-28 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/9232#discussion_r43233453
  
--- Diff: 
yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtilSuite.scala 
---
@@ -245,4 +247,55 @@ class YarnSparkHadoopUtilSuite extends SparkFunSuite 
with Matchers with Logging
   System.clearProperty("SPARK_YARN_MODE")
 }
   }
+
+  test("Obtain tokens For HiveMetastore") {
+val hadoopConf = new Configuration()
+hadoopConf.set("hive.metastore.kerberos.principal", "bob")
+// thrift picks up on port 0 and bails out, without trying to talk to 
endpoint
+hadoopConf.set("hive.metastore.uris", "http://localhost:0";)
+val util = new YarnSparkHadoopUtil
+val e = intercept[InvocationTargetException] {
+  util.obtainTokenForHiveMetastoreInner(hadoopConf, "alice")
+}
+assertNestedHiveException(e)
+// expect exception trapping code to unwind this hive-side exception
+assertNestedHiveException(intercept[InvocationTargetException] {
+  util.obtainTokenForHiveMetastore(hadoopConf)
+})
+  }
+
+  def assertNestedHiveException(e: InvocationTargetException): Throwable = 
{
+val inner = e.getCause
+if (inner == null) {
+  fail("No inner cause", e)
+}
+if (!inner.isInstanceOf[HiveException]) {
+  fail(s"Not a hive exception", inner)
+}
+inner
+  }
+
+  test("handleTokenIntrospectionFailure") {
+val util = new YarnSparkHadoopUtil
+// downgraded exceptions
+util.handleTokenIntrospectionFailure("hive", new 
ClassNotFoundException("cnfe"))
--- End diff --

Following my previous comment, you could get rid of this whole test case if 
you just do exception handling in the caller method. If you really want to test 
that CNFE is ignored, you could use Mockito's `spy` to mock 
`obtainTokenForHiveMetastoreInner` and make it throw a CNFE. The other tests 
here are not really that interesting anymore.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-10-28 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/9232#discussion_r43231831
  
--- Diff: 
yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtil.scala ---
@@ -142,6 +145,99 @@ class YarnSparkHadoopUtil extends SparkHadoopUtil {
 val containerIdString = 
System.getenv(ApplicationConstants.Environment.CONTAINER_ID.name())
 ConverterUtils.toContainerId(containerIdString)
   }
+
+  /**
+   * Obtains token for the Hive metastore, using the current user as the 
principal.
+   * Some exceptions are caught and downgraded to a log message.
+   * @param conf hadoop configuration; the Hive configuration will be 
based on this
+   * @return a token, or `None` if there's no need for a token (no 
metastore URI or principal
+   * in the config), or if a binding exception was caught and 
downgraded.
+   */
+  def obtainTokenForHiveMetastore(conf: Configuration): 
Option[Token[DelegationTokenIdentifier]] = {
+try {
+  obtainTokenForHiveMetastoreInner(conf, 
UserGroupInformation.getCurrentUser().getUserName)
+} catch {
+  case e: Exception => {
+handleTokenIntrospectionFailure("Hive", e)
+None
+  }
+}
+  }
+
+  /**
+   * Handle failures to obtain a token through introspection. Failures to 
load the class are
+   * not treated as errors: anything else is.
+   * @param service service name for error messages
+   * @param thrown exception caught
+   * @throws Exception if the `thrown` exception isn't one that is to be 
ignored
+   */
+  private[yarn] def handleTokenIntrospectionFailure(service: String, 
thrown: Throwable): Unit = {
+thrown match {
+  case e: ClassNotFoundException =>
+logInfo(s"$service class not found $e")
+logDebug("Hive Class not found", e)
+  case t: Throwable => {
--- End diff --

Hi @steveloughran ,

I think you're still not really getting what I'm saying. You can just 
*delete this whole `case`*. The exception will just propagate up the call stack


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-10-28 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/9232#discussion_r43223745
  
--- Diff: 
yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtil.scala ---
@@ -142,6 +145,97 @@ class YarnSparkHadoopUtil extends SparkHadoopUtil {
 val containerIdString = 
System.getenv(ApplicationConstants.Environment.CONTAINER_ID.name())
 ConverterUtils.toContainerId(containerIdString)
   }
+
+  /**
+   * Obtains token for the Hive metastore, using the current user as the 
principal.
+   * Some exceptions are caught and downgraded to a log message.
+   * @param conf hadoop configuration; the Hive configuration will be 
based on this
+   * @return a token, or `None` if there's no need for a token (no 
metastore URI or principal
+   * in the config), or if a binding exception was caught and 
downgraded.
+   */
+  def obtainTokenForHiveMetastore(conf: Configuration): 
Option[Token[DelegationTokenIdentifier]] = {
+try {
+  obtainTokenForHiveMetastoreInner(conf, 
UserGroupInformation.getCurrentUser().getUserName)
+} catch {
+  case e: Exception => {
+handleTokenIntrospectionFailure("Hive", e)
+None
+  }
+}
+  }
+
+  /**
+   * Handle failures to obtain a token through introspection. Failures to 
load the class are
+   * not treated as errors: anything else is.
+   * @param service service name for error messages
+   * @param thrown exception caught
+   * @throws Exception if the `thrown` exception isn't one that is to be 
ignored
+   */
+  private[yarn] def handleTokenIntrospectionFailure(service: String, 
thrown: Throwable): Unit = {
+thrown match {
+  case e: ClassNotFoundException =>
+logInfo(s"$service class not found $e")
+logDebug("Hive Class not found", e)
+  case t: Throwable => {
+throw t
--- End diff --

The user can (i) not give Spark a hive configuration, in which case there 
will be no metastore URIs, and this code will be skipped, or (ii) set 
`spark.yarn.security.tokens.hive.enabled` to false.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-10-27 Thread zhzhan
Github user zhzhan commented on a diff in the pull request:

https://github.com/apache/spark/pull/9232#discussion_r43204781
  
--- Diff: 
yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtil.scala ---
@@ -142,6 +145,97 @@ class YarnSparkHadoopUtil extends SparkHadoopUtil {
 val containerIdString = 
System.getenv(ApplicationConstants.Environment.CONTAINER_ID.name())
 ConverterUtils.toContainerId(containerIdString)
   }
+
+  /**
+   * Obtains token for the Hive metastore, using the current user as the 
principal.
+   * Some exceptions are caught and downgraded to a log message.
+   * @param conf hadoop configuration; the Hive configuration will be 
based on this
+   * @return a token, or `None` if there's no need for a token (no 
metastore URI or principal
+   * in the config), or if a binding exception was caught and 
downgraded.
+   */
+  def obtainTokenForHiveMetastore(conf: Configuration): 
Option[Token[DelegationTokenIdentifier]] = {
+try {
+  obtainTokenForHiveMetastoreInner(conf, 
UserGroupInformation.getCurrentUser().getUserName)
+} catch {
+  case e: Exception => {
+handleTokenIntrospectionFailure("Hive", e)
+None
+  }
+}
+  }
+
+  /**
+   * Handle failures to obtain a token through introspection. Failures to 
load the class are
+   * not treated as errors: anything else is.
+   * @param service service name for error messages
+   * @param thrown exception caught
+   * @throws Exception if the `thrown` exception isn't one that is to be 
ignored
+   */
+  private[yarn] def handleTokenIntrospectionFailure(service: String, 
thrown: Throwable): Unit = {
+thrown match {
+  case e: ClassNotFoundException =>
+logInfo(s"$service class not found $e")
+logDebug("Hive Class not found", e)
+  case t: Throwable => {
+throw t
--- End diff --

Here the exception is thrown. I know swallow the exception is bad, but what 
happen if the user does not want to access the hive metastore but want to use 
spark even if token cannot be acquired? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-10-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9232#issuecomment-151677558
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-10-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9232#issuecomment-151677560
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44475/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-10-27 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9232#issuecomment-151677475
  
**[Test build #44475 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44475/consoleFull)**
 for PR 9232 at commit 
[`00cb5a7`](https://github.com/apache/spark/commit/00cb5a7323a4f91adfa5c4273c8a6bcbc67dc008).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:\n  * `
logInfo(s\"$service class not found $e\")`\n


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-10-27 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9232#issuecomment-151674747
  
**[Test build #44475 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44475/consoleFull)**
 for PR 9232 at commit 
[`00cb5a7`](https://github.com/apache/spark/commit/00cb5a7323a4f91adfa5c4273c8a6bcbc67dc008).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-10-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9232#issuecomment-151673386
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-10-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9232#issuecomment-151673409
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-10-27 Thread steveloughran
Github user steveloughran commented on a diff in the pull request:

https://github.com/apache/spark/pull/9232#discussion_r43201346
  
--- Diff: 
yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtil.scala ---
@@ -142,6 +145,99 @@ class YarnSparkHadoopUtil extends SparkHadoopUtil {
 val containerIdString = 
System.getenv(ApplicationConstants.Environment.CONTAINER_ID.name())
 ConverterUtils.toContainerId(containerIdString)
   }
+
+  /**
+   * Obtains token for the Hive metastore, using the current user as the 
principal.
+   * Some exceptions are caught and downgraded to a log message.
+   * @param conf hadoop configuration; the Hive configuration will be 
based on this
+   * @return a token, or `None` if there's no need for a token (no 
metastore URI or principal
+   * in the config), or if a binding exception was caught and 
downgraded.
+   */
+  def obtainTokenForHiveMetastore(conf: Configuration): 
Option[Token[DelegationTokenIdentifier]] = {
+try {
+  obtainTokenForHiveMetastoreInner(conf, 
UserGroupInformation.getCurrentUser().getUserName)
+} catch {
+  case e: Exception => {
+handleTokenIntrospectionFailure("Hive", e)
+None
+  }
+}
+  }
+
+  /**
+   * Handle failures to obtain a token through introspection. Failures to 
load the class are
+   * not treated as errors: anything else is.
+   * @param service service name for error messages
+   * @param thrown exception caught
+   * @throws Exception if the `thrown` exception isn't one that is to be 
ignored
+   */
+  private[yarn] def handleTokenIntrospectionFailure(service: String, 
thrown: Throwable): Unit = {
+thrown match {
+  case e: ClassNotFoundException =>
+logInfo(s"$service class not found $e")
+logDebug("Hive Class not found", e)
+  case t: Throwable => {
--- End diff --

oh, we're at cross purposes. I was looking @ line 128, above. you were at 
180. That's why I Was confused. Yes, I'll cut the lower


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-10-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9232#issuecomment-151661827
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44469/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-10-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9232#issuecomment-151661825
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-10-27 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9232#issuecomment-151661688
  
**[Test build #44469 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44469/consoleFull)**
 for PR 9232 at commit 
[`fbf0ecb`](https://github.com/apache/spark/commit/fbf0ecbd4ae4303846608193c91d735f536e9015).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:\n  * `
logInfo(s\"$service class not found $e\")`\n


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-10-27 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/9232#discussion_r43195162
  
--- Diff: 
yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtil.scala ---
@@ -142,6 +145,99 @@ class YarnSparkHadoopUtil extends SparkHadoopUtil {
 val containerIdString = 
System.getenv(ApplicationConstants.Environment.CONTAINER_ID.name())
 ConverterUtils.toContainerId(containerIdString)
   }
+
+  /**
+   * Obtains token for the Hive metastore, using the current user as the 
principal.
+   * Some exceptions are caught and downgraded to a log message.
+   * @param conf hadoop configuration; the Hive configuration will be 
based on this
+   * @return a token, or `None` if there's no need for a token (no 
metastore URI or principal
+   * in the config), or if a binding exception was caught and 
downgraded.
+   */
+  def obtainTokenForHiveMetastore(conf: Configuration): 
Option[Token[DelegationTokenIdentifier]] = {
+try {
+  obtainTokenForHiveMetastoreInner(conf, 
UserGroupInformation.getCurrentUser().getUserName)
+} catch {
+  case e: Exception => {
+handleTokenIntrospectionFailure("Hive", e)
+None
+  }
+}
+  }
+
+  /**
+   * Handle failures to obtain a token through introspection. Failures to 
load the class are
+   * not treated as errors: anything else is.
+   * @param service service name for error messages
+   * @param thrown exception caught
+   * @throws Exception if the `thrown` exception isn't one that is to be 
ignored
+   */
+  private[yarn] def handleTokenIntrospectionFailure(service: String, 
thrown: Throwable): Unit = {
+thrown match {
+  case e: ClassNotFoundException =>
+logInfo(s"$service class not found $e")
+logDebug("Hive Class not found", e)
+  case t: Throwable => {
--- End diff --

I don't follow. You're catching an exception, logging it, and re-throwing 
it, which causes the exception to show up twice in the process output.

Instead, you can just delete your code, and let the exception propagate 
naturally. It will show up in the output the same way.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-10-27 Thread steveloughran
Github user steveloughran commented on a diff in the pull request:

https://github.com/apache/spark/pull/9232#discussion_r43194609
  
--- Diff: 
yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtil.scala ---
@@ -142,6 +145,99 @@ class YarnSparkHadoopUtil extends SparkHadoopUtil {
 val containerIdString = 
System.getenv(ApplicationConstants.Environment.CONTAINER_ID.name())
 ConverterUtils.toContainerId(containerIdString)
   }
+
+  /**
+   * Obtains token for the Hive metastore, using the current user as the 
principal.
+   * Some exceptions are caught and downgraded to a log message.
+   * @param conf hadoop configuration; the Hive configuration will be 
based on this
+   * @return a token, or `None` if there's no need for a token (no 
metastore URI or principal
+   * in the config), or if a binding exception was caught and 
downgraded.
+   */
+  def obtainTokenForHiveMetastore(conf: Configuration): 
Option[Token[DelegationTokenIdentifier]] = {
+try {
+  obtainTokenForHiveMetastoreInner(conf, 
UserGroupInformation.getCurrentUser().getUserName)
+} catch {
+  case e: Exception => {
+handleTokenIntrospectionFailure("Hive", e)
+None
+  }
+}
+  }
+
+  /**
+   * Handle failures to obtain a token through introspection. Failures to 
load the class are
+   * not treated as errors: anything else is.
+   * @param service service name for error messages
+   * @param thrown exception caught
+   * @throws Exception if the `thrown` exception isn't one that is to be 
ignored
+   */
+  private[yarn] def handleTokenIntrospectionFailure(service: String, 
thrown: Throwable): Unit = {
+thrown match {
+  case e: ClassNotFoundException =>
+logInfo(s"$service class not found $e")
+logDebug("Hive Class not found", e)
+  case t: Throwable => {
--- End diff --

those exceptions aren't being rethrown though, are they? So its logging the 
full stack @ debug, and a one-liner for most.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-10-27 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9232#issuecomment-151657516
  
**[Test build #44469 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44469/consoleFull)**
 for PR 9232 at commit 
[`fbf0ecb`](https://github.com/apache/spark/commit/fbf0ecbd4ae4303846608193c91d735f536e9015).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-10-27 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/9232#discussion_r43193976
  
--- Diff: 
yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtil.scala ---
@@ -142,6 +145,99 @@ class YarnSparkHadoopUtil extends SparkHadoopUtil {
 val containerIdString = 
System.getenv(ApplicationConstants.Environment.CONTAINER_ID.name())
 ConverterUtils.toContainerId(containerIdString)
   }
+
+  /**
+   * Obtains token for the Hive metastore, using the current user as the 
principal.
+   * Some exceptions are caught and downgraded to a log message.
+   * @param conf hadoop configuration; the Hive configuration will be 
based on this
+   * @return a token, or `None` if there's no need for a token (no 
metastore URI or principal
+   * in the config), or if a binding exception was caught and 
downgraded.
+   */
+  def obtainTokenForHiveMetastore(conf: Configuration): 
Option[Token[DelegationTokenIdentifier]] = {
+try {
+  obtainTokenForHiveMetastoreInner(conf, 
UserGroupInformation.getCurrentUser().getUserName)
+} catch {
+  case e: Exception => {
+handleTokenIntrospectionFailure("Hive", e)
+None
+  }
+}
+  }
+
+  /**
+   * Handle failures to obtain a token through introspection. Failures to 
load the class are
+   * not treated as errors: anything else is.
+   * @param service service name for error messages
+   * @param thrown exception caught
+   * @throws Exception if the `thrown` exception isn't one that is to be 
ignored
+   */
+  private[yarn] def handleTokenIntrospectionFailure(service: String, 
thrown: Throwable): Unit = {
+thrown match {
+  case e: ClassNotFoundException =>
+logInfo(s"$service class not found $e")
+logDebug("Hive Class not found", e)
+  case t: Throwable => {
--- End diff --

Why would you? All you're doing here is printing the stack trace to stderr, 
which will happen again when you re-throw the exception.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-10-27 Thread steveloughran
Github user steveloughran commented on a diff in the pull request:

https://github.com/apache/spark/pull/9232#discussion_r43193733
  
--- Diff: 
yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtil.scala ---
@@ -142,6 +145,99 @@ class YarnSparkHadoopUtil extends SparkHadoopUtil {
 val containerIdString = 
System.getenv(ApplicationConstants.Environment.CONTAINER_ID.name())
 ConverterUtils.toContainerId(containerIdString)
   }
+
+  /**
+   * Obtains token for the Hive metastore, using the current user as the 
principal.
+   * Some exceptions are caught and downgraded to a log message.
+   * @param conf hadoop configuration; the Hive configuration will be 
based on this
+   * @return a token, or `None` if there's no need for a token (no 
metastore URI or principal
+   * in the config), or if a binding exception was caught and 
downgraded.
+   */
+  def obtainTokenForHiveMetastore(conf: Configuration): 
Option[Token[DelegationTokenIdentifier]] = {
+try {
+  obtainTokenForHiveMetastoreInner(conf, 
UserGroupInformation.getCurrentUser().getUserName)
+} catch {
+  case e: Exception => {
+handleTokenIntrospectionFailure("Hive", e)
+None
+  }
+}
+  }
+
+  /**
+   * Handle failures to obtain a token through introspection. Failures to 
load the class are
+   * not treated as errors: anything else is.
+   * @param service service name for error messages
+   * @param thrown exception caught
+   * @throws Exception if the `thrown` exception isn't one that is to be 
ignored
+   */
+  private[yarn] def handleTokenIntrospectionFailure(service: String, 
thrown: Throwable): Unit = {
+thrown match {
+  case e: ClassNotFoundException =>
+logInfo(s"$service class not found $e")
+logDebug("Hive Class not found", e)
+  case t: Throwable => {
--- End diff --

Either?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-10-27 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/9232#discussion_r43193401
  
--- Diff: 
yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtil.scala ---
@@ -142,6 +145,99 @@ class YarnSparkHadoopUtil extends SparkHadoopUtil {
 val containerIdString = 
System.getenv(ApplicationConstants.Environment.CONTAINER_ID.name())
 ConverterUtils.toContainerId(containerIdString)
   }
+
+  /**
+   * Obtains token for the Hive metastore, using the current user as the 
principal.
+   * Some exceptions are caught and downgraded to a log message.
+   * @param conf hadoop configuration; the Hive configuration will be 
based on this
+   * @return a token, or `None` if there's no need for a token (no 
metastore URI or principal
+   * in the config), or if a binding exception was caught and 
downgraded.
+   */
+  def obtainTokenForHiveMetastore(conf: Configuration): 
Option[Token[DelegationTokenIdentifier]] = {
+try {
+  obtainTokenForHiveMetastoreInner(conf, 
UserGroupInformation.getCurrentUser().getUserName)
+} catch {
+  case e: Exception => {
+handleTokenIntrospectionFailure("Hive", e)
+None
+  }
+}
+  }
+
+  /**
+   * Handle failures to obtain a token through introspection. Failures to 
load the class are
+   * not treated as errors: anything else is.
+   * @param service service name for error messages
+   * @param thrown exception caught
+   * @throws Exception if the `thrown` exception isn't one that is to be 
ignored
+   */
+  private[yarn] def handleTokenIntrospectionFailure(service: String, 
thrown: Throwable): Unit = {
+thrown match {
+  case e: ClassNotFoundException =>
+logInfo(s"$service class not found $e")
+logDebug("Hive Class not found", e)
+  case t: Throwable => {
--- End diff --

really, you don't need this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-10-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9232#issuecomment-151655354
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-10-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9232#issuecomment-151655385
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-10-27 Thread steveloughran
Github user steveloughran commented on a diff in the pull request:

https://github.com/apache/spark/pull/9232#discussion_r43189656
  
--- Diff: 
yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtil.scala ---
@@ -142,6 +145,117 @@ class YarnSparkHadoopUtil extends SparkHadoopUtil {
 val containerIdString = 
System.getenv(ApplicationConstants.Environment.CONTAINER_ID.name())
 ConverterUtils.toContainerId(containerIdString)
   }
+
+  /**
+   * Obtains token for the Hive metastore, using the current user as the 
principal.
+   * Some exceptions are caught and downgraded to a log message.
+   * @param conf hadoop configuration; the Hive configuration will be 
based on this
+   * @return a token, or `None` if there's no need for a token (no 
metastore URI or principal
+   * in the config), or if a binding exception was caught and 
downgraded.
+   */
+  def obtainTokenForHiveMetastore(conf: Configuration): 
Option[Token[DelegationTokenIdentifier]] = {
+try {
+  obtainTokenForHiveMetastoreInner(conf, 
UserGroupInformation.getCurrentUser().getUserName)
+} catch {
+  case e: Exception => {
+handleTokenIntrospectionFailure("Hive", e)
+None
+  }
+}
+  }
+
+  /**
+   * Handle failures to obtain a token through introspection. Failures to 
load the class are
+   * not treated as errors: anything else is.
+   * @param service service name for error messages
+   * @param thrown exception caught
+   * @throws Exception if the `thrown` exception isn't one that is to be 
ignored
+   */
+  private[yarn] def handleTokenIntrospectionFailure(service: String, 
thrown: Throwable): Unit = {
+thrown match {
+  case e: ClassNotFoundException =>
+logInfo(s"$service class not found $e")
+logDebug("Hive Class not found", e)
+  case e: NoClassDefFoundError =>
+logDebug(s"$service class not found", e)
+  case e: InvocationTargetException =>
+// problem talking to the metastore or other hive-side exception
+logInfo(s"$service method invocation failed", e)
+throw if (e.getCause != null) e.getCause else e
+  case e: ReflectiveOperationException =>
+// any other reflection failure log at error and rethrow
+logError(s"$service Class operation failed", e)
+throw e;
--- End diff --

well, that simplifies the clause, and the test...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-10-27 Thread steveloughran
Github user steveloughran commented on a diff in the pull request:

https://github.com/apache/spark/pull/9232#discussion_r43189502
  
--- Diff: 
yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtil.scala ---
@@ -142,6 +145,117 @@ class YarnSparkHadoopUtil extends SparkHadoopUtil {
 val containerIdString = 
System.getenv(ApplicationConstants.Environment.CONTAINER_ID.name())
 ConverterUtils.toContainerId(containerIdString)
   }
+
+  /**
+   * Obtains token for the Hive metastore, using the current user as the 
principal.
+   * Some exceptions are caught and downgraded to a log message.
+   * @param conf hadoop configuration; the Hive configuration will be 
based on this
+   * @return a token, or `None` if there's no need for a token (no 
metastore URI or principal
+   * in the config), or if a binding exception was caught and 
downgraded.
+   */
+  def obtainTokenForHiveMetastore(conf: Configuration): 
Option[Token[DelegationTokenIdentifier]] = {
+try {
+  obtainTokenForHiveMetastoreInner(conf, 
UserGroupInformation.getCurrentUser().getUserName)
+} catch {
+  case e: Exception => {
+handleTokenIntrospectionFailure("Hive", e)
+None
+  }
+}
+  }
+
+  /**
+   * Handle failures to obtain a token through introspection. Failures to 
load the class are
+   * not treated as errors: anything else is.
+   * @param service service name for error messages
+   * @param thrown exception caught
+   * @throws Exception if the `thrown` exception isn't one that is to be 
ignored
+   */
+  private[yarn] def handleTokenIntrospectionFailure(service: String, 
thrown: Throwable): Unit = {
+thrown match {
+  case e: ClassNotFoundException =>
+logInfo(s"$service class not found $e")
+logDebug("Hive Class not found", e)
+  case e: NoClassDefFoundError =>
+logDebug(s"$service class not found", e)
+  case e: InvocationTargetException =>
+// problem talking to the metastore or other hive-side exception
+logInfo(s"$service method invocation failed", e)
+throw if (e.getCause != null) e.getCause else e
+  case e: ReflectiveOperationException =>
+// any other reflection failure log at error and rethrow
+logError(s"$service Class operation failed", e)
+throw e;
+  case e: RuntimeException =>
+// any runtime exception, including Illegal Argument Exception
+throw e
+  case t: Throwable => {
+val msg = s"$service: Unexpected Exception " + t
+logError(msg, t)
+throw new RuntimeException(msg, t)
+  }
+}
+  }
+
+  /**
+   * Inner routine to obtains token for the Hive metastore; exceptions are 
raised on any problem.
+   * @param conf hadoop configuration; the Hive configuration will be 
based on this.
+   * @param username the username of the principal requesting the 
delegating token.
+   * @return a delegation token
+   */
+  private[yarn] def obtainTokenForHiveMetastoreInner(conf: Configuration,
+  username: String): Option[Token[DelegationTokenIdentifier]] = {
+val mirror = universe.runtimeMirror(getClass.getClassLoader)
+
+// the hive configuration class is a subclass of Hadoop Configuration, 
so can be cast down
+// to a Configuration and used without reflection
+val hiveConfClass = 
mirror.classLoader.loadClass("org.apache.hadoop.hive.conf.HiveConf")
+// using the (Configuration, Class) constructor allows the current 
configuratin to be included
+// in the hive config.
+val ctor = hiveConfClass.getDeclaredConstructor(classOf[Configuration],
+  classOf[Object].getClass)
+val hiveConf = ctor.newInstance(conf, 
hiveConfClass).asInstanceOf[Configuration]
+val metastore_uri = hiveConf.getTrimmed("hive.metastore.uris", "")
+
+// Check for local metastore
+if (metastore_uri.nonEmpty) {
+  if (username.isEmpty) {
+throw new IllegalArgumentException(s"Username undefined")
+  }
+  val metastore_kerberos_principal_key = 
"hive.metastore.kerberos.principal"
+  val principal = 
hiveConf.getTrimmed(metastore_kerberos_principal_key, "")
+  if (principal.isEmpty) {
+throw new IllegalArgumentException(s"Hive principal" +
+s" $metastore_kerberos_principal_key undefined")
+  }
+  logDebug(s"Getting Hive delegation token for user $username against 
$metastore_uri")
+  val hiveClass = 
mirror.classLoader.loadClass("org.apache.hadoop.hive.ql.metadata.Hive")
+  val closeCurrent = hiveClass.getMethod("closeCurrent")
+  try {
+// get all the instance meth

[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-10-27 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/9232#discussion_r43160341
  
--- Diff: 
yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtil.scala ---
@@ -142,6 +145,117 @@ class YarnSparkHadoopUtil extends SparkHadoopUtil {
 val containerIdString = 
System.getenv(ApplicationConstants.Environment.CONTAINER_ID.name())
 ConverterUtils.toContainerId(containerIdString)
   }
+
+  /**
+   * Obtains token for the Hive metastore, using the current user as the 
principal.
+   * Some exceptions are caught and downgraded to a log message.
+   * @param conf hadoop configuration; the Hive configuration will be 
based on this
+   * @return a token, or `None` if there's no need for a token (no 
metastore URI or principal
+   * in the config), or if a binding exception was caught and 
downgraded.
+   */
+  def obtainTokenForHiveMetastore(conf: Configuration): 
Option[Token[DelegationTokenIdentifier]] = {
+try {
+  obtainTokenForHiveMetastoreInner(conf, 
UserGroupInformation.getCurrentUser().getUserName)
+} catch {
+  case e: Exception => {
+handleTokenIntrospectionFailure("Hive", e)
+None
+  }
+}
+  }
+
+  /**
+   * Handle failures to obtain a token through introspection. Failures to 
load the class are
+   * not treated as errors: anything else is.
+   * @param service service name for error messages
+   * @param thrown exception caught
+   * @throws Exception if the `thrown` exception isn't one that is to be 
ignored
+   */
+  private[yarn] def handleTokenIntrospectionFailure(service: String, 
thrown: Throwable): Unit = {
+thrown match {
+  case e: ClassNotFoundException =>
+logInfo(s"$service class not found $e")
+logDebug("Hive Class not found", e)
+  case e: NoClassDefFoundError =>
+logDebug(s"$service class not found", e)
+  case e: InvocationTargetException =>
+// problem talking to the metastore or other hive-side exception
+logInfo(s"$service method invocation failed", e)
+throw if (e.getCause != null) e.getCause else e
+  case e: ReflectiveOperationException =>
+// any other reflection failure log at error and rethrow
+logError(s"$service Class operation failed", e)
+throw e;
--- End diff --

I think unwinding just makes the code more confusing. Just handle the 
exceptions you really mean to handle, and let the others propagate as is. 
Errors here should be very uncommon, and the extra info in the stack trace 
won't really make it harder to find the case.

I think it's very unlikely you'll get a `NoClassDefFoundError` if you don't 
get a `ClassDefNotFoundException`. If you do, it's probably an actual error - 
user has added some Hive jars to the path but not others, or something like 
that.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-10-27 Thread steveloughran
Github user steveloughran commented on a diff in the pull request:

https://github.com/apache/spark/pull/9232#discussion_r43128655
  
--- Diff: 
yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtil.scala ---
@@ -142,6 +145,117 @@ class YarnSparkHadoopUtil extends SparkHadoopUtil {
 val containerIdString = 
System.getenv(ApplicationConstants.Environment.CONTAINER_ID.name())
 ConverterUtils.toContainerId(containerIdString)
   }
+
+  /**
+   * Obtains token for the Hive metastore, using the current user as the 
principal.
+   * Some exceptions are caught and downgraded to a log message.
+   * @param conf hadoop configuration; the Hive configuration will be 
based on this
+   * @return a token, or `None` if there's no need for a token (no 
metastore URI or principal
+   * in the config), or if a binding exception was caught and 
downgraded.
+   */
+  def obtainTokenForHiveMetastore(conf: Configuration): 
Option[Token[DelegationTokenIdentifier]] = {
+try {
+  obtainTokenForHiveMetastoreInner(conf, 
UserGroupInformation.getCurrentUser().getUserName)
+} catch {
+  case e: Exception => {
+handleTokenIntrospectionFailure("Hive", e)
+None
+  }
+}
+  }
+
+  /**
+   * Handle failures to obtain a token through introspection. Failures to 
load the class are
+   * not treated as errors: anything else is.
+   * @param service service name for error messages
+   * @param thrown exception caught
+   * @throws Exception if the `thrown` exception isn't one that is to be 
ignored
+   */
+  private[yarn] def handleTokenIntrospectionFailure(service: String, 
thrown: Throwable): Unit = {
+thrown match {
+  case e: ClassNotFoundException =>
+logInfo(s"$service class not found $e")
+logDebug("Hive Class not found", e)
+  case e: NoClassDefFoundError =>
+logDebug(s"$service class not found", e)
+  case e: InvocationTargetException =>
+// problem talking to the metastore or other hive-side exception
+logInfo(s"$service method invocation failed", e)
+throw if (e.getCause != null) e.getCause else e
+  case e: ReflectiveOperationException =>
+// any other reflection failure log at error and rethrow
+logError(s"$service Class operation failed", e)
+throw e;
--- End diff --

`NoClassDefFound` may need to be covered too; I think it's the transient 
version of CNFE, though that may be superstition.

I'd like to still unwind the `InvocationTargetException`, as its just a 
wrapper for what came in before; cutting it will simply reduce one level of 
needless stack trace


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-10-26 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/9232#discussion_r43014446
  
--- Diff: 
yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtil.scala ---
@@ -142,6 +145,117 @@ class YarnSparkHadoopUtil extends SparkHadoopUtil {
 val containerIdString = 
System.getenv(ApplicationConstants.Environment.CONTAINER_ID.name())
 ConverterUtils.toContainerId(containerIdString)
   }
+
+  /**
+   * Obtains token for the Hive metastore, using the current user as the 
principal.
+   * Some exceptions are caught and downgraded to a log message.
+   * @param conf hadoop configuration; the Hive configuration will be 
based on this
+   * @return a token, or `None` if there's no need for a token (no 
metastore URI or principal
+   * in the config), or if a binding exception was caught and 
downgraded.
+   */
+  def obtainTokenForHiveMetastore(conf: Configuration): 
Option[Token[DelegationTokenIdentifier]] = {
+try {
+  obtainTokenForHiveMetastoreInner(conf, 
UserGroupInformation.getCurrentUser().getUserName)
+} catch {
+  case e: Exception => {
+handleTokenIntrospectionFailure("Hive", e)
+None
+  }
+}
+  }
+
+  /**
+   * Handle failures to obtain a token through introspection. Failures to 
load the class are
+   * not treated as errors: anything else is.
+   * @param service service name for error messages
+   * @param thrown exception caught
+   * @throws Exception if the `thrown` exception isn't one that is to be 
ignored
+   */
+  private[yarn] def handleTokenIntrospectionFailure(service: String, 
thrown: Throwable): Unit = {
+thrown match {
+  case e: ClassNotFoundException =>
+logInfo(s"$service class not found $e")
+logDebug("Hive Class not found", e)
+  case e: NoClassDefFoundError =>
+logDebug(s"$service class not found", e)
+  case e: InvocationTargetException =>
+// problem talking to the metastore or other hive-side exception
+logInfo(s"$service method invocation failed", e)
+throw if (e.getCause != null) e.getCause else e
+  case e: ReflectiveOperationException =>
+// any other reflection failure log at error and rethrow
+logError(s"$service Class operation failed", e)
+throw e;
+  case e: RuntimeException =>
+// any runtime exception, including Illegal Argument Exception
+throw e
+  case t: Throwable => {
+val msg = s"$service: Unexpected Exception " + t
+logError(msg, t)
+throw new RuntimeException(msg, t)
+  }
+}
+  }
+
+  /**
+   * Inner routine to obtains token for the Hive metastore; exceptions are 
raised on any problem.
+   * @param conf hadoop configuration; the Hive configuration will be 
based on this.
+   * @param username the username of the principal requesting the 
delegating token.
+   * @return a delegation token
+   */
+  private[yarn] def obtainTokenForHiveMetastoreInner(conf: Configuration,
+  username: String): Option[Token[DelegationTokenIdentifier]] = {
+val mirror = universe.runtimeMirror(getClass.getClassLoader)
+
+// the hive configuration class is a subclass of Hadoop Configuration, 
so can be cast down
+// to a Configuration and used without reflection
+val hiveConfClass = 
mirror.classLoader.loadClass("org.apache.hadoop.hive.conf.HiveConf")
+// using the (Configuration, Class) constructor allows the current 
configuratin to be included
+// in the hive config.
+val ctor = hiveConfClass.getDeclaredConstructor(classOf[Configuration],
+  classOf[Object].getClass)
+val hiveConf = ctor.newInstance(conf, 
hiveConfClass).asInstanceOf[Configuration]
+val metastore_uri = hiveConf.getTrimmed("hive.metastore.uris", "")
+
+// Check for local metastore
+if (metastore_uri.nonEmpty) {
+  if (username.isEmpty) {
+throw new IllegalArgumentException(s"Username undefined")
+  }
+  val metastore_kerberos_principal_key = 
"hive.metastore.kerberos.principal"
+  val principal = 
hiveConf.getTrimmed(metastore_kerberos_principal_key, "")
+  if (principal.isEmpty) {
--- End diff --

use `require`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsub

[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-10-26 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/9232#discussion_r43014505
  
--- Diff: 
yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtil.scala ---
@@ -142,6 +145,117 @@ class YarnSparkHadoopUtil extends SparkHadoopUtil {
 val containerIdString = 
System.getenv(ApplicationConstants.Environment.CONTAINER_ID.name())
 ConverterUtils.toContainerId(containerIdString)
   }
+
+  /**
+   * Obtains token for the Hive metastore, using the current user as the 
principal.
+   * Some exceptions are caught and downgraded to a log message.
+   * @param conf hadoop configuration; the Hive configuration will be 
based on this
+   * @return a token, or `None` if there's no need for a token (no 
metastore URI or principal
+   * in the config), or if a binding exception was caught and 
downgraded.
+   */
+  def obtainTokenForHiveMetastore(conf: Configuration): 
Option[Token[DelegationTokenIdentifier]] = {
+try {
+  obtainTokenForHiveMetastoreInner(conf, 
UserGroupInformation.getCurrentUser().getUserName)
+} catch {
+  case e: Exception => {
+handleTokenIntrospectionFailure("Hive", e)
+None
+  }
+}
+  }
+
+  /**
+   * Handle failures to obtain a token through introspection. Failures to 
load the class are
+   * not treated as errors: anything else is.
+   * @param service service name for error messages
+   * @param thrown exception caught
+   * @throws Exception if the `thrown` exception isn't one that is to be 
ignored
+   */
+  private[yarn] def handleTokenIntrospectionFailure(service: String, 
thrown: Throwable): Unit = {
+thrown match {
+  case e: ClassNotFoundException =>
+logInfo(s"$service class not found $e")
+logDebug("Hive Class not found", e)
+  case e: NoClassDefFoundError =>
+logDebug(s"$service class not found", e)
+  case e: InvocationTargetException =>
+// problem talking to the metastore or other hive-side exception
+logInfo(s"$service method invocation failed", e)
+throw if (e.getCause != null) e.getCause else e
+  case e: ReflectiveOperationException =>
+// any other reflection failure log at error and rethrow
+logError(s"$service Class operation failed", e)
+throw e;
+  case e: RuntimeException =>
+// any runtime exception, including Illegal Argument Exception
+throw e
+  case t: Throwable => {
+val msg = s"$service: Unexpected Exception " + t
+logError(msg, t)
+throw new RuntimeException(msg, t)
+  }
+}
+  }
+
+  /**
+   * Inner routine to obtains token for the Hive metastore; exceptions are 
raised on any problem.
+   * @param conf hadoop configuration; the Hive configuration will be 
based on this.
+   * @param username the username of the principal requesting the 
delegating token.
+   * @return a delegation token
+   */
+  private[yarn] def obtainTokenForHiveMetastoreInner(conf: Configuration,
+  username: String): Option[Token[DelegationTokenIdentifier]] = {
+val mirror = universe.runtimeMirror(getClass.getClassLoader)
+
+// the hive configuration class is a subclass of Hadoop Configuration, 
so can be cast down
+// to a Configuration and used without reflection
+val hiveConfClass = 
mirror.classLoader.loadClass("org.apache.hadoop.hive.conf.HiveConf")
+// using the (Configuration, Class) constructor allows the current 
configuratin to be included
+// in the hive config.
+val ctor = hiveConfClass.getDeclaredConstructor(classOf[Configuration],
+  classOf[Object].getClass)
+val hiveConf = ctor.newInstance(conf, 
hiveConfClass).asInstanceOf[Configuration]
+val metastore_uri = hiveConf.getTrimmed("hive.metastore.uris", "")
+
+// Check for local metastore
+if (metastore_uri.nonEmpty) {
+  if (username.isEmpty) {
+throw new IllegalArgumentException(s"Username undefined")
+  }
+  val metastore_kerberos_principal_key = 
"hive.metastore.kerberos.principal"
+  val principal = 
hiveConf.getTrimmed(metastore_kerberos_principal_key, "")
+  if (principal.isEmpty) {
+throw new IllegalArgumentException(s"Hive principal" +
+s" $metastore_kerberos_principal_key undefined")
+  }
+  logDebug(s"Getting Hive delegation token for user $username against 
$metastore_uri")
+  val hiveClass = 
mirror.classLoader.loadClass("org.apache.hadoop.hive.ql.metadata.Hive")
+  val closeCurrent = hiveClass.getMethod("closeCurrent")
+  try {
+// get all the instance methods bef

[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-10-26 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/9232#discussion_r43014420
  
--- Diff: 
yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtil.scala ---
@@ -142,6 +145,117 @@ class YarnSparkHadoopUtil extends SparkHadoopUtil {
 val containerIdString = 
System.getenv(ApplicationConstants.Environment.CONTAINER_ID.name())
 ConverterUtils.toContainerId(containerIdString)
   }
+
+  /**
+   * Obtains token for the Hive metastore, using the current user as the 
principal.
+   * Some exceptions are caught and downgraded to a log message.
+   * @param conf hadoop configuration; the Hive configuration will be 
based on this
+   * @return a token, or `None` if there's no need for a token (no 
metastore URI or principal
+   * in the config), or if a binding exception was caught and 
downgraded.
+   */
+  def obtainTokenForHiveMetastore(conf: Configuration): 
Option[Token[DelegationTokenIdentifier]] = {
+try {
+  obtainTokenForHiveMetastoreInner(conf, 
UserGroupInformation.getCurrentUser().getUserName)
+} catch {
+  case e: Exception => {
+handleTokenIntrospectionFailure("Hive", e)
+None
+  }
+}
+  }
+
+  /**
+   * Handle failures to obtain a token through introspection. Failures to 
load the class are
+   * not treated as errors: anything else is.
+   * @param service service name for error messages
+   * @param thrown exception caught
+   * @throws Exception if the `thrown` exception isn't one that is to be 
ignored
+   */
+  private[yarn] def handleTokenIntrospectionFailure(service: String, 
thrown: Throwable): Unit = {
+thrown match {
+  case e: ClassNotFoundException =>
+logInfo(s"$service class not found $e")
+logDebug("Hive Class not found", e)
+  case e: NoClassDefFoundError =>
+logDebug(s"$service class not found", e)
+  case e: InvocationTargetException =>
+// problem talking to the metastore or other hive-side exception
+logInfo(s"$service method invocation failed", e)
+throw if (e.getCause != null) e.getCause else e
+  case e: ReflectiveOperationException =>
+// any other reflection failure log at error and rethrow
+logError(s"$service Class operation failed", e)
+throw e;
+  case e: RuntimeException =>
+// any runtime exception, including Illegal Argument Exception
+throw e
+  case t: Throwable => {
+val msg = s"$service: Unexpected Exception " + t
+logError(msg, t)
+throw new RuntimeException(msg, t)
+  }
+}
+  }
+
+  /**
+   * Inner routine to obtains token for the Hive metastore; exceptions are 
raised on any problem.
+   * @param conf hadoop configuration; the Hive configuration will be 
based on this.
+   * @param username the username of the principal requesting the 
delegating token.
+   * @return a delegation token
+   */
+  private[yarn] def obtainTokenForHiveMetastoreInner(conf: Configuration,
+  username: String): Option[Token[DelegationTokenIdentifier]] = {
+val mirror = universe.runtimeMirror(getClass.getClassLoader)
+
+// the hive configuration class is a subclass of Hadoop Configuration, 
so can be cast down
+// to a Configuration and used without reflection
+val hiveConfClass = 
mirror.classLoader.loadClass("org.apache.hadoop.hive.conf.HiveConf")
+// using the (Configuration, Class) constructor allows the current 
configuratin to be included
+// in the hive config.
+val ctor = hiveConfClass.getDeclaredConstructor(classOf[Configuration],
+  classOf[Object].getClass)
+val hiveConf = ctor.newInstance(conf, 
hiveConfClass).asInstanceOf[Configuration]
+val metastore_uri = hiveConf.getTrimmed("hive.metastore.uris", "")
+
+// Check for local metastore
+if (metastore_uri.nonEmpty) {
+  if (username.isEmpty) {
--- End diff --

`require(username.nonEmpty)`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-10-26 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/9232#discussion_r43014259
  
--- Diff: 
yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtil.scala ---
@@ -142,6 +145,117 @@ class YarnSparkHadoopUtil extends SparkHadoopUtil {
 val containerIdString = 
System.getenv(ApplicationConstants.Environment.CONTAINER_ID.name())
 ConverterUtils.toContainerId(containerIdString)
   }
+
+  /**
+   * Obtains token for the Hive metastore, using the current user as the 
principal.
+   * Some exceptions are caught and downgraded to a log message.
+   * @param conf hadoop configuration; the Hive configuration will be 
based on this
+   * @return a token, or `None` if there's no need for a token (no 
metastore URI or principal
+   * in the config), or if a binding exception was caught and 
downgraded.
+   */
+  def obtainTokenForHiveMetastore(conf: Configuration): 
Option[Token[DelegationTokenIdentifier]] = {
+try {
+  obtainTokenForHiveMetastoreInner(conf, 
UserGroupInformation.getCurrentUser().getUserName)
+} catch {
+  case e: Exception => {
+handleTokenIntrospectionFailure("Hive", e)
+None
+  }
+}
+  }
+
+  /**
+   * Handle failures to obtain a token through introspection. Failures to 
load the class are
+   * not treated as errors: anything else is.
+   * @param service service name for error messages
+   * @param thrown exception caught
+   * @throws Exception if the `thrown` exception isn't one that is to be 
ignored
+   */
+  private[yarn] def handleTokenIntrospectionFailure(service: String, 
thrown: Throwable): Unit = {
+thrown match {
+  case e: ClassNotFoundException =>
+logInfo(s"$service class not found $e")
+logDebug("Hive Class not found", e)
+  case e: NoClassDefFoundError =>
+logDebug(s"$service class not found", e)
+  case e: InvocationTargetException =>
+// problem talking to the metastore or other hive-side exception
+logInfo(s"$service method invocation failed", e)
+throw if (e.getCause != null) e.getCause else e
+  case e: ReflectiveOperationException =>
+// any other reflection failure log at error and rethrow
+logError(s"$service Class operation failed", e)
+throw e;
+  case e: RuntimeException =>
+// any runtime exception, including Illegal Argument Exception
+throw e
+  case t: Throwable => {
+val msg = s"$service: Unexpected Exception " + t
+logError(msg, t)
+throw new RuntimeException(msg, t)
+  }
+}
+  }
+
+  /**
+   * Inner routine to obtains token for the Hive metastore; exceptions are 
raised on any problem.
+   * @param conf hadoop configuration; the Hive configuration will be 
based on this.
+   * @param username the username of the principal requesting the 
delegating token.
+   * @return a delegation token
+   */
+  private[yarn] def obtainTokenForHiveMetastoreInner(conf: Configuration,
+  username: String): Option[Token[DelegationTokenIdentifier]] = {
+val mirror = universe.runtimeMirror(getClass.getClassLoader)
+
+// the hive configuration class is a subclass of Hadoop Configuration, 
so can be cast down
+// to a Configuration and used without reflection
+val hiveConfClass = 
mirror.classLoader.loadClass("org.apache.hadoop.hive.conf.HiveConf")
+// using the (Configuration, Class) constructor allows the current 
configuratin to be included
+// in the hive config.
+val ctor = hiveConfClass.getDeclaredConstructor(classOf[Configuration],
+  classOf[Object].getClass)
+val hiveConf = ctor.newInstance(conf, 
hiveConfClass).asInstanceOf[Configuration]
+val metastore_uri = hiveConf.getTrimmed("hive.metastore.uris", "")
--- End diff --

`metastoreUri`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11265] [YARN] YarnClient can't get toke...

2015-10-26 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/9232#discussion_r43014139
  
--- Diff: 
yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtil.scala ---
@@ -142,6 +145,117 @@ class YarnSparkHadoopUtil extends SparkHadoopUtil {
 val containerIdString = 
System.getenv(ApplicationConstants.Environment.CONTAINER_ID.name())
 ConverterUtils.toContainerId(containerIdString)
   }
+
+  /**
+   * Obtains token for the Hive metastore, using the current user as the 
principal.
+   * Some exceptions are caught and downgraded to a log message.
+   * @param conf hadoop configuration; the Hive configuration will be 
based on this
+   * @return a token, or `None` if there's no need for a token (no 
metastore URI or principal
+   * in the config), or if a binding exception was caught and 
downgraded.
+   */
+  def obtainTokenForHiveMetastore(conf: Configuration): 
Option[Token[DelegationTokenIdentifier]] = {
+try {
+  obtainTokenForHiveMetastoreInner(conf, 
UserGroupInformation.getCurrentUser().getUserName)
+} catch {
+  case e: Exception => {
+handleTokenIntrospectionFailure("Hive", e)
+None
+  }
+}
+  }
+
+  /**
+   * Handle failures to obtain a token through introspection. Failures to 
load the class are
+   * not treated as errors: anything else is.
+   * @param service service name for error messages
+   * @param thrown exception caught
+   * @throws Exception if the `thrown` exception isn't one that is to be 
ignored
+   */
+  private[yarn] def handleTokenIntrospectionFailure(service: String, 
thrown: Throwable): Unit = {
+thrown match {
+  case e: ClassNotFoundException =>
+logInfo(s"$service class not found $e")
+logDebug("Hive Class not found", e)
+  case e: NoClassDefFoundError =>
+logDebug(s"$service class not found", e)
+  case e: InvocationTargetException =>
+// problem talking to the metastore or other hive-side exception
+logInfo(s"$service method invocation failed", e)
+throw if (e.getCause != null) e.getCause else e
+  case e: ReflectiveOperationException =>
+// any other reflection failure log at error and rethrow
+logError(s"$service Class operation failed", e)
+throw e;
--- End diff --

nuke the semi-colon. Also, I don't think you need to catch / log / re-throw 
any of these. Just let them bubble up and fail the app. The user will see the 
error. As I suggested before, the only `case` you need here is for 
`ClassNotFoundException`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   >