[GitHub] spark pull request #17335: [SPARK-19995][Hive][Yarn] Using real user to init...

2017-03-19 Thread jerryshao
Github user jerryshao commented on a diff in the pull request:

https://github.com/apache/spark/pull/17335#discussion_r106831705
  
--- Diff: core/src/main/scala/org/apache/spark/deploy/SparkHadoopUtil.scala 
---
@@ -353,6 +354,25 @@ class SparkHadoopUtil extends Logging {
 }
 buffer.toString
   }
+
+  /**
+   * Run some code as the real logged in user (which may differ from the 
current user, for
+   * example, when using proxying).
+   */
+  private[spark] def doAsRealUser[T](fn: => T): T = {
+val currentUser = UserGroupInformation.getCurrentUser()
--- End diff --

@vanzin I tried with Above two configurations, though having some class not 
found issue in our HDP environment, but metastore connect can be correct 
established without GSSAPI tgt not found issue. Tried with 
spark.sql.hive.metastore.jars=maven, spark.sql.hive.metastore.version=1.2.1 and 
2.0.1.

```
17/03/20 03:35:48 INFO metastore: Trying to connect to metastore with URI 
thrift://c6402.ambari.apache.org:9083
17/03/20 03:35:48 INFO metastore: Opened a connection to metastore, current 
connections: 1
17/03/20 03:35:48 INFO metastore: Connected to metastore.
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17335: [SPARK-19995][Hive][Yarn] Using real user to init...

2017-03-19 Thread jerryshao
Github user jerryshao commented on a diff in the pull request:

https://github.com/apache/spark/pull/17335#discussion_r106828435
  
--- Diff: core/src/main/scala/org/apache/spark/deploy/SparkHadoopUtil.scala 
---
@@ -353,6 +354,25 @@ class SparkHadoopUtil extends Logging {
 }
 buffer.toString
   }
+
+  /**
+   * Run some code as the real logged in user (which may differ from the 
current user, for
+   * example, when using proxying).
+   */
+  private[spark] def doAsRealUser[T](fn: => T): T = {
+val currentUser = UserGroupInformation.getCurrentUser()
--- End diff --

I see. Let me take a try. But I'm guessing this is the only place where the 
issue can be handled from Spark side.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17335: [SPARK-19995][Hive][Yarn] Using real user to init...

2017-03-17 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/17335#discussion_r106700053
  
--- Diff: core/src/main/scala/org/apache/spark/deploy/SparkHadoopUtil.scala 
---
@@ -353,6 +354,25 @@ class SparkHadoopUtil extends Logging {
 }
 buffer.toString
   }
+
+  /**
+   * Run some code as the real logged in user (which may differ from the 
current user, for
+   * example, when using proxying).
+   */
+  private[spark] def doAsRealUser[T](fn: => T): T = {
+val currentUser = UserGroupInformation.getCurrentUser()
--- End diff --

Hmmm... I'm not so sure this will work in all cases. Can you test this with 
both `spark.sql.hive.metastore.jars` and `spark.sql.hive.metastore.version` set?

The problem is that this class is loaded by Spark's main class loader, 
while `HiveClientImpl` comes from a different class loader. So 
`UserGroupInformation` might be a different class in certain cases. It's the 
same reasoning why `HiveClientImpl` class does its own `loginUserFromKeytab` 
around L110.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17335: [SPARK-19995][Hive][Yarn] Using real user to init...

2017-03-17 Thread jerryshao
GitHub user jerryshao opened a pull request:

https://github.com/apache/spark/pull/17335

[SPARK-19995][Hive][Yarn] Using real user to initialize hive SessionState

## What changes were proposed in this pull request?

Using current user to connect MetaStore in `HiveClientImpl` will introduce 
tgt not found issue if current user is not kinited.

This could be happened when using `--proxy-user`, only real user is 
kinited. So we should use real user to connect Metastore instead of current 
user to avoid this issue.

## How was this patch tested?

Local verified in secure cluster.

@vanzin @tgravescs @dongjoon-hyun please help to review, thanks a lot.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jerryshao/apache-spark SPARK-19995

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/17335.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #17335


commit d31dcb340eba66d6adf6c027675fb9ebb5b18ce2
Author: jerryshao 
Date:   2017-03-17T09:14:56Z

Using real user to initialize hive SessionState

Change-Id: If423f3fdc709ed3284cafc01efd1fe389f635560




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org