[GitHub] spark issue #21756: [SPARK-24764] [CORE] Add ServiceLoader implementation fo...

2018-09-14 Thread vanzin
Github user vanzin commented on the issue:

https://github.com/apache/spark/pull/21756
  
Only you can close your own PR.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21756: [SPARK-24764] [CORE] Add ServiceLoader implementation fo...

2018-09-13 Thread shrutig
Github user shrutig commented on the issue:

https://github.com/apache/spark/pull/21756
  
Thank you for your feedback. I have created a separate JIRA 
https://issues.apache.org/jira/browse/SPARK-25428 to track Spark support for 
plain Kerberos authentication. Hence, this PR can be closed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21756: [SPARK-24764] [CORE] Add ServiceLoader implementation fo...

2018-09-07 Thread vanzin
Github user vanzin commented on the issue:

https://github.com/apache/spark/pull/21756
  
I really dislike exposing this a something that others can override. We 
should be going the other way, and making that code as generic as possible.

If there's something that you can fix in the existing code to enable your 
use case, that would be better. Otherwise, I suggest just patching the needed 
code in your deployment; sounds from your e-mails like you're already modifying 
Spark, so...


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21756: [SPARK-24764] [CORE] Add ServiceLoader implementation fo...

2018-09-04 Thread jerryshao
Github user jerryshao commented on the issue:

https://github.com/apache/spark/pull/21756
  
I think the use case here is quite specific, I'm not sure if it is a good 
idea to make `SparkHadoopUtil` ServiceLoader-able to support your requirement. 
Typically I don't think user has a such requirement to build their own 
`SparkHadoopUtil`.

I'm wondering do we have other solutions or workarounds to support your use 
case?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21756: [SPARK-24764] [CORE] Add ServiceLoader implementation fo...

2018-09-04 Thread dbtsai
Github user dbtsai commented on the issue:

https://github.com/apache/spark/pull/21756
  
add @jerryshao for more feedback. Thanks.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21756: [SPARK-24764] [CORE] Add ServiceLoader implementation fo...

2018-09-04 Thread shrutig
Github user shrutig commented on the issue:

https://github.com/apache/spark/pull/21756
  
@dbtsai @vanzin 
What we are trying to achieve is to make Spark work with plain Kerberos 
authentication.

We `login user from keytab` at the startup of driver and executors and then 
use this UserGroupInformation. We do not have Hadoop token authentication 
method, but use plain Kerberos auth.

Spark's runAsSparkUser creates a new UGI based on the current static UGI 
and then transfers credentials from this current static UGI to the new UGI. 
This works well with other auth methods, except Kerberos. For Kerberos, the UGI 
has to be created from loginUserFromKeytab method only and not simply by doing 
a transfer credentials from the previous UGI to the new UGI.  This is because  
isKeytab and isKrbTkt variables in UGI are needed for the UGI to work properly.

The implementation which has been working is below: 

```
val currentUgi = UserGroupInformation.getCurrentUser
val ugi = if (currentUgi.getAuthenticationMethod == 
AuthenticationMethod.KERBEROS) {
  currentUgi  // This worked for us for KERBEROS auth method
} else {
  val user = Utils.getCurrentUserName()
  val newUgi = UserGroupInformation.createRemoteUser(user)
  transferCredentials(UserGroupInformation.getCurrentUser, newUgi)
  newUgi
}
logDebug("running as user: " + ugi.getUserName)
ugi.doAs(new PrivilegedExceptionAction[Unit] {
  def run: Unit = func()
})
  }
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21756: [SPARK-24764] [CORE] Add ServiceLoader implementation fo...

2018-08-22 Thread dbtsai
Github user dbtsai commented on the issue:

https://github.com/apache/spark/pull/21756
  
From what I talked to @shrutig offline, the use-case her team has is when 
`UserGroupInformation.getCurrentUser.getAuthenticationMethod == 
AuthenticationMethod.KERBEROS`,  they want to set the remote ugi with 
`AuthMethod.KERBEROS` authentication method as the default remote ugi is using 
`AuthMethod.SIMPLE`.

I'll suggest to add 
```scala
ugi.setAuthenticationMethod(
  
UserGroupInformation.getCurrentUser.getAuthenticationMethod.getAuthMethod)
```
into `def createSparkUser(): UserGroupInformation` instead of customizing 
`SparkHadoopUtil`.



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21756: [SPARK-24764] [CORE] Add ServiceLoader implementation fo...

2018-08-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21756
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21756: [SPARK-24764] [CORE] Add ServiceLoader implementation fo...

2018-08-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21756
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95049/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21756: [SPARK-24764] [CORE] Add ServiceLoader implementation fo...

2018-08-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21756
  
**[Test build #95049 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95049/testReport)**
 for PR 21756 at commit 
[`5234c98`](https://github.com/apache/spark/commit/5234c989cb32abd9a2e7ea44ca921ff2a36ea74f).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21756: [SPARK-24764] [CORE] Add ServiceLoader implementation fo...

2018-08-21 Thread vanzin
Github user vanzin commented on the issue:

https://github.com/apache/spark/pull/21756
  
I don't understand the purpose of this change. Can you explain exactly what 
part of `SparkHadoopUtil` do you need to customize, and why that change can't 
just be made to the shared code?

The bug you filed even contains incorrect information. There is a single 
implementation of `SparkHadoopUtil` used by all cluster managers - if you look 
at "YarnSparkHadoopUtil.scala", there is only an object there, with some 
YARN-specific methods, but the YARN code re-uses the shared `SparkHadoopUtil`. 
I actually made that change; and the only reason `SparkHadoopUtil` didn't just 
become an object is because it's a semi-public API.

Your code also breaks if you have more than one custom implementation in 
the classpath. If your goal is to allow cluster managers to override it, that's 
a pretty big hole, since you can't have the built-in cluster managers 
overriding them since more often than not they're all in the classpath at the 
same time.

The only reason this class was overridden in YARN in the past is because 
Spark used to support both hadoop1 and hadoop2, and code in `core/` could not 
call hadoop2 APIs (but code in the yarn module could). That is not the case 
anymore.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21756: [SPARK-24764] [CORE] Add ServiceLoader implementation fo...

2018-08-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21756
  
**[Test build #95049 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95049/testReport)**
 for PR 21756 at commit 
[`5234c98`](https://github.com/apache/spark/commit/5234c989cb32abd9a2e7ea44ca921ff2a36ea74f).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21756: [SPARK-24764] [CORE] Add ServiceLoader implementation fo...

2018-07-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21756
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93856/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21756: [SPARK-24764] [CORE] Add ServiceLoader implementation fo...

2018-07-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21756
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21756: [SPARK-24764] [CORE] Add ServiceLoader implementation fo...

2018-07-31 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21756
  
**[Test build #93856 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93856/testReport)**
 for PR 21756 at commit 
[`6b9edca`](https://github.com/apache/spark/commit/6b9edca76579cd1adfb42eb4085b604b050b552c).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21756: [SPARK-24764] [CORE] Add ServiceLoader implementation fo...

2018-07-31 Thread shrutig
Github user shrutig commented on the issue:

https://github.com/apache/spark/pull/21756
  
`SparkHadoopUtil` is accessed in multiple points in Apache Spark , and is 
not limited to usage by yarn.  External cluster managers should also be able to 
change any functions in `SparkHadoopUtil`, currently there is no way to do so. 
Internally, we have leveraged the external cluster manager and we would like to 
control the `runAsSparkUser` method for our authentication method. Just as 
external cluster manager is pluggable, `SparkHadoopUtil` could also be 
pluggable.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21756: [SPARK-24764] [CORE] Add ServiceLoader implementation fo...

2018-07-31 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21756
  
**[Test build #93856 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93856/testReport)**
 for PR 21756 at commit 
[`6b9edca`](https://github.com/apache/spark/commit/6b9edca76579cd1adfb42eb4085b604b050b552c).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21756: [SPARK-24764] [CORE] Add ServiceLoader implementation fo...

2018-07-16 Thread jerryshao
Github user jerryshao commented on the issue:

https://github.com/apache/spark/pull/21756
  
Would you please explain the scenarios of such usage? This 
`SparkHadoopUtil` is highly hadoop/yarn dependent, I'm not sure how other 
customized cluster manager use it?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21756: [SPARK-24764] [CORE] Add ServiceLoader implementation fo...

2018-07-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21756
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92944/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21756: [SPARK-24764] [CORE] Add ServiceLoader implementation fo...

2018-07-12 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21756
  
**[Test build #92944 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92944/testReport)**
 for PR 21756 at commit 
[`606beae`](https://github.com/apache/spark/commit/606beae16b632933b311ba7267a8ea4a6339c2bb).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `sys.error(s\"Incompatible class found for SparkHadoopUtil 
implementation: $clazz\")`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21756: [SPARK-24764] [CORE] Add ServiceLoader implementation fo...

2018-07-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21756
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21756: [SPARK-24764] [CORE] Add ServiceLoader implementation fo...

2018-07-12 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21756
  
**[Test build #92944 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92944/testReport)**
 for PR 21756 at commit 
[`606beae`](https://github.com/apache/spark/commit/606beae16b632933b311ba7267a8ea4a6339c2bb).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21756: [SPARK-24764] [CORE] Add ServiceLoader implementation fo...

2018-07-12 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21756
  
**[Test build #92940 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92940/testReport)**
 for PR 21756 at commit 
[`dd5d924`](https://github.com/apache/spark/commit/dd5d92459ea2ce869057492dd6f802cb0479b9cb).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `sys.error(s\"Incompatible class found for SparkHadoopUtil 
implementation: $clazz\")`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21756: [SPARK-24764] [CORE] Add ServiceLoader implementation fo...

2018-07-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21756
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92940/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21756: [SPARK-24764] [CORE] Add ServiceLoader implementation fo...

2018-07-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21756
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21756: [SPARK-24764] [CORE] Add ServiceLoader implementation fo...

2018-07-12 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21756
  
**[Test build #92940 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92940/testReport)**
 for PR 21756 at commit 
[`dd5d924`](https://github.com/apache/spark/commit/dd5d92459ea2ce869057492dd6f802cb0479b9cb).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21756: [SPARK-24764] [CORE] Add ServiceLoader implementation fo...

2018-07-12 Thread dbtsai
Github user dbtsai commented on the issue:

https://github.com/apache/spark/pull/21756
  
Jenkins, ok to test


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21756: [SPARK-24764] [CORE] Add ServiceLoader implementation fo...

2018-07-12 Thread dbtsai
Github user dbtsai commented on the issue:

https://github.com/apache/spark/pull/21756
  
add to whitelist


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21756: [SPARK-24764] [CORE] Add ServiceLoader implementation fo...

2018-07-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21756
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21756: [SPARK-24764] [CORE] Add ServiceLoader implementation fo...

2018-07-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21756
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21756: [SPARK-24764] [CORE] Add ServiceLoader implementation fo...

2018-07-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21756
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org