Lars Francke created HBASE-20797:
------------------------------------

             Summary: hbase-spark 
                 Key: HBASE-20797
                 URL: https://issues.apache.org/jira/browse/HBASE-20797
             Project: HBase
          Issue Type: Bug
          Components: spark
    Affects Versions: 3.0.0
            Reporter: Lars Francke


We're running into an issue using the spark integration when using Hadoop 
2.7.2. The problem is this line of code from {{HBaseContext.scala}}

{code:java}
ugi.setAuthenticationMethod(AuthenticationMethod.PROXY)
{code}

I'm not an expert but I think that's wrong code. If we were to create a Proxy 
user then we'd need to use {{UserGroupInformation.createProxyUser(...) }} which 
would also set the realUser etc. Also: I don't think it makes sense to create a 
proxy user on the client side? The chances are good that the user we're 
authenticating as doesn't exen have proxy privileges as it's usually only 
granted to servers.

We've tried to trace where this line of code came from in Git but it was a code 
drop back in Ted's original repo.

The error we're seeing actually occurs when (in a Spark job) we access HDFS 
because KMSClientProvider has code like this:

{code:java}
actualUgi =
    (UserGroupInformation.getCurrentUser().getAuthenticationMethod() ==
    UserGroupInformation.AuthenticationMethod.PROXY) ? UserGroupInformation
        .getCurrentUser().getRealUser() : UserGroupInformation
{code}

But we've never set up the realUser so actualUgi is null which later leads to a 
NullPointerException.

I _think_ the proper fix is to just remove that line as I have no idea what its 
intention is. I can provide a patch but I'd like to get input first. Maybe I'm 
mistaken?




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to