Thanks for your prompt responses! @Steve
I actually put my keytabs to all the nodes already. And I used them to kinit on each server. But how can I make spark to use my key tab and principle when I start cluster or submit the job? Or is there a way to let spark use ticket cache on each node? I tried --keytab and --principle when I submit the job, still get the same error. I guess that's for YARN only. On Fri, Jun 23, 2017 at 18:50 Steve Loughran <ste...@hortonworks.com> wrote: > On 23 Jun 2017, at 10:22, Saisai Shao <sai.sai.s...@gmail.com> wrote: > > Spark running with standalone cluster manager currently doesn't support > accessing security Hadoop. Basically the problem is that standalone mode > Spark doesn't have the facility to distribute delegation tokens. > > Currently only Spark on YARN or local mode supports security Hadoop. > > Thanks > Jerry > > > There's possibly an ugly workaround where you ssh in to every node and log > in direct to your kdc using a keytab you pushed out...that would eliminate > the need for anything related to hadoop tokens. After all, that's > essentially what spark-on-yarn does when when you give it keytab. > > > see also: > https://www.gitbook.com/book/steveloughran/kerberos_and_hadoop/details > > On Fri, Jun 23, 2017 at 5:10 PM, Mu Kong <kong.mu....@gmail.com> wrote: > >> Hi, all! >> >> I was trying to read from a Kerberosed hadoop cluster from a standalone >> spark cluster. >> Right now, I encountered some authentication issues with Kerberos: >> >> >> java.io.IOException: Failed on local exception: java.io.IOException: >> org.apache.hadoop.security.AccessControlException: Client cannot >> authenticate via:[TOKEN, KERBEROS]; Host Details : local host is: >> "XXXXXXXXXXXX"; destination host is: XXXXXXXXXXXXXXX; >> >> >> >> I checked with klist, and principle/realm is correct. >> I also used hdfs command line to poke HDFS from all the nodes, and it >> worked. >> And if I submit job using local(client) mode, the job worked fine. >> >> I tried to put everything from hadoop/conf to spark/conf and hive/conf to >> spark/conf. >> Also tried edit spark/conf/spark-env.sh to add >> SPARK_SUBMIT_OPTS/SPARK_MASTER_OPTS/SPARK_SLAVE_OPTS/HADOOP_CONF_DIR/HIVE_CONF_DIR, >> and tried to export them in .bashrc as well. >> >> However, I'm still experiencing the same exception. >> >> Then I read some concerning posts about problems with >> kerberosed hadoop, some post like the following one: >> http://blog.stratio.com/spark-kerberos-safe-story/ >> , which indicates that we can not access to kerberosed hdfs using >> standalone spark cluster. >> >> I'm using spark 2.1.1, is it still the case that we can't access >> kerberosed hdfs with 2.1.1? >> >> Thanks! >> >> >> Best regards, >> Mu >> >> >