Thank you for the answer, it doesn't seem to work neither (I've not log into the machine as the spark user, but kinit inside the spark-env script), and also tried inside the job.
I've notice when I run pyspark that the kerberos token is used for something, but this same behavior is not presented when I start a worker, so maybe those aren't think to use kerberos... On Tue, Jun 16, 2015 at 12:10 PM, Steve Loughran <ste...@hortonworks.com> wrote: > > On 15 Jun 2015, at 15:43, Borja Garrido Bear <kazebo...@gmail.com> wrote: > > I tried running the job in a standalone cluster and I'm getting this: > > java.io.IOException: Failed on local exception: java.io.IOException: > org.apache.hadoop.security.AccessControlException: Client cannot > authenticate via:[TOKEN, KERBEROS]; Host Details : local host is: > "worker-node/0.0.0.0"; destination host is: "hdfs":9000; > > > Both nodes can access the HDFS running spark locally, and have valid kerberos > credentials, I know for the moment keytab is not supported for standalone > mode, but as long as the tokens I had when initiating the workers and masters > are valid this should work, shouldn't it? > > > > > I don't know anything about tokens on standalone. In YARN what we have to > do is something called "delegation tokens", the client asks (something) for > tokens granting access to HDFS, and attaches that to the YARN container > creation request, which is then handed off to the app master, which then > gets to deal with (a) passing them down to launched workers and (b) dealing > with token refresh (which is where keytabs come in to play) > > Why not try sshing in to the worker-node as the spark user and run kinit > there to see if the problem goes away once you've logged in with Kerberos. > If that works, you're going to have to automate that process across the > cluster >