Re: Spark standalone - reading kerberos hdfs

2021-01-24 Thread jelmer
The only way I ever got it to work with spark standalone is via web hdfs. See https://issues.apache.org/jira/browse/SPARK-5158?focusedCommentId=16516856&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16516856 On Fri, 8 Jan 2021 at 18:49, Sudhir Babu Pothineni wro

Re: Spark standalone - reading kerberos hdfs

2021-01-23 Thread Gábor Rőczei
Hi Sudhir, > On 21 Jan 2021, at 16:24, Sudhir Babu Pothineni wrote: > > Any other insights into this issue? I tried multiple way to supply keytab to > executor > > Does spark standalone doesn’t support Kerberos? Spark standalone mode does not support Kerberos authentication. Related source

Re: Spark standalone - reading kerberos hdfs

2021-01-21 Thread Sudhir Babu Pothineni
Any other insights into this issue? I tried multiple way to supply keytab to executor Does spark standalone doesn’t support Kerberos? > On Jan 8, 2021, at 1:53 PM, Sudhir Babu Pothineni > wrote: > >  > Incase of Spark on Yarn, Application Master shares the token. > > I think incase of spa

Re: Spark standalone - reading kerberos hdfs

2021-01-08 Thread Sudhir Babu Pothineni
Incase of Spark on Yarn, Application Master shares the token. I think incase of spark stand alone the token is not shared to executor, any example how to get the HDFS token for executor? On Fri, Jan 8, 2021 at 12:13 PM Gabor Somogyi wrote: > TGT is not enough, you need HDFS token which can be o

Re: Spark standalone - reading kerberos hdfs

2021-01-08 Thread Gabor Somogyi
TGT is not enough, you need HDFS token which can be obtained by Spark. Please check the logs... On Fri, 8 Jan 2021, 18:51 Sudhir Babu Pothineni, wrote: > I spin up a spark standalone cluster (spark.autheticate=false), submitted > a job which reads remote kerberized HDFS, > > val spark = SparkSes

Spark standalone - reading kerberos hdfs

2021-01-08 Thread Sudhir Babu Pothineni
I spin up a spark standalone cluster (spark.autheticate=false), submitted a job which reads remote kerberized HDFS, val spark = SparkSession.builder() .master("spark://spark-standalone:7077") .getOrCreate() UserGroupInformation.loginUserFromKeytab(principal, ke