The only way I ever got it to work with spark standalone is via web hdfs. See https://issues.apache.org/jira/browse/SPARK-5158?focusedCommentId=16516856&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16516856
On Fri, 8 Jan 2021 at 18:49, Sudhir Babu Pothineni <sbpothin...@gmail.com> wrote: > I spin up a spark standalone cluster (spark.autheticate=false), submitted > a job which reads remote kerberized HDFS, > > val spark = SparkSession.builder() > .master("spark://spark-standalone:7077") > .getOrCreate() > > UserGroupInformation.loginUserFromKeytab(principal, keytab) > val df = spark.read.parquet("hdfs://namenode:8020/test/parquet/") > > Ran into following exception: > > Caused by: > java.io.IOException: java.io.IOException: Failed on local exception: > java.io.IOException: org.apache.hadoop.security.AccessControlException: > Client cannot authenticate via:[TOKEN, KERBEROS]; Host Details : local host > is: "..."; destination host is: "...":10346; > > > Any suggestions? > > Thanks > Sudhir >