You could have posted just the error, which is at the end of my response.

Why are you trying to use WebHDFS? I'm not really sure how
authentication works with that. But generally applications use HDFS
(which uses a different URI scheme), and Spark should work fine with
that.


Error:
Authentication required
org.apache.hadoop.security.AccessControlException: Authentication required
at 
org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:457)
at 
org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$200(WebHdfsFileSystem.java:113)
at 
org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:738)
at 
org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:582)
at 
org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:612)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
at 
org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:608)
at 
org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getDelegationToken(WebHdfsFileSystem.java:1507)
at org.apache.hadoop.fs.FileSystem.collectDelegationTokens(FileSystem.java:545)
at org.apache.hadoop.fs.FileSystem.addDelegationTokens(FileSystem.java:523)
at 
org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:140)
at 
org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:100)
at 
org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodes(TokenCache.java:80)
at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:206)
at 
org.apache.hadoop.mapred.SequenceFileInputFormat.listStatus(SequenceFileInputFormat.java:45)
at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:315)
at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:199)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:242)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:240)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:240)


On Thu, Dec 8, 2016 at 12:29 PM, Gerard Casey <gerardhughca...@gmail.com> wrote:
> Sure - I wanted to check with admin before sharing. I’ve attached it now, 
> does this help?
>
> Many thanks again,
>
> G
>
>
>
>> On 8 Dec 2016, at 20:18, Marcelo Vanzin <van...@cloudera.com> wrote:
>>
>> Then you probably have a configuration error somewhere. Since you
>> haven't actually posted the error you're seeing, it's kinda hard to
>> help any further.
>>
>> On Thu, Dec 8, 2016 at 11:17 AM, Gerard Casey <gerardhughca...@gmail.com> 
>> wrote:
>>> Right. I’m confident that is setup correctly.
>>>
>>> I can run the SparkPi test script. The main difference between it and my 
>>> application is that it doesn’t access HDFS.
>>>
>>>> On 8 Dec 2016, at 18:43, Marcelo Vanzin <van...@cloudera.com> wrote:
>>>>
>>>> On Wed, Dec 7, 2016 at 11:54 PM, Gerard Casey <gerardhughca...@gmail.com> 
>>>> wrote:
>>>>> To be specific, where exactly should spark.authenticate be set to true?
>>>>
>>>> spark.authenticate has nothing to do with kerberos. It's for
>>>> authentication between different Spark processes belonging to the same
>>>> app.
>>>>
>>>> --
>>>> Marcelo
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>>>>
>>>
>>
>>
>>
>> --
>> Marcelo
>>
>> ---------------------------------------------------------------------
>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>>
>
>



-- 
Marcelo

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to