Is there any way to get data from HDFS (e.g. with sc.textFile) with two
separate usernames in the same Spark job? For instance, if I have a file on
hdfs-server-1.com and the alice user has permission to view it, and I have a
file on hdfs-server-2.com and the bob user has permission to view it,
Hmmm, I seem to be able to get around this by setting
hadoopConf1.setBoolean("fs.s3n.impl.disable.cache", true) in my code. Is
there anybody more familiar with Hadoop who can confirm that the filesystem
cache would cause this issue?
--
View this message in context:
s.
>
> Something like this:
>
> myRdd.map(x => try{ //something }catch{ case e:Exception =>
> log.error("Whoops!! :" + e) })
>
>
>
>
> Thanks
> Best Regards
>
> On Tue, Sep 1, 2015 at 1:22 AM, Wayne Song <wayne.e.s...@gmail.com> wrote:
>
>
We've been running into a situation where exceptions in rdd.map() calls will
not get recorded and shown on the web UI properly. We've discovered that
this seems to occur because we're creating our own threads in
foreachPartition() calls. If I have code like this:
The tasks on the executors
Hello,
I am trying to start a Spark master for a standalone cluster on an EC2 node.
The CLI command I'm using looks like this:
Note that I'm specifying the --host argument; I want my Spark master to be
listening on a specific IP address. The host that I'm specifying (i.e.
54.xx.xx.xx) is the