Running a simple word count job in standalone mode as a non root user from
spark-shell. The spark master, worker services are running as root user.

The problem is the _temporary under /user/krajah/output2/_temporary/0 dir
is being created with root permission even when running the job as non root
user - krajah in this case. The higher level directories are getting
created with right permission though. There was a similar question posted
long time back, but there is no answer:
http://mail-archives.apache.org/mod_mbox/mesos-user/201408.mbox/%3CCAAeYHL2M9J9xEotf_0zXmZXy2_x-oBHa=xxl2naft203o6u...@mail.gmail.com%3E


*Wrong permission for child directory*
drwxr-xr-x   - root   root            0 2015-04-01 11:20
/user/krajah/output2/_temporary/0/_temporary


*Right permission for parent directories*
hadoop fs -ls -R /user/krajah/my_output
drwxr-xr-x   - krajah krajah          1 2015-04-01 11:46
/user/krajah/my_output/_temporary
drwxr-xr-x   - krajah krajah          3 2015-04-01 11:46
/user/krajah/my_output/_temporary/0

*Job and Stacktrace*

scala> val file = sc.textFile("/user/krajah/junk.txt")
scala> val counts = file.flatMap(line => line.split(" "))
scala> .map(word => (word, 1))
scala> .reduceByKey(_ + _)

scala> counts.saveAsTextFile("/user/krajah/count2")
java.io.IOException: Error: Permission denied
    at com.mapr.fs.MapRFileSystem.rename(MapRFileSystem.java:926)
    at
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.mergePaths(FileOutputCommitter.java:345)
    at
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.mergePaths(FileOutputCommitter.java:362)
    at
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.commitJob(FileOutputCommitter.java:310)
    at
org.apache.hadoop.mapred.FileOutputCommitter.commitJob(FileOutputCommitter.java:136)
    at
org.apache.spark.SparkHadoopWriter.commitJob(SparkHadoopWriter.scala:127)
    at
org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopDataset(PairRDDFunctions.scala:1079)
    at
org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopFile(PairRDDFunctions.scala:944)
    at
org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopFile(PairRDDFunctions.scala:853)
    at org.apache.spark.rdd.RDD.saveAsTextFile(RDD.scala:1199)
    at $iwC$$iwC$$iwC$$iwC.<init>(<console>:17)
    at $iwC$$iwC$$iwC.<init>(<console>:22)
    at $iwC$$iwC.<init>(<console>:24)
    at $iwC.<init>(<console>:26)
    at <init>(<console>:28)
    at .<init>(<console>:32)
    at .<clinit>(<console>)
    at .<init>(<console>:7)
    at .<clinit>(<console>)
    at $print(<console>)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)


--
Kannan

Reply via email to