Re: Executing spark jobs with predefined Hadoop user

2014-04-12 Thread Asaf Lahav
Thank you all very much for your responses We are going to test these recommendations. Adnan, in regards to the HDFS URI, this is actually the manner in which we are accessing the file system already. It was simply removed from the post. Thank you, Asaf On Thu, Apr 10, 2014 at 5:33 PM, Sha

RE: Executing spark jobs with predefined Hadoop user

2014-04-10 Thread Shao, Saisai
Hi Asaf, The user who run SparkContext is decided by the below code in SparkContext, normally this user.name is the user who started JVM, you can start your application with -Duser.name=xxx to specify a username you want, this specified username will be the user to communicate with HDFS. val

Re: Executing spark jobs with predefined Hadoop user

2014-04-10 Thread Adnan
Then problem is not on spark side, you have three options, choose any one of them: 1. Change permissions on /tmp/Iris folder from shell on NameNode with "hdfs dfs -chmod" command. 2. Run your hadoop service with hdfs user. 3. Disable dfs.permissions in conf/hdfs-site.xml. Regards, Adnan avito w

Re: Executing spark jobs with predefined Hadoop user

2014-04-10 Thread Adnan
You need to use proper HDFS URI with saveAsTextFile. For Example: rdd.saveAsTextFile("hdfs://NameNode:Port/tmp/Iris/output.tmp") Regards, Adnan Asaf Lahav wrote > Hi, > > We are using Spark with data files on HDFS. The files are stored as files > for predefined hadoop user ("hdfs"). > > The