Hello All,
I am new to Spark. I have a very basic question.How do I write the output
of an action on a RDD to HDFS?
Thanks in advance for the help.
Cheers,
Ravi
Hi Chris,
Thanks for the quick reply and the welcome. I am trying to read a file from
hdfs and then writing back just the first line to hdfs.
I calling first() on the RDD to get the first line.
Sent from my iPhone
On Jun 22, 2015, at 7:42 PM, Chris Gore cdg...@cdgore.com wrote:
Hi Ravi,
Hi Ravi,
Welcome, you probably want RDD.saveAsTextFile(“hdfs:///my_file”)
Chris
On Jun 22, 2015, at 5:28 PM, ravi tella ddpis...@gmail.com wrote:
Hello All,
I am new to Spark. I have a very basic question.How do I write the output of
an action on a RDD to HDFS?
Thanks in advance
Hi Ravi,
For this case, you could simply do
sc.parallelize([rdd.first()]).saveAsTextFile(“hdfs:///my_file”) using pyspark
or sc.parallelize(Array(rdd.first())).saveAsTextFile(“hdfs:///my_file”) using
Scala
Chris
On Jun 22, 2015, at 5:53 PM, ddpis...@gmail.com wrote:
Hi Chris,
Thanks for