Hi Devies.
Thank you for the quick answer.
I have a code like this:
sc = SparkContext(appName=TAD)
lines = sc.textFile(sys.argv[1], 1)
result = lines.map(doSplit).groupByKey().map(lambda (k,vc):
traffic_process_model(k,vc))
result.saveAsTextFile(sys.argv[2])
Can you please give short
Hi ,
I am running pyspark job.
I need serialize final result to *hdfs in binary files* and having ability
to give a *name for output files*.
I found this post:
http://stackoverflow.com/questions/25293962/specifying-the-output-file-name-in-apache-spark
but it explains how to do it using scala.
One option maybe call HDFS tools or client to rename them after saveAsXXXFile().
On Thu, Nov 13, 2014 at 9:39 PM, Oleg Ruchovets oruchov...@gmail.com wrote:
Hi ,
I am running pyspark job.
I need serialize final result to hdfs in binary files and having ability to
give a name for output