Why not just create a partitions for they key you want to groupby and save it
in there? Appending to a file already written to HDFS isn't the best idea
IMO.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Writing-all-values-for-same-key-to-one
])
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Writing-all-values-for-same-key-to-one-file-tp27455p27486.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org
pend to it.
Thanks,
Ritesh
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Writing-all-values-for-same-key-to-one-file-tp27455p27485.html
Sent from the Apache Spark User List mailing list archive at
for rdd, you can use `saveAsHadoopFile` with a Custom `MultipleOutputFormat`
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Writing-all-values-for-same-key-to-one-file-tp27455p27483.html
Sent from the Apache Spark User List mailing list archive at
Partition your data using the key
rdd.partitionByKey()
On Fri, Aug 5, 2016 at 10:10 AM, rtijoriwala
wrote:
> Any recommendations? comments?
>
>
>
> --
> View this message in context: http://apache-spark-user-list.
> 1001560.n3.nabble.com/Writing-all-values-for-sa
Any recommendations? comments?
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Writing-all-values-for-same-key-to-one-file-tp27455p27480.html
Sent from the Apache Spark User List mailing list archive at Nabble.com