Thanks, I got the example below working. Though it writes both the keys and values to the output file.
Is there any way to write just the values ? -- Nick String[] strings = { "Abcd", "Azlksd", "whhd", "wasc", "aDxa" }; sc.parallelize(Arrays.asList(strings)) .mapToPair(pairFunction) .saveAsHadoopFile("s3://...", String.class, String.class, RDDMultipleTextOutputFormat.class); ________________________________ From: Nicholas Chammas <nicholas.cham...@gmail.com> Sent: Wednesday, May 4, 2016 4:21:12 PM To: Afshartous, Nick; user@spark.apache.org Subject: Re: Writing output of key-value Pair RDD You're looking for this discussion: http://stackoverflow.com/q/23995040/877069 Also, a simpler alternative with DataFrames: https://github.com/apache/spark/pull/8375#issuecomment-202458325 On Wed, May 4, 2016 at 4:09 PM Afshartous, Nick <nafshart...@turbine.com<mailto:nafshart...@turbine.com>> wrote: Hi, Is there any way to write out to S3 the values of a f key-value Pair RDD ? I'd like each value of a pair to be written to its own file where the file name corresponds to the key name. Thanks, -- Nick