saving rdd to multiple files named by the key

2015-01-26 Thread Sharon Rapoport
Hi, I have an rdd of [k,v] pairs. I want to save each [v] to a file named [k]. I got them by combining many [k,v] by [k]. I could then save to file by partitions, but that still doesn't allow me to choose the name, and leaves me stuck with foo/part-... Any tips? Thanks, Sharon

Problem getting Spark running on a Yarn cluster

2015-01-06 Thread Sharon Rapoport
Hello, We have hadoop 2.6.0 and Yarn set up on ec2. Trying to get spark 1.1.1 running on the Yarn cluster. I have of course googled around and found that this problem is solved for most after removing the line including 127.0.1.1 from /etc/hosts. This hasn’t seemed to solve this for me.