saving rdd to multiple files named by the key

2015-01-26 Thread Sharon Rapoport
Hi,

I have an rdd of [k,v] pairs. I want to save each [v] to a file named [k].
I got them by combining many [k,v] by [k]. I could then save to file by
partitions, but that still doesn't allow me to choose the name, and leaves
me stuck with foo/part-...

Any tips?

Thanks,
Sharon


Problem getting Spark running on a Yarn cluster

2015-01-06 Thread Sharon Rapoport
Hello, 

We have hadoop 2.6.0 and Yarn set up on ec2. Trying to get spark 1.1.1 running 
on the Yarn cluster.
I have of course googled around and found that this problem is solved for most 
after removing the line including 127.0.1.1 from /etc/hosts. This hasn’t seemed 
to solve this for me. Anyone has an idea where else might 127.0.1.1 be hiding 
in some conf? Looked everywhere… or is there a completely different problem?

Thanks,
Sharon

I am getting this error:

WARN network.SendingConnection: Error finishing connection to /127.0.1.1:47020
java.net.ConnectException: Connection refused