I used groupBy to create the keys for both RDDs. Then I did the join. I think though it be useful if in the future Spark could allows us to specify the fields on which to join, even when the keys are different. Scalding allows this feature.
-- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/specifying-fields-for-join-tp7528p7591.html Sent from the Apache Spark User List mailing list archive at Nabble.com.