Hi. Can't you do a filter, to get only the ABC shows, map that into a keyed instance of the show, and then do a reduceByKey to sum up the views?
Something like this in Scala code: /filter for the channel new pair (show, view count) / val myAnswer = joined_dataset.filter( _._2._1 == "ABC" ).map( (_._1, _._2._2) .reduceByKey( (a,b) => a + b ) This should give you an RDD of one record per show and the summed view count but only for shows on ABC right? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/How-to-work-with-a-joined-rdd-in-pyspark-tp25510p25514.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org