Hi,
The problem I am looking at is as follows: - I read in a log file of multiple users as a RDD - I'd like to group the above RDD into *multiple RDDs* by userIds (the key) - my processEachUser() function then takes in each RDD mapped into each individual user, and calls for RDD.map or DataFrame operations on them. (I already had the function coded, I am therefore reluctant to work with the ResultIterable object coming out of rdd.groupByKey() ... ) I've searched the mailing list and googled on "RDD of RDDs" and seems like it isn't a thing at all. A few choices left seem to be: 1) groupByKey() and then work with the ResultIterable object; 2) groupbyKey() and then write each group into a file, and read them back as individual rdds to process.. Anyone got a better idea or had a similar problem before? Thanks! Ping -- Ping Yan Ph.D. in Management Dept. of Management Information Systems University of Arizona Tucson, AZ 85721