It seems to me that the way makeLinkRDDs works is by taking advantage of the fact that partition IDs happen to coincide with what we get from userPartitioner, since the HashPartitioner in
*val grouped = ratingsByUserBlock.partitionBy(new HashPartitioner(numUserBlocks))* is actually preserving the keys from the ALSPartitioner. i.e., the blockId in *val links = grouped.mapPartitionsWithIndex((blockId, elements) => { ...* happens to be the same with the keys in *grouped* I feel this is rather fragile. Say if we are not using HashPartitioner, then the blockId won't be the same as that from the ALSPartitioner, which can lead to annoying problems. Correct me if I miss something. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/makeLinkRDDs-in-MLlib-ALS-tp11089.html Sent from the Apache Spark User List mailing list archive at Nabble.com.