RDD has methods to zip with another RDD or with an index, but there's no equivalent for data frames. Anyone know a good way to do this?
I thought I could just convert to RDD, do the zip, and then convert back, but ... 1. I don't see a way (outside developer API) to convert RDD[Row] directly back to DataFrame. Is there really no way to do this? 2. I don't see any way to modify Row objects or create new rows with additional columns. In other words, no way to convert RDD[(Row, Row)] to RDD[Row] It seems the only way to get what I want is to extract out the data into a case class and then convert back to a data frame. Did I miss something?