filed jira SPARK-3489 <https://issues.apache.org/jira/browse/SPARK-3489>
On Thu, Sep 4, 2014 at 9:36 AM, Mohit Jaggi <mohitja...@gmail.com> wrote: > Folks, > I sent an email announcing > https://github.com/AyasdiOpenSource/df > > This dataframe is basically a map of RDDs of columns(along with DSL > sugar), as column based operations seem to be most common. But row > operations are not uncommon. To get rows out of columns right now I zip the > column RDDs together. I use RDD.zip then flatten the tuples I get. I > realize that RDD.zipPartitions might be faster. However, I believe an even > better approach should be possible. Surely we can have a zip method that > can combine a large variable number of RDDs? Can that be added to > Spark-core? Or is there an alternative equally good or better approach? > > Cheers, > Mohit. >