filed  jira SPARK-3489  <https://issues.apache.org/jira/browse/SPARK-3489>

On Thu, Sep 4, 2014 at 9:36 AM, Mohit Jaggi <mohitja...@gmail.com> wrote:

> Folks,
> I sent an email announcing
> https://github.com/AyasdiOpenSource/df
>
> This dataframe is basically a map of RDDs of columns(along with DSL
> sugar), as column based operations seem to be most common. But row
> operations are not uncommon. To get rows out of columns right now I zip the
> column RDDs together. I use RDD.zip then flatten the tuples I get. I
> realize that RDD.zipPartitions might be faster. However, I believe an even
> better approach should be possible. Surely we can have a zip method that
> can combine a large variable number of RDDs? Can that be added to
> Spark-core? Or is there an alternative equally good or better approach?
>
> Cheers,
> Mohit.
>

Reply via email to