[jira] [Commented] (SPARK-9999) RDD-like API on top of Catalyst/DataFrame

2015-10-16 Thread Sandy Ryza (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14961513#comment-14961513 ] Sandy Ryza commented on SPARK-: --- So ClassTags would work for case classes and Avro specific records,

[jira] [Commented] (SPARK-9999) RDD-like API on top of Catalyst/DataFrame

2015-10-16 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14961518#comment-14961518 ] Michael Armbrust commented on SPARK-: - Yeah, I think tuples are a pretty important use case.

[jira] [Commented] (SPARK-9999) RDD-like API on top of Catalyst/DataFrame

2015-10-16 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14961567#comment-14961567 ] Matei Zaharia commented on SPARK-: -- Beyond tuples, you'll also want encoders for other generic

[jira] [Commented] (SPARK-9999) RDD-like API on top of Catalyst/DataFrame

2015-10-15 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14959347#comment-14959347 ] Michael Armbrust commented on SPARK-: - Yeah, that Scala code should work. Regarding the Java

[jira] [Commented] (SPARK-9999) RDD-like API on top of Catalyst/DataFrame

2015-10-14 Thread Sandy Ryza (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14957144#comment-14957144 ] Sandy Ryza commented on SPARK-: --- Maybe you all have thought through this as well, but I had some

[jira] [Commented] (SPARK-9999) RDD-like API on top of Catalyst/DataFrame

2015-10-14 Thread Sandy Ryza (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14956341#comment-14956341 ] Sandy Ryza commented on SPARK-: --- Thanks for the explanation [~rxin] and [~marmbrus]. I understand

[jira] [Commented] (SPARK-9999) RDD-like API on top of Catalyst/DataFrame

2015-10-14 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14957926#comment-14957926 ] Michael Armbrust commented on SPARK-: - [~sandyr] did you look at the test cases [in

[jira] [Commented] (SPARK-9999) RDD-like API on top of Catalyst/DataFrame

2015-10-14 Thread Sandy Ryza (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14958022#comment-14958022 ] Sandy Ryza commented on SPARK-: --- bq. The problem with doing this using a registry (like kryo in RDDs

[jira] [Commented] (SPARK-9999) RDD-like API on top of Catalyst/DataFrame

2015-10-13 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14955308#comment-14955308 ] Reynold Xin commented on SPARK-: [~sandyr] I thought a lot about doing this on top of the existing

[jira] [Commented] (SPARK-9999) RDD-like API on top of Catalyst/DataFrame

2015-10-13 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14955304#comment-14955304 ] Sean Owen commented on SPARK-: -- I had a similar question about how much more this is than the current

[jira] [Commented] (SPARK-9999) RDD-like API on top of Catalyst/DataFrame

2015-10-13 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14955335#comment-14955335 ] Reynold Xin commented on SPARK-: The big ones are: 1. encoders (which breaks almost every function

[jira] [Commented] (SPARK-9999) RDD-like API on top of Catalyst/DataFrame

2015-10-13 Thread Sandy Ryza (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14955286#comment-14955286 ] Sandy Ryza commented on SPARK-: --- To ask the obvious question: what are the reasons that the RDD API

[jira] [Commented] (SPARK-9999) RDD-like API on top of Catalyst/DataFrame

2015-10-13 Thread Sandy Ryza (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14955320#comment-14955320 ] Sandy Ryza commented on SPARK-: --- [~rxin] where are the places where the API would need to break? >

[jira] [Commented] (SPARK-9999) RDD-like API on top of Catalyst/DataFrame

2015-10-13 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14955368#comment-14955368 ] Michael Armbrust commented on SPARK-: - Other compatibility breaking things include: getting

[jira] [Commented] (SPARK-9999) RDD-like API on top of Catalyst/DataFrame

2015-10-13 Thread Sandy Ryza (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14955840#comment-14955840 ] Sandy Ryza commented on SPARK-: --- If I understand correctly, it seems like there are ways to work

[jira] [Commented] (SPARK-9999) RDD-like API on top of Catalyst/DataFrame

2015-10-13 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14955868#comment-14955868 ] Reynold Xin commented on SPARK-: [~sandyr] Your concern is absolutely valid, but I don't think

[jira] [Commented] (SPARK-9999) RDD-like API on top of Catalyst/DataFrame

2015-10-13 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14955878#comment-14955878 ] Michael Armbrust commented on SPARK-: - I think improving Java compatibility and getting rid of

[jira] [Commented] (SPARK-9999) RDD-like API on top of Catalyst/DataFrame

2015-10-13 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14955877#comment-14955877 ] Reynold Xin commented on SPARK-: BTW another possible approach that we haven't discussed is that

[jira] [Commented] (SPARK-9999) RDD-like API on top of Catalyst/DataFrame

2015-09-29 Thread Sen Fang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14936268#comment-14936268 ] Sen Fang commented on SPARK-: - Another idea is do something similar to F# TypeProvider approach:

[jira] [Commented] (SPARK-9999) RDD-like API on top of Catalyst/DataFrame

2015-08-20 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14706244#comment-14706244 ] Reynold Xin commented on SPARK-: This needs to be designed first. I'm not sure if

[jira] [Commented] (SPARK-9999) RDD-like API on top of Catalyst/DataFrame

2015-08-17 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14699512#comment-14699512 ] Herman van Hovell commented on SPARK-: -- This sounds interesting. In order to