Re: TableAPI - Join on two keys

2015-04-17 Thread Stephan Ewen
I also agree with Aljoscha. It is widely considered a big mistake in SQL that cross products are implicit rather than explicit (because they are so expensive) Let's not make the same mistake here to put the theoretical algebra over the practical user experience and program safety. On Fri, Apr 17

Re: TableAPI - Join on two keys

2015-04-17 Thread Fabian Hueske
I agree with Aljoscha. Let's give a good error message and offer a cross operator. 2015-04-17 4:52 GMT-05:00 Aljoscha Krettek : > Yes, that is the idea, but I think in this case the user must be > protected from an operation that can get ridiculously expensive. > > On Fri, Apr 17, 2015 at 10:20 A

Re: TableAPI - Join on two keys

2015-04-17 Thread Aljoscha Krettek
Yes, that is the idea, but I think in this case the user must be protected from an operation that can get ridiculously expensive. On Fri, Apr 17, 2015 at 10:20 AM, Felix Neutatz wrote: > I am also against the manual cross method. Isn't it the idea of the table > API to hide the actual implementat

Re: TableAPI - Join on two keys

2015-04-17 Thread Felix Neutatz
I am also against the manual cross method. Isn't it the idea of the table API to hide the actual implementation from the user? Best regards, Felix Am 17.04.2015 10:09 vorm. schrieb "Till Rohrmann" : > Why not doing two separate joins, union the results and doing a distinct > operation on the comb

Re: TableAPI - Join on two keys

2015-04-17 Thread Till Rohrmann
Why not doing two separate joins, union the results and doing a distinct operation on the combined key? On Fri, Apr 17, 2015 at 9:42 AM, Aljoscha Krettek wrote: > So, the first thing is a "feature" of the Java API that removes > duplicate fields in keys, so an equi-join on (0,0) with (0,1) would

Re: TableAPI - Join on two keys

2015-04-17 Thread Aljoscha Krettek
So, the first thing is a "feature" of the Java API that removes duplicate fields in keys, so an equi-join on (0,0) with (0,1) would throw an error because one 0 is removed from the first key. The second thing is a feature of the Table API where the error message is hinting at the problem: Could no

TableAPI - Join on two keys

2015-04-16 Thread Felix Neutatz
Hi, I want to join two tables in the following way: case class WeightedEdge(src: Int, target: Int, weight: Double) case class Community(communityID: Int, nodeID: Int) case class CommunitySumTotal(communityID: Int, sumTotal: Double) val communities: DataSet[Community] val weightedEdges: DataSet[