[GitHub] flink pull request: [FLINK-1622][java-api][scala-api] add a partia...

2015-03-09 Thread mxm
GitHub user mxm opened a pull request: https://github.com/apache/flink/pull/466 [FLINK-1622][java-api][scala-api] add a partial GroupReduce operator The partial GroupReduce operator acts like a regular GroupReduce operator but does not perform a full group reduce. Instead, it per

[GitHub] flink pull request: [FLINK-1622][java-api][scala-api] add a partia...

2015-03-09 Thread hsaputra
Github user hsaputra commented on a diff in the pull request: https://github.com/apache/flink/pull/466#discussion_r26064533 --- Diff: flink-core/src/main/java/org/apache/flink/api/common/operators/base/GroupReducePartialOperatorBase.java --- @@ -0,0 +1,150 @@ +/* + * Licen

[GitHub] flink pull request: [FLINK-1622][java-api][scala-api] add a partia...

2015-03-10 Thread aljoscha
Github user aljoscha commented on a diff in the pull request: https://github.com/apache/flink/pull/466#discussion_r26105008 --- Diff: flink-compiler/src/main/java/org/apache/flink/compiler/operators/GroupReducePartialProperties.java --- @@ -0,0 +1,112 @@ +/* + * Licensed t

[GitHub] flink pull request: [FLINK-1622][java-api][scala-api] add a partia...

2015-03-10 Thread aljoscha
Github user aljoscha commented on a diff in the pull request: https://github.com/apache/flink/pull/466#discussion_r26105136 --- Diff: flink-java/src/main/java/org/apache/flink/api/java/operators/SortedGrouping.java --- @@ -156,6 +156,23 @@ public SortedGrouping(DataSet set, Keys k

[GitHub] flink pull request: [FLINK-1622][java-api][scala-api] add a partia...

2015-03-10 Thread aljoscha
Github user aljoscha commented on a diff in the pull request: https://github.com/apache/flink/pull/466#discussion_r26105155 --- Diff: flink-java/src/main/java/org/apache/flink/api/java/operators/UnsortedGrouping.java --- @@ -159,7 +159,23 @@ public UnsortedGrouping(DataSet set, Ke

[GitHub] flink pull request: [FLINK-1622][java-api][scala-api] add a partia...

2015-03-10 Thread aljoscha
Github user aljoscha commented on a diff in the pull request: https://github.com/apache/flink/pull/466#discussion_r26105226 --- Diff: flink-scala/src/main/scala/org/apache/flink/api/scala/GroupedDataSet.scala --- @@ -355,6 +355,63 @@ class GroupedDataSet[T: ClassTag]( }

[GitHub] flink pull request: [FLINK-1622][java-api][scala-api] add a partia...

2015-03-10 Thread aljoscha
Github user aljoscha commented on the pull request: https://github.com/apache/flink/pull/466#issuecomment-78016698 I like the implementation, except for my comments on groupReducePartial() on grouped DataSets. Also, the tests seem a bit shady because of all the grouping and regular re

[GitHub] flink pull request: [FLINK-1622][java-api][scala-api] add a partia...

2015-03-10 Thread mxm
Github user mxm commented on a diff in the pull request: https://github.com/apache/flink/pull/466#discussion_r26120661 --- Diff: flink-java/src/main/java/org/apache/flink/api/java/operators/SortedGrouping.java --- @@ -156,6 +156,23 @@ public SortedGrouping(DataSet set, Keys keys,

[GitHub] flink pull request: [FLINK-1622][java-api][scala-api] add a partia...

2015-03-10 Thread mxm
Github user mxm commented on a diff in the pull request: https://github.com/apache/flink/pull/466#discussion_r26130403 --- Diff: flink-scala/src/main/scala/org/apache/flink/api/scala/GroupedDataSet.scala --- @@ -355,6 +355,63 @@ class GroupedDataSet[T: ClassTag]( }

[GitHub] flink pull request: [FLINK-1622][java-api][scala-api] add a partia...

2015-03-10 Thread mxm
Github user mxm commented on a diff in the pull request: https://github.com/apache/flink/pull/466#discussion_r26130234 --- Diff: flink-core/src/main/java/org/apache/flink/api/common/operators/base/GroupReducePartialOperatorBase.java --- @@ -0,0 +1,150 @@ +/* + * Licensed t

[GitHub] flink pull request: [FLINK-1622][java-api][scala-api] add a partia...

2015-03-10 Thread mxm
Github user mxm commented on a diff in the pull request: https://github.com/apache/flink/pull/466#discussion_r26130398 --- Diff: flink-java/src/main/java/org/apache/flink/api/java/operators/UnsortedGrouping.java --- @@ -159,7 +159,23 @@ public UnsortedGrouping(DataSet set, Keys ke

[GitHub] flink pull request: [FLINK-1622][java-api][scala-api] add a partia...

2015-03-10 Thread mxm
Github user mxm commented on the pull request: https://github.com/apache/flink/pull/466#issuecomment-78076027 @aljoscha Thanks for the comments. I agree, the tests are a bit shady because they test the operator by first performing a partial, then a full reduce. Using a custom partitio

[GitHub] flink pull request: [FLINK-1622][java-api][scala-api] add a partia...

2015-03-10 Thread hsaputra
Github user hsaputra commented on the pull request: https://github.com/apache/flink/pull/466#issuecomment-78070448 I assume this will be new operator but I do not see updates on the documentation files. --- If your project is set up for it, you can reply to this email and have your r

[GitHub] flink pull request: [FLINK-1622][java-api][scala-api] add a partia...

2015-03-10 Thread fhueske
Github user fhueske commented on the pull request: https://github.com/apache/flink/pull/466#issuecomment-78084575 Yes, I've got a couple of comments as well. First of all, as @mxm said, I would propose to call this operator ``combine`` because it is a generalized combiner (out

[GitHub] flink pull request: [FLINK-1622][java-api][scala-api] add a partia...

2015-03-10 Thread mxm
Github user mxm commented on the pull request: https://github.com/apache/flink/pull/466#issuecomment-78093403 @aljoscha @fhueske For a general combine, the operator can be used without grouping. When we want to combine elements before performing a proper groupReduce with a groupBy, we

[GitHub] flink pull request: [FLINK-1622][java-api][scala-api] add a partia...

2015-03-10 Thread mxm
Github user mxm commented on a diff in the pull request: https://github.com/apache/flink/pull/466#discussion_r26139203 --- Diff: flink-java/src/main/java/org/apache/flink/api/java/operators/GroupReducePartialOperator.java --- @@ -0,0 +1,229 @@ +/* + * Licensed to the Apach

[GitHub] flink pull request: [FLINK-1622][java-api][scala-api] add a partia...

2015-03-12 Thread fhueske
Github user fhueske commented on a diff in the pull request: https://github.com/apache/flink/pull/466#discussion_r26135427 --- Diff: flink-java/src/main/java/org/apache/flink/api/java/operators/GroupReducePartialOperator.java --- @@ -0,0 +1,229 @@ +/* + * Licensed to the A

[GitHub] flink pull request: [FLINK-1622][java-api][scala-api] add a partia...

2015-03-12 Thread fhueske
Github user fhueske commented on the pull request: https://github.com/apache/flink/pull/466#issuecomment-78091524 I have to correct myself. A combiner should of course be called on groups of records. Therefore, calling it on a Grouping makes absolute sense. However, the semantics of t

[GitHub] flink pull request: [FLINK-1622][java-api][scala-api] add a partia...

2015-03-12 Thread aljoscha
Github user aljoscha commented on the pull request: https://github.com/apache/flink/pull/466#issuecomment-78101926 Sorry, I completely blanked, of course, You still need the grouping, only the shuffle step you don't need. So, I suggest only better tests, using a combination of

[GitHub] flink pull request: [FLINK-1622][java-api][scala-api] add a partia...

2015-03-13 Thread mxm
Github user mxm commented on the pull request: https://github.com/apache/flink/pull/466#issuecomment-79096322 @aljoscha @fhueske @hsaputra Thanks for the feedback. Some people suggested that the name is confusing and that my pull request involved too much code duplication. I propose t