[ https://issues.apache.org/jira/browse/FLINK-1628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14349652#comment-14349652 ]
ASF GitHub Bot commented on FLINK-1628: --------------------------------------- GitHub user fhueske opened a pull request: https://github.com/apache/flink/pull/458 [FLINK-1628] Fix partitioning properties for Joins and CoGroups. Fix partitioning properties for Joins and CoGroups and some smaller bugs on the way. You can merge this pull request into a Git repository by running: $ git pull https://github.com/fhueske/flink joinCompilerBug Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/458.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #458 ---- commit 89c3bd0b76c1b4ace58e93571b361cdc0af2cbd6 Author: Fabian Hueske <fhue...@apache.org> Date: 2015-03-04T17:49:22Z [FLINK-1628] Fix partitioning properties for Joins and CoGroups. ---- > Strange behavior of "where" function during a join > -------------------------------------------------- > > Key: FLINK-1628 > URL: https://issues.apache.org/jira/browse/FLINK-1628 > Project: Flink > Issue Type: Bug > Components: Optimizer > Affects Versions: 0.9 > Reporter: Daniel Bali > Assignee: Fabian Hueske > Priority: Critical > Labels: batch > > Hello! > If I use the `where` function with a field list during a join, it exhibits > strange behavior. > Here is the sample code that triggers the error: > https://gist.github.com/balidani/d9789b713e559d867d5c > This example joins a DataSet with itself, then counts the number of rows. If > I use `.where(0, 1)` the result is (22), which is not correct. If I use > `EdgeKeySelector`, I get the correct result (101). > When I pass a field list to the `equalTo` function (but not `where`), > everything works again. > If I don't include the `groupBy` and `reduceGroup` parts, everything works. > Also, when working with large DataSets, passing a field list to `where` makes > it incredibly slow, even though I don't see any exceptions in the log (in > DEBUG mode). > Does anybody know what might cause this problem? > Thanks! -- This message was sent by Atlassian JIRA (v6.3.4#6332)