GitHub user hvanhovell opened a pull request:
https://github.com/apache/spark/pull/13629
[SPARK-15370][SQL] Fix count bug
# What changes were proposed in this pull request?
This pull request fixes the COUNT bug in the
`RewriteCorrelatedScalarSubquery` rule.
After this change, the rule tests the expression at the root of the
correlated subquery to determine whether the expression returns `NULL` on empty
input. If the expression does not return `NULL`, the rule generates additional
logic in the `Project` operator above the rewritten subquery. This additional
logic intercepts `NULL` values coming from the outer join and replaces them
with the value that the subquery's expression would return on empty input.
This PR is a takes over https://github.com/apache/spark/pull/13155, and it
only fixes an issue with `Literal` construction and some style. All credits
should go @frreiss.
# How was this patch tested?
Added regression tests to cover all branches of the updated rule (see
changes to `SubquerySuite`).
Ran all existing automated regression tests after merging with latest trunk.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/hvanhovell/spark SPARK-15370-cleanup
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/13629.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #13629
commit 3b1649105869c72ccb16f86732e04829aaae0e93
Author: frreiss
Date: 2016-05-16T17:58:00Z
Commit before merge.
commit 58df60d5468e53c4b6fc41a1d7c896abfb01cdd1
Author: frreiss
Date: 2016-05-16T17:58:21Z
Merge branch 'master' of https://github.com/apache/spark
commit 910cbf54e2300a57640e017610c204da2d462964
Author: frreiss
Date: 2016-05-16T20:46:55Z
Merge branch 'master' of https://github.com/apache/spark
commit 76d9f4528b8536d1e5680279ab76b9e26dd3a873
Author: frreiss
Date: 2016-05-17T14:52:46Z
Merge branch 'master' of https://github.com/apache/spark
commit 1615d560310a59b08a4c03677dd53eb3b9b49e06
Author: frreiss
Date: 2016-05-20T02:01:33Z
Second version of the updated rewrite
commit 1b4ba5ed629d9b1e72d919d89b3592f7b29f3f3c
Author: frreiss
Date: 2016-05-20T14:57:24Z
Merge branch 'master' of https://github.com/apache/spark
commit fb7cb4304ba02815a79278d1d5d6d194fe8db25c
Author: frreiss
Date: 2016-05-24T18:11:54Z
Merge branch 'master' of https://github.com/apache/spark
commit 8cd2877179dded4557c8da92e5b16011637289b0
Author: frreiss
Date: 2016-06-10T05:02:47Z
Addressing additional corner cases and review comments.
commit e5c592032b5604a8f8f10326ecd10ade22b5dc43
Author: Herman van Hovell
Date: 2016-06-12T23:43:30Z
Style fixes
commit 39f7e043c0abbe27823499699877e986f6fa2eb7
Author: Herman van Hovell
Date: 2016-06-12T23:43:32Z
Merge remote-tracking branch 'apache-github/master' into SPARK-15370-cleanup
commit 30dd0bd7d560151085e53667fcc4f6a8895844ed
Author: Herman van Hovell
Date: 2016-06-12T23:57:18Z
Some simplification
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org