[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user viirya closed the pull request at: https://github.com/apache/spark/pull/11235 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/11235#issuecomment-194049145 @davies yea, looks like it is doing the same. Let me close this now. Thanks for reviewing this anyway! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/11235#discussion_r55457081 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -146,6 +147,56 @@ object EliminateSerialization extends Rule[LogicalPlan] { } /** + * Add Filter to left and right of a EquiJoin to filter out rows with null keys. + * So we may not need to check nullability of keys while joining. Besides, by filtering + * out keys with null, we can also reduce data size in Join. + */ +object AddNullFilterForEquiJoin extends Rule[LogicalPlan] with PredicateHelper { + def apply(plan: LogicalPlan): LogicalPlan = plan transform { +case ExtractEquiJoinKeys(Inner, leftKeys, rightKeys, condition, left, right) => + val leftConditions = leftKeys.distinct.map { l => +IsNotNull(l) + }.reduceLeft(And) + + val rightConditions = rightKeys.distinct.map { r => +IsNotNull(r) + }.reduceLeft(And) + + val keysConditions = ExpressionSet(leftKeys.zip(rightKeys).map { lr => +EqualTo(lr._2, lr._1) + }).reduceLeft(And) + + val finalConditions = if (condition.isDefined) { +And(keysConditions, condition.get) + } else { +keysConditions + } + + val leftHasAllNotNullPredicate = left.constraints.nonEmpty && +left.constraints.filter(_.isInstanceOf[IsNotNull]) + .forall(expr => leftConditions.references.intersect(expr.references).nonEmpty) + + val rightHasAllNotNullPredicate = right.constraints.nonEmpty && +right.constraints.filter(_.isInstanceOf[IsNotNull]) + .forall(expr => rightConditions.references.intersect(expr.references).nonEmpty) --- End diff -- Because we want to eliminate reading nulls in advance of join operator. If we simply add IsNotNull to the left/right to the conditions, the nulls are still read during performing join. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user davies commented on the pull request: https://github.com/apache/spark/pull/11235#issuecomment-194041946 I did not realized that we had merged https://github.com/apache/spark/pull/11372/files, do we still need this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/11235#discussion_r55403683 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -146,6 +147,56 @@ object EliminateSerialization extends Rule[LogicalPlan] { } /** + * Add Filter to left and right of a EquiJoin to filter out rows with null keys. + * So we may not need to check nullability of keys while joining. Besides, by filtering + * out keys with null, we can also reduce data size in Join. + */ +object AddNullFilterForEquiJoin extends Rule[LogicalPlan] with PredicateHelper { + def apply(plan: LogicalPlan): LogicalPlan = plan transform { +case ExtractEquiJoinKeys(Inner, leftKeys, rightKeys, condition, left, right) => + val leftConditions = leftKeys.distinct.map { l => +IsNotNull(l) + }.reduceLeft(And) + + val rightConditions = rightKeys.distinct.map { r => +IsNotNull(r) + }.reduceLeft(And) + + val keysConditions = ExpressionSet(leftKeys.zip(rightKeys).map { lr => +EqualTo(lr._2, lr._1) + }).reduceLeft(And) + + val finalConditions = if (condition.isDefined) { +And(keysConditions, condition.get) + } else { +keysConditions + } + + val leftHasAllNotNullPredicate = left.constraints.nonEmpty && +left.constraints.filter(_.isInstanceOf[IsNotNull]) + .forall(expr => leftConditions.references.intersect(expr.references).nonEmpty) + + val rightHasAllNotNullPredicate = right.constraints.nonEmpty && +right.constraints.filter(_.isInstanceOf[IsNotNull]) + .forall(expr => rightConditions.references.intersect(expr.references).nonEmpty) --- End diff -- This seems complicated and definitely needs more comments. Why are we not just adding IsNotNull to the left/right when it doesn't already exist in the conditions? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/11235#discussion_r55398759 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -146,6 +147,56 @@ object EliminateSerialization extends Rule[LogicalPlan] { } /** + * Add Filter to left and right of a EquiJoin to filter out rows with null keys. + * So we may not need to check nullability of keys while joining. Besides, by filtering + * out keys with null, we can also reduce data size in Join. + */ +object AddNullFilterForEquiJoin extends Rule[LogicalPlan] with PredicateHelper { + def apply(plan: LogicalPlan): LogicalPlan = plan transform { +case ExtractEquiJoinKeys(Inner, leftKeys, rightKeys, condition, left, right) => + val leftConditions = leftKeys.distinct.map { l => +IsNotNull(l) + }.reduceLeft(And) + + val rightConditions = rightKeys.distinct.map { r => +IsNotNull(r) + }.reduceLeft(And) + + val keysConditions = ExpressionSet(leftKeys.zip(rightKeys).map { lr => +EqualTo(lr._2, lr._1) + }).reduceLeft(And) + + val finalConditions = if (condition.isDefined) { +And(keysConditions, condition.get) + } else { +keysConditions + } + + val leftHasAllNotNullPredicate = left.constraints.nonEmpty && +left.constraints.filter(_.isInstanceOf[IsNotNull]) + .forall(expr => leftConditions.references.intersect(expr.references).nonEmpty) + + val rightHasAllNotNullPredicate = right.constraints.nonEmpty && +right.constraints.filter(_.isInstanceOf[IsNotNull]) + .forall(expr => rightConditions.references.intersect(expr.references).nonEmpty) + + val newLeft = if (leftHasAllNotNullPredicate) { +left + } else { +Filter(leftConditions, left) + } + + val newRight = if (rightHasAllNotNullPredicate) { +right + } else { +Filter(rightConditions, right) + } + + Join(newLeft, newRight, Inner, Some(finalConditions)) --- End diff -- If no filter added, we should return the original join. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/11235#discussion_r55398641 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -146,6 +147,56 @@ object EliminateSerialization extends Rule[LogicalPlan] { } /** + * Add Filter to left and right of a EquiJoin to filter out rows with null keys. + * So we may not need to check nullability of keys while joining. Besides, by filtering + * out keys with null, we can also reduce data size in Join. + */ +object AddNullFilterForEquiJoin extends Rule[LogicalPlan] with PredicateHelper { + def apply(plan: LogicalPlan): LogicalPlan = plan transform { +case ExtractEquiJoinKeys(Inner, leftKeys, rightKeys, condition, left, right) => + val leftConditions = leftKeys.distinct.map { l => +IsNotNull(l) + }.reduceLeft(And) + + val rightConditions = rightKeys.distinct.map { r => +IsNotNull(r) + }.reduceLeft(And) + + val keysConditions = ExpressionSet(leftKeys.zip(rightKeys).map { lr => +EqualTo(lr._2, lr._1) + }).reduceLeft(And) + + val finalConditions = if (condition.isDefined) { --- End diff -- It's better to use the original condition --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/11235#discussion_r55398500 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -146,6 +147,56 @@ object EliminateSerialization extends Rule[LogicalPlan] { } /** + * Add Filter to left and right of a EquiJoin to filter out rows with null keys. + * So we may not need to check nullability of keys while joining. Besides, by filtering + * out keys with null, we can also reduce data size in Join. + */ +object AddNullFilterForEquiJoin extends Rule[LogicalPlan] with PredicateHelper { + def apply(plan: LogicalPlan): LogicalPlan = plan transform { +case ExtractEquiJoinKeys(Inner, leftKeys, rightKeys, condition, left, right) => + val leftConditions = leftKeys.distinct.map { l => --- End diff -- We should only add predicate if the key is nullable and there is no IsNotNull constraints on the key. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/11235#issuecomment-193720918 Comments addressed, please check if the change is good to merge now. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11235#issuecomment-193579525 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11235#issuecomment-193579527 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52617/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11235#issuecomment-193578999 **[Test build #52617 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52617/consoleFull)** for PR 11235 at commit [`312cb32`](https://github.com/apache/spark/commit/312cb326922624e95528b7f2dc92129c59b3b524). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/11235#discussion_r55309414 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -144,6 +146,56 @@ object EliminateSerialization extends Rule[LogicalPlan] { } /** + * Add Filter to left and right of an inner Join to filter out rows with null keys. + * So we may not need to check nullability of keys while joining. Besides, by filtering + * out keys with null, we can also reduce data size in Join. + */ +object AddFilterOfNullForInnerJoin extends Rule[LogicalPlan] with PredicateHelper { --- End diff -- Renamed. Will do part of semi join and outer join in separate PR once this getting merged. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11235#issuecomment-193541806 **[Test build #52617 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52617/consoleFull)** for PR 11235 at commit [`312cb32`](https://github.com/apache/spark/commit/312cb326922624e95528b7f2dc92129c59b3b524). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/11235#discussion_r55247463 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -144,6 +146,56 @@ object EliminateSerialization extends Rule[LogicalPlan] { } /** + * Add Filter to left and right of an inner Join to filter out rows with null keys. + * So we may not need to check nullability of keys while joining. Besides, by filtering + * out keys with null, we can also reduce data size in Join. + */ +object AddFilterOfNullForInnerJoin extends Rule[LogicalPlan] with PredicateHelper { --- End diff -- Not commenting at the name at all (haven't looked at the rest of the pr), but AddFilterOfNullForEquiJoin can be shortened slightly to AddNullFilterForEquiJoin --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/11235#discussion_r55245770 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -144,6 +146,56 @@ object EliminateSerialization extends Rule[LogicalPlan] { } /** + * Add Filter to left and right of an inner Join to filter out rows with null keys. + * So we may not need to check nullability of keys while joining. Besides, by filtering + * out keys with null, we can also reduce data size in Join. + */ +object AddFilterOfNullForInnerJoin extends Rule[LogicalPlan] with PredicateHelper { --- End diff -- We could also do this for semi join and outer join, should we call it AddFilterOfNullForEquiJoin ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/11235#discussion_r55245694 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -57,6 +57,8 @@ abstract class Optimizer extends RuleExecutor[LogicalPlan] { ReplaceDistinctWithAggregate) :: Batch("Aggregate", FixedPoint(100), RemoveLiteralFromGroupExpressions) :: +Batch("Join", FixedPoint(100), --- End diff -- If we can make this rule idempotent, we don't need to put this as separate group. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/11235#discussion_r55244765 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -57,6 +57,8 @@ abstract class Optimizer extends RuleExecutor[LogicalPlan] { ReplaceDistinctWithAggregate) :: Batch("Aggregate", FixedPoint(100), RemoveLiteralFromGroupExpressions) :: +Batch("Join", FixedPoint(100), --- End diff -- We may have more InnerJoin from OuterJoinElimination, should we move this rule after that? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/11235#issuecomment-193064804 ping @marmbrus @rxin @davies @liancheng Is this ready to go? Or you have other comments? Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/11235#issuecomment-192639157 @liancheng Can you review this too? I think I've addressed previous comments. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/11235#issuecomment-191033735 @marmbrus @davies @rxin any comments for this? Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/11235#issuecomment-190200841 ping @marmbrus @davies @rxin --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/11235#issuecomment-189309998 @marmbrus I've addressed the comments. Please see if this change is appropriate. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11235#issuecomment-189170441 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52028/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11235#issuecomment-189170436 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11235#issuecomment-189169285 **[Test build #52028 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52028/consoleFull)** for PR 11235 at commit [`bf4777c`](https://github.com/apache/spark/commit/bf4777c00fad6cf143f351e1e5b4d8cbb2b5e230). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11235#issuecomment-189140564 **[Test build #52028 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52028/consoleFull)** for PR 11235 at commit [`bf4777c`](https://github.com/apache/spark/commit/bf4777c00fad6cf143f351e1e5b4d8cbb2b5e230). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11235#issuecomment-189122781 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11235#issuecomment-189122783 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52021/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11235#issuecomment-189122519 **[Test build #52021 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52021/consoleFull)** for PR 11235 at commit [`8291831`](https://github.com/apache/spark/commit/8291831ffd934f06c91d6388c92e8ede66c2fdc7). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/11235#discussion_r54203802 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/planning/patterns.scala --- @@ -123,6 +123,9 @@ object ExtractEquiJoinKeys extends Logging with PredicateHelper { case EqualTo(l, r) => canEvaluate(l, left) && canEvaluate(r, right) || canEvaluate(l, right) && canEvaluate(r, left) +case EqualNullSafe(l, r) => + canEvaluate(l, left) && canEvaluate(r, right) || +canEvaluate(l, right) && canEvaluate(r, left) --- End diff -- Since `EqualNullSafe` is added into left/right keys in `ExtractEquiJoinKeys`, we should also exclude it from the returned `condition`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11235#issuecomment-189103359 **[Test build #52021 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52021/consoleFull)** for PR 11235 at commit [`8291831`](https://github.com/apache/spark/commit/8291831ffd934f06c91d6388c92e8ede66c2fdc7). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11235#issuecomment-188752803 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51960/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11235#issuecomment-188752798 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11235#issuecomment-188751471 **[Test build #51960 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51960/consoleFull)** for PR 11235 at commit [`88f5020`](https://github.com/apache/spark/commit/88f5020a95ab7c8895c90fb947bddc0ff71fa2cc). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11235#issuecomment-188750192 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11235#issuecomment-188750194 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51959/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11235#issuecomment-188749094 **[Test build #51959 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51959/consoleFull)** for PR 11235 at commit [`687e948`](https://github.com/apache/spark/commit/687e94851c17e8473880bc1568d512535c155f12). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11235#issuecomment-188702764 **[Test build #51960 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51960/consoleFull)** for PR 11235 at commit [`88f5020`](https://github.com/apache/spark/commit/88f5020a95ab7c8895c90fb947bddc0ff71fa2cc). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/11235#discussion_r54071987 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/joins/BroadcastHashJoin.scala --- @@ -274,7 +274,7 @@ case class BroadcastHashJoin( |// generate join key for stream side |${keyEv.code} |// find matches from HashRelation - |$bufferType $matches = $anyNull ? null : ($bufferType)$relationTerm.get(${keyEv.value}); + |$bufferType $matches = ($bufferType)$relationTerm.get(${keyEv.value}); --- End diff -- Will revert this and address that in another PR too. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/11235#discussion_r54071649 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/joins/HashJoin.scala --- @@ -126,11 +126,11 @@ trait HashJoin { while (currentHashMatches == null && streamIter.hasNext) { currentStreamedRow = streamIter.next() val key = joinKeys(currentStreamedRow) -if (!key.anyNull) { - currentHashMatches = hashedRelation.get(key) - if (currentHashMatches != null) { -currentMatchPosition = 0 - } +// We do filtering null keys by inserting a Filter before inner join, --- End diff -- OK. I will revert this part and try to address this in another PR. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11235#issuecomment-188698959 **[Test build #51959 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51959/consoleFull)** for PR 11235 at commit [`687e948`](https://github.com/apache/spark/commit/687e94851c17e8473880bc1568d512535c155f12). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/11235#discussion_r53670406 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/joins/HashJoin.scala --- @@ -126,11 +126,11 @@ trait HashJoin { while (currentHashMatches == null && streamIter.hasNext) { currentStreamedRow = streamIter.next() val key = joinKeys(currentStreamedRow) -if (!key.anyNull) { - currentHashMatches = hashedRelation.get(key) - if (currentHashMatches != null) { -currentMatchPosition = 0 - } +// We do filtering null keys by inserting a Filter before inner join, --- End diff -- Even if we were going to make changes like this, they should be done in a different PR and come with some benchmarks to show that you are actually speeding things up. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/11235#discussion_r53670314 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/joins/HashJoin.scala --- @@ -126,11 +126,11 @@ trait HashJoin { while (currentHashMatches == null && streamIter.hasNext) { currentStreamedRow = streamIter.next() val key = joinKeys(currentStreamedRow) -if (!key.anyNull) { - currentHashMatches = hashedRelation.get(key) - if (currentHashMatches != null) { -currentMatchPosition = 0 - } +// We do filtering null keys by inserting a Filter before inner join, --- End diff -- The execution engines correctness should not be predicated on specific optimizations. That is *really* tight coupling. If you want to optimize code generation for null handling you should do it based on signals like `nullable` and not on undocumented, implicit contracts between components. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/11235#discussion_r53670042 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/joins/BroadcastHashJoin.scala --- @@ -274,7 +274,7 @@ case class BroadcastHashJoin( |// generate join key for stream side |${keyEv.code} |// find matches from HashRelation - |$bufferType $matches = $anyNull ? null : ($bufferType)$relationTerm.get(${keyEv.value}); + |$bufferType $matches = ($bufferType)$relationTerm.get(${keyEv.value}); --- End diff -- Explain these changes please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/11235#discussion_r53669872 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -57,6 +57,8 @@ abstract class Optimizer extends RuleExecutor[LogicalPlan] { ReplaceDistinctWithAggregate) :: Batch("Aggregate", FixedPoint(100), RemoveLiteralFromGroupExpressions) :: +Batch("Join", Once, + AddFilterOfNullForInnerJoin) :: --- End diff -- We should make this rule idempotent instead of hacking it and making it run only once. You are loosing the benefits of emergent optimizations with this implementation. I would directly construct the filter in the left/right child, but only when its not already present in the `constraints` of the child. This is the whole reason we added the ability to reason about what `constraints` are already present on a subtree. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/11235#issuecomment-186975972 cc @davies @marmbrus @liancheng @rxin --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11235#issuecomment-186835014 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11235#issuecomment-186835016 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51631/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11235#issuecomment-186834946 **[Test build #51631 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51631/consoleFull)** for PR 11235 at commit [`aada320`](https://github.com/apache/spark/commit/aada32032b6b5a71c735ddb65c4852b40f6e31c6). * This patch passes all tests. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `abstract class SubqueryExpression extends LeafExpression ` * `case class ScalarSubquery(` * `case class Subquery(name: String, child: SparkPlan) extends UnaryNode ` * `case class ScalarSubquery(` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11235#issuecomment-186823215 **[Test build #51631 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51631/consoleFull)** for PR 11235 at commit [`aada320`](https://github.com/apache/spark/commit/aada32032b6b5a71c735ddb65c4852b40f6e31c6). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/11235#discussion_r53442868 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/joins/HashJoin.scala --- @@ -126,11 +126,11 @@ trait HashJoin { while (currentHashMatches == null && streamIter.hasNext) { currentStreamedRow = streamIter.next() val key = joinKeys(currentStreamedRow) -if (!key.anyNull) { - currentHashMatches = hashedRelation.get(key) - if (currentHashMatches != null) { -currentMatchPosition = 0 - } +// We do filtering null keys by inserting a Filter before inner join, --- End diff -- Understood. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/11235#issuecomment-186142858 ping @davies --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11235#issuecomment-186116416 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51531/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11235#issuecomment-186116414 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11235#issuecomment-186116106 **[Test build #51531 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51531/consoleFull)** for PR 11235 at commit [`4f15778`](https://github.com/apache/spark/commit/4f15778f2cc26baf7b5b07313e6f50e458a400a3). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11235#issuecomment-186095529 **[Test build #51531 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51531/consoleFull)** for PR 11235 at commit [`4f15778`](https://github.com/apache/spark/commit/4f15778f2cc26baf7b5b07313e6f50e458a400a3). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/11235#issuecomment-186092279 @maropu yeah, I think so. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11235#issuecomment-185558200 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51468/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11235#issuecomment-185558199 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11235#issuecomment-185558006 **[Test build #51468 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51468/consoleFull)** for PR 11235 at commit [`0c14be5`](https://github.com/apache/spark/commit/0c14be50ad16f445b8cbaf0cd1df0eaf32a14dcd). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11235#issuecomment-185538567 **[Test build #51468 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51468/consoleFull)** for PR 11235 at commit [`0c14be5`](https://github.com/apache/spark/commit/0c14be50ad16f445b8cbaf0cd1df0eaf32a14dcd). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/11235#issuecomment-185535356 retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11235#issuecomment-185531092 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51464/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11235#issuecomment-185531090 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/11235#discussion_r53267709 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/joins/HashJoin.scala --- @@ -126,11 +126,11 @@ trait HashJoin { while (currentHashMatches == null && streamIter.hasNext) { currentStreamedRow = streamIter.next() val key = joinKeys(currentStreamedRow) -if (!key.anyNull) { - currentHashMatches = hashedRelation.get(key) - if (currentHashMatches != null) { -currentMatchPosition = 0 - } +// We do filtering null keys by inserting a Filter before inner join, --- End diff -- As null are filters by added filter, I think we should be able to skip it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11235#issuecomment-185161812 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51426/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11235#issuecomment-185161806 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11235#issuecomment-185161522 **[Test build #51426 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51426/consoleFull)** for PR 11235 at commit [`7766d17`](https://github.com/apache/spark/commit/7766d17d432285d8ca8cb4eb63e8e1d5717ac00d). * This patch **fails PySpark unit tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `class AddFilterOfNullForInnerJoinSuite extends PlanTest ` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11235#issuecomment-185154496 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51425/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11235#issuecomment-185154491 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11235#issuecomment-185153934 **[Test build #51425 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51425/consoleFull)** for PR 11235 at commit [`f8505de`](https://github.com/apache/spark/commit/f8505de0bf89f008becb05eaa2e744ae7290c1ba). * This patch **fails PySpark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user maropu commented on the pull request: https://github.com/apache/spark/pull/11235#issuecomment-185149697 ISTM we can also add `NULL` filters in the one side of left/right-outer joins. Is it wrong? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11235#issuecomment-185134057 **[Test build #51426 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51426/consoleFull)** for PR 11235 at commit [`7766d17`](https://github.com/apache/spark/commit/7766d17d432285d8ca8cb4eb63e8e1d5717ac00d). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/11235#discussion_r53143407 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/joins/HashJoin.scala --- @@ -126,11 +126,11 @@ trait HashJoin { while (currentHashMatches == null && streamIter.hasNext) { currentStreamedRow = streamIter.next() val key = joinKeys(currentStreamedRow) -if (!key.anyNull) { - currentHashMatches = hashedRelation.get(key) - if (currentHashMatches != null) { -currentMatchPosition = 0 - } +// We do filtering null keys by inserting a Filter before inner join, --- End diff -- We need assertion: !key.anyNull? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/11235#discussion_r53143261 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -23,7 +23,7 @@ import org.apache.spark.sql.catalyst.analysis.{CleanupAliases, EliminateSubQueri import org.apache.spark.sql.catalyst.expressions._ import org.apache.spark.sql.catalyst.expressions.Literal.{FalseLiteral, TrueLiteral} import org.apache.spark.sql.catalyst.expressions.aggregate._ -import org.apache.spark.sql.catalyst.planning.{ExtractFiltersAndInnerJoins, Unions} +import org.apache.spark.sql.catalyst.planning._ --- End diff -- Nit: Is it better, `import org.apache.spark.sql.catalyst.planning.{ExtractEquiJoinKeys, ExtractFiltersAndInnerJoins, Unions}`? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11235#issuecomment-185123129 **[Test build #51425 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51425/consoleFull)** for PR 11235 at commit [`f8505de`](https://github.com/apache/spark/commit/f8505de0bf89f008becb05eaa2e744ae7290c1ba). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11235#issuecomment-185082131 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11235#issuecomment-185082137 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51414/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11235#issuecomment-185081565 **[Test build #51414 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51414/consoleFull)** for PR 11235 at commit [`132890b`](https://github.com/apache/spark/commit/132890bbcd013dd284dfb9d9fd48e8440343ddd5). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11235#issuecomment-185053702 **[Test build #51414 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51414/consoleFull)** for PR 11235 at commit [`132890b`](https://github.com/apache/spark/commit/132890bbcd013dd284dfb9d9fd48e8440343ddd5). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
GitHub user viirya opened a pull request: https://github.com/apache/spark/pull/11235 [SPARK-13249][SQL] Add Filter checking nullability of keys for inner join JIRA: https://issues.apache.org/jira/browse/SPARK-13249 For inner join, the join key with null in it will not match each other, so we could insert a Filter before inner join (could be pushed down), then we don't need to check nullability of keys while joining. You can merge this pull request into a Git repository by running: $ git pull https://github.com/viirya/spark-1 add-filter-for-innerjoin Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/11235.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #11235 commit 216305fef918f46d50af7107c7ea3182ad91afcf Author: Liang-Chi HsiehDate: 2016-02-17T06:31:03Z Add Filter checking nullability of keys for inner join. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org