spark git commit: [SPARK-4693] [SQL] PruningPredicates may be wrong if predicates contains an empty AttributeSet() references

marmbrus Thu, 18 Dec 2014 20:14:55 -0800

Repository: spark
Updated Branches:
  refs/heads/master 22ddb6e03 -> e7de7e5f4



[SPARK-4693] [SQL] PruningPredicates may be wrong if predicates contains an 
empty AttributeSet() references

The sql "select * from spark_test::for_test where abs(20141202) is not null" 
has predicates=List(IS NOT NULL 
HiveSimpleUdf#org.apache.hadoop.hive.ql.udf.UDFAbs(20141202)) and
partitionKeyIds=AttributeSet(). PruningPredicates is List(IS NOT NULL 
HiveSimpleUdf#org.apache.hadoop.hive.ql.udf.UDFAbs(20141202)). Then the 
exception "java.lang.IllegalArgumentException: requirement failed: Partition 
pruning predicates only supported for partitioned tables." is thrown.
The sql "select * from spark_test::for_test_partitioned_table where 
abs(20141202) is not null and type_id=11 and platform = 3" with partitioned key 
insert_date has predicates=List(IS NOT NULL 
HiveSimpleUdf#org.apache.hadoop.hive.ql.udf.UDFAbs(20141202), (type_id#12 = 
11), (platform#8 = 3)) and partitionKeyIds=AttributeSet(insert_date#24). 
PruningPredicates is List(IS NOT NULL 
HiveSimpleUdf#org.apache.hadoop.hive.ql.udf.UDFAbs(20141202)).

Author: YanTangZhai <hakeemz...@tencent.com>
Author: yantangzhai <tyz0...@163.com>

Closes #3556 from YanTangZhai/SPARK-4693 and squashes the following commits:

620ebe3 [yantangzhai] [SPARK-4693] [SQL] PruningPredicates may be wrong if 
predicates contains an empty AttributeSet() references
37cfdf5 [yantangzhai] [SPARK-4693] [SQL] PruningPredicates may be wrong if 
predicates contains an empty AttributeSet() references
70a3544 [yantangzhai] [SPARK-4693] [SQL] PruningPredicates may be wrong if 
predicates contains an empty AttributeSet() references
efa9b03 [YanTangZhai] Update HiveQuerySuite.scala
72accf1 [YanTangZhai] Update HiveQuerySuite.scala
e572b9a [YanTangZhai] Update HiveStrategies.scala
6e643f8 [YanTangZhai] Merge pull request #11 from apache/master
e249846 [YanTangZhai] Merge pull request #10 from apache/master
d26d982 [YanTangZhai] Merge pull request #9 from apache/master
76d4027 [YanTangZhai] Merge pull request #8 from apache/master
03b62b0 [YanTangZhai] Merge pull request #7 from apache/master
8a00106 [YanTangZhai] Merge pull request #6 from apache/master
cbcba66 [YanTangZhai] Merge pull request #3 from apache/master
cdef539 [YanTangZhai] Merge pull request #1 from apache/master


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/e7de7e5f
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/e7de7e5f
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/e7de7e5f

Branch: refs/heads/master
Commit: e7de7e5f46821e1ba3b070b21d6bcf6d5ec8a796
Parents: 22ddb6e
Author: YanTangZhai <hakeemz...@tencent.com>
Authored: Thu Dec 18 20:13:46 2014 -0800
Committer: Michael Armbrust <mich...@databricks.com>
Committed: Thu Dec 18 20:13:46 2014 -0800

----------------------------------------------------------------------
 .../spark/sql/catalyst/expressions/AttributeSet.scala       | 2 ++
 .../scala/org/apache/spark/sql/hive/HiveStrategies.scala    | 5 +++--
 .../apache/spark/sql/hive/execution/HiveQuerySuite.scala    | 9 +++++++++
 3 files changed, 14 insertions(+), 2 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/e7de7e5f/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/AttributeSet.scala
----------------------------------------------------------------------
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/AttributeSet.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/AttributeSet.scala
index 2b4969b..171845a 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/AttributeSet.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/AttributeSet.scala
@@ -112,4 +112,6 @@ class AttributeSet private (val baseSet: 
Set[AttributeEquals])
   override def toSeq: Seq[Attribute] = baseSet.map(_.a).toArray.toSeq
 
   override def toString = "{" + baseSet.map(_.a).mkString(", ") + "}"
+
+  override def isEmpty: Boolean = baseSet.isEmpty
 }

http://git-wip-us.apache.org/repos/asf/spark/blob/e7de7e5f/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveStrategies.scala
----------------------------------------------------------------------
diff --git 
a/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveStrategies.scala 
b/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveStrategies.scala
index 5f02e95..4ebd59d 100644
--- a/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveStrategies.scala
+++ b/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveStrategies.scala
@@ -210,8 +210,9 @@ private[hive] trait HiveStrategies {
         // Filter out all predicates that only deal with partition keys, these 
are given to the
         // hive table scan operator to be used for partition pruning.
         val partitionKeyIds = AttributeSet(relation.partitionKeys)
-        val (pruningPredicates, otherPredicates) = predicates.partition {
-          _.references.subsetOf(partitionKeyIds)
+        val (pruningPredicates, otherPredicates) = predicates.partition { 
predicate =>
+          !predicate.references.isEmpty &&
+          predicate.references.subsetOf(partitionKeyIds)
         }
 
         pruneFilterProject(

http://git-wip-us.apache.org/repos/asf/spark/blob/e7de7e5f/sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveQuerySuite.scala
----------------------------------------------------------------------
diff --git 
a/sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveQuerySuite.scala
 
b/sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveQuerySuite.scala
index 63eb07c..4d81acc 100644
--- 
a/sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveQuerySuite.scala
+++ 
b/sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveQuerySuite.scala
@@ -411,6 +411,15 @@ class HiveQuerySuite extends HiveComparisonTest with 
BeforeAndAfter {
   createQueryTest("select null from table",
     "SELECT null FROM src LIMIT 1")
 
+  test("predicates contains an empty AttributeSet() references") {
+    sql(
+      """
+        |SELECT a FROM (
+        |  SELECT 1 AS a FROM src LIMIT 1 ) table
+        |WHERE abs(20141202) is not null
+      """.stripMargin).collect()
+  }
+
   test("implement identity function using case statement") {
     val actual = sql("SELECT (CASE key WHEN key THEN key END) FROM src")
       .map { case Row(i: Int) => i }


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-4693] [SQL] PruningPredicates may be wrong if predicates contains an empty AttributeSet() references

Reply via email to