Hi, I encountered a weird problem in spark sql.
I use sbt/sbt hive/console  to go into the shell.

I test the filter push down by using catalyst.

scala>  val queryPlan = sql("select value from (select key,value from src)a
where a.key=86 ")
scala> queryPlan.baseLogicalPlan
res0: org.apache.spark.sql.catalyst.plans.logical.LogicalPlan = 
Project ['value]
 Filter ('a.key = 86)
  Subquery a
   Project ['key,'value]
    UnresolvedRelation None, src, None

I want to achieve the "Filter Push Down".

So I run :
scala> var newQuery = queryPlan.baseLogicalPlan transform {
     |     case f @ Filter(_, p @ Project(_,grandChild)) 
     |     if (f.references subsetOf grandChild.output) => 
     |     p.copy(child = f.copy(child = grandChild))
     | }
<console>:42: error: type mismatch;
 found   : Seq[org.apache.spark.sql.catalyst.expressions.Attribute]
 required:
scala.collection.GenSet[org.apache.spark.sql.catalyst.expressions.Attribute]
           if (f.references subsetOf grandChild.output) => 
                                                ^
It throws exception above. I don't know what's wrong.

If I run :
var newQuery = queryPlan.baseLogicalPlan transform {
    case f @ Filter(_, p @ Project(_,grandChild)) 
    if true => 
    p.copy(child = f.copy(child = grandChild))
}
scala> var newQuery = queryPlan.baseLogicalPlan transform {
     |     case f @ Filter(_, p @ Project(_,grandChild)) 
     |     if true => 
     |     p.copy(child = f.copy(child = grandChild))
     | }
newQuery: org.apache.spark.sql.catalyst.plans.logical.LogicalPlan = 
Project ['value]
 Filter ('a.key = 86)
  Subquery a
   Project ['key,'value]
    UnresolvedRelation None, src, None

It seems the Filter also in the same position, not switch the order.
Can anyone guide me about it?




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/spark1-0-1-catalyst-transform-filter-not-push-down-tp9599.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Reply via email to