[ 
https://issues.apache.org/jira/browse/SPARK-23985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16625422#comment-16625422
 ] 

Ohad Raviv commented on SPARK-23985:
------------------------------------

you're right. that's very strange. looks like something got lost in 
translation. 
when I'm running you're example (which is actually mine..) indeed I get the 
right plan. However, if I try my original code it is still the un-optimized 
plan (with Spark 2.3):
{code}
    import org.apache.spark.sql.functions._
    spark.range(10).selectExpr(
      "cast(id as string) a",
      "id as b").write.saveAsTable("t1")
    val windowSpec = Window.partitionBy(concat(col("a"), 
lit("lit"))).orderBy("b")
    spark.table("t1").withColumn("d", row_number() over windowSpec)
      .where("a>'1'")
      .explain
{code}
{code}
== Physical Plan ==
*(3) Project [a#8, b#9L, d#13]
+- *(3) Filter (isnotnull(a#8) && (a#8 > 1))
   +- Window [row_number() windowspecdefinition(_w0#14, b#9L ASC NULLS FIRST, 
specifiedwindowframe(RowFrame, unboundedpreceding$(), currentrow$())) AS d#13], 
[_w0#14], [b#9L ASC NULLS FIRST]
      +- *(2) Sort [_w0#14 ASC NULLS FIRST, b#9L ASC NULLS FIRST], false, 0
         +- Exchange hashpartitioning(_w0#14, 2)
            +- *(1) Project [a#8, b#9L, concat(a#8, lit) AS _w0#14]
               +- *(1) FileScan parquet unitest.t1[a#8,b#9L] Batched: true, 
Format: Parquet, Location: InMemoryFileIndex[../t1], PartitionFilters: [], 
PushedFilters: [], ReadSchema: struct<a:string,b:bigint>
{code}
can you understand the diff?

 

> predicate push down doesn't work with simple compound partition spec
> --------------------------------------------------------------------
>
>                 Key: SPARK-23985
>                 URL: https://issues.apache.org/jira/browse/SPARK-23985
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 2.4.0
>            Reporter: Ohad Raviv
>            Priority: Minor
>
> while predicate push down works with this query: 
> {code:sql}
> select *, row_number() over (partition by a order by b) from t1 where a>1
> {code}
> it dowsn't work with:
> {code:sql}
> select *, row_number() over (partition by concat(a,'lit') order by b) from t1 
> where a>1
> {code}
>  
> I added a test to FilterPushdownSuite which I think recreates the problem:
> {code:scala}
>   test("Window: predicate push down -- ohad") {
>     val winExpr = windowExpr(count('b),
>       windowSpec(Concat('a :: Nil) :: Nil, 'b.asc :: Nil, UnspecifiedFrame))
>     val originalQuery = testRelation.select('a, 'b, 'c, 
> winExpr.as('window)).where('a > 1)
>     val correctAnswer = testRelation
>       .where('a > 1).select('a, 'b, 'c)
>       .window(winExpr.as('window) :: Nil, 'a :: Nil, 'b.asc :: Nil)
>       .select('a, 'b, 'c, 'window).analyze
>     comparePlans(Optimize.execute(originalQuery.analyze), correctAnswer)
>   }
> {code}
> will try to create a PR with a correction



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to