[ 
https://issues.apache.org/jira/browse/SPARK-23985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16625436#comment-16625436
 ] 

Ohad Raviv commented on SPARK-23985:
------------------------------------

the same is true for Spark 2.4:
{code}
sparkSession.range(10).selectExpr("cast(id as string) as a", "id as b", 
"id").write.saveAsTable("t1")
val w = sparkSession.sql(
  "select *, row_number() over (partition by concat(a,'lit') order by b) from 
t1 where a>'1'")
w.explain

val windowSpec = Window.partitionBy(concat(col("a"), lit("lit"))).orderBy("b")
sparkSession.table("t1").withColumn("d", row_number() over windowSpec)
  .where("a>'1'")
  .explain
{code}
plans:
{code}
== Physical Plan ==
*(3) Project [a#11, b#12L, id#13L, row_number() OVER (PARTITION BY concat(a, 
lit) ORDER BY b ASC NULLS FIRST ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT 
ROW)#22]
+- Window [row_number() windowspecdefinition(_w0#23, b#12L ASC NULLS FIRST, 
specifiedwindowframe(RowFrame, unboundedpreceding$(), currentrow$())) AS 
row_number() OVER (PARTITION BY concat(a, lit) ORDER BY b ASC NULLS FIRST ROWS 
BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)#22], [_w0#23], [b#12L ASC NULLS 
FIRST]
   +- *(2) Sort [_w0#23 ASC NULLS FIRST, b#12L ASC NULLS FIRST], false, 0
      +- Exchange hashpartitioning(_w0#23, 1)
         +- *(1) Project [a#11, b#12L, id#13L, concat(a#11, lit) AS _w0#23]
            +- *(1) Filter (isnotnull(a#11) && (a#11 > 1))
               +- *(1) FileScan parquet default.t1[a#11,b#12L,id#13L] Batched: 
true, Format: Parquet, Location: 
InMemoryFileIndex[file:../catalyst/spark-warehouse/t1], PartitionFilters: [], 
PushedFilters: [IsNotNull(a), GreaterThan(a,1)], ReadSchema: 
struct<a:string,b:bigint,id:bigint>


== Physical Plan ==
*(3) Project [a#11, b#12L, id#13L, d#28]
+- *(3) Filter (isnotnull(a#11) && (a#11 > 1))
   +- Window [row_number() windowspecdefinition(_w0#29, b#12L ASC NULLS FIRST, 
specifiedwindowframe(RowFrame, unboundedpreceding$(), currentrow$())) AS d#28], 
[_w0#29], [b#12L ASC NULLS FIRST]
      +- *(2) Sort [_w0#29 ASC NULLS FIRST, b#12L ASC NULLS FIRST], false, 0
         +- Exchange hashpartitioning(_w0#29, 1)
            +- *(1) Project [a#11, b#12L, id#13L, concat(a#11, lit) AS _w0#29]
               +- *(1) FileScan parquet default.t1[a#11,b#12L,id#13L] Batched: 
true, Format: Parquet, Location: 
InMemoryFileIndex[file:../catalyst/spark-warehouse/t1], PartitionFilters: [], 
PushedFilters: [], ReadSchema: struct<a:string,b:bigint,id:bigint>
{code}

> predicate push down doesn't work with simple compound partition spec
> --------------------------------------------------------------------
>
>                 Key: SPARK-23985
>                 URL: https://issues.apache.org/jira/browse/SPARK-23985
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 2.4.0
>            Reporter: Ohad Raviv
>            Priority: Minor
>
> while predicate push down works with this query: 
> {code:sql}
> select *, row_number() over (partition by a order by b) from t1 where a>1
> {code}
> it dowsn't work with:
> {code:sql}
> select *, row_number() over (partition by concat(a,'lit') order by b) from t1 
> where a>1
> {code}
>  
> I added a test to FilterPushdownSuite which I think recreates the problem:
> {code:scala}
>   test("Window: predicate push down -- ohad") {
>     val winExpr = windowExpr(count('b),
>       windowSpec(Concat('a :: Nil) :: Nil, 'b.asc :: Nil, UnspecifiedFrame))
>     val originalQuery = testRelation.select('a, 'b, 'c, 
> winExpr.as('window)).where('a > 1)
>     val correctAnswer = testRelation
>       .where('a > 1).select('a, 'b, 'c)
>       .window(winExpr.as('window) :: Nil, 'a :: Nil, 'b.asc :: Nil)
>       .select('a, 'b, 'c, 'window).analyze
>     comparePlans(Optimize.execute(originalQuery.analyze), correctAnswer)
>   }
> {code}
> will try to create a PR with a correction



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to