[ https://issues.apache.org/jira/browse/SPARK-36027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17375778#comment-17375778 ]
Apache Spark commented on SPARK-36027: -------------------------------------- User 'SaurabhChawla100' has created a pull request for this issue: https://github.com/apache/spark/pull/33232 > Push down Filter in case of filter having child as TypedFilter > -------------------------------------------------------------- > > Key: SPARK-36027 > URL: https://issues.apache.org/jira/browse/SPARK-36027 > Project: Spark > Issue Type: Improvement > Components: SQL > Affects Versions: 3.1.2 > Reporter: Saurabh Chawla > Priority: Major > > In case of Filter having child as TypedFilter, Pushdown of Filters does not > take place > > {code:java} > scala> def testUdfFunction(r: String): Boolean = { > | r.equals("hello") > | } > testUdfFunction: (r: String)Boolean > val df= spark.read.parquet("/testDir/testParquetSize/Parquetgzip/") > df: org.apache.spark.sql.DataFrame = [_1: string, _2: string ... 1 more field] > {code} > > df.filter(x => > testUdfFunction(x.getAs("_1"))).filter("_2<='id103855'").queryExecution.executedPlan > > {code:java} > Filter (isnotnull(_2#1) AND (_2#1 <= id103855)) > +- *(1) Filter > $line20.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$Lambda$3184/1455948476@5ce4af92.apply > +- *(1) ColumnarToRow > +- FileScan parquet [_1#0,_2#1,_3#2] Batched: true, DataFilters: [], Format: > Parquet, Location: > InMemoryFileIndex[file:/Users/saurabh.chawla/testDir/testParquetSize/Parquetgzip], > PartitionFilters: [], PushedFilters: [], ReadSchema: > struct<_1:string,_2:string,_3:string> > > {code} > df.filter(x => > testUdfFunction(x.getAs("_1"))).filter("_2<='id103855'").queryExecution.optimizedPlan > {code:java} > Filter (isnotnull(_2#1) AND (_2#1 <= id103855)) > +- TypedFilter > $line22.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$Lambda$3191/320569017@37a2806c, > interface org.apache.spark.sql.Row, [StructField(_1,StringType,true), > StructField(_2,StringType,true), StructField(_3,StringType,true)], > createexternalrow(_1#0.toString, _2#1.toString, _3#2.toString, > StructField(_1,StringType,true), StructField(_2,StringType,true), > StructField(_3,StringType,true)) > +- Relation[_1#0,_2#1,_3#2] parquet{code} > > There is need to add this change to push down the filter for this scenario > > -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org