[ 
https://issues.apache.org/jira/browse/SPARK-16461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Rosen updated SPARK-16461:
-------------------------------
    Assignee: Hyukjin Kwon

> Support partition batch pruning with `<=>` (EqualNullSafe) predicate in 
> InMemoryTableScanExec
> ---------------------------------------------------------------------------------------------
>
>                 Key: SPARK-16461
>                 URL: https://issues.apache.org/jira/browse/SPARK-16461
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>            Reporter: Hyukjin Kwon
>            Assignee: Hyukjin Kwon
>             Fix For: 2.1.0
>
>
> It seems `EqualNullSafe` filter was missed for batch pruneing partitions in 
> cached tables.
> Supporting this improve the performance roughly ~75% (it will vary).
> Running the codes below:
> {code}
> test("Null-safe equal comparison") {
>   val N = 20000000
>   val df = spark.range(N).repartition(20)
>   val benchmark = new Benchmark("Null-safe equal comparison", N)
>   df.createOrReplaceTempView("t")
>   spark.catalog.cacheTable("t")
>   sql("select id from t where id <=> 1").collect()
>   benchmark.addCase("Null-safe equal comparison", 10) { _ =>
>     sql("select id from t where id <=> 1").collect()
>   }
>   benchmark.run()
> }
> {code}
> produces the results below:
> Before:
> {code}
> Running benchmark: Null-safe equal comparison
>   Running case: Null-safe equal comparison
>   Stopped after 10 iterations, 2098 ms
> Java HotSpot(TM) 64-Bit Server VM 1.8.0_45-b14 on Mac OS X 10.11.5
> Intel(R) Core(TM) i7-4850HQ CPU @ 2.30GHz
> Null-safe equal comparison:              Best/Avg Time(ms)    Rate(M/s)   Per 
> Row(ns)   Relative
> ------------------------------------------------------------------------------------------------
> Null-safe equal comparison                     204 /  210         98.1        
>   10.2       1.0X
> {code}
> After
> {code}
> Running benchmark: Null-safe equal comparison
>   Running case: Null-safe equal comparison
>   Stopped after 10 iterations, 478 ms
> Java HotSpot(TM) 64-Bit Server VM 1.8.0_45-b14 on Mac OS X 10.11.5
> Intel(R) Core(TM) i7-4850HQ CPU @ 2.30GHz
> Null-safe equal comparison:              Best/Avg Time(ms)    Rate(M/s)   Per 
> Row(ns)   Relative
> ------------------------------------------------------------------------------------------------
> Null-safe equal comparison                      42 /   48        474.1        
>    2.1       1.0X
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to