[ 
https://issues.apache.org/jira/browse/SPARK-37989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17480963#comment-17480963
 ] 

Yuming Wang commented on SPARK-37989:
-------------------------------------

Benchmark 5
{code:java}
import org.apache.spark.benchmark.Benchmark
import org.apache.spark.sql.catalyst.optimizer.LimitPushDown
val numRows = 1024 * 1024 * 1000

spark.sql(s"CREATE TABLE t1 using parquet AS SELECT id % (${numRows} / 3) AS a, 
id % (${numRows} / 3) AS b, id % (${numRows} / 3) AS c FROM range(1, 
${numRows}L, 1, 5)")
val benchmark = new Benchmark("Push down limit through Aggregate if it is group 
only", numRows, minNumIters = 1)
Seq(LimitPushDown.ruleName, "").foreach { execludedRules =>
  benchmark.addCase(s"Push down ${if (execludedRules.length > 0) "disabled" 
else "enabled" }") { _ =>
    withSQLConf(SQLConf.OPTIMIZER_EXCLUDED_RULES.key -> execludedRules) {
      spark.sql("SELECT distinct * FROM t1 LIMIT 
1000").write.format("noop").mode("Overwrite").save()
    }
  }
}

benchmark.run()
{code}

{noformat}
Java HotSpot(TM) 64-Bit Server VM 1.8.0_281-b09 on Mac OS X 10.15.7
Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
Push down limit through Aggregate if it is group only:  Best Time(ms)   Avg 
Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
-------------------------------------------------------------------------------------------------------------------------------------
Push down disabled                                            752585         
752585           0          1.4         717.7       1.0X
Push down enabled                                             266496         
266496           0          3.9         254.1       2.8X
{noformat}



> Push down limit through Aggregate if it is group only
> -----------------------------------------------------
>
>                 Key: SPARK-37989
>                 URL: https://issues.apache.org/jira/browse/SPARK-37989
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 3.3.0
>            Reporter: Yuming Wang
>            Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to