[ https://issues.apache.org/jira/browse/DRILL-5715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Boaz Ben-Zvi resolved DRILL-5715. --------------------------------- Resolution: Fixed Reviewer: Paul Rogers The commit for DRILL-5694 (PR #938) also solves this performance bug (basically removed calls to Setup before every hash computation, plus few little changes like replacing setSafe with set ). > Performance of refactored HashAgg operator regressed > ---------------------------------------------------- > > Key: DRILL-5715 > URL: https://issues.apache.org/jira/browse/DRILL-5715 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Codegen > Affects Versions: 1.11.0 > Environment: 10-node RHEL 6.4 (32 Core, 256GB RAM) > Reporter: Kunal Khatua > Assignee: Boaz Ben-Zvi > Labels: performance, regression > Fix For: 1.12.0 > > Attachments: 26736242-d084-6604-aac9-927e729da755.sys.drill, > 26736615-9e86-dac9-ad77-b022fd791f67.sys.drill, > 2675cc73-9481-16e0-7d21-5f1338611e5f.sys.drill, > 2675de42-3789-47b8-29e8-c5077af136db.sys.drill, drill-1.10.0_callTree.png, > drill-1.10.0_hotspot.png, drill-1.11.0_callTree.png, drill-1.11.0_hotspot.png > > > When running the following simple HashAgg-based query on a TPCH-table - > Lineitem with 6Billion rows on a 10 node setup (with a single partition to > disable any possible spilling to disk) > {code:sql} > select count(*) > from ( > select l_quantity > , count(l_orderkey) > from lineitem > group by l_quantity > ) {code} > the runtime increased from {{7.378 sec}} to {{11.323 sec}} [reported by the > JDBC client]. > To disable spill-to-disk in Drill-1.11.0, the {{drill-override.conf}} was > modified to > {code}drill.exec.hashagg.num_partitions : 1{code} > Attached are two profiles > Drill 1.10.0 : [^2675cc73-9481-16e0-7d21-5f1338611e5f.sys.drill] > Drill 1.11.0 : [^2675de42-3789-47b8-29e8-c5077af136db.sys.drill] > A separate run was done for both scenarios with the > {{planner.width.max_per_node=10}} and profiled with YourKit. > Image snippets are attached, indicating the hotspots in both builds: > *Drill 1.10.0* : > Profile: [^26736242-d084-6604-aac9-927e729da755.sys.drill] > CallTree: [^drill-1.10.0_callTree.png] > HotSpot: [^drill-1.10.0_hotspot.png] > !drill-1.10.0_hotspot.png|drill-1.10.0_hotspot! > *Drill 1.11.0* : > Profile: [^26736615-9e86-dac9-ad77-b022fd791f67.sys.drill] > CallTree: [^drill-1.11.0_callTree.png] > HotSpot: [^drill-1.11.0_hotspot.png] > !drill-1.11.0_hotspot.png|drill-1.11.0_hotspot! -- This message was sent by Atlassian JIRA (v6.4.14#64029)