Norbert Luksa has uploaded a new patch set (#10). ( 
http://gerrit.cloudera.org:8080/14347 )

Change subject: IMPALA-6501: Optimize count(star) for Kudu scans
......................................................................

IMPALA-6501: Optimize count(star) for Kudu scans

IMPALA-5036 added an optimisation for count(star) in Parquet scans
that avoids materialising dummy rows. This change provides similar
optimization for Kudu tables.

Instead of materializing empty rows when computing count star, we use
the NumRows field from the Kudu API. The Kudu scanner tuple is
modified to have one slot into which we will write the
num rows statistic. The aggregate function is changed from count to a
special sum function that gets initialized to 0.

Tests:
 * Added end-to-end tests
 ̣* Added planner tests
 * Run performance tests on tpch.lineitem Kudu table with 25 set as
   scaling factor, on 1 node, with mt_dop set to 1, just to measure
   the speedup gained when scanning. Counting the rows before the
   optimization took around 400ms, and around 170ms after.

Change-Id: Ic99e0f954d0ca65779bd531ca79ace1fcb066fb9
---
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/kudu-scan-node-base.cc
M be/src/exec/kudu-scan-node-base.h
M be/src/exec/kudu-scanner.cc
M be/src/exec/kudu-scanner.h
M common/thrift/PlanNodes.thrift
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
M fe/src/main/java/org/apache/impala/planner/ScanNode.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/disable-codegen.test
A testdata/workloads/functional-planner/queries/PlannerTest/kudu-stats-agg.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/parquet-stats-agg.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test
A testdata/workloads/functional-query/queries/QueryTest/kudu-stats-agg.test
M tests/query_test/test_aggregation.py
18 files changed, 580 insertions(+), 91 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/47/14347/10
--
To view, visit http://gerrit.cloudera.org:8080/14347
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic99e0f954d0ca65779bd531ca79ace1fcb066fb9
Gerrit-Change-Number: 14347
Gerrit-PatchSet: 10
Gerrit-Owner: Norbert Luksa <norbert.lu...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Norbert Luksa <norbert.lu...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com>

Reply via email to