Wenzhe Zhou has uploaded a new patch set (#5). ( http://gerrit.cloudera.org:8080/20653 )
Change subject: IMPALA-12377: Improve count(*) performance for jdbc external table ...................................................................... IMPALA-12377: Improve count(*) performance for jdbc external table Backend function DataSourceScanNode::GetNext() handles count query inefficiently. Even when there are no column data returned from external data source, it still tries to materialize rows and add rows to RowBatch one by one up to the number of row count. It also call GetNextInputBatch() multiple times (count / batch_size), while GetNextInputBatch() invokes JNI function in external data source. This patch improves the DataSourceScanNode::GetNext() and JdbcDataSource.getNext() to avoid unnecessary function calls. Testing: - Ran query_test/test_ext_data_sources.py which consists count queries for jdbc external table. - Passed core-tests. Change-Id: I9953dca949eb773022f1d6dcf48d8877857635d6 --- M be/src/exec/data-source-scan-node.cc M java/ext-data-source/jdbc/src/main/java/org/apache/impala/extdatasource/jdbc/JdbcDataSource.java 2 files changed, 32 insertions(+), 24 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/53/20653/5 -- To view, visit http://gerrit.cloudera.org:8080/20653 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I9953dca949eb773022f1d6dcf48d8877857635d6 Gerrit-Change-Number: 20653 Gerrit-PatchSet: 5 Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com> Gerrit-Reviewer: Abhishek Rawat <ara...@cloudera.com> Gerrit-Reviewer: Anonymous Coward <gsi...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com> Gerrit-Reviewer: Yifan Zhang <chinazhangyi...@163.com>