[ 
https://issues.apache.org/jira/browse/IMPALA-12377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenzhe Zhou updated IMPALA-12377:
---------------------------------
    Summary: Improve 'select count(*)' performance for external data source  
(was: Improve 'select count(*)' for external data source)

> Improve 'select count(*)' performance for external data source
> --------------------------------------------------------------
>
>                 Key: IMPALA-12377
>                 URL: https://issues.apache.org/jira/browse/IMPALA-12377
>             Project: IMPALA
>          Issue Type: Sub-task
>          Components: Backend, Frontend
>            Reporter: Wenzhe Zhou
>            Assignee: Wenzhe Zhou
>            Priority: Major
>
> The code to handle 'select count(*)' in backend function 
> DataSourceScanNode::GetNext() are not efficient. Even there are no column 
> data returned from external data source, it still try to materialize rows and 
> add rows to RowBatch one by one up to the number of row count.  It also call 
> GetNextInputBatch() multiple times (count / batch_size), while  
> GetNextInputBatch() invoke JNI function.  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to