[jira] [Resolved] (IMPALA-1580) Optimize conversion of row batch to query result set

2019-09-16 Thread Tim Armstrong (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-1580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong resolved IMPALA-1580.
---
Resolution: Duplicate

> Optimize conversion of row batch to query result set
> 
>
> Key: IMPALA-1580
> URL: https://issues.apache.org/jira/browse/IMPALA-1580
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Perf Investigation
>Affects Versions: Impala 2.0.1
>Reporter: casey
>Priority: Minor
>  Labels: performance, ramp-up
> Attachments: select_lineitem.profile
>
>
> For simple queries that produce a large result set such as "select * from 
> tpch.lineitem" the server execution time is limited by the time required to 
> convert row batches (results in the internal structure) to query results (the 
> structure to be sent to the client). The data conversion is the limiting 
> factor in this case because the query plan execution happens in parallel.
> Here are some data points from the profile of "select * from tpch.lineitem" 
> using HS2 (this was taken using --exchg_node_buffer_size_bytes=2048576000 so 
> the exchange node would never block because of a full buffer.). Beeswax takes 
> even longer to convert the rows.
> * Query Timeline: 1m9s
> * Execution Profile -- Total: 1s295ms
> * ClientFetchWaitTimer: 52s553ms
> * RowMaterializationTimer: 15s216ms
> * Coordinator Fragment F01:(Total: 1s092ms
> * Averaged Fragment F00:(Total: 5s608ms
> So the "RowMaterializationTimer", which is actually conversion time, adds ~9 
> seconds or ~2x the plan execution time to the overall time.
> Ideally the conversion time would be codegen'd but even without that there 
> should be a lot of room for improvement by reducing function calls.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-1580) Optimize conversion of row batch to query result set

2019-09-16 Thread Tim Armstrong (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-1580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong resolved IMPALA-1580.
---
Resolution: Duplicate

> Optimize conversion of row batch to query result set
> 
>
> Key: IMPALA-1580
> URL: https://issues.apache.org/jira/browse/IMPALA-1580
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Perf Investigation
>Affects Versions: Impala 2.0.1
>Reporter: casey
>Priority: Minor
>  Labels: performance, ramp-up
> Attachments: select_lineitem.profile
>
>
> For simple queries that produce a large result set such as "select * from 
> tpch.lineitem" the server execution time is limited by the time required to 
> convert row batches (results in the internal structure) to query results (the 
> structure to be sent to the client). The data conversion is the limiting 
> factor in this case because the query plan execution happens in parallel.
> Here are some data points from the profile of "select * from tpch.lineitem" 
> using HS2 (this was taken using --exchg_node_buffer_size_bytes=2048576000 so 
> the exchange node would never block because of a full buffer.). Beeswax takes 
> even longer to convert the rows.
> * Query Timeline: 1m9s
> * Execution Profile -- Total: 1s295ms
> * ClientFetchWaitTimer: 52s553ms
> * RowMaterializationTimer: 15s216ms
> * Coordinator Fragment F01:(Total: 1s092ms
> * Averaged Fragment F00:(Total: 5s608ms
> So the "RowMaterializationTimer", which is actually conversion time, adds ~9 
> seconds or ~2x the plan execution time to the overall time.
> Ideally the conversion time would be codegen'd but even without that there 
> should be a lot of room for improvement by reducing function calls.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)