[jira] [Resolved] (IMPALA-1580) Optimize conversion of row batch to query result set
[ https://issues.apache.org/jira/browse/IMPALA-1580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong resolved IMPALA-1580. --- Resolution: Duplicate > Optimize conversion of row batch to query result set > > > Key: IMPALA-1580 > URL: https://issues.apache.org/jira/browse/IMPALA-1580 > Project: IMPALA > Issue Type: Improvement > Components: Perf Investigation >Affects Versions: Impala 2.0.1 >Reporter: casey >Priority: Minor > Labels: performance, ramp-up > Attachments: select_lineitem.profile > > > For simple queries that produce a large result set such as "select * from > tpch.lineitem" the server execution time is limited by the time required to > convert row batches (results in the internal structure) to query results (the > structure to be sent to the client). The data conversion is the limiting > factor in this case because the query plan execution happens in parallel. > Here are some data points from the profile of "select * from tpch.lineitem" > using HS2 (this was taken using --exchg_node_buffer_size_bytes=2048576000 so > the exchange node would never block because of a full buffer.). Beeswax takes > even longer to convert the rows. > * Query Timeline: 1m9s > * Execution Profile -- Total: 1s295ms > * ClientFetchWaitTimer: 52s553ms > * RowMaterializationTimer: 15s216ms > * Coordinator Fragment F01:(Total: 1s092ms > * Averaged Fragment F00:(Total: 5s608ms > So the "RowMaterializationTimer", which is actually conversion time, adds ~9 > seconds or ~2x the plan execution time to the overall time. > Ideally the conversion time would be codegen'd but even without that there > should be a lot of room for improvement by reducing function calls. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-1580) Optimize conversion of row batch to query result set
[ https://issues.apache.org/jira/browse/IMPALA-1580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong resolved IMPALA-1580. --- Resolution: Duplicate > Optimize conversion of row batch to query result set > > > Key: IMPALA-1580 > URL: https://issues.apache.org/jira/browse/IMPALA-1580 > Project: IMPALA > Issue Type: Improvement > Components: Perf Investigation >Affects Versions: Impala 2.0.1 >Reporter: casey >Priority: Minor > Labels: performance, ramp-up > Attachments: select_lineitem.profile > > > For simple queries that produce a large result set such as "select * from > tpch.lineitem" the server execution time is limited by the time required to > convert row batches (results in the internal structure) to query results (the > structure to be sent to the client). The data conversion is the limiting > factor in this case because the query plan execution happens in parallel. > Here are some data points from the profile of "select * from tpch.lineitem" > using HS2 (this was taken using --exchg_node_buffer_size_bytes=2048576000 so > the exchange node would never block because of a full buffer.). Beeswax takes > even longer to convert the rows. > * Query Timeline: 1m9s > * Execution Profile -- Total: 1s295ms > * ClientFetchWaitTimer: 52s553ms > * RowMaterializationTimer: 15s216ms > * Coordinator Fragment F01:(Total: 1s092ms > * Averaged Fragment F00:(Total: 5s608ms > So the "RowMaterializationTimer", which is actually conversion time, adds ~9 > seconds or ~2x the plan execution time to the overall time. > Ideally the conversion time would be codegen'd but even without that there > should be a lot of room for improvement by reducing function calls. -- This message was sent by Atlassian Jira (v8.3.2#803003)