[ 
https://issues.apache.org/jira/browse/IMPALA-13509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17895391#comment-17895391
 ] 

Csaba Ringhofer commented on IMPALA-13509:
------------------------------------------

https://gerrit.cloudera.org/#/c/21932/

> Avoid duplicate deepcopy during hash partitioning in KrpcDataStreamSender
> -------------------------------------------------------------------------
>
>                 Key: IMPALA-13509
>                 URL: https://issues.apache.org/jira/browse/IMPALA-13509
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Backend
>            Reporter: Csaba Ringhofer
>            Assignee: Csaba Ringhofer
>            Priority: Critical
>              Labels: performance
>
> Currently all rows are deep copied twice:
> 1. to the RowBatch of the given channel
> 2. to an OutboundRowBatch when the collector RowBatch is at capacity
> Copying directly to an OutboundRowBatch could avoid some CPU work.
> The would also allow easier implementation of the following improvements:
> - deduplicate tuples similarly to broadcast/unpartitioned exchange 
> (IMPALA-13225).
> - keep outbound row batch size below data_stream_sender_buffer_size even for 
> var len data 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to