[ 
https://issues.apache.org/jira/browse/TEZ-2575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saikat updated TEZ-2575:
------------------------
    Attachment: TEZ-2575.3.patch

Changes:
1. in sort() sort and spill only for non empty spans
2. combine singlerecordspill and pipeline shuffle events
3. check recursion depth or span.length() before calling spillSingleRecord().
4. Added generic testcase method to handle variablesize KV pairs

> Handle KeyValue pairs size which do not fit in a single block
> -------------------------------------------------------------
>
>                 Key: TEZ-2575
>                 URL: https://issues.apache.org/jira/browse/TEZ-2575
>             Project: Apache Tez
>          Issue Type: Improvement
>            Reporter: Saikat
>            Assignee: Saikat
>         Attachments: TEZ-2575.1.patch, TEZ-2575.2.patch, TEZ-2575.3.patch, 
> TEZ-2575.patch
>
>
> In the present implementation, the available buffer is divided into blocks 
> (specified in the constructor for pipeline sort). and a linked list of these 
> block byte buffers is maintained. 
> A span is created out of the buffers. 
> The present logic, doesnot handle scenario where a single key-value pair size 
> doesnot fit into any of the blocks.
> example if 1mb total memory is divided into 4 blocks, (256 kb each),
> if a single KV pair is greater than the blocksize(~ignoring meta data size), 
> then it fails with buffer exceptions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to