[jira] [Commented] (TEZ-4075) Tez: Reimplement tez.runtime.transfer.data-via-events.enabled

Siddharth Seth (Jira) Fri, 04 Oct 2019 19:20:59 -0700


    [ 
https://issues.apache.org/jira/browse/TEZ-4075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16944957#comment-16944957
 ]


Siddharth Seth commented on TEZ-4075:
-------------------------------------

bq. Instead of using compressed stream, tried using uncompressed stream (as the 
data checks are done in uncompressed manner anyways). But this also leads to a 
corner case issue, as FSD*Stream is backed by buffered stream of 4096 array. It 
hits the limit during buffer close (i.e at the time of out.close, it tries to 
flush its contents and causes overflow). Along with this, we need to fix file 
close markers and comp/decom bytes counters during corner case scearnios.
OK. That is weird. I would've though the checks that are being done on the 
uncompressed size would have been adequate to make sure there is no overflow. 
Which stream ended up with a buffer overflow?

+1 for splitting into 2 patches, and +1 for the patch (with a minor comment)
It looks like Precommit isn't runnin properly o github. Will have to upload the 
patch here as well to get precommit tests to run.

> Tez: Reimplement tez.runtime.transfer.data-via-events.enabled
> -------------------------------------------------------------
>
>                 Key: TEZ-4075
>                 URL: https://issues.apache.org/jira/browse/TEZ-4075
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Gopal Vijayaraghavan
>            Assignee: Richard Zhang
>            Priority: Major
>         Attachments: TEZ-4075.10.patch, TEZ-4075.15.patch, TEZ-4075.16.patch, 
> Tez-4075.5.patch, Tez-4075.8.patch
>
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> This was factored out by TEZ-2196, which does skip buffers for 1-partition 
> data exchanges (therefore goes to disk directly).
> {code}
>     if (shufflePayload.hasData()) {       
> shuffleManager.addKnownInput(shufflePayload.getHost(),
>       DataProto dataProto = shufflePayload.getData();         
> shufflePayload.getPort(), srcAttemptIdentifier, srcIndex);
>       FetchedInput fetchedInput = 
> inputAllocator.allocate(dataProto.getRawLength(),   
>           dataProto.getCompressedLength(), srcAttemptIdentifier);     
>       moveDataToFetchedInput(dataProto, fetchedInput, hostIdentifier);        
>       shuffleManager.addCompletedInputWithData(srcAttemptIdentifier, 
> fetchedInput);   
>     } else {  
>       shuffleManager.addKnownInput(shufflePayload.getHost(),  
>           shufflePayload.getPort(), srcAttemptIdentifier, srcIndex);  
>     } 
> {code}
> got removed in 
> https://github.com/apache/tez/commit/1ba1f927c16a1d7c273b6cd1a8553e5269d1541a
> It would be better to buffer up the 512Byte limit for the event size before 
> writing to disk, since creating a new file always incurs disk traffic, even 
> if the file is eventually being served out of the buffer cache.
> The total overhead of receiving an event, then firing an HTTP call to fetch 
> the data etc adds approx 100-150ms to a query - the data xfer through the 
> event will skip the disk entirely for this & also remove the extra IOPS 
> incurred.
> This channel is not suitable for large-scale event transport, but 
> specifically the workload here deals with 1-row control tables which consume 
> more bandwidth with HTTP headers and hostnames than the 93 byte payload.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (TEZ-4075) Tez: Reimplement tez.runtime.transfer.data-via-events.enabled

Reply via email to