[ 
https://issues.apache.org/jira/browse/TEZ-4075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16938265#comment-16938265
 ] 

Rajesh Balamohan commented on TEZ-4075:
---------------------------------------

Changes:

1. Extended IFile to have filebacked in-mem writer. Payload size calculation 
works based on raw data in ifile, before flipping to disk based writer. This 
can be enabled/disabled on need basis. 
2. Optimized the local dir allocation path. In case the payload is small, it 
would never hit any disk code path.
3. Added test cases for the in-mem writer.

Note that the default behaviour would be to make use of IFile and it would save 
fetch calls for small payload. On need basis, in-mem writer can be enabled with 
`tez.runtime.transfer.data-via-events.support.in-mem.file`.

> Tez: Reimplement tez.runtime.transfer.data-via-events.enabled
> -------------------------------------------------------------
>
>                 Key: TEZ-4075
>                 URL: https://issues.apache.org/jira/browse/TEZ-4075
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Gopal Vijayaraghavan
>            Assignee: Richard Zhang
>            Priority: Major
>         Attachments: TEZ-4075.10.patch, TEZ-4075.15.patch, Tez-4075.5.patch, 
> Tez-4075.8.patch
>
>
> This was factored out by TEZ-2196, which does skip buffers for 1-partition 
> data exchanges (therefore goes to disk directly).
> {code}
>     if (shufflePayload.hasData()) {       
> shuffleManager.addKnownInput(shufflePayload.getHost(),
>       DataProto dataProto = shufflePayload.getData();         
> shufflePayload.getPort(), srcAttemptIdentifier, srcIndex);
>       FetchedInput fetchedInput = 
> inputAllocator.allocate(dataProto.getRawLength(),   
>           dataProto.getCompressedLength(), srcAttemptIdentifier);     
>       moveDataToFetchedInput(dataProto, fetchedInput, hostIdentifier);        
>       shuffleManager.addCompletedInputWithData(srcAttemptIdentifier, 
> fetchedInput);   
>     } else {  
>       shuffleManager.addKnownInput(shufflePayload.getHost(),  
>           shufflePayload.getPort(), srcAttemptIdentifier, srcIndex);  
>     } 
> {code}
> got removed in 
> https://github.com/apache/tez/commit/1ba1f927c16a1d7c273b6cd1a8553e5269d1541a
> It would be better to buffer up the 512Byte limit for the event size before 
> writing to disk, since creating a new file always incurs disk traffic, even 
> if the file is eventually being served out of the buffer cache.
> The total overhead of receiving an event, then firing an HTTP call to fetch 
> the data etc adds approx 100-150ms to a query - the data xfer through the 
> event will skip the disk entirely for this & also remove the extra IOPS 
> incurred.
> This channel is not suitable for large-scale event transport, but 
> specifically the workload here deals with 1-row control tables which consume 
> more bandwidth with HTTP headers and hostnames than the 93 byte payload.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to