[ 
https://issues.apache.org/jira/browse/TEZ-4075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16941155#comment-16941155
 ] 

Siddharth Seth commented on TEZ-4075:
-------------------------------------

A few more comments. Please evaluate on value of the changes.
 * Rename the counter to DATA_VIA_EVENT_BYTES (to clarify the unit) ?
 * Ignore the previous comment about MEMORY vs DISK
 * UnorderedPartitionedKVWriter.cachedStream - Let this be handled by 
IFile.FileBackedInMemIFileWriter itself? Doesn't need to be exposed in 
UPKVWriter
 * IFile.FileBackedInMemFileWriter - this.totalSize = HEADER.length + 1; - Can 
this be picked up from the stream itself - currentPos (in case logic changes in 
IFile)
 * writeKVPair -> 'totalSize =' or 'totalSize +='
 * writeKVPair -> this uses keyData.length, valueData.length. IFile.Writer uses 
keyLength, valueLenght. Not sure how easy it is to move these size computations 
to a separate method so that they can be re-used here (again, in case logic 
changes in IFile.Writer)
 * cachedStream could use DataOutputBuffer - to avoid the array copy in 
java.io.ByteArrayOutputStream (Probably not a big deal though since this is 
expected to be a small allocation)
 * On resetToFileBasedWriter
 ** If I'm not mistaken, the compression stream can end up buffering data, so 
trying to copy the raw stream without closing the compression stream can lead 
to errors.
 ** The format seems to be - "Header to Raw Stream, Compressed Data, Checksum 
to Raw Stream". Closing the compressed stream, and then copying data, and then 
removing the checksum may help here. Assuming the force close on the 
CompressionStream, followed by additional compressed data will not cause errors 
(just potential inefficiencies)
** Alternately - The InMemStream could always disable compression (the size 
calculation is based on uncompressed size anyway). Compression gets enabled on 
the FileBasedStream - which allows for regular compression buffers etc to kick 
in.
** May be worthwhile to run the tests with and without compression enabled 
(maybe a couple of codecs).

> Tez: Reimplement tez.runtime.transfer.data-via-events.enabled
> -------------------------------------------------------------
>
>                 Key: TEZ-4075
>                 URL: https://issues.apache.org/jira/browse/TEZ-4075
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Gopal Vijayaraghavan
>            Assignee: Richard Zhang
>            Priority: Major
>         Attachments: TEZ-4075.10.patch, TEZ-4075.15.patch, TEZ-4075.16.patch, 
> Tez-4075.5.patch, Tez-4075.8.patch
>
>
> This was factored out by TEZ-2196, which does skip buffers for 1-partition 
> data exchanges (therefore goes to disk directly).
> {code}
>     if (shufflePayload.hasData()) {       
> shuffleManager.addKnownInput(shufflePayload.getHost(),
>       DataProto dataProto = shufflePayload.getData();         
> shufflePayload.getPort(), srcAttemptIdentifier, srcIndex);
>       FetchedInput fetchedInput = 
> inputAllocator.allocate(dataProto.getRawLength(),   
>           dataProto.getCompressedLength(), srcAttemptIdentifier);     
>       moveDataToFetchedInput(dataProto, fetchedInput, hostIdentifier);        
>       shuffleManager.addCompletedInputWithData(srcAttemptIdentifier, 
> fetchedInput);   
>     } else {  
>       shuffleManager.addKnownInput(shufflePayload.getHost(),  
>           shufflePayload.getPort(), srcAttemptIdentifier, srcIndex);  
>     } 
> {code}
> got removed in 
> https://github.com/apache/tez/commit/1ba1f927c16a1d7c273b6cd1a8553e5269d1541a
> It would be better to buffer up the 512Byte limit for the event size before 
> writing to disk, since creating a new file always incurs disk traffic, even 
> if the file is eventually being served out of the buffer cache.
> The total overhead of receiving an event, then firing an HTTP call to fetch 
> the data etc adds approx 100-150ms to a query - the data xfer through the 
> event will skip the disk entirely for this & also remove the extra IOPS 
> incurred.
> This channel is not suitable for large-scale event transport, but 
> specifically the workload here deals with 1-row control tables which consume 
> more bandwidth with HTTP headers and hostnames than the 93 byte payload.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to