[jira] [Commented] (HDFS-14820) The default 8KB buffer of BlockReaderRemote#newBlockReader#BufferedOutputStream is too big

Wei-Chiu Chuang (Jira) Tue, 03 Mar 2020 12:03:25 -0800


    [ 
https://issues.apache.org/jira/browse/HDFS-14820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17050517#comment-17050517
 ]


Wei-Chiu Chuang commented on HDFS-14820:
----------------------------------------

The current implementation is, DFS client send a request (which is short) to 
DataNode asking for a block using an output stream. After that, client receives 
block data DataNode  (which can be several MBs long) using an input stream.

This patch changes the buffer size of the former, output stream. There is 
absolutely no reason to use a 8kb buffer size for this stream. The input 
stream, yes what [~eyang] says makes sense.

{code}
OpReadBlockProto proto = OpReadBlockProto.newBuilder()
        .setHeader(DataTransferProtoUtil.buildClientHeader(blk, clientName,
            blockToken))
        .setOffset(blockOffset)
        .setLen(length)
        .setSendChecksums(sendChecksum)
        .setCachingStrategy(getCachingStrategy(cachingStrategy))
        .build();
{code}

Also note that the stream objects are not recycled. One block is one 
output/input stream object.

>  The default 8KB buffer of 
> BlockReaderRemote#newBlockReader#BufferedOutputStream is too big
> -------------------------------------------------------------------------------------------
>
>                 Key: HDFS-14820
>                 URL: https://issues.apache.org/jira/browse/HDFS-14820
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: Lisheng Sun
>            Assignee: Lisheng Sun
>            Priority: Major
>         Attachments: HDFS-14820.001.patch, HDFS-14820.002.patch, 
> HDFS-14820.003.patch
>
>
> this issue is similar to HDFS-14535.
> {code:java}
> public static BlockReader newBlockReader(String file,
>     ExtendedBlock block,
>     Token<BlockTokenIdentifier> blockToken,
>     long startOffset, long len,
>     boolean verifyChecksum,
>     String clientName,
>     Peer peer, DatanodeID datanodeID,
>     PeerCache peerCache,
>     CachingStrategy cachingStrategy,
>     int networkDistance) throws IOException {
>   // in and out will be closed when sock is closed (by the caller)
>   final DataOutputStream out = new DataOutputStream(new BufferedOutputStream(
>       peer.getOutputStream()));
>   new Sender(out).readBlock(block, blockToken, clientName, startOffset, len,
>       verifyChecksum, cachingStrategy);
> }
> public BufferedOutputStream(OutputStream out) {
>     this(out, 8192);
> }
> {code}
> Sender#readBlock parameter( block,blockToken, clientName, startOffset, len, 
> verifyChecksum, cachingStrategy) could not use such a big buffer.
> So i think it should reduce BufferedOutputStream buffer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14820) The default 8KB buffer of BlockReaderRemote#newBlockReader#BufferedOutputStream is too big

Reply via email to