[ 
https://issues.apache.org/jira/browse/HDFS-17293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17808923#comment-17808923
 ] 

ASF GitHub Bot commented on HDFS-17293:
---------------------------------------

hfutatzhanghb commented on PR #6368:
URL: https://github.com/apache/hadoop/pull/6368#issuecomment-1902139525

   > This PR has corrected the size of the first packet in a new block, which 
is great. However, due to the original logical problem in 
`adjustChunkBoundary`, the calculation of the size of the last packet in a 
block is still problematic, and I think we need a new PR to solve it.
   > 
   > 
https://github.com/apache/hadoop/blob/27ecc23ae7c5cafba6a5ea58d4a68d25bd7507dd/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSOutputStream.java#L531-L543
   > 
   > 
   > Line540, when we pass `blockSize - getStreamer().getBytesCurBlock()` to 
`computePacketChunkSize` as the first parameter, `computePacketChunkSize` is 
likely to cause the data that could have been sent in one data packet to be 
split into two data packets and sent.
   
   Sir, very nice catch. I think below code may resolve the problem you found. 
Please take a look~ I will submit another PR to fix it and Add UT.
   
   ```java
       if (!getStreamer().getAppendChunk()) {
         int psize = 0;
         if (blockSize == getStreamer().getBytesCurBlock()) {
           psize = writePacketSize;
         } else if (blockSize - getStreamer().getBytesCurBlock() + 
PacketHeader.PKT_MAX_HEADER_LEN
             < writePacketSize ) {
           psize = (int)(blockSize - getStreamer().getBytesCurBlock()) + 
PacketHeader.PKT_MAX_HEADER_LEN;
         } else {
           psize = (int) Math
               .min(blockSize - getStreamer().getBytesCurBlock(), 
writePacketSize);
         }
         computePacketChunkSize(psize, bytesPerChecksum);
       }
   ```




> First packet data + checksum size will be set to 516 bytes when writing to a 
> new block.
> ---------------------------------------------------------------------------------------
>
>                 Key: HDFS-17293
>                 URL: https://issues.apache.org/jira/browse/HDFS-17293
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>    Affects Versions: 3.3.6
>            Reporter: farmmamba
>            Assignee: farmmamba
>            Priority: Major
>              Labels: pull-request-available
>
> First packet size will be set to 516 bytes when writing to a new block.
> In  method computePacketChunkSize, the parameters psize and csize would be 
> (0, 512)
> when writting to a new block. It should better use writePacketSize.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to