[jira] [Commented] (HDFS-16970) EC: client copy wrong buffer from decode output during pread

chan (Jira) Fri, 01 Dec 2023 02:28:03 -0800


    [ 
https://issues.apache.org/jira/browse/HDFS-16970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17791991#comment-17791991
 ]


chan commented on HDFS-16970:
-----------------------------

can you provide your hadoop version?  the newest version 
[StripeReader|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/StripeReader.java]

 will clear readbuffer,so will not read the data before timeout

> EC: client copy wrong buffer from decode output during pread
> ------------------------------------------------------------
>
>                 Key: HDFS-16970
>                 URL: https://issues.apache.org/jira/browse/HDFS-16970
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: dfsclient, ec, erasure-coding
>    Affects Versions: 3.3.4
>            Reporter: MingHui Luo
>            Priority: Critical
>              Labels: pull-request-available
>             Fix For: 3.4.0, 3.3.5, 3.2.5
>
>
> When dfsStripedInputStream do pread from a striped block group and read 
> internal block timeout, so will read parity block for decode and fill 
> original chunk buffer with decoded data.
> Here try to fill original chunk buffer with decoded data, but get wrong data. 
> The reason is that
> 1.original chunk buffer already read some bytes before timeout from 
> blockReader 
> 2.chunkBytebuffer's slice always fill begin 0 position of decodeByteBuffer 
> slice bytebuffer will fill from wrong decodeByteBuffer position, so will get 
> wrong data from pread.
> {code:java}
> 23/03/21 06:31:11 WARN [StripedRead-24] DFSClient: Exception while reading 
> from BP-xxx:blk_-9xxx_xxx of file_xxx from 
> DatanodeInfoWithStorage[10.xxx.xx.xx:50010,DS-xxx,DISK]
> java.net.SocketTimeoutException: 10000 millis timeout while waiting for 
> channel to be ready for read. ch : java.nio.channels.SocketChannel[connected 
> local=/10.xxx.xx.xx:51426 remote=/10.xxx.xx.xx:50010]
>     at 
> org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
>     at 
> org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161)
>     at 
> org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.readChannelFully(PacketReceiver.java:256)
>     at 
> org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doReadFully(PacketReceiver.java:207)
>     at 
> org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doRead(PacketReceiver.java:134)
>     at 
> org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.receiveNextPacket(PacketReceiver.java:102)
>     at 
> org.apache.hadoop.hdfs.client.impl.BlockReaderRemote.readNextPacket(BlockReaderRemote.java:221)
>     at 
> org.apache.hadoop.hdfs.client.impl.BlockReaderRemote.read(BlockReaderRemote.java:201)
>     at 
> org.apache.hadoop.hdfs.ByteBufferStrategy.readFromBlock(ReaderStrategy.java:180)
>     at 
> org.apache.hadoop.hdfs.ByteBufferStrategy.readFromBlock(ReaderStrategy.java:172)
>     at org.apache.hadoop.hdfs.StripeReader.readToBuffer(StripeReader.java:240)
>     at 
> org.apache.hadoop.hdfs.StripeReader.lambda$readCells$0(StripeReader.java:286)
>     at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>     at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>     at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>     at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>     at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>     at java.lang.Thread.run(Thread.java:748) {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-16970) EC: client copy wrong buffer from decode output during pread

Reply via email to