[ https://issues.apache.org/jira/browse/HDFS-16544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Takanobu Asanuma resolved HDFS-16544. ------------------------------------- Fix Version/s: 3.4.0 3.2.4 3.3.4 Assignee: qinyuren Resolution: Fixed > EC decoding failed due to invalid buffer > ---------------------------------------- > > Key: HDFS-16544 > URL: https://issues.apache.org/jira/browse/HDFS-16544 > Project: Hadoop HDFS > Issue Type: Bug > Components: erasure-coding > Reporter: qinyuren > Assignee: qinyuren > Priority: Major > Labels: pull-request-available > Fix For: 3.4.0, 3.2.4, 3.3.4 > > Time Spent: 1h > Remaining Estimate: 0h > > In [HDFS-16538|http://https//issues.apache.org/jira/browse/HDFS-16538] , we > found an EC file decoding bug if more than one data block read failed. > Currently, we found another bug trigger by #StatefulStripeReader.decode. > If we read an EC file which {*}length more than one stripe{*}, and this file > have *one data block* and *the first parity block* corrupted, this error will > happen. > {code:java} > org.apache.hadoop.HadoopIllegalArgumentException: Invalid buffer found, not > allowing null at > org.apache.hadoop.io.erasurecode.rawcoder.ByteBufferDecodingState.checkOutputBuffers(ByteBufferDecodingState.java:132) > at > org.apache.hadoop.io.erasurecode.rawcoder.ByteBufferDecodingState.<init>(ByteBufferDecodingState.java:48) > at > org.apache.hadoop.io.erasurecode.rawcoder.RawErasureDecoder.decode(RawErasureDecoder.java:86) > at > org.apache.hadoop.io.erasurecode.rawcoder.RawErasureDecoder.decode(RawErasureDecoder.java:170) > at > org.apache.hadoop.hdfs.StripeReader.decodeAndFillBuffer(StripeReader.java:435) > at > org.apache.hadoop.hdfs.StatefulStripeReader.decode(StatefulStripeReader.java:94) > at org.apache.hadoop.hdfs.StripeReader.readStripe(StripeReader.java:392) > at > org.apache.hadoop.hdfs.DFSStripedInputStream.readOneStripe(DFSStripedInputStream.java:315) > at > org.apache.hadoop.hdfs.DFSStripedInputStream.readWithStrategy(DFSStripedInputStream.java:408) > at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:918) > {code} > > Let's say we use ec(6+3) and the data block[0] and the first parity block[6] > are corrupted. > # The readers for block[0] and block[6] will be closed after reading the > first stripe of an EC file; > # When the client reading the second stripe of the EC file, it will trigger > #prepareParityChunk for block[6]. > # The decodeInputs[6] will not be constructed because the reader for > block[6] was closed. > > {code:java} > boolean prepareParityChunk(int index) { > Preconditions.checkState(index >= dataBlkNum > && alignedStripe.chunks[index] == null); > if (readerInfos[index] != null && readerInfos[index].shouldSkip) { > alignedStripe.chunks[index] = new StripingChunk(StripingChunk.MISSING); > // we have failed the block reader before > return false; > } > final int parityIndex = index - dataBlkNum; > ByteBuffer buf = dfsStripedInputStream.getParityBuffer().duplicate(); > buf.position(cellSize * parityIndex); > buf.limit(cellSize * parityIndex + (int) alignedStripe.range.spanInBlock); > decodeInputs[index] = > new ECChunk(buf.slice(), 0, (int) alignedStripe.range.spanInBlock); > alignedStripe.chunks[index] = > new StripingChunk(decodeInputs[index].getBuffer()); > return true; > } {code} > -- This message was sent by Atlassian Jira (v8.20.7#820007) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org