[ https://issues.apache.org/jira/browse/HADOOP-11938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14537713#comment-14537713 ]
Yi Liu commented on HADOOP-11938: --------------------------------- Some comments: In *AbstractRawErasureCoder.java* {code} + protected ByteBuffer resetOutputBuffer(ByteBuffer buffer) { + int pos = buffer.position(); + buffer.put(zeroChunkBytes); + buffer.position(pos); + + return buffer; } {code} length of zeroChunkBytes could be larger than buffer.remaining(), just use a *for* to put 0 to buffer. {code} protected ByteBuffer resetInputBuffer(ByteBuffer buffer) { {code} What's the reason we need to reset inputbuffer? Input buffers are given by caller. In AbstractRawErasureDecoder.java {code} boolean usingDirectBuffer = ! inputs[0].hasArray(); {code} use {{inputs\[0\].isDirect();}} {code} @Override public void decode(ByteBuffer[] inputs, int[] erasedIndexes, ByteBuffer[] outputs) @Override public void decode(byte[][] inputs, int[] erasedIndexes, byte[][] outputs) { {code} We should do following check: 1. all the input have the same length besides some inputs are null, and check output has enough space. 2. {{erasedIndexes}} matches the {{null}} position of inputs. We should also enhance the description in RawErasureDecoder#decode to describe more about decode/reconstruct. {code} + for (int i = 0; i < outputs.length; ++i) { + buffer = outputs[i]; + // to be ready for read dataLen bytes + buffer.flip(); + buffer.position(outputOffsets[i]); + buffer.limit(outputOffsets[i] + dataLen); } {code} This is unnecessary, remove it. In *AbstractRawErasureEncoder.java* {code} boolean usingDirectBuffer = ! inputs[0].hasArray(); {code} use {{inputs\[0\].isDirect();}} all comments are same as in *AbstractRawErasureDecoder* in *RSRawDecoder.java* {code} assert (getNumDataUnits() + getNumParityUnits() < RSUtil.GF.getFieldSize()); this.errSignature = new int[getNumParityUnits()]; this.primitivePower = RSUtil.getPrimitivePower(getNumDataUnits(), getNumParityUnits()); {code} Why not use {{numDataUnits}} and {{numParityUnits}} directly? In *RSRawEncoder.java* in {{initialize}}, use {{numDataUnits}} and {{numParityUnits}} directly. > Fix ByteBuffer version encode/decode API of raw erasure coder > ------------------------------------------------------------- > > Key: HADOOP-11938 > URL: https://issues.apache.org/jira/browse/HADOOP-11938 > Project: Hadoop Common > Issue Type: Sub-task > Components: io > Reporter: Kai Zheng > Assignee: Kai Zheng > Attachments: HADOOP-11938-HDFS-7285-v1.patch, > HADOOP-11938-HDFS-7285-workaround.patch > > > While investigating a test failure in {{TestRecoverStripedFile}}, one issue > in raw erasrue coder, caused by an optimization in below codes. It assumes > the heap buffer backed by the bytes array available for reading or writing > always starts with zero and takes the whole space. > {code} > protected static byte[][] toArrays(ByteBuffer[] buffers) { > byte[][] bytesArr = new byte[buffers.length][]; > ByteBuffer buffer; > for (int i = 0; i < buffers.length; i++) { > buffer = buffers[i]; > if (buffer == null) { > bytesArr[i] = null; > continue; > } > if (buffer.hasArray()) { > bytesArr[i] = buffer.array(); > } else { > throw new IllegalArgumentException("Invalid ByteBuffer passed, " + > "expecting heap buffer"); > } > } > return bytesArr; > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)