[jira] [Commented] (HADOOP-11938) Fix ByteBuffer version encode/decode API of raw erasure coder

Yi Liu (JIRA) Mon, 11 May 2015 01:46:38 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-11938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14537713#comment-14537713
 ]


Yi Liu commented on HADOOP-11938:
---------------------------------

Some comments:

In *AbstractRawErasureCoder.java*
{code}
+  protected ByteBuffer resetOutputBuffer(ByteBuffer buffer) {
+    int pos = buffer.position();
+    buffer.put(zeroChunkBytes);
+    buffer.position(pos);
+
+    return buffer;
   }
{code}
length of zeroChunkBytes could be larger than buffer.remaining(),   just use a 
*for* to put 0 to buffer.

{code}
protected ByteBuffer resetInputBuffer(ByteBuffer buffer) {
{code}
What's the reason we need to reset inputbuffer?  Input buffers are given by 
caller.

In AbstractRawErasureDecoder.java
{code}
boolean usingDirectBuffer = ! inputs[0].hasArray();
{code}
use {{inputs\[0\].isDirect();}}

{code}
@Override
public void decode(ByteBuffer[] inputs, int[] erasedIndexes, ByteBuffer[] 
outputs)
@Override
public void decode(byte[][] inputs, int[] erasedIndexes, byte[][] outputs) {
{code}
We should do following check:
1. all the input have the same length besides some inputs are null, and check 
output has enough space.
2. {{erasedIndexes}} matches the {{null}} position of inputs.
We should also enhance the description in RawErasureDecoder#decode  to describe 
more about decode/reconstruct.

{code}
+    for (int i = 0; i < outputs.length; ++i) {
+      buffer = outputs[i];
+      // to be ready for read dataLen bytes
+      buffer.flip();
+      buffer.position(outputOffsets[i]);
+      buffer.limit(outputOffsets[i] + dataLen);
     }
{code}
This is unnecessary, remove it.

In *AbstractRawErasureEncoder.java*
{code}
boolean usingDirectBuffer = ! inputs[0].hasArray();
{code}
use {{inputs\[0\].isDirect();}}

all comments are same as in *AbstractRawErasureDecoder*

in *RSRawDecoder.java*
{code}
assert (getNumDataUnits() + getNumParityUnits() < RSUtil.GF.getFieldSize());

    this.errSignature = new int[getNumParityUnits()];
    this.primitivePower = RSUtil.getPrimitivePower(getNumDataUnits(),
        getNumParityUnits());
{code}
Why not use {{numDataUnits}} and {{numParityUnits}} directly?

In *RSRawEncoder.java*
in {{initialize}}, use {{numDataUnits}} and {{numParityUnits}} directly.


> Fix ByteBuffer version encode/decode API of raw erasure coder
> -------------------------------------------------------------
>
>                 Key: HADOOP-11938
>                 URL: https://issues.apache.org/jira/browse/HADOOP-11938
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: io
>            Reporter: Kai Zheng
>            Assignee: Kai Zheng
>         Attachments: HADOOP-11938-HDFS-7285-v1.patch, 
> HADOOP-11938-HDFS-7285-workaround.patch
>
>
> While investigating a test failure in {{TestRecoverStripedFile}}, one issue 
> in raw erasrue coder, caused by an optimization in below codes. It assumes 
> the  heap buffer backed by the bytes array available for reading or writing 
> always starts with zero and takes the whole space.
> {code}
>   protected static byte[][] toArrays(ByteBuffer[] buffers) {
>     byte[][] bytesArr = new byte[buffers.length][];
>     ByteBuffer buffer;
>     for (int i = 0; i < buffers.length; i++) {
>       buffer = buffers[i];
>       if (buffer == null) {
>         bytesArr[i] = null;
>         continue;
>       }
>       if (buffer.hasArray()) {
>         bytesArr[i] = buffer.array();
>       } else {
>         throw new IllegalArgumentException("Invalid ByteBuffer passed, " +
>             "expecting heap buffer");
>       }
>     }
>     return bytesArr;
>   }
> {code} 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-11938) Fix ByteBuffer version encode/decode API of raw erasure coder

Reply via email to