Siyao Meng created HDDS-15341:
---------------------------------

             Summary: EC client write can fail with 
ArrayIndexOutOfBoundsException due to CoderUtil emptyChunk resize race
                 Key: HDDS-15341
                 URL: https://issues.apache.org/jira/browse/HDDS-15341
             Project: Apache Ozone
          Issue Type: Bug
          Components: EC Client
            Reporter: Siyao Meng
            Assignee: Siyao Meng


Problem: CoderUtil.resetBuffer(byte[] buffer, int offset, int len) gets a 
shared zero-filled buffer from getEmptyChunk(len) and then calls:

{code}
System.arraycopy(empty, 0, buffer, offset, len);
{code}

The old getEmptyChunk() implementation checked emptyChunk.length before 
entering the synchronized block, unconditionally replaced the shared static 
buffer inside the lock, and returned the shared static field after leaving the 
lock. This allowed a smaller concurrent caller to shrink the shared cached 
buffer after a larger caller had grown it.

{code:title=Current logic w/o the fix}
  static byte[] getEmptyChunk(int leastLength) {
    if (emptyChunk.length >= leastLength) {
      return emptyChunk; // In most time
    }

    synchronized (CoderUtil.class) {
      emptyChunk = new byte[leastLength];
    }
{code}

Repro:

1. emptyChunk starts as byte[4096].
2. Thread A calls getEmptyChunk(4097) and blocks before entering the 
synchronized block.
3. Thread B calls getEmptyChunk(8194), enters the synchronized block, and sets 
emptyChunk = byte[8194].
4. Thread A resumes and unconditionally sets emptyChunk = byte[4097].
5. Thread B returns the shared static emptyChunk, now byte[4097].
6. System.arraycopy(..., len=8194) throws ArrayIndexOutOfBoundsException.

This is a TOCTOU-style race on the shared emptyChunk cache.




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to