Copilot commented on code in PR #7835: URL: https://github.com/apache/hadoop/pull/7835#discussion_r2253585143
########## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsInputStream.java: ########## @@ -511,8 +512,12 @@ private int readInternal(final long position, final byte[] b, final int offset, } int receivedBytes; - // queue read-aheads - int numReadAheads = this.readAheadQueueDepth; + /* + * Number of prefetches to queue for each request is configurable. + * For the first read of this input stream, we don't read ahead but keep + * the current read data in cache as pattern might not be sequential. + */ + int numReadAheads = firstRead? 1 : this.readAheadQueueDepth; Review Comment: [nitpick] The ternary operator lacks spaces around the `?` operator, which deviates from Java coding conventions. It should be `firstRead ? 1 : this.readAheadQueueDepth`. ```suggestion int numReadAheads = firstRead ? 1 : this.readAheadQueueDepth; ``` ########## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsInputStream.java: ########## @@ -348,14 +348,15 @@ private int readOneBlock(final byte[] b, final int off, final int len) throws IO if (alwaysReadBufferSize) { bytesRead = readInternal(fCursor, buffer, 0, bufferSize, false); } else { - // Enable readAhead when reading sequentially + // Switch between enabling and disabling read ahead based on the workload read pattern. if (-1 == fCursorAfterLastRead || fCursorAfterLastRead == fCursor || b.length >= bufferSize) { - LOG.debug("Sequential read with read ahead size of {}", bufferSize); + // Sequential read pattern detected. Enable read ahead. + LOG.debug("Sequential read with read size of {} and read ahead enabled", bufferSize); bytesRead = readInternal(fCursor, buffer, 0, bufferSize, false); } else { - // Enabling read ahead for random reads as well to reduce number of remote calls. + // Random read pattern detected. Disable read ahead. int lengthWithReadAhead = Math.min(b.length + readAheadRange, bufferSize); - LOG.debug("Random read with read ahead size of {}", lengthWithReadAhead); + LOG.debug("Random read with read size of {} and read ahead disabled", lengthWithReadAhead); Review Comment: The log message mentions 'read ahead disabled' but the variable logged is `lengthWithReadAhead`, which suggests read-ahead is still being used. This is confusing and inconsistent with the message. ```suggestion LOG.debug("Random read with requested size {} (read ahead range: {}), actual read size: {}, read ahead disabled", b.length, readAheadRange, lengthWithReadAhead); ``` ########## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsInputStream.java: ########## @@ -348,14 +348,15 @@ private int readOneBlock(final byte[] b, final int off, final int len) throws IO if (alwaysReadBufferSize) { bytesRead = readInternal(fCursor, buffer, 0, bufferSize, false); } else { - // Enable readAhead when reading sequentially + // Switch between enabling and disabling read ahead based on the workload read pattern. if (-1 == fCursorAfterLastRead || fCursorAfterLastRead == fCursor || b.length >= bufferSize) { - LOG.debug("Sequential read with read ahead size of {}", bufferSize); + // Sequential read pattern detected. Enable read ahead. + LOG.debug("Sequential read with read size of {} and read ahead enabled", bufferSize); Review Comment: The log message mentions 'read size of {}' but logs `bufferSize` instead of the actual read size (`b.length`). This could be misleading when debugging read operations. ```suggestion LOG.debug("Sequential read with read size of {} and read ahead enabled", b.length); ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org