Re: [PR] Hadoop 19641: [ABFS][ReadAheadV2] First Read should bypass ReadBufferManager [hadoop]

via GitHub Tue, 05 Aug 2025 03:04:10 -0700


Copilot commented on code in PR #7835:
URL: https://github.com/apache/hadoop/pull/7835#discussion_r2253585143



##########
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsInputStream.java:
##########
@@ -511,8 +512,12 @@ private int readInternal(final long position, final byte[] 
b, final int offset,
       }
       int receivedBytes;
 
-      // queue read-aheads
-      int numReadAheads = this.readAheadQueueDepth;
+      /*
+       * Number of prefetches to queue for each request is configurable.
+       * For the first read of this input stream, we don't read ahead but keep
+       * the current read data in cache as pattern might not be sequential.
+       */
+      int numReadAheads = firstRead? 1 : this.readAheadQueueDepth;

Review Comment:
   [nitpick] The ternary operator lacks spaces around the `?` operator, which 
deviates from Java coding conventions. It should be `firstRead ? 1 : 
this.readAheadQueueDepth`.
   ```suggestion
         int numReadAheads = firstRead ? 1 : this.readAheadQueueDepth;
   ```



##########
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsInputStream.java:
##########
@@ -348,14 +348,15 @@ private int readOneBlock(final byte[] b, final int off, 
final int len) throws IO
       if (alwaysReadBufferSize) {
         bytesRead = readInternal(fCursor, buffer, 0, bufferSize, false);
       } else {
-        // Enable readAhead when reading sequentially
+        // Switch between enabling and disabling read ahead based on the 
workload read pattern.
         if (-1 == fCursorAfterLastRead || fCursorAfterLastRead == fCursor || 
b.length >= bufferSize) {
-          LOG.debug("Sequential read with read ahead size of {}", bufferSize);
+          // Sequential read pattern detected. Enable read ahead.
+          LOG.debug("Sequential read with read size of {} and read ahead 
enabled", bufferSize);
           bytesRead = readInternal(fCursor, buffer, 0, bufferSize, false);
         } else {
-          // Enabling read ahead for random reads as well to reduce number of 
remote calls.
+          // Random read pattern detected. Disable read ahead.
           int lengthWithReadAhead = Math.min(b.length + readAheadRange, 
bufferSize);
-          LOG.debug("Random read with read ahead size of {}", 
lengthWithReadAhead);
+          LOG.debug("Random read with read size of {} and read ahead 
disabled", lengthWithReadAhead);

Review Comment:
   The log message mentions 'read ahead disabled' but the variable logged is 
`lengthWithReadAhead`, which suggests read-ahead is still being used. This is 
confusing and inconsistent with the message.
   ```suggestion
             LOG.debug("Random read with requested size {} (read ahead range: 
{}), actual read size: {}, read ahead disabled", b.length, readAheadRange, 
lengthWithReadAhead);
   ```



##########
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsInputStream.java:
##########
@@ -348,14 +348,15 @@ private int readOneBlock(final byte[] b, final int off, 
final int len) throws IO
       if (alwaysReadBufferSize) {
         bytesRead = readInternal(fCursor, buffer, 0, bufferSize, false);
       } else {
-        // Enable readAhead when reading sequentially
+        // Switch between enabling and disabling read ahead based on the 
workload read pattern.
         if (-1 == fCursorAfterLastRead || fCursorAfterLastRead == fCursor || 
b.length >= bufferSize) {
-          LOG.debug("Sequential read with read ahead size of {}", bufferSize);
+          // Sequential read pattern detected. Enable read ahead.
+          LOG.debug("Sequential read with read size of {} and read ahead 
enabled", bufferSize);

Review Comment:
   The log message mentions 'read size of {}' but logs `bufferSize` instead of 
the actual read size (`b.length`). This could be misleading when debugging read 
operations.
   ```suggestion
             LOG.debug("Sequential read with read size of {} and read ahead 
enabled", b.length);
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Re: [PR] Hadoop 19641: [ABFS][ReadAheadV2] First Read should bypass ReadBufferManager [hadoop]

Reply via email to