sodonnel commented on code in PR #10415:
URL: https://github.com/apache/ozone/pull/10415#discussion_r3382288985


##########
hadoop-hdds/client/src/main/java/org/apache/hadoop/hdds/scm/storage/StreamBlockInputStream.java:
##########
@@ -208,12 +214,40 @@ public synchronized void seek(long pos) throws 
IOException {
     if (pos == position) {
       return;
     }
-    LOG.debug("{}: seek {} -> {}", this, position, pos);
-    closeStream();
+    LOG.debug("{}: seek {} -> {}", getName(streamingReader), position, pos);
+    readBuffer = reuseReadBuffer(readBuffer, pos);

Review Comment:
   I've attached a diff that has a unit test to validate that we don't re-read 
data already on the queue when seeking forward. I tried to create a similar 
test for a backwards seek but its more difficult as we want to test that the 
queue is drained before requesting more data, which isn't really possible with 
the current structure.
   
   But I think we need to change the reading behavior so it reads through the 
queue and only reads more data from the datanode if the queue has been drained.
   
   In a dysfunctional case, we could have read 32MB onto the queue with 
pre-read. Seek from 0.1MB to 1.1MB. This would trigger another 32MB read onto 
the end of the queue, but the data we want is there are the head. Now there is 
31 + 32MB on the queue. Seek now to 3.1MB, the same happens, as we have the 
next chunk we need available.
   
   Even with the backwards seek, we could have 32MB on the queue, and then read 
32MB more before throwing away the first 32MB. We should throw away the first 
32MB first.
   
   
[forward-seek-test.patch](https://github.com/user-attachments/files/28763628/forward-seek-test.patch)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to