Tak-Lon (Stephen) Wu created HDDS-15422:
-------------------------------------------

             Summary: Stream read with `seek` close stream incorrectly
                 Key: HDDS-15422
                 URL: https://issues.apache.org/jira/browse/HDDS-15422
             Project: Apache Ozone
          Issue Type: Bug
          Components: Ozone Client
    Affects Versions: 2.2.0
            Reporter: Tak-Lon (Stephen) Wu


While testing HBase read only with Ozone by using YCSB workload C (L1 
blockcache / no bucketcache /), We are experiencing degraded read throughput 
and hanging handler threads on our HBase RegionServers. Despite a healthy 
BlockCache hit rate (~87.5%), active read threads are becoming blocked, 
creating a bottleneck.

Profiling reveals that handler threads are hanging IdLock.getLockEntry while 
waiting for a single thread to fetch an HFile block from Apache Ozone.

and while tuning on TRACE/DEBUG log , we found the input stream are closing 
immediately after seeking from end/middle of the file back to 0. the next seek 
that tried to seek from 0 to middle hangs and wait till timeout.  this 
indicates that read amplification issue triggered by Reverse Scans over the 
Ozone gRPC client cannot reuse the stream.

{code}
2026-05-29 12:50:06,171 DEBUG 
org.apache.hadoop.hdds.scm.storage.StreamBlockInputStream: 
[RpcServer.default.FPBQ.Fifo.handler=42,queue=6,port=22101]: 
stream119-reader3831: return response positon 179405298, length 4095502 (block 
offset 179306496, length 4194304)
2026-05-29 12:50:06,171 DEBUG 
org.apache.hadoop.hdds.scm.storage.StreamBlockInputStream: 
[RpcServer.default.FPBQ.Fifo.handler=42,queue=6,port=22101]: stream119: seek 
179471034 -> 0
2026-05-29 12:50:06,172 DEBUG 
org.apache.hadoop.hdds.scm.storage.StreamBlockInputStream: 
[RpcServer.default.FPBQ.Fifo.handler=42,queue=6,port=22101]: Closing 
stream119-reader3831
2026-05-29 12:50:06,172 DEBUG 
org.apache.hadoop.hdds.scm.storage.StreamBlockInputStream: 
[RpcServer.default.FPBQ.Fifo.handler=42,queue=6,port=22101]: 
stream119-reader3831 setCompleted success
2026-05-29 12:50:06,172 DEBUG 
org.apache.hadoop.hdds.scm.storage.StreamBlockInputStream: 
[RpcServer.default.FPBQ.Fifo.handler=42,queue=6,port=22101]: stream120: seek 0 
-> 133955559
2026-05-29 12:50:06,172 DEBUG 
org.apache.hadoop.hdds.scm.storage.StreamBlockInputStream: 
[grpc-default-executor-11]: stream119-reader3831 Failed to setCompleted: 
Already completed
2026-05-29 12:50:06,985 DEBUG 
org.apache.hadoop.hdds.scm.storage.StreamBlockInputStream: 
[RpcServer.default.FPBQ.Fifo.handler=42,queue=6,port=22101]: stream493: seek 
217274129 -> 0
2026-05-29 12:50:06,985 DEBUG 
org.apache.hadoop.hdds.scm.storage.StreamBlockInputStream: 
[RpcServer.default.FPBQ.Fifo.handler=42,queue=6,port=22101]: stream492: seek 0 
-> 102738300
2026-05-29 12:50:06,985 DEBUG 
org.apache.hadoop.hdds.scm.storage.StreamBlockInputStream: 
[RpcServer.default.FPBQ.Fifo.handler=42,queue=6,port=22101]: position 
102738300, length 65686, requested 102738300, diff 65686, readLength 65686, 
preReadSize=0
2026-05-29 12:50:07,008 DEBUG 
org.apache.hadoop.hdds.scm.storage.StreamBlockInputStream: 
[grpc-default-executor-24]: stream492-reader3832: enqueue response offset 
101711872, length 4194304, numChecksums 4, bytesPerChecksum=1048576
2026-05-29 12:50:07,009 DEBUG 
org.apache.hadoop.hdds.scm.storage.StreamBlockInputStream: 
[RpcServer.default.FPBQ.Fifo.handler=42,queue=6,port=22101]: 
stream492-reader3832: return response positon 102738300, length 3167876 (block 
offset 101711872, length 4194304)
2026-05-29 12:50:07,009 DEBUG 
org.apache.hadoop.hdds.scm.storage.StreamBlockInputStream: 
[RpcServer.default.FPBQ.Fifo.handler=42,queue=6,port=22101]: stream492: seek 
102803986 -> 0
2026-05-29 12:50:07,009 DEBUG 
org.apache.hadoop.hdds.scm.storage.StreamBlockInputStream: 
[RpcServer.default.FPBQ.Fifo.handler=42,queue=6,port=22101]: Closing 
stream492-reader3832
2026-05-29 12:50:07,009 DEBUG 
org.apache.hadoop.hdds.scm.storage.StreamBlockInputStream: 
[RpcServer.default.FPBQ.Fifo.handler=42,queue=6,port=22101]: 
stream492-reader3832 setCompleted success
2026-05-29 12:50:07,009 DEBUG 
org.apache.hadoop.hdds.scm.storage.StreamBlockInputStream: 
[RpcServer.default.FPBQ.Fifo.handler=42,queue=6,port=22101]: stream493: seek 0 
-> 217274129
2026-05-29 12:50:07,009 DEBUG 
org.apache.hadoop.hdds.scm.storage.StreamBlockInputStream: 
[grpc-default-executor-24]: stream492-reader3832 Failed to setCompleted: 
Already completed

{code}

please check [this snippet of code 
|https://github.com/apache/ozone/blob/master/hadoop-hdds/client/src/main/java/org/apache/hadoop/hdds/scm/storage/StreamBlockInputStream.java#L212]
 in StreamBlockInputStream.java. 

one thought is to reduce the read chunk size from e.g. 4MB or 1MB to 64KB to 
match the hbase block, but this size reduction does not help the fact of 
Reverse Scan for HFile reading cases.


Setup configuration

{code}
 ozone.fs.datastream.enabled = true
 ozone.client.stream.readblock.enable = true
 hdds.container.ratis.datastream.enabled = true
 ozone.client.verify.checksum = false (was true and turned to false when 
checksum failed)
 hdds.grpc.tls.enabled = false (final)
 ozone.client.stream.read.timeout = 120s
 scm.container.client.idle.threshold = 120s
{code}






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to