clintropolis opened a new pull request, #19396:
URL: https://github.com/apache/druid/pull/19396

   ### Description
   This PR switches the `SegmentLocalCacheManager` on-demand load executor 
(used in virtual-storage mode) from a fixed platform-thread pool to one virtual 
thread per task with a `Semaphore` for backpressure, and converts 
`SegmentCacheEntry` from  synchronized to ReentrantLock so virtual threads park 
rather than pinning their carrier during the long-running mount path.
   
   The on-demand load work is overwhelmingly socket wait against deep storage 
with a small CPU portion at the end (factorize/mmap setup). Virtual threads let 
us fan out hundreds of in-flight loads cheaply, with the semaphore providing a 
similar kind of backpressure the pool size used to give implicitly. This 
becomes  especially relevant once partial loading lands and the pool starts 
handling many smaller per-internal-segment-file range read calls instead of one 
big mount per segment.
   
   The `ReentrantLock` conversion in `SegmentCacheEntry` is what makes the 
virtual threads switch actually pay off on Java 21. Without it, the entire 
`mount()` body runs inside `synchronized (this)`, pinning the carrier thread 
and effectively       
     capping concurrent mounts at the carrier-pool size regardless of how many 
virtual threads are spawned. `ReentrantLock` parks the virtual thread properly. 
Once Druid's minimum is Java 24+, [JEP 491](https://openjdk.org/jeps/491) makes 
this conversion redundant, but it's a mechanical minimum-risk change in the 
meantime.  
   
   changes:
   * switch default `SegmentLocalCacheManager.virtualStorageLoadOnDemandExec` 
to virtual threads with a `Semaphore` for backpressure
   * added `SegmentLoaderConfig.virtualStorageUseVirtualThreads` 
(`druid.segmentCache.virtualStorageUseVirtualThreads`) config that defaults to 
true, but allows opt-out via setting to false
   * raise default `SegmentLoaderConfig.virtualStorageLoadThreads` default to 
Math.max(32, 4 * cores), sized as ~4x lookahead per processing thread
   * convert `SegmentCacheEntry` from `synchronized` to `ReentrantLock` so 
virtual threads park instead of pinning the carrier during mount
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to