[PR] fix(csharp/databricks): Release CloudFetch memory immediately after download completes [arrow-adbc]

via GitHub Tue, 25 Nov 2025 02:22:14 -0800


eric-wang-1990 opened a new pull request, #3756:
URL: https://github.com/apache/arrow-adbc/pull/3756


   Fixes a deadlock that occurs with fast parallel downloads when memory buffer 
is limited.
   
   ## Problem
   The CloudFetch downloader was acquiring memory when downloads started but 
only releasing it when results were disposed (after the reader consumed all 
batches). This caused a deadlock in the following scenario:
   
   1. Downloads happen in parallel (e.g., 3 concurrent threads)
   2. Downloads complete very fast (e.g., on high-performance VMs)
   3. Reader must consume results sequentially by offset
   4. Fast downloads complete out of order and accumulate in memory
   5. Memory buffer fills up with completed-but-not-yet-read results
   6. New downloads cannot start (no memory available)
   7. Reader is blocked waiting for the next sequential download to start
   8. **Deadlock**: Reader waiting for download, downloads waiting for memory
   
   Example with 100MB memory, 20MB files, 3 parallel downloads:
   - Downloads 1, 2, 3 start (60MB used)
   - Download 3 finishes first, then 2, then 1 (all 60MB still held)
   - Downloads 4, 5 start and finish quickly (100MB used)
   - Reader is still processing download 1
   - Download 6 cannot start (no memory)
   - Reader waiting for download 6, but it will never start → deadlock
   
   ## Solution
   Release memory immediately when download completes (in SetCompleted), not 
when result is disposed. This ensures memory is available for new downloads 
even if the reader hasn't consumed the result yet.
   
   ## Changes
   - **CloudFetchDownloader.cs**: Added memory release immediately after 
`SetCompleted()` call
   - **DownloadResult.cs**: Removed memory release from `Dispose()` method
   
   ## Testing
   - Manually verified on fast VM with Power BI Desktop (previously deadlocked, 
now works)
   - Existing CloudFetchDownloader tests still pass
   - Log files confirm memory is released immediately after download completion
   
   🤖 Generated with [Claude Code](https://claude.com/claude-code)
   
   Co-Authored-By: Claude <[email protected]>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[PR] fix(csharp/databricks): Release CloudFetch memory immediately after download completes [arrow-adbc]

Reply via email to