eric-wang-1990 opened a new pull request, #3756: URL: https://github.com/apache/arrow-adbc/pull/3756
Fixes a deadlock that occurs with fast parallel downloads when memory buffer is limited. ## Problem The CloudFetch downloader was acquiring memory when downloads started but only releasing it when results were disposed (after the reader consumed all batches). This caused a deadlock in the following scenario: 1. Downloads happen in parallel (e.g., 3 concurrent threads) 2. Downloads complete very fast (e.g., on high-performance VMs) 3. Reader must consume results sequentially by offset 4. Fast downloads complete out of order and accumulate in memory 5. Memory buffer fills up with completed-but-not-yet-read results 6. New downloads cannot start (no memory available) 7. Reader is blocked waiting for the next sequential download to start 8. **Deadlock**: Reader waiting for download, downloads waiting for memory Example with 100MB memory, 20MB files, 3 parallel downloads: - Downloads 1, 2, 3 start (60MB used) - Download 3 finishes first, then 2, then 1 (all 60MB still held) - Downloads 4, 5 start and finish quickly (100MB used) - Reader is still processing download 1 - Download 6 cannot start (no memory) - Reader waiting for download 6, but it will never start → deadlock ## Solution Release memory immediately when download completes (in SetCompleted), not when result is disposed. This ensures memory is available for new downloads even if the reader hasn't consumed the result yet. ## Changes - **CloudFetchDownloader.cs**: Added memory release immediately after `SetCompleted()` call - **DownloadResult.cs**: Removed memory release from `Dispose()` method ## Testing - Manually verified on fast VM with Power BI Desktop (previously deadlocked, now works) - Existing CloudFetchDownloader tests still pass - Log files confirm memory is released immediately after download completion 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
