jadewang-db opened a new pull request, #2855:
URL: https://github.com/apache/arrow-adbc/pull/2855

   ### Problem
   The Databricks driver's CloudFetch functionality was not properly handling 
expired cloud file URLs, which could lead to failed downloads and errors during 
query execution. The system needed a way to track, cache, and refresh presigned 
URLs before they expire.
   
   ### Solution
   - Implemented a new `CloudFetchUrlManager` class that:
     - Manages a cache of cloud file URLs with their expiration times
     - Proactively refreshes URLs that are about to expire
     - Efficiently fetches and caches URLs in batches
     - Provides thread-safe access to URL information
   - Added an `IClock` interface and implementations to facilitate testing with 
controlled time
   - Extended the `IDownloadResult` interface to support URL refreshing and 
expiration checking
   - Updated namespace from 
`Apache.Arrow.Adbc.Drivers.Apache.Databricks.CloudFetch` to 
`Apache.Arrow.Adbc.Drivers.Databricks.CloudFetch` for better organization
   
   ### Testing
   - Created comprehensive unit tests in `CloudFetchUrlManagerTest.cs` that 
verify:
     - URL caching behavior
     - Proper handling of URL expiration
     - Batch fetching of URLs
     - Refreshing of expired URLs
     - Thread safety of the implementation
   
   This change improves the reliability of the CloudFetch functionality by 
ensuring that cloud file URLs are refreshed before they expire, preventing 
download failures during query execution


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to