carloea2 opened a new pull request, #4181:
URL: https://github.com/apache/texera/pull/4181

   ### What changes were proposed in this PR?
   
   This PR adds dataset **recoverable multipart upload** experience end-to-end 
(backend + frontend) by tightening concurrency rules, addingrecoverability, and 
supporting a clean restart path when upload configuration changes.
   
   Backend changes:
   - Added a new multipart operation type: **`type=list`**
     - Lists active multipart upload file paths (within the physical address 
expiration window) so clients can discover resumable uploads.
   - Updated **`type=init`** to deny concurrent uploads for the same file
     - Uses DB row locking (`FOR UPDATE NOWAIT`) to fail fast with **409 
CONFLICT** if another client is currently uploading/initializing parts for the 
same file.
   - Updated `init` response to return **all missing parts** , plus 
`completedPartsCount`
     - Enables clients to recover by uploading only missing part numbers, 
without trials.
   - Added a **delete-then-restart flow** when the requested `fileSizeBytes` / 
`partSizeBytes` changes from an existing session
     - If an existing session is found but its config mismatches the incoming 
request:
       - Delete the DB upload session (and part rows via cascade),
       - Abort the previous lakeFS multipart upload,
       - Start a fresh session with the new parameters.
   - Added/updated tests covering:
     - `type=list` behavior
     - init concurrency denial (conflict when rows are locked)
     - restart behavior when `fileSizeBytes` or `partSizeBytes` changes
   
   Frontend changes:
   - Added a **Recover confirmation dialog** for multipart uploads
     - When resumable uploads are detected, the dialog lets the user choose 
which items to:
       - **Recover** (continue by uploading missing parts, if impossible 
automatic restart),
       - **Skip** (ignore a detected recoverable upload and not resume it).
   
   This results in a safer and clearer resumable upload UX:
   - No silent concurrent uploads for the same file.
   - Full visibility into which parts are missing.
   - A deterministic restart path when upload parameters change.
   - Explicit user choice in the UI about what to recover vs skip.
   
   ### Any related issues, documentation, discussions?
   
   
   ### How was this PR tested?
   
   Backend:
   - Added automated tests covering:
     - `type=list`
     - `type=init` concurrency denial (409 on concurrent lock)
     - missing parts reporting (returns all missing parts, sorted)
     - delete-then-restart behavior when `fileSizeBytes` or `partSizeBytes` 
changes
   
   Frontend:
   - Manually verified resume dialog behavior:
     - Detected recoverable uploads render in the dialog
     - “Recover” continues uploading only missing parts and  triggers the 
restart path when parameters differ
     - “Skip” leaves the upload untouched and proceeds without recovering
   
   ### Was this PR authored or co-authored using generative AI tooling?
   
   ChatGPT co-authored


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to