carloea2 opened a new pull request, #4181:
URL: https://github.com/apache/texera/pull/4181
### What changes were proposed in this PR?
This PR adds dataset **recoverable multipart upload** experience end-to-end
(backend + frontend) by tightening concurrency rules, addingrecoverability, and
supporting a clean restart path when upload configuration changes.
Backend changes:
- Added a new multipart operation type: **`type=list`**
- Lists active multipart upload file paths (within the physical address
expiration window) so clients can discover resumable uploads.
- Updated **`type=init`** to deny concurrent uploads for the same file
- Uses DB row locking (`FOR UPDATE NOWAIT`) to fail fast with **409
CONFLICT** if another client is currently uploading/initializing parts for the
same file.
- Updated `init` response to return **all missing parts** , plus
`completedPartsCount`
- Enables clients to recover by uploading only missing part numbers,
without trials.
- Added a **delete-then-restart flow** when the requested `fileSizeBytes` /
`partSizeBytes` changes from an existing session
- If an existing session is found but its config mismatches the incoming
request:
- Delete the DB upload session (and part rows via cascade),
- Abort the previous lakeFS multipart upload,
- Start a fresh session with the new parameters.
- Added/updated tests covering:
- `type=list` behavior
- init concurrency denial (conflict when rows are locked)
- restart behavior when `fileSizeBytes` or `partSizeBytes` changes
Frontend changes:
- Added a **Recover confirmation dialog** for multipart uploads
- When resumable uploads are detected, the dialog lets the user choose
which items to:
- **Recover** (continue by uploading missing parts, if impossible
automatic restart),
- **Skip** (ignore a detected recoverable upload and not resume it).
This results in a safer and clearer resumable upload UX:
- No silent concurrent uploads for the same file.
- Full visibility into which parts are missing.
- A deterministic restart path when upload parameters change.
- Explicit user choice in the UI about what to recover vs skip.
### Any related issues, documentation, discussions?
### How was this PR tested?
Backend:
- Added automated tests covering:
- `type=list`
- `type=init` concurrency denial (409 on concurrent lock)
- missing parts reporting (returns all missing parts, sorted)
- delete-then-restart behavior when `fileSizeBytes` or `partSizeBytes`
changes
Frontend:
- Manually verified resume dialog behavior:
- Detected recoverable uploads render in the dialog
- “Recover” continues uploading only missing parts and triggers the
restart path when parameters differ
- “Skip” leaves the upload untouched and proceeds without recovering
### Was this PR authored or co-authored using generative AI tooling?
ChatGPT co-authored
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]