carloea2 opened a new pull request, #5929: URL: https://github.com/apache/texera/pull/5929
### What changes were proposed in this PR? <img width="784" height="471" alt="image" src="https://github.com/user-attachments/assets/bb2a5b9c-a178-43a9-abf4-7c746bc9e0ba" /> This PR improves dataset upload retry behavior by distinguishing between two cases that currently look similar to users: | Case | Behavior | | --- | --- | | Active multipart upload session exists | Ask whether to resume or restart the upload. | | Matching file already exists in the dataset | Ask whether to upload again or skip it. | The backend adds a dataset endpoint that checks candidate upload files against committed and staged dataset files by path and size. The frontend uses that endpoint after checking active multipart sessions, so failed uploads can still be resumed while completed matching files can be skipped from the retry batch. The user-facing copy intentionally says a file with the same path and size exists, instead of implying byte-for-byte certainty. ### Any related issues, documentation, discussions? Related to discussion #5744: Improve resumable upload: track completion at the batch/session level. ### How was this PR tested? Added backend and frontend tests covering: | Test | Coverage | | --- | --- | | `DatasetResourceSpec` | Finds committed and staged files only when path and size match. | | `DatasetService` spec | Forces one multipart part upload to fail, then verifies retry uploads only the missing part. | | `FilesUploaderComponent` spec | Verifies a mixed retry batch asks Resume/Restart for a failed multipart file and Upload/Skip for a completed matching file. | Commands run: ```powershell sbt "FileService/testOnly org.apache.texera.service.resource.DatasetResourceSpec -- -z findExistingUploadFiles" yarn ng run gui:test --include=src/app/dashboard/service/user/dataset/dataset.service.spec.ts yarn ng run gui:test --include=src/app/dashboard/component/user/files-uploader/files-uploader.component.spec.ts yarn eslint src/app/dashboard/component/user/files-uploader/files-uploader.component.ts src/app/dashboard/component/user/files-uploader/files-uploader.component.spec.ts src/app/dashboard/component/user/files-uploader/conflicting-file-modal-content/conflicting-file-modal-content.component.ts src/app/dashboard/service/user/dataset/dataset.service.ts src/app/dashboard/service/user/dataset/dataset.service.spec.ts -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
