eugenegujing opened a new issue, #5144:
URL: https://github.com/apache/texera/issues/5144
### What happened?
After raising `dataset.single_file_upload_max_size_mib` to 204800 (200 GiB)
via the Admin Settings page so that the frontend size check accepts large
files, uploading a multi-100GB `.h5` file to a dataset fails.
The dataset page only shows the generic toast: Upload failed. Please retry.
The browser console shows that the multipart-upload requests for many
individual parts time out at the TCP level (`net::ERR_CONNECTION_TIMED_OUT`),
e.g.:
:4200/api/dataset/multipart-upload?...&partNumber=6
net::ERR_CONNECTION_TIMED_OUT
:4200/api/dataset/multipart-upload?...&partNumber=90
net::ERR_CONNECTION_TIMED_OUT
:4200/api/dataset/multipart-upload?...&partNumber=115
net::ERR_CONNECTION_TIMED_OUT
:4200/api/dataset/multipart-upload?...&partNumber=160
net::ERR_CONNECTION_TIMED_OUT
### How to reproduce?
1. Run Texera locally (frontend on :4200, file-service backend).
2. As an admin, open Admin → Settings, and set:
- Max File Size (MiB): 204800 (i.e. 200 GiB)
- Part Size (MiB): 50 (default)
Save.
3. Create a new dataset, open the dataset page, click "Browser & Upload
Files" (or drag-and-drop).
4. Select a single ≥100 GB file (in my case a ~100+ GB `.h5` file).
5. Observe:
- Browser DevTools → Network shows many requests to
`/api/dataset/multipart-upload?...&partNumber=N` failing with
`net::ERR_CONNECTION_TIMED_OUT`.
- The dataset page shows only the toast: `Upload failed. Please retry.`
- No new version is created.
### Branch
main
### Commit Hash (Optional)
_No response_
### What browsers are you seeing the problem on?
_No response_
### Relevant log output
```shell
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]