anoopj opened a new issue, #16062:
URL: https://github.com/apache/iceberg/issues/16062
### Apache Iceberg version
1.10.1 (latest release)
### Query engine
None
### Please describe the bug 🐞
### Description
#16055 identified and fixed an EOF handling bug in `GCSInputStream`. But it
looks like the same bug exists in all other cloud storage `InputStream`
implementations.
**Impact:** The single-byte `read()` bug can cause an infinite loop for
callers that reads until EOF. The buffered read corrupts position tracking and
metrics. In practice, Iceberg typically reads files using range reads at known
offsets rather than sequential reads to EOF, so this is unlikely to be hit in
the hot path.
There is also a metrics bug in the buffered reads called out in that PR.
Affected implementations:
-
[`S3InputStream#read()`](https://github.com/apache/iceberg/blob/f984c28b215c56846c632ccc5a368f4f4afe0b5d/aws/src/main/java/org/apache/iceberg/aws/s3/S3InputStream.java#L124-L125)
-
[`ADLSInputStream#read()`](https://github.com/apache/iceberg/blob/f984c28b215c56846c632ccc5a368f4f4afe0b5d/azure/src/main/java/org/apache/iceberg/azure/adlsv2/ADLSInputStream.java#L117-L118)
- `OSSInputStream` (aliyun)
- `EcsSeekableInputStream` (dell)
GCS fix from @vladislav-sidorovich :
https://github.com/apache/iceberg/pull/16055
### Willingness to contribute
- [ ] I can contribute a fix for this bug independently
- [x] I would be willing to contribute a fix for this bug with guidance from
the Iceberg community
- [ ] I cannot contribute a fix for this bug at this time
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]