[
https://issues.apache.org/jira/browse/FLINK-39533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Gabor Somogyi updated FLINK-39533:
----------------------------------
Fix Version/s: 2.3.0
> Use abort() instead of drain on close/seek when remaining bytes exceed
> threshold in NativeS3InputStream
> -------------------------------------------------------------------------------------------------------
>
> Key: FLINK-39533
> URL: https://issues.apache.org/jira/browse/FLINK-39533
> Project: Flink
> Issue Type: Bug
> Components: Connectors / FileSystem
> Affects Versions: 2.3.0
> Reporter: Samrat Deb
> Assignee: Samrat Deb
> Priority: Major
> Labels: pull-request-available
> Fix For: 2.3.0, 2.4.0
>
>
> NativeS3InputStream currently calls close() on the underlying
> ResponseInputStream during seek(), skip(), and close() operations. Apache
> HttpClient's close() implementation drains all remaining bytes from the
> response body to enable HTTP connection reuse.
> For large S3 objects where only a small portion was read (large state file or
> seeking within columnar formats)
> this drain loop reads and discards potentially gigabytes of data over the
> network causing severe latency during seek/close operations.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)