[ 
https://issues.apache.org/jira/browse/FLINK-39484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

macdoor615 closed FLINK-39484.
------------------------------
    Resolution: Won't Fix

> NativeS3InputStream: abort unfinished GetObject responses on seek/reopen to 
> avoid ConnectionClosedException on S3-compatible storage
> ------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: FLINK-39484
>                 URL: https://issues.apache.org/jira/browse/FLINK-39484
>             Project: Flink
>          Issue Type: Bug
>          Components: FileSystems
>    Affects Versions: 2.3.0
>         Environment: * Apache Flink: 2.3.0 / paimon 1.3.1
> * Storage: MinIO 
> * Workload: Paimon catalog reading large Parquet files; high parallelism 
> (many concurrent readers)
> * JVM: Java 21 
> * Relevant Flink S3 settings (examples): s3.connection.timeout, 
> s3.socket.timeout, s3.read.buffer.size, s3.retry.max-num-retries
>            Reporter: macdoor615
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 2.3.0
>
>         Attachments: flink-dict-taskexecutor-0-cmtt-dict-57.log
>
>
> h2. Problem
> When reading large Parquet files (e.g. via Paimon) from S3-compatible 
> endpoints such as MinIO using the {{flink-s3-fs-native}} plugin, tasks can 
> fail with:
>  
> {code:java}
> Caused by: 
> org.apache.flink.fs.s3native.shaded.org.apache.http.ConnectionClosedException:
>  Premature end of Content-Length delimited message body (expected: 
> 41,228,206; received: 15,528,875)
> {code}
> The stack trace points to closing the buffered stream inside\{{ 
> NativeS3InputStream.close()}} while Apache HttpClient tries to drain the 
> remainder of the HTTP response body.
>  
> Parquet access patterns use\{{ seek()}} on {{{}FSDataInputStream{}}}. 
> {{NativeS3InputStream}} issues ranged GETs and replaces the underlying 
> {{GetObject}} stream. The previous implementation closed the old stream 
> without consuming the full response body. For AWS SDK v2 + Apache HTTP 
> client, {{close()}} attempts to read the remaining bytes to reuse the 
> connection. If the server or network has already terminated the connection, 
> draining fails with the exception above.
> h2. Expected behavior
> Abandoning a partially read {{GetObject}} body (after seek/reopen or early 
> close) should use {{ResponseInputStream.abort()}} per AWS SDK guidance, 
> instead of relying on {{close()}} to drain the body.
> h2. Suggested fix
>  * On reopen {{(openStreamAtCurrentPosition) }}and on{{ close()}} when 
> {{{}position < contentLength{}}}, call {{abort()}} on the  
> {{ResponseInputStream}}  before discarding wrappers.
>  * Optionally document that {{abort()}} may reduce connection reuse compared 
> to fully draining after a complete sequential read.
> h1. Notes
> This is independent of tuning {{{}s3.socket.timeout / 
> s3.connection.timeout{}}}; the failure occurs during response teardown, not 
> only slow reads.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to