[ 
https://issues.apache.org/jira/browse/FLINK-35150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

qyw updated FLINK-35150:
------------------------
    Description: 
Flink S3 hadoop, write S3 in csv mode, I used this patch  FLINK-28513 .   But I 
don't understand why S3RecoverableFsDataOutputStream "sync" method of this 
class to be "completeMultipartUpload" operation, if "completeMultipartUpload" 
here, Calling close later to upload the rest of the stream will inevitably 
result in an error.   The part corresponding to uploadID has been merged.
Therefore, when the message in csv is larger than "S3_MULTIPART_MIN_PART_SIZE", 
the uploadPart will be started when switching files, then when BulkPartWriter 
performs closeForCommit, Due to the sync S3RecoverableFsDataOutputStream method 
call completeMultipartUpload, So S3RecoverableFsDataOutputStream 
"closeForCommit" method due to the uploadPart, at this time will lead to errors.

 

BulkPartWriter:

!image-2024-04-18-11-03-08-998.png!

CsvBulkWriter:

!image-2024-04-18-11-20-25-583.png!
S3RecoverableFsDataOutputStream:
!image-2024-04-18-10-51-05-071.png!
 

 

 

  was:
Flink S3 hadoop, write S3 in csv mode, I used this patch  FLINK-28513 .   But I 
don't understand why S3RecoverableFsDataOutputStream "sync" method of this 
class to be "completeMultipartUpload" operation, if "completeMultipartUpload" 
here, Calling close later to upload the rest of the stream will inevitably 
result in an error.   The part corresponding to uploadID has been merged.
Therefore, when the message in csv is larger than "S3_MULTIPART_MIN_PART_SIZE", 
the uploadPart will be started when switching files, then when BulkPartWriter 
performs closeForCommit, Due to the sync S3RecoverableFsDataOutputStream method 
call completeMultipartUpload, So S3RecoverableFsDataOutputStream 
"closeForCommit" method due to the uploadPart, at this time will lead to errors.

 

BulkPartWriter:

!image-2024-04-18-11-03-08-998.png!

CsvBulkWriter:

!image-2024-04-18-11-20-43-126.png!
S3RecoverableFsDataOutputStream:
!image-2024-04-18-10-51-05-071.png!
 

 

 


> The specified upload does not exist. The upload ID may be invalid
> -----------------------------------------------------------------
>
>                 Key: FLINK-35150
>                 URL: https://issues.apache.org/jira/browse/FLINK-35150
>             Project: Flink
>          Issue Type: Bug
>          Components: Connectors / FileSystem
>    Affects Versions: 1.15.0
>            Reporter: qyw
>            Priority: Major
>         Attachments: image-2024-04-18-10-51-05-071.png, 
> image-2024-04-18-11-03-08-998.png, image-2024-04-18-11-20-25-583.png
>
>
> Flink S3 hadoop, write S3 in csv mode, I used this patch  FLINK-28513 .   But 
> I don't understand why S3RecoverableFsDataOutputStream "sync" method of this 
> class to be "completeMultipartUpload" operation, if "completeMultipartUpload" 
> here, Calling close later to upload the rest of the stream will inevitably 
> result in an error.   The part corresponding to uploadID has been merged.
> Therefore, when the message in csv is larger than 
> "S3_MULTIPART_MIN_PART_SIZE", the uploadPart will be started when switching 
> files, then when BulkPartWriter performs closeForCommit, Due to the sync 
> S3RecoverableFsDataOutputStream method call completeMultipartUpload, So 
> S3RecoverableFsDataOutputStream "closeForCommit" method due to the 
> uploadPart, at this time will lead to errors.
>  
> BulkPartWriter:
> !image-2024-04-18-11-03-08-998.png!
> CsvBulkWriter:
> !image-2024-04-18-11-20-25-583.png!
> S3RecoverableFsDataOutputStream:
> !image-2024-04-18-10-51-05-071.png!
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to