[ 
https://issues.apache.org/jira/browse/FLINK-8794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16414272#comment-16414272
 ] 

Steve Loughran commented on FLINK-8794:
---------------------------------------

{quote} writing to local disks would decrease performance, since you would need 
to write the same data twice (first locally then copy remotely
{quote}
 I don't know what FS connector you are using, but these days S3A defaults to 
buffering blocks to local HDD before initiating upload in the close() or after 
the block size threshold is reached. You aren't going to see a perf hit if you 
are writing files smaller than fs.s3a.blocksize. If bigger, afraid so, but it 
may be worth it. The staging S3A committers coming in Hadoop 3.1 postpone all 
uploads until task commit, but they gain better failure semantics and job 
commit is fast and trivial

> When using BucketingSink, it happens that one of the files is always in the 
> [.in-progress] state
> ------------------------------------------------------------------------------------------------
>
>                 Key: FLINK-8794
>                 URL: https://issues.apache.org/jira/browse/FLINK-8794
>             Project: Flink
>          Issue Type: Improvement
>          Components: filesystem-connector
>    Affects Versions: 1.4.0, 1.4.1
>            Reporter: yanxiaobin
>            Priority: Major
>
> When using BucketingSink, it happens that one of the files is always in the 
> [.in-progress] state. And this state has never changed after that.  The 
> underlying use of S3 as storage.
>  
> {code:java}
> // code placeholder
> {code}
> 2018-02-28 11:58:42  147341619 {color:#d04437}_part-28-0.in-progress{color}
> 2018-02-28 12:06:27  147315059 part-0-0
> 2018-02-28 12:06:27  147462359 part-1-0
> 2018-02-28 12:06:27  147316006 part-10-0
> 2018-02-28 12:06:28  147349854 part-100-0
> 2018-02-28 12:06:27  147421625 part-101-0
> 2018-02-28 12:06:27  147443830 part-102-0
> 2018-02-28 12:06:27  147372801 part-103-0
> 2018-02-28 12:06:27  147343670 part-104-0
> ......



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to