Re: S3 StreamingFileSink issues

2020-10-07 Thread Dan Diephouse
FYI - I discovered that if I specify the Hadoop compression codec it works fine. E.g.: CompressWriters.forExtractor(new DefaultExtractor()).withHadoopCompression("GzipCodec") Haven't dug into exactly why yet. On Wed, Oct 7, 2020 at 12:14 PM David Anderson wrote: > Looping in @Kostas Kloudas

Re: S3 StreamingFileSink issues

2020-10-07 Thread David Anderson
Looping in @Kostas Kloudas who should be able to clarify things. David On Wed, Oct 7, 2020 at 7:12 PM Dan Diephouse wrote: > Thanks! Completely missed that in the docs. It's now working, however it's > not working with compression writers. Someone else noted this issue here: > > >

Re: S3 StreamingFileSink issues

2020-10-07 Thread Dan Diephouse
Thanks! Completely missed that in the docs. It's now working, however it's not working with compression writers. Someone else noted this issue here: https://stackoverflow.com/questions/62138635/flink-streaming-compression-not-working-using-amazon-aws-s3-connector-streaming Looking at the code,

Re: S3 StreamingFileSink issues

2020-10-07 Thread David Anderson
Dan, The first point you've raised is a known issue: When a job is stopped, the unfinished part files are not transitioned to the finished state. This is mentioned in the docs as Important Note 2 [1], and fixing this is waiting on FLIP-46 [2]. That section of the docs also includes some

S3 StreamingFileSink issues

2020-10-06 Thread Dan Diephouse
First, let me say, Flink is super cool - thanks everyone for making my life easier in a lot of ways! Wish I had this 10 years ago Onto the fun stuff: I am attempting to use the StreamingFileSink with S3. Note that Flink is embedded in my app, not running as a standalone cluster. I am having