Re: Flink Kafka to BucketingSink to S3 - S3Exception

2018-11-07 Thread Flink Developer
Thank you Addison and Ravi for the detailed info. Hi Addison, it sounds like StreamingFileSink is promising and will be available in Flink 1.7. From the mailing list, it looks like Flink 1.7 RC is now available for use. Some questions for you... in your use case, is your source Kafka and is the

Re: Flink Kafka to BucketingSink to S3 - S3Exception

2018-11-05 Thread Ravi Bhushan Ratnakar
Hi there, some questions: 1. Is this using Flink 1.6.2 with dependencies (flink-s3-fs-hadoop, flink-statebackend-rocksdb, hadoop-common, hadoop-aws, hadoop-hdfs, hadoop-common) ? If so, could you please share your dependency versioning? [Ravi]- I am using Aws Emr 5.18 which supports Fli

Re: Flink Kafka to BucketingSink to S3 - S3Exception

2018-11-05 Thread Addison Higham
Hi there, This is going to be a bit of a long post, but I think there has been a lot of confusion around S3, so I am going to go over everything I know in hopes that helps. As mentioned by Rafi, The BucketingSink does not work for file systems like S3, as the bucketing sink makes some assumptions

Re: Flink Kafka to BucketingSink to S3 - S3Exception

2018-11-04 Thread Flink Developer
Hi Ravi, some questions: - Is this using Flink 1.6.2 with dependencies (flink-s3-fs-hadoop, flink-statebackend-rocksdb, hadoop-common, hadoop-aws, hadoop-hdfs, hadoop-common) ? If so, could you please share your dependency versioning? - Does this use a kafka source with high flink parallelism (~

Re: Flink Kafka to BucketingSink to S3 - S3Exception

2018-11-03 Thread Ravi Bhushan Ratnakar
I have done little changes in BucketingSink and implemented as new CustomBucketingSink to use in my project which works fine with s3 and s3a protocol. This implementation doesn't require xml file configuration, rather than it uses configuration provided using flink configuration object by calling

Re: Flink Kafka to BucketingSink to S3 - S3Exception

2018-11-03 Thread Flink Developer
It seems the issue also appears when using Flink version 1.6.2 . ‐‐‐ Original Message ‐‐‐ On Tuesday, October 30, 2018 10:26 PM, Flink Developer wrote: > Hi, thanks for the info Rafi, that seems to be related. I hope Flink version > 1.6.2 fixes this. Has anyone encountered this before?

Re: Flink Kafka to BucketingSink to S3 - S3Exception

2018-10-30 Thread Flink Developer
Hi, thanks for the info Rafi, that seems to be related. I hope Flink version 1.6.2 fixes this. Has anyone encountered this before? I would also like to note that my jar includes a core-site.xml file that uses *s3a*. Is this the recommended configuration to use with BucketingSink? Should the

Re: Flink Kafka to BucketingSink to S3 - S3Exception

2018-10-28 Thread Rafi Aroch
Hi, I'm also experiencing this with Flink 1.5.2. This is probably related to BucketingSink not working properly with S3 as filesystem because of the eventual-consistency of S3. I see that https://issues.apache.org/jira/browse/FLINK-9752 will be part of 1.6.2 release. It might help, if you use the

Flink Kafka to BucketingSink to S3 - S3Exception

2018-10-27 Thread Flink Developer
Hi, I'm running a scala flink app in an AWS EMR cluster (emr 5.17, hadoop 2.8.4) with flink parallelization set to 400. The source is a Kafka topic and sinks to S3 in the format of: s3:/. There's potentially 400 files writing simultaneously. Configuration: - Flink v1.5.2 - Checkpointing en