Re: Location of flink-s3-fs-hadoop plugin (Flink 1.9.0 on EMR)

2020-01-24 Thread Senthil Kumar
Thanks, here’s the debug output. It looks like we need to setup hdfs-config file in the flink config. Could you advise us further? -- 2020-01-23 22:07:44,014 DEBUG org.apache.flink.core.fs.FileSystem - Loading extension file systems via services 2020-01-23 22:07:44,0

Re: FileStreamingSink is using the same counter for different files

2020-01-24 Thread Pawel Bartoszek
I have looked into the source code and it looks likes that the same counter counter value being used in two buckets is correct. Each Bucket class https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/functions/sink/filesystem/Bucket.java is pa

Re: Location of flink-s3-fs-hadoop plugin (Flink 1.9.0 on EMR)

2020-01-24 Thread Aaron Langford
This seems to confirm that the S3 file system implementation is not being loaded when you start your job. Can you share the details of how you are getting the flink-s3-fs-hadoop artifact onto your cluster? Are you simply ssh-ing to the master node and doing this manually? Are you doing this via a

Is there anything strictly special about sink functions?

2020-01-24 Thread Andrew Roberts
Hello, I’m trying to push some behavior that we’ve currently got in a large, stateful SinkFunction implementation into Flink’s windowing system. The task at hand is similar to what StreamingFileSink provides, but more flexible. I don’t want to re-implement that sink, because it uses the Stream

1.9.2 Release Date?

2020-01-24 Thread Hailu, Andreas
Hi, Do we have any thoughts on a release date for 1.9.2? I've been eyeing FLINK-13184 particularly to help alleviate stress on our RM + Name Node and reduce noise/delays due to sporadic Task Manager timeouts. We submit thousands of jobs per hou

Re: FileStreamingSink is using the same counter for different files

2020-01-24 Thread Kostas Kloudas
Hi Pawel, You are correct that counters are unique within the same bucket but NOT across buckets. Across buckets, you may see the same counter being used. The max counter is used only upon restoring from a failure, resuming from a savepoint or rescaling and this is done to guarantee that n valid d

Re: 1.9.2 Release Date?

2020-01-24 Thread Arvid Heise
Hi Andreas, voting for 1.9.2-rc1 started 9h before you wrote your email. [1] If noone finds a bug or raises other concerns, 1.9.2 should be available next week. We are always happy about feedback. So if you have the option to test that rc1, please do. [1] http://apache-flink-mailing-list-archive.

Re: batch job OOM

2020-01-24 Thread Bowen Li
Hi Fanbin, You can install your own Flink build in AWS EMR, and it frees you from Emr’s release cycles On Thu, Jan 23, 2020 at 03:36 Jingsong Li wrote: > Fanbin, > > I have no idea now, can you created a JIRA to track it? You can describe > complete SQL and some data informations. > > Best, > J