Re: Timestamp/watermark support in Kinesis consumer

2018-02-12 Thread Eron Wright
It is valuable to consider the behavior of a consumer in both a real-time processing context, which consists mostly of tail reads, and a historical processing context, where there's an abundance of backlogged data. In the historical processing context, system internals (e.g. shard selection

Re: Timestamp/watermark support in Kinesis consumer

2018-02-12 Thread Thomas Weise
I don't think there is a generic solution to the problem you are describing; we don't know how long it will take for resharding to take effect and those changes to become visible to the connector. Depending on how latency sensitive the pipeline is, possibly a configurable watermark hold period

Re: Timestamp/watermark support in Kinesis consumer

2018-02-12 Thread Eron Wright
I'd like to know how you envision dealing with resharding in relation to the watermark state. Imagine that a given shard S1 has a watermark of T1, and is then split into two shards S2 and S3. The new shards are assigned to subtasks according to a hash function. The current watermarks of those

Re: Timestamp/watermark support in Kinesis consumer

2018-02-12 Thread Thomas Weise
Based on my draft implementation, the changes that are needed in the Flink connector are as follows: I need to be able to override the following to track last record timestamp and idle time per shard. protected final void emitRecordAndUpdateState(T record, long recordTimestamp, int

[jira] [Created] (FLINK-8641) Move BootstrapTools#getTaskManagerShellCommand to flink-yarn

2018-02-12 Thread Chesnay Schepler (JIRA)
Chesnay Schepler created FLINK-8641: --- Summary: Move BootstrapTools#getTaskManagerShellCommand to flink-yarn Key: FLINK-8641 URL: https://issues.apache.org/jira/browse/FLINK-8641 Project: Flink

Re: [DISCUSS] Releasing Flink 1.5.0

2018-02-12 Thread Stephan Ewen
I agree with the basic idea. I think there is no need to call it "soft feature freeze" though - it is a feature freeze (no new features get merged) ;-) What you are suggesting is to delay forking of the release-1.5 branch to avoid applying the bug fixes to too many branches. That makes sense. In

Re: Batch job getting stuck

2018-02-12 Thread Timo Walther
Hi Amit, how is the memory consumption when the jobs get stuck? Is the Java GC active? Are you using off-heap memory? Regards, Timo Am 2/12/18 um 10:10 AM schrieb Amit Jain: Hi, We have created Batch job where we are trying to merge set of S3 directories in TextFormat with the old snapshot

[jira] [Created] (FLINK-8640) Disable japicmp on java 9

2018-02-12 Thread Chesnay Schepler (JIRA)
Chesnay Schepler created FLINK-8640: --- Summary: Disable japicmp on java 9 Key: FLINK-8640 URL: https://issues.apache.org/jira/browse/FLINK-8640 Project: Flink Issue Type: Sub-task

[jira] [Created] (FLINK-8639) Fix always need to seek multiple times when iterator RocksDBMapState

2018-02-12 Thread Sihua Zhou (JIRA)
Sihua Zhou created FLINK-8639: - Summary: Fix always need to seek multiple times when iterator RocksDBMapState Key: FLINK-8639 URL: https://issues.apache.org/jira/browse/FLINK-8639 Project: Flink

Re: [DISCUSS] Releasing Flink 1.5.0

2018-02-12 Thread Kostas Kloudas
For me as well +1. Cheers, Kostas > On Feb 12, 2018, at 2:59 PM, Timo Walther wrote: > > Sounds good to me. +1 from my side. > > Regards, > Timo > > > Am 2/12/18 um 2:56 PM schrieb Aljoscha Krettek: >> I agree with Chesnay: we should do a soft "feature freeze" first,

Re: [DISCUSS] Releasing Flink 1.5.0

2018-02-12 Thread Timo Walther
Sounds good to me. +1 from my side. Regards, Timo Am 2/12/18 um 2:56 PM schrieb Aljoscha Krettek: I agree with Chesnay: we should do a soft "feature freeze" first, were we agree to not merge new features to master after that and then to the actual hard cutting of the release branch a while

Re: [DISCUSS] Releasing Flink 1.5.0

2018-02-12 Thread Aljoscha Krettek
I agree with Chesnay: we should do a soft "feature freeze" first, were we agree to not merge new features to master after that and then to the actual hard cutting of the release branch a while later. For actual dates, I'm proposing end of this week (16.02.2018) as soft feature freeze and end

Batch job getting stuck

2018-02-12 Thread Amit Jain
Hi, We have created Batch job where we are trying to merge set of S3 directories in TextFormat with the old snapshot in Parquet format. We are running 50 such jobs daily and found the progress of few random jobs get stuck in between. We have gone through logs of JobManager, TaskManager and could

Re: Permission For New Contributor

2018-02-12 Thread Fabian Hueske
Hi Tuo Wang, I've given you contributor permissions. Looking forward to your contributions. Best, Fabian 2018-02-12 7:25 GMT+01:00 T. Wang : > Hi all, > I would like to start to contribute code in Flink community. > Could someone give me contributor permission? > My jira