Re: High back-pressure after recovering from a save point

2017-07-13 Thread Kien Truong
Hi Fabian, This happens to me even when the restore is immediate, so there's not much data in Kafka to catch up (5 minutes max) Regards Kien On Jul 13, 2017, 23:40, at 23:40, Fabian Hueske wrote: >I would guess that this is quite usual because the job has to >"catch-up"

Re: S3 recovery and checkpoint directories exhibit explosive growth

2017-07-13 Thread prashantnayak
To add one more data point... it seems like the recovery directory is the bottleneck somehow.. so if we delete the recovery directory and restart the job manager - it comes back and is responsive. Of course, we lose all jobs, since none can be recovered... and that is of course not ideal. So

S3 recovery and checkpoint directories exhibit explosive growth

2017-07-13 Thread Prashant Nayak
We’re using Flink 1.3.1 on Mesos, with HA/recovery stored in S3 using RocksDB with incremental checkpointing. We have enabled external checkpoints (every 30s), retaining the two latest external checkpoints. We are trying to track down something we see happening where the recovery, checkpoint and

Re: How to send local files to a flink job on YARN

2017-07-13 Thread Ted Yu
I went back to commit 6e38eb8: [FLINK-1436] [docs] update command line documentation A search in the repo for "yarnship" ended up with no hit in the code (same with commit bf6b9aaab89e2e04678784525a42a19f099aa7f5 which is at top of git repo). Wondering whether it is supported. On Thu, Jul 13,

RE: How to send local files to a flink job on YARN

2017-07-13 Thread Guy Harmach
Hi, Just to clarify my need, I want to send the file from local file system to the job entry point, read it in the main method, and according its content to build my sources, operations and sinks. I assumed by the cli usage description for the yarnship flag that it is the equivalent to Spark’s

Re: [POLL] Who still uses Java 7 with Flink ?

2017-07-13 Thread nragon
+1 dropping java 7 -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/POLL-Who-still-uses-Java-7-with-Flink-tp12216p14266.html Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.

Re: How to send local files to a flink job on YARN

2017-07-13 Thread Jörn Franke
That does not sound like a good idea to put a configuration file on every node. What about Zookeeper? > On 13. Jul 2017, at 17:10, Guy Harmach wrote: > > Hi, > > I’m running a flink job on YARN. I’d like to pass yaml configuration files to > the job. > I tried to use the

Re: global window trigger

2017-07-13 Thread Aljoscha Krettek
Window contents are only purged from state if the Trigger says so or if the watermark passes the garbage collection horizon for a given window. With GlobalWindows, the GC horizon is never reached, that leaves Triggers. You can create a Trigger that purges every time it fires by wrapping it in a

How to send local files to a flink job on YARN

2017-07-13 Thread Guy Harmach
Hi, I'm running a flink job on YARN. I'd like to pass yaml configuration files to the job. I tried to use the flink cli -yarnship flag to point to a directory containing the file, but wasn't able to get it in the job. Can someone give an example of how to send local files and how to read them

Re: Why would a kafka source checkpoint take so long?

2017-07-13 Thread Vinay Patil
Hi Stephan, Sure will do that next time when I observe it. Regards, Vinay Patil On Thu, Jul 13, 2017 at 8:09 PM, Stephan Ewen wrote: > Is there any way you can pull a thread dump from the TMs at the point when > that happens? > > On Wed, Jul 12, 2017 at 8:50 PM, vinay patil

Re: Why would a kafka source checkpoint take so long?

2017-07-13 Thread Stephan Ewen
Is there any way you can pull a thread dump from the TMs at the point when that happens? On Wed, Jul 12, 2017 at 8:50 PM, vinay patil wrote: > Hi Gyula, > > I have observed similar issue with FlinkConsumer09 and 010 and posted it > to the mailing list as well . This

Re: [POLL] Who still uses Java 7 with Flink ?

2017-07-13 Thread Seth Wiesman
+1 for dropping java 7 On 7/13/17, 4:59 AM, "Konstantin Knauf" wrote: +1 for dropping Java 7 On 13.07.2017 10:11, Niels Basjes wrote: > +1 For dropping java 1.7 > > On 13 Jul 2017 04:11, "Jark Wu" wrote: > >> +1

High back-pressure after recovering from a save point

2017-07-13 Thread Kien Truong
Hi all, I have one job where back-pressure is significantly higher after resuming from a save point. Because that job makes heavy use of stateful functions with RocksDBStateBackend , I'm suspecting that this is the cause of performance degradation. Does anyone encounter simillar issues

Re: Reading static data

2017-07-13 Thread Timo Walther
Hi Mohit, do you plan to implement a batch or streaming job? If it is a streaming job: You can use a connected stream (see [1], Slide 34). The static data is one side of the stream that could be updated from time to time and will always propagated (using a broadcast()) to all workers that do

Re: Flink Elasticsearch Connector: Lucene Error message

2017-07-13 Thread Fabian Wollert
Hi Timo, Hi Gordon, thx for the reply! I checked the connection from both clusters to each other, and i can telnet to the 9300 port of flink, so i think the connection is not an issue here. We are currently using in our live env a custom elasticsearch connector, which used some extra lib's

Re: Read configuration instead of hard code

2017-07-13 Thread Timo Walther
Hi Desheng, Flink programs are defined in a regular Java main() method. They are executed on the Flink Client (usually the JobManeger) when submitted, you can add arbirary additional logic (like reading a file from an NFS) to the code. After retrieving the Kafka Info you can pass it to the

Re: Flink Elasticsearch Connector: Lucene Error message

2017-07-13 Thread Timo Walther
Hi Fabian, I loop in Gordon. Maybe he knows whats happening here. Regards, Timo Am 13.07.17 um 13:26 schrieb Fabian Wollert: Hi everyone, I'm trying to make use of the new Elasticsearch Connector

Read configuration instead of hard code

2017-07-13 Thread ZalaCheung
Hi all, I use Kafka as the source of my data stream. Instead of specifying the host and topic of Kafka directly in code, is that possible to read the configuration including Kakfa info from somewhere else like file or database? Desheng Zhang E-mail: gzzhangdesh...@corp.netease.com;

Re: How to maintain variable for each map operator

2017-07-13 Thread ZalaCheung
HI Kurt, Thanks for you reply! I’ve already solved the problem! Desheng Zhang E-mail: gzzhangdesh...@corp.netease.com; > On Jul 13, 2017, at 17:10, Kurt Young wrote: > > Hi, > > Regarding 1. State is some kind of value bound with your current key of > KeyedStream.

Re: [POLL] Who still uses Java 7 with Flink ?

2017-07-13 Thread Stephan Ewen
Thanks a lot for the feedback. Let's keep this thread open for feedback till end of the week. If it stays like it is (all or a vast majority in favor of dropping Java 7 support), we should start dropping Java 7 from next week on. On Thu, Jul 13, 2017 at 10:59 AM, Konstantin Knauf <

Re: System properties when submitting flink job to YARN Session

2017-07-13 Thread Aljoscha Krettek
Hi Jins, Do these settings have to be in the Jar File? Since you’re using Beam, you could also use PipelineOptions to make the options accessible to functions at runtime. Best, Aljoscha > On 12. Jul 2017, at 20:21, Jins George wrote: > > Hi Aljoscha, > > I am still

Re: How to maintain variable for each map operator

2017-07-13 Thread Kurt Young
Hi, Regarding 1. State is some kind of value bound with your current key of KeyedStream. ListState is list like state, it can be used as a List, you can add value to it, and get a iterator from it. If you have multiple ArrayList to maintain, you can declare multiple states, each with different

Re: [POLL] Who still uses Java 7 with Flink ?

2017-07-13 Thread Konstantin Knauf
+1 for dropping Java 7 On 13.07.2017 10:11, Niels Basjes wrote: > +1 For dropping java 1.7 > > On 13 Jul 2017 04:11, "Jark Wu" wrote: > >> +1 for dropping Java 7 >> >> 2017-07-13 9:34 GMT+08:00 ☼ R Nair (रविशंकर नायर) < >> ravishankar.n...@gmail.com>: >> >>> +1 for dropping

Re: Fink: KafkaProducer Data Loss

2017-07-13 Thread Piotr Nowojski
Ops, sorry, I forgot that this issue was relevant to FlinkKafkaProducer010 only. Piotrek > On Jul 13, 2017, at 9:33 AM, Tzu-Li (Gordon) Tai wrote: > > Hi Ninad & Piotr, > > AFAIK, when this issue was reported, Ninad was using 09. > FLINK-6996 only affects Flink Kafka

Re: Fink: KafkaProducer Data Loss

2017-07-13 Thread Tzu-Li (Gordon) Tai
Hi Ninad & Piotr, AFAIK, when this issue was reported, Ninad was using 09. FLINK-6996 only affects Flink Kafka Producer 010, so I don’t think that’s the cause here. @Ninad Code to reproduce this would definitely be helpful here, thanks. If you prefer to provide that privately, that would also

Re: Fink: KafkaProducer Data Loss

2017-07-13 Thread Piotr Nowojski
Hi, I’m not sure how relevant is this, but recently I have found and fixed a bug, that in certain conditions was causing data losses for all of the FlinkKafkaProducers in Flink: https://issues.apache.org/jira/browse/FLINK-6996 Namely on

Re: How to maintain variable for each map operator

2017-07-13 Thread ZalaCheung
Hi Kurt, Thanks! Your link helps me a lot. I still have some problems after I glance on the document. As you can see from my first email, I tried to implement a mapfunction class in flink. I actually have 3 arraylists to be maintain at this map operator. I think the Using managed keyed state