Finding things not seen in the last window

2017-09-29 Thread Ron Crocker
Hi - I have a colleague who is trying to write a flink job that will determine deltas from period to period. Let’s say the periods are 1 minutes. What he would like to do is report in minute 2 those things that are new since from minute 1, then in minute 3 report those things that are new also

Windowing isn't applied per key

2017-09-29 Thread Marcus Clendenin
I have a job that is performing an aggregation over a time window. This windowing is supposed to be happening by key, but the output I am seeing is creating an overall window on everything coming in. Is this happening because I am doing a map of the data before I am running the keyBy command? This

Enriching data from external source with cache

2017-09-29 Thread Derek VerLee
My basic problem will sound familiar I think, I need to enrich incoming data using a REST call to an external system for slowly evolving metadata. and some cache based lag is acceptable, so to reduce load on the external system and to process more efficiently, I would lik

Re: Repeated exceptions during system metrics registration

2017-09-29 Thread Reinier Kip
Why of course... Thank you for your time. I'll figure out where to go with Beam. From: Chesnay Schepler Sent: 29 September 2017 16:41:23 To: user@flink.apache.org Subject: Re: Repeated exceptions during system metrics registration You probably have multiple opera

Sink buffering

2017-09-29 Thread nragon
Hi, Just like mentioned at Berlin FF17, Pravega talk, can we simulate, somehow, sink buffering(pravega transactions) and coordinate them with checkpoints? My intension is to buffer records before sending them to hbase. Any opinions or tips? Thanks -- Sent from: http://apache-flink-user-mailing

Re: starting query server when running flink embedded

2017-09-29 Thread Piotr Nowojski
Hi, You can take a look at how is it done in the exercises here . There are example solutions that run on a local environment. I Hope that helps :) Piotrek > On Sep 28, 2017, at 11:22 PM, Henri Heiskanen > wrote: > > Hi, > > I wou

Re: state of parallel jobs when one task fails

2017-09-29 Thread r. r.
Thanks a lot - wasn't aware of FailoverStrategy Best regards Robert > Оригинално писмо >От: Piotr Nowojski pi...@data-artisans.com >Относно: Re: state of parallel jobs when one task fails >До: "r. r." >Изпратено на: 29.09.2017 18:21 > > > > >

Re: how many 'run -c' commands to start?

2017-09-29 Thread r. r.
Sure, here is the cmdline output: flink-1.3.2/bin/flink run -c com.corp.flink.KafkaJob quickstart.jar --topic InputQ --bootstrap.servers localhost:9092 --zookeeper.connect localhost:2181 --group.id Consumers -p 5 --detached Cluster configuration: Standalone cluster with JobManager at localhost/

Re: state of parallel jobs when one task fails

2017-09-29 Thread Piotr Nowojski
Hi, Yes, by default Flink will restart all of the tasks. I think that since Flink 1.3, you can configure a FailoverStrategy to change this behavior. Tha

Re: How flink monitor source stream task(Time Trigger) is running?

2017-09-29 Thread Piotr Nowojski
I am still not sure what do you mean by “thread crash without throw”. If SourceFunction.run methods returns without an exception Flink assumes that it has cleanly shutdown and that there were simply no more elements to collect/create by this task. If it continue working, without throwing an exc

state of parallel jobs when one task fails

2017-09-29 Thread r. r.
Hello I have a simple job with a single map() processing which I want to run with many documents in parallel in Flink. What will happen if one of the 'instances' of the job fails?   This statement in Flink docs confuses me: "In case of failures, a job switches first to failing where it cancels all

Re: how many 'run -c' commands to start?

2017-09-29 Thread Chesnay Schepler
The only nodes that matter are those on which the Flink processes, i.e Job- and TaskManagers , are being run. To prevent a JobManager node failure from causing the job to fail

Re: Repeated exceptions during system metrics registration

2017-09-29 Thread Chesnay Schepler
You probably have multiple operators that are called "Map", which causes the metric identifier to not be unique. As a result only 1 of these metrics is reported (whichever was registered first). Giving each operator a unique name will resolve this issue, but I don't know exactly how to do that

Repeated exceptions during system metrics registration

2017-09-29 Thread Reinier Kip
Hi all, I'm running a Beam pipeline on Flink and sending metrics via the Graphite reporter. I get repeated exceptions on the slaves, which try to register the same metric multiple times. Jobmanager and taskmanager data is fine: I can see JVM stuff, but only one datapoint here and there for task

Re: ArrayIndexOutOfBoundExceptions while processing valve output watermark and while applying ReduceFunction in reducing state

2017-09-29 Thread Federico D'Ambrosio
Hi Gordon, I'm currently using Flink 1.3.2 in local mode. If it's any help I realized from the log that the complete task which is failing is: 2017-09-29 14:17:20,354 INFO org.apache.flink.runtime.taskmanager.Task - latest_time -> (map_active_stream, map_history_stream) (1/1)

Re: ArrayIndexOutOfBoundExceptions while processing valve output watermark and while applying ReduceFunction in reducing state

2017-09-29 Thread Tzu-Li (Gordon) Tai
Hi, I’m looking into this. Could you let us know the Flink version in which the exceptions occurred? Cheers, Gordon On 29 September 2017 at 3:11:30 PM, Federico D'Ambrosio (federico.dambro...@smartlab.ws) wrote: Hi, I'm coming across these Exceptions while running a pretty simple flink job. F

ArrayIndexOutOfBoundExceptions while processing valve output watermark and while applying ReduceFunction in reducing state

2017-09-29 Thread Federico D'Ambrosio
Hi, I'm coming across these Exceptions while running a pretty simple flink job. First one: java.lang.RuntimeException: Exception occurred while processing valve output watermark: at org.apache.flink.streaming.runtime.io.StreamInputProcessor$ForwardingValveOutputHandler.handleWatermark(Str

Re: How flink monitor source stream task(Time Trigger) is running?

2017-09-29 Thread yunfan123
So my question is if this thread crash without throw any Exception. It seems flink can't handle this state. -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: How flink monitor source stream task(Time Trigger) is running?

2017-09-29 Thread Piotr Nowojski
Any exception thrown by your SourceFunction will be caught by Flink and that will mark a task (that was executing this SourceFuntion) as failed. If you started some custom threads in your SourceFunction, you have to manually propagate their exceptions to the SourceFunction. Piotrek > On Sep 29

Re: EASY Friday afternoon question: order of chained sink operator execution in a streaming task

2017-09-29 Thread Chesnay Schepler
Yes, i believe that is correct. On 29.09.2017 14:01, Martin Eden wrote: Hi all, Just a quick one. I have a task that looks like this (as printed in the logs): 17-09-29 0703510695 INFO TaskManager.info: Received task Co-Flat Map -> Process -> (Sink: sink1, Sink: sink2, Sink: sink3) (2/2) Af

Re: How flink monitor source stream task(Time Trigger) is running?

2017-09-29 Thread yunfan123
My source stream means the funciton implement the org.apache.flink.streaming.api.functions.source.SourceFunction. My question is how flink know all working thread is alive? If one working thread that execute the SourceFunction crash, how flink know this happenned? -- Sent from: http://apache-fl

EASY Friday afternoon question: order of chained sink operator execution in a streaming task

2017-09-29 Thread Martin Eden
Hi all, Just a quick one. I have a task that looks like this (as printed in the logs): 17-09-29 0703510695 INFO TaskManager.info: Received task Co-Flat Map -> Process -> (Sink: sink1, Sink: sink2, Sink: sink3) (2/2) After looking a bit at the code of the streaming task I suppose the sink operat

Re: Clean GlobalWidnow state

2017-09-29 Thread Fabian Hueske
Thanks for creating the JIRA issue! Best, Fabian 2017-09-20 12:26 GMT+02:00 gerardg : > I have prepared a repo that reproduces the issue: > https://github.com/GerardGarcia/flink-global-window-growing-state > > Maybe this way it is easier to spot the error or we can determine if it is > a > bug.

Re: How flink monitor source stream task(Time Trigger) is running?

2017-09-29 Thread Piotr Nowojski
We use Akka's DeathWatch mechanism to detect dead components. TaskManager failure shouldn’t prevent recovering from state (as long as there are enough task slots). I’m not sure if I understand what you mean by "source stream thread" crash. If is was some error during performing a checkpoint so

Re: Job Manager minimum memory hard coded to 768

2017-09-29 Thread Till Rohrmann
Hi Dan, I think Aljoscha is right and the 768 MB minimum JM memory is more of a legacy artifact which was never properly refactored. If I remember correctly, then we had problems when starting Flink in a container with a lower memory limit. Therefore this limit was introduced. But I'm actually not