Re: Access Sliding window

2017-08-03 Thread Raj Kumar
Thanks Fabian. Your suggestion helped. But, I am stuck at 3rd step 1. I didn't completely understand the step 3. What the process function should look like ? Why does it needs to be stateful. Can you please provide more details on this. 2. In the stateful, function, we need to have a value state

Cannot restore from savepoint after adding a sink operator

2017-08-03 Thread Sam Huang
Hi all! I added a S3 bucketing sink operator to my flink job and tried to start it from a savepoint using --allowNonRestoreState option, and it's showing me this error: I found on Flink

Re: Customer inputformat

2017-08-03 Thread Ted Yu
Did you use StreamExecutionEnvironment.createFileInput() ? What did the modification times of the 2 files look like (were they the newest) ? Cheers On Mon, Jul 31, 2017 at 12:42 PM, Mohit Anchlia wrote: > Thanks! When I give path to a directory flink is only reading 2

Re: CEP condition expression and its event consuming strategy

2017-08-03 Thread Chao Wang
Thank you, Dawid. FYI, I've implemented the discarding logic by CoFlatMapFunction, for the special case where there are only two input streams: I maintain a logical state (no match, input1 matched, or input2 matched) and use private variables to store the matched event so far, which waits to

Re: json mapper

2017-08-03 Thread Eron Wright
I think your snippet looks good. The Jackson ObjectMapper is designed to be reused by numerous threads, and routinely stored as a static field. It is somewhat expensive to create. Hope this helps, -Eron On Thu, Aug 3, 2017 at 7:46 AM, Nico Kruber wrote: > Hi Peter, >

Re: Akka Quarantine & Old YARN Versions

2017-08-03 Thread Konstantin Knauf
Hi Nico, thanks for the quick response! No, this was note enabled :( Since we are in the process of upgrading to 1.3.1: I did not find this option in 1.3, only 1.2. Is this the default behaviour in 1.3 or is this configuration just not documented? Cheers, Konstantin On 03.08.2017 17:11, Nico

RE: Event-time and first watermark

2017-08-03 Thread Gwenhael Pasquiers
We're not using a Window but a more basic ProcessFunction to handle sessions. We made this choice because we have to handle (millions of) sessions that can last from 10 seconds to 24 hours so we wanted to handle things manually using the State class. We're using the watermark as an event-time

WaterMark & Eventwindow not fired correctly

2017-08-03 Thread aitozi
Hi, i have encounted a problem, i apply generate and assign watermark at the datastream, and then keyBy, and EventTimewindow and apply window Function. in the log, i can see that watermark and the eventtime with the message are correct , and i think the situation bellow will trigger the

Re: Akka Quarantine & Old YARN Versions

2017-08-03 Thread Nico Kruber
Hi Konstantin, I digged through the linked pull requests (of https://issues.apache.org/jira/ browse/FLINK-3347) a bit just to notice that the fix-version tag was wrong (should have been 1.2.1, not 1.2.0) but you have that already. In there, it was also mentioned that the quarantine monitor is

Re: json mapper

2017-08-03 Thread Nico Kruber
Hi Peter, I'm no Scala developer but I may be able to help with some concepts: * a static reference used inside a [Map]Function will certainly cause problems when executed in parallel in the same JVM, e.g. a TaskManager with multiple slots, depending on whether this static object is stateful

State Backend

2017-08-03 Thread Vijay Srinivasaraghavan
Hello, I would like to know if we have any latency requirements for choosing appropriate state backend?  For example, if an HCFS implementation is used as Flink state backend (instead of stock HDFS), are there any implications that one needs to know with respect to the performance? - Frequency

Re: Event-time and first watermark

2017-08-03 Thread Nico Kruber
Hi Gwenhael, "A Watermark(t) declares that event time has reached time t in that stream, meaning that there should be no more elements from the stream with a timestamp t’ <= t (i.e. events with timestamps older or equal to the watermark)." [1] Therefore, they should be behind the actual event

Re: state inside functions

2017-08-03 Thread Nico Kruber
Hi Peter, there's no need to worry about transient members as the operator itself is not serialized - only the state itself, depending on the state back-end. If you want your state to be recovered by checkpoints you should implement the open() method and initialise your state there as in your

Re: Getting JobManager address and port within a running job

2017-08-03 Thread Biplob Biswas
Hi nico, This behaviour was on my cluster and not on the local mode as I wanted to check whether it's an issue of my job or the behaviour with jobmanager is consistent everywhere. When I run my job on the yarn-cluster mode, it's not honouring the IP and port I specified and its randomly

Re: Can't find correct JobManager address, job fails with Queryable state

2017-08-03 Thread Biplob Biswas
Hi Nico, I had actually tried doing that but I still get the same error as before with the actor not found. I then ran on my mock cluster and I was getting the same error although I could observe the jobmanager on the yarn cluster mode with a defined port. The addres and port combination was

Re: Getting JobManager address and port within a running job

2017-08-03 Thread Biplob Biswas
Also, is it possible to get the JobID from a running flink instance for a streaming job? I know I can get for a batch job with ExecutionEnvironment.getExecutionEnvironment().getId() but apparently, it generates a new execution environment and returns the job id of that environment for a batch

Re: Getting JobManager address and port within a running job

2017-08-03 Thread Nico Kruber
Assuming, from your previous email, that you fire up a LocalFlinkMiniCluster: this, afaik, does not process flink-conf.yaml but only the configuration given to it. If you start a "real" flink cluster, e.g. by bin/start-cluster.sh, it will show the behaviour you desired. Nico On Thursday, 3

Re: Can't find correct JobManager address, job fails with Queryable state

2017-08-03 Thread Nico Kruber
Hi Biplob, by starting a local environment the way you described, i.e. by using LocalStreamEnvironment.createLocalEnvironmentWithWebUI(conf); you are firing up a LocalFlinkMiniCluster which, by default, has the queryable state server disabled. You can enable it via:

Re: replacement for KeyedStream.fold(..) ?

2017-08-03 Thread Nico Kruber
Hi Peter, although unfortunately not documented yet in [1] (rumor has it that that is going to change soon) and without a proper replacement note in the deprecation javadoc, two things come to mind for replacing fold(): * AggregateFunction and

Getting JobManager address and port within a running job

2017-08-03 Thread Biplob Biswas
Hi, Is there a way to fetch the jobmanager address and port from a running flink job, I was expecting the address and port to be constant but it changes everytime I am running a job. ANd somehow its not honoring the jobmanager.rpc.address and jobmanager.rpc.port set in the flink-conf.yaml file.

json mapper

2017-08-03 Thread Peter Ertl
Hi flink users, I just wanted to ask if this kind of scala map function is correct? object JsonMapper { private val mapper: ObjectMapper = new ObjectMapper() } class JsonMapper extends MapFunction[String, ObjectNode] { override def map(value: String): ObjectNode =

state inside functions

2017-08-03 Thread Peter Ertl
Hi, can someone elaborate on when I should set properties transient / non-transient within operators (e.g. map / flatMap / reduce) ? I see these two possibilies: (1) initialize a non-transient property from the constructor (2) initialize a transient property inside a Rich???Function when

Re: Can't find correct JobManager address, job fails with Queryable state

2017-08-03 Thread Biplob Biswas
I managed to get the Web UI up and running but I am still getting the error with "Actor not found" Before the job failed I got the output for the Flink config from the WebUI and it seems okay to me, this corresponds to the config I have already set.

Akka Quarantine & Old YARN Versions

2017-08-03 Thread Konstantin Knauf
Hi everyone, we are running Flink 1.2.1 on YARN 2.4 (I know, way to old :(). Correlated with the last Flink Upgrade from 1.1.3 -> 1.2.1 we are experiencing regular TaskManager failures due to [Taskmanager Logs] 2017-07-10 15:25:26,448 ERROR Remoting - Association to

Event-time and first watermark

2017-08-03 Thread Gwenhael Pasquiers
Hi, >From my tests it seems that the initial watermark value is Long.MIN_VALUE even >though my first data passed through the timestamp extractor before arriving >into my ProcessFunction. It looks like the watermark "lags" behind the data by >one message. Is there a way to have a watermark