Re: When a jobmanager fails, it doesn't restart because it tries to restart non existing tasks

2018-07-19 Thread vino yang
Hi Till, You are right, we also saw the problem you said. Curator removes the specific job graph path asynchronously. But it's the only gist when recovering, right? Is there any plan to enhance this point? Thanks, vino. 2018-07-19 21:58 GMT+08:00 Till Rohrmann : > Hi Gerard, > > the logging

Re: org.apache.flink.streaming.runtime.streamrecord.LatencyMarker cannot be cast to org.apache.flink.streaming.runtime.streamrecord.StreamRecord

2018-07-19 Thread vino yang
Hi Gregory, This exception seems a bug, you can create a issues in the JIRA. Thanks, vino. 2018-07-20 10:28 GMT+08:00 Philip Doctor : > Oh you were asking about the cast exception, I haven't seen that before, > sorry to be off topic. > > > > > -- > *From:* Philip

Re: org.apache.flink.streaming.runtime.streamrecord.LatencyMarker cannot be cast to org.apache.flink.streaming.runtime.streamrecord.StreamRecord

2018-07-19 Thread Philip Doctor
Oh you were asking about the cast exception, I haven't seen that before, sorry to be off topic. From: Philip Doctor Sent: Thursday, July 19, 2018 9:27:15 PM To: Gregory Fee; user Subject: Re: org.apache.flink.streaming.runtime.streamrecord.LatencyMarker

Re: org.apache.flink.streaming.runtime.streamrecord.LatencyMarker cannot be cast to org.apache.flink.streaming.runtime.streamrecord.StreamRecord

2018-07-19 Thread Philip Doctor
I'm just a flink user, not an expert. I've seen that exception before. I have never seen it be the actual error, I usually see it when some other operator is throwing an uncaught exception and busy dying. It seems to me that the prior operator throws this error "Can't forward to the next

org.apache.flink.streaming.runtime.streamrecord.LatencyMarker cannot be cast to org.apache.flink.streaming.runtime.streamrecord.StreamRecord

2018-07-19 Thread Gregory Fee
Hello, I have a job running and I've gotten this error a few times. The job recovers from a checkpoint and seems to continue forward fine. Then the error will happen again sometime later, perhaps 1 hour. This looks like a Flink bug to me but I could use an expert opinion. Thanks!

Re: Is KeyedProcessFunction available in Flink 1.4?

2018-07-19 Thread Bowen Li
Hi Anna, KeyedProcessFunction is only available starting from Flink 1.5. The doc is here . It extends ProcessFunction and shares the same functionalities except giving more

Is KeyedProcessFunction available in Flink 1.4?

2018-07-19 Thread anna stax
Hello all, I am using Flink 1.4 because thats the version provided by the latest AWS EMR. Is KeyedProcessFunction available in Flink 1.4? Also please share any links to good examples on using KeyedProcessFunction . Thanks

Flink 1.4.2 -> 1.5.0 upgrade Queryable State debugging

2018-07-19 Thread Philip Doctor
Dear Flink Users, I'm trying to upgrade to flink 1.5.0, so far everything works except for the Queryable state client. Now here's where it gets weird. I have the client sitting behind a web API so the rest of our non-java ecosystem can consume it. I've got 2 tests, one calls my route

Re: Keeping only latest row by key?

2018-07-19 Thread Fabian Hueske
HI James, Yes, that should also do the trick. Best, Fabian 2018-07-19 16:06 GMT+02:00 Porritt, James : > It looks like the following gives me the result I’m interested in: > > > > batchEnv > > .createInput(dataset) > > .groupBy("id") > >

RE: Keeping only latest row by key?

2018-07-19 Thread Porritt, James
It looks like the following gives me the result I’m interested in: batchEnv .createInput(dataset) .groupBy("id") .sortGroup("timestamp", Order.DESCENDING) .first(1); Is there anything I’ve misunderstood with this? From: Porritt,

Re: Flink on Mesos: containers question

2018-07-19 Thread Till Rohrmann
Hi Alexei, I actually never used Mesos with container images. I always used it in a way where the Mesos task directly starts the Java process. Cheers, Till On Thu, Jul 19, 2018 at 2:44 PM NEKRASSOV, ALEXEI wrote: > Till, > > > > Any insight into how Flink components are containerized in

Re: When a jobmanager fails, it doesn't restart because it tries to restart non existing tasks

2018-07-19 Thread Till Rohrmann
Hi Gerard, the logging statement `Removed job graph ... from ZooKeeper` is actually not 100% accurate. The actual deletion is executed as an asynchronous background task and the log statement is not printed in the callback (which it should). Therefore, the deletion could still have failed. In

Re: Why data didn't enter the time window in EventTime mode

2018-07-19 Thread Hequn Cheng
Hi Soheil, You can monitor the watermarks in the web dashboard as Fabian said. There are some documents here[1]. [1] https://ci.apache.org/projects/flink/flink-docs-master/monitoring/debugging_event_time.html#monitoring-current-event-time On Thu, Jul 19, 2018 at 3:53 PM, Fabian Hueske wrote: >

RE: Flink on Mesos: containers question

2018-07-19 Thread NEKRASSOV, ALEXEI
Till, Any insight into how Flink components are containerized in Mesos? Thanks! Alex From: Fabian Hueske [mailto:fhue...@gmail.com] Sent: Monday, July 16, 2018 7:57 AM To: NEKRASSOV, ALEXEI Cc: user@flink.apache.org; Till Rohrmann Subject: Re: Flink on Mesos: containers question Hi Alexei,

Re: data enrichment via endpoint, serializable issue

2018-07-19 Thread Steffen Wohlers
Hi Xingcan, option two RichMapFunction works , thanks a lot! Thanks, Steffen > On 19. Jul 2018, at 13:59, Xingcan Cui wrote: > > Hi Steffen, > > You could make the class `TextAPIClient` serializable, or use > `RichMapFunction` [1] and instantiate all the required objects in its > `open()`

Re: Bootstrapping the state

2018-07-19 Thread vino yang
Hi Henkka, The behavior of the text file source meets expectation. Flink will not keep your source task thread when it exit from it's invoke method. That means you should keep your source task alive. So to implement this, you should customize a text file source (implement SourceFunction

Re: data enrichment via endpoint, serializable issue

2018-07-19 Thread Xingcan Cui
Hi Steffen, You could make the class `TextAPIClient` serializable, or use `RichMapFunction` [1] and instantiate all the required objects in its `open()` method. [1] https://ci.apache.org/projects/flink/flink-docs-master/dev/api_concepts.html#rich-functions

Bootstrapping the state

2018-07-19 Thread Henri Heiskanen
Hi, I've been looking into how to initialise large state and especially checked this presentation by Lyft referenced in this group as well: https://www.youtube.com/watch?v=WdMcyN5QZZQ In our use case we would like to load roughly 4 billion entries into this state and I believe loading this data

Re: When a jobmanager fails, it doesn't restart because it tries to restart non existing tasks

2018-07-19 Thread Gerard Garcia
Thanks Andrey, That is the log from the jobmanager just after it has finished cancelling the task: 11:29:18.716 [flink-akka.actor.default-dispatcher-15695] INFO org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Stopping checkpoint coordinator for job e403893e5208ca47ace886a77e405291.

Re: Object reuse in DataStreams

2018-07-19 Thread vino yang
Hi Urs, I think Flink does not encourage to use "object reuse" feature, because in the documentation, it warn the user it may course bug when the user-code function of an operation is not aware of this behavior[1]. The "object reuse" is runtime behavior and it's configuration item belongs

data enrichment via endpoint, serializable issue

2018-07-19 Thread Steffen Wohlers
Hi all, I’m new to Apache Flink and I have the following issue: I would like to enrich data via map function. For that I call a method which calls an endpoint but I get following error message „The implementation of the MapFunction is not serializable. The object probably contains or

Re: Global latency metrics

2018-07-19 Thread vino yang
Hi shimin, For some scenario, your requirement is necessary. And sometimes, we want to know the total throughput, latency and the event processing rate end-to-end. But currently, Flink can not support the global metrics. To Chesnay, I think it's a good feature the community can consider.

Re: Parallel stream partitions

2018-07-19 Thread Fabian Hueske
Hi Nick, What Ken said is correct, but let me add two more things. 1) State Usually, you only need to partition (keyBy()) the data if you want to process tuples with the same same key together. Therefore, it is necessary to hold some tuples or intermediate results (like partial or running

Re: Description of Flink event time processing

2018-07-19 Thread Fabian Hueske
Hi Elias, Thanks for the update! I'll try to have another look soon. Best, Fabian 2018-07-11 1:30 GMT+02:00 Elias Levy : > Thanks for all the comments. I've updated the document to account for the > feedback. Please take a look. > > On Fri, Jul 6, 2018 at 2:33 PM Elias Levy > wrote: > >>

Re: flink 1.4.2 Ambari

2018-07-19 Thread Jeff Bean
Antonio, Have you seen: https://github.com/abajwa-hw/ambari-flink-service Jeff On Fri, Jul 13, 2018 at 7:45 PM, antonio saldivar wrote: > Hello > > I am trying to find the way to add Flink 1.4.2 service to ambari because > is not listed in the Stack. does anyone has the steps to add this

Re: Cannot configure akka.ask.timeout

2018-07-19 Thread Gary Yao
Hi Lukas, It seems that when using MiniCluster, the config key akka.ask.timeout is not respected. Instead, a hardcoded timeout of 10s is used [1]. Since all communication is locally, it would be interesting to see in detail what your job looks like that it exceeds the timeout. The key

Re: Flink resource manager unable to connect to mesos after restart

2018-07-19 Thread Renjie Liu
Hi, Gary: It can be reproduced stablely, just need to kill job manager and restart it. Attached is jobmanager's log, but I don't find anyting valuable since it just keep reporting unable to connect to mesos master. On Thu, Jul 19, 2018 at 4:55 AM Gary Yao wrote: > Hi, > > If you are able to

RE: Keeping only latest row by key?

2018-07-19 Thread Porritt, James
Hi Timo, Thanks for this. I’ve been looking into creating this in Java by looking at MaxAggFunction.scala as a basis. Is it correct that I’d be creating a version for each type I want to use it with (albeit using Generic s) and registering the functions separately for use with

Re: Production readiness of Flink Job Stop Service

2018-07-19 Thread Fabian Hueske
Hi Chirag, Stop with savepoint is not mentioned in the 1.5.0 release notes [1]. Since its a frequently requested feature, I'm pretty sure that it would have been mentioned if it was added. Best, Fabian [1] http://flink.apache.org/news/2018/05/25/release-1.5.0.html 2018-07-19 8:39 GMT+02:00

Re: Why data didn't enter the time window in EventTime mode

2018-07-19 Thread Fabian Hueske
Hi Soheil, Hequn is right. This might be an issue with advancing event-time. You can monitor that by checking the watermarks in the web dashboard or print-debug it with a ProcessFunction which can lookup the current watermark. Best, Fabian 2018-07-19 3:30 GMT+02:00 Hequn Cheng : > Hi Soheil, >

Re: Race between window assignment and same window timeout

2018-07-19 Thread Fabian Hueske
Hi Shay, This sounds very much like the off-by-one bug described by FLINK-9857 [1]. The problem was identified in another recent user ml thread and fixed for Flink 1.5.2 and 1.6.0. Best, Fabian [1] https://issues.apache.org/jira/browse/FLINK-9857 2018-07-18 19:00 GMT+02:00 Andrey Zagrebin : >

Re: Production readiness of Flink Job Stop Service

2018-07-19 Thread vino yang
Hi Chirag, Did you read the latest stable Flink documentation about Savepoint[1] and Cancel with savepoint[2] and Upgrade application[3]? [1]: https://ci.apache.org/projects/flink/flink-docs-release-1.5/ops/state/savepoints.html#resuming-from-savepoints [2]:

Production readiness of Flink Job Stop Service

2018-07-19 Thread Chirag Dewan
Hi, I am planning to use the Stop Service for stopping/resuming/pausing my Flink Job. My intention is to stop sources before we take the savepoint i.e. stop with savepoint.  I know that since Flink 1.4.2, Stop is not stable/not production ready.  With Flink 1.5 can it be used for stopping jobs?