Re: Question about Timestamp in Flink SQL

2017-11-28 Thread wangsan
Hi Xincan, Thanks for your reply. The system default timezone is just as what I expected (sun.util.calendar.ZoneInfo[id="Asia/Shanghai",offset=2880,dstSavings=0,useDaylight=false,transitions=19,lastRule=null]). I looked into the generated code, and I found the following code snippet:

Re: user driven stream processing

2017-11-28 Thread Tony Wei
Hi KZ, https://data-artisans.com/blog/real-time-fraud-detection-ing-bank-apache-flink This article seems to be a good example to trigger a new calculation on a running job. Maybe you can get some help from it. Best Regards, Tony Wei 2017-11-29 4:53 GMT+08:00 zanqing zhang

Re: How to perform efficient DataSet reuse between iterations

2017-11-28 Thread Miguel Coimbra
Hello, You're right, I was overlooking that. With your suggestion, I now just define a different sink in each iteration of the loop. Then they all output to disk when executing a single bigger plan. I have one more question: I know I can retrieve the total time this single job takes to execute,

Re: S3 Access in eu-central-1

2017-11-28 Thread Dominik Bruhn
Hey Stephan, Hey Steve, that was the right hint, adding that open to the Java-Options fixed the problem. Maybe we should add this somehow to our Flink Wiki? Thanks! Dominik On 28/11/17 11:55, Stephan Ewen wrote: Got a pointer from Steve that this is answered on Stack Overflow here:

user driven stream processing

2017-11-28 Thread zanqing zhang
Hi All, Has anyone done any stream processing driven by a user request? What's the recommended way of doing this? Or is this completely wrong direction to go for applications running on top of Flink? Basically we need to tweak the stream processing based on parameters provided by a user, e.g.

Re: Status of Kafka011JsonTableSink for 1.4.0 release?

2017-11-28 Thread Georgios Kaklamanos
Hi Fabian, Thanks for the answer. I had seen the Kafka Producer but, from a quick look, I didn't seem to find something like a JSON Serialization Schema, which I need since the next app in my pipeline, expects to read the data in JSON. So hoping for a TableJSONSink, I didn't look more into it.

Re: How to deal with java.lang.IllegalStateException: Could not initialize keyed state backend after a code update

2017-11-28 Thread Federico D'Ambrosio
Hi Gordon, explicitly specifying the serialversionuid did the job, thank you! The failing task was latest_time -> (cassandra-map -> Sink: cassandra-active-sink, map_active_stream, map_history_stream) like the following: val events = keyedstream .window(Time.seconds(20))

Re: How to deal with java.lang.IllegalStateException: Could not initialize keyed state backend after a code update

2017-11-28 Thread Tzu-Li (Gordon) Tai
Hi Federico, It seems like the state cannot be restored because the class of the state type (i.e., Event) had been modified since the savepoint, and therefore has a conflicting serialVersionUID with whatever it is in the savepoint. This can happen if Java serialization is used for some part of

How to deal with java.lang.IllegalStateException: Could not initialize keyed state backend after a code update

2017-11-28 Thread Federico D'Ambrosio
Hi, I recently had to do a code update of a long running Flink Stream job (1.3.2) and on the restart from the savepoint I had to deal with: java.lang.IllegalStateException: Could not initialize keyed state backend. Caused by: java.io.InvalidClassException: lab.vardata.events.Event; local class

Re: Flink 1.2.0->1.3.2 TaskManager reporting to JobManager

2017-11-28 Thread Nico Kruber
Hi Regina, can you explain a bit more on what you are trying to do and how this is set up? I quickly tried to reproduce locally by starting a cluster and could not see this behaviour. Also, can you try to increase the loglevel to INFO and see whether you see anything suspicious in the logs?

Re: S3 Access in eu-central-1

2017-11-28 Thread Stephan Ewen
Got a pointer from Steve that this is answered on Stack Overflow here: https://stackoverflow.com/questions/36154484/aws-java- sdk-manually-set-signature-version Flink 1.4 contains a specially bundled "fs-s3-hadoop" with smaller no footprint, compatible across Hadoop versions, and based on a later

Re: Missing checkpoint when restarting failed job

2017-11-28 Thread Gerard Garcia
I've been monitoring the task and checkpoint 1 never gets deleted. Right now we have: chk-1 chk-1222 chk-326 chk-329 chk-357 chk-358 chk-8945 chk-8999 chk-9525 chk-9788 chk-9789 chk-9790 chk-9791 I made the task fail and it recovered without problems so for now I would say that the

Re: How to perform efficient DataSet reuse between iterations

2017-11-28 Thread Fabian Hueske
Hi, by calling result.count(), you compute the complete plan from the beginning and not just the operations you added since the last execution. Looking at the output you posted, each step takes about 15 seconds (with about 5 secs of initialization). So the 20 seconds of the first step include

Re: Non-intrusive way to detect which type is using kryo ?

2017-11-28 Thread Timo Walther
Hi Kien, at the moment I'm working on some improvements to the type system that will make it easier to tell if a type is a POJO or not. I have some utility in mind like `ensurePojo(MyType.class)` that would throw an exception with a reason why this type must be treated as a generic type.

Re: Non-intrusive way to detect which type is using kryo ?

2017-11-28 Thread Antoine Philippot
Hi Kien, The only way I found is to add this line at the beginning of the application to detect kryo serialization : `com.esotericsoftware.minlog.Log.set(Log.LEVEL_DEBUG)` Antoine Le mar. 28 nov. 2017 à 02:41, Kien Truong a écrit : > Hi, > > Are there any way to only

Re: How to perform efficient DataSet reuse between iterations

2017-11-28 Thread Miguel Coimbra
Hello Fabian, Thank you for the reply. I was hoping the situation had in fact changed. As far as I know, I am not calling execute() directly even once - it is being called implicitly by simple DataSink elements added to the plan through count(): System.out.println(String.format("%d-th graph