Re: Slow flink checkpoint

2018-03-16 Thread Fabian Hueske
Hi, AFAIK, that's not possible. The only "solution" is to reduce the number of timers. Whether that's possible or not, depends on the application. For example, if you use timers to clean up state, you can work with an upper and lower bound and only register one timer for each (upper - lower) inter

Re: [ANNOUNCE] Apache Flink 1.3.3 released

2018-03-16 Thread Fabian Hueske
Thanks for managing this release Gordon! Cheers, Fabian 2018-03-15 21:24 GMT+01:00 Tzu-Li (Gordon) Tai : > The Apache Flink community is very happy to announce the release of Apache > Flink 1.3.3, which is the third bugfix release for the Apache Flink 1.3 > series. > > Apache Flink® is an open-s

Re: Deserializing the InputFormat (org.apache.flink.api.java.hadoop.mapreduce.HadoopInputFormat@50bb10fd) failed: unread block data after upgrade to 1.4.2

2018-03-16 Thread Fabian Hueske
One thing that changed in Flink 1.4 with respect to Hadoop is that Hadoop is now an optional dependency. Since Hadoop dependencies are now dynamically loaded, you might use different versions on the client and the cluster? Also the order in which classes are loaded changed. You could try to enable

Re: Deserializing the InputFormat (org.apache.flink.api.java.hadoop.mapreduce.HadoopInputFormat@50bb10fd) failed: unread block data after upgrade to 1.4.2

2018-03-16 Thread eSKa
Thanks a lot. It seems to work. What is now the default classloader's order? To keep it working in new version how should I inject Hadoop dependencies so that they are read properly? The class that is missing (HadoopInputFormat) is from hadoop-compatibility library. I have upgraded it to version 1

Re: Move files read by flink

2018-03-16 Thread flinkuser101
so, if you want to have your file parsed try to stay away from flink file parser (v1.4). Use nifi to parse files and then you could use Kafka or Flink. My data pipeline looks like: ftp <-> nifi <-> kafka <-> flink -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: Slow flink checkpoint

2018-03-16 Thread Stefan Richter
Hi, yes, that is correct, the timer service is currently only available in main-memory and only with synchronous snapshots. this topic is on our TODO list for after the Flink 1.5 release. Best, Stefan > Am 16.03.2018 um 09:03 schrieb Fabian Hueske : > > Hi, > > AFAIK, that's not possible. >

Re: [ANNOUNCE] Apache Flink 1.3.3 released

2018-03-16 Thread Till Rohrmann
Thanks for managing the release Gordon and also thanks to the community! Cheers, Till On Fri, Mar 16, 2018 at 9:05 AM, Fabian Hueske wrote: > Thanks for managing this release Gordon! > > Cheers, Fabian > > 2018-03-15 21:24 GMT+01:00 Tzu-Li (Gordon) Tai : > >> The Apache Flink community is very

Adding a new field in java class -> restore fails with "KryoException: Unable to find class"

2018-03-16 Thread Juho Autio
Is it possible to add new fields to the object type of a stream, and then restore from savepoint? I tried to add a new field "private String" to my java class. It previously had "private String" and a "private final Map". When trying to restore an old savepoint after this code change, it failed wi

Basic question about flink programms

2018-03-16 Thread Pan Glust
Hello everyone, coming from the Spring/CDI world, I'm new to Flink and to streaming processing in general and apologizing for the very basic questions. I wrote simple Flink job with all functions inlined in the main method. The main method has some static instance variables like HTTP client and g

Re: Slow flink checkpoint

2018-03-16 Thread 林德强
Hi Fabian , Reduce the number of timers is a good idea. But in my application the timer is different from the key registered follow the keyBy . May be it can't work with an upper and lower bound. I try modify the flink resource and start a thread to clean the expired keyed sate, but it d

Re: [ANNOUNCE] Apache Flink 1.3.3 released

2018-03-16 Thread Stephan Ewen
This release fixed a quite critical bug that could lead to loss of checkpoint state: https://issues.apache.org/jira/browse/FLINK-7783 We recommend all users on Flink 1.3.2 to upgrade to 1.3.3 On Fri, Mar 16, 2018 at 10:31 AM, Till Rohrmann wrote: > Thanks for managing the release Gordon and al

Re: Adding a new field in java class -> restore fails with "KryoException: Unable to find class"

2018-03-16 Thread Stephan Ewen
Hi! Schema evolution is a bit tricky at the moment. There is a short term and long term answer to this: - Long term: We store serializer configuration in the snapshots, and want to use this in the future to offer a path that converts old format to new format (read with old serializer, pass thro

Re: Basic question about flink programms

2018-03-16 Thread dyana . rose
In general I'd expect that every class with state that you use in Flink will be serialised, and therefore you should be marking your classes as Serializable and set a serialVersionUID I have what sounds like a very similar problem to yours. I need to use a non-serializable component in my strea

Re: Implement a sort inside the WindowFunction

2018-03-16 Thread Felipe Gutierrez
thanks Fabian, I am building an example and generating my own fake source to process in Flink. I am going to implement more stuff with keys and event time processing t

[ANNOUNCE] Flink 1.5 release testing effort

2018-03-16 Thread Till Rohrmann
Dear community, as it probably has gone a little bit unnoticed, I wanted to bring up that the Flink community is currently preparing the Flink 1.5 release. The release branch has already been created [1]. Currently, the remaining bugs are being fixed and the release is being tested. I hope that we

Intergrations Test in Scala

2018-03-16 Thread zavalit
hi, Flinkers i've already posted by stackoverflow, but there are still no that match feedback about the problem https://stackoverflow.com/questions/49155762/failing-to-trigger-streamingmultipleprogramstestbase-test-in-scala/49258997#49258997 Problem: i cannot start any test extending *StreamingMul

Strange behavior on filter, group and reduce DataSets

2018-03-16 Thread simone
Hi all, I am using Flink 1.3.1 and I have found a strange behavior on running the following logic: 1. Read data from file and store into DataSet 2. Split dataset in two, by checking if "field1" of POJOs is empty or not, so that the first dataset contains only elements with non empty "fiel

Re: SQL Table API: Naming operations done in query

2018-03-16 Thread Juho Autio
Hi, has there been any changes to state handling with Flink SQL? Anything planned? I didn't find it at https://ci.apache.org/projects/flink/flink-docs-master/dev/table/sql.html. Recently I ran into problems when trying to restore the state after changes that I thought wouldn't change the executi

Re: Strange behavior on filter, group and reduce DataSets

2018-03-16 Thread Kien Truong
Hi, Just a guest, but string compare in Java should be using equals method, not == operator. Regards, Kien On 3/16/2018 9:47 PM, simone wrote: /subject.getField("field1") == "";// /

Re: Strange behavior on filter, group and reduce DataSets

2018-03-16 Thread simone
Sorry, I translated the code into pseudocode too fast. That is indeed an equals. On 16/03/2018 15:58, Kien Truong wrote: Hi, Just a guest, but string compare in Java should be using equals method, not == operator. Regards, Kien On 3/16/2018 9:47 PM, simone wrote: /subject.getField("fi

Re: SQL Table API: Naming operations done in query

2018-03-16 Thread Fabian Hueske
Hmmm, that's a strange behavior that is unexpected (to me). Flink optimizes the Table API / SQL queries when a Table is converted into a DataStream (or DataSet) or emitted to a TableSink. So, given that you convert the result tables in addSink() into a DataStream and write them to a sink function,

How to correct use timeWindow() with DataStream?

2018-03-16 Thread Felipe Gutierrez
Hi all, I am building an example with DataStream using Flink that has a fake source generator of LogLine(Date d, String line). I want to work with Watermarks on it so I created a class that implements AssignerWithPeriodicWatermarks. If I don't use the monad ".timeWindow(Time.seconds(2))" on the da

Queryable State

2018-03-16 Thread Vishal Santoshi
We are making few decisions on use cases where Queryable state is a natural fit https://ci.apache.org/projects/flink/flink-docs-release-1.4/dev/stream/state/queryable_state.html Is Queryable state production ready ? We will go to 1.5 flnk if that helps to make the case for the usage.

CsvSink

2018-03-16 Thread karim amer
Hi There, I am trying to write a CSVsink to disk but it's not getting written. I think the file is getting overwritten or truncated once The Stream process finishes. Does anyone know why the file is getting overwritten or truncated and how can i fix this ? tableEnv.registerDataStream("table", w

Cassandra counter datatype support through POJO

2018-03-16 Thread Rohan Thimmappa
Hi All, i have table containing usage which is counter data type. every time i get usage for a id and would like to user counter data time to increment it. https://docs.datastax.com/en/cql/3.1/cql/cql_using/use_counter_t.html Is it support POJO approach of cassandra sync or i have use SQL appro