Accept Avro object in ListCheckpointed interface

2017-07-21 Thread Kien Truong
Hi, ListCheckpointed only accept Serializable object at the moment, which make it cumbersome to checkpoint avro objects (have to convert them to byte arrays first). Is there any plan to support avro object directly? Best regards, Kien

Re: Accept Avro object in ListCheckpointed interface

2017-07-21 Thread Stefan Richter
Hi, ListCheckpointed is just a simplified/shortcut version of the more powerful interface CheckpointedFunction. If Serializable does not cover your use case, I suggest you to go with CheckpointedFunction and create your states in initialize(…) via the initializationContext.getOperatorStateStor

write into hdfs using avro

2017-07-21 Thread Rinat
Hi, folks ! I’ve got a little question, I’m trying to save stream of events from Kafka into HDSF using org.apache.flink.streaming.connectors.fs.bucketing.BucketingSink with AVRO serialization. If I properly understood, I should use some implementation of org.apache.flink.streaming.connectors.f

Re: Problem with Flink restoring from checkpoints

2017-07-21 Thread Francisco Blaya
Hi Aljoscha, I've tried both. When we restart from a manually created savepoint we see "restored from savepoint" in the Flink Dashboard. If we restart from externalized checkpoint we see "restored from checkpoint". In both scenarios we lose data in S3. Cheers, Fran On 20 July 2017 at 17:54, Alj

RE: Flink ML with DataStream

2017-07-21 Thread Branham, Jeremy [IT]
Thanks Fabian – I’m interested in the early development of ML on streams. Harshith and I plan on doing some prototyping for NRT anomaly detection leveraging the stream API. It would be great if we could produce something reusable for the community. From: Fabian Hueske [mailto:fhue...@gmail.com]

Re: Flink ML with DataStream

2017-07-21 Thread Fabian Hueske
Hi Jeremy, here are a few links about the recent efforts for ML on streams with Flink: - Discussion on the dev mailing list [1] - Announcement of a Slack channel [2] - GDocs Design Doc [3] IMO, anomaly detection is a great use case for ML on streams. Cheers, Fabian [1] https://lists.apache.org

Re: FlinkKafkaConsumer010 - Memory Issue

2017-07-21 Thread Kien Truong
Hi, From the log, it doesn't seem that the task manager use a lot of memory. Can you post the output of top. Regards, Kien On 7/20/2017 1:23 AM, PedroMrChaves wrote: Hello, Whenever I submit a job to Flink that retrieves data from Kafka the memory consumption continuously increases. I've c