Re: Possible use case: Simulating iterative batch processing by rewinding source

2016-04-11 Thread Robert Metzger
Flink's DataStream API also allows reading files from disk (local, hdfs, etc.). So you don't have to set up Kafka to make this work (If you have it already, you can of course use it). On Mon, Apr 11, 2016 at 11:08 AM, Ufuk Celebi wrote: > On Mon, Apr 11, 2016 at 10:26 AM, Raul

Re: Possible use case: Simulating iterative batch processing by rewinding source

2016-04-11 Thread Ufuk Celebi
On Mon, Apr 11, 2016 at 10:26 AM, Raul Kripalani wrote: > Would appreciate the feedback of the community. Even if it's to inform that > currently this iterative, batch, windowed approach is not possible, that's > ok! Hey Raul! What you describe should work with Flink. This is

Re: Possible use case: Simulating iterative batch processing by rewinding source

2016-04-11 Thread Raul Kripalani
Hello, Perhaps the description of use case wasn't clear enough? Please let me know. Would appreciate the feedback of the community. Even if it's to inform that currently this iterative, batch, windowed approach is not possible, that's ok! Cheers, *Raúl Kripalani* PMC & Committer @ Apache

Re: Possible use case: Simulating iterative batch processing by rewinding source

2016-04-06 Thread Christophe Salperwyck
Hi, I am interested too. For my part, I was thinking to use HBase as a backend so that my data are stored sorted. Nice to have to generate timeseries in the good order. Cheers, Christophe 2016-04-06 21:22 GMT+02:00 Raul Kripalani : > Hello, > > I'm getting started with Flink

Possible use case: Simulating iterative batch processing by rewinding source

2016-04-06 Thread Raul Kripalani
Hello, I'm getting started with Flink for a use case that could leverage the window processing abilities of Flink that Spark does not offer. Basically I have dumps of timeseries data (10y in ticks) which I need to calculate many metrics in an exploratory manner based on event time. NOTE: I don't