rocessing engine such as Spark? I do
> like the choices that adopting the unified programming model outlined in
> Apache Beam/Google Cloud Dataflow SDK and this purports to have runners for
> both Flink and Spark.
>
>
>
> Regards,
>
>
>
> Leith
>
> *From: *Till R
programming model outlined in
Apache Beam/Google Cloud Dataflow SDK and this purports to have runners for
both Flink and Spark.
Regards,
Leith
*From: *Till Rohrmann
*Date: *Wednesday, 20 July 2016 at 5:05 PM
*To: *
*Subject: *Re: Using Kafka and Flink for batch processing of a batch data
sou
Date: Wednesday, 20 July 2016 at 5:05 PM
To:
Subject: Re: Using Kafka and Flink for batch processing of a batch data source
At the moment there is also no batch source for Kafka. I'm also not so sure how
you would define a batch given a Kafka stream. Only reading till a certain
offset? Or maybe
At the moment there is also no batch source for Kafka. I'm also not so sure
how you would define a batch given a Kafka stream. Only reading till a
certain offset? Or maybe until one has read n messages?
I think it's best to write the batch data to HDFS or another batch data
store.
Cheers,
Till
O
It likely does not make sense to publish a file ( "batch data") into Kafka;
unless the file is very small.
An improvised pub-sub mechanism for Kafka could be to (a) write the file
into a persistent store outside of kafka (b) publishing of a message into
Kafka about that write so as to enable proce
I am currently working on an architecture for a big data streaming and batch
processing platform. I am planning on using Apache Kafka for a distributed
messaging system to handle data from streaming data sources and then pass on to
Apache Flink for stream processing. I would also like to use Fli