Hello All,

I was wondering what would be the community's thoughts on the following ?

We are using kafka 0.9 input operator to read from a few topics. We are
using this stream to generate a parquet file. Now this approach is all good
for a beginners use case. At a later point in time, we would like to
"merge" all of the parquet files previously generated and for this I would
like to reprocess data exactly from a particular offset inside each of the
partitions. Each of the partitions will have their own starting and ending
offsets that I need to process for.

I was wondering if there is an easy way to extend the Kafka 0.9 operator (
perhaps along the lines of the offset manager in the 0.8 versions of the
kafka operator ) . Thoughts please ?

Regards,
Ananth

Reply via email to