Hello All, I was wondering what would be the community's thoughts on the following ?
We are using kafka 0.9 input operator to read from a few topics. We are using this stream to generate a parquet file. Now this approach is all good for a beginners use case. At a later point in time, we would like to "merge" all of the parquet files previously generated and for this I would like to reprocess data exactly from a particular offset inside each of the partitions. Each of the partitions will have their own starting and ending offsets that I need to process for. I was wondering if there is an easy way to extend the Kafka 0.9 operator ( perhaps along the lines of the offset manager in the 0.8 versions of the kafka operator ) . Thoughts please ? Regards, Ananth
