Hi Ananth,
Unlike files, Kafka is usually for streaming cases. Correct me if I'm
wrong, your use case seems like a batch processing. We didn't consider end
offset in our Kafka input operator design. But it could be a useful
feature. Unfortunately there is no easy way, as of I know, to extend
existing operator to achieve that.

OffsetManager is not designed for end offset. It's only
a  customizable callback to update the committed offsets. And the start
offsets it loads are supposed for stateful application restart.

Can you create a ticket and elaborate your use case there? Thanks!

Regards,
Siyuan




On Friday, June 10, 2016, Ananth Gundabattula <agundabatt...@gmail.com>
wrote:

> Hello All,
>
> I was wondering what would be the community's thoughts on the following ?
>
> We are using kafka 0.9 input operator to read from a few topics. We are
> using this stream to generate a parquet file. Now this approach is all good
> for a beginners use case. At a later point in time, we would like to
> "merge" all of the parquet files previously generated and for this I would
> like to reprocess data exactly from a particular offset inside each of the
> partitions. Each of the partitions will have their own starting and ending
> offsets that I need to process for.
>
> I was wondering if there is an easy way to extend the Kafka 0.9 operator (
> perhaps along the lines of the offset manager in the 0.8 versions of the
> kafka operator ) . Thoughts please ?
>
> Regards,
> Ananth
>

Reply via email to