Hey Alexander, We have a checkpoint offset tool (./samza-shell/src/main/bash/checkpoint-tool.sh), which allows you to read and write offsets for all input partitions. This tool will allow you to arbitrarily set offsets before a job starts.
We also support the samza.offset.default, and samza.reset.offset configurations: http://samza.incubator.apache.org/learn/documentation/0.7.0/jobs/configurat ion-table.html#streams These allow you to specify whether a job should read from the head or tail of an input stream when the job first starts. We don't currently support a way to change offsets once a job has already started. If you can get more specific about your use case, Cheers, Chris On 11/10/14 6:53 AM, "Alexander Taggart" <[email protected]> wrote: >We're investigating using Samza, and one aspect of our usage would require >being able to start a job such that it begins reading from a specified >Kafka offset. If I understand correctly, each job being bound to a >specific partition would need to be provided with a specific offset. Is >there any facility for providing such values, either via config or via >API? If not, what might be a good approach to implementing it (e.g., a >custom kafka consumer)?
