Hey Cody, I'm convinced that I'm not going to get the functionality I want
without using the Direct Stream API.

I'm now looking through
https://github.com/koeninger/kafka-exactly-once/blob/master/blogpost.md#exactly-once-using-transactional-writes
where you say "For the very first time the job is run, the table can be
pre-loaded with appropriate starting offsets."

Could you provide some guidance on how to determine valid starting offsets
the very first time, particularly in my case where I have 10+ topics in
multiple different deployment environments with an unknown and potentially
dynamic number of partitions per topic per environment?

I'd be happy if I could initialize all consumers to the value of
*auto.offset.reset
= "largest"*, record the partitions and offsets as they flow through spark,
and then use those discovered offsets from thereon out.

I'm thinking I can probably just do some if/else logic and use the basic
createDirectStream and the more advanced
createDirectStream(...fromOffsets...) if the offsets for my topic name
exists in the database. Any reason that wouldn't work? If I don't include
an offset range for a particular partition, will that partition be ignored?




On Wed, Dec 2, 2015 at 3:17 PM Cody Koeninger <c...@koeninger.org> wrote:

> Use the direct stream.  You can put multiple topics in a single stream,
> and differentiate them on a per-partition basis using the offset range.
>
> On Wed, Dec 2, 2015 at 2:13 PM, dutrow <dan.dut...@gmail.com> wrote:
>
>> I found the JIRA ticket: https://issues.apache.org/jira/browse/SPARK-2388
>>
>> It was marked as invalid.
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-user-list.1001560.n3.nabble.com/Kafka-streaming-from-multiple-topics-tp8678p25550.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> For additional commands, e-mail: user-h...@spark.apache.org
>>
>> --
Dan ✆

Reply via email to