* Is there a tentative date for 0.10.0 release?
    I think it's coming out soon. @Yi Pan , he should know more about that.


* I checked the checkpoint topic for Samza job and it seems the checkpoint 
topic is created with1 partition by default. Given that each Samza task will 
need to read from checkpoint topic, it is similar to what I need to read (Each 
Samza task is reading from the same partition of a topic). I am wondering how 
is that achieved?
    In current implementation, only the AM reads the checkpoint stream and 
distribute the information to all the nodes using the http server. Not all the 
nodes are consuming the checkpoint stream. Correct me if I am wrong.


Thanks,
Yan






At 2015-10-28 02:49:23, "Chen Song" <chen.song...@gmail.com> wrote:
>Thanks Yan.
>
>* Is there a tentative date for 0.10.0 release?
>* I checked the checkpoint topic for Samza job and it seems the checkpoint
>topic is created with1 partition by default. Given that each Samza task
>will need to read from checkpoint topic, it is similar to what I need to
>read (Each Samza task is reading from the same partition of a topic). I am
>wondering how is that achieved?
>
>Chen
>
>On Sat, Oct 24, 2015 at 5:52 AM, Yan Fang <yanfangw...@163.com> wrote:
>
>> Hi Chen Song,
>>
>>
>> Sorry for the late reply. What you describe is a typical bootstrap use
>> case. Check
>> http://samza.apache.org/learn/documentation/0.9/container/streams.html ,
>> the bootstrap configuration. By using this one, Samza will always read the
>> *topicR* from the beginning when it restarts. And then it treats the
>> *topicR* as a normal topic after reading existing msgs in the *topicD*.
>>
>>
>> == can we configure each individual Samza task to read data from all
>> partitions from a topic?
>> It works in the 0.10.0 by using the broadcast stream. In the 0.9.0, you
>> have to "create topicR with the same number of partitions as *topicD*, and
>> replicate data to all partitions".
>>
>>
>> Hope this still helps.
>>
>>
>> Thanks,
>> Yan
>>
>>
>> At 2015-10-22 04:44:41, "Chen Song" <chen.song...@gmail.com> wrote:
>> >In our samza app, we need to read data from MySQL (reference table) with a
>> >stream. So the requirements are
>> >
>> >* Read data into each Samza task before processing any message.
>> >* The Samza task should be able to listen to updates happening in MySQL.
>> >
>> >I did some research after scanning through some relevant conversations and
>> >JIRAs on the community but did not find a solution yet. Neither I find a
>> >recommended way to do this.
>> >
>> >If my data streams comes from a topic called *topicD*, options in my mind
>> >are:
>> >
>> >   - Use Kafka
>> >      1. Use one of CDC based solution to replicate data in MySQL to a
>> >      topic Kafka. https://github.com/wushujames/mysql-cdc-projects/wiki.
>> >      Say the topic is called *topicR*.
>> >      2. In my Samza app, read reference table from *topicR *and persisted
>> >      in a cache in each Samza task's local storage.
>> >         - If the data in *topicR *is NOT partitioned in the same way as
>> >         *topicD*, can we configure each individual Samza task to read
>> data
>> >         from all partitions from a topic?
>> >         - If the answer to the above question is no, do I need to
>> >create *topicR
>> >         *with the same number of partitions as *topicD*, and replicate
>> >         data to all partitions?
>> >         - On start, how to make Samza task to block processing the first
>> >         message from *topicD* before reading all data from *topicR*.
>> >      3. Any new updates/deletes to *topicR *will be consumed to update
>> the
>> >      local cache of each Samza task.
>> >      4. On failure or restarts, each Samza task will read from the
>> >      beginning from *topicR*.
>> >   - Not Use Kafka
>> >      - Each Samza task reads a Snapshot of database and builds its local
>> >      cache, and it then needs to read periodically to update its
>> >local cache. I
>> >      have read about a few blogs, and this doesn't sound a solid way
>> >in the long
>> >      term.
>> >
>> >Any thoughts?
>> >
>> >Chen
>> >
>> >   -
>> >
>> >--
>> >Chen Song
>>
>
>
>
>-- 
>Chen Song

Reply via email to