Hey Roger, > Do I need to manually create the KV store changelog topic?
Yes, unfortunately you do need to create it manually at the moment. > I saw this ticket (https://issues.apache.org/jira/browse/SAMZA-226) but >it looks like it's still open. Yep, that's the ticket to fix the above issue. :) It is indeed still open. > Do checkpoint topics get created? Yes. > Are jobs tasks assigned to partitions of a shared checkpoint topic or do >they each get their own checkpoint topic? In 0.7.0, each task got its own partition. In 0.8.0 (post-SAMZA-123), the checkpoint topic is single partition, and all tasks in one job share this partition. Note that jobs each still have their own checkpoint topics. The SAMZA-123 JIRA has a design doc that Jakob wrote, which describes how the checkpoint topic works. For 0.7.0, the legacy checkpoint topics, you can find docs here: http://samza.incubator.apache.org/learn/documentation/0.7.0/container/check pointing.html > Should I proceed with this version or would it make life easier to use >trunk or something closer to 0.8.0? I would recommend using 0.8.0 (master). We've not yet released it, but that's mostly since we're waiting on SAMZA-236. We've been running 0.8.0 at LinkedIn for several large jobs (600k-800k msgs/sec), and it's been pretty solid. It also has a ton of performance improvements, an new UI, etc. > Anything else I need to watch out for? If you're already running with 0.7.0, you'll either need to abandon your checkpoints, or wait for SAMZA-354. The 0.8.0 checkpoint topic changes were backwards incompatible, and thus we are adding an auto-migration feature, which hasn't yet been written (though it's being worked on right now). Cheers, Chris On 10/14/14 1:04 PM, "Roger Hoover" <[email protected]> wrote: >Hi all, > >I want to deploy a Samza job in a pre-production environment and need to >figure out how to handle configuration of the various topics. In >particular, I want to make sure topics like the KV store changelog are >configured to be compacted so that data isn't lost over time. > >Do I need to manually create the KV store changelog topic? I saw this >ticket (https://issues.apache.org/jira/browse/SAMZA-226) but it looks like >it's still open. > >Do checkpoint topics get created? If not, what does the >"task.checkpoint.replication.factor" configuration do? Are jobs tasks >assigned to partitions of a shared checkpoint topic or do they each get >their own checkpoint topic? > >So far I've developed my proof of concept job with 0.7.0. Should I >proceed >with this version or would it make life easier to use trunk or something >closer to 0.8.0? > >Anything else I need to watch out for? > >Thanks, > >Roger
