Hello I am very new at samza and doing PoC now. So far, I really like Samza especially its architecture on tight integration with Kafka and checkpointing.
1. Dynamically change the number of containers I saw the following from the documentation: "The number of tasks is determined automatically from the number of partitions in the input and is fixed, but the number of containers (and the CPU and memory resources associated with them) is specified by the user at run time and can be changed at any time." Is this true? If so, how can I change the number of containers on the fly? 2. How to shutdown the job I will need to shutdown the job and restart it. How can I shutdown the job? Can I use kill-yarn-job.sh? 3. Timeout for close() of ClosableTask I guess that if the job implements ClosableTask, it can be shutdown gracefully. How long does Samza wait for close() method to return? Is there any timeout configuration? or will it be blocked forever? 4. Auto-create topics We're not running our kafka clusters with auto.create.topics.enabled=true to prevent any abuse of the broker. We observed that auto-created topics fell into unstable status before, we're not sure it's a bug on kafka broker or it was a just coincidence. Is there any guideline for running samza jobs without auto-create topics? Thank you Best, Jae
