We are finding YARN and AWS Ec2 to be too costly for us. We are having to scale 
the cluster to support more jobs and have plans to write more jobs. We are 
scaling because cluster doesn’t have enough VCores to support all the 
Containers, not enough RAM for jobs, etc.

Has anyone had luck running Samza jobs in an alternative scheduler? Say, Nomad, 
Kubernetes or something else?

Similarly, anyone have any luck with Samza on something like Kafka’s streams 
where I don’t have to have the overhead of YARN and a scheduler at all?

Also, at a small scale shop – what is the minimum number of partitions I can 
get away with? Any advice on determining the appropriate number of partitions?  
Kafka, Zookeeper and Secor  are also costs we could potentially reduce via 
partition count.


Thanks for any input.



Jeremiah Adams
Software Engineer
www.helixeducation.com<http://www.helixeducation.com/>
Blog<http://www.helixeducation.com/blog/> | 
Twitter<https://twitter.com/HelixEducation> | 
Facebook<https://www.facebook.com/HelixEducation> | 
LinkedIn<http://www.linkedin.com/company/3609946>

Reply via email to