Hi, Susan,
Welcome to Samza!
First I will try to answer your question about partition assignment in
Samza. The assignment from stream partition to Samza tasks is determined by
the SystemStreamPartitionGrouper. The default implementation include two
assignment methods: 1 task per system stream
Hey Susan-
That volume of topics (or partitions) would be a significant burden
on both the Kafka cluster and underlying YARN cluster (for the Samza
job). A 'large number of partitions' even at places with huge Kafka
clusters is on the order of 512 or so. It sounds like you're trying
to use
Hey Susan,
As far as I know, there is very minimal differences
between Partition vs Topic strategy in terms of performance - in terms of
how they are allocated in the memory they should be very similar, but I'll
get some Kafka experts to comment on that.
From Samza's perspective,
Hi there, I'm new to Samza/Kafka and we're evaluating Samza to see whether
it would be a good fit for our application. I just had a few questions
about how partitioning works.
I understand there is a limitation on the number of topics we can create
[1], and I was wondering, if we need more than,