Re: partition.assignment.strategy for MM2?

2020-09-20 Thread nitin agarwal
If I understand correctly there is only one consumer object created per
task, how would the setting of partition.assignment.strategy help here ?

Thanks,
Nitin

On Sun, Sep 20, 2020 at 10:37 PM Ning Zhang  wrote:

> Here my initial thoughts:
>
> (1) MM2 assigned partitions over tasks based on round-robin at the MM2
> application level:
>
>
> https://github.com/apache/kafka/commit/4fe324b5a696d47ae8414cf030ae36b83a2cf163#diff-bf03ad7a93c070fe11fcf88f897e0973
>
> (2) if want to try out `RoundRobinAssignor` in consumer-level, the right
> config format should be:
> .consumer.partition.assignment.strategy
>
> (3) if the workload is static, increasing the number of MM2 instance will
> definitely decrease the throughput per machine. There are multiple layers
> to optimize: (1) if the k8s containers running MM2 are under-provisioned in
> terms of network bandwidth, memory and etc. (2) JVM and GC, (3) MM2
> consumer and producer configs. I published a basic MM2 config for k8s
> deployment (for my experiments):
> https://github.com/ning2008wisc/minikube-mm2-demo/blob/high_scale/kafka-mm/values.yaml
> and it could be a starting point for tuning.
>
> On 2020/09/16 08:01:19, Iftach Ben-Yosef  wrote:
> > Hello,
> >
> > Following our experimenting with MM2, we seem to have a hard time
> > recovering from lags created by increased traffic.
> >
> > Since we're running MM2 on kubernetes, we can easily add more
> > resources/instances to our MM2 cluster, which should in theory increase
> the
> > clusters performance and allow us to eliminate accumulated lag faster.
> >
> > In practice, we see that increasing the number of MM2 instances we have
> > does not drastically change the throughput per machine - newly created
> > instances have much lower throughput than instances which existed before
> > the increase.
> >
> > This made me think about the partition assignment strategy used by MM2.
> If
> > I understand correctly, the default seems to be
> > partition.assignment.strategy = [class
> > org.apache.kafka.clients.consumer.RangeAssignor]
> >
> > I believe changing this to round robin might help this? I tried setting
> it
> > up in my /config/connect-mirror-maker.properties file as one of each of
> the
> > following (assuming source / dest A / B respectively)
> >
> >
> A.consumer.partition.assignment.strategy=org.apache.kafka.clients.consumer.RoundRobinAssignor
> >
> B.consumer.partition.assignment.strategy=org.apache.kafka.clients.consumer.RoundRobinAssignor
> >
> A.partition.assignment.strategy=org.apache.kafka.clients.consumer.RoundRobinAssignor
> >
> B.partition.assignment.strategy=org.apache.kafka.clients.consumer.RoundRobinAssignor
> >
> partition.assignment.strategy=org.apache.kafka.clients.consumer.RoundRobinAssignor
> >
> consumer.partition.assignment.strategy=org.apache.kafka.clients.consumer.RoundRobinAssignor
> > None of these seem to work. Am I doing it wrong? Is this setting even
> > applicable to MM2?
> >
> > Thanks a lot,
> > Iftach
> >
> > --
> > The above terms reflect a potential business arrangement, are provided
> > solely as a basis for further discussion, and are not intended to be and
> do
> > not constitute a legally binding obligation. No legally binding
> obligations
> > will be created, implied, or inferred until an agreement in final form
> is
> > executed in writing by all parties involved.
> >
> >
> > This email and any
> > attachments hereto may be confidential or privileged.  If you received
> this
> > communication by mistake, please don't forward it to anyone else, please
> > erase all copies and attachments, and please let me know that it has
> gone
> > to the wrong person. Thanks.
> >
>


Re: partition.assignment.strategy for MM2?

2020-09-20 Thread Ning Zhang
Here my initial thoughts:

(1) MM2 assigned partitions over tasks based on round-robin at the MM2 
application level:

https://github.com/apache/kafka/commit/4fe324b5a696d47ae8414cf030ae36b83a2cf163#diff-bf03ad7a93c070fe11fcf88f897e0973

(2) if want to try out `RoundRobinAssignor` in consumer-level, the right config 
format should be: .consumer.partition.assignment.strategy

(3) if the workload is static, increasing the number of MM2 instance will 
definitely decrease the throughput per machine. There are multiple layers to 
optimize: (1) if the k8s containers running MM2 are under-provisioned in terms 
of network bandwidth, memory and etc. (2) JVM and GC, (3) MM2 consumer and 
producer configs. I published a basic MM2 config for k8s deployment (for my 
experiments): 
https://github.com/ning2008wisc/minikube-mm2-demo/blob/high_scale/kafka-mm/values.yaml
 and it could be a starting point for tuning.

On 2020/09/16 08:01:19, Iftach Ben-Yosef  wrote: 
> Hello,
> 
> Following our experimenting with MM2, we seem to have a hard time
> recovering from lags created by increased traffic.
> 
> Since we're running MM2 on kubernetes, we can easily add more
> resources/instances to our MM2 cluster, which should in theory increase the
> clusters performance and allow us to eliminate accumulated lag faster.
> 
> In practice, we see that increasing the number of MM2 instances we have
> does not drastically change the throughput per machine - newly created
> instances have much lower throughput than instances which existed before
> the increase.
> 
> This made me think about the partition assignment strategy used by MM2. If
> I understand correctly, the default seems to be
> partition.assignment.strategy = [class
> org.apache.kafka.clients.consumer.RangeAssignor]
> 
> I believe changing this to round robin might help this? I tried setting it
> up in my /config/connect-mirror-maker.properties file as one of each of the
> following (assuming source / dest A / B respectively)
> 
> A.consumer.partition.assignment.strategy=org.apache.kafka.clients.consumer.RoundRobinAssignor
> B.consumer.partition.assignment.strategy=org.apache.kafka.clients.consumer.RoundRobinAssignor
> A.partition.assignment.strategy=org.apache.kafka.clients.consumer.RoundRobinAssignor
> B.partition.assignment.strategy=org.apache.kafka.clients.consumer.RoundRobinAssignor
> partition.assignment.strategy=org.apache.kafka.clients.consumer.RoundRobinAssignor
> consumer.partition.assignment.strategy=org.apache.kafka.clients.consumer.RoundRobinAssignor
> None of these seem to work. Am I doing it wrong? Is this setting even
> applicable to MM2?
> 
> Thanks a lot,
> Iftach
> 
> -- 
> The above terms reflect a potential business arrangement, are provided 
> solely as a basis for further discussion, and are not intended to be and do 
> not constitute a legally binding obligation. No legally binding obligations 
> will be created, implied, or inferred until an agreement in final form is 
> executed in writing by all parties involved.
> 
> 
> This email and any 
> attachments hereto may be confidential or privileged.  If you received this 
> communication by mistake, please don't forward it to anyone else, please 
> erase all copies and attachments, and please let me know that it has gone 
> to the wrong person. Thanks.
> 


partition.assignment.strategy for MM2?

2020-09-16 Thread Iftach Ben-Yosef
Hello,

Following our experimenting with MM2, we seem to have a hard time
recovering from lags created by increased traffic.

Since we're running MM2 on kubernetes, we can easily add more
resources/instances to our MM2 cluster, which should in theory increase the
clusters performance and allow us to eliminate accumulated lag faster.

In practice, we see that increasing the number of MM2 instances we have
does not drastically change the throughput per machine - newly created
instances have much lower throughput than instances which existed before
the increase.

This made me think about the partition assignment strategy used by MM2. If
I understand correctly, the default seems to be
partition.assignment.strategy = [class
org.apache.kafka.clients.consumer.RangeAssignor]

I believe changing this to round robin might help this? I tried setting it
up in my /config/connect-mirror-maker.properties file as one of each of the
following (assuming source / dest A / B respectively)

A.consumer.partition.assignment.strategy=org.apache.kafka.clients.consumer.RoundRobinAssignor
B.consumer.partition.assignment.strategy=org.apache.kafka.clients.consumer.RoundRobinAssignor
A.partition.assignment.strategy=org.apache.kafka.clients.consumer.RoundRobinAssignor
B.partition.assignment.strategy=org.apache.kafka.clients.consumer.RoundRobinAssignor
partition.assignment.strategy=org.apache.kafka.clients.consumer.RoundRobinAssignor
consumer.partition.assignment.strategy=org.apache.kafka.clients.consumer.RoundRobinAssignor
None of these seem to work. Am I doing it wrong? Is this setting even
applicable to MM2?

Thanks a lot,
Iftach

-- 
The above terms reflect a potential business arrangement, are provided 
solely as a basis for further discussion, and are not intended to be and do 
not constitute a legally binding obligation. No legally binding obligations 
will be created, implied, or inferred until an agreement in final form is 
executed in writing by all parties involved.


This email and any 
attachments hereto may be confidential or privileged.  If you received this 
communication by mistake, please don't forward it to anyone else, please 
erase all copies and attachments, and please let me know that it has gone 
to the wrong person. Thanks.