Is this a brand-new job, with the cluster having all 18 TMs at the time of submission? (or did you add more TMs while the job was running)

On 3/12/2021 5:47 PM, Aeden Jameson wrote:
Hi Matthias,

Yes, all the task managers have the same hardware/memory configuration.

Aeden

On Fri, Mar 12, 2021 at 3:25 AM Matthias Pohl <matth...@ververica.com> wrote:
Hi Aeden,
just to be sure: All task managers have the same hardware/memory configuration, 
haven't they? I'm not 100% sure whether this affects the slot selection in the 
end, but it looks like this parameter has also an influence on the slot 
matching strategy preferring slots with less utilization of resources [1].

I'm gonna add Chesnay to the thread. He might have more insights here. @Chesnay 
are there any other things that might affect the slot selection when actually 
trying to evenly spread out the slots?

Matthias

[1] 
https://github.com/apache/flink/blob/c6997c97c575d334679915c328792b8a3067cfb5/flink-runtime/src/main/java/org/apache/flink/runtime/resourcemanager/slotmanager/SlotManagerConfiguration.java#L141

On Fri, Mar 12, 2021 at 12:58 AM Aeden Jameson <aeden.jame...@gmail.com> wrote:
Hi Arvid,

   Thanks for responding. I did check the configuration tab of the job
manager and the setting cluster.evenly-spread-out-slots: true is
there. However I'm still observing unevenness in the distribution of
source tasks. Perhaps this additional information could shed light.

Version: 1.12.1
Deployment Mode: Application
Deployment Type: Standalone,  Docker on Kubernetes using the Lyft
Flink operator https://github.com/lyft/flinkk8soperator

I did place the setting under the flinkConfig section,

apiVersion: flink.k8s.io/v1beta1
....
spec:
   flinkConfig:
     cluster.evenly-spread-out-slots: true
     high-availability: zookeeper
     ...
     state.backend: filesystem
     ...
   jobManagerConfig:
     envConfig:
         ....

Would you explain how the setting ends up evenly distributing active
kafka consumers? Is it a result of just assigning tasks toTM1, TM2,
TM3 ... TM18 in order and starting again. In my case I have 36
partitions and 18 nodes so after the second pass in assignment I would
end up with 2 subtasks in the consumer group on each TM. And then
subsequent passes result in inactive consumers.


Thank you,
Aeden

On Thu, Mar 11, 2021 at 5:26 AM Arvid Heise <ar...@apache.org> wrote:
Hi Aeden,

the option that you mentioned should have actually caused your desired 
behavior. Can you double-check that it's set for the job (you can look at the 
config in the Flink UI to be 100% sure).

Another option is to simply give all task managers 2 slots. In that way, the 
scheduler can only evenly distribute.

On Wed, Mar 10, 2021 at 7:21 PM Aeden Jameson <aeden.jame...@gmail.com> wrote:
     I have a cluster with 18 task managers 4 task slots each running a
job whose source/sink(s) are declared with FlinkSQL using the Kafka
connector. The topic being read has 36 partitions. The problem I'm
observing is that the subtasks for the sources are not evenly
distributed. For example, 1 task manager will have 4 active source
subtasks and other TM's none. Is there a way to force  each task
manager to have 2 active source subtasks.  I tried using the setting
cluster.evenly-spread-out-slots: true , but that didn't have the
desired effect.

--
Thank you,
Aeden


Reply via email to