Hi,

Do I understand correctly that:
1. The workload varies across the jobs but stays the same for the same job
2. With a small number of slots per TM you are concerned about uneven
resource utilization when running low- and high-intensive jobs on the
same cluster simultaneously?

If so, wouldn't reducing parallelism of low-intensive jobs help?
Other options to consider are putting subtasks of high-intensive job
into different slot-sharing groups; or breaking operator chains
explicitly [1]

There are also a number of improvements coming in 1.13 release: [2][3][4].

I'm pulling in Till and Robert who knows this area better.

[1] 
https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/operators/#task-chaining-and-resource-groups
[2] https://issues.apache.org/jira/browse/FLINK-21267
[3] https://issues.apache.org/jira/browse/FLINK-10404
[4] https://issues.apache.org/jira/browse/FLINK-14187

Regards,
Roman

On Fri, Mar 12, 2021 at 5:03 AM Sush Bankapura
<sushrutha.bankap...@man-es.com> wrote:
>
> Hi,
>
> We  have multiple jobs that need to be deployed to a Flink cluster. 
> Parallelism for jobs vary and dependent on the type of work being done  and 
> so are the memory requirements. All jobs currently use the same state 
> backend.  Since the workloads handled by each job is different, the scaling 
> pattern also varies. We run all our jobs in a  single Flink cluster (7 VMs 
> with the same instance configuration)
>
>  Most of what I have read in the Flink documentation indicates any of the 
> following for setting the task slots
>
> 1. As a rule of thumb, a good default number of task slots will be the number 
> of CPU cores. With hyper-threading, each slot then takes 2 or more hardware 
> thread contexts. If you are doing any Blocking IO operations in Flink job, it 
> is suggested to have more number of slots than the core.
>
> 2. A Flink cluster needs exactly as many task slots as the highest 
> parallelism used in the job. No need to calculate how many tasks (with 
> varying parallelism) a program contains in total.
>
> I did not find documentation  for the task slot setting for the scenario I 
> have enumerated. While setting a lower value for the task slots seems to work 
> better for jobs which need to process high amounts of traffic than the other 
> jobs which process lower amounts of traffic, but this will be inefficient if 
> the slots are assigned to jobs which work on lower volumes of traffic.
>
> Depending on the workload handled by each Flink job. rt seems that we would 
> need to set as many clusters.
>
> 1. Is this the only option available?
> 2. Are there any guidelines on deciding on the number of task slots in such 
> an environment?
>
> Thanks,
> Sushruth

Reply via email to