Re: Deploying Flink on Kubernetes with fractional CPU and different limits and requests

2021-09-03 Thread Yang Wang
Hi Alexis, Thanks for sharing more thoughts about resource configuration. Your suggestions make a lot of sense to me. I believe it could also help others especially for those who are more familiar with K8s and tend to use pod template as far as possible. I have created a ticket for this

RE: Deploying Flink on Kubernetes with fractional CPU and different limits and requests

2021-09-03 Thread Alexis Sarda-Espinosa
Hi Yang, I understand the issue, and yes, if Flink memory must be specified in the configuration anyway, it’s probably better to leave memory configuration in the templates empty. For the CPU case I still think the template’s requests/limits should have priority if they are specified. The

Re: Deploying Flink on Kubernetes with fractional CPU and different limits and requests

2021-09-03 Thread Yang Wang
Hi Alexis Thanks for your valuable inputs. First, I want to share why Flink has to overwrite the resources which are defined in the pod template. You could the fields that will be overwritten by Flink here[1]. I think the major reason is that Flink need to ensure the consistency between Flink

RE: Deploying Flink on Kubernetes with fractional CPU and different limits and requests

2021-09-02 Thread Alexis Sarda-Espinosa
Just to provide my opinion, I find the idea of factors unintuitive for this specific case. When I’m working with Kubernetes resources and sizing, I have to think in absolute terms for all pods and define requests and limits with concrete values. Using factors for Flink means that I have to

Re: Deploying Flink on Kubernetes with fractional CPU and different limits and requests

2021-09-02 Thread spoon_lz
Hi Yang, I agree with you, but I think the limit-factor should be greater than or equal to 1, and default to 1 is a better choice. If the default value is 1.5, the memory limit will exceed the actual physical memory of a node, which may result in OOM, machine downtime, or random pod death if

Re: Deploying Flink on Kubernetes with fractional CPU and different limits and requests

2021-09-01 Thread Yang Wang
Given that the limit-factor should be greater than 1, then using the limit-factor could also work for memory. > Why do we need a larger memory resource limit than request? A typical use case I could imagine is the page cache. Having more page cache might improve the performance. And they could be

Re: Deploying Flink on Kubernetes with fractional CPU and different limits and requests

2021-09-01 Thread spoon_lz
Yes, shrinking the requested memory will result in OOM. We do this because the user-created job provides an initial value (for example, 2 cpus and 4096MB of memory for TaskManager). In most cases, the user will not change this value unless the task fails or there is an exception such as data

Re: Deploying Flink on Kubernetes with fractional CPU and different limits and requests

2021-09-01 Thread Yang Wang
Hi Lz, Thanks for sharing your ideas. I have to admin that I prefer the limit factor to set the resource limit, not the percentage to set the resource request. Because usually the resource request is configured or calculated by Flink, and it indicates user required resources. It has the same

Re: Deploying Flink on Kubernetes with fractional CPU and different limits and requests

2021-09-01 Thread spoon_lz
Hi,everyone I have some other ideas for kubernetes resource Settings, as described by WangYang in [flink-15648], which increase the CPU limit by a certain percentage to provide more computational performance for jobs. Should we consider the alternative of shrinking the request to start more

Re: Deploying Flink on Kubernetes with fractional CPU and different limits and requests

2021-09-01 Thread 971066723
Hi,everyoneI have some other ideas for kubernetes resource Settings, as described by WangYang in [flink-15648], which increase the CPU limit by a certain percentage to provide more computational performance for jobs. Should we

Re: Deploying Flink on Kubernetes with fractional CPU and different limits and requests

2021-09-01 Thread Denis Cosmin NUTIU
Hi Yang, I have limited Flink internals knowledge, but I can try to implement FLINK-15648 and open up a PR on GitHub or send the patch via email. How does that sound? I'll sign the ICLA and switch to my personal address. Sincerely, Denis On Wed, 2021-09-01 at 13:48 +0800, Yang Wang wrote:

Re: Deploying Flink on Kubernetes with fractional CPU and different limits and requests

2021-08-31 Thread Yang Wang
Great. If no one wants to work on this ticket FLINK-15648, I will try to get this done in the next major release cycle(1.15). Best, Yang Denis Cosmin NUTIU 于2021年8月31日周二 下午4:59写道: > Hi everyone, > > Thanks for getting back to me! > > > I think it would be nice if the task manager pods get

Re: Deploying Flink on Kubernetes with fractional CPU and different limits and requests

2021-08-31 Thread Denis Cosmin NUTIU
Hi everyone, Thanks for getting back to me! > I think it would be nice if the task manager pods get their values from the > configuration file only if the pod templates don’t specify any resources. > That was the goal of supporting pod templates, right? Allowing more custom > scenarios

Re: Deploying Flink on Kubernetes with fractional CPU and different limits and requests

2021-08-30 Thread Yang Wang
Hi all, I think it is a good improvement to support different resource requests and limits. And it is very useful especially for the CPU resource since it heavily depends on the upstream workloads. Actually, we(alibaba) have introduced some internal config options to support this feature. WDYT?

RE: Deploying Flink on Kubernetes with fractional CPU and different limits and requests

2021-08-26 Thread Alexis Sarda-Espinosa
I think it would be nice if the task manager pods get their values from the configuration file only if the pod templates don’t specify any resources. That was the goal of supporting pod templates, right? Allowing more custom scenarios without letting the configuration options get bloated.

Re: Deploying Flink on Kubernetes with fractional CPU and different limits and requests

2021-08-26 Thread Denis Cosmin NUTIU
Hi Matthias, Thanks for getting back to me and for your time! We have some Flink jobs deployed on Kubernetes and running kubectl top pod gives the following result: NAMECPU(cores) MEMORY(bytes) aa-78c8cb77d4-zlmpg

Re: Deploying Flink on Kubernetes with fractional CPU and different limits and requests

2021-08-26 Thread Matthias Pohl
Hi Denis, I did a bit of digging: It looks like there is no way to specify them independently. You can find documentation about pod templates for TaskManager and JobManager [1]. But even there it states that for cpu and memory, the resource specs are overwritten by the Flink configuration. The