Kamil Wasilewski created BEAM-9673:
--------------------------------------

             Summary: Migrate the memory configuration of Flink cluster from 
Flink <= 1.9 to 1.10
                 Key: BEAM-9673
                 URL: https://issues.apache.org/jira/browse/BEAM-9673
             Project: Beam
          Issue Type: Bug
          Components: testing
            Reporter: Kamil Wasilewski


Our Google Cloud Dataproc setup [1], which runs a Flink cluster for testing 
purposes, needs to be reviewed and updated before reusing in the latest version 
of Flink (1.10)

There is an official migration guide [2] which can help to update the 
configuration.

Here's also a list of ideas I came up with during initial investigation:

1) JVM Metaspace Size must be increased to prevent OOM errors. This can be done 
by setting _taskmanager.memory.jvm-metaspace.size_ to "512 mb" (default value 
is 256 mb).

2) It appears that the size of _Managed Memory_ is too low for tests involving 
GBK and coGBK operations [3] [4]. I managed to run those tests successfully by 
increasing _taskmanager.memory.managed.fraction_ to 0.8 and changing the type 
of dataproc workers to n1-highmem-4. But there might be another option.

[1] [https://github.com/apache/beam/tree/master/.test-infra/dataproc]

[2] 
[https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/memory/mem_migration.html]

[3] 
[https://github.com/apache/beam/blob/master/.test-infra/jenkins/job_LoadTests_GBK_Flink_Python.groovy]

[4] 
[https://github.com/apache/beam/blob/master/.test-infra/jenkins/job_LoadTests_coGBK_Flink_Python.groovy]


 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to