Kamil Wasilewski created BEAM-9673: -------------------------------------- Summary: Migrate the memory configuration of Flink cluster from Flink <= 1.9 to 1.10 Key: BEAM-9673 URL: https://issues.apache.org/jira/browse/BEAM-9673 Project: Beam Issue Type: Bug Components: testing Reporter: Kamil Wasilewski
Our Google Cloud Dataproc setup [1], which runs a Flink cluster for testing purposes, needs to be reviewed and updated before reusing in the latest version of Flink (1.10) There is an official migration guide [2] which can help to update the configuration. Here's also a list of ideas I came up with during initial investigation: 1) JVM Metaspace Size must be increased to prevent OOM errors. This can be done by setting _taskmanager.memory.jvm-metaspace.size_ to "512 mb" (default value is 256 mb). 2) It appears that the size of _Managed Memory_ is too low for tests involving GBK and coGBK operations [3] [4]. I managed to run those tests successfully by increasing _taskmanager.memory.managed.fraction_ to 0.8 and changing the type of dataproc workers to n1-highmem-4. But there might be another option. [1] [https://github.com/apache/beam/tree/master/.test-infra/dataproc] [2] [https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/memory/mem_migration.html] [3] [https://github.com/apache/beam/blob/master/.test-infra/jenkins/job_LoadTests_GBK_Flink_Python.groovy] [4] [https://github.com/apache/beam/blob/master/.test-infra/jenkins/job_LoadTests_coGBK_Flink_Python.groovy] -- This message was sent by Atlassian Jira (v8.3.4#803005)