Go to the web UI and verify all 136 TaskManagers are visible in the machine you are submitting the job from. I have encountered issues where not all TaskManagers start, or you may not have all 17 configured properly to be one cluster vs 17 clusters of 8.
Michael > On Apr 26, 2018, at 10:48 AM, m@xi <makisnt...@gmail.com> wrote: > > Hello Flinkers, > > I have deployed Flink in a cluster of 17 nodes, each having 8 CPUs. Thus, in > total there are 136 CPUs available. > > I have set the parameter askmanager.numberOfTaskSlots = 8 in all machines, > since they have 8 CPUs. > > And when I am going to run ./flink run -c classpath jarFile -p 136 and I get > error. > > I can only put it maximum 8 which is reasonable from one point. But here [1] > it says the following : > > parallelism.default: The default parallelism to use for programs that have > no parallelism specified. (DEFAULT: 1). For setups that have no concurrent > jobs running, setting this value to NumTaskManagers * NumSlotsPerTaskManager > will cause the system to use all available execution resources for the > program’s execution. Note: The default parallelism can be overwritten for an > entire job by calling setParallelism(int parallelism) on the > ExecutionEnvironment or by passing -p <parallelism> to the Flink > Command-line frontend. It can be overwritten for single transformations by > calling setParallelism(int parallelism) on an operator. See Parallel > Execution for more information about parallelism. > > So...specially the part : setting this value to NumTaskManagers * > NumSlotsPerTaskManager will cause the system to use all available execution > resources for the program’s execution. > > So, for me NumTaskManagers * NumSlotsPerTaskManager = 17 * 8 = 136. Right? > Any idea why this does not work? > > Best, > Max > > [1] -- > https://ci.apache.org/projects/flink/flink-docs-release-1.4/ops/config.html > > > > -- > Sent from: > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/