Re: One TaskManager per node or multiple TaskManager per node

Jamie Grier Mon, 14 Jan 2019 11:39:58 -0800

There are a lot of different ways to deploy Flink.  It would be easier to
answer your question with a little more context about your use case but in
general I would advocate the following:

1) Don't run a "permanent" Flink cluster and then submit jobs to it.
Instead what you should do is run an "ephemeral" cluster per job if
possible.  This keeps jobs completely isolated from each other which helps
a lot with understanding performance, debugging, looking at logs, etc.
2) Given that you can do #1 and you are running on bare metal (as opposed
to in containers) then run one TM per physical machine.

There are many ways to accomplish the above depending on your deployment
infrastructure (YARN, K8S, bare metal, VMs, etc) so it's hard to give
detailed input but general you'll have the best luck if you don't run
multiple jobs in the same TM/JVM.

In terms of the TM memory usage you can set that up by configuring it in
the flink-conf.yaml file.  The config key you are looking or is
taskmanager.heap.size:
https://ci.apache.org/projects/flink/flink-docs-release-1.7/ops/config.html#taskmanager-heap-size

On Mon, Jan 14, 2019 at 8:05 AM Ethan Li <ethanopensou...@gmail.com> wrote:

> Hello,
>
> I am setting up a standalone flink cluster and I am wondering what’s the
> best way to distribute TaskManagers.  Do we usually launch one TaskManager
> (with many slots) per node or multiple TaskManagers per node (with smaller
> number of slots per tm) ?  Also with one TaskManager per node, I am seeing
> that TM launches with only 30GB JVM heap by default while the node has 180
> GB. Why is it not launching with more memory since there is a lot
> available?
>
> Thank you very much!
>
> - Ethan

Re: One TaskManager per node or multiple TaskManager per node

Reply via email to