Hi Andrés,

sorry for the late reply.
1. The slots are released, when the streaming pipeline ends. In principle,
it is not a problem when a slot is allocated, even when not processing any
incoming messages. So you are not doing something wrong. How many records
do you receive per pipeline? (are they idle for multiple hours?)
There's a way to utilize the slots more efficiently:
https://statefun.io/ Statefun
will be contributed to Flink soon.
StateFun doesn't have a direct slots to pipeline mapping.

2. The memory consumption per slot greatly depends on what kind of operator
you are running in it. A heap statebackend might need a few gigabytes, a
stateless mapper needs almost no memory. Some time ago, I wrote a blog post
on sizing a Flink cluster:
https://www.ververica.com/blog/how-to-size-your-apache-flink-cluster-general-guidelines

Best,
Robert


On Fri, Dec 13, 2019 at 5:06 PM Andrés Garagiola <andresgaragi...@gmail.com>
wrote:

> Hi
>
> I'm testing Flink to do stream processing, in my use case there are
> multiples pipelines processing messages from multiple Kafka sources. I have
> some questions regarding the jobs and slots.
>
> 1) When I deploy a new job, it takes a job slot in the TM, the job never
> ends (I think it doesn't end because is a stream pipeline), and the slot is
> never released, this means that the slot is busy even when no new messages
> are coming from the Kafka topic. Is that OK or I'm doing something wrong?
> Is there a way to do a more efficient utilization of the job slots?
>
> 2) In my use case, I need good job scalability. Potentially I could have
> many pipelines running in the Flink environment, but on the other hand,
> increase latency would not be a serious problem for me. There are some
> recommendations regarding memory for slot? I saw that the CPU
> recommendation is a core per slot, taking into account that increase the
> latency would not be a big problem, do you see another good reason to
> follow this recommendation?
>
> Thank you
> Regards
>

Reply via email to