Re: Data & Task distribution among the available Nodes

Martijn Visser Thu, 29 Jun 2023 01:17:20 -0700

Hi Mahmoud,

While it's not an answer to your questions, I do want to point out
that the DataSet API is deprecated and will be removed in a future
version of Flink. I would recommend moving to either the Table API or
the DataStream API.


Best regards,

Martijn

On Thu, Jun 22, 2023 at 6:14 PM Mahmoud Awad <mahmoud.a4...@hotmail.com> wrote:
>
> Hello everyone,
>
> I am trying to understand the mechanism by which Flink distributed the data 
> and the tasks among the nodes/task managers in the cluster, assuming all TMs 
> have equal resources. I am using the DataSet API on my own machine.
> I will try to address the issue with the following questions :
>
> -When we  firstly read the data from the source(Text,CSV..etc.), How does 
> Flink ensures the fairly distribution of data from the source to the next 
> subtask ?
>
> -Are there any preferences by which Flink will prefer a task manager on the 
> other(assuming all task managers have equal resources) ?
>
> - Based on what, will Flink choose to deploy a specific task in a specific 
> task manager ?
>
> I hope I was able to explain my point, thank you in advanced.
>
> Best regards
> Mahmoud
>
>
>
> Gesendet von Mail für Windows
>
>

Re: Data & Task distribution among the available Nodes

Reply via email to