Hi Mahmoud, While it's not an answer to your questions, I do want to point out that the DataSet API is deprecated and will be removed in a future version of Flink. I would recommend moving to either the Table API or the DataStream API.
Best regards, Martijn On Thu, Jun 22, 2023 at 6:14 PM Mahmoud Awad <mahmoud.a4...@hotmail.com> wrote: > > Hello everyone, > > I am trying to understand the mechanism by which Flink distributed the data > and the tasks among the nodes/task managers in the cluster, assuming all TMs > have equal resources. I am using the DataSet API on my own machine. > I will try to address the issue with the following questions : > > -When we firstly read the data from the source(Text,CSV..etc.), How does > Flink ensures the fairly distribution of data from the source to the next > subtask ? > > -Are there any preferences by which Flink will prefer a task manager on the > other(assuming all task managers have equal resources) ? > > - Based on what, will Flink choose to deploy a specific task in a specific > task manager ? > > I hope I was able to explain my point, thank you in advanced. > > Best regards > Mahmoud > > > > Gesendet von Mail für Windows > >