Hello everyone, I am trying to understand the mechanism by which Flink distributed the data and the tasks among the nodes/task managers in the cluster, assuming all TMs have equal resources. I am using the DataSet API on my own machine. I will try to address the issue with the following questions :
-When we firstly read the data from the source(Text,CSV..etc.), How does Flink ensures the fairly distribution of data from the source to the next subtask ? -Are there any preferences by which Flink will prefer a task manager on the other(assuming all task managers have equal resources) ? - Based on what, will Flink choose to deploy a specific task in a specific task manager ? I hope I was able to explain my point, thank you in advanced. Best regards Mahmoud Gesendet von Mail<https://go.microsoft.com/fwlink/?LinkId=550986> für Windows