Hello, I'm trying to get some simple rules or guidelines for what values to set for operator or job parallelism. It would seem to me that it should be a number <= the number of available task slots?
For example, suppose I have 2 task manager machines, each with 4 task slots. Assuming no other jobs running on the cluster, would I set the parallelism for operations like filter and map to 8? If not, what would be a reasonable number? What happens if you request more parallelism than they are task slots? In example above, what happens if I set parallelism to 12 on the operations? I'm assuming it would just use as many as are available? Also, it would seem that you would not want to hardcode the parallelism into your source code, since you would want to have a rough idea of available task slots when you submit the job? Should you set parallelism to all operators roughly the same or different values, and what would guide that decision? https://www.elastic.co/webinars/getting-started-elasticsearch?elektra=home&storm=sub1 Thanks! -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/