Hello, the performance of apply function after join. Just for your information, I am running Flink job on the cluster consisted of 9 machine with each 48 cores. I am working on some benchmark with comparison of Flink, Spark-Sql, and Hive.
I tried to optimize *join function with Hint* for better performance. I want to increase the performance as much as possible. Here are Questions=== 1) When seeing job progress log, apply() after join function seems like it takes a bit long time. Do you think if I do not use apply() to format tuples, I would gain the better performance? Well, I could set just the column number instead of apply() 2) on using *join with Hint* like Huge or Tiny, is there the ideal ratio regarding to the size of two tables? For me, if some table is 10 times bigger than the other table, I use join with Hint. Otherwise, I usually use the general join(). Best, Phil