Hello, the performance of apply function after join.

Just for your information, I am running Flink job on the cluster consisted
of 9 machine with each 48 cores. I am working on some benchmark with
comparison of Flink, Spark-Sql, and Hive.

I tried to optimize *join function with Hint* for better performance. I
want to increase the performance as much as possible.

Here are Questions===
1) When seeing job progress log, apply() after join function seems like it
takes a bit long time. Do you think if I do not use apply() to format
tuples, I would gain the better performance? Well, I could set just the
column number instead of apply()

2) on using *join with Hint* like Huge or Tiny, is there the ideal ratio
regarding to the size of two tables? For me, if some table is 10 times
bigger than the other table, I use join with Hint. Otherwise, I usually use
the general join().

Best,
Phil

Reply via email to