Flink Table SQL, Kafka, partitions and unnecessary shuffling

Dan Hill Tue, 15 Sep 2020 20:36:47 -0700

How do I avoid unnecessary reshuffles when using Kafka as input?  My keys
in Kafka are ~userId.  The first few stages do joins that are usually
(userId, someOtherKeyId).  It makes sense for these joins to stay on the
same machine and avoid unnecessary shuffling.


What's the best way to avoid unnecessary shuffling when using Table SQL
interface?  I see PARTITION BY on TABLE.  I'm not sure how to specify the
keys for Kafka.

Flink Table SQL, Kafka, partitions and unnecessary shuffling

Reply via email to