How do I avoid unnecessary reshuffles when using Kafka as input?  My keys
in Kafka are ~userId.  The first few stages do joins that are usually
(userId, someOtherKeyId).  It makes sense for these joins to stay on the
same machine and avoid unnecessary shuffling.

What's the best way to avoid unnecessary shuffling when using Table SQL
interface?  I see PARTITION BY on TABLE.  I'm not sure how to specify the
keys for Kafka.

Reply via email to