[GitHub] spark issue #21156: [SPARK-24087][SQL] Avoid shuffle when join keys are a su...

yucai Tue, 10 Jul 2018 02:33:02 -0700

Github user yucai commented on the issue:

    https://github.com/apache/spark/pull/21156
  
    @cloud-fan For bucket table, the user will do the bucket on the primary 
key, so in this case, they will not have the parallelism and data skew issue 
and we can see good benefit from avoiding shuffle.
    Do you mean the performance regression in some more general cases?



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21156: [SPARK-24087][SQL] Avoid shuffle when join keys are a su...

Reply via email to