GitHub user yucai opened a pull request:

    https://github.com/apache/spark/pull/21149

    [SPARK-24076][SQL] Use different seed in HashAggregate to avoid hash 
conflict

    ## What changes were proposed in this pull request?
    
    HashAggregate uses the same hash algorithm and seed as shuffle, it may lead 
to very bad hash conflict like [SPARK-24076].
    
    ## How was this patch tested?
    
    Unit test and production case.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/yucai/spark SPARK-24076

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/21149.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #21149
    
----
commit 5e8846840518648f06e3911ec911f9e92d670134
Author: yucai <yyu1@...>
Date:   2018-04-25T06:58:10Z

    [SPARK-24076][SQL] Use different seed in HashAggregate to avoid hash 
conflict

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to