GitHub user saucam opened a pull request:

    https://github.com/apache/spark/pull/9858

    SPARK-11878: Eliminate distribute by in case group by is present with 
exactly the same grouping expressions

    For queries like :
    select <> from table group by a distribute by a
    we can eliminate distribute by ; since group by will anyways do a hash 
partitioning
    Also applicable when user uses Dataframe API but the number of partitions 
in RepartitionByExpression is not specified (None)

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/saucam/spark eliminatedistribute

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/9858.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #9858
    
----
commit a86feca6e2b9aaba9babed8854a39c97b59f34cd
Author: Yash Datta <yash.da...@guavus.com>
Date:   2015-11-20T07:43:47Z

    SPARK-11878: Eliminate distribute by in case group by is present with 
exactly the same grouping expressions

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to