GitHub user yaooqinn opened a pull request:

    https://github.com/apache/spark/pull/15071

    [WIP][SPARK-17517][SQL]Improve generated Code for BroadcastHashJoinExec

    ## What changes were proposed in this pull request?
    
    For current `BroadcastHashJoinExec`, we generate join code for key is not 
unique like this:
    ```java
    while (matches.hasnext)
    matched = matches.next
    check and read stream side row fields
    check and read build side row fieldes
    reset result row
    write stream side row fields to result row
    write stream side row fields to result row
    ```
    For some cases, we don't need to check/read/write the steam side repeatedly 
in such while circle, e.g. `Inner Join with BuildRight`, or `BuildLeft && all 
left side fields are fixed length` and so on. we may generate the code as below:
    ```java
    check and read stream side row fields
    reset result row
    write stream side row fields to result row
    while (matches.hasnext)
    matched = matches.next
    check and read build side row fieldes
    write stream side row fields to result row
    ```
    
    
    ## How was this patch tested?
    
    todo
    
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/yaooqinn/spark bhj-codegen

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/15071.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #15071
    
----
commit bb731f2bc318ddad03e6543bf45cd5ff7e775206
Author: Kent Yao <yaooq...@hotmail.com>
Date:   2016-09-13T03:14:46Z

    init commit

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to