[ 
https://issues.apache.org/jira/browse/SPARK-17517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kent Yao updated SPARK-17517:
-----------------------------
     Target Version/s:   (was: 2.1.0)
    External issue ID:   (was: 12795)
        Fix Version/s:     (was: 2.1.0)
          Description: 
For current `BroadcastHashJoinExec`, we generate join code for key is not 
unique like this: 

{code:title=processNext.java|borderStyle=solid}
while (matches.hasnext) {
    matched = matches.next
    check and read stream side row fields
    check and read build side row fieldes
    reset result row
    write stream side row fields to result row
    write stream side row fields to result row
    append(result row)
}
{code}

For some cases, we don't need to check/read/write the steam side repeatedly in 
such while circle, e.g. `Inner Join with BuildRight`, or `BuildLeft &&  all 
left side fields are fixed length` and so on. we may generate the code as below:

{code:title=processNext.java|borderStyle=solid}
check and read stream side row fields
reset result row
write stream side row fields to result row
while (matches.hasnext)
{
    matched = matches.next
    check and read build side row fieldes
    write stream side row fields to result row
    append(result row)
}
{code}



  was:
For current `BroadcastHashJoinExec`, we generate join code for key is not 
unique like this: 

{code:title=Bar.java|borderStyle=solid}
while (matches.hasnext) {
    matched = matches.next
    check and read stream side row fields
    check and read build side row fieldes
    reset result row
    write stream side row fields to result row
    write stream side row fields to result row
    append(result row)
}
{code}

For some cases, we don't need to check/read/write the steam side repeatedly in 
such while circle, e.g. `Inner Join with BuildRight`, or `BuildLeft &&  all 
left side fields are fixed length` and so on. we may generate the code as below:

```java
check and read stream side row fields
reset result row
write stream side row fields to result row
while (matches.hasnext)
{
    matched = matches.next
    check and read build side row fieldes
    write stream side row fields to result row
    append(result row)
}
```




> Improve generated Code for BroadcastHashJoinExec
> ------------------------------------------------
>
>                 Key: SPARK-17517
>                 URL: https://issues.apache.org/jira/browse/SPARK-17517
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>            Reporter: Kent Yao
>
> For current `BroadcastHashJoinExec`, we generate join code for key is not 
> unique like this: 
> {code:title=processNext.java|borderStyle=solid}
> while (matches.hasnext) {
>     matched = matches.next
>     check and read stream side row fields
>     check and read build side row fieldes
>     reset result row
>     write stream side row fields to result row
>     write stream side row fields to result row
>     append(result row)
> }
> {code}
> For some cases, we don't need to check/read/write the steam side repeatedly 
> in such while circle, e.g. `Inner Join with BuildRight`, or `BuildLeft &&  
> all left side fields are fixed length` and so on. we may generate the code as 
> below:
> {code:title=processNext.java|borderStyle=solid}
> check and read stream side row fields
> reset result row
> write stream side row fields to result row
> while (matches.hasnext)
> {
>     matched = matches.next
>     check and read build side row fieldes
>     write stream side row fields to result row
>     append(result row)
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to