[ https://issues.apache.org/jira/browse/SPARK-17517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Kent Yao updated SPARK-17517: ----------------------------- Target Version/s: (was: 2.1.0) External issue ID: (was: 12795) Fix Version/s: (was: 2.1.0) Description: For current `BroadcastHashJoinExec`, we generate join code for key is not unique like this: {code:title=processNext.java|borderStyle=solid} while (matches.hasnext) { matched = matches.next check and read stream side row fields check and read build side row fieldes reset result row write stream side row fields to result row write stream side row fields to result row append(result row) } {code} For some cases, we don't need to check/read/write the steam side repeatedly in such while circle, e.g. `Inner Join with BuildRight`, or `BuildLeft && all left side fields are fixed length` and so on. we may generate the code as below: {code:title=processNext.java|borderStyle=solid} check and read stream side row fields reset result row write stream side row fields to result row while (matches.hasnext) { matched = matches.next check and read build side row fieldes write stream side row fields to result row append(result row) } {code} was: For current `BroadcastHashJoinExec`, we generate join code for key is not unique like this: {code:title=Bar.java|borderStyle=solid} while (matches.hasnext) { matched = matches.next check and read stream side row fields check and read build side row fieldes reset result row write stream side row fields to result row write stream side row fields to result row append(result row) } {code} For some cases, we don't need to check/read/write the steam side repeatedly in such while circle, e.g. `Inner Join with BuildRight`, or `BuildLeft && all left side fields are fixed length` and so on. we may generate the code as below: ```java check and read stream side row fields reset result row write stream side row fields to result row while (matches.hasnext) { matched = matches.next check and read build side row fieldes write stream side row fields to result row append(result row) } ``` > Improve generated Code for BroadcastHashJoinExec > ------------------------------------------------ > > Key: SPARK-17517 > URL: https://issues.apache.org/jira/browse/SPARK-17517 > Project: Spark > Issue Type: Improvement > Components: SQL > Reporter: Kent Yao > > For current `BroadcastHashJoinExec`, we generate join code for key is not > unique like this: > {code:title=processNext.java|borderStyle=solid} > while (matches.hasnext) { > matched = matches.next > check and read stream side row fields > check and read build side row fieldes > reset result row > write stream side row fields to result row > write stream side row fields to result row > append(result row) > } > {code} > For some cases, we don't need to check/read/write the steam side repeatedly > in such while circle, e.g. `Inner Join with BuildRight`, or `BuildLeft && > all left side fields are fixed length` and so on. we may generate the code as > below: > {code:title=processNext.java|borderStyle=solid} > check and read stream side row fields > reset result row > write stream side row fields to result row > while (matches.hasnext) > { > matched = matches.next > check and read build side row fieldes > write stream side row fields to result row > append(result row) > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org