[ https://issues.apache.org/jira/browse/HIVE-8700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14195403#comment-14195403 ]
Suhas Satish commented on HIVE-8700: ------------------------------------ Sure [~szehon]. Attaching my changeset as a patch. This compiles. I was testing at runtime. So didn't follow the naming conventions like HIVE-8700-spark.patch as I dont want unit tests triggered just yet. > Replace ReduceSink to HashTableSink (or equi.) for small tables [Spark Branch] > ------------------------------------------------------------------------------ > > Key: HIVE-8700 > URL: https://issues.apache.org/jira/browse/HIVE-8700 > Project: Hive > Issue Type: Sub-task > Components: Spark > Reporter: Xuefu Zhang > Assignee: Suhas Satish > Attachments: HIVE-8700.patch > > > With HIVE-8616 enabled, the new plan has ReduceSinkOperator for the small > tables. For example, the follow represents the operator plan for the small > table dec1 derived from query {code}explain select /*+ MAPJOIN(dec)*/ * from > dec join dec1 on dec.value=dec1.d;{code} > {code} > Map 2 > Map Operator Tree: > TableScan > alias: dec1 > Statistics: Num rows: 0 Data size: 107 Basic stats: PARTIAL > Column stats: NONE > Filter Operator > predicate: d is not null (type: boolean) > Statistics: Num rows: 0 Data size: 0 Basic stats: NONE > Column stats: NONE > Reduce Output Operator > key expressions: d (type: decimal(5,2)) > sort order: + > Map-reduce partition columns: d (type: decimal(5,2)) > Statistics: Num rows: 0 Data size: 0 Basic stats: NONE > Column stats: NONE > value expressions: i (type: int) > {code} > With the new design for broadcasting small tables, we need to convert the > ReduceSinkOperator with HashTableSinkOperator or equivalent in the new plan. -- This message was sent by Atlassian JIRA (v6.3.4#6332)