[ 
https://issues.apache.org/jira/browse/HIVE-20491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16604008#comment-16604008
 ] 

Ashutosh Chauhan commented on HIVE-20491:
-----------------------------------------

Ok. As it currently stands we always estimate assuming fast hashtable and 
always use it. 
What we will miss out on is if estimate is high we will turn off BJ altogether 
instead of going with more memory efficient optimized version of hashtable. I 
agree we can take this improvement in follow-up.
+1

> Fix mapjoin size estimations for Fast implementation
> ----------------------------------------------------
>
>                 Key: HIVE-20491
>                 URL: https://issues.apache.org/jira/browse/HIVE-20491
>             Project: Hive
>          Issue Type: Improvement
>          Components: Statistics
>            Reporter: Zoltan Haindrich
>            Assignee: Zoltan Haindrich
>            Priority: Major
>         Attachments: HIVE-20491.01.patch, HIVE-20491.01wip02.patch, 
> HIVE-20491.02.patch
>
>
> HIVE-19824 have fixed the estimations; but it calculated for the "optimized" 
> impl; the "fast" one has a little bit bigger footprint.
> It also seems like fast is a bit overestimated at runtime...that should be 
> also taken care of.
> | numkeys | implementation | compiler estimation | runtime estimation | 
> runtime measurement | ce / rm | re / rm |
> | 25M | FAST | 1168435456 | 2189433712 | 1513584984 | .77 | 1.44 |
> | 25M | OPTIMIZED | 1168435456 | 1191203764 | 1168439664 | 100% | 1.01 |



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to