[ 
https://issues.apache.org/jira/browse/HIVE-19578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16478651#comment-16478651
 ] 

Prasanth Jayachandran commented on HIVE-19578:
----------------------------------------------

Ran a quick benchmark with following scenarios
{code:java}
testHLLAddHive - Current Hive implementation that uses TreeMap for HLL Sparse 
Register
testHLLAddInt2ByteSortedMap - TreeMap replaced with Int2ByteSortedMap for HLL 
Sparse Regsiter
testHLLAddInt2ByteSortedMapPlusOpt - Int2ByteSortedMap + branch optimizations 
in HLL add() inner loop{code}
{code:java}
Benchmark                                         Mode Cnt Score Error Units
HyperLogLogAdd.testHLLAddInt2ByteSortedMapPlusOpt avgt 10 12.773 ± 0.382 ns/op
HyperLogLogAdd.testHLLAddInt2ByteSortedMap        avgt 10 25.675 ± 0.439 ns/op
HyperLogLogAdd.testHLLAddHive                     avgt 10 37.559 ± 0.488 
ns/op{code}

> HLL merges tempList on every add
> --------------------------------
>
>                 Key: HIVE-19578
>                 URL: https://issues.apache.org/jira/browse/HIVE-19578
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Sergey Shelukhin
>            Assignee: Prasanth Jayachandran
>            Priority: Major
>         Attachments: Screen Shot 2018-05-16 at 15.29.12 .png
>
>
>  See comments on HIVE-18866; this has significant perf overhead after the 
> even bigger overhead from hashing is removed.  !Screen Shot 2018-05-16 at 
> 15.29.12 .png!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to