[ https://issues.apache.org/jira/browse/HIVE-19578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16478651#comment-16478651 ]
Prasanth Jayachandran commented on HIVE-19578: ---------------------------------------------- Ran a quick benchmark with following scenarios {code:java} testHLLAddHive - Current Hive implementation that uses TreeMap for HLL Sparse Register testHLLAddInt2ByteSortedMap - TreeMap replaced with Int2ByteSortedMap for HLL Sparse Regsiter testHLLAddInt2ByteSortedMapPlusOpt - Int2ByteSortedMap + branch optimizations in HLL add() inner loop{code} {code:java} Benchmark Mode Cnt Score Error Units HyperLogLogAdd.testHLLAddInt2ByteSortedMapPlusOpt avgt 10 12.773 ± 0.382 ns/op HyperLogLogAdd.testHLLAddInt2ByteSortedMap avgt 10 25.675 ± 0.439 ns/op HyperLogLogAdd.testHLLAddHive avgt 10 37.559 ± 0.488 ns/op{code} > HLL merges tempList on every add > -------------------------------- > > Key: HIVE-19578 > URL: https://issues.apache.org/jira/browse/HIVE-19578 > Project: Hive > Issue Type: Bug > Reporter: Sergey Shelukhin > Assignee: Prasanth Jayachandran > Priority: Major > Attachments: Screen Shot 2018-05-16 at 15.29.12 .png > > > See comments on HIVE-18866; this has significant perf overhead after the > even bigger overhead from hashing is removed. !Screen Shot 2018-05-16 at > 15.29.12 .png! -- This message was sent by Atlassian JIRA (v7.6.3#76005)