[jira] [Commented] (HIVE-20892) Benchmark XXhash for 64 bit hashing function instead of Murmum hash
[ https://issues.apache.org/jira/browse/HIVE-20892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16680595#comment-16680595 ] slim bouguerra commented on HIVE-20892: --- forgot to add that we need to look at 32bit hashes since that is what Hive uses for Joins and Grouping. > Benchmark XXhash for 64 bit hashing function instead of Murmum hash > --- > > Key: HIVE-20892 > URL: https://issues.apache.org/jira/browse/HIVE-20892 > Project: Hive > Issue Type: Sub-task >Reporter: slim bouguerra >Assignee: slim bouguerra >Priority: Major > > https://cyan4973.github.io/xxHash/ > FYI this is used by lot of other MPP systems ... -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20892) Benchmark XXhash for 64 bit hashing function instead of Murmum hash
[ https://issues.apache.org/jira/browse/HIVE-20892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16680592#comment-16680592 ] slim bouguerra commented on HIVE-20892: --- [~prasanth_j] thanks that is what planing to do, seems like you have done most of the work, maybe worth re-run it with newer JVMs and on something else than laptop? Also am curious about the impact of the distribution over actual data like TPC-H > Benchmark XXhash for 64 bit hashing function instead of Murmum hash > --- > > Key: HIVE-20892 > URL: https://issues.apache.org/jira/browse/HIVE-20892 > Project: Hive > Issue Type: Sub-task >Reporter: slim bouguerra >Assignee: slim bouguerra >Priority: Major > > https://cyan4973.github.io/xxHash/ > FYI this is used by lot of other MPP systems ... -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20892) Benchmark XXhash for 64 bit hashing function instead of Murmum hash
[ https://issues.apache.org/jira/browse/HIVE-20892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16680441#comment-16680441 ] Prasanth Jayachandran commented on HIVE-20892: -- [https://github.com/prasanthj/hasher] Murmur2 is slightly better in terms of perf than Murmur3 but for this reason [https://github.com/apache/hive/blob/master/storage-api/src/java/org/apache/hive/common/util/BloomFilter.java#L37-L40] Murmur3 is chosen for bloomfilter and HLL in Hive. > Benchmark XXhash for 64 bit hashing function instead of Murmum hash > --- > > Key: HIVE-20892 > URL: https://issues.apache.org/jira/browse/HIVE-20892 > Project: Hive > Issue Type: Sub-task >Reporter: slim bouguerra >Assignee: slim bouguerra >Priority: Major > > https://cyan4973.github.io/xxHash/ > FYI this is used by lot of other MPP systems ... -- This message was sent by Atlassian JIRA (v7.6.3#76005)