[ https://issues.apache.org/jira/browse/IMPALA-7635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17062983#comment-17062983 ]
David Rorke commented on IMPALA-7635: ------------------------------------- One simple first step to help assess the memory wasted by the current power-of-two bucket allocation would be to add a profile counter for the number of filled buckets. If we find that we're frequently allocating much more than needed to maintain the MAX_FILL_FACTOR, there's more incentive to consider more incremental allocation. > Reduce size of hash tables in-memory by packing buckets more densely > -------------------------------------------------------------------- > > Key: IMPALA-7635 > URL: https://issues.apache.org/jira/browse/IMPALA-7635 > Project: IMPALA > Issue Type: Improvement > Components: Backend > Affects Versions: Impala 3.1.0 > Reporter: Tim Armstrong > Priority: Major > Labels: perf > > Currently the hash tables used for hash join and aggregation use 16 bytes per > bucket and 24 bytes per additional duplicate for a key: > {code} > /// Linked list of entries used for duplicates. > struct DuplicateNode { > /// Used for full outer and right {outer, anti, semi} joins. Indicates > whether the > /// row in the DuplicateNode has been matched. > /// From an abstraction point of view, this is an awkward place to store > this > /// information. > /// TODO: Fold this flag in the next pointer below. > bool matched; > /// Chain to next duplicate node, NULL when end of list. > DuplicateNode* next; > HtData htdata; > }; > struct Bucket { > /// Whether this bucket contains a vaild entry, or it is empty. > bool filled; > /// Used for full outer and right {outer, anti, semi} joins. Indicates > whether the > /// row in the bucket has been matched. > /// From an abstraction point of view, this is an awkward place to store > this > /// information but it is efficient. This space is otherwise unused. > bool matched; > /// Used in case of duplicates. If true, then the bucketData union should > be used as > /// 'duplicates'. > bool hasDuplicates; > /// Cache of the hash for data. > /// TODO: Do we even have to cache the hash value? > uint32_t hash; > /// Either the data for this bucket or the linked list of duplicates. > union { > HtData htdata; > DuplicateNode* duplicates; > } bucketData; > }; > {code} > There are some comments in the code that suggest folding the boolean values > into the upper bits of the pointers (since on amd64 the address space is only > 48 bits, but moving to 57 bits apparently - see > https://software.intel.com/sites/default/files/managed/2b/80/5-level_paging_white_paper.pdf). > That would reduce the bucket to 12 bytes of actual data. > This would give us the opportunity to reduce memory requirements of joins and > the pressure on caches significantly, provided we can work out the > implementation issues and the cost of the bit manipulation doesn't exceed the > benefit (my intuition is that cache effects are way more important but I > could be wrong). > Here's a rough idea of what we could do: > # Implement folding of booleans into the pointer and mark struct Bucket as > packed so that it doesn't just undo the work with additional padding. > # Modifying Hashtable to work with the new bucket structure. This needs a > little thought since the bucket allocations must be a power-of-two size in > bytes, but we also need the hash table entries to be a power-of-two in order > for masking the hash to get the bucket number to work. I think either we > could just leave wasted space in the buffer or switch to a non-power-of-two > number of buckets and using an alternative method of getting the bucket from > the hash: > https://lemire.me/blog/2016/06/27/a-fast-alternative-to-the-modulo-reduction/ > # Run benchmarks to see if it's beneficial. The effect probably depends on > the data set size. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org