[ 
https://issues.apache.org/jira/browse/IMPALA-7635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17062983#comment-17062983
 ] 

David Rorke commented on IMPALA-7635:
-------------------------------------

One simple first step to help assess the memory wasted by the current 
power-of-two bucket allocation would be to add a profile counter for the number 
of filled buckets.  If we find that we're frequently allocating much more than 
needed to maintain the MAX_FILL_FACTOR, there's more incentive to consider more 
incremental allocation.

> Reduce size of hash tables in-memory by packing buckets more densely
> --------------------------------------------------------------------
>
>                 Key: IMPALA-7635
>                 URL: https://issues.apache.org/jira/browse/IMPALA-7635
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Backend
>    Affects Versions: Impala 3.1.0
>            Reporter: Tim Armstrong
>            Priority: Major
>              Labels: perf
>
> Currently the hash tables used for hash join and aggregation use 16 bytes per 
> bucket and 24 bytes per additional duplicate for a key:
> {code}
>   /// Linked list of entries used for duplicates.
>   struct DuplicateNode {
>     /// Used for full outer and right {outer, anti, semi} joins. Indicates 
> whether the
>     /// row in the DuplicateNode has been matched.
>     /// From an abstraction point of view, this is an awkward place to store 
> this
>     /// information.
>     /// TODO: Fold this flag in the next pointer below.
>     bool matched;
>     /// Chain to next duplicate node, NULL when end of list.
>     DuplicateNode* next;
>     HtData htdata;
>   };
>   struct Bucket {
>     /// Whether this bucket contains a vaild entry, or it is empty.
>     bool filled;
>     /// Used for full outer and right {outer, anti, semi} joins. Indicates 
> whether the
>     /// row in the bucket has been matched.
>     /// From an abstraction point of view, this is an awkward place to store 
> this
>     /// information but it is efficient. This space is otherwise unused.
>     bool matched;
>     /// Used in case of duplicates. If true, then the bucketData union should 
> be used as
>     /// 'duplicates'.
>     bool hasDuplicates;
>     /// Cache of the hash for data.
>     /// TODO: Do we even have to cache the hash value?
>     uint32_t hash;
>     /// Either the data for this bucket or the linked list of duplicates.
>     union {
>       HtData htdata;
>       DuplicateNode* duplicates;
>     } bucketData;
>   };
> {code}
> There are some comments in the code that suggest folding the boolean values 
> into the upper bits of the pointers (since on amd64 the address space is only 
> 48 bits, but moving to 57 bits apparently - see 
> https://software.intel.com/sites/default/files/managed/2b/80/5-level_paging_white_paper.pdf).
>  That would reduce the bucket to 12 bytes of actual data.
> This would give us the opportunity to reduce memory requirements of joins and 
> the pressure on caches significantly, provided we can work out the 
> implementation issues and the cost of the bit manipulation doesn't exceed the 
> benefit (my intuition is that cache effects are way more important but I 
> could be wrong).
> Here's a rough idea of what we could do:
> # Implement folding of booleans into the pointer and mark struct Bucket as 
> packed so that it doesn't just undo the work with additional padding.
> # Modifying Hashtable to work with the new bucket structure. This needs a 
> little thought since the bucket allocations must be a power-of-two size in 
> bytes, but we also need the hash table entries to be a power-of-two in order 
> for masking the hash to get the bucket number to work. I think either we 
> could just leave wasted space in the buffer or switch to a non-power-of-two 
> number of buckets and using an alternative method of getting the bucket from 
> the hash: 
> https://lemire.me/blog/2016/06/27/a-fast-alternative-to-the-modulo-reduction/
> # Run benchmarks to see if it's beneficial. The effect probably depends on 
> the data set size.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to