[GitHub] [hive] rbalamohan commented on a change in pull request #1250: HIVE-23843: Improve key evictions in VectorGroupByOperator

GitBox Thu, 23 Jul 2020 18:22:26 -0700


rbalamohan commented on a change in pull request #1250:
URL: https://github.com/apache/hive/pull/1250#discussion_r459811176




##########
File path: 
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorGroupByOperator.java
##########
@@ -561,17 +590,25 @@ private void flush(boolean all) throws HiveException {
             maxHashTblMemory/1024/1024,
             gcCanary.get() == null ? "dead" : "alive"));
       }
+      int avgAccess = computeAvgAccess();
 
       /* Iterate the global (keywrapper,aggregationbuffers) map and emit
        a row for each key */
       Iterator<Map.Entry<KeyWrapper, VectorAggregationBufferRow>> iter =
           mapKeysAggregationBuffers.entrySet().iterator();
       while(iter.hasNext()) {
         Map.Entry<KeyWrapper, VectorAggregationBufferRow> pair = iter.next();
+        if (!all && avgAccess >= 1) {
+          // Retain entries when access pattern is > than average access
+          if (pair.getValue().getAccessCount() > avgAccess) {

Review comment:
       >> keys could retain their places for a long time because of very old 
cache hits - and they will keep their place in the cache
   
   This depends on incoming data and would be the worst case scenario similar 
to earlier implementation.  However, there is a corner case (again depending on 
data) that large number of entries in the map exceeds the average threshold 
which could prevent 10% flushing limit. Adding the reset would help preventing 
this. I will create a follow up ticket on this.
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org

[GitHub] [hive] rbalamohan commented on a change in pull request #1250: HIVE-23843: Improve key evictions in VectorGroupByOperator

Reply via email to