[ https://issues.apache.org/jira/browse/HIVE-6518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jitendra Nath Pandey updated HIVE-6518: --------------------------------------- Resolution: Fixed Fix Version/s: 0.14.0 Status: Resolved (was: Patch Available) I have committed this trunk. Thanks to Gopal! [~rhbutani] This is an important fix to vector group by because the aggregates must flush more aggressively in case of GC. Therefore, I intend to commit it to branch-0.13. as well. > Add a GC canary to the VectorGroupByOperator to flush whenever a GC is > triggered > -------------------------------------------------------------------------------- > > Key: HIVE-6518 > URL: https://issues.apache.org/jira/browse/HIVE-6518 > Project: Hive > Issue Type: Bug > Components: Query Processor > Affects Versions: 0.13.0 > Reporter: Gopal V > Assignee: Gopal V > Priority: Minor > Fix For: 0.14.0 > > Attachments: HIVE-6518.1-tez.patch, HIVE-6518.2-tez.patch, > HIVE-6518.2.patch, HIVE-6518.3.patch > > > The current VectorGroupByOperator implementation flushes the in-memory hashes > when the maximum entries or fraction of memory is hit. > This works for most cases, but there are some corner cases where we hit GC > ovehead limits or heap size limits before either of those conditions are > reached due to the rest of the pipeline. > This patch adds a SoftReference as a GC canary. If the soft reference is > dead, then a full GC pass happened sometime in the near past & the > aggregation hashtables should be flushed immediately before another full GC > is triggered. -- This message was sent by Atlassian JIRA (v6.2#6252)