[
https://issues.apache.org/jira/browse/HIVE-6518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13935444#comment-13935444
]
Gopal V commented on HIVE-6518:
-------------------------------
The test failures don't seem to be related to this fix - they aren't vectorized.
> Add a GC canary to the VectorGroupByOperator to flush whenever a GC is
> triggered
> --------------------------------------------------------------------------------
>
> Key: HIVE-6518
> URL: https://issues.apache.org/jira/browse/HIVE-6518
> Project: Hive
> Issue Type: Bug
> Components: Query Processor
> Affects Versions: 0.13.0
> Reporter: Gopal V
> Assignee: Gopal V
> Priority: Minor
> Attachments: HIVE-6518.1-tez.patch, HIVE-6518.2-tez.patch,
> HIVE-6518.2.patch, HIVE-6518.3.patch
>
>
> The current VectorGroupByOperator implementation flushes the in-memory hashes
> when the maximum entries or fraction of memory is hit.
> This works for most cases, but there are some corner cases where we hit GC
> ovehead limits or heap size limits before either of those conditions are
> reached due to the rest of the pipeline.
> This patch adds a SoftReference as a GC canary. If the soft reference is
> dead, then a full GC pass happened sometime in the near past & the
> aggregation hashtables should be flushed immediately before another full GC
> is triggered.
--
This message was sent by Atlassian JIRA
(v6.2#6252)