[ 
https://issues.apache.org/jira/browse/HIVE-6518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13915212#comment-13915212
 ] 

Gunther Hagleitner commented on HIVE-6518:
------------------------------------------

I like it. Sounds like this will allow you to be more aggressive with 
caching/flushing params, while having a trigger that will flush out stuff when 
necessary.

+1 (assuming tests pass)

> Add a GC canary to the VectorGroupByOperator to flush whenever a GC is 
> triggered
> --------------------------------------------------------------------------------
>
>                 Key: HIVE-6518
>                 URL: https://issues.apache.org/jira/browse/HIVE-6518
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 0.13.0
>            Reporter: Gopal V
>            Assignee: Gopal V
>            Priority: Minor
>         Attachments: HIVE-6518.1-tez.patch
>
>
> The current VectorGroupByOperator implementation flushes the in-memory hashes 
> when the maximum entries or fraction of memory is hit.
> This works for most cases, but there are some corner cases where we hit GC 
> ovehead limits or heap size limits before either of those conditions are 
> reached due to the rest of the pipeline.
> This patch adds a SoftReference as a GC canary. If the soft reference is 
> dead, then a full GC pass happened sometime in the near past & the 
> aggregation hashtables should be flushed immediately before another full GC 
> is triggered.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to