[ 
https://issues.apache.org/jira/browse/HIVE-6518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6518:
---------------------------------------

       Resolution: Fixed
    Fix Version/s: 0.14.0
           Status: Resolved  (was: Patch Available)

I have committed this trunk. Thanks to Gopal!

[~rhbutani] This is an important fix to vector group by because the aggregates 
must flush more aggressively in case of GC. Therefore, I intend to commit it to 
branch-0.13. as well.

> Add a GC canary to the VectorGroupByOperator to flush whenever a GC is 
> triggered
> --------------------------------------------------------------------------------
>
>                 Key: HIVE-6518
>                 URL: https://issues.apache.org/jira/browse/HIVE-6518
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 0.13.0
>            Reporter: Gopal V
>            Assignee: Gopal V
>            Priority: Minor
>             Fix For: 0.14.0
>
>         Attachments: HIVE-6518.1-tez.patch, HIVE-6518.2-tez.patch, 
> HIVE-6518.2.patch, HIVE-6518.3.patch
>
>
> The current VectorGroupByOperator implementation flushes the in-memory hashes 
> when the maximum entries or fraction of memory is hit.
> This works for most cases, but there are some corner cases where we hit GC 
> ovehead limits or heap size limits before either of those conditions are 
> reached due to the rest of the pipeline.
> This patch adds a SoftReference as a GC canary. If the soft reference is 
> dead, then a full GC pass happened sometime in the near past & the 
> aggregation hashtables should be flushed immediately before another full GC 
> is triggered.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to