[
https://issues.apache.org/jira/browse/OOZIE-2206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14503299#comment-14503299
]
Rohini Palaniswamy commented on OOZIE-2206:
-------------------------------------------
If we are just cleaning up locks that are still there after 5 mins, why do we
need REAP_INDEFINITELY for a particular path? Won't REAP_UNTIL_GONE (to take
into account another HA server reaping it) take care of the issue. Why should
the removed lock path be monitored indefinitely?
> Change Reaper mode on ChildReaper in ZKLocksService
> ---------------------------------------------------
>
> Key: OOZIE-2206
> URL: https://issues.apache.org/jira/browse/OOZIE-2206
> Project: Oozie
> Issue Type: Bug
> Affects Versions: trunk
> Reporter: Ryota Egashira
> Assignee: Ryota Egashira
> Fix For: trunk
>
> Attachments: OOZIE-2206.patch
>
>
> OOZIE-1906 added znode cleanup thread.
> currently passing Reaper.Mode.REAP_INDEFINITELY, but this enforce Oozie
> server to keep reaping znode even after znode is cleaned up.
> (https://github.com/apache/curator/blob/master/curator-recipes/src/main/java/org/apache/curator/framework/recipes/locks/Reaper.java)
>
> This adds memory pressure on oozie server. Need to change to REAP_UNTIL_GONE
> or REAP_UNTIL_DELETE
>
> {code}
> reaper = new ChildReaper(zk.getClient(), LOCKS_NODE,
> Reaper.Mode.REAP_INDEFINITELY, getExecutorService(),
> ConfigurationService.getInt(services.getConf(), REAPING_THRESHOLD) * 1000,
> REAPING_LEADER_PATH);
> {code}
> we hit one scenario where one ZK quorum slows down for short period, causing
> many Zk locks not released properly, right after ChildReaper (every 5 min )
> runs, which keep checking the list of Znode ever since, in the end, Oozie
> server hit OOM.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)