[
https://issues.apache.org/jira/browse/OOZIE-2206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14503280#comment-14503280
]
Purshotam Shah commented on OOZIE-2206:
---------------------------------------
{quote}
Posted 9 months, 2 weeks ago (July 3, 2014, 6:59 p.m.)
core/src/main/java/org/apache/oozie/service/ZKLocksService.java (Diff revision
2)
67
reaper = new ChildReaper(zk.getClient(), LOCKS_NODE,
Reaper.Mode.REAP_UNTIL_DELETE, getExecutorService(),
Shouldn't the mode be REAP_UNTIL_DELETE? REAP_UNTIL_DELETE will stop after the
first deletion, right?
https://curator.apache.org/apidocs/org/apache/curator/framework/recipes/locks/Reaper.Mode.html
The issue has been resolved. Show all issues
Purshotam Shah 9 months, 1 week ago (July 11, 2014, 4:47 p.m.)
REAP_INDEFINITELY:Reap forever, or until removePath is called for the path.
I think this param is passed for each lock. REAP_INDEFINITELY is used as
default implementation. Will use REAP_INDEFINITELY.
{quote}
We started with REAP_UNTIL_DELETE, but moved to REAP_INDEFINITELY based on
discussion on review board.
We should also check with curator team, there could be some bug in curator code.
> Change Reaper mode on ChildReaper in ZKLocksService
> ---------------------------------------------------
>
> Key: OOZIE-2206
> URL: https://issues.apache.org/jira/browse/OOZIE-2206
> Project: Oozie
> Issue Type: Bug
> Affects Versions: trunk
> Reporter: Ryota Egashira
> Assignee: Ryota Egashira
> Fix For: trunk
>
> Attachments: OOZIE-2206.patch
>
>
> OOZIE-1906 added znode cleanup thread.
> currently passing Reaper.Mode.REAP_INDEFINITELY, but this enforce Oozie
> server to keep reaping znode even after znode is cleaned up.
> (https://github.com/apache/curator/blob/master/curator-recipes/src/main/java/org/apache/curator/framework/recipes/locks/Reaper.java)
>
> This adds memory pressure on oozie server. Need to change to REAP_UNTIL_GONE
> or REAP_UNTIL_DELETE
>
> {code}
> reaper = new ChildReaper(zk.getClient(), LOCKS_NODE,
> Reaper.Mode.REAP_INDEFINITELY, getExecutorService(),
> ConfigurationService.getInt(services.getConf(), REAPING_THRESHOLD) * 1000,
> REAPING_LEADER_PATH);
> {code}
> we hit one scenario where one ZK quorum slows down for short period, causing
> many Zk locks not released properly, right after ChildReaper (every 5 min )
> runs, which keep checking the list of Znode ever since, in the end, Oozie
> server hit OOM.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)