[ 
https://issues.apache.org/jira/browse/OOZIE-2206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14503280#comment-14503280
 ] 

Purshotam Shah commented on OOZIE-2206:
---------------------------------------

{quote}
Posted 9 months, 2 weeks ago (July 3, 2014, 6:59 p.m.)

core/src/main/java/org/apache/oozie/service/ZKLocksService.java (Diff revision 
2)
67      
            reaper = new ChildReaper(zk.getClient(), LOCKS_NODE, 
Reaper.Mode.REAP_UNTIL_DELETE, getExecutorService(),


Shouldn't the mode be REAP_UNTIL_DELETE?  REAP_UNTIL_DELETE will stop after the 
first deletion, right?

https://curator.apache.org/apidocs/org/apache/curator/framework/recipes/locks/Reaper.Mode.html
 The issue has been resolved.  Show all issues
Purshotam Shah 9 months, 1 week ago (July 11, 2014, 4:47 p.m.)
REAP_INDEFINITELY:Reap forever, or until removePath is called for the path.

I think this param is passed for each lock. REAP_INDEFINITELY is used as 
default implementation. Will use REAP_INDEFINITELY.

{quote}
We started with REAP_UNTIL_DELETE, but moved to REAP_INDEFINITELY based on 
discussion on review board.
We should also check with curator team, there could be some bug in curator code.


> Change Reaper mode on ChildReaper in ZKLocksService
> ---------------------------------------------------
>
>                 Key: OOZIE-2206
>                 URL: https://issues.apache.org/jira/browse/OOZIE-2206
>             Project: Oozie
>          Issue Type: Bug
>    Affects Versions: trunk
>            Reporter: Ryota Egashira
>            Assignee: Ryota Egashira
>             Fix For: trunk
>
>         Attachments: OOZIE-2206.patch
>
>
> OOZIE-1906 added znode cleanup thread.
> currently passing Reaper.Mode.REAP_INDEFINITELY, but this enforce Oozie 
> server to keep reaping znode even after znode is cleaned up.  
> (https://github.com/apache/curator/blob/master/curator-recipes/src/main/java/org/apache/curator/framework/recipes/locks/Reaper.java)
>  
> This adds memory pressure on oozie server. Need to change to REAP_UNTIL_GONE  
> or REAP_UNTIL_DELETE
>   
> {code}
> reaper = new ChildReaper(zk.getClient(), LOCKS_NODE, 
> Reaper.Mode.REAP_INDEFINITELY, getExecutorService(), 
> ConfigurationService.getInt(services.getConf(), REAPING_THRESHOLD) * 1000, 
> REAPING_LEADER_PATH);
> {code}
> we hit one scenario where  one ZK quorum slows down for short period, causing 
> many Zk locks not released properly, right after ChildReaper (every 5 min ) 
> runs, which keep checking the list of Znode ever since, in the end, Oozie 
> server hit OOM.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to