chevaris opened a new issue, #1258:
URL: https://github.com/apache/curator/issues/1258

    PersistentTTLNode (Curator 5.8.0 and probably previous versions) has a 
corner case that prevents that ZNode is deleted when program running the recipe 
is stop in certain situations.
   
   This is the sequence:
   - Start the recipe with TTL of 30secs -> Container Node is created
   - Stop the program (or the program crashes in production) that runs the 
recipe before Touch TTL node is created. This is NOT deterministic and 
basically a background thread is scheduled to run TTL/2 (by default). In worse 
case scenario de TTL node could take up to 15 secs in this example to be created
   
   When this is happening the CONTAINER node is never deleted. One option is to 
increase the touchScheduleFactor, BUT still this solution looks not correct for 
me.
   
   
   In my view the recipe should watch the Container Node itself and just when 
the node is created, the recipe could trigger TOUCH node creation to minimize 
the opportunity window in which the problem happens.
   
   I attach a test case that shows the problem, and I include a fixed recipe 
that solves it.
   
   
https://github.com/chevaris/curator/commit/6da77252f24841d8f8e85572cad9ac6d86cac5e7
   
   Anyhow, no matter how fast the touch ZNode is added the race condition will 
be always there, and in my view this is a limitation on the strategy used for 
this recipe that should be documented.
   
   Regards,
   
   Cheva


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to