How would one go about using this feature? Perhaps a pseudo api and client 
code will help me understand. 
Here’s my current API:

zk.createContainer(path, data, acls);   // and standard variations for async

I opted to create a new API because I didn’t want to add a new CreateMode. 
Also, sequential and ephemeral don’t make sense for containers. Usage: in 
Curator, when you create a lock, leader, etc. instance you pass in a path that 
is used manage things. Now, instead of creating a standard PERSISTENT node, a 
container would be created. Internally, a container node is normal persistent 
ZNode with a flag (TBD) that marks it as a container.

How can we guarantee that the last delete is actually the last delete. What 
if there was a race condition on delete and creation of the new node under 
the same parent.
It’s not necessary. A properly written recipe (IMO of course) re-creates parent 
nodes when necessary. Curator does this (via the createParentsIfNeeded() 
operation). So, any Curator recipe will be updated to re-create the container 
if needed.

At a very high level the big question is are we tying this feature to a 
specific recipe and the way its implemented. 
I don’t think so. Of course I’m biased to how Curator does things. But, when 
following the recipes on the doc pages you will always end up in the state 
where there are a bunch of parent nodes lying around. There is actually an 
implied node type of container already being described by the docs - there’s 
just no support for it in ZK.



-Jordan





On April 14, 2015 at 1:52:38 PM, kishore g (g.kish...@gmail.com) wrote:

Hi Jordon,  

I like this feature and always thought it would be useful to have something  
like this for Apache Helix as well. We do have a clean up thread that  
deletes the znodes. But I felt it was tied to Helix.  

Here are some of the questions that made me think its better to have the  
user of zookeeper handle deleting the parent node according to the use case.  

How would one go about using this feature? Perhaps a pseudo api and client  
code will help me understand.  

How can we guarantee that the last delete is actually the last delete. What  
if there was a race condition on delete and creation of the new node under  
the same parent. What kind of exception will we throw when a participant  
tried to create a node under a container node but the parent directory was  
deleted. How should the client handle such an exception.  

What about libraries(such as curator, zkclient) that provide mkdir -p kind  
of api where they go ahead and create parent nodes automatically if they  
don't exist.  

At a very high level the big question is are we tying this feature to a  
specific recipe and the way its implemented.  

Does this make sense?  

thanks,  
Kishore G  

On Tue, Apr 14, 2015 at 10:49 AM, Camille Fournier <cami...@apache.org>  
wrote:  

> Look at the session managers, they track what sessions are alive and clean  
> up when they aren't.  
>  
> On Tue, Apr 14, 2015 at 1:49 PM, Camille Fournier <c...@renttherunway.com>  
> wrote:  
>  
> > Look at the session managers, they track what sessions are alive and  
> clean  
> > up when they aren't.  
> >  
> > C  
> >  
> > On Tue, Apr 14, 2015 at 1:36 PM, Jordan Zimmerman <  
> > jor...@jordanzimmerman.com> wrote:  
> >  
> >> Another question…  
> >>  
> >> So, my two current questions are:  
> >>  
> >> * noting that a ZNode is a container, would you suggest the hack of a  
> >> special ephemeralOwner value or would you add a new field to Stat?  
> >>  
> >> * is there a current mechanism in the ZK server code (for the leader in  
> >> particular) to handle periodic housecleaning tasks? If so, where is that  
> >> code?  
> >>  
> >> -Jordan  
> >>  
> >>  
> >>  
> >> On April 13, 2015 at 2:48:27 PM, Jordan Zimmerman (  
> >> jor...@jordanzimmerman.com) wrote:  
> >>  
> >> As for noting that a ZNode is a container, would you suggest the hack of  
> >> a special ephemeralOwner value or would you add a new field to Stat?  
> >>  
> >> -Jordan  
> >>  
> >>  
> >>  
> >> On April 10, 2015 at 6:40:23 PM, Patrick Hunt (ph...@apache.org) wrote:  
> >>  
> >> Adding is typically good from a b/w compact perspective. If you use the  
> >> new  
> >> feature (at runtime) it generally precludes rollback though.  
> >>  
> >> See CreateTxn and CreateTxnV0  
> >>  
> >> A bit of background on convenience vs availability: Originally in ZK's  
> >> life  
> >> we explicitly stayed away from such operations at the API level (another  
> >> example being "rm -r"). We wanted to have high availability, in the  
> sense  
> >> that a single operation performed a single discreet operation on the  
> >> service. We didn't want to allow "unbounded" sets of changes that might  
> >> affect availability - say a single operation that triggered a thousand  
> >> discreet operations on the service, blocking out clients from doing  
> other  
> >> work. This seems pretty bounded to me though - at worst deleting the  
> >> entire  
> >> parent chain, which in general should be relatively small.  
> >>  
> >> Patrick  
> >>  
> >> On Thu, Apr 9, 2015 at 12:41 PM, Jordan Zimmerman <  
> >> jor...@jordanzimmerman.com> wrote:  
> >>  
> >> > You don’t even need to look at cversion. If the parent node is a  
> >> container  
> >> > and has no children (i.e. the node being deleted is the last child),  
> it  
> >> > gets deleted.  
> >> >  
> >> > The trouble I’m currently having, though, is that I don’t want to  
> modify  
> >> > the CreateTxn record. I can’t find a place to mark that the node  
> should  
> >> be  
> >> > a container. I guess I’ll have to add a new record type. What are the  
> >> > ramifications of that?  
> >> >  
> >> > -JZ  
> >> >  
> >> > On April 9, 2015 at 2:24:16 PM, Michi Mutsuzaki (  
> mi...@cs.stanford.edu)  
> >> > wrote:  
> >> >  
> >> > I see, so the container znode is a znode that gets deleted if it's  
> >> > empty and it ever had a child (cversion is greater than zero). It  
> >> > sounds good to me. Let's see what other people say.  
> >> >  
> >> > Thanks Jordan!  
> >> >  
> >> > On Thu, Apr 9, 2015 at 10:20 AM, Jordan Zimmerman  
> >> > <jor...@jordanzimmerman.com> wrote:  
> >> > > This sounds great to me, but it sounds a lot like ZOOKEEPER-723.  
> >> > >  
> >> > > The problem with both ZOOKEEPER-723 and ZOOKEEPER-834 is that it  
> >> > overloads  
> >> > > the concept of EPHEMERAL. EPHEMERALs are tied to sessions. In the  
> use  
> >> > cases  
> >> > > that I see, the parent node is always PERSISTENT - i.e. not tied to  
> a  
> >> > > session.  
> >> > >  
> >> > > I haven't looked at the patch yet, but how do you handle the "first  
> >> > > child" problem?  
> >> > >  
> >> > > My solution applies only when a node is deleted. So, there is no  
> need  
> >> > for a  
> >> > > first child check. When a node is deleted, iff it's parent has zero  
> >> > children  
> >> > > and is of type CONTAINER, then the parent is deleted and recursively  
> >> up  
> >> > the  
> >> > > tree.  
> >> > >  
> >> > > -Jordan  
> >> > >  
> >> > > On April 9, 2015 at 12:15:33 PM, Michi Mutsuzaki (  
> >> mi...@cs.stanford.edu)  
> >> > > wrote:  
> >> > >  
> >> > > Hi Jordan.  
> >> > >  
> >> > > This sounds great to me, but it sounds a lot like ZOOKEEPER-723.  
> >> > > Different people had different ideas there, but the original  
> >> > > description was:  
> >> > >  
> >> > > "rather than changing the semantics of ephemeral nodes, i propose  
> >> > > ephemeral parents: znodes that disappear when they have no more  
> >> > > children. this cleanup would happen automatically when the last  
> child  
> >> > > is removed. an ephemeral parent is not tied to any particular  
> session,  
> >> > > so even if the creator goes away, the ephemeral parent will remain  
> as  
> >> > > long as there are children."  
> >> > >  
> >> > > I haven't looked at the patch yet, but how do you handle the "first  
> >> > > child" problem? Is the container znode created with a first child to  
> >> > > prevent getting deleted, or does the client rely on multi to create  
> a  
> >> > > container and its children, or something else?  
> >> > >  
> >> > >  
> >> > > On Thu, Apr 9, 2015 at 8:00 AM, Jordan Zimmerman  
> >> > > <jor...@jordanzimmerman.com> wrote:  
> >> > >> BACKGROUND  
> >> > >> ============  
> >> > >> A recurring problem for ZooKeeper users is garbage collection of  
> >> parent  
> >> > >> nodes. Many recipes (e.g. locks, leaders, etc.) call for the  
> creation  
> >> > of a  
> >> > >> parent node under which participants create sequential nodes. When  
> >> the  
> >> > >> participant is done, it deletes its node. In practice, the  
> ZooKeeper  
> >> > tree  
> >> > >> begins to fill up with orphaned parent nodes that are no longer  
> >> needed.  
> >> > The  
> >> > >> ZooKeeper APIs don't provide a way to clean these. Over time,  
> >> ZooKeeper  
> >> > can  
> >> > >> become unstable due to the number of these nodes.  
> >> > >>  
> >> > >> CURRENT SOLUTIONS  
> >> > >> ===================  
> >> > >> Apache Curator has a workaround solution for this by providing the  
> >> > Reaper  
> >> > >> class which runs in the background looking for orphaned parent  
> nodes  
> >> and  
> >> > >> deleting them. This isn't ideal and it would be better if ZooKeeper  
> >> > >> supported this directly.  
> >> > >>  
> >> > >> PROPOSAL  
> >> > >> =========  
> >> > >> ZOOKEEPER-723 and ZOOKEEPER-834 have been proposed to allow  
> EPHEMERAL  
> >> > >> nodes to contain child nodes. This is not optimum as EPHEMERALs are  
> >> > tied to  
> >> > >> a session and the general use case of parent nodes is for  
> PERSISTENT  
> >> > nodes.  
> >> > >> This proposal adds a new node type, CONTAINER. A CONTAINER node is  
> >> the  
> >> > same  
> >> > >> as a PERSISTENT node with the additional property that when its  
> last  
> >> > child  
> >> > >> is deleted, it is deleted (and CONTAINER nodes recursively up the  
> >> tree  
> >> > are  
> >> > >> deleted if empty).  
> >> > >>  
> >> > >> I have a first pass (untested) straw man proposal open for comment  
> >> here:  
> >> > >>  
> >> > >> https://github.com/apache/zookeeper/pull/28  
> >> > >>  
> >> > >> In order to have minimum impact on existing implementations, a  
> >> container  
> >> > >> node is denoted by having an ephemeralOwner id of Long.MIN_VALUE.  
> >> This  
> >> > is  
> >> > >> pretty hackish, but I think it's the most supportable without  
> causing  
> >> > >> disruption. Also, a container behaves a "little bit" like an  
> >> EPHEMERAL  
> >> > node  
> >> > >> so it isn't totally illogical. Alternatively, a new field could be  
> >> > added to  
> >> > >> STAT.  
> >> > >>  
> >> > >> I look forward to feedback on this. If people think it's worthwhile  
> >> I'll  
> >> > >> open a Jira and work on a Production quality solution. If it's  
> >> > rejected, I'd  
> >> > >> appreciate discussion of an alternate as this is a real need in the  
> >> ZK  
> >> > user  
> >> > >> community.  
> >> > >>  
> >> > >> -Jordan  
> >> > >>  
> >> > >>  
> >> >  
> >>  
> >  
> >  
>  

Reply via email to