Re: [Gluster-devel] Posix lock migration design

Susant Palai Wed, 09 Mar 2016 22:06:07 -0800

Update:

Here is an initial solution on how to solve races between (fd-migration with 
lock migration) and (client-disconnect with lock migration).


Please give your suggestion and comments.

Fuse fd migration:
-------------------
    part1:  Fuse fd migration with out fd association
        - How it is working currently:
        - Fuse initiate fd migration task from graph1 to graph2
        - As part of this a new fd is opened on graph2
        - Locks are associated with fd and client (connection id) currently. 
With fd association out of the picture there will be just new client update to 
the locks.

    part2: fd migration interaction with lock migration
       - As part of "fuse-fd-lock-migration" we do two operations.
           a. getxattr (lockinfo): Get the old fd number on old graph
           b. setxattr (lockinfo): set (new fd number + new client info) on the 
new graph through the new fd
       - So meta-lock acts as  a hint of lock migration for any lock related 
operations (fd-migraiton, server_connection_cleanup, new lk requests, flush 
etc...)
       - Now getxattr need not worry about metalk presence at all. Once it 
reads the necessary information the bulk of the job is left to setxattr.
       - Setxattr: 
           - case 1: whether meta lock is present
               - if YES, wait till meta-unlock is executed on the lock. Unwind 
the call with EREMOTE. Now it's dht translator's responsibility to lookup the 
file to figure out the file location and redirect the setxattr. So destination 
will have the new graph client-id.
               - if NO,  set new client information. Which will be migrated by 
rebalance.
           - case 2: What if setxattr has missed (meta lock + unlock)
               - Meta-unlock upon successful lock migration will set a REPLAY 
flag. Which indicates the data as well as locks have been migrated. 
               - So unwind with EREMOTE, and leave it to dht for the 
redirection part.

<Question: Until fd migration has not happened we operate through old fd. Yes 
->>> right?>

client talking to source disconnects  during lock migration:
-------------------------------------------------------------
- There are many phases of data+lock_migraiton.  The following describes 
disconnect around all the phases.

phase-1: disconnect before data migration
- server cleanup will flush the locks. Hence, there are no locks left for 
migraiton.

phase-2: disconnect before meatlk reaches server
- same case as phase-1

phase-3: disconnect just after metalk
- server_cleanup on seeing metalk waits till meta-unlock.
- flush the locks on source. 
- incoming ops (write/lk) well will fail with ENOTCONN.
- fd_close on ENOTCONN will refresh it's inode to check whether the file has 
migrated else where and flush the locks



Thanks,
Susant



----- Original Message -----
> From: "Susant Palai" <spa...@redhat.com>
> To: "Gluster Devel" <gluster-devel@gluster.org>
> Sent: Thursday, 3 March, 2016 3:09:06 PM
> Subject: Re: [Gluster-devel] Posix lock migration design
> 
> Update on Lock migration design.
> 
> For lock migration we are planning to get rid of fd association with the
> lock. Rather we will base our lock operations
> based on lk-owner(equivalent of pid) which is POSIX standard. The fd
> association does not suit the need of lock migration
> as migrated fd will not be valid on the destination. Where as Working with
> lk-owner is much flexible as it does not change
> across different server.
> 
> The current model of posix lock infrastructure associate fd with lock for the
> following operations which we are planning to
> replace with lk-owner.
> 
> 1) lock cleanup for protocol client disconnects based on fd
> 
> 2) release call on fd
> 
> 3) fuse fd migration (triggered by a graph switch)
> 
> The new design being worked out and will update here once ready.
> 
> Please post your suggestions/comments here :)
> 
> Thanks,
> Susant
> 
> ----- Original Message -----
> > From: "Raghavendra G" <raghaven...@gluster.com>
> > To: "Susant Palai" <spa...@redhat.com>
> > Cc: "Raghavendra Gowdappa" <rgowd...@redhat.com>, "Gluster Devel"
> > <gluster-devel@gluster.org>
> > Sent: Tuesday, 1 March, 2016 11:40:54 AM
> > Subject: Re: [Gluster-devel] Posix lock migration design
> > 
> > On Mon, Feb 29, 2016 at 12:52 PM, Susant Palai <spa...@redhat.com> wrote:
> > 
> > > Hi Raghavendra,
> > >    I have a question on the design.
> > >
> > >    Currently in case of a client disconnection, pl_flush cleans up the
> > > locks associated with the fd created from that client.
> > > From the design, rebalance will migrate the locks to the new destination.
> > > Now in case client gets disconnected from the
> > > destination brick, how it is supposed to clean up the locks as
> > > rebalance/brick have no idea whether the client has opened
> > > an fd on destination and what the fd is.
> > >
> > 
> > >    So the question is how to associate the client fd with locks on
> > > destination.
> > >
> > 
> > We don't use fds to cleanup the locks during flush. We use lk-owner which
> > doesn't change across migration. Note that lk-owner for posix-locks is
> > filled by the vfs/kernel where we've glusterfs mount.
> > 
> > <pl_flush>
> >          pthread_mutex_lock (&pl_inode->mutex);
> >         {
> >                 __delete_locks_of_owner (pl_inode, frame->root->client,
> >                                          &frame->root->lk_owner);
> >         }
> >         pthread_mutex_unlock (&pl_inode->mutex);
> > </pl_flush>
> > 
> > 
> > > Thanks,
> > > Susant
> > >
> > > ----- Original Message -----
> > > From: "Susant Palai" <spa...@redhat.com>
> > > To: "Gluster Devel" <gluster-devel@gluster.org>
> > > Sent: Friday, 29 January, 2016 3:15:14 PM
> > > Subject: [Gluster-devel] Posix lock migration design
> > >
> > > Hi,
> > >    Here, [1]
> > >
> > > https://docs.google.com/document/d/17SZAKxx5mhM-cY5hdE4qRq9icmFqy3LBaTdewofOXYc/edit?usp=sharing
> > > is a google document about proposal for "POSIX_LOCK_MIGRATION". Problem
> > > statement and design are explained in the document it self.
> > >
> > >   Requesting the devel list to go through the document and
> > > comment/analyze/suggest, to take the thoughts forward (either on the
> > > google doc itself or here on the devel list).
> > >
> > >
> > > Thanks,
> > > Susant
> > > _______________________________________________
> > > Gluster-devel mailing list
> > > Gluster-devel@gluster.org
> > > http://www.gluster.org/mailman/listinfo/gluster-devel
> > > _______________________________________________
> > > Gluster-devel mailing list
> > > Gluster-devel@gluster.org
> > > http://www.gluster.org/mailman/listinfo/gluster-devel
> > >
> > 
> > 
> > 
> > --
> > Raghavendra G
> > 
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
> 
_______________________________________________
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Posix lock migration design

Reply via email to