Just checked ec code. Looks okay. All entry fops are also updating metadata and data part of the xattr.
On Tue, May 31, 2016 at 12:37 PM, Xavier Hernandez <xhernan...@datalab.es> wrote: > Hi, > > On 31/05/16 07:05, Raghavendra Gowdappa wrote: > >> +gluster-devel, +Xavi >> >> Hi all, >> >> The context is [1], where bricks do pre-operation checks before doing a >> fop and proceed with fop only if pre-op check is successful. >> >> @Xavi, >> >> We need your inputs on behavior of EC subvolumes as well. >> > > If I understand correctly, EC shouldn't have any problems here. > > EC sends the mkdir request to all subvolumes that are currently considered > "good" and tries to combine the answers. Answers that match in return code, > errno (if necessary) and xdata contents (except for some special xattrs > that are ignored for combination purposes), are grouped. > > Then it takes the group with more members/answers. If that group has a > minimum size of #bricks - redundancy, it is considered the good answer. > Otherwise EIO is returned because bricks are in an inconsistent state. > > If there's any answer in another group, it's considered bad and gets > marked so that self-heal will repair it using the good information from the > majority of bricks. > > xdata is combined and returned even if return code is -1. > > Is that enough to cover the needed behavior ? > > Xavi > > > >> [1] http://review.gluster.org/13885 >> >> regards, >> Raghavendra >> >> ----- Original Message ----- >> >>> From: "Pranith Kumar Karampuri" <pkara...@redhat.com> >>> To: "Raghavendra Gowdappa" <rgowd...@redhat.com> >>> Cc: "team-quine-afr" <team-quine-...@redhat.com>, "rhs-zteam" < >>> rhs-zt...@redhat.com> >>> Sent: Tuesday, May 31, 2016 10:22:49 AM >>> Subject: Re: dht mkdir preop check, afr and (non-)readable afr subvols >>> >>> I think you should start a discussion on gluster-devel so that Xavi gets >>> a >>> chance to respond on the mails as well. >>> >>> On Tue, May 31, 2016 at 10:21 AM, Raghavendra Gowdappa < >>> rgowd...@redhat.com> >>> wrote: >>> >>> Also note that we've plans to extend this pre-op check to all dentry >>>> operations which also depend parent layout. So, the discussion need to >>>> cover all dentry operations like: >>>> >>>> 1. create >>>> 2. mkdir >>>> 3. rmdir >>>> 4. mknod >>>> 5. symlink >>>> 6. unlink >>>> 7. rename >>>> >>>> We also plan to have similar checks in lock codepath for directories too >>>> (planning to use hashed-subvolume as lock-subvolume for directories). >>>> So, >>>> more fops :) >>>> 8. lk (posix locks) >>>> 9. inodelk >>>> 10. entrylk >>>> >>>> regards, >>>> Raghavendra >>>> >>>> ----- Original Message ----- >>>> >>>>> From: "Raghavendra Gowdappa" <rgowd...@redhat.com> >>>>> To: "team-quine-afr" <team-quine-...@redhat.com> >>>>> Cc: "rhs-zteam" <rhs-zt...@redhat.com> >>>>> Sent: Tuesday, May 31, 2016 10:15:04 AM >>>>> Subject: dht mkdir preop check, afr and (non-)readable afr subvols >>>>> >>>>> Hi all, >>>>> >>>>> I have some queries related to the behavior of afr_mkdir with respect >>>>> to >>>>> readable subvols. >>>>> >>>>> 1. While winding mkdir to subvols does afr check whether the subvolume >>>>> is >>>>> good/readable? Or does it wind to all subvols irrespective of whether a >>>>> subvol is good/bad? In the latter case, what if >>>>> a. mkdir succeeds on non-readable subvolume >>>>> b. fails on readable subvolume >>>>> >>>>> What is the result reported to higher layers in the above scenario? >>>>> If >>>>> mkdir is failed, is it cleaned up on non-readable subvolume where it >>>>> failed? >>>>> >>>>> I am interested in this case as dht-preop check relies on layout xattrs >>>>> >>>> and I >>>> >>>>> assume layout xattrs in particular (and all xattrs in general) are >>>>> guaranteed to be correct only on a readable subvolume of afr. So, in >>>>> >>>> essence >>>> >>>>> we shouldn't be winding down mkdir on non-readable subvols as whatever >>>>> >>>> the >>>> >>>>> decision brick makes as part of pre-op check is inherently flawed. >>>>> >>>>> regards, >>>>> Raghavendra >>>>> >>>> -- >>> Pranith >>> >>> -- Pranith
_______________________________________________ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel