I've filed a bug at [1] to track issue in afr. [1] https://bugzilla.redhat.com/show_bug.cgi?id=1341429
On Tue, May 31, 2016 at 2:17 PM, Raghavendra G <raghaven...@gluster.com> wrote: > > > On Tue, May 31, 2016 at 12:37 PM, Xavier Hernandez <xhernan...@datalab.es> > wrote: > >> Hi, >> >> On 31/05/16 07:05, Raghavendra Gowdappa wrote: >> >>> +gluster-devel, +Xavi >>> >>> Hi all, >>> >>> The context is [1], where bricks do pre-operation checks before doing a >>> fop and proceed with fop only if pre-op check is successful. >>> >>> @Xavi, >>> >>> We need your inputs on behavior of EC subvolumes as well. >>> >> >> If I understand correctly, EC shouldn't have any problems here. >> >> EC sends the mkdir request to all subvolumes that are currently >> considered "good" and tries to combine the answers. Answers that match in >> return code, errno (if necessary) and xdata contents (except for some >> special xattrs that are ignored for combination purposes), are grouped. >> >> Then it takes the group with more members/answers. If that group has a >> minimum size of #bricks - redundancy, it is considered the good answer. >> Otherwise EIO is returned because bricks are in an inconsistent state. >> >> If there's any answer in another group, it's considered bad and gets >> marked so that self-heal will repair it using the good information from the >> majority of bricks. >> >> xdata is combined and returned even if return code is -1. >> >> Is that enough to cover the needed behavior ? >> > > Thanks Xavi. That's sufficient for the feature in question. One of the > main cases I was interested in was what would be the behaviour if mkdir > succeeds on "bad" subvolume and fails on "good" subvolume. Since you never > wind mkdir to "bad" subvolume(s), this situation never arises. > > > >> >> Xavi >> >> >> >>> [1] http://review.gluster.org/13885 >>> >>> regards, >>> Raghavendra >>> >>> ----- Original Message ----- >>> >>>> From: "Pranith Kumar Karampuri" <pkara...@redhat.com> >>>> To: "Raghavendra Gowdappa" <rgowd...@redhat.com> >>>> Cc: "team-quine-afr" <team-quine-...@redhat.com>, "rhs-zteam" < >>>> rhs-zt...@redhat.com> >>>> Sent: Tuesday, May 31, 2016 10:22:49 AM >>>> Subject: Re: dht mkdir preop check, afr and (non-)readable afr subvols >>>> >>>> I think you should start a discussion on gluster-devel so that Xavi >>>> gets a >>>> chance to respond on the mails as well. >>>> >>>> On Tue, May 31, 2016 at 10:21 AM, Raghavendra Gowdappa < >>>> rgowd...@redhat.com> >>>> wrote: >>>> >>>> Also note that we've plans to extend this pre-op check to all dentry >>>>> operations which also depend parent layout. So, the discussion need to >>>>> cover all dentry operations like: >>>>> >>>>> 1. create >>>>> 2. mkdir >>>>> 3. rmdir >>>>> 4. mknod >>>>> 5. symlink >>>>> 6. unlink >>>>> 7. rename >>>>> >>>>> We also plan to have similar checks in lock codepath for directories >>>>> too >>>>> (planning to use hashed-subvolume as lock-subvolume for directories). >>>>> So, >>>>> more fops :) >>>>> 8. lk (posix locks) >>>>> 9. inodelk >>>>> 10. entrylk >>>>> >>>>> regards, >>>>> Raghavendra >>>>> >>>>> ----- Original Message ----- >>>>> >>>>>> From: "Raghavendra Gowdappa" <rgowd...@redhat.com> >>>>>> To: "team-quine-afr" <team-quine-...@redhat.com> >>>>>> Cc: "rhs-zteam" <rhs-zt...@redhat.com> >>>>>> Sent: Tuesday, May 31, 2016 10:15:04 AM >>>>>> Subject: dht mkdir preop check, afr and (non-)readable afr subvols >>>>>> >>>>>> Hi all, >>>>>> >>>>>> I have some queries related to the behavior of afr_mkdir with respect >>>>>> to >>>>>> readable subvols. >>>>>> >>>>>> 1. While winding mkdir to subvols does afr check whether the >>>>>> subvolume is >>>>>> good/readable? Or does it wind to all subvols irrespective of whether >>>>>> a >>>>>> subvol is good/bad? In the latter case, what if >>>>>> a. mkdir succeeds on non-readable subvolume >>>>>> b. fails on readable subvolume >>>>>> >>>>>> What is the result reported to higher layers in the above scenario? >>>>>> If >>>>>> mkdir is failed, is it cleaned up on non-readable subvolume where it >>>>>> failed? >>>>>> >>>>>> I am interested in this case as dht-preop check relies on layout >>>>>> xattrs >>>>>> >>>>> and I >>>>> >>>>>> assume layout xattrs in particular (and all xattrs in general) are >>>>>> guaranteed to be correct only on a readable subvolume of afr. So, in >>>>>> >>>>> essence >>>>> >>>>>> we shouldn't be winding down mkdir on non-readable subvols as whatever >>>>>> >>>>> the >>>>> >>>>>> decision brick makes as part of pre-op check is inherently flawed. >>>>>> >>>>>> regards, >>>>>> Raghavendra >>>>>> >>>>> -- >>>> Pranith >>>> >>>> _______________________________________________ >> Gluster-devel mailing list >> Gluster-devel@gluster.org >> http://www.gluster.org/mailman/listinfo/gluster-devel >> > > > > -- > Raghavendra G > -- Raghavendra G
_______________________________________________ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel