To answer my own question. I now realize the importance of creating specifically directories and not files because it appears that directories are created on each pair regardless of the hash. And so, if a host is down, changes will be marked as pending for that host. With files it's different because in that case the name of the file *IS *important resulting in the file being placed on only 1 pair.
On Mon, Oct 10, 2016 at 3:47 PM, Sergei Gerasenko <sgerasenk...@gmail.com> wrote: > The guide here: https://gluster.readthedocs.io/en/latest/Administrator > %20Guide/Managing%20Volumes/#replace-faulty-brick suggests running the > following while the partner host is down: > > mkdir /mnt/r2/<name-of-nonexistent-dir> rmdir > /mnt/r2/<name-of-nonexistent-dir> setfattr -n trusted.non-existent-key -v > abc /mnt/r2 setfattr -x trusted.non-existent-key /mnt/r2 > > That should set an extended attribute on the healthy replica partner > indicating that there are pending changes for the partner host. > > Remembering that we're in a distributed, replicated situation, I don't > quite understand because the created directories can be created on any > pair, not necessarily the one we're fixing. I think the name of the > directory should chosen such that its dht value lands the file on the > affected brick (the healthy of the two replicate hosts). That's not easy to > do. > > Does somebody have any suggestions? > > > On Thu, Oct 6, 2016 at 10:47 PM, Sergei Gerasenko <sgerasenk...@gmail.com> > wrote: > >> Step 10 isn't really necessary. The changes should probably be monitored >> under the brick directory. >> >> On Thu, Oct 6, 2016 at 10:25 PM, Sergei Gerasenko <sgerasenk...@gmail.com >> > wrote: >> >>> I've simulated the problem on 4 VMs in a distributed replicated setup >>> with a 2 replica-factor. I've repeatedly torn down and brought up a VM from >>> a snapshot in each of my tests. >>> >>> What has worked so far is this: >>> >>> >>> 1. Make a copy of /var/lib/glusterd from the affected machine, save >>> it elsewhere. >>> 2. Configure your new machine (in my case I reverted to a VM >>> snapshot). Assign the same ip and hostname! >>> 3. Install gluster. >>> 4. Stop the daemons if they are running. >>> 5. Nuke the /var/lib/glusterd directory and replace it with the >>> saved copy in step 1. >>> 6. Create the brick directory. >>> 7. Get the extended volume attribute from a healthy node like so: >>> getfattr >>> -e base64 -n trusted.glusterfs.volume-id /data/brick_dir >>> 8. Apply the extended attribute volume id attribute like so: setfattr >>> -n trusted.glusterfs.volume-id -v 'the_value_you_got_in_7==' >>> /data/brick_dir >>> 9. Start the daemons. >>> 10. FUSE mount the gluster partition through the daemons running >>> locally. So the /etc/fstab would contain something like: >>> localhost:/gluster_volume /mnt/gluster glusterfs _netdev,defaults >>> 0 0 >>> 11. On the healthy partner machine with another fuse mount point to >>> the same volume do something like: find /mnt/fuse | xargs stat. >>> 12. Step 8 will make files appear under the mount point on the new >>> box but the files are not going to be physically in the brick directory >>> -- >>> yet. See 10. >>> 13. Run the heal command from the same host where you ran find. That >>> will finally sync the files to the brick. Run the heal info command >>> periodically and the number of files being healed should eventually go >>> down >>> to 0. >>> >>> That's my experience with the VMs today. >>> >>> On Wed, Oct 5, 2016 at 4:46 PM, Joe Julian <j...@julianfamily.org> wrote: >>> >>>> What I always do is just shut it down, repair (or replace) the brick, >>>> then start it up again with "... start $volname force". >>>> >>>> On October 5, 2016 11:27:36 PM GMT+02:00, Sergei Gerasenko < >>>> sgerasenk...@gmail.com> wrote: >>>>> >>>>> Hi, sorry if this has been asked before but the documentation is a bit >>>>> conflicting in various sources on what to do exactly. >>>>> >>>>> I have an 6-node, distributed replicated cluster with a replica factor >>>>> of 2. So it's 3 pairs of servers. I need to remove a server from one of >>>>> those replica sets, rebuild it and put it back in. >>>>> >>>>> What's the tried and proven sequence of steps for this? Any pointers >>>>> would be very useful. >>>>> >>>>> Thanks! >>>>> Sergei >>>>> >>>>> ------------------------------ >>>>> >>>>> Gluster-users mailing list >>>>> Gluster-users@gluster.org >>>>> http://www.gluster.org/mailman/listinfo/gluster-users >>>>> >>>>> >>>> -- >>>> Sent from my Android device with K-9 Mail. Please excuse my brevity. >>>> >>>> _______________________________________________ >>>> Gluster-users mailing list >>>> Gluster-users@gluster.org >>>> http://www.gluster.org/mailman/listinfo/gluster-users >>>> >>> >>> >> >
_______________________________________________ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users