Re: [Gluster-users] Fixing heal / split-brain when the entry is a directory

2014-03-05 Thread Shawn Heisey
> From my short Gluster experience I noticed that during fix-layout when
> adding new bricks, re-creates the directories on the new bricks. Could
> yo maybe try to fix-layout, possibly after, you change the trusted
> xattrs? Or try some combinatinos of that..
>
> also I assume you do
> gluster vol heal  full

There are dozens of millions of files taking up about 60TB of space. A
heal full would take days, maybe weeks. Half of my bricks are at 95
percent capacity. I need to get the rebalance going again as soon as
possible ... We had a near-catastrophic rebalance failure in November on
3.3.1, upgraded a few days ago, and tried again. These heal troubles are
the result.

Thanks,
Shawn


___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Fixing heal / split-brain when the entry is a directory

2014-03-04 Thread Shawn Heisey

On 3/4/2014 5:20 PM, Viktor Villafuerte wrote:

You may have tried this already.. but what if you leave both trusted.afr
entries, change only one to '0' and then self-heal?


The lack of a Reply-To header on some lists always trips me up.  I end 
up just replying to the sender.


Setting one entry to 0 and leaving the other as non-zero did not work 
either.  Note that I changed the same xattr on both copies.


# file: bricks/d00v00/mdfs/REDACTED/mdfs/RNI/rniphotos/docs/030
trusted.afr.mdfs-client-0=0x0046
trusted.afr.mdfs-client-1=0x
trusted.gfid=0x505e4a1f512042c68b4c2dd38c497534
trusted.glusterfs.dht=0x00012ffd3ffb

Doing a 'stat' on the fuse mount changes the entry in 'heal info' from a 
gfid to the real location, which answers another question I had.


Thanks,
Shawn

___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Fixing heal / split-brain when the entry is a directory

2014-03-04 Thread Viktor Villafuerte
You may have tried this already.. but what if you leave both trusted.afr
entries, change only one to '0' and then self-heal?

v


On Tue 04 Mar 2014 16:46:14, Shawn Heisey wrote:
> I have a bunch of heal problems on a volume.  For this email, I
> won't speculate about what caused them - that's a whole other
> discussion that I may have at some point in the future.  This will
> concentrate on fixing the immediate problems so I can move forward.
> 
> Thanks to JoeJulian's blog posts and talking to him in the IRC
> channel, I have a pretty good handle on how to fix entries in the
> 'heal $vol info' output ... but only if the entry given refers to a
> real *file* or a gluster link file.  Almost all of the entries in my
> report are directories, and I have no idea how to fix it.
> 
> All I have for these entries is gfid values, so I first locate the
> entry in .glusterfs.  In this case, it's a symlink.
> 
> [root@slc01dfs001a ~]# stat 
> /bricks/d00v00/mdfs/.glusterfs/fe/93/fe93de6e-5b91-4193-a31c-786726886ff1
>   File: 
> `/bricks/d00v00/mdfs/.glusterfs/fe/93/fe93de6e-5b91-4193-a31c-786726886ff1'
> -> `../../a7/30/a730505c-84f3-407f-ac27-d45465a17f40/331'
>   Size: 52  Blocks: 0  IO Block: 4096   symbolic link
> Device: fd06h/64774dInode: 2152112572  Links: 1
> Access: (0777/lrwxrwxrwx)  Uid: (0/root)   Gid: (0/root)
> Access: 2013-06-21 03:17:27.740839811 -0600
> Modify: 2013-06-21 03:17:27.740839811 -0600
> Change: 2013-06-21 03:17:27.740839811 -0600
> 
> To figure out what the actual directory name is, I use readlink:
> 
> [root@slc01dfs001a ~]# readlink -f 
> /bricks/d00v00/mdfs/.glusterfs/fe/93/fe93de6e-5b91-4193-a31c-786726886ff1
> /bricks/d00v00/mdfs/REDACTED/mdfs/RTR/rtrphotosfour/docs/331
> 
> I can get the extended attributes. I know from talking to Joe Julian
> that the following output means both copies think the other needs
> healing.  If I compare 'ls -al' output from the brick directory on
> both copies, they are the same.
> 
> [root@slc01dfs001a ~]# getfattr -m . -d -e hex
> /bricks/d00v00/mdfs/REDACTED/mdfs/RTR/rtrphotosfour/docs/331
> getfattr: Removing leading '/' from absolute path names
> # file: bricks/d00v00/mdfs/REDACTED/mdfs/RTR/rtrphotosfour/docs/331
> trusted.afr.mdfs-client-0=0x006e
> trusted.afr.mdfs-client-1=0x006e
> trusted.gfid=0xfe93de6e5b914193a31c786726886ff1
> trusted.glusterfs.dht=0x00013ffc4ffa
> 
> Now for the big question ... what do I do, in a step-by-step format,
> to eliminate this entry from the heal info output?  On another
> entry, I tried deleting the second trusted.afr entry on both copies,
> I tried deleting them both, I tried deleting one and setting the
> other to zero, and I tried changing them to both to zero.  In
> between each of these, I did a stat on the directory via the FUSE
> mount.  It did not change the heal info output.
> 
> Thanks,
> Shawn
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-users

-- 
Regards

Viktor Villafuerte
Optus Internet Engineering
t: 02 808-25265
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users