3 files with information, 2 x a 0-bit file with the same name
Checking the 0-bit files:
[root@gluster01 ~]# getfattr -m . -d -e hex
/export/brick14gfs01/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd
getfattr: Removing leading '/' from absolute path names
# file:
export/brick14gfs01/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd
security.selinux=0x73797374656d5f753a6f626a6563745f723a66696c655f743a733000
trusted.afr.dirty=0x000000000000000000000000
trusted.afr.sr_vol01-client-34=0x000000000000000000000000
trusted.afr.sr_vol01-client-35=0x000000000000000000000000
trusted.gfid=0xaefd184508414a8f8408f1ab8aa7a417
[root@gluster03 ~]# getfattr -m . -d -e hex
/export/brick14gfs03/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd
getfattr: Removing leading '/' from absolute path names
# file:
export/brick14gfs03/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd
security.selinux=0x73797374656d5f753a6f626a6563745f723a66696c655f743a733000
trusted.afr.dirty=0x000000000000000000000000
trusted.afr.sr_vol01-client-34=0x000000000000000000000000
trusted.afr.sr_vol01-client-35=0x000000000000000000000000
trusted.gfid=0xaefd184508414a8f8408f1ab8aa7a417
This is not a glusterfs link file since there is no
"trusted.glusterfs.dht.linkto", am I correct?
And checking the "good" files:
# file:
export/brick13gfs01/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd
security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000
trusted.afr.dirty=0x000000000000000000000000
trusted.afr.sr_vol01-client-32=0x000000000000000000000000
trusted.afr.sr_vol01-client-33=0x000000000000000000000000
trusted.afr.sr_vol01-client-34=0x000000000000000000000000
trusted.afr.sr_vol01-client-35=0x000000010000000100000000
trusted.gfid=0xaefd184508414a8f8408f1ab8aa7a417
[root@gluster02 ~]# getfattr -m . -d -e hex
/export/brick13gfs02/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd
getfattr: Removing leading '/' from absolute path names
# file:
export/brick13gfs02/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd
security.selinux=0x73797374656d5f753a6f626a6563745f723a66696c655f743a733000
trusted.afr.dirty=0x000000000000000000000000
trusted.afr.sr_vol01-client-32=0x000000000000000000000000
trusted.afr.sr_vol01-client-33=0x000000000000000000000000
trusted.gfid=0xaefd184508414a8f8408f1ab8aa7a417
[root@gluster03 ~]# getfattr -m . -d -e hex
/export/brick13gfs03/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd
getfattr: Removing leading '/' from absolute path names
# file:
export/brick13gfs03/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd
security.selinux=0x73797374656d5f753a6f626a6563745f723a66696c655f743a733000
trusted.afr.dirty=0x000000000000000000000000
trusted.afr.sr_vol01-client-40=0x000000000000000000000000
trusted.afr.sr_vol01-client-41=0x000000000000000000000000
trusted.gfid=0xaefd184508414a8f8408f1ab8aa7a417
Seen from a client via a glusterfs mount:
[root@client ~]# ls -al
/mnt/glusterfs/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/300*
-rw-r--r--. 1 root root 0 Feb 18 00:51
/mnt/glusterfs/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd
-rw-r--r--. 1 root root 0 Feb 18 00:51
/mnt/glusterfs/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd
-rw-r--r--. 1 root root 0 Feb 18 00:51
/mnt/glusterfs/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd
Via NFS (just after performing a umount and mount the volume again):
[root@client ~]# ls -al
/mnt/nfs/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/300*
-rw-r--r--. 1 root root 44332659200 Feb 17 23:55
/mnt/test/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd
-rw-r--r--. 1 root root 44332659200 Feb 17 23:55
/mnt/test/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd
-rw-r--r--. 1 root root 44332659200 Feb 17 23:55
/mnt/test/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd
Doing the same list a couple of seconds later:
[root@client ~]# ls -al
/mnt/nfs/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/300*
-rw-r--r--. 1 root root 0 Feb 18 00:51
/mnt/test/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd
-rw-r--r--. 1 root root 0 Feb 18 00:51
/mnt/test/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd
-rw-r--r--. 1 root root 0 Feb 18 00:51
/mnt/test/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd
And again, and again, and again:
[root@client ~]# ls -al
/mnt/nfs/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/300*
-rw-r--r--. 1 root root 0 Feb 18 00:51
/mnt/test/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd
-rw-r--r--. 1 root root 0 Feb 18 00:51
/mnt/test/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd
-rw-r--r--. 1 root root 0 Feb 18 00:51
/mnt/test/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd
This really seems odd. Why do we get to see "real data file" once only?
It seems more and more that this crazy file duplication (and writing
of sticky bit files) was actually triggered when rebooting one of
the three nodes while there still is an active (even when there is
no data exchange at all) NFS connection, since all 0-bit files (of
the non Sticky bit type) were either created at 00:51 or 00:41, the
exact moment one of the three nodes in the cluster were rebooted.
This would mean that replication currently with GlusterFS creates
hardly any redundancy. Quiet the opposite, if one of the machines
goes down, all of your data seriously gets disorganised. I am buzzy
configuring a test installation to see how this can be best
reproduced for a bug report..
Does anyone have a suggestion how to best get rid of the duplicates,
or rather get this mess organised the way it should be?
This is a cluster with millions of files. A rebalance does not fix
the issue, neither does a rebalance fix-layout help. Since this is a
replicated volume all files should be their 2x, not 3x. Can I safely
just remove all the 0 bit files outside of the .glusterfs directory
including the sticky bit files?
The empty 0 bit files outside of .glusterfs on every brick I can
probably safely removed like this:
find /export/* -path */.glusterfs -prune -o -type f -size 0 -perm
1000 -exec rm {} \;
not?
Thanks!
Cheers,
Olav
On 18/02/15 22:10, Olav Peeters wrote:
Thanks Tom and Joe,
for the fast response!
Before I started my upgrade I stopped all clients using the volume
and stopped all VM's with VHD on the volume, but I guess, and this
may be the missing thing to reproduce this in a lab, I did not
detach a NFS shared storage mount from a XenServer pool to this
volume, since this is an extremely risky business. I also did not
stop the volume. This I guess was a bit stupid, but since I did
upgrades in the past this way without any issues I skipped this
step (a really bad habit). I'll make amends and file a proper bug
report :-). I agree with you Joe, this should never happen, even
when someone ignores the advice of stopping the volume. If it would
also be nessessary to detach shared storage NFS connections to a
volume, than franky, glusterfs is unusable in a private cloud. No
one can afford downtime of the whole infrastructure just for a
glusterfs upgrade. Ideally a replicated gluster volume should even
be able to remain online and used during (at least a minor version)
upgrade.
I don't know whether a heal was maybe buzzy when I started the
upgrade. I forgot to check. I did check the CPU activity on the
gluster nodes which were very low (in the 0.0X range via top), so I
doubt it. I will add this to the bug report as a suggestion should
they not be able to reproduce with an open NFS connection.
By the way, is it sufficient to do:
service glusterd stop
service glusterfsd stop
and do a:
ps aux | gluster*
to see if everything has stopped and kill any leftovers should this
be necessary?
For the fix, do you agree that if I run e.g.:
find /export/* -type f -size 0 -perm 1000 -exec /bin/rm {} \;
on every node if /export is the location of all my bricks, also in
a replicated set-up, this will be save?
No necessary 0bit files will be deleted in e.g. the .glusterfs of
every brick?
Thanks for your support!
Cheers,
Olav
On 18/02/15 20:51, Joe Julian wrote:
On 02/18/2015 11:43 AM, tben...@3vgeomatics.com wrote:
Hi Olav,
I have a hunch that our problem was caused by improper unmounting
of the gluster volume, and have since found that the proper order
should be: kill all jobs using volume -> unmount volume on
clients -> gluster volume stop -> stop gluster service (if necessary)
In my case, I wrote a Python script to find duplicate files on
the mounted volume, then delete the corresponding link files on
the bricks (making sure to also delete files in the .glusterfs
directory)
However, your find command was also suggested to me and I think
it's a simpler solution. I believe removing all link files (even
ones that are not causing duplicates) is fine since the next file
access gluster will do a lookup on all bricks and recreate any
link files if necessary. Hopefully a gluster expert can chime in
on this point as I'm not completely sure.
You are correct.
Keep in mind your setup is somewhat different than mine as I have
only 5 bricks with no replication.
Regards,
Tom
--------- Original Message ---------
Subject: Re: [Gluster-users] Hundreds of duplicate files
From: "Olav Peeters" <opeet...@gmail.com>
Date: 2/18/15 10:52 am
To: gluster-users@gluster.org, tben...@3vgeomatics.com
Hi all,
I'm have this problem after upgrading from 3.5.3 to 3.6.2.
At the moment I am still waiting for a heal to finish (on a
31TB volume with 42 bricks, replicated over three nodes).
Tom,
how did you remove the duplicates?
with 42 bricks I will not be able to do this manually..
Did a:
find $brick_root -type f -size 0 -perm 1000 -exec /bin/rm {} \;
work for you?
Should this type of thing ideally not be checked and mended
by a heal?
Does anyone have an idea yet how this happens in the first
place? Can it be connected to upgrading?
Cheers,
Olav
On 01/01/15 03:07, tben...@3vgeomatics.com wrote:
No, the files can be read on a newly mounted client! I
went ahead and deleted all of the link files associated
with these duplicates, and then remounted the volume. The
problem is fixed!
Thanks again for the help, Joe and Vijay.
Tom
--------- Original Message ---------
Subject: Re: [Gluster-users] Hundreds of duplicate files
From: "Vijay Bellur" <vbel...@redhat.com>
Date: 12/28/14 3:23 am
To: tben...@3vgeomatics.com, gluster-users@gluster.org
On 12/28/2014 01:20 PM, tben...@3vgeomatics.com wrote:
> Hi Vijay,
> Yes the files are still readable from the
.glusterfs path.
> There is no explicit error. However, trying to read
a text file in
> python simply gives me null characters:
>
> >>> open('ott_mf_itab').readlines()
>
['\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00']
>
> And reading binary files does the same
>
Is this behavior seen with a freshly mounted client too?
-Vijay
> --------- Original Message ---------
> Subject: Re: [Gluster-users] Hundreds of duplicate
files
> From: "Vijay Bellur" <vbel...@redhat.com>
> Date: 12/27/14 9:57 pm
> To: tben...@3vgeomatics.com, gluster-users@gluster.org
>
> On 12/28/2014 10:13 AM, tben...@3vgeomatics.com wrote:
> > Thanks Joe, I've read your blog post as well as
your post
> regarding the
> > .glusterfs directory.
> > I found some unneeded duplicate files which were
not being read
> > properly. I then deleted the link file from the
brick. This always
> > removes the duplicate file from the listing, but
the file does not
> > always become readable. If I also delete the
associated file in the
> > .glusterfs directory on that brick, then some
more files become
> > readable. However this solution still doesn't
work for all files.
> > I know the file on the brick is not corrupt as it
can be read
> directly
> > from the brick directory.
>
> For files that are not readable from the client,
can you check if the
> file is readable from the .glusterfs/ path?
>
> What is the specific error that is seen while
trying to read one such
> file from the client?
>
> Thanks,
> Vijay
>
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
>
_______________________________________________
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users
_______________________________________________
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users
_______________________________________________
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users