Re: [Gluster-users] Is rebalance completely broken on 3.5.3 ?

2015-03-20 Thread Olav Peeters

Hi Alessandro,
what you describe here reminds me of this issue:
http://www.spinics.net/lists/gluster-users/msg20144.html

And now that you mention it, the mess on our cluster could indeed have 
been triggered by an aborted rebalance.
This is a very important clue, since apparently developers were never 
able to reproduce the issue in the lab. I also tried to reproduce the 
issue on a test cluster, but never succeeded.


The example you describe below seems to me relatively easy to fix. A 
rebalance fix-layout would eventually get rid of the sticky bit files 
(-T) on your brick 5 and 6 and you could manually remove the 
files created on 10/03 as long as you also remove the corresponding link 
file in the .glusterfs dir on that brick.


I whole heartedly agree with you that this needs urgent attention of 
developers before they start working on new features. A mess like this 
in a distributed file system makes the file system unusable for 
production. This should never happen, never! And if it does a rebalance 
should be able to detect and fix it... fast and efficiently. I also 
agree that the status of a rebalance should be more telling, giving a 
clear idea how long it would still take to complete. On large clusters a 
rebalance often takes ages and makes the entire cluster extremely 
vulnerable. (another scary operation is a remove-brick operation, but 
this is another story)


What I did in our case, maybe this could help you too as a quick fix for 
the most critical directories, is to rsync to a different storage (via a 
mount point). rsync only copies one file of duplicated files and you 
could separately copy a good version (in the case below e.g.: -rw-r--r-- 
2 seviri users 68 May 26 2014 
/data/glusterfs/home/brick1/seviri/.forward) of the problem files. But 
probably, as soon as you remove the files created on 10/03 (incl. the 
gluster link file in .glusterfs), the listing via your NFS mount will be 
restored. Try this out with a couple of files you have back-upped to be 
sure.


Hope this helps!

Cheers,
Olav





On 20/03/15 12:22, Alessandro Ipe wrote:


Hi,

After lauching a rebalance on an idle gluster system one week ago, 
its status told me it has scanned


more than 23 millions files on each of my 6 bricks. However, without 
knowing at least the total files to


be scanned, this status is USELESS from an end-user perspective, 
because it does not allow you to


know WHEN the rebalance could eventually complete (one day, one week, 
one year or never). From


my point of view, the total files per bricks could be obtained and 
maintained when activating quota,


since the whole filesystem has to be crawled...

After one week being offline and still no clue when the rebalance 
would complete, I decided to stop it...


Enormous mistake... It seems that rebalance cannot manage to not screw 
some files. Example, on


the only client mounting the gluster system, ls -la /home/seviri returns

ls: cannot access /home/seviri/.forward: Stale NFS file handle

ls: cannot access /home/seviri/.forward: Stale NFS file handle

-? ? ? ? ? ? .forward

-? ? ? ? ? ? .forward

while this file could perfectly be accessed before (being rebalanced) 
and has not been modifed for at


least 3 years.

Getting the extended attributes on the various bricks 3, 4, 5, 6 (3-4 
replicate, 5-6 replicate)


Brick 3:

ls -l /data/glusterfs/home/brick?/seviri/.forward

-rw-r--r-- 2 seviri users 68 May 26 2014 
/data/glusterfs/home/brick1/seviri/.forward


-rw-r--r-- 2 seviri users 68 Mar 10 10:22 
/data/glusterfs/home/brick2/seviri/.forward


getfattr -d -m . -e hex /data/glusterfs/home/brick?/seviri/.forward

# file: data/glusterfs/home/brick1/seviri/.forward

trusted.afr.home-client-8=0x

trusted.afr.home-client-9=0x

trusted.gfid=0xc1d268beb17443a39d914de917de123a

# file: data/glusterfs/home/brick2/seviri/.forward

trusted.afr.home-client-10=0x

trusted.afr.home-client-11=0x

trusted.gfid=0x14a1c10eb1474ef2bf72f4c6c64a90ce

trusted.glusterfs.quota.4138a9fa-a453-4b8e-905a-e02cce07d717.contri=0x0200

trusted.pgfid.4138a9fa-a453-4b8e-905a-e02cce07d717=0x0001

Brick 4:

ls -l /data/glusterfs/home/brick?/seviri/.forward

-rw-r--r-- 2 seviri users 68 May 26 2014 
/data/glusterfs/home/brick1/seviri/.forward


-rw-r--r-- 2 seviri users 68 Mar 10 10:22 
/data/glusterfs/home/brick2/seviri/.forward


getfattr -d -m . -e hex /data/glusterfs/home/brick?/seviri/.forward

# file: data/glusterfs/home/brick1/seviri/.forward

trusted.afr.home-client-8=0x

trusted.afr.home-client-9=0x

trusted.gfid=0xc1d268beb17443a39d914de917de123a

# file: data/glusterfs/home/brick2/seviri/.forward

trusted.afr.home-client-10=0x

trusted.afr.home-client-11=0x

trusted.gfid=0x14a1c10eb1474ef2bf72f4c6c64a90ce


Re: [Gluster-users] Hundreds of duplicate files

2015-02-22 Thread Olav Peeters
trusted.afr.sr_vol01-client-41=0x
trusted.gfid=0xaefd184508414a8f8408f1ab8aa7a417

My bet would be that I can delete the first two of these files.
For the rest they look identical:

[root@gluster01 ~]# ls -al 
/export/brick13gfs01/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd
-rw-r--r--. 2 root root 44332659200 Feb 17 23:55 
/export/brick13gfs01/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd


[root@gluster02 ~]# ls -al 
/export/brick13gfs02/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd
-rw-r--r--. 2 root root 44332659200 Feb 17 23:55 
/export/brick13gfs02/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd


[root@gluster02 ~]# ls -al 
/export/brick15gfs02/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd
-rw-r--r--. 2 root root 44332659200 Feb 17 23:55 
/export/brick15gfs02/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd


[root@gluster03 ~]# ls -al 
/export/brick13gfs03/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd
-rw-r--r--. 2 root root 44332659200 Feb 17 23:55 
/export/brick13gfs03/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd


Cheers,
Olav




On 21/02/15 01:37, Olav Peeters wrote:

It look even worse than I had feared.. :-(
This really is a crazy bug.

If I understand you correctly, the only sane pairing of the xattrs is 
of the two 0-bit files, since this is the full list of bricks:


root@gluster01 ~]# gluster volume info

Volume Name: sr_vol01
Type: Distributed-Replicate
Volume ID: c6d6147e-2d91-4d98-b8d9-ba05ec7e4ad6
Status: Started
Number of Bricks: 21 x 2 = 42
Transport-type: tcp
Bricks:
Brick1: gluster01:/export/brick1gfs01
Brick2: gluster02:/export/brick1gfs02
Brick3: gluster01:/export/brick4gfs01
Brick4: gluster03:/export/brick4gfs03
Brick5: gluster02:/export/brick4gfs02
Brick6: gluster03:/export/brick1gfs03
Brick7: gluster01:/export/brick2gfs01
Brick8: gluster02:/export/brick2gfs02
Brick9: gluster01:/export/brick5gfs01
Brick10: gluster03:/export/brick5gfs03
Brick11: gluster02:/export/brick5gfs02
Brick12: gluster03:/export/brick2gfs03
Brick13: gluster01:/export/brick3gfs01
Brick14: gluster02:/export/brick3gfs02
Brick15: gluster01:/export/brick6gfs01
Brick16: gluster03:/export/brick6gfs03
Brick17: gluster02:/export/brick6gfs02
Brick18: gluster03:/export/brick3gfs03
Brick19: gluster01:/export/brick8gfs01
Brick20: gluster02:/export/brick8gfs02
Brick21: gluster01:/export/brick9gfs01
Brick22: gluster02:/export/brick9gfs02
Brick23: gluster01:/export/brick10gfs01
Brick24: gluster03:/export/brick10gfs03
Brick25: gluster01:/export/brick11gfs01
Brick26: gluster03:/export/brick11gfs03
Brick27: gluster02:/export/brick10gfs02
Brick28: gluster03:/export/brick8gfs03
Brick29: gluster02:/export/brick11gfs02
Brick30: gluster03:/export/brick9gfs03
Brick31: gluster01:/export/brick12gfs01
Brick32: gluster02:/export/brick12gfs02
Brick33: gluster01:/export/brick13gfs01
Brick34: gluster02:/export/brick13gfs02
Brick35: gluster01:/export/brick14gfs01
Brick36: gluster03:/export/brick14gfs03
Brick37: gluster01:/export/brick15gfs01
Brick38: gluster03:/export/brick15gfs03
Brick39: gluster02:/export/brick14gfs02
Brick40: gluster03:/export/brick12gfs03
Brick41: gluster02:/export/brick15gfs02
Brick42: gluster03:/export/brick13gfs03


The two 0-bit files are on brick 35 and 36 as the getfattr correctly 
lists.


Another sane pairing could be this (if the first file did not also 
refer to client-34 and client-35):


[root@gluster01 ~]# getfattr -m . -d -e hex 
/export/brick13gfs01/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd

getfattr: Removing leading '/' from absolute path names
# file: 
export/brick13gfs01/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd

security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000
trusted.afr.dirty=0x
trusted.afr.sr_vol01-client-32=0x
trusted.afr.sr_vol01-client-33=0x
trusted.afr.sr_vol01-client-34=0x
trusted.afr.sr_vol01-client-35=0x00010001
trusted.gfid=0xaefd184508414a8f8408f1ab8aa7a417

[root@gluster02 ~]# getfattr -m . -d -e hex 
/export/brick13gfs02/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd

getfattr: Removing leading '/' from absolute path names
# file: 
export/brick13gfs02/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd

security.selinux=0x73797374656d5f753a6f626a6563745f723a66696c655f743a733000
trusted.afr.dirty=0x
trusted.afr.sr_vol01-client-32=0x
trusted.afr.sr_vol01-client-33=0x
trusted.gfid=0xaefd184508414a8f8408f1ab8aa7a417

But why is the security.selinux hash different?


You mention

Re: [Gluster-users] Hundreds of duplicate files

2015-02-20 Thread Olav Peeters

Thanks Joe,
for the answers!

I was not clear enough about the set up apparently.
The Gluster cluster consist of 3 nodes with each 14 bricks. The bricks 
are formatted as xfs, mounted locally as xfs. There is one volume, type: 
Distributed-Replicate (replica 2). The configuration is so that bricks 
are mirrored on two different nodes.


The NFS mount which was alive but not used during reboot when the 
problem started are from clients (2 XenServer machines configured as a 
pool - a shared storage set-up). The comparisons I give below are 
between (other) clients mounting via either glusterfs or NFS. Similar 
problem with the exception that the first listing (via ls) after a fresh 
mount via NFS actually does find the files with data. A second listing 
only finds the 0 bit file with the same name.


So all the 0bit files in mode 0644 can be safely removed?

Why do I see three files with the same name (and modification timestamp 
etc.) via either a glusterfs or NFS mount from a client? Deleting one of 
the three will probably not solve the issue either.. this seems to me an 
indexing issue in the gluster cluster.


How do I get Gluster to replicate the files correctly, only 2 versions 
of the same file, not three, and on two bricks on different machines?


Cheers,
Olav




On 20/02/15 21:51, Joe Julian wrote:


On 02/20/2015 12:21 PM, Olav Peeters wrote:
Let's take one file (3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd) as an 
example...
On the 3 nodes where all bricks are formatted as XFS and mounted in 
/export and 272b2366-dfbf-ad47-2a0f-5d5cc40863e3 is the mounting 
point of a NFS shared storage connection from XenServer machines:
Did I just read this correctly? Your bricks are NFS mounts? ie, 
GlusterFS Client - GlusterFS Server - NFS - XFS


[root@gluster01 ~]# find 
/export/*/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/ -name '300*' -exec ls 
-la {} \;
-rw-r--r--. 2 root root 44332659200 Feb 17 23:55 
/export/brick13gfs01/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd

Supposedly, this is the actual file.
-rw-r--r--. 2 root root 0 Feb 18 00:51 
/export/brick14gfs01/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd
This is not a linkfile. Note it's mode 0644. How it got there with 
those permissions would be a matter of history and would require 
information that's probably lost.


root@gluster02 ~]# find 
/export/*/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/ -name '300*' -exec ls 
-la {} \;
-rw-r--r--. 2 root root 44332659200 Feb 17 23:55 
/export/brick13gfs02/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd


[root@gluster03 ~]# find 
/export/*/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/ -name '300*' -exec ls 
-la {} \;
-rw-r--r--. 2 root root 44332659200 Feb 17 23:55 
/export/brick13gfs03/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd
-rw-r--r--. 2 root root 0 Feb 18 00:51 
/export/brick14gfs03/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd

Same analysis as above.


3 files with information, 2 x a 0-bit file with the same name

Checking the 0-bit files:
[root@gluster01 ~]# getfattr -m . -d -e hex 
/export/brick14gfs01/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd

getfattr: Removing leading '/' from absolute path names
# file: 
export/brick14gfs01/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd

security.selinux=0x73797374656d5f753a6f626a6563745f723a66696c655f743a733000
trusted.afr.dirty=0x
trusted.afr.sr_vol01-client-34=0x
trusted.afr.sr_vol01-client-35=0x
trusted.gfid=0xaefd184508414a8f8408f1ab8aa7a417

[root@gluster03 ~]# getfattr -m . -d -e hex 
/export/brick14gfs03/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd

getfattr: Removing leading '/' from absolute path names
# file: 
export/brick14gfs03/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd

security.selinux=0x73797374656d5f753a6f626a6563745f723a66696c655f743a733000
trusted.afr.dirty=0x
trusted.afr.sr_vol01-client-34=0x
trusted.afr.sr_vol01-client-35=0x
trusted.gfid=0xaefd184508414a8f8408f1ab8aa7a417

This is not a glusterfs link file since there is no 
trusted.glusterfs.dht.linkto, am I correct?

You are correct.


And checking the good files:

# file: 
export/brick13gfs01/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd

security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000
trusted.afr.dirty=0x
trusted.afr.sr_vol01-client-32=0x
trusted.afr.sr_vol01-client-33=0x
trusted.afr.sr_vol01-client-34=0x
trusted.afr.sr_vol01-client-35=0x00010001
trusted.gfid=0xaefd184508414a8f8408f1ab8aa7a417

Re: [Gluster-users] Hundreds of duplicate files

2015-02-20 Thread Olav Peeters
 in the current state could do more harm than good?
I launched a second rebalance in the hope that the system would mend 
itself after all...


Thanks a million for your support in this darkest hour of my time as a 
glusterfs user :-)


Cheers,
Olav



On 20/02/15 23:10, Joe Julian wrote:


On 02/20/2015 01:47 PM, Olav Peeters wrote:

Thanks Joe,
for the answers!

I was not clear enough about the set up apparently.
The Gluster cluster consist of 3 nodes with each 14 bricks. The 
bricks are formatted as xfs, mounted locally as xfs. There is one 
volume, type: Distributed-Replicate (replica 2). The configuration is 
so that bricks are mirrored on two different nodes.


The NFS mount which was alive but not used during reboot when the 
problem started are from clients (2 XenServer machines configured as 
a pool - a shared storage set-up). The comparisons I give below are 
between (other) clients mounting via either glusterfs or NFS. Similar 
problem with the exception that the first listing (via ls) after a 
fresh mount via NFS actually does find the files with data. A second 
listing only finds the 0 bit file with the same name.


So all the 0bit files in mode 0644 can be safely removed?

Probably? Is it likely that you have any empty files? I don't know.


Why do I see three files with the same name (and modification 
timestamp etc.) via either a glusterfs or NFS mount from a client? 
Deleting one of the three will probably not solve the issue either.. 
this seems to me an indexing issue in the gluster cluster.
Very good question. I don't know. The xattrs tell a strange story that 
I haven't seen before. One legit file shows sr_vol01-client-32 and 33. 
This would be normal, assuming the filename hash would put it on that 
replica pair (we can't tell since the rebalance has changed the hash 
map). Another file shows sr_vol01-client-32, 33, 34, and 35 with 
pending updates scheduled for 35. I have no idea which brick this is 
(see gluster volume info and map the digits (35) with the bricks 
offset by 1 (client-35 is brick 36). That last one is on 40,41.


I don't know how these files all got on different replica sets. My 
speculations include hostname changes, long-running net-split 
conditions with different dht maps (failed rebalances), moved bricks, 
load balancers between client and server, mercury in retrograde (lol)...


How do I get Gluster to replicate the files correctly, only 2 
versions of the same file, not three, and on two bricks on different 
machines?




Identify which replica is correct by using the little python script at 
http://joejulian.name/blog/dht-misses-are-expensive/ to get the hash 
of the filename. Examine the dht map to see which replica pair 
*should* have that hash and remove the others (and their hardlink in 
.glusterfs). There is no 1-liner that's going to do this. I would 
probably script the logic in python, have it print out what it was 
going to do, check that for sanity and, if sane, execute it.


But mostly figure out how Bricks 32 and/or 33 can become 34 and/or 35 
and/or 40 and/or 41. That's the root of the whole problem.



Cheers,
Olav




On 20/02/15 21:51, Joe Julian wrote:


On 02/20/2015 12:21 PM, Olav Peeters wrote:
Let's take one file (3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd) as 
an example...
On the 3 nodes where all bricks are formatted as XFS and mounted in 
/export and 272b2366-dfbf-ad47-2a0f-5d5cc40863e3 is the mounting 
point of a NFS shared storage connection from XenServer machines:
Did I just read this correctly? Your bricks are NFS mounts? ie, 
GlusterFS Client - GlusterFS Server - NFS - XFS


[root@gluster01 ~]# find 
/export/*/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/ -name '300*' -exec 
ls -la {} \;
-rw-r--r--. 2 root root 44332659200 Feb 17 23:55 
/export/brick13gfs01/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd

Supposedly, this is the actual file.
-rw-r--r--. 2 root root 0 Feb 18 00:51 
/export/brick14gfs01/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd
This is not a linkfile. Note it's mode 0644. How it got there with 
those permissions would be a matter of history and would require 
information that's probably lost.


root@gluster02 ~]# find 
/export/*/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/ -name '300*' -exec 
ls -la {} \;
-rw-r--r--. 2 root root 44332659200 Feb 17 23:55 
/export/brick13gfs02/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd


[root@gluster03 ~]# find 
/export/*/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/ -name '300*' -exec 
ls -la {} \;
-rw-r--r--. 2 root root 44332659200 Feb 17 23:55 
/export/brick13gfs03/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd
-rw-r--r--. 2 root root 0 Feb 18 00:51 
/export/brick14gfs03/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd

Same analysis as above.


3 files with information, 2 x a 0-bit file with the same name

Checking the 0-bit files:
[root@gluster01

Re: [Gluster-users] Hundreds of duplicate files

2015-02-20 Thread Olav Peeters
@client ~]# ls -al 
/mnt/glusterfs/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/300*
-rw-r--r--. 1 root root 0 Feb 18 00:51 
/mnt/glusterfs/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd
-rw-r--r--. 1 root root 0 Feb 18 00:51 
/mnt/glusterfs/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd
-rw-r--r--. 1 root root 0 Feb 18 00:51 
/mnt/glusterfs/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd




Via NFS (just after performing a umount and mount the volume again):
[root@client ~]# ls -al /mnt/nfs/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/300*
-rw-r--r--. 1 root root 44332659200 Feb 17 23:55 
/mnt/test/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd
-rw-r--r--. 1 root root 44332659200 Feb 17 23:55 
/mnt/test/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd
-rw-r--r--. 1 root root 44332659200 Feb 17 23:55 
/mnt/test/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd


Doing the same list a couple of seconds later:
[root@client ~]# ls -al /mnt/nfs/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/300*
-rw-r--r--. 1 root root 0 Feb 18 00:51 
/mnt/test/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd
-rw-r--r--. 1 root root 0 Feb 18 00:51 
/mnt/test/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd
-rw-r--r--. 1 root root 0 Feb 18 00:51 
/mnt/test/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd

And again, and again, and again:
[root@client ~]# ls -al /mnt/nfs/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/300*
-rw-r--r--. 1 root root 0 Feb 18 00:51 
/mnt/test/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd
-rw-r--r--. 1 root root 0 Feb 18 00:51 
/mnt/test/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd
-rw-r--r--. 1 root root 0 Feb 18 00:51 
/mnt/test/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd


This really seems odd. Why do we get to see real data file once only?

It seems more and more that this crazy file duplication (and writing of 
sticky bit files) was actually triggered when rebooting one of the three 
nodes while there still is an active (even when there is no data 
exchange at all) NFS connection, since all 0-bit files (of the non 
Sticky bit type) were either created at 00:51 or 00:41, the exact moment 
one of the three nodes in the cluster were rebooted. This would mean 
that replication currently with GlusterFS creates hardly any redundancy. 
Quiet the opposite, if one of the machines goes down, all of your data 
seriously gets disorganised. I am buzzy configuring a test installation 
to see how this can be best reproduced for a bug report..


Does anyone have a suggestion how to best get rid of the duplicates, or 
rather get this mess organised the way it should be?
This is a cluster with millions of files. A rebalance does not fix the 
issue, neither does a rebalance fix-layout help. Since this is a 
replicated volume all files should be their 2x, not 3x. Can I safely 
just remove all the 0 bit files outside of the .glusterfs directory 
including the sticky bit files?


The empty 0 bit files outside of .glusterfs on every brick I can 
probably safely removed like this:
find /export/* -path */.glusterfs -prune -o -type f -size 0 -perm 1000 
-exec rm {} \;

not?

Thanks!

Cheers,
Olav


On 18/02/15 22:10, Olav Peeters wrote:

Thanks Tom and Joe,
for the fast response!

Before I started my upgrade I stopped all clients using the volume and 
stopped all VM's with VHD on the volume, but I guess, and this may be 
the missing thing to reproduce this in a lab, I did not detach a NFS 
shared storage mount from a XenServer pool to this volume, since this 
is an extremely risky business. I also did not stop the volume. This I 
guess was a bit stupid, but since I did upgrades in the past this way 
without any issues I skipped this step (a really bad habit). I'll make 
amends and file a proper bug report :-). I agree with you Joe, this 
should never happen, even when someone ignores the advice of stopping 
the volume. If it would also be nessessary to detach shared storage 
NFS connections to a volume, than franky, glusterfs is unusable in a 
private cloud. No one can afford downtime of the whole infrastructure 
just for a glusterfs upgrade. Ideally a replicated gluster volume 
should even be able to remain online and used during (at least a minor 
version) upgrade.


I don't know whether a heal was maybe buzzy when I started the 
upgrade. I forgot to check. I did check the CPU activity on the 
gluster nodes which were very low (in the 0.0X range via top), so I 
doubt it. I will add this to the bug report as a suggestion should 
they not be able to reproduce with an open NFS connection.


By the way, is it sufficient to do:
service glusterd stop
service glusterfsd stop
and do a:
ps aux | gluster

Re: [Gluster-users] Hundreds of duplicate files

2015-02-18 Thread Olav Peeters

Hi all,
I'm have this problem after upgrading from 3.5.3 to 3.6.2.
At the moment I am still waiting for a heal to finish (on a 31TB volume 
with 42 bricks, replicated over three nodes).


Tom,
how did you remove the duplicates?
with 42 bricks I will not be able to do this manually..
Did a:
find $brick_root -type f -size 0 -perm 1000 -exec /bin/rm {} \;
work for you?

Should this type of thing ideally not be checked and mended by a heal?

Does anyone have an idea yet how this happens in the first place? Can it 
be connected to upgrading?


Cheers,
Olav

On 01/01/15 03:07, tben...@3vgeomatics.com wrote:
No, the files can be read on a newly mounted client! I went ahead and 
deleted all of the link files associated with these duplicates, and 
then remounted the volume. The problem is fixed!

Thanks again for the help, Joe and Vijay.
Tom

- Original Message -
Subject: Re: [Gluster-users] Hundreds of duplicate files
From: Vijay Bellur vbel...@redhat.com
Date: 12/28/14 3:23 am
To: tben...@3vgeomatics.com, gluster-users@gluster.org

On 12/28/2014 01:20 PM, tben...@3vgeomatics.com wrote:
 Hi Vijay,
 Yes the files are still readable from the .glusterfs path.
 There is no explicit error. However, trying to read a text file in
 python simply gives me null characters:

  open('ott_mf_itab').readlines()


['\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00']

 And reading binary files does the same


Is this behavior seen with a freshly mounted client too?

-Vijay

 - Original Message -
 Subject: Re: [Gluster-users] Hundreds of duplicate files
 From: Vijay Bellur vbel...@redhat.com
 Date: 12/27/14 9:57 pm
 To: tben...@3vgeomatics.com, gluster-users@gluster.org

 On 12/28/2014 10:13 AM, tben...@3vgeomatics.com wrote:
  Thanks Joe, I've read your blog post as well as your post
 regarding the
  .glusterfs directory.
  I found some unneeded duplicate files which were not being read
  properly. I then deleted the link file from the brick. This always
  removes the duplicate file from the listing, but the file does not
  always become readable. If I also delete the associated file
in the
  .glusterfs directory on that brick, then some more files become
  readable. However this solution still doesn't work for all files.
  I know the file on the brick is not corrupt as it can be read
 directly
  from the brick directory.

 For files that are not readable from the client, can you check
if the
 file is readable from the .glusterfs/ path?

 What is the specific error that is seen while trying to read one
such
 file from the client?

 Thanks,
 Vijay



 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://www.gluster.org/mailman/listinfo/gluster-users




___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Hundreds of duplicate files

2015-02-18 Thread Olav Peeters

Thanks Tom and Joe,
for the fast response!

Before I started my upgrade I stopped all clients using the volume and 
stopped all VM's with VHD on the volume, but I guess, and this may be 
the missing thing to reproduce this in a lab, I did not detach a NFS 
shared storage mount from a XenServer pool to this volume, since this is 
an extremely risky business. I also did not stop the volume. This I 
guess was a bit stupid, but since I did upgrades in the past this way 
without any issues I skipped this step (a really bad habit). I'll make 
amends and file a proper bug report :-). I agree with you Joe, this 
should never happen, even when someone ignores the advice of stopping 
the volume. If it would also be nessessary to detach shared storage NFS 
connections to a volume, than franky, glusterfs is unusable in a private 
cloud. No one can afford downtime of the whole infrastructure just for a 
glusterfs upgrade. Ideally a replicated gluster volume should even be 
able to remain online and used during (at least a minor version) upgrade.


I don't know whether a heal was maybe buzzy when I started the upgrade. 
I forgot to check. I did check the CPU activity on the gluster nodes 
which were very low (in the 0.0X range via top), so I doubt it. I will 
add this to the bug report as a suggestion should they not be able to 
reproduce with an open NFS connection.


By the way, is it sufficient to do:
service glusterd stop
service glusterfsd stop
and do a:
ps aux | gluster*
to see if everything has stopped and kill any leftovers should this be 
necessary?


For the fix, do you agree that if I run e.g.:
find /export/* -type f -size 0 -perm 1000 -exec /bin/rm {} \;
on every node if /export is the location of all my bricks, also in a 
replicated set-up, this will be save?
No necessary 0bit files will be deleted in e.g. the .glusterfs of every 
brick?


Thanks for your support!

Cheers,
Olav






On 18/02/15 20:51, Joe Julian wrote:


On 02/18/2015 11:43 AM, tben...@3vgeomatics.com wrote:

Hi Olav,

I have a hunch that our problem was caused by improper unmounting of 
the gluster volume, and have since found that the proper order should 
be: kill all jobs using volume - unmount volume on clients - 
gluster volume stop - stop gluster service (if necessary)
In my case, I wrote a Python script to find duplicate files on the 
mounted volume, then delete the corresponding link files on the 
bricks (making sure to also delete files in the .glusterfs directory)
However, your find command was also suggested to me and I think it's 
a simpler solution. I believe removing all link files (even ones that 
are not causing duplicates) is fine since the next file access 
gluster will do a lookup on all bricks and recreate any link files if 
necessary. Hopefully a gluster expert can chime in on this point as 
I'm not completely sure.


You are correct.

Keep in mind your setup is somewhat different than mine as I have 
only 5 bricks with no replication.

Regards,
Tom

- Original Message -
Subject: Re: [Gluster-users] Hundreds of duplicate files
From: Olav Peeters opeet...@gmail.com
Date: 2/18/15 10:52 am
To: gluster-users@gluster.org, tben...@3vgeomatics.com

Hi all,
I'm have this problem after upgrading from 3.5.3 to 3.6.2.
At the moment I am still waiting for a heal to finish (on a 31TB
volume with 42 bricks, replicated over three nodes).

Tom,
how did you remove the duplicates?
with 42 bricks I will not be able to do this manually..
Did a:
find $brick_root -type f -size 0 -perm 1000 -exec /bin/rm {} \;
work for you?

Should this type of thing ideally not be checked and mended by a
heal?

Does anyone have an idea yet how this happens in the first place?
Can it be connected to upgrading?

Cheers,
Olav

  


On 01/01/15 03:07, tben...@3vgeomatics.com wrote:

No, the files can be read on a newly mounted client! I went
ahead and deleted all of the link files associated with these
duplicates, and then remounted the volume. The problem is fixed!
Thanks again for the help, Joe and Vijay.
Tom

- Original Message -
Subject: Re: [Gluster-users] Hundreds of duplicate files
From: Vijay Bellur vbel...@redhat.com
Date: 12/28/14 3:23 am
To: tben...@3vgeomatics.com, gluster-users@gluster.org

On 12/28/2014 01:20 PM, tben...@3vgeomatics.com wrote:
 Hi Vijay,
 Yes the files are still readable from the .glusterfs path.
 There is no explicit error. However, trying to read a
text file in
 python simply gives me null characters:

  open('ott_mf_itab').readlines()


['\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00

[Gluster-users] problems after gluster volume remove-brick

2015-01-21 Thread Olav Peeters

Hi,
two days ago is started a gluster volume remove-brick on a 
Distributed-Replicate volume with 21 x 2 per node (3 in total).


I wanted to remove 4 bricks per node which are smaller than the others 
(on each node I have 7 x 2TB disks and 4 x 500GB disks).
I am still on gluster 3.5.2. and I was not aware that using disks of 
different sizes is only supported as of 3.6.x (am I correct?)


I started with 2 paired disks like so:
gluster volume remove-brick VOLNAME node03:/export/brick8node03 
node02:/export/brick10node02 start


I followed the progress (which was very slow):
gluster volume remove-brick volume_name node03:/export/brick8node03 
node02:/export/brick10node02 status
after a day the progress of node03:/export/brick8node03 showed 
completed, the other brick remained in progress


this morning several VM's with vdi's on the volume started showing disk 
errors + a couple of gluserfs mounts returned a disk is full type of 
error on the volume which is only ca. 41% filled with data currently.


via df -h I saw that most of the 500GB disk where indeed 100% full. 
Others were meanwhile nearly empty..

Gluster seems to have gone nuts a bit during rebalancing the data.

I did a:
gluster volume remove-brick VOLNAME node03:/export/brick8node03 
node02:/export/brick10node02 stop

and a:
gluster volume rebalance VOLNAME start

progress is again very slow and some of the disks/bricks which were ca. 
98% are now 100% full.
The situation seems to be both getting worse in some cases and slowly 
improving e.g. for another pair of bricks (from 100% to 97%).


There clearly has been some data corruption. Some VM's don't want to 
boot anymore, throwing disk errors.


How do I proceed?
Wait a very long time for the rebalance to complete and hope that the 
data corruption is automatically mended?


Upgrade to 3.6.x and hope that the issues (which might be related to me 
using bricks of different sizes) are resolved and again risk a 
remove-brick operation?


Should I rather do a:
gluster volume rebalance VOLNAME migrate-data start

Should I have done a replace-brick instead of a remove-brick operation 
originally? I thought that replace-brick is becoming obsolete.


Thanks,
Olav




___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] problems after gluster volume remove-brick

2015-01-21 Thread Olav Peeters

Adding to my previous mail..
I find a couple of strange errors in the rebalance log 
(/var/log/glusterfs/sr_vol01-rebalance.log)

e.g.:
[2015-01-21 10:00:32.123999] E 
[afr-self-heal-entry.c:1135:afr_sh_entry_impunge_newfile_cbk] 
0-sr_vol01-replicate-11: creation of /some/file/on/the/volume.data on 
sr_vol01-client-23 failed (No space left on device)


Why is the rebalance seemingly not taking account of the space left on 
disks available.

This is the current situation on this particular node:
[root@gluster03 ~]# df -h
FilesystemSize  Used Avail Use% Mounted on
/dev/mapper/VolGroup-lv_root
   50G  2.4G   45G   5% /
tmpfs 7.8G 0  7.8G   0% /dev/shm
/dev/sda1 485M   95M  365M  21% /boot
/dev/sdb1 1.9T  577G  1.3T  31% /export/brick1gfs03
/dev/sdc1 1.9T  154G  1.7T   9% /export/brick2gfs03
/dev/sdd1 1.9T  413G  1.5T  23% /export/brick3gfs03
/dev/sde1 1.9T  1.5T  417G  78% /export/brick4gfs03
/dev/sdf1 1.9T  1.6T  286G  85% /export/brick5gfs03
/dev/sdg1 1.9T  1.4T  443G  77% /export/brick6gfs03
/dev/sdh1 1.9T   33M  1.9T   1% /export/brick7gfs03
/dev/sdi1 466G   62G  405G  14% /export/brick8gfs03
/dev/sdj1 466G  166G  301G  36% /export/brick9gfs03
/dev/sdk1 466G  466G   20K 100% /export/brick10gfs03
/dev/sdl1 466G  450G   16G  97% /export/brick11gfs03
/dev/sdm1 1.9T  206G  1.7T  12% /export/brick12gfs03
/dev/sdn1 1.9T  306G  1.6T  17% /export/brick13gfs03
/dev/sdo1 1.9T  107G  1.8T   6% /export/brick14gfs03
/dev/sdp1 1.9T  252G  1.6T  14% /export/brick15gfs03

why are brick10 and brick11 over utilised when there is plenty of space 
on brick 6, 14, etc. ?

Anyone any idea?

Cheers,
Olav



On 21/01/15 13:18, Olav Peeters wrote:

Hi,
two days ago is started a gluster volume remove-brick on a 
Distributed-Replicate volume with 21 x 2 per node (3 in total).


I wanted to remove 4 bricks per node which are smaller than the others 
(on each node I have 7 x 2TB disks and 4 x 500GB disks).
I am still on gluster 3.5.2. and I was not aware that using disks of 
different sizes is only supported as of 3.6.x (am I correct?)


I started with 2 paired disks like so:
gluster volume remove-brick VOLNAME node03:/export/brick8node03 
node02:/export/brick10node02 start


I followed the progress (which was very slow):
gluster volume remove-brick volume_name node03:/export/brick8node03 
node02:/export/brick10node02 status
after a day the progress of node03:/export/brick8node03 showed 
completed, the other brick remained in progress


this morning several VM's with vdi's on the volume started showing 
disk errors + a couple of gluserfs mounts returned a disk is full type 
of error on the volume which is only ca. 41% filled with data currently.


via df -h I saw that most of the 500GB disk where indeed 100% full. 
Others were meanwhile nearly empty..

Gluster seems to have gone nuts a bit during rebalancing the data.

I did a:
gluster volume remove-brick VOLNAME node03:/export/brick8node03 
node02:/export/brick10node02 stop

and a:
gluster volume rebalance VOLNAME start

progress is again very slow and some of the disks/bricks which were 
ca. 98% are now 100% full.
The situation seems to be both getting worse in some cases and slowly 
improving e.g. for another pair of bricks (from 100% to 97%).


There clearly has been some data corruption. Some VM's don't want to 
boot anymore, throwing disk errors.


How do I proceed?
Wait a very long time for the rebalance to complete and hope that the 
data corruption is automatically mended?


Upgrade to 3.6.x and hope that the issues (which might be related to 
me using bricks of different sizes) are resolved and again risk a 
remove-brick operation?


Should I rather do a:
gluster volume rebalance VOLNAME migrate-data start

Should I have done a replace-brick instead of a remove-brick operation 
originally? I thought that replace-brick is becoming obsolete.


Thanks,
Olav





___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] socket.c:2161:socket_connect_finish (Connection refused)

2014-06-11 Thread Olav Peeters

OK, thanks for the info!
Regards,
Olav

On 11/06/14 08:38, Pranith Kumar Karampuri wrote:


On 06/11/2014 12:03 PM, Olav Peeters wrote:

Thanks Pranith!

I see this at the end of the log files of one of the problem bricks 
(the first two errors are repeated several times):


[2014-06-10 09:55:28.354659] E [rpcsvc.c:1206:rpcsvc_submit_generic] 
0-rpc-service: failed to submit message (XID: 0x103c59, Program: 
GlusterFS 3.3, ProgVers: 330, Proc: 30) to rpc-transport 
(tcp.sr_vol01-server)
[2014-06-10 09:55:28.354683] E [server.c:190:server_submit_reply] 
(--/usr/lib64/glusterfs/3.5.0/xlator/performance/io-threads.so(iot_finodelk_cbk+0xb9) 
[0x7f8c8e82f189] 
(--/usr/lib64/glusterfs/3.5.0/xlator/debug/io-stats.so(io_stats_finodelk_cbk+0xed) 
[0x7f8c8e1f22ed] 
(--/usr/lib64/glusterfs/3.5.0/xlator/protocol/server.so(server_finodelk_cbk+0xad) 
[0x7f8c8dfc555d]))) 0-: Reply submission failed

pending frames:
frame : type(0) op(30)
frame : type(0) op(30)
frame : type(0) op(30)
frame : type(0) op(30)
...
...

frame : type(0) op(30)
frame : type(0) op(30)
frame : type(0) op(30)

patchset: git://git.gluster.com/glusterfs.git
signal received: 11
time of crash: 2014-06-10 09:55:28configuration details:
argp 1
backtrace 1
dlfcn 1
fdatasync 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.5.0
/lib64/libc.so.6(+0x329a0)[0x7f8c94aac9a0]
/usr/lib64/glusterfs/3.5.0/xlator/features/locks.so(grant_blocked_inode_locks+0xc1)[0x7f8c8ea54061] 

/usr/lib64/glusterfs/3.5.0/xlator/features/locks.so(pl_inodelk_client_cleanup+0x249)[0x7f8c8ea54569] 

/usr/lib64/glusterfs/3.5.0/xlator/features/locks.so(+0x6f0a)[0x7f8c8ea49f0a] 


/usr/lib64/libglusterfs.so.0(gf_client_disconnect+0x5d)[0x7f8c964d701d]
/usr/lib64/glusterfs/3.5.0/xlator/protocol/server.so(server_connection_cleanup+0x458)[0x7f8c8dfbda48] 

/usr/lib64/glusterfs/3.5.0/xlator/protocol/server.so(server_rpc_notify+0x183)[0x7f8c8dfb9713] 


/usr/lib64/libgfrpc.so.0(rpcsvc_handle_disconnect+0x105)[0x7f8c96261d35]
/usr/lib64/libgfrpc.so.0(rpcsvc_notify+0x1a0)[0x7f8c96263880]
/usr/lib64/libgfrpc.so.0(rpc_transport_notify+0x28)[0x7f8c96264f98]
/usr/lib64/glusterfs/3.5.0/rpc-transport/socket.so(+0xa9a1)[0x7f8c914c39a1] 


/usr/lib64/libglusterfs.so.0(+0x672f7)[0x7f8c964d92f7]
/usr/sbin/glusterfsd(main+0x564)[0x4075e4]
/lib64/libc.so.6(__libc_start_main+0xfd)[0x7f8c94a98d1d]
/usr/sbin/glusterfsd[0x404679]
-

Again no info to be found online about the error.
Any idea?


This is because of bug 1089470 which is fixed for 3.5.1. Which will be 
releasing shortly.


Pranith

Olav





On 11/06/14 04:42, Pranith Kumar Karampuri wrote:

Olav,
 Check logs of the bricks to see why the bricks went down.

Pranith

On 06/11/2014 04:02 AM, Olav Peeters wrote:

Hi,
I upgraded from glusterfs 3.4 to 3.5 about 8 days ago. Everything 
was running fine until this morning. In a fuse mount we were having 
write issues. Creating and deleting files became an issue all of a 
sudden without any new changes to the cluster.


In /var/log/glusterfs/glustershd.log every couple of seconds I'm 
getting this:


[2014-06-10 22:23:52.055128] I [rpc-clnt.c:1685:rpc_clnt_reconfig] 
0-sr_vol01-client-13: changing port to 49156 (from 0)
[2014-06-10 22:23:52.060153] E 
[socket.c:2161:socket_connect_finish] 0-sr_vol01-client-13: 
connection to ip-of-one-of-the-gluster-nodes:49156 failed 
(Connection refused)


# gluster volume status sr_vol01
shows that two bricks of the 18 are offline.

rebalance fails

Iptables was stopped on all nodes

If I cd into the two bricks which are offline according to the 
gluster v status, I can read/write without any problems... The 
disks are clearly fine. They are mounted, they are available.


I cannot find much info online about the error.
Does anyone have an idea what could be wrong?
How can I get the two bricks back online?

Cheers,
Olav

___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users








___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] socket.c:2161:socket_connect_finish (Connection refused)

2014-06-11 Thread Olav Peeters

Thanks allot, Pranith!
All seems back to normal again.
Looking forward to the release of 3.5.1 !
Cheers,
Olav


On 11/06/14 09:30, Pranith Kumar Karampuri wrote:

hey
Just do gluster volume start volname force and things 
should be back to normal


Pranith

On 06/11/2014 12:56 PM, Olav Peeters wrote:

Pranith,
how could I move all data from the two problem bricks temporarily 
until the release of 3.5.1?

Like this?
# gluster volume replace-brick VOLNAME BRICK NEW-BRICK start
Will this work if the bricks are offline?
Or is there some other way to get the bricks back online manually?
Would it help to do all fuse connections via NFS until after the fix?
Cheers,
Olav

On 11/06/14 08:44, Olav Peeters wrote:

OK, thanks for the info!
Regards,
Olav

On 11/06/14 08:38, Pranith Kumar Karampuri wrote:


On 06/11/2014 12:03 PM, Olav Peeters wrote:

Thanks Pranith!

I see this at the end of the log files of one of the problem 
bricks (the first two errors are repeated several times):


[2014-06-10 09:55:28.354659] E 
[rpcsvc.c:1206:rpcsvc_submit_generic] 0-rpc-service: failed to 
submit message (XID: 0x103c59, Program: GlusterFS 3.3, ProgVers: 
330, Proc: 30) to rpc-transport (tcp.sr_vol01-server)
[2014-06-10 09:55:28.354683] E [server.c:190:server_submit_reply] 
(--/usr/lib64/glusterfs/3.5.0/xlator/performance/io-threads.so(iot_finodelk_cbk+0xb9) 
[0x7f8c8e82f189] 
(--/usr/lib64/glusterfs/3.5.0/xlator/debug/io-stats.so(io_stats_finodelk_cbk+0xed) 
[0x7f8c8e1f22ed] 
(--/usr/lib64/glusterfs/3.5.0/xlator/protocol/server.so(server_finodelk_cbk+0xad) 
[0x7f8c8dfc555d]))) 0-: Reply submission failed

pending frames:
frame : type(0) op(30)
frame : type(0) op(30)
frame : type(0) op(30)
frame : type(0) op(30)
...
...

frame : type(0) op(30)
frame : type(0) op(30)
frame : type(0) op(30)

patchset: git://git.gluster.com/glusterfs.git
signal received: 11
time of crash: 2014-06-10 09:55:28configuration details:
argp 1
backtrace 1
dlfcn 1
fdatasync 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.5.0
/lib64/libc.so.6(+0x329a0)[0x7f8c94aac9a0]
/usr/lib64/glusterfs/3.5.0/xlator/features/locks.so(grant_blocked_inode_locks+0xc1)[0x7f8c8ea54061] 

/usr/lib64/glusterfs/3.5.0/xlator/features/locks.so(pl_inodelk_client_cleanup+0x249)[0x7f8c8ea54569] 

/usr/lib64/glusterfs/3.5.0/xlator/features/locks.so(+0x6f0a)[0x7f8c8ea49f0a] 

/usr/lib64/libglusterfs.so.0(gf_client_disconnect+0x5d)[0x7f8c964d701d] 

/usr/lib64/glusterfs/3.5.0/xlator/protocol/server.so(server_connection_cleanup+0x458)[0x7f8c8dfbda48] 

/usr/lib64/glusterfs/3.5.0/xlator/protocol/server.so(server_rpc_notify+0x183)[0x7f8c8dfb9713] 

/usr/lib64/libgfrpc.so.0(rpcsvc_handle_disconnect+0x105)[0x7f8c96261d35] 


/usr/lib64/libgfrpc.so.0(rpcsvc_notify+0x1a0)[0x7f8c96263880]
/usr/lib64/libgfrpc.so.0(rpc_transport_notify+0x28)[0x7f8c96264f98]
/usr/lib64/glusterfs/3.5.0/rpc-transport/socket.so(+0xa9a1)[0x7f8c914c39a1] 


/usr/lib64/libglusterfs.so.0(+0x672f7)[0x7f8c964d92f7]
/usr/sbin/glusterfsd(main+0x564)[0x4075e4]
/lib64/libc.so.6(__libc_start_main+0xfd)[0x7f8c94a98d1d]
/usr/sbin/glusterfsd[0x404679]
-

Again no info to be found online about the error.
Any idea?


This is because of bug 1089470 which is fixed for 3.5.1. Which will 
be releasing shortly.


Pranith

Olav





On 11/06/14 04:42, Pranith Kumar Karampuri wrote:

Olav,
 Check logs of the bricks to see why the bricks went down.

Pranith

On 06/11/2014 04:02 AM, Olav Peeters wrote:

Hi,
I upgraded from glusterfs 3.4 to 3.5 about 8 days ago. 
Everything was running fine until this morning. In a fuse mount 
we were having write issues. Creating and deleting files became 
an issue all of a sudden without any new changes to the cluster.


In /var/log/glusterfs/glustershd.log every couple of seconds I'm 
getting this:


[2014-06-10 22:23:52.055128] I 
[rpc-clnt.c:1685:rpc_clnt_reconfig] 0-sr_vol01-client-13: 
changing port to 49156 (from 0)
[2014-06-10 22:23:52.060153] E 
[socket.c:2161:socket_connect_finish] 0-sr_vol01-client-13: 
connection to ip-of-one-of-the-gluster-nodes:49156 failed 
(Connection refused)


# gluster volume status sr_vol01
shows that two bricks of the 18 are offline.

rebalance fails

Iptables was stopped on all nodes

If I cd into the two bricks which are offline according to the 
gluster v status, I can read/write without any problems... The 
disks are clearly fine. They are mounted, they are available.


I cannot find much info online about the error.
Does anyone have an idea what could be wrong?
How can I get the two bricks back online?

Cheers,
Olav

___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users














___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


[Gluster-users] socket.c:2161:socket_connect_finish (Connection refused)

2014-06-10 Thread Olav Peeters

Hi,
I upgraded from glusterfs 3.4 to 3.5 about 8 days ago. Everything was 
running fine until this morning. In a fuse mount we were having write 
issues. Creating and deleting files became an issue all of a sudden 
without any new changes to the cluster.


In /var/log/glusterfs/glustershd.log every couple of seconds I'm getting 
this:


[2014-06-10 22:23:52.055128] I [rpc-clnt.c:1685:rpc_clnt_reconfig] 
0-sr_vol01-client-13: changing port to 49156 (from 0)
[2014-06-10 22:23:52.060153] E [socket.c:2161:socket_connect_finish] 
0-sr_vol01-client-13: connection to ip-of-one-of-the-gluster-nodes:49156 
failed (Connection refused)


# gluster volume status sr_vol01
shows that two bricks of the 18 are offline.

rebalance fails

Iptables was stopped on all nodes

If I cd into the two bricks which are offline according to the gluster v 
status, I can read/write without any problems... The disks are clearly 
fine. They are mounted, they are available.


I cannot find much info online about the error.
Does anyone have an idea what could be wrong?
How can I get the two bricks back online?

Cheers,
Olav

___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Planing Update gluster 3.4 to gluster 3.5 on centos 6.4

2014-05-15 Thread Olav Peeters

Thanks Franco,
for the feed-back!
Did you stop gluster before updating? Or where there maybe no active 
read/writes since it was a test system?

Cheers,
Olav

On 15/05/14 02:36, Franco Broi wrote:

On Wed, 2014-05-14 at 12:31 +0200, Olav Peeters wrote:

Hi,
from what I read here:
http://www.gluster.org/community/documentation/index.php/Upgrade_to_3.5

... if you are on 3.4.0 AND have NO quota configured, it should be
safe to just replace a version
specific /etc/yum.repos.d/glusterfs-epel.repo with e.g.:
http://download.gluster.org/pub/gluster/glusterfs/3.4/LATEST/EPEL.repo/glusterfs-epel.repo
(thus referring to LATEST and not e.g.
http://download.gluster.org/pub/gluster/glusterfs/3.4/3.4.0/EPEL.repo)
and just do a:
yum upgrade
to upgrade both your system and glusterfs together one cluster node at
a time (if you are on CentOS or Fedora), right?
Has anyone successfully done it this way yet on CentOS 6.4?

Yes. Nothing bad happened but it was a test system, I've yet to do it
for real on our production system.


Cheers,
Olav



On 14/05/14 09:01, Humble Devassy Chirammal wrote:


Please refer  #
http://www.gluster.org/community/documentation/index.php/Upgrade_to_3.5


--Humble



On Wed, May 14, 2014 at 12:09 PM, Daniel Müller
muel...@tropenklinik.de wrote:
 Hello to all,
 I am planning updating gluster 3.4 to recent version 3.5 is
 there any issue
 concerning my replicating vols? Or can I simply yum
 install...
 
 Greetings

 Daniel
 
 
 EDV Daniel Müller
 
 Leitung EDV

 Tropenklinik Paul-Lechler-Krankenhaus
 Paul-Lechler-Str. 24
 72076 Tübingen
 Tel.: 07071/206-463, Fax: 07071/206-499
 eMail: muel...@tropenklinik.de
 Internet: www.tropenklinik.de
 
 
 
 
 
 ___

 Gluster-users mailing list
 Gluster-users@gluster.org
 http://supercolony.gluster.org/mailman/listinfo/gluster-users




___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users




___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Planing Update gluster 3.4 to gluster 3.5 on centos 6.4

2014-05-14 Thread Olav Peeters

Hi,
from what I read here:
http://www.gluster.org/community/documentation/index.php/Upgrade_to_3.5

... if you are on 3.4.0 AND have NO quota configured, it should be safe 
to just replace a version specific /etc/yum.repos.d/glusterfs-epel.repo 
with e.g.:

http://download.gluster.org/pub/gluster/glusterfs/3.4/LATEST/EPEL.repo/glusterfs-epel.repo
(thus referring to LATEST and not e.g. 
http://download.gluster.org/pub/gluster/glusterfs/3.4/3.4.0/EPEL.repo)

and just do a:
yum upgrade
to upgrade both your system and glusterfs together one cluster node at a 
time (if you are on CentOS or Fedora), right?

Has anyone successfully done it this way yet on CentOS 6.4?

Cheers,
Olav



On 14/05/14 09:01, Humble Devassy Chirammal wrote:
Please refer  # 
http://www.gluster.org/community/documentation/index.php/Upgrade_to_3.5


--Humble


On Wed, May 14, 2014 at 12:09 PM, Daniel Müller 
muel...@tropenklinik.de mailto:muel...@tropenklinik.de wrote:


Hello to all,
I am planning updating gluster 3.4 to recent version 3.5 is there
any issue
concerning my replicating vols? Or can I simply yum install...

Greetings
Daniel


EDV Daniel Müller

Leitung EDV
Tropenklinik Paul-Lechler-Krankenhaus
Paul-Lechler-Str. 24
72076 Tübingen
Tel.: 07071/206-463, Fax: 07071/206-499
eMail: muel...@tropenklinik.de mailto:muel...@tropenklinik.de
Internet: www.tropenklinik.de http://www.tropenklinik.de





___
Gluster-users mailing list
Gluster-users@gluster.org mailto:Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users




___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users