Re: [Gluster-users] Is rebalance completely broken on 3.5.3 ?
Hi Alessandro, what you describe here reminds me of this issue: http://www.spinics.net/lists/gluster-users/msg20144.html And now that you mention it, the mess on our cluster could indeed have been triggered by an aborted rebalance. This is a very important clue, since apparently developers were never able to reproduce the issue in the lab. I also tried to reproduce the issue on a test cluster, but never succeeded. The example you describe below seems to me relatively easy to fix. A rebalance fix-layout would eventually get rid of the sticky bit files (-T) on your brick 5 and 6 and you could manually remove the files created on 10/03 as long as you also remove the corresponding link file in the .glusterfs dir on that brick. I whole heartedly agree with you that this needs urgent attention of developers before they start working on new features. A mess like this in a distributed file system makes the file system unusable for production. This should never happen, never! And if it does a rebalance should be able to detect and fix it... fast and efficiently. I also agree that the status of a rebalance should be more telling, giving a clear idea how long it would still take to complete. On large clusters a rebalance often takes ages and makes the entire cluster extremely vulnerable. (another scary operation is a remove-brick operation, but this is another story) What I did in our case, maybe this could help you too as a quick fix for the most critical directories, is to rsync to a different storage (via a mount point). rsync only copies one file of duplicated files and you could separately copy a good version (in the case below e.g.: -rw-r--r-- 2 seviri users 68 May 26 2014 /data/glusterfs/home/brick1/seviri/.forward) of the problem files. But probably, as soon as you remove the files created on 10/03 (incl. the gluster link file in .glusterfs), the listing via your NFS mount will be restored. Try this out with a couple of files you have back-upped to be sure. Hope this helps! Cheers, Olav On 20/03/15 12:22, Alessandro Ipe wrote: Hi, After lauching a rebalance on an idle gluster system one week ago, its status told me it has scanned more than 23 millions files on each of my 6 bricks. However, without knowing at least the total files to be scanned, this status is USELESS from an end-user perspective, because it does not allow you to know WHEN the rebalance could eventually complete (one day, one week, one year or never). From my point of view, the total files per bricks could be obtained and maintained when activating quota, since the whole filesystem has to be crawled... After one week being offline and still no clue when the rebalance would complete, I decided to stop it... Enormous mistake... It seems that rebalance cannot manage to not screw some files. Example, on the only client mounting the gluster system, ls -la /home/seviri returns ls: cannot access /home/seviri/.forward: Stale NFS file handle ls: cannot access /home/seviri/.forward: Stale NFS file handle -? ? ? ? ? ? .forward -? ? ? ? ? ? .forward while this file could perfectly be accessed before (being rebalanced) and has not been modifed for at least 3 years. Getting the extended attributes on the various bricks 3, 4, 5, 6 (3-4 replicate, 5-6 replicate) Brick 3: ls -l /data/glusterfs/home/brick?/seviri/.forward -rw-r--r-- 2 seviri users 68 May 26 2014 /data/glusterfs/home/brick1/seviri/.forward -rw-r--r-- 2 seviri users 68 Mar 10 10:22 /data/glusterfs/home/brick2/seviri/.forward getfattr -d -m . -e hex /data/glusterfs/home/brick?/seviri/.forward # file: data/glusterfs/home/brick1/seviri/.forward trusted.afr.home-client-8=0x trusted.afr.home-client-9=0x trusted.gfid=0xc1d268beb17443a39d914de917de123a # file: data/glusterfs/home/brick2/seviri/.forward trusted.afr.home-client-10=0x trusted.afr.home-client-11=0x trusted.gfid=0x14a1c10eb1474ef2bf72f4c6c64a90ce trusted.glusterfs.quota.4138a9fa-a453-4b8e-905a-e02cce07d717.contri=0x0200 trusted.pgfid.4138a9fa-a453-4b8e-905a-e02cce07d717=0x0001 Brick 4: ls -l /data/glusterfs/home/brick?/seviri/.forward -rw-r--r-- 2 seviri users 68 May 26 2014 /data/glusterfs/home/brick1/seviri/.forward -rw-r--r-- 2 seviri users 68 Mar 10 10:22 /data/glusterfs/home/brick2/seviri/.forward getfattr -d -m . -e hex /data/glusterfs/home/brick?/seviri/.forward # file: data/glusterfs/home/brick1/seviri/.forward trusted.afr.home-client-8=0x trusted.afr.home-client-9=0x trusted.gfid=0xc1d268beb17443a39d914de917de123a # file: data/glusterfs/home/brick2/seviri/.forward trusted.afr.home-client-10=0x trusted.afr.home-client-11=0x trusted.gfid=0x14a1c10eb1474ef2bf72f4c6c64a90ce
Re: [Gluster-users] Hundreds of duplicate files
trusted.afr.sr_vol01-client-41=0x trusted.gfid=0xaefd184508414a8f8408f1ab8aa7a417 My bet would be that I can delete the first two of these files. For the rest they look identical: [root@gluster01 ~]# ls -al /export/brick13gfs01/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd -rw-r--r--. 2 root root 44332659200 Feb 17 23:55 /export/brick13gfs01/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd [root@gluster02 ~]# ls -al /export/brick13gfs02/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd -rw-r--r--. 2 root root 44332659200 Feb 17 23:55 /export/brick13gfs02/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd [root@gluster02 ~]# ls -al /export/brick15gfs02/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd -rw-r--r--. 2 root root 44332659200 Feb 17 23:55 /export/brick15gfs02/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd [root@gluster03 ~]# ls -al /export/brick13gfs03/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd -rw-r--r--. 2 root root 44332659200 Feb 17 23:55 /export/brick13gfs03/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd Cheers, Olav On 21/02/15 01:37, Olav Peeters wrote: It look even worse than I had feared.. :-( This really is a crazy bug. If I understand you correctly, the only sane pairing of the xattrs is of the two 0-bit files, since this is the full list of bricks: root@gluster01 ~]# gluster volume info Volume Name: sr_vol01 Type: Distributed-Replicate Volume ID: c6d6147e-2d91-4d98-b8d9-ba05ec7e4ad6 Status: Started Number of Bricks: 21 x 2 = 42 Transport-type: tcp Bricks: Brick1: gluster01:/export/brick1gfs01 Brick2: gluster02:/export/brick1gfs02 Brick3: gluster01:/export/brick4gfs01 Brick4: gluster03:/export/brick4gfs03 Brick5: gluster02:/export/brick4gfs02 Brick6: gluster03:/export/brick1gfs03 Brick7: gluster01:/export/brick2gfs01 Brick8: gluster02:/export/brick2gfs02 Brick9: gluster01:/export/brick5gfs01 Brick10: gluster03:/export/brick5gfs03 Brick11: gluster02:/export/brick5gfs02 Brick12: gluster03:/export/brick2gfs03 Brick13: gluster01:/export/brick3gfs01 Brick14: gluster02:/export/brick3gfs02 Brick15: gluster01:/export/brick6gfs01 Brick16: gluster03:/export/brick6gfs03 Brick17: gluster02:/export/brick6gfs02 Brick18: gluster03:/export/brick3gfs03 Brick19: gluster01:/export/brick8gfs01 Brick20: gluster02:/export/brick8gfs02 Brick21: gluster01:/export/brick9gfs01 Brick22: gluster02:/export/brick9gfs02 Brick23: gluster01:/export/brick10gfs01 Brick24: gluster03:/export/brick10gfs03 Brick25: gluster01:/export/brick11gfs01 Brick26: gluster03:/export/brick11gfs03 Brick27: gluster02:/export/brick10gfs02 Brick28: gluster03:/export/brick8gfs03 Brick29: gluster02:/export/brick11gfs02 Brick30: gluster03:/export/brick9gfs03 Brick31: gluster01:/export/brick12gfs01 Brick32: gluster02:/export/brick12gfs02 Brick33: gluster01:/export/brick13gfs01 Brick34: gluster02:/export/brick13gfs02 Brick35: gluster01:/export/brick14gfs01 Brick36: gluster03:/export/brick14gfs03 Brick37: gluster01:/export/brick15gfs01 Brick38: gluster03:/export/brick15gfs03 Brick39: gluster02:/export/brick14gfs02 Brick40: gluster03:/export/brick12gfs03 Brick41: gluster02:/export/brick15gfs02 Brick42: gluster03:/export/brick13gfs03 The two 0-bit files are on brick 35 and 36 as the getfattr correctly lists. Another sane pairing could be this (if the first file did not also refer to client-34 and client-35): [root@gluster01 ~]# getfattr -m . -d -e hex /export/brick13gfs01/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd getfattr: Removing leading '/' from absolute path names # file: export/brick13gfs01/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000 trusted.afr.dirty=0x trusted.afr.sr_vol01-client-32=0x trusted.afr.sr_vol01-client-33=0x trusted.afr.sr_vol01-client-34=0x trusted.afr.sr_vol01-client-35=0x00010001 trusted.gfid=0xaefd184508414a8f8408f1ab8aa7a417 [root@gluster02 ~]# getfattr -m . -d -e hex /export/brick13gfs02/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd getfattr: Removing leading '/' from absolute path names # file: export/brick13gfs02/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd security.selinux=0x73797374656d5f753a6f626a6563745f723a66696c655f743a733000 trusted.afr.dirty=0x trusted.afr.sr_vol01-client-32=0x trusted.afr.sr_vol01-client-33=0x trusted.gfid=0xaefd184508414a8f8408f1ab8aa7a417 But why is the security.selinux hash different? You mention
Re: [Gluster-users] Hundreds of duplicate files
Thanks Joe, for the answers! I was not clear enough about the set up apparently. The Gluster cluster consist of 3 nodes with each 14 bricks. The bricks are formatted as xfs, mounted locally as xfs. There is one volume, type: Distributed-Replicate (replica 2). The configuration is so that bricks are mirrored on two different nodes. The NFS mount which was alive but not used during reboot when the problem started are from clients (2 XenServer machines configured as a pool - a shared storage set-up). The comparisons I give below are between (other) clients mounting via either glusterfs or NFS. Similar problem with the exception that the first listing (via ls) after a fresh mount via NFS actually does find the files with data. A second listing only finds the 0 bit file with the same name. So all the 0bit files in mode 0644 can be safely removed? Why do I see three files with the same name (and modification timestamp etc.) via either a glusterfs or NFS mount from a client? Deleting one of the three will probably not solve the issue either.. this seems to me an indexing issue in the gluster cluster. How do I get Gluster to replicate the files correctly, only 2 versions of the same file, not three, and on two bricks on different machines? Cheers, Olav On 20/02/15 21:51, Joe Julian wrote: On 02/20/2015 12:21 PM, Olav Peeters wrote: Let's take one file (3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd) as an example... On the 3 nodes where all bricks are formatted as XFS and mounted in /export and 272b2366-dfbf-ad47-2a0f-5d5cc40863e3 is the mounting point of a NFS shared storage connection from XenServer machines: Did I just read this correctly? Your bricks are NFS mounts? ie, GlusterFS Client - GlusterFS Server - NFS - XFS [root@gluster01 ~]# find /export/*/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/ -name '300*' -exec ls -la {} \; -rw-r--r--. 2 root root 44332659200 Feb 17 23:55 /export/brick13gfs01/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd Supposedly, this is the actual file. -rw-r--r--. 2 root root 0 Feb 18 00:51 /export/brick14gfs01/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd This is not a linkfile. Note it's mode 0644. How it got there with those permissions would be a matter of history and would require information that's probably lost. root@gluster02 ~]# find /export/*/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/ -name '300*' -exec ls -la {} \; -rw-r--r--. 2 root root 44332659200 Feb 17 23:55 /export/brick13gfs02/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd [root@gluster03 ~]# find /export/*/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/ -name '300*' -exec ls -la {} \; -rw-r--r--. 2 root root 44332659200 Feb 17 23:55 /export/brick13gfs03/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd -rw-r--r--. 2 root root 0 Feb 18 00:51 /export/brick14gfs03/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd Same analysis as above. 3 files with information, 2 x a 0-bit file with the same name Checking the 0-bit files: [root@gluster01 ~]# getfattr -m . -d -e hex /export/brick14gfs01/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd getfattr: Removing leading '/' from absolute path names # file: export/brick14gfs01/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd security.selinux=0x73797374656d5f753a6f626a6563745f723a66696c655f743a733000 trusted.afr.dirty=0x trusted.afr.sr_vol01-client-34=0x trusted.afr.sr_vol01-client-35=0x trusted.gfid=0xaefd184508414a8f8408f1ab8aa7a417 [root@gluster03 ~]# getfattr -m . -d -e hex /export/brick14gfs03/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd getfattr: Removing leading '/' from absolute path names # file: export/brick14gfs03/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd security.selinux=0x73797374656d5f753a6f626a6563745f723a66696c655f743a733000 trusted.afr.dirty=0x trusted.afr.sr_vol01-client-34=0x trusted.afr.sr_vol01-client-35=0x trusted.gfid=0xaefd184508414a8f8408f1ab8aa7a417 This is not a glusterfs link file since there is no trusted.glusterfs.dht.linkto, am I correct? You are correct. And checking the good files: # file: export/brick13gfs01/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000 trusted.afr.dirty=0x trusted.afr.sr_vol01-client-32=0x trusted.afr.sr_vol01-client-33=0x trusted.afr.sr_vol01-client-34=0x trusted.afr.sr_vol01-client-35=0x00010001 trusted.gfid=0xaefd184508414a8f8408f1ab8aa7a417
Re: [Gluster-users] Hundreds of duplicate files
in the current state could do more harm than good? I launched a second rebalance in the hope that the system would mend itself after all... Thanks a million for your support in this darkest hour of my time as a glusterfs user :-) Cheers, Olav On 20/02/15 23:10, Joe Julian wrote: On 02/20/2015 01:47 PM, Olav Peeters wrote: Thanks Joe, for the answers! I was not clear enough about the set up apparently. The Gluster cluster consist of 3 nodes with each 14 bricks. The bricks are formatted as xfs, mounted locally as xfs. There is one volume, type: Distributed-Replicate (replica 2). The configuration is so that bricks are mirrored on two different nodes. The NFS mount which was alive but not used during reboot when the problem started are from clients (2 XenServer machines configured as a pool - a shared storage set-up). The comparisons I give below are between (other) clients mounting via either glusterfs or NFS. Similar problem with the exception that the first listing (via ls) after a fresh mount via NFS actually does find the files with data. A second listing only finds the 0 bit file with the same name. So all the 0bit files in mode 0644 can be safely removed? Probably? Is it likely that you have any empty files? I don't know. Why do I see three files with the same name (and modification timestamp etc.) via either a glusterfs or NFS mount from a client? Deleting one of the three will probably not solve the issue either.. this seems to me an indexing issue in the gluster cluster. Very good question. I don't know. The xattrs tell a strange story that I haven't seen before. One legit file shows sr_vol01-client-32 and 33. This would be normal, assuming the filename hash would put it on that replica pair (we can't tell since the rebalance has changed the hash map). Another file shows sr_vol01-client-32, 33, 34, and 35 with pending updates scheduled for 35. I have no idea which brick this is (see gluster volume info and map the digits (35) with the bricks offset by 1 (client-35 is brick 36). That last one is on 40,41. I don't know how these files all got on different replica sets. My speculations include hostname changes, long-running net-split conditions with different dht maps (failed rebalances), moved bricks, load balancers between client and server, mercury in retrograde (lol)... How do I get Gluster to replicate the files correctly, only 2 versions of the same file, not three, and on two bricks on different machines? Identify which replica is correct by using the little python script at http://joejulian.name/blog/dht-misses-are-expensive/ to get the hash of the filename. Examine the dht map to see which replica pair *should* have that hash and remove the others (and their hardlink in .glusterfs). There is no 1-liner that's going to do this. I would probably script the logic in python, have it print out what it was going to do, check that for sanity and, if sane, execute it. But mostly figure out how Bricks 32 and/or 33 can become 34 and/or 35 and/or 40 and/or 41. That's the root of the whole problem. Cheers, Olav On 20/02/15 21:51, Joe Julian wrote: On 02/20/2015 12:21 PM, Olav Peeters wrote: Let's take one file (3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd) as an example... On the 3 nodes where all bricks are formatted as XFS and mounted in /export and 272b2366-dfbf-ad47-2a0f-5d5cc40863e3 is the mounting point of a NFS shared storage connection from XenServer machines: Did I just read this correctly? Your bricks are NFS mounts? ie, GlusterFS Client - GlusterFS Server - NFS - XFS [root@gluster01 ~]# find /export/*/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/ -name '300*' -exec ls -la {} \; -rw-r--r--. 2 root root 44332659200 Feb 17 23:55 /export/brick13gfs01/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd Supposedly, this is the actual file. -rw-r--r--. 2 root root 0 Feb 18 00:51 /export/brick14gfs01/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd This is not a linkfile. Note it's mode 0644. How it got there with those permissions would be a matter of history and would require information that's probably lost. root@gluster02 ~]# find /export/*/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/ -name '300*' -exec ls -la {} \; -rw-r--r--. 2 root root 44332659200 Feb 17 23:55 /export/brick13gfs02/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd [root@gluster03 ~]# find /export/*/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/ -name '300*' -exec ls -la {} \; -rw-r--r--. 2 root root 44332659200 Feb 17 23:55 /export/brick13gfs03/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd -rw-r--r--. 2 root root 0 Feb 18 00:51 /export/brick14gfs03/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd Same analysis as above. 3 files with information, 2 x a 0-bit file with the same name Checking the 0-bit files: [root@gluster01
Re: [Gluster-users] Hundreds of duplicate files
@client ~]# ls -al /mnt/glusterfs/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/300* -rw-r--r--. 1 root root 0 Feb 18 00:51 /mnt/glusterfs/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd -rw-r--r--. 1 root root 0 Feb 18 00:51 /mnt/glusterfs/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd -rw-r--r--. 1 root root 0 Feb 18 00:51 /mnt/glusterfs/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd Via NFS (just after performing a umount and mount the volume again): [root@client ~]# ls -al /mnt/nfs/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/300* -rw-r--r--. 1 root root 44332659200 Feb 17 23:55 /mnt/test/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd -rw-r--r--. 1 root root 44332659200 Feb 17 23:55 /mnt/test/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd -rw-r--r--. 1 root root 44332659200 Feb 17 23:55 /mnt/test/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd Doing the same list a couple of seconds later: [root@client ~]# ls -al /mnt/nfs/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/300* -rw-r--r--. 1 root root 0 Feb 18 00:51 /mnt/test/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd -rw-r--r--. 1 root root 0 Feb 18 00:51 /mnt/test/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd -rw-r--r--. 1 root root 0 Feb 18 00:51 /mnt/test/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd And again, and again, and again: [root@client ~]# ls -al /mnt/nfs/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/300* -rw-r--r--. 1 root root 0 Feb 18 00:51 /mnt/test/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd -rw-r--r--. 1 root root 0 Feb 18 00:51 /mnt/test/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd -rw-r--r--. 1 root root 0 Feb 18 00:51 /mnt/test/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd This really seems odd. Why do we get to see real data file once only? It seems more and more that this crazy file duplication (and writing of sticky bit files) was actually triggered when rebooting one of the three nodes while there still is an active (even when there is no data exchange at all) NFS connection, since all 0-bit files (of the non Sticky bit type) were either created at 00:51 or 00:41, the exact moment one of the three nodes in the cluster were rebooted. This would mean that replication currently with GlusterFS creates hardly any redundancy. Quiet the opposite, if one of the machines goes down, all of your data seriously gets disorganised. I am buzzy configuring a test installation to see how this can be best reproduced for a bug report.. Does anyone have a suggestion how to best get rid of the duplicates, or rather get this mess organised the way it should be? This is a cluster with millions of files. A rebalance does not fix the issue, neither does a rebalance fix-layout help. Since this is a replicated volume all files should be their 2x, not 3x. Can I safely just remove all the 0 bit files outside of the .glusterfs directory including the sticky bit files? The empty 0 bit files outside of .glusterfs on every brick I can probably safely removed like this: find /export/* -path */.glusterfs -prune -o -type f -size 0 -perm 1000 -exec rm {} \; not? Thanks! Cheers, Olav On 18/02/15 22:10, Olav Peeters wrote: Thanks Tom and Joe, for the fast response! Before I started my upgrade I stopped all clients using the volume and stopped all VM's with VHD on the volume, but I guess, and this may be the missing thing to reproduce this in a lab, I did not detach a NFS shared storage mount from a XenServer pool to this volume, since this is an extremely risky business. I also did not stop the volume. This I guess was a bit stupid, but since I did upgrades in the past this way without any issues I skipped this step (a really bad habit). I'll make amends and file a proper bug report :-). I agree with you Joe, this should never happen, even when someone ignores the advice of stopping the volume. If it would also be nessessary to detach shared storage NFS connections to a volume, than franky, glusterfs is unusable in a private cloud. No one can afford downtime of the whole infrastructure just for a glusterfs upgrade. Ideally a replicated gluster volume should even be able to remain online and used during (at least a minor version) upgrade. I don't know whether a heal was maybe buzzy when I started the upgrade. I forgot to check. I did check the CPU activity on the gluster nodes which were very low (in the 0.0X range via top), so I doubt it. I will add this to the bug report as a suggestion should they not be able to reproduce with an open NFS connection. By the way, is it sufficient to do: service glusterd stop service glusterfsd stop and do a: ps aux | gluster
Re: [Gluster-users] Hundreds of duplicate files
Hi all, I'm have this problem after upgrading from 3.5.3 to 3.6.2. At the moment I am still waiting for a heal to finish (on a 31TB volume with 42 bricks, replicated over three nodes). Tom, how did you remove the duplicates? with 42 bricks I will not be able to do this manually.. Did a: find $brick_root -type f -size 0 -perm 1000 -exec /bin/rm {} \; work for you? Should this type of thing ideally not be checked and mended by a heal? Does anyone have an idea yet how this happens in the first place? Can it be connected to upgrading? Cheers, Olav On 01/01/15 03:07, tben...@3vgeomatics.com wrote: No, the files can be read on a newly mounted client! I went ahead and deleted all of the link files associated with these duplicates, and then remounted the volume. The problem is fixed! Thanks again for the help, Joe and Vijay. Tom - Original Message - Subject: Re: [Gluster-users] Hundreds of duplicate files From: Vijay Bellur vbel...@redhat.com Date: 12/28/14 3:23 am To: tben...@3vgeomatics.com, gluster-users@gluster.org On 12/28/2014 01:20 PM, tben...@3vgeomatics.com wrote: Hi Vijay, Yes the files are still readable from the .glusterfs path. There is no explicit error. However, trying to read a text file in python simply gives me null characters: open('ott_mf_itab').readlines() ['\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'] And reading binary files does the same Is this behavior seen with a freshly mounted client too? -Vijay - Original Message - Subject: Re: [Gluster-users] Hundreds of duplicate files From: Vijay Bellur vbel...@redhat.com Date: 12/27/14 9:57 pm To: tben...@3vgeomatics.com, gluster-users@gluster.org On 12/28/2014 10:13 AM, tben...@3vgeomatics.com wrote: Thanks Joe, I've read your blog post as well as your post regarding the .glusterfs directory. I found some unneeded duplicate files which were not being read properly. I then deleted the link file from the brick. This always removes the duplicate file from the listing, but the file does not always become readable. If I also delete the associated file in the .glusterfs directory on that brick, then some more files become readable. However this solution still doesn't work for all files. I know the file on the brick is not corrupt as it can be read directly from the brick directory. For files that are not readable from the client, can you check if the file is readable from the .glusterfs/ path? What is the specific error that is seen while trying to read one such file from the client? Thanks, Vijay ___ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users ___ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users ___ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Hundreds of duplicate files
Thanks Tom and Joe, for the fast response! Before I started my upgrade I stopped all clients using the volume and stopped all VM's with VHD on the volume, but I guess, and this may be the missing thing to reproduce this in a lab, I did not detach a NFS shared storage mount from a XenServer pool to this volume, since this is an extremely risky business. I also did not stop the volume. This I guess was a bit stupid, but since I did upgrades in the past this way without any issues I skipped this step (a really bad habit). I'll make amends and file a proper bug report :-). I agree with you Joe, this should never happen, even when someone ignores the advice of stopping the volume. If it would also be nessessary to detach shared storage NFS connections to a volume, than franky, glusterfs is unusable in a private cloud. No one can afford downtime of the whole infrastructure just for a glusterfs upgrade. Ideally a replicated gluster volume should even be able to remain online and used during (at least a minor version) upgrade. I don't know whether a heal was maybe buzzy when I started the upgrade. I forgot to check. I did check the CPU activity on the gluster nodes which were very low (in the 0.0X range via top), so I doubt it. I will add this to the bug report as a suggestion should they not be able to reproduce with an open NFS connection. By the way, is it sufficient to do: service glusterd stop service glusterfsd stop and do a: ps aux | gluster* to see if everything has stopped and kill any leftovers should this be necessary? For the fix, do you agree that if I run e.g.: find /export/* -type f -size 0 -perm 1000 -exec /bin/rm {} \; on every node if /export is the location of all my bricks, also in a replicated set-up, this will be save? No necessary 0bit files will be deleted in e.g. the .glusterfs of every brick? Thanks for your support! Cheers, Olav On 18/02/15 20:51, Joe Julian wrote: On 02/18/2015 11:43 AM, tben...@3vgeomatics.com wrote: Hi Olav, I have a hunch that our problem was caused by improper unmounting of the gluster volume, and have since found that the proper order should be: kill all jobs using volume - unmount volume on clients - gluster volume stop - stop gluster service (if necessary) In my case, I wrote a Python script to find duplicate files on the mounted volume, then delete the corresponding link files on the bricks (making sure to also delete files in the .glusterfs directory) However, your find command was also suggested to me and I think it's a simpler solution. I believe removing all link files (even ones that are not causing duplicates) is fine since the next file access gluster will do a lookup on all bricks and recreate any link files if necessary. Hopefully a gluster expert can chime in on this point as I'm not completely sure. You are correct. Keep in mind your setup is somewhat different than mine as I have only 5 bricks with no replication. Regards, Tom - Original Message - Subject: Re: [Gluster-users] Hundreds of duplicate files From: Olav Peeters opeet...@gmail.com Date: 2/18/15 10:52 am To: gluster-users@gluster.org, tben...@3vgeomatics.com Hi all, I'm have this problem after upgrading from 3.5.3 to 3.6.2. At the moment I am still waiting for a heal to finish (on a 31TB volume with 42 bricks, replicated over three nodes). Tom, how did you remove the duplicates? with 42 bricks I will not be able to do this manually.. Did a: find $brick_root -type f -size 0 -perm 1000 -exec /bin/rm {} \; work for you? Should this type of thing ideally not be checked and mended by a heal? Does anyone have an idea yet how this happens in the first place? Can it be connected to upgrading? Cheers, Olav On 01/01/15 03:07, tben...@3vgeomatics.com wrote: No, the files can be read on a newly mounted client! I went ahead and deleted all of the link files associated with these duplicates, and then remounted the volume. The problem is fixed! Thanks again for the help, Joe and Vijay. Tom - Original Message - Subject: Re: [Gluster-users] Hundreds of duplicate files From: Vijay Bellur vbel...@redhat.com Date: 12/28/14 3:23 am To: tben...@3vgeomatics.com, gluster-users@gluster.org On 12/28/2014 01:20 PM, tben...@3vgeomatics.com wrote: Hi Vijay, Yes the files are still readable from the .glusterfs path. There is no explicit error. However, trying to read a text file in python simply gives me null characters: open('ott_mf_itab').readlines() ['\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00
[Gluster-users] problems after gluster volume remove-brick
Hi, two days ago is started a gluster volume remove-brick on a Distributed-Replicate volume with 21 x 2 per node (3 in total). I wanted to remove 4 bricks per node which are smaller than the others (on each node I have 7 x 2TB disks and 4 x 500GB disks). I am still on gluster 3.5.2. and I was not aware that using disks of different sizes is only supported as of 3.6.x (am I correct?) I started with 2 paired disks like so: gluster volume remove-brick VOLNAME node03:/export/brick8node03 node02:/export/brick10node02 start I followed the progress (which was very slow): gluster volume remove-brick volume_name node03:/export/brick8node03 node02:/export/brick10node02 status after a day the progress of node03:/export/brick8node03 showed completed, the other brick remained in progress this morning several VM's with vdi's on the volume started showing disk errors + a couple of gluserfs mounts returned a disk is full type of error on the volume which is only ca. 41% filled with data currently. via df -h I saw that most of the 500GB disk where indeed 100% full. Others were meanwhile nearly empty.. Gluster seems to have gone nuts a bit during rebalancing the data. I did a: gluster volume remove-brick VOLNAME node03:/export/brick8node03 node02:/export/brick10node02 stop and a: gluster volume rebalance VOLNAME start progress is again very slow and some of the disks/bricks which were ca. 98% are now 100% full. The situation seems to be both getting worse in some cases and slowly improving e.g. for another pair of bricks (from 100% to 97%). There clearly has been some data corruption. Some VM's don't want to boot anymore, throwing disk errors. How do I proceed? Wait a very long time for the rebalance to complete and hope that the data corruption is automatically mended? Upgrade to 3.6.x and hope that the issues (which might be related to me using bricks of different sizes) are resolved and again risk a remove-brick operation? Should I rather do a: gluster volume rebalance VOLNAME migrate-data start Should I have done a replace-brick instead of a remove-brick operation originally? I thought that replace-brick is becoming obsolete. Thanks, Olav ___ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] problems after gluster volume remove-brick
Adding to my previous mail.. I find a couple of strange errors in the rebalance log (/var/log/glusterfs/sr_vol01-rebalance.log) e.g.: [2015-01-21 10:00:32.123999] E [afr-self-heal-entry.c:1135:afr_sh_entry_impunge_newfile_cbk] 0-sr_vol01-replicate-11: creation of /some/file/on/the/volume.data on sr_vol01-client-23 failed (No space left on device) Why is the rebalance seemingly not taking account of the space left on disks available. This is the current situation on this particular node: [root@gluster03 ~]# df -h FilesystemSize Used Avail Use% Mounted on /dev/mapper/VolGroup-lv_root 50G 2.4G 45G 5% / tmpfs 7.8G 0 7.8G 0% /dev/shm /dev/sda1 485M 95M 365M 21% /boot /dev/sdb1 1.9T 577G 1.3T 31% /export/brick1gfs03 /dev/sdc1 1.9T 154G 1.7T 9% /export/brick2gfs03 /dev/sdd1 1.9T 413G 1.5T 23% /export/brick3gfs03 /dev/sde1 1.9T 1.5T 417G 78% /export/brick4gfs03 /dev/sdf1 1.9T 1.6T 286G 85% /export/brick5gfs03 /dev/sdg1 1.9T 1.4T 443G 77% /export/brick6gfs03 /dev/sdh1 1.9T 33M 1.9T 1% /export/brick7gfs03 /dev/sdi1 466G 62G 405G 14% /export/brick8gfs03 /dev/sdj1 466G 166G 301G 36% /export/brick9gfs03 /dev/sdk1 466G 466G 20K 100% /export/brick10gfs03 /dev/sdl1 466G 450G 16G 97% /export/brick11gfs03 /dev/sdm1 1.9T 206G 1.7T 12% /export/brick12gfs03 /dev/sdn1 1.9T 306G 1.6T 17% /export/brick13gfs03 /dev/sdo1 1.9T 107G 1.8T 6% /export/brick14gfs03 /dev/sdp1 1.9T 252G 1.6T 14% /export/brick15gfs03 why are brick10 and brick11 over utilised when there is plenty of space on brick 6, 14, etc. ? Anyone any idea? Cheers, Olav On 21/01/15 13:18, Olav Peeters wrote: Hi, two days ago is started a gluster volume remove-brick on a Distributed-Replicate volume with 21 x 2 per node (3 in total). I wanted to remove 4 bricks per node which are smaller than the others (on each node I have 7 x 2TB disks and 4 x 500GB disks). I am still on gluster 3.5.2. and I was not aware that using disks of different sizes is only supported as of 3.6.x (am I correct?) I started with 2 paired disks like so: gluster volume remove-brick VOLNAME node03:/export/brick8node03 node02:/export/brick10node02 start I followed the progress (which was very slow): gluster volume remove-brick volume_name node03:/export/brick8node03 node02:/export/brick10node02 status after a day the progress of node03:/export/brick8node03 showed completed, the other brick remained in progress this morning several VM's with vdi's on the volume started showing disk errors + a couple of gluserfs mounts returned a disk is full type of error on the volume which is only ca. 41% filled with data currently. via df -h I saw that most of the 500GB disk where indeed 100% full. Others were meanwhile nearly empty.. Gluster seems to have gone nuts a bit during rebalancing the data. I did a: gluster volume remove-brick VOLNAME node03:/export/brick8node03 node02:/export/brick10node02 stop and a: gluster volume rebalance VOLNAME start progress is again very slow and some of the disks/bricks which were ca. 98% are now 100% full. The situation seems to be both getting worse in some cases and slowly improving e.g. for another pair of bricks (from 100% to 97%). There clearly has been some data corruption. Some VM's don't want to boot anymore, throwing disk errors. How do I proceed? Wait a very long time for the rebalance to complete and hope that the data corruption is automatically mended? Upgrade to 3.6.x and hope that the issues (which might be related to me using bricks of different sizes) are resolved and again risk a remove-brick operation? Should I rather do a: gluster volume rebalance VOLNAME migrate-data start Should I have done a replace-brick instead of a remove-brick operation originally? I thought that replace-brick is becoming obsolete. Thanks, Olav ___ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] socket.c:2161:socket_connect_finish (Connection refused)
OK, thanks for the info! Regards, Olav On 11/06/14 08:38, Pranith Kumar Karampuri wrote: On 06/11/2014 12:03 PM, Olav Peeters wrote: Thanks Pranith! I see this at the end of the log files of one of the problem bricks (the first two errors are repeated several times): [2014-06-10 09:55:28.354659] E [rpcsvc.c:1206:rpcsvc_submit_generic] 0-rpc-service: failed to submit message (XID: 0x103c59, Program: GlusterFS 3.3, ProgVers: 330, Proc: 30) to rpc-transport (tcp.sr_vol01-server) [2014-06-10 09:55:28.354683] E [server.c:190:server_submit_reply] (--/usr/lib64/glusterfs/3.5.0/xlator/performance/io-threads.so(iot_finodelk_cbk+0xb9) [0x7f8c8e82f189] (--/usr/lib64/glusterfs/3.5.0/xlator/debug/io-stats.so(io_stats_finodelk_cbk+0xed) [0x7f8c8e1f22ed] (--/usr/lib64/glusterfs/3.5.0/xlator/protocol/server.so(server_finodelk_cbk+0xad) [0x7f8c8dfc555d]))) 0-: Reply submission failed pending frames: frame : type(0) op(30) frame : type(0) op(30) frame : type(0) op(30) frame : type(0) op(30) ... ... frame : type(0) op(30) frame : type(0) op(30) frame : type(0) op(30) patchset: git://git.gluster.com/glusterfs.git signal received: 11 time of crash: 2014-06-10 09:55:28configuration details: argp 1 backtrace 1 dlfcn 1 fdatasync 1 libpthread 1 llistxattr 1 setfsid 1 spinlock 1 epoll.h 1 xattr.h 1 st_atim.tv_nsec 1 package-string: glusterfs 3.5.0 /lib64/libc.so.6(+0x329a0)[0x7f8c94aac9a0] /usr/lib64/glusterfs/3.5.0/xlator/features/locks.so(grant_blocked_inode_locks+0xc1)[0x7f8c8ea54061] /usr/lib64/glusterfs/3.5.0/xlator/features/locks.so(pl_inodelk_client_cleanup+0x249)[0x7f8c8ea54569] /usr/lib64/glusterfs/3.5.0/xlator/features/locks.so(+0x6f0a)[0x7f8c8ea49f0a] /usr/lib64/libglusterfs.so.0(gf_client_disconnect+0x5d)[0x7f8c964d701d] /usr/lib64/glusterfs/3.5.0/xlator/protocol/server.so(server_connection_cleanup+0x458)[0x7f8c8dfbda48] /usr/lib64/glusterfs/3.5.0/xlator/protocol/server.so(server_rpc_notify+0x183)[0x7f8c8dfb9713] /usr/lib64/libgfrpc.so.0(rpcsvc_handle_disconnect+0x105)[0x7f8c96261d35] /usr/lib64/libgfrpc.so.0(rpcsvc_notify+0x1a0)[0x7f8c96263880] /usr/lib64/libgfrpc.so.0(rpc_transport_notify+0x28)[0x7f8c96264f98] /usr/lib64/glusterfs/3.5.0/rpc-transport/socket.so(+0xa9a1)[0x7f8c914c39a1] /usr/lib64/libglusterfs.so.0(+0x672f7)[0x7f8c964d92f7] /usr/sbin/glusterfsd(main+0x564)[0x4075e4] /lib64/libc.so.6(__libc_start_main+0xfd)[0x7f8c94a98d1d] /usr/sbin/glusterfsd[0x404679] - Again no info to be found online about the error. Any idea? This is because of bug 1089470 which is fixed for 3.5.1. Which will be releasing shortly. Pranith Olav On 11/06/14 04:42, Pranith Kumar Karampuri wrote: Olav, Check logs of the bricks to see why the bricks went down. Pranith On 06/11/2014 04:02 AM, Olav Peeters wrote: Hi, I upgraded from glusterfs 3.4 to 3.5 about 8 days ago. Everything was running fine until this morning. In a fuse mount we were having write issues. Creating and deleting files became an issue all of a sudden without any new changes to the cluster. In /var/log/glusterfs/glustershd.log every couple of seconds I'm getting this: [2014-06-10 22:23:52.055128] I [rpc-clnt.c:1685:rpc_clnt_reconfig] 0-sr_vol01-client-13: changing port to 49156 (from 0) [2014-06-10 22:23:52.060153] E [socket.c:2161:socket_connect_finish] 0-sr_vol01-client-13: connection to ip-of-one-of-the-gluster-nodes:49156 failed (Connection refused) # gluster volume status sr_vol01 shows that two bricks of the 18 are offline. rebalance fails Iptables was stopped on all nodes If I cd into the two bricks which are offline according to the gluster v status, I can read/write without any problems... The disks are clearly fine. They are mounted, they are available. I cannot find much info online about the error. Does anyone have an idea what could be wrong? How can I get the two bricks back online? Cheers, Olav ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] socket.c:2161:socket_connect_finish (Connection refused)
Thanks allot, Pranith! All seems back to normal again. Looking forward to the release of 3.5.1 ! Cheers, Olav On 11/06/14 09:30, Pranith Kumar Karampuri wrote: hey Just do gluster volume start volname force and things should be back to normal Pranith On 06/11/2014 12:56 PM, Olav Peeters wrote: Pranith, how could I move all data from the two problem bricks temporarily until the release of 3.5.1? Like this? # gluster volume replace-brick VOLNAME BRICK NEW-BRICK start Will this work if the bricks are offline? Or is there some other way to get the bricks back online manually? Would it help to do all fuse connections via NFS until after the fix? Cheers, Olav On 11/06/14 08:44, Olav Peeters wrote: OK, thanks for the info! Regards, Olav On 11/06/14 08:38, Pranith Kumar Karampuri wrote: On 06/11/2014 12:03 PM, Olav Peeters wrote: Thanks Pranith! I see this at the end of the log files of one of the problem bricks (the first two errors are repeated several times): [2014-06-10 09:55:28.354659] E [rpcsvc.c:1206:rpcsvc_submit_generic] 0-rpc-service: failed to submit message (XID: 0x103c59, Program: GlusterFS 3.3, ProgVers: 330, Proc: 30) to rpc-transport (tcp.sr_vol01-server) [2014-06-10 09:55:28.354683] E [server.c:190:server_submit_reply] (--/usr/lib64/glusterfs/3.5.0/xlator/performance/io-threads.so(iot_finodelk_cbk+0xb9) [0x7f8c8e82f189] (--/usr/lib64/glusterfs/3.5.0/xlator/debug/io-stats.so(io_stats_finodelk_cbk+0xed) [0x7f8c8e1f22ed] (--/usr/lib64/glusterfs/3.5.0/xlator/protocol/server.so(server_finodelk_cbk+0xad) [0x7f8c8dfc555d]))) 0-: Reply submission failed pending frames: frame : type(0) op(30) frame : type(0) op(30) frame : type(0) op(30) frame : type(0) op(30) ... ... frame : type(0) op(30) frame : type(0) op(30) frame : type(0) op(30) patchset: git://git.gluster.com/glusterfs.git signal received: 11 time of crash: 2014-06-10 09:55:28configuration details: argp 1 backtrace 1 dlfcn 1 fdatasync 1 libpthread 1 llistxattr 1 setfsid 1 spinlock 1 epoll.h 1 xattr.h 1 st_atim.tv_nsec 1 package-string: glusterfs 3.5.0 /lib64/libc.so.6(+0x329a0)[0x7f8c94aac9a0] /usr/lib64/glusterfs/3.5.0/xlator/features/locks.so(grant_blocked_inode_locks+0xc1)[0x7f8c8ea54061] /usr/lib64/glusterfs/3.5.0/xlator/features/locks.so(pl_inodelk_client_cleanup+0x249)[0x7f8c8ea54569] /usr/lib64/glusterfs/3.5.0/xlator/features/locks.so(+0x6f0a)[0x7f8c8ea49f0a] /usr/lib64/libglusterfs.so.0(gf_client_disconnect+0x5d)[0x7f8c964d701d] /usr/lib64/glusterfs/3.5.0/xlator/protocol/server.so(server_connection_cleanup+0x458)[0x7f8c8dfbda48] /usr/lib64/glusterfs/3.5.0/xlator/protocol/server.so(server_rpc_notify+0x183)[0x7f8c8dfb9713] /usr/lib64/libgfrpc.so.0(rpcsvc_handle_disconnect+0x105)[0x7f8c96261d35] /usr/lib64/libgfrpc.so.0(rpcsvc_notify+0x1a0)[0x7f8c96263880] /usr/lib64/libgfrpc.so.0(rpc_transport_notify+0x28)[0x7f8c96264f98] /usr/lib64/glusterfs/3.5.0/rpc-transport/socket.so(+0xa9a1)[0x7f8c914c39a1] /usr/lib64/libglusterfs.so.0(+0x672f7)[0x7f8c964d92f7] /usr/sbin/glusterfsd(main+0x564)[0x4075e4] /lib64/libc.so.6(__libc_start_main+0xfd)[0x7f8c94a98d1d] /usr/sbin/glusterfsd[0x404679] - Again no info to be found online about the error. Any idea? This is because of bug 1089470 which is fixed for 3.5.1. Which will be releasing shortly. Pranith Olav On 11/06/14 04:42, Pranith Kumar Karampuri wrote: Olav, Check logs of the bricks to see why the bricks went down. Pranith On 06/11/2014 04:02 AM, Olav Peeters wrote: Hi, I upgraded from glusterfs 3.4 to 3.5 about 8 days ago. Everything was running fine until this morning. In a fuse mount we were having write issues. Creating and deleting files became an issue all of a sudden without any new changes to the cluster. In /var/log/glusterfs/glustershd.log every couple of seconds I'm getting this: [2014-06-10 22:23:52.055128] I [rpc-clnt.c:1685:rpc_clnt_reconfig] 0-sr_vol01-client-13: changing port to 49156 (from 0) [2014-06-10 22:23:52.060153] E [socket.c:2161:socket_connect_finish] 0-sr_vol01-client-13: connection to ip-of-one-of-the-gluster-nodes:49156 failed (Connection refused) # gluster volume status sr_vol01 shows that two bricks of the 18 are offline. rebalance fails Iptables was stopped on all nodes If I cd into the two bricks which are offline according to the gluster v status, I can read/write without any problems... The disks are clearly fine. They are mounted, they are available. I cannot find much info online about the error. Does anyone have an idea what could be wrong? How can I get the two bricks back online? Cheers, Olav ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
[Gluster-users] socket.c:2161:socket_connect_finish (Connection refused)
Hi, I upgraded from glusterfs 3.4 to 3.5 about 8 days ago. Everything was running fine until this morning. In a fuse mount we were having write issues. Creating and deleting files became an issue all of a sudden without any new changes to the cluster. In /var/log/glusterfs/glustershd.log every couple of seconds I'm getting this: [2014-06-10 22:23:52.055128] I [rpc-clnt.c:1685:rpc_clnt_reconfig] 0-sr_vol01-client-13: changing port to 49156 (from 0) [2014-06-10 22:23:52.060153] E [socket.c:2161:socket_connect_finish] 0-sr_vol01-client-13: connection to ip-of-one-of-the-gluster-nodes:49156 failed (Connection refused) # gluster volume status sr_vol01 shows that two bricks of the 18 are offline. rebalance fails Iptables was stopped on all nodes If I cd into the two bricks which are offline according to the gluster v status, I can read/write without any problems... The disks are clearly fine. They are mounted, they are available. I cannot find much info online about the error. Does anyone have an idea what could be wrong? How can I get the two bricks back online? Cheers, Olav ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Planing Update gluster 3.4 to gluster 3.5 on centos 6.4
Thanks Franco, for the feed-back! Did you stop gluster before updating? Or where there maybe no active read/writes since it was a test system? Cheers, Olav On 15/05/14 02:36, Franco Broi wrote: On Wed, 2014-05-14 at 12:31 +0200, Olav Peeters wrote: Hi, from what I read here: http://www.gluster.org/community/documentation/index.php/Upgrade_to_3.5 ... if you are on 3.4.0 AND have NO quota configured, it should be safe to just replace a version specific /etc/yum.repos.d/glusterfs-epel.repo with e.g.: http://download.gluster.org/pub/gluster/glusterfs/3.4/LATEST/EPEL.repo/glusterfs-epel.repo (thus referring to LATEST and not e.g. http://download.gluster.org/pub/gluster/glusterfs/3.4/3.4.0/EPEL.repo) and just do a: yum upgrade to upgrade both your system and glusterfs together one cluster node at a time (if you are on CentOS or Fedora), right? Has anyone successfully done it this way yet on CentOS 6.4? Yes. Nothing bad happened but it was a test system, I've yet to do it for real on our production system. Cheers, Olav On 14/05/14 09:01, Humble Devassy Chirammal wrote: Please refer # http://www.gluster.org/community/documentation/index.php/Upgrade_to_3.5 --Humble On Wed, May 14, 2014 at 12:09 PM, Daniel Müller muel...@tropenklinik.de wrote: Hello to all, I am planning updating gluster 3.4 to recent version 3.5 is there any issue concerning my replicating vols? Or can I simply yum install... Greetings Daniel EDV Daniel Müller Leitung EDV Tropenklinik Paul-Lechler-Krankenhaus Paul-Lechler-Str. 24 72076 Tübingen Tel.: 07071/206-463, Fax: 07071/206-499 eMail: muel...@tropenklinik.de Internet: www.tropenklinik.de ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Planing Update gluster 3.4 to gluster 3.5 on centos 6.4
Hi, from what I read here: http://www.gluster.org/community/documentation/index.php/Upgrade_to_3.5 ... if you are on 3.4.0 AND have NO quota configured, it should be safe to just replace a version specific /etc/yum.repos.d/glusterfs-epel.repo with e.g.: http://download.gluster.org/pub/gluster/glusterfs/3.4/LATEST/EPEL.repo/glusterfs-epel.repo (thus referring to LATEST and not e.g. http://download.gluster.org/pub/gluster/glusterfs/3.4/3.4.0/EPEL.repo) and just do a: yum upgrade to upgrade both your system and glusterfs together one cluster node at a time (if you are on CentOS or Fedora), right? Has anyone successfully done it this way yet on CentOS 6.4? Cheers, Olav On 14/05/14 09:01, Humble Devassy Chirammal wrote: Please refer # http://www.gluster.org/community/documentation/index.php/Upgrade_to_3.5 --Humble On Wed, May 14, 2014 at 12:09 PM, Daniel Müller muel...@tropenklinik.de mailto:muel...@tropenklinik.de wrote: Hello to all, I am planning updating gluster 3.4 to recent version 3.5 is there any issue concerning my replicating vols? Or can I simply yum install... Greetings Daniel EDV Daniel Müller Leitung EDV Tropenklinik Paul-Lechler-Krankenhaus Paul-Lechler-Str. 24 72076 Tübingen Tel.: 07071/206-463, Fax: 07071/206-499 eMail: muel...@tropenklinik.de mailto:muel...@tropenklinik.de Internet: www.tropenklinik.de http://www.tropenklinik.de ___ Gluster-users mailing list Gluster-users@gluster.org mailto:Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users