Re: [Gluster-users] 3.8.3 Bitrot signature process
Hi Kotresh, 2280 is a brick process, i have not tried with dist-rep volume? I have not seen any fd in bitd process in any of the node's and bitd process usage always 0% CPU and randomly it goes 0.3% CPU. Thanks, Amudhan On Thursday, September 22, 2016, Kotresh Hiremath Ravishankar < khire...@redhat.com> wrote: > Hi Amudhan, > > No, bitrot signer is a different process by itself and is not part of brick process. > I believe the process 2280 is a brick process ? Did you check with dist-rep volume? > Is the same behavior being observed there as well? We need to figure out why brick > process is holding that fd for such a long time. > > Thanks and Regards, > Kotresh H R > > - Original Message - >> From: "Amudhan P">> To: "Kotresh Hiremath Ravishankar" >> Sent: Wednesday, September 21, 2016 8:15:33 PM >> Subject: Re: [Gluster-users] 3.8.3 Bitrot signature process >> >> Hi Kotresh, >> >> As soon as fd closes from brick1 pid, i can see bitrot signature for the >> file in brick. >> >> So, it looks like fd opened by brick process to calculate signature. >> >> output of the file: >> >> -rw-r--r-- 2 root root 250M Sep 21 18:32 >> /media/disk1/brick1/data/G/test59-bs10M-c100.nul >> >> getfattr: Removing leading '/' from absolute path names >> # file: media/disk1/brick1/data/G/test59-bs10M-c100.nul >> trusted.bit-rot.signature=0x010200e9474e4cc6 73c0c227a6e807e04aa4ab1f88d3744243950a290869c53daa65df >> trusted.bit-rot.version=0x020057d6af3200012a13 >> trusted.ec.config=0x080501000200 >> trusted.ec.size=0x3e80 >> trusted.ec.version=0x1f401f40 >> trusted.gfid=0x4c091145429448468fffe358482c63e1 >> >> stat /media/disk1/brick1/data/G/test59-bs10M-c100.nul >> File: ‘/media/disk1/brick1/data/G/test59-bs10M-c100.nul’ >> Size: 262144000 Blocks: 512000 IO Block: 4096 regular file >> Device: 811h/2065d Inode: 402653311 Links: 2 >> Access: (0644/-rw-r--r--) Uid: (0/root) Gid: (0/root) >> Access: 2016-09-21 18:34:43.722712751 +0530 >> Modify: 2016-09-21 18:32:41.650712946 +0530 >> Change: 2016-09-21 19:14:41.698708914 +0530 >> Birth: - >> >> >> In other 2 bricks in same set, still signature is not updated for the same >> file. >> >> >> On Wed, Sep 21, 2016 at 6:48 PM, Amudhan P wrote: >> >> > Hi Kotresh, >> > >> > I am very sure, No read was going on from mount point. >> > >> > Again i did same test but after writing data to mount point. I have >> > unmounted mount point. >> > >> > after 120 seconds i am seeing this file fd entry in brick 1 pid >> > >> > getfattr -m. -e hex -d test59-bs10 >> > # file: test59-bs10M-c100.nul >> > trusted.bit-rot.version=0x020057bed574000ed534 >> > trusted.ec.config=0x080501000200 >> > trusted.ec.size=0x3e80 >> > trusted.ec.version=0x1f401f40 >> > trusted.gfid=0x4c091145429448468fffe358482c63e1 >> > >> > >> > ls -l /proc/2280/fd >> > lr-x-- 1 root root 64 Sep 21 13:08 19 -> /media/disk1/brick1/. >> > glusterfs/4c/09/4c091145-4294-4846-8fff-e358482c63e1 >> > >> > Volume is a EC - 4+1 >> > >> > On Wed, Sep 21, 2016 at 6:17 PM, Kotresh Hiremath Ravishankar < >> > khire...@redhat.com> wrote: >> > >> >> Hi Amudhan, >> >> >> >> If you see the ls output, some process has a fd opened in the backend. >> >> That is the reason bitrot is not considering for the signing. >> >> Could you please observe, after 120 secs of closure of >> >> "/media/disk2/brick2/.glusterfs/6e/7c/6e7c49e6-094e-4435- >> >> 85bf-f21f99fd8764" >> >> the signing happens. If so we need to figure out who holds this fd for >> >> such a long time. >> >> And also we need to figure is this issue specific to EC volume. >> >> >> >> Thanks and Regards, >> >> Kotresh H R >> >> >> >> - Original Message - >> >> > From: "Amudhan P" >> >> > To: "Kotresh Hiremath Ravishankar" >> >> > Cc: "Gluster Users" >> >> > Sent: Wednesday, September 21, 2016 4:56:40 PM >> >> > Subject: Re: [Gluster-users] 3.8.3 Bitrot signature process >> >> > >> >> > Hi Kotresh, >> >> > >> >> > >> >> > Writing new file. >> >> > >> >> > getfattr -m. -e hex -d /media/disk2/brick2/data/G/ test58-bs10M-c100.nul >> >> > getfattr: Removing leading '/' from absolute path names >> >> > # file: media/disk2/brick2/data/G/test58-bs10M-c100.nul >> >> > trusted.bit-rot.version=0x020057da8b23000b120e >> >> > trusted.ec.config=0x080501000200 >> >> > trusted.ec.size=0x3e80 >> >> > trusted.ec.version=0x1f401f40 >> >> > trusted.gfid=0x6e7c49e6094e443585bff21f99fd8764 >> >> > >> >> > >> >> > Running ls -l in brick 2 pid >> >> > >> >> > ls -l /proc/30162/fd >> >> > >> >> > lr-x-- 1 root root 64 Sep 21 16:22 59 -> >> >> > /media/disk2/brick2/.glusterfs/quanrantine >> >> > lrwx-- 1 root root 64 Sep 21 16:22 6 -> >> >> >
Re: [Gluster-users] 3.8.3 Bitrot signature process
Hi Amudhan, No, bitrot signer is a different process by itself and is not part of brick process. I believe the process 2280 is a brick process ? Did you check with dist-rep volume? Is the same behavior being observed there as well? We need to figure out why brick process is holding that fd for such a long time. Thanks and Regards, Kotresh H R - Original Message - > From: "Amudhan P"> To: "Kotresh Hiremath Ravishankar" > Sent: Wednesday, September 21, 2016 8:15:33 PM > Subject: Re: [Gluster-users] 3.8.3 Bitrot signature process > > Hi Kotresh, > > As soon as fd closes from brick1 pid, i can see bitrot signature for the > file in brick. > > So, it looks like fd opened by brick process to calculate signature. > > output of the file: > > -rw-r--r-- 2 root root 250M Sep 21 18:32 > /media/disk1/brick1/data/G/test59-bs10M-c100.nul > > getfattr: Removing leading '/' from absolute path names > # file: media/disk1/brick1/data/G/test59-bs10M-c100.nul > trusted.bit-rot.signature=0x010200e9474e4cc673c0c227a6e807e04aa4ab1f88d3744243950a290869c53daa65df > trusted.bit-rot.version=0x020057d6af3200012a13 > trusted.ec.config=0x080501000200 > trusted.ec.size=0x3e80 > trusted.ec.version=0x1f401f40 > trusted.gfid=0x4c091145429448468fffe358482c63e1 > > stat /media/disk1/brick1/data/G/test59-bs10M-c100.nul > File: ‘/media/disk1/brick1/data/G/test59-bs10M-c100.nul’ > Size: 262144000 Blocks: 512000 IO Block: 4096 regular file > Device: 811h/2065d Inode: 402653311 Links: 2 > Access: (0644/-rw-r--r--) Uid: (0/root) Gid: (0/root) > Access: 2016-09-21 18:34:43.722712751 +0530 > Modify: 2016-09-21 18:32:41.650712946 +0530 > Change: 2016-09-21 19:14:41.698708914 +0530 > Birth: - > > > In other 2 bricks in same set, still signature is not updated for the same > file. > > > On Wed, Sep 21, 2016 at 6:48 PM, Amudhan P wrote: > > > Hi Kotresh, > > > > I am very sure, No read was going on from mount point. > > > > Again i did same test but after writing data to mount point. I have > > unmounted mount point. > > > > after 120 seconds i am seeing this file fd entry in brick 1 pid > > > > getfattr -m. -e hex -d test59-bs10 > > # file: test59-bs10M-c100.nul > > trusted.bit-rot.version=0x020057bed574000ed534 > > trusted.ec.config=0x080501000200 > > trusted.ec.size=0x3e80 > > trusted.ec.version=0x1f401f40 > > trusted.gfid=0x4c091145429448468fffe358482c63e1 > > > > > > ls -l /proc/2280/fd > > lr-x-- 1 root root 64 Sep 21 13:08 19 -> /media/disk1/brick1/. > > glusterfs/4c/09/4c091145-4294-4846-8fff-e358482c63e1 > > > > Volume is a EC - 4+1 > > > > On Wed, Sep 21, 2016 at 6:17 PM, Kotresh Hiremath Ravishankar < > > khire...@redhat.com> wrote: > > > >> Hi Amudhan, > >> > >> If you see the ls output, some process has a fd opened in the backend. > >> That is the reason bitrot is not considering for the signing. > >> Could you please observe, after 120 secs of closure of > >> "/media/disk2/brick2/.glusterfs/6e/7c/6e7c49e6-094e-4435- > >> 85bf-f21f99fd8764" > >> the signing happens. If so we need to figure out who holds this fd for > >> such a long time. > >> And also we need to figure is this issue specific to EC volume. > >> > >> Thanks and Regards, > >> Kotresh H R > >> > >> - Original Message - > >> > From: "Amudhan P" > >> > To: "Kotresh Hiremath Ravishankar" > >> > Cc: "Gluster Users" > >> > Sent: Wednesday, September 21, 2016 4:56:40 PM > >> > Subject: Re: [Gluster-users] 3.8.3 Bitrot signature process > >> > > >> > Hi Kotresh, > >> > > >> > > >> > Writing new file. > >> > > >> > getfattr -m. -e hex -d /media/disk2/brick2/data/G/test58-bs10M-c100.nul > >> > getfattr: Removing leading '/' from absolute path names > >> > # file: media/disk2/brick2/data/G/test58-bs10M-c100.nul > >> > trusted.bit-rot.version=0x020057da8b23000b120e > >> > trusted.ec.config=0x080501000200 > >> > trusted.ec.size=0x3e80 > >> > trusted.ec.version=0x1f401f40 > >> > trusted.gfid=0x6e7c49e6094e443585bff21f99fd8764 > >> > > >> > > >> > Running ls -l in brick 2 pid > >> > > >> > ls -l /proc/30162/fd > >> > > >> > lr-x-- 1 root root 64 Sep 21 16:22 59 -> > >> > /media/disk2/brick2/.glusterfs/quanrantine > >> > lrwx-- 1 root root 64 Sep 21 16:22 6 -> > >> > /var/lib/glusterd/vols/glsvol1/run/10.1.2.2-media-disk2-brick2.pid > >> > lr-x-- 1 root root 64 Sep 21 16:25 60 -> > >> > /media/disk2/brick2/.glusterfs/6e/7c/6e7c49e6-094e-4435- > >> 85bf-f21f99fd8764 > >> > lr-x-- 1 root root 64 Sep 21 16:22 61 -> > >> > /media/disk2/brick2/.glusterfs/quanrantine > >> > > >> > > >> > find /media/disk2/ -samefile > >> > /media/disk2/brick2/.glusterfs/6e/7c/6e7c49e6-094e-4435- > >> 85bf-f21f99fd8764
Re: [Gluster-users] gluster 3.7 healing errors (no data available, buf->ia_gfid is null)
On 09/21/2016 10:54 PM, Pasi Kärkkäinen wrote: Let's see. # getfattr -m . -d -e hex /bricks/vol1/brick1/foo getfattr: Removing leading '/' from absolute path names # file: bricks/vol1/brick1/foo security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a756e6c6162656c65645f743a733000 So hmm.. no trusted.gfid it seems.. is that perhaps because this node was down when the file was created? No, even if that were the case, the gfid should have been set while healing the file to this node. Can you try doing a setfattr -n trusted.gfid -v 0xc1ca778ed2af4828b981171c0c5bd45e on the file. and launch heal again? What about the .glusterfs hardlink- does that exist? -Ravi On another node: # getfattr -m . -d -e hex /bricks/vol1/brick1/foo getfattr: Removing leading '/' from absolute path names # file: bricks/vol1/brick1/foo security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000 trusted.afr.dirty=0x trusted.afr.gvol1-client-1=0x16620001 trusted.bit-rot.version=0x020057e00db5000624ed trusted.gfid=0xc1ca778ed2af4828b981171c0c5bd45e So there we have the gfid.. How do I fix this and allow healing process to continue/finish.. ? Thanks, -- Pasi ___ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] incorrect usage value on a directory
Great! Thank you, Manikandan. > On Sep 15, 2016, at 6:23 AM, Raghavendra Gowdappawrote: > > Hi Sergei, > > You can set marker "dirty" xattr using key trusted.glusterfs.quota.dirty. You > have two choices: > > 1. Setting through a gluster mount. This will set key on _all_ bricks. > > [root@unused personal]# gluster volume info > No volumes present > [root@unused personal]# rm -rf /home/export/ptop-1 && gluster volume create > ptop-1 booradley:/home/export/ptop-1/ > volume create: ptop-1: success: please start the volume to access data > [root@unused personal]# gluster volume start ptop-1 > volume start: ptop-1: success > > > [root@unused personal]# mount -t glusterfs booradley:/ptop-1 /mnt/glusterfs > [root@unused personal]# cd /mnt/glusterfs > [root@unused glusterfs]# ls > [root@unused glusterfs]# mkdir dir > [root@unused glusterfs]# ls > dir > [root@unused glusterfs]# setfattr -n trusted.glusterfs.quota.dirty -v 1 dir > [root@unused glusterfs]# getfattr -e hex -m . -d dir > # file: dir > security.selinux=0x73797374656d5f753a6f626a6563745f723a6675736566735f743a733000 > > [root@unused glusterfs]# getfattr -e hex -m . -d /home/export/ptop-1/dir/ > getfattr: Removing leading '/' from absolute path names > # file: home/export/ptop-1/dir/ > security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a686f6d655f726f6f745f743a733000 > trusted.gfid=0xbea41d7780e4445e93dc379b0a43bb7a > trusted.glusterfs.dht=0x0001 > trusted.glusterfs.quota.dirty=0x31 > > 2. If you find usage wrong only on an individual brick, you can just set the > xattr on the backend directly. For eg., in the volume above, we can also do, > setfattr -n trusted.glusterfs.quota.dirty -v 1 /home/export/ptop-1/dir > > regards, > Raghavendra > > - Original Message - >> From: "Manikandan Selvaganesh" >> To: "Sergei Gerasenko" >> Cc: "Sergei Gerasenko" , "gluster-users" >> >> Sent: Tuesday, August 30, 2016 10:57:33 PM >> Subject: Re: [Gluster-users] incorrect usage value on a directory >> >> Hi Sergei, >> >> Apologies for the delay. I am extremely sorry, I was struck on something >> important >> It's great that you figured out the solution. >> >> Whenever you set a dirty flag as mentioned in the previous thread, the quota >> values will be recalcualted. >> Yep, as you mentioned there are lot of changes that has gone in from 3.7. We >> have >> introduced Inode-quota feature in 3.7, then we have implemented the Quota >> versioning >> in 3.7.5 and then enhance quota enable/disable feature in 3.7.12. So a lot of >> code changes >> has been done. >> >> In case would you like to know more, you can refer our specs[1]. >> >> [1] https://github.com/gluster/glusterfs-specs >> >> On Tue, Aug 30, 2016 at 9:27 PM, Sergei Gerasenko < gera...@gmail.com > >> wrote: >> >> >> >> The problem must have started because of an upgrade to 3.7.12 from an older >> version. Not sure exactly how. >> >> >> >> >> On Aug 30, 2016, at 10:44 AM, Sergei Gerasenko < gera...@gmail.com > wrote: >> >> It seems that it did the trick. The usage is being recalculated. I’m glad to >> be posting a solution to the original problem on this thread. It’s so >> frequent that threads contain only incomplete or partially complete >> solutions. >> >> Thanks, >> Sergei >> >> >> >> >> On Aug 29, 2016, at 3:41 PM, Sergei Gerasenko < sgerasenk...@gmail.com > >> wrote: >> >> I found an informative thread on a similar problem: >> >> http://www.spinics.net/lists/gluster-devel/msg18400.html >> >> According to the thread, it seems that the solution is to disable the quota, >> which will clear the relevant xattrs and then re-enable the quota which >> should force a recalc. I will try this tomorrow. >> >> On Thu, Aug 11, 2016 at 9:31 AM, Sergei Gerasenko < gera...@gmail.com > >> wrote: >> >> >> >> Hi Selvaganesh, >> >> Thanks so much for your help. I didn’t have that option on probably because I >> originally had a lower version of cluster and then upgraded. I turned the >> option on just now. >> >> The usage is still off. Should I wait a certain time? >> >> Thanks, >> Sergei >> >> >> >> >> On Aug 9, 2016, at 7:26 AM, Manikandan Selvaganesh < mselv...@redhat.com > >> wrote: >> >> Hi Sergei, >> >> When quota is enabled, quota-deem-statfs should be set to ON(By default with >> the recent versions). But apparently >> from your 'gluster v info' output, it is like quota-deem-statfs is not on. >> >> Could you please check and confirm the same on >> /var/lib/glusterd/vols//info. If you do not find an option >> 'features.quota-deem-statfs=on', then this feature is turned off. Did you >> turn off this one? You could turn it on by doing this >> 'gluster volume set quota-deem-statfs on'. >> >> To know more about this feature, please refer here[1] >> >> [1] >>
[Gluster-users] write performance with NIC bonding
Hi, I'm using gluster 3.7.5 and I'm trying to get port bonding working properly with the gluster protocol. I've bonded the NICs using round robin because I also bond it at the switch level with link aggregation. I've used this type of bonding without a problem with my other applications but for some reason gluster does not want to utilize all 3 NICs for writes but it does for reads... any of you come across this or know why? Here's the output of the traffic on the NICs you can see that RX is unbalanced but TX is completely balanced across the 3 NICs. I've tried both mounting via glusterfs or nfs, both result in the same imbalance. Am I missing some configuration? root@e-gluster-01:~# ifconfig bond0 Link encap:Ethernet inet addr: Bcast:128.33.23.255 Mask:255.255.248.0 inet6 addr: fe80::46a8:42ff:fe43:8817/64 Scope:Link UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1 RX packets:160972852 errors:0 dropped:0 overruns:0 frame:0 TX packets:122295229 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:152800624950 (142.3 GiB) TX bytes:138720356365 (129.1 GiB) em1 Link encap:Ethernet UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1 RX packets:160793725 errors:0 dropped:0 overruns:0 frame:0 TX packets:40763142 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:152688146880 (142.2 GiB) TX bytes:46239971255 (43.0 GiB) Interrupt:41 em2 Link encap:Ethernet UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1 RX packets:92451 errors:0 dropped:0 overruns:0 frame:0 TX packets:40750031 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:9001370 (8.5 MiB) TX bytes:46216513162 (43.0 GiB) Interrupt:45 em3 Link encap:Ethernet UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1 RX packets:86676 errors:0 dropped:0 overruns:0 frame:0 TX packets:40782056 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:103476700 (98.6 MiB) TX bytes:46263871948 (43.0 GiB) Interrupt:40 ___ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] EC clarification
> 2016-09-21 20:56 GMT+02:00 Serkan Çoban: > > Then you can use 8+3 with 11 servers. > > Stripe size won't be good: 512*(8-3) = 2560 and not 2048 (or multiple) It's not really 512*(8+3) though. Even though there are 11 fragments, they only contain 8 fragments' worth of data. They just encode it with enough redundancy that *any* 8 contains the whole. ___ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] EC clarification
2016-09-21 20:56 GMT+02:00 Serkan Çoban: > Then you can use 8+3 with 11 servers. Stripe size won't be good: 512*(8-3) = 2560 and not 2048 (or multiple) ___ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] EC clarification
Then you can use 8+3 with 11 servers. On Wed, Sep 21, 2016 at 9:17 PM, Gandalf Corvotempestawrote: > 2016-09-21 16:13 GMT+02:00 Serkan Çoban : >> 8+2 is recommended for 10 servers. From n+k servers it will be good to >> choose n with a power of 2(4,8,16,vs) >> You need to add 10 bricks if you want to extend the volume. > > 8+2 means at least 2 failed bricks, right? > That's too low. I need at least 3 bricks ___ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] EC clarification
2016-09-21 16:13 GMT+02:00 Serkan Çoban: > 8+2 is recommended for 10 servers. From n+k servers it will be good to > choose n with a power of 2(4,8,16,vs) > You need to add 10 bricks if you want to extend the volume. 8+2 means at least 2 failed bricks, right? That's too low. I need at least 3 bricks ___ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] gluster 3.7 healing errors (no data available, buf->ia_gfid is null)
Hi, On Wed, Sep 21, 2016 at 10:12:44PM +0530, Ravishankar N wrote: > On 09/21/2016 06:45 PM, Pasi Kärkkäinen wrote: > >Hello, > > > >I have a pretty basic two-node gluster 3.7 setup, with a volume > >replicated/mirrored to both servers. > > > >One of the servers was down for hardware maintenance, and later when it got > >back up, the healing process started, re-syncing files. > >In the beginning there was some 200 files that need to be synced, and now > >the number of files is down to 10, but it seems the last 10 files don't seem > >to get synced.. > > > >So the problem is the healing/re-sync never ends for these files.. > > > > > ># gluster volume heal gvol1 info > >Brick gnode1:/bricks/vol1/brick1 > >/foo > >/ - Possibly undergoing heal > > > >/foo6 > >/foo8 > >/foo7 > >/foo9 > >/foo2 > >/foo5 > >/foo4 > >/foo3 > >Status: Connected > >Number of entries: 10 > > > >Brick gnode2:/bricks/vol1/brick1 > >/ > >Status: Connected > >Number of entries: 1 > > > > > >In the brick logs for the volume I see these errors repeating: > > > >[2016-09-21 12:41:43.063209] E [MSGID: 113002] [posix.c:252:posix_lookup] > >0-gvol1-posix: buf->ia_gfid is null for /bricks/vol1/brick1/foo [No data > >available] > >[2016-09-21 12:41:43.063266] E [MSGID: 115050] > >[server-rpc-fops.c:179:server_lookup_cbk] 0-gvol1-server: 1484202: LOOKUP > >/foo (----0001/foo) ==> (No data available) [No > >data available] > > > > > >Any idea what might cause those errors? (/foo is exactly the file that is > >being healed, but fails to heal) > >Any tricks to try? > > Can you check if the 'trusted.gfid' xattr is present for those files > on the bricks and the files also have the associated hardlink inside > .glusterfs? You can refer to > https://joejulian.name/blog/what-is-this-new-glusterfs-directory-in-33/ > if you are not familiar with the .glusterfs directory. > Let's see. # getfattr -m . -d -e hex /bricks/vol1/brick1/foo getfattr: Removing leading '/' from absolute path names # file: bricks/vol1/brick1/foo security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a756e6c6162656c65645f743a733000 So hmm.. no trusted.gfid it seems.. is that perhaps because this node was down when the file was created? On another node: # getfattr -m . -d -e hex /bricks/vol1/brick1/foo getfattr: Removing leading '/' from absolute path names # file: bricks/vol1/brick1/foo security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000 trusted.afr.dirty=0x trusted.afr.gvol1-client-1=0x16620001 trusted.bit-rot.version=0x020057e00db5000624ed trusted.gfid=0xc1ca778ed2af4828b981171c0c5bd45e So there we have the gfid.. How do I fix this and allow healing process to continue/finish.. ? Thanks, -- Pasi > -Ravi > > > > >Software versions: CentOS 7 with gluster37 repo (running Gluster 3.7.15), > >and nfs-ganesha 2.3.3. > > > > > >Thanks a lot, > > > >-- Pasi > > ___ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] gluster 3.7 healing errors (no data available, buf->ia_gfid is null)
On 09/21/2016 06:45 PM, Pasi Kärkkäinen wrote: Hello, I have a pretty basic two-node gluster 3.7 setup, with a volume replicated/mirrored to both servers. One of the servers was down for hardware maintenance, and later when it got back up, the healing process started, re-syncing files. In the beginning there was some 200 files that need to be synced, and now the number of files is down to 10, but it seems the last 10 files don't seem to get synced.. So the problem is the healing/re-sync never ends for these files.. # gluster volume heal gvol1 info Brick gnode1:/bricks/vol1/brick1 /foo / - Possibly undergoing heal /foo6 /foo8 /foo7 /foo9 /foo2 /foo5 /foo4 /foo3 Status: Connected Number of entries: 10 Brick gnode2:/bricks/vol1/brick1 / Status: Connected Number of entries: 1 In the brick logs for the volume I see these errors repeating: [2016-09-21 12:41:43.063209] E [MSGID: 113002] [posix.c:252:posix_lookup] 0-gvol1-posix: buf->ia_gfid is null for /bricks/vol1/brick1/foo [No data available] [2016-09-21 12:41:43.063266] E [MSGID: 115050] [server-rpc-fops.c:179:server_lookup_cbk] 0-gvol1-server: 1484202: LOOKUP /foo (----0001/foo) ==> (No data available) [No data available] Any idea what might cause those errors? (/foo is exactly the file that is being healed, but fails to heal) Any tricks to try? Can you check if the 'trusted.gfid' xattr is present for those files on the bricks and the files also have the associated hardlink inside .glusterfs? You can refer to https://joejulian.name/blog/what-is-this-new-glusterfs-directory-in-33/ if you are not familiar with the .glusterfs directory. -Ravi Software versions: CentOS 7 with gluster37 repo (running Gluster 3.7.15), and nfs-ganesha 2.3.3. Thanks a lot, -- Pasi ___ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users ___ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] EC clarification
8+2 is recommended for 10 servers. From n+k servers it will be good to choose n with a power of 2(4,8,16,vs) You need to add 10 bricks if you want to extend the volume. On Wed, Sep 21, 2016 at 4:12 PM, Gandalf Corvotempestawrote: > 2016-09-21 14:42 GMT+02:00 Xavier Hernandez : >> You *must* ensure that *all* bricks forming a single disperse set are placed >> in a different server. There are no 4 special fragments. All fragments have >> the same importance. The way to do that is ordering them when the volume is >> created: >> >> gluster volume create test disperse 16 redundancy 4 >> server{1..20}:/bricks/test1 server{1..20}:/bricks/test2 >> server{1..20}:/bricks/test3 >> >> This way all 20 fragments from each disperse set will be placed in a >> different server. However each server will have 3 bricks and no fragment >> from a single file will be stored in more than one brick of each server. > > Now it's clear. > So, at very minimum, EC is good starting from 7+3 (10 servers with 1 > brick each) because: 512*(7-3) = 2048 > Any smaller combinations would mean less redundancy (6+2) or not > optimized stripe size like 512*(5-3)=1024 > > Is this correct ? And what if I have to add some brick to the current > servers or I have to add new servers? > Can I add them freely or I have to follow some rules ? > ___ > Gluster-users mailing list > Gluster-users@gluster.org > http://www.gluster.org/mailman/listinfo/gluster-users ___ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users
[Gluster-users] gluster 3.7 healing errors (no data available, buf->ia_gfid is null)
Hello, I have a pretty basic two-node gluster 3.7 setup, with a volume replicated/mirrored to both servers. One of the servers was down for hardware maintenance, and later when it got back up, the healing process started, re-syncing files. In the beginning there was some 200 files that need to be synced, and now the number of files is down to 10, but it seems the last 10 files don't seem to get synced.. So the problem is the healing/re-sync never ends for these files.. # gluster volume heal gvol1 info Brick gnode1:/bricks/vol1/brick1 /foo / - Possibly undergoing heal /foo6 /foo8 /foo7 /foo9 /foo2 /foo5 /foo4 /foo3 Status: Connected Number of entries: 10 Brick gnode2:/bricks/vol1/brick1 / Status: Connected Number of entries: 1 In the brick logs for the volume I see these errors repeating: [2016-09-21 12:41:43.063209] E [MSGID: 113002] [posix.c:252:posix_lookup] 0-gvol1-posix: buf->ia_gfid is null for /bricks/vol1/brick1/foo [No data available] [2016-09-21 12:41:43.063266] E [MSGID: 115050] [server-rpc-fops.c:179:server_lookup_cbk] 0-gvol1-server: 1484202: LOOKUP /foo (----0001/foo) ==> (No data available) [No data available] Any idea what might cause those errors? (/foo is exactly the file that is being healed, but fails to heal) Any tricks to try? Software versions: CentOS 7 with gluster37 repo (running Gluster 3.7.15), and nfs-ganesha 2.3.3. Thanks a lot, -- Pasi ___ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users
[Gluster-users] Weekly Community Meeting - 21-Sep-2016
This weeks meeting started slow. But snowballed into quite an active meeting. Thank you all who attended the meeting! The meeting logs for the meeting are available at the links below, and the minutes have been pasted at the end. - Minutes: https://meetbot.fedoraproject.org/gluster-meeting/2016-09-21/weekly_community_meeting_21-sep-2016.2016-09-21-11.59.html - Minutes (text): https://meetbot.fedoraproject.org/gluster-meeting/2016-09-21/weekly_community_meeting_21-sep-2016.2016-09-21-11.59.txt - Log: https://meetbot.fedoraproject.org/gluster-meeting/2016-09-21/weekly_community_meeting_21-sep-2016.2016-09-21-11.59.log.html Next weeks meeting will be hosted by Samikshan. See you all next week, same place, same time. Cheers, Kaushal Meeting summary --- * Roll Call (kshlm, 12:00:06) * Next weeks host (kshlm, 12:08:57) * samikshan is next weeks host (kshlm, 12:10:19) * Project Infrastructure (kshlm, 12:10:26) * GlusterFS-4.0 (kshlm, 12:15:50) * LINK: https://www.gluster.org/pipermail/gluster-devel/2016-September/050928.html (kshlm, 12:18:55) * GlusterFS-3.9 (kshlm, 12:21:30) * GlusterFS-3.8 (kshlm, 12:27:05) * GlusterFS-3.7 (kshlm, 12:29:45) * NFS Ganesha (kshlm, 12:34:27) * Samba (kshlm, 12:37:34) * Last weeks AIs (kshlm, 12:39:14) * rastar_afk/ndevos/jdarcy to improve cleanup to control the processes that test starts. (kshlm, 12:39:26) * ACTION: rastar_afk/ndevos/jdarcy to improve cleanup to control the processes that test starts. (kshlm, 12:40:27) * RC tagging to be done by this week for 3.9 by aravindavk. (kshlm, 12:41:47) * RC tagging to be done by this week for 3.9 by aravindavk/pranithk (kshlm, 12:42:19) * ACTION: RC tagging to be done by this week for 3.9 by aravindavk/pranithk (kshlm, 12:42:34) * jdarcy will bug amye regarding a public announcement for Gluster Summit talks (kshlm, 12:42:39) * LINK: https://www.gluster.org/pipermail/gluster-devel/2016-September/050888.html (kshlm, 12:43:27) * Open floor (kshlm, 12:43:42) * RHEL5 build issues (kshlm, 12:43:58) * LINK: https://www.gluster.org/pipermail/gluster-infra/2016-September/002821.html (kshlm, 13:01:52) * Updates on documentation (kshlm, 13:02:06) * LINK: https://rajeshjoseph.gitbooks.io/test-guide/content/ (rjoseph, 13:03:51) * LINK: https://github.com/rajeshjoseph/doctest (rjoseph, 13:06:23) * Announcements (kshlm, 13:08:02) Meeting ended at 13:08:26 UTC. Action Items * rastar_afk/ndevos/jdarcy to improve cleanup to control the processes that test starts. * RC tagging to be done by this week for 3.9 by aravindavk/pranithk Action Items, by person --- * aravindavk * RC tagging to be done by this week for 3.9 by aravindavk/pranithk * ndevos * rastar_afk/ndevos/jdarcy to improve cleanup to control the processes that test starts. * **UNASSIGNED** * (none) People Present (lines said) --- * kshlm (156) * nigelb (42) * ndevos (27) * kkeithley (22) * misc (20) * rjoseph (20) * aravindavk (11) * samikshan (4) * zodbot (4) * amye (4) * ankitraj (1) * Klas (1) ___ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] EC clarification
2016-09-21 14:42 GMT+02:00 Xavier Hernandez: > You *must* ensure that *all* bricks forming a single disperse set are placed > in a different server. There are no 4 special fragments. All fragments have > the same importance. The way to do that is ordering them when the volume is > created: > > gluster volume create test disperse 16 redundancy 4 > server{1..20}:/bricks/test1 server{1..20}:/bricks/test2 > server{1..20}:/bricks/test3 > > This way all 20 fragments from each disperse set will be placed in a > different server. However each server will have 3 bricks and no fragment > from a single file will be stored in more than one brick of each server. Now it's clear. So, at very minimum, EC is good starting from 7+3 (10 servers with 1 brick each) because: 512*(7-3) = 2048 Any smaller combinations would mean less redundancy (6+2) or not optimized stripe size like 512*(5-3)=1024 Is this correct ? And what if I have to add some brick to the current servers or I have to add new servers? Can I add them freely or I have to follow some rules ? ___ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] 3.8.3 Bitrot signature process
Hi Amudhan, If you see the ls output, some process has a fd opened in the backend. That is the reason bitrot is not considering for the signing. Could you please observe, after 120 secs of closure of "/media/disk2/brick2/.glusterfs/6e/7c/6e7c49e6-094e-4435-85bf-f21f99fd8764" the signing happens. If so we need to figure out who holds this fd for such a long time. And also we need to figure is this issue specific to EC volume. Thanks and Regards, Kotresh H R - Original Message - > From: "Amudhan P"> To: "Kotresh Hiremath Ravishankar" > Cc: "Gluster Users" > Sent: Wednesday, September 21, 2016 4:56:40 PM > Subject: Re: [Gluster-users] 3.8.3 Bitrot signature process > > Hi Kotresh, > > > Writing new file. > > getfattr -m. -e hex -d /media/disk2/brick2/data/G/test58-bs10M-c100.nul > getfattr: Removing leading '/' from absolute path names > # file: media/disk2/brick2/data/G/test58-bs10M-c100.nul > trusted.bit-rot.version=0x020057da8b23000b120e > trusted.ec.config=0x080501000200 > trusted.ec.size=0x3e80 > trusted.ec.version=0x1f401f40 > trusted.gfid=0x6e7c49e6094e443585bff21f99fd8764 > > > Running ls -l in brick 2 pid > > ls -l /proc/30162/fd > > lr-x-- 1 root root 64 Sep 21 16:22 59 -> > /media/disk2/brick2/.glusterfs/quanrantine > lrwx-- 1 root root 64 Sep 21 16:22 6 -> > /var/lib/glusterd/vols/glsvol1/run/10.1.2.2-media-disk2-brick2.pid > lr-x-- 1 root root 64 Sep 21 16:25 60 -> > /media/disk2/brick2/.glusterfs/6e/7c/6e7c49e6-094e-4435-85bf-f21f99fd8764 > lr-x-- 1 root root 64 Sep 21 16:22 61 -> > /media/disk2/brick2/.glusterfs/quanrantine > > > find /media/disk2/ -samefile > /media/disk2/brick2/.glusterfs/6e/7c/6e7c49e6-094e-4435-85bf-f21f99fd8764 > /media/disk2/brick2/.glusterfs/6e/7c/6e7c49e6-094e-4435-85bf-f21f99fd8764 > /media/disk2/brick2/data/G/test58-bs10M-c100.nul > > > > On Wed, Sep 21, 2016 at 3:28 PM, Kotresh Hiremath Ravishankar < > khire...@redhat.com> wrote: > > > Hi Amudhan, > > > > Don't grep for the filename, glusterfs maintains hardlink in .glusterfs > > directory > > for each file. Just check 'ls -l /proc//fd' for any > > fds opened > > for a file in .glusterfs and check if it's the same file. > > > > Thanks and Regards, > > Kotresh H R > > > > - Original Message - > > > From: "Amudhan P" > > > To: "Kotresh Hiremath Ravishankar" > > > Cc: "Gluster Users" > > > Sent: Wednesday, September 21, 2016 1:33:10 PM > > > Subject: Re: [Gluster-users] 3.8.3 Bitrot signature process > > > > > > Hi Kotresh, > > > > > > i have used below command to verify any open fd for file. > > > > > > "ls -l /proc/*/fd | grep filename". > > > > > > as soon as write completes there no open fd's, if there is any alternate > > > option. please let me know will also try that. > > > > > > > > > > > > > > > Also, below is my scrub status in my test setup. number of skipped files > > > slow reducing day by day. I think files are skipped due to bitrot > > signature > > > process is not completed yet. > > > > > > where can i see scrub skipped files? > > > > > > > > > Volume name : glsvol1 > > > > > > State of scrub: Active (Idle) > > > > > > Scrub impact: normal > > > > > > Scrub frequency: daily > > > > > > Bitrot error log location: /var/log/glusterfs/bitd.log > > > > > > Scrubber error log location: /var/log/glusterfs/scrub.log > > > > > > > > > = > > > > > > Node: localhost > > > > > > Number of Scrubbed files: 1644 > > > > > > Number of Skipped files: 1001 > > > > > > Last completed scrub time: 2016-09-20 11:59:58 > > > > > > Duration of last scrub (D:M:H:M:S): 0:0:39:26 > > > > > > Error count: 0 > > > > > > > > > = > > > > > > Node: 10.1.2.3 > > > > > > Number of Scrubbed files: 1644 > > > > > > Number of Skipped files: 1001 > > > > > > Last completed scrub time: 2016-09-20 10:50:00 > > > > > > Duration of last scrub (D:M:H:M:S): 0:0:38:17 > > > > > > Error count: 0 > > > > > > > > > = > > > > > > Node: 10.1.2.4 > > > > > > Number of Scrubbed files: 981 > > > > > > Number of Skipped files: 1664 > > > > > > Last completed scrub time: 2016-09-20 12:38:01 > > > > > > Duration of last scrub (D:M:H:M:S): 0:0:35:19 > > > > > > Error count: 0 > > > > > > > > > = > > > > > > Node: 10.1.2.1 > > > > > > Number of Scrubbed files: 1263 > > > > > > Number of Skipped files: 1382 > > > > > > Last completed scrub time: 2016-09-20 11:57:21 > > > > > > Duration of last scrub (D:M:H:M:S): 0:0:37:17 > > > > > > Error count: 0 > > > > > > > > > = > > > > > > Node: 10.1.2.2 > > > > > > Number of Scrubbed files: 1644 > > >
Re: [Gluster-users] EC clarification
On 21/09/16 14:36, Gandalf Corvotempesta wrote: Il 01 set 2016 10:18 AM, "Xavier Hernandez"> ha scritto: If you put more than one fragment into the same server, you will lose all the fragments if the server goes down. If there are more than 4 fragments on that server, the file will be unrecoverable until the server is brought up again. Putting more than one fragment into a single server only makes sense to account for disk failures, since the protection against server failures is lower. Exactly, what i would like to ensure is that the 4 segments needed for recovery are placed automatically on at least 4 servers (and not on 4 different bricks that could be on the same server) You *must* ensure that *all* bricks forming a single disperse set are placed in a different server. There are no 4 special fragments. All fragments have the same importance. The way to do that is ordering them when the volume is created: gluster volume create test disperse 16 redundancy 4 server{1..20}:/bricks/test1 server{1..20}:/bricks/test2 server{1..20}:/bricks/test3 This way all 20 fragments from each disperse set will be placed in a different server. However each server will have 3 bricks and no fragment from a single file will be stored in more than one brick of each server. Xavi ___ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] EC clarification
Il 01 set 2016 10:18 AM, "Xavier Hernandez"ha scritto: > If you put more than one fragment into the same server, you will lose all the fragments if the server goes down. If there are more than 4 fragments on that server, the file will be unrecoverable until the server is brought up again. > > Putting more than one fragment into a single server only makes sense to account for disk failures, since the protection against server failures is lower. > Exactly, what i would like to ensure is that the 4 segments needed for recovery are placed automatically on at least 4 servers (and not on 4 different bricks that could be on the same server) ___ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Upgrading Gluster Client without upgrading server
You should be first upgrading your servers followed by clients. On Wed, Sep 21, 2016 at 2:44 PM, mabiwrote: > Hi, > > I have a GlusterFS server version 3.7.12 and mount my volumes on my > clients using FUSE native GlusterFS. Now I was wondering if it is safe to > upgrade the GlusterFS client on my clients to 3.7.15 without upgrading my > server to 3.7.15? > > Regards, > M. > > ___ > Gluster-users mailing list > Gluster-users@gluster.org > http://www.gluster.org/mailman/listinfo/gluster-users > -- --Atin ___ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] 3.8.3 Bitrot signature process
Hi Kotresh, Writing new file. getfattr -m. -e hex -d /media/disk2/brick2/data/G/test58-bs10M-c100.nul getfattr: Removing leading '/' from absolute path names # file: media/disk2/brick2/data/G/test58-bs10M-c100.nul trusted.bit-rot.version=0x020057da8b23000b120e trusted.ec.config=0x080501000200 trusted.ec.size=0x3e80 trusted.ec.version=0x1f401f40 trusted.gfid=0x6e7c49e6094e443585bff21f99fd8764 Running ls -l in brick 2 pid ls -l /proc/30162/fd lr-x-- 1 root root 64 Sep 21 16:22 59 -> /media/disk2/brick2/.glusterfs/quanrantine lrwx-- 1 root root 64 Sep 21 16:22 6 -> /var/lib/glusterd/vols/glsvol1/run/10.1.2.2-media-disk2-brick2.pid lr-x-- 1 root root 64 Sep 21 16:25 60 -> /media/disk2/brick2/.glusterfs/6e/7c/6e7c49e6-094e-4435-85bf-f21f99fd8764 lr-x-- 1 root root 64 Sep 21 16:22 61 -> /media/disk2/brick2/.glusterfs/quanrantine find /media/disk2/ -samefile /media/disk2/brick2/.glusterfs/6e/7c/6e7c49e6-094e-4435-85bf-f21f99fd8764 /media/disk2/brick2/.glusterfs/6e/7c/6e7c49e6-094e-4435-85bf-f21f99fd8764 /media/disk2/brick2/data/G/test58-bs10M-c100.nul On Wed, Sep 21, 2016 at 3:28 PM, Kotresh Hiremath Ravishankar < khire...@redhat.com> wrote: > Hi Amudhan, > > Don't grep for the filename, glusterfs maintains hardlink in .glusterfs > directory > for each file. Just check 'ls -l /proc//fd' for any > fds opened > for a file in .glusterfs and check if it's the same file. > > Thanks and Regards, > Kotresh H R > > - Original Message - > > From: "Amudhan P"> > To: "Kotresh Hiremath Ravishankar" > > Cc: "Gluster Users" > > Sent: Wednesday, September 21, 2016 1:33:10 PM > > Subject: Re: [Gluster-users] 3.8.3 Bitrot signature process > > > > Hi Kotresh, > > > > i have used below command to verify any open fd for file. > > > > "ls -l /proc/*/fd | grep filename". > > > > as soon as write completes there no open fd's, if there is any alternate > > option. please let me know will also try that. > > > > > > > > > > Also, below is my scrub status in my test setup. number of skipped files > > slow reducing day by day. I think files are skipped due to bitrot > signature > > process is not completed yet. > > > > where can i see scrub skipped files? > > > > > > Volume name : glsvol1 > > > > State of scrub: Active (Idle) > > > > Scrub impact: normal > > > > Scrub frequency: daily > > > > Bitrot error log location: /var/log/glusterfs/bitd.log > > > > Scrubber error log location: /var/log/glusterfs/scrub.log > > > > > > = > > > > Node: localhost > > > > Number of Scrubbed files: 1644 > > > > Number of Skipped files: 1001 > > > > Last completed scrub time: 2016-09-20 11:59:58 > > > > Duration of last scrub (D:M:H:M:S): 0:0:39:26 > > > > Error count: 0 > > > > > > = > > > > Node: 10.1.2.3 > > > > Number of Scrubbed files: 1644 > > > > Number of Skipped files: 1001 > > > > Last completed scrub time: 2016-09-20 10:50:00 > > > > Duration of last scrub (D:M:H:M:S): 0:0:38:17 > > > > Error count: 0 > > > > > > = > > > > Node: 10.1.2.4 > > > > Number of Scrubbed files: 981 > > > > Number of Skipped files: 1664 > > > > Last completed scrub time: 2016-09-20 12:38:01 > > > > Duration of last scrub (D:M:H:M:S): 0:0:35:19 > > > > Error count: 0 > > > > > > = > > > > Node: 10.1.2.1 > > > > Number of Scrubbed files: 1263 > > > > Number of Skipped files: 1382 > > > > Last completed scrub time: 2016-09-20 11:57:21 > > > > Duration of last scrub (D:M:H:M:S): 0:0:37:17 > > > > Error count: 0 > > > > > > = > > > > Node: 10.1.2.2 > > > > Number of Scrubbed files: 1644 > > > > Number of Skipped files: 1001 > > > > Last completed scrub time: 2016-09-20 11:59:25 > > > > Duration of last scrub (D:M:H:M:S): 0:0:39:18 > > > > Error count: 0 > > > > = > > > > > > > > > > Thanks > > Amudhan > > > > > > On Wed, Sep 21, 2016 at 11:45 AM, Kotresh Hiremath Ravishankar < > > khire...@redhat.com> wrote: > > > > > Hi Amudhan, > > > > > > I don't think it's the limitation with read data from the brick. > > > To limit the usage of CPU, throttling is done using token bucket > > > algorithm. The log message showed is related to it. But even then > > > I think it should not take 12 minutes for check-sum calculation unless > > > there is an fd open (might be internal). Could you please cross verify > > > if there are any fd opened on that file by looking into /proc? I will > > > also test it out in the mean time and get back to you. > > > > > > Thanks and Regards, > > > Kotresh H R > > > > > > - Original Message - > > > > From: "Amudhan P" > > > > To:
[Gluster-users] Upgrading Gluster Client without upgrading server
Hi, I have a GlusterFS server version 3.7.12 and mount my volumes on my clients using FUSE native GlusterFS. Now I was wondering if it is safe to upgrade the GlusterFS client on my clients to 3.7.15 without upgrading my server to 3.7.15? Regards, M.___ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] 3.8.3 Bitrot signature process
Hi Kotresh, i have used below command to verify any open fd for file. "ls -l /proc/*/fd | grep filename". as soon as write completes there no open fd's, if there is any alternate option. please let me know will also try that. Also, below is my scrub status in my test setup. number of skipped files slow reducing day by day. I think files are skipped due to bitrot signature process is not completed yet. where can i see scrub skipped files? Volume name : glsvol1 State of scrub: Active (Idle) Scrub impact: normal Scrub frequency: daily Bitrot error log location: /var/log/glusterfs/bitd.log Scrubber error log location: /var/log/glusterfs/scrub.log = Node: localhost Number of Scrubbed files: 1644 Number of Skipped files: 1001 Last completed scrub time: 2016-09-20 11:59:58 Duration of last scrub (D:M:H:M:S): 0:0:39:26 Error count: 0 = Node: 10.1.2.3 Number of Scrubbed files: 1644 Number of Skipped files: 1001 Last completed scrub time: 2016-09-20 10:50:00 Duration of last scrub (D:M:H:M:S): 0:0:38:17 Error count: 0 = Node: 10.1.2.4 Number of Scrubbed files: 981 Number of Skipped files: 1664 Last completed scrub time: 2016-09-20 12:38:01 Duration of last scrub (D:M:H:M:S): 0:0:35:19 Error count: 0 = Node: 10.1.2.1 Number of Scrubbed files: 1263 Number of Skipped files: 1382 Last completed scrub time: 2016-09-20 11:57:21 Duration of last scrub (D:M:H:M:S): 0:0:37:17 Error count: 0 = Node: 10.1.2.2 Number of Scrubbed files: 1644 Number of Skipped files: 1001 Last completed scrub time: 2016-09-20 11:59:25 Duration of last scrub (D:M:H:M:S): 0:0:39:18 Error count: 0 = Thanks Amudhan On Wed, Sep 21, 2016 at 11:45 AM, Kotresh Hiremath Ravishankar < khire...@redhat.com> wrote: > Hi Amudhan, > > I don't think it's the limitation with read data from the brick. > To limit the usage of CPU, throttling is done using token bucket > algorithm. The log message showed is related to it. But even then > I think it should not take 12 minutes for check-sum calculation unless > there is an fd open (might be internal). Could you please cross verify > if there are any fd opened on that file by looking into /proc? I will > also test it out in the mean time and get back to you. > > Thanks and Regards, > Kotresh H R > > - Original Message - > > From: "Amudhan P"> > To: "Kotresh Hiremath Ravishankar" > > Cc: "Gluster Users" > > Sent: Tuesday, September 20, 2016 3:19:28 PM > > Subject: Re: [Gluster-users] 3.8.3 Bitrot signature process > > > > Hi Kotresh, > > > > Please correct me if i am wrong, Once a file write completes and as soon > as > > closes fds, bitrot waits for 120 seconds and starts hashing and update > > signature for the file in brick. > > > > But, what i am feeling that bitrot takes too much of time to complete > > hashing. > > > > below is test result i would like to share. > > > > > > writing data in below path using dd : > > > > /mnt/gluster/data/G (mount point) > > -rw-r--r-- 1 root root 10M Sep 20 12:19 test53-bs10M-c1.nul > > -rw-r--r-- 1 root root 100M Sep 20 12:19 test54-bs10M-c10.nul > > > > No any other write or read process is going on. > > > > > > Checking file data in one of the brick. > > > > -rw-r--r-- 2 root root 2.5M Sep 20 12:23 test53-bs10M-c1.nul > > -rw-r--r-- 2 root root 25M Sep 20 12:23 test54-bs10M-c10.nul > > > > file's stat and getfattr info from brick, after write process completed. > > > > gfstst-node5:/media/disk2/brick2/data/G$ stat test53-bs10M-c1.nul > > File: ‘test53-bs10M-c1.nul’ > > Size: 2621440 Blocks: 5120 IO Block: 4096 regular file > > Device: 821h/2081d Inode: 536874168 Links: 2 > > Access: (0644/-rw-r--r--) Uid: (0/root) Gid: (0/root) > > Access: 2016-09-20 12:23:28.798886647 +0530 > > Modify: 2016-09-20 12:23:28.994886646 +0530 > > Change: 2016-09-20 12:23:28.998886646 +0530 > > Birth: - > > > > gfstst-node5:/media/disk2/brick2/data/G$ stat test54-bs10M-c10.nul > > File: ‘test54-bs10M-c10.nul’ > > Size: 26214400Blocks: 51200 IO Block: 4096 regular file > > Device: 821h/2081d Inode: 536874169 Links: 2 > > Access: (0644/-rw-r--r--) Uid: (0/root) Gid: (0/root) > > Access: 2016-09-20 12:23:42.902886624 +0530 > > Modify: 2016-09-20 12:23:44.378886622 +0530 > > Change: 2016-09-20 12:23:44.378886622 +0530 > > Birth: - > > > > gfstst-node5:/media/disk2/brick2/data/G$ sudo getfattr -m. -e hex -d > > test53-bs10M-c1.nul > > # file: test53-bs10M-c1.nul > > trusted.bit-rot.version=0x020057daa7b50002e5b4 > >
Re: [Gluster-users] 3.8.3 Bitrot signature process
Hi Amudhan, I don't think it's the limitation with read data from the brick. To limit the usage of CPU, throttling is done using token bucket algorithm. The log message showed is related to it. But even then I think it should not take 12 minutes for check-sum calculation unless there is an fd open (might be internal). Could you please cross verify if there are any fd opened on that file by looking into /proc? I will also test it out in the mean time and get back to you. Thanks and Regards, Kotresh H R - Original Message - > From: "Amudhan P"> To: "Kotresh Hiremath Ravishankar" > Cc: "Gluster Users" > Sent: Tuesday, September 20, 2016 3:19:28 PM > Subject: Re: [Gluster-users] 3.8.3 Bitrot signature process > > Hi Kotresh, > > Please correct me if i am wrong, Once a file write completes and as soon as > closes fds, bitrot waits for 120 seconds and starts hashing and update > signature for the file in brick. > > But, what i am feeling that bitrot takes too much of time to complete > hashing. > > below is test result i would like to share. > > > writing data in below path using dd : > > /mnt/gluster/data/G (mount point) > -rw-r--r-- 1 root root 10M Sep 20 12:19 test53-bs10M-c1.nul > -rw-r--r-- 1 root root 100M Sep 20 12:19 test54-bs10M-c10.nul > > No any other write or read process is going on. > > > Checking file data in one of the brick. > > -rw-r--r-- 2 root root 2.5M Sep 20 12:23 test53-bs10M-c1.nul > -rw-r--r-- 2 root root 25M Sep 20 12:23 test54-bs10M-c10.nul > > file's stat and getfattr info from brick, after write process completed. > > gfstst-node5:/media/disk2/brick2/data/G$ stat test53-bs10M-c1.nul > File: ‘test53-bs10M-c1.nul’ > Size: 2621440 Blocks: 5120 IO Block: 4096 regular file > Device: 821h/2081d Inode: 536874168 Links: 2 > Access: (0644/-rw-r--r--) Uid: (0/root) Gid: (0/root) > Access: 2016-09-20 12:23:28.798886647 +0530 > Modify: 2016-09-20 12:23:28.994886646 +0530 > Change: 2016-09-20 12:23:28.998886646 +0530 > Birth: - > > gfstst-node5:/media/disk2/brick2/data/G$ stat test54-bs10M-c10.nul > File: ‘test54-bs10M-c10.nul’ > Size: 26214400Blocks: 51200 IO Block: 4096 regular file > Device: 821h/2081d Inode: 536874169 Links: 2 > Access: (0644/-rw-r--r--) Uid: (0/root) Gid: (0/root) > Access: 2016-09-20 12:23:42.902886624 +0530 > Modify: 2016-09-20 12:23:44.378886622 +0530 > Change: 2016-09-20 12:23:44.378886622 +0530 > Birth: - > > gfstst-node5:/media/disk2/brick2/data/G$ sudo getfattr -m. -e hex -d > test53-bs10M-c1.nul > # file: test53-bs10M-c1.nul > trusted.bit-rot.version=0x020057daa7b50002e5b4 > trusted.ec.config=0x080501000200 > trusted.ec.size=0x00a0 > trusted.ec.version=0x00500050 > trusted.gfid=0xe2416bd1aae4403c88f44286273bbe99 > > gfstst-node5:/media/disk2/brick2/data/G$ sudo getfattr -m. -e hex -d > test54-bs10M-c10.nul > # file: test54-bs10M-c10.nul > trusted.bit-rot.version=0x020057daa7b50002e5b4 > trusted.ec.config=0x080501000200 > trusted.ec.size=0x0640 > trusted.ec.version=0x03200320 > trusted.gfid=0x54e018dd8c5a4bd79e0317729d8a57c5 > > > > file's stat and getfattr info from brick, after bitrot signature updated. > > gfstst-node5:/media/disk2/brick2/data/G$ stat test53-bs10M-c1.nul > File: ‘test53-bs10M-c1.nul’ > Size: 2621440 Blocks: 5120 IO Block: 4096 regular file > Device: 821h/2081d Inode: 536874168 Links: 2 > Access: (0644/-rw-r--r--) Uid: (0/root) Gid: (0/root) > Access: 2016-09-20 12:25:31.494886450 +0530 > Modify: 2016-09-20 12:23:28.994886646 +0530 > Change: 2016-09-20 12:27:00.994886307 +0530 > Birth: - > > > gfstst-node5:/media/disk2/brick2/data/G$ sudo getfattr -m. -e hex -d > test53-bs10M-c1.nul > # file: test53-bs10M-c1.nul > trusted.bit-rot.signature=0x0102006de7493c5c90f643357c268fbaaf461c1567e0334e4948023ce17268403aa37a > trusted.bit-rot.version=0x020057daa7b50002e5b4 > trusted.ec.config=0x080501000200 > trusted.ec.size=0x00a0 > trusted.ec.version=0x00500050 > trusted.gfid=0xe2416bd1aae4403c88f44286273bbe99 > > > gfstst-node5:/media/disk2/brick2/data/G$ stat test54-bs10M-c10.nul > File: ‘test54-bs10M-c10.nul’ > Size: 26214400Blocks: 51200 IO Block: 4096 regular file > Device: 821h/2081d Inode: 536874169 Links: 2 > Access: (0644/-rw-r--r--) Uid: (0/root) Gid: (0/root) > Access: 2016-09-20 12:25:47.510886425 +0530 > Modify: 2016-09-20 12:23:44.378886622 +0530 > Change: 2016-09-20 12:38:05.954885243 +0530 > Birth: - > > > gfstst-node5:/media/disk2/brick2/data/G$ sudo getfattr -m. -e hex -d > test54-bs10M-c10.nul > # file: test54-bs10M-c10.nul >