Re: [Gluster-devel] missing files
Any updates on this issue? Thanks in advance... David -- Original Message -- From: Shyam srang...@redhat.com To: David F. Robinson david.robin...@corvidtec.com; Justin Clift jus...@gluster.org Cc: Gluster Devel gluster-devel@gluster.org Sent: 2/11/2015 10:02:09 PM Subject: Re: [Gluster-devel] missing files On 02/11/2015 08:28 AM, David F. Robinson wrote: My base filesystem has 40-TB and the tar takes 19 minutes. I copied over 10-TB and it took the tar extraction from 1-minute to 7-minutes. My suspicion is that it is related to number of files and not necessarily file size. Shyam is looking into reproducing this behavior on a redhat system. I am able to reproduce the issue on a similar setup internally (at least at the surface it seems to be similar to what David is facing). I will continue the investigation for the root cause. Shyam ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] missing files
Shyam, You asked me to stop/start the slow volume to see if it fixed the timing issue. I stopped/started homegfs_backup (the production volume with 40+ TB) and it didn't make it faster. I didn't stop/start the fast volume to see if it made it slower. I just did that and sent out an email. I saw a similar result as Pranith. however, I tried this test below and saw no issues. So, i don't know why restart the older volume of test3brick slowed it down but the test below shows no slowdown. #... Create 2-new bricks gluster volume create test4brick gfsib01bkp.corvidtec.com:/data/brick01bkp/test4brick gfsib01bkp.corvidtec.com:/data/brick02bkp/test4brick gluster volume create test5brick gfsib01bkp.corvidtec.com:/data/brick01bkp/test5brick gfsib01bkp.corvidtec.com:/data/brick02bkp/test5brick gluster volume start test4brick gluster volume start test5brick mount /test4brick mount /test5brick cp /root/boost_1_57_0.tar /test4brick cp /root/boost_1_57_0.tar /test5brick #... Stop/start test4brick to see if this causes a timing issue umount /test4brick gluster volume stop test4brick gluster volume start test4brick mount /test4brick #... Run test on both new bricks cd /test4brick time tar -xPf boost_1_57_0.tar; time rm -rf boost_1_57_0 real1m29.712s user0m0.415s sys 0m2.772s real0m18.866s user0m0.087s sys 0m0.556s cd /test5brick time tar -xPf boost_1_57_0.tar; time rm -rf boost_1_57_0 real 1m28.243s user 0m0.366s sys 0m2.502s real 0m18.193s user 0m0.075s sys 0m0.543s #... Repeat again after stop/start of test4brick umount /test4brick gluster volume stop test4brick gluster volume start test4brick mount /test4brick cd /test4brick time tar -xPf boost_1_57_0.tar; time rm -rf boost_1_57_0 real1m25.277s user0m0.466s sys 0m3.107s real0m16.575s user0m0.084s sys 0m0.577s -- Original Message -- From: Shyam srang...@redhat.com To: Pranith Kumar Karampuri pkara...@redhat.com; Justin Clift jus...@gluster.org Cc: Gluster Devel gluster-devel@gluster.org; David F. Robinson david.robin...@corvidtec.com Sent: 2/12/2015 10:46:14 AM Subject: Re: [Gluster-devel] missing files On 02/12/2015 06:22 AM, Pranith Kumar Karampuri wrote: On 02/12/2015 03:05 PM, Pranith Kumar Karampuri wrote: On 02/12/2015 09:14 AM, Justin Clift wrote: On 12 Feb 2015, at 03:02, Shyam srang...@redhat.com wrote: On 02/11/2015 08:28 AM, David F. Robinson wrote: Just to increase confidence performed one more test. Stopped the volumes and re-started. Now on both the volumes, the numbers are almost same: [root@gqac031 gluster-mount]# time rm -rf boost_1_57_0 ; time tar xf boost_1_57_0.tar.gz real 1m15.074s user 0m0.550s sys 0m4.656s real 2m46.866s user 0m5.347s sys 0m16.047s [root@gqac031 gluster-mount]# cd /gluster-emptyvol/ [root@gqac031 gluster-emptyvol]# ls boost_1_57_0.tar.gz [root@gqac031 gluster-emptyvol]# time tar xf boost_1_57_0.tar.gz real 2m31.467s user 0m5.475s sys 0m15.471s gqas015.sbu.lab.eng.bos.redhat.com:testvol on /gluster-mount type fuse.glusterfs (rw,default_permissions,allow_other,max_read=131072) gqas015.sbu.lab.eng.bos.redhat.com:emotyvol on /gluster-emptyvol type fuse.glusterfs (rw,default_permissions,allow_other,max_read=131072) If I remember right, we performed a similar test on David's setup, but I believe there was no significant performance gain there. David could you clarify? Just so we know where we are headed :) Shyam ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] missing files
That is very interesting. I tried this test and received a similar result. Start/stopping the volume causes a timing issue on the blank volume. It seems like there is some parameter getting set when you create a volume and gets reset when you start/stop a volume. Or, something gets set during the start/stop operation that causes the problem. Is there a way to list all parameters that are set for a volume? gluster volume info only shows the ones that the user has changed from defaults. [root@gfs01bkp ~]# gluster volume stop test3brick Stopping volume will make its data inaccessible. Do you want to continue? (y/n) y volume stop: test3brick: success [root@gfs01bkp ~]# gluster volume start test3brick volume start: test3brick: success [root@gfs01bkp ~]# mount /test3brick [root@gfs01bkp ~]# cd /test3brick/ [root@gfs01bkp test3brick]# date; time tar -xPf boost_1_57_0.tar ; time rm -rf boost_1_57_0 Thu Feb 12 10:42:43 EST 2015 real3m46.002s user0m0.421s sys 0m2.812s real0m15.406s user0m0.092s sys 0m0.549s -- Original Message -- From: Pranith Kumar Karampuri pkara...@redhat.com To: Justin Clift jus...@gluster.org; Shyam srang...@redhat.com Cc: Gluster Devel gluster-devel@gluster.org; David F. Robinson david.robin...@corvidtec.com Sent: 2/12/2015 6:22:23 AM Subject: Re: [Gluster-devel] missing files On 02/12/2015 03:05 PM, Pranith Kumar Karampuri wrote: On 02/12/2015 09:14 AM, Justin Clift wrote: On 12 Feb 2015, at 03:02, Shyam srang...@redhat.com wrote: On 02/11/2015 08:28 AM, David F. Robinson wrote: My base filesystem has 40-TB and the tar takes 19 minutes. I copied over 10-TB and it took the tar extraction from 1-minute to 7-minutes. My suspicion is that it is related to number of files and not necessarily file size. Shyam is looking into reproducing this behavior on a redhat system. I am able to reproduce the issue on a similar setup internally (at least at the surface it seems to be similar to what David is facing). I will continue the investigation for the root cause. Here is the initial analysis of my investigation: (Thanks for providing me with the setup shyam, keep the setup we may need it for further analysis) On bad volume: %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop - --- --- --- 0.00 0.00 us 0.00 us 0.00 us 937104 FORGET 0.00 0.00 us 0.00 us 0.00 us 872478 RELEASE 0.00 0.00 us 0.00 us 0.00 us 23668 RELEASEDIR 0.00 41.86 us 23.00 us 86.00 us 92 STAT 0.01 39.40 us 24.00 us 104.00 us 218 STATFS 0.28 55.99 us 43.00 us 1152.00 us 4065 SETXATTR 0.58 56.89 us 25.00 us 4505.00 us 8236 OPENDIR 0.73 26.80 us 11.00 us 257.00 us 22238 FLUSH 0.77 152.83 us 92.00 us 8819.00 us 4065 RMDIR 2.57 62.00 us 21.00 us 409.00 us 33643 WRITE 5.46 199.16 us 108.00 us 469938.00 us 22238 UNLINK 6.70 69.83 us 43.00 us .00 us 77809 LOOKUP 6.97 447.60 us 21.00 us 54875.00 us 12631 READDIRP 7.73 79.42 us 33.00 us 1535.00 us 78909 SETATTR 14.11 2815.00 us 176.00 us 2106305.00 us 4065 MKDIR 54.09 1972.62 us 138.00 us 1520773.00 us 22238 CREATE On good volume: %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop - --- --- --- 0.00 0.00 us 0.00 us 0.00 us 58870 FORGET 0.00 0.00 us 0.00 us 0.00 us 66016 RELEASE 0.00 0.00 us 0.00 us 0.00 us 16480 RELEASEDIR 0.00 61.50 us 58.00 us 65.00 us 2 OPEN 0.01 39.56 us 16.00 us 112.00 us 71 STAT 0.02 41.29 us 27.00 us 79.00 us 163 STATFS 0.03 36.06 us 17.00 us 98.00 us 301 FSTAT 0.79 62.38 us 39.00 us 269.00 us 4065 SETXATTR 1.14 242.99 us 25.00 us 28636.00 us 1497 READ 1.54 59.76 us 25.00 us 6325.00 us 8236 OPENDIR 1.70 133.75 us 89.00 us 374.00 us 4065 RMDIR 2.25 32.65 us 15.00 us 265.00 us 22006 FLUSH 3.37 265.05 us 172.00 us 2349.00 us 4065 MKDIR 7.14 68.34 us 21.00 us 21902.00 us 33357 WRITE 11.00 159.68 us 107.00 us 2567.00 us 22003 UNLINK 13.82 200.54 us 133.00 us 21762.00 us 22003 CREATE 17.85 448.85 us 22.00 us 54046.00 us 12697 READDIRP 18.37 76.12 us 45.00 us 294.00 us 77044 LOOKUP 20.95 85.54 us 35.00 us 1404.00 us 78204 SETATTR As we can see here, FORGET/RELEASE are way more in the brick from full volume compared to the brick from empty volume. It seems to suggest that the inode-table on the volume with lots of data is carrying too many passive inodes in the table which need to be displaced to create new ones. Need to check if they come in the fop-path. Need to continue my investigations further, will let you know. Just to increase confidence performed one more test. Stopped the volumes and re-started. Now on both the volumes, the numbers are almost same: [root@gqac031 gluster-mount]# time rm -rf boost_1_57_0 ; time tar xf boost_1_57_0.tar.gz real 1m15.074s
Re: [Gluster-devel] missing files
-- Original Message -- From: Shyam srang...@redhat.com To: David F. Robinson david.robin...@corvidtec.com; Pranith Kumar Karampuri pkara...@redhat.com; Justin Clift jus...@gluster.org Cc: Gluster Devel gluster-devel@gluster.org Sent: 2/12/2015 11:26:51 AM Subject: Re: [Gluster-devel] missing files On 02/12/2015 11:18 AM, David F. Robinson wrote: Shyam, You asked me to stop/start the slow volume to see if it fixed the timing issue. I stopped/started homegfs_backup (the production volume with 40+ TB) and it didn't make it faster. I didn't stop/start the fast volume to see if it made it slower. I just did that and sent out an email. I saw a similar result as Pranith. Just to be clear even after restart of the slow volume, we see ~19 minutes for the tar to complete, correct? Correct Versus, on the fast volume it is anywhere between 00:55 - 3:00 minutes, irrespective of start, fresh create, etc. correct? Correct Shyam ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] missing files
FWIW, starting/stopping a volume that is fast doesn't consistently make it slow. I just tried it again on an older volume... It doesn't make it slow. I also went back and re-ran the test on test3brick and it isn't slow any longer. Maybe there is a time lag after stopping/starting a volume before it becomes fast. Either way, stopping/starting a fast volume only makes it slow for some period of time and it doesn't consistently make it slow. I don't think this is the issue. red-herring. [root@gfs01bkp /]# gluster volume stop test2brick Stopping volume will make its data inaccessible. Do you want to continue? (y/n) y [root@gfs01bkp /]# gluster volume start test2brick volume start: test2brick: success [root@gfs01bkp /]# mount /test2brick [root@gfs01bkp /]# cd /test2brick [root@gfs01bkp test2brick]# time tar -xPf boost_1_57_0.tar; time rm -rf boost_1_57_0 real1m1.124s user0m0.432s sys 0m3.136s real0m16.630s user0m0.083s sys 0m0.570s #... Retest on test3brick after it has been up after a volume restart for 20-minutes... Compare this to running the test immediately after a restart which gave a time of 3.5-minutes. [root@gfs01bkp test3brick]# time tar -xPf boost_1_57_0.tar; time rm -rf boost_1_57_0 real1m17.786s user0m0.502s sys 0m3.278s real0m18.103s user0m0.101s sys 0m0.684s -- Original Message -- From: Shyam srang...@redhat.com To: David F. Robinson david.robin...@corvidtec.com; Pranith Kumar Karampuri pkara...@redhat.com; Justin Clift jus...@gluster.org Cc: Gluster Devel gluster-devel@gluster.org Sent: 2/12/2015 11:26:51 AM Subject: Re: [Gluster-devel] missing files On 02/12/2015 11:18 AM, David F. Robinson wrote: Shyam, You asked me to stop/start the slow volume to see if it fixed the timing issue. I stopped/started homegfs_backup (the production volume with 40+ TB) and it didn't make it faster. I didn't stop/start the fast volume to see if it made it slower. I just did that and sent out an email. I saw a similar result as Pranith. Just to be clear even after restart of the slow volume, we see ~19 minutes for the tar to complete, correct? Versus, on the fast volume it is anywhere between 00:55 - 3:00 minutes, irrespective of start, fresh create, etc. correct? Shyam ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Fw: Re[2]: missing files
I will forward the emails to Shyam to the devel list. David (Sent from mobile) === David F. Robinson, Ph.D. President - Corvid Technologies 704.799.6944 x101 [office] 704.252.1310 [cell] 704.799.7974 [fax] david.robin...@corvidtec.com http://www.corvidtechnologies.com On Feb 11, 2015, at 8:21 AM, Pranith Kumar Karampuri pkara...@redhat.com wrote: On 02/11/2015 06:49 PM, Pranith Kumar Karampuri wrote: On 02/11/2015 08:36 AM, Shyam wrote: Did some analysis with David today on this here is a gist for the list, 1) Volumes classified as slow (i.e with a lot of pre-existing data) and fast (new volumes carved from the same backend file system that slow bricks are on, with little or no data) 2) We ran an strace of tar and also collected io-stats outputs from these volumes, both show that create and mkdir is slower on slow as compared to the fast volume. This seems to be the overall reason for slowness. Did you happen to do strace of the brick when this happened? If not, David, can we get that information as well? It would be nice to compare the difference in syscalls of the bricks of two volumes to see if there are any extra syscalls that is adding to the delay. Pranith Pranith 3) The tarball extraction is to a new directory on the gluster mount, so all lookups etc. happen within this new name space on the volume 4) Checked memory footprints of the slow bricks and fast bricks etc. nothing untoward noticed there 5) Restarted the slow volume, just as a test case to do things from scratch, no improvement in performance. Currently attempting to reproduce this on a local system to see if the same behavior is seen so that it becomes easier to debug etc. Others on the list can chime in as they see fit. Thanks, Shyam On 02/10/2015 09:58 AM, David F. Robinson wrote: Forwarding to devel list as recommended by Justin... David -- Forwarded Message -- From: David F. Robinson david.robin...@corvidtec.com To: Justin Clift jus...@gluster.org Sent: 2/10/2015 9:49:09 AM Subject: Re[2]: [Gluster-devel] missing files Bad news... I don't think it is the old linkto files. Bad because if that was the issue, cleaning up all of bad linkto files would have fixed the issue. It seems like the system just gets slower as you add data. First, I setup a new clean volume (test2brick) on the same system as the old one (homegfs_bkp). See 'gluster v info' below. I ran my simple tar extraction test on the new volume and it took 58-seconds to complete (which, BTW, is 10-seconds faster than my old non-gluster system, so kudos). The time on homegfs_bkp is 19-minutes. Next, I copied 10-terabytes of data over to test2brick and re-ran the test which then took 7-minutes. I created a test3brick and ran the test and it took 53-seconds. To confirm all of this, I deleted all of the data from test2brick and re-ran the test. It took 51-seconds!!! BTW. I also checked the .glusterfs for stale linkto files (find . -type f -size 0 -perm 1000 -exec ls -al {} \;). There are many, many thousands of these types of files on the old volume and none on the new one, so I don't think this is related to the performance issue. Let me know how I should proceed. Send this to devel list? Pranith? others? Thanks... [root@gfs01bkp .glusterfs]# gluster volume info homegfs_bkp Volume Name: homegfs_bkp Type: Distribute Volume ID: 96de8872-d957-4205-bf5a-076e3f35b294 Status: Started Number of Bricks: 2 Transport-type: tcp Bricks: Brick1: gfsib01bkp.corvidtec.com:/data/brick01bkp/homegfs_bkp Brick2: gfsib01bkp.corvidtec.com:/data/brick02bkp/homegfs_bkp [root@gfs01bkp .glusterfs]# gluster volume info test2brick Volume Name: test2brick Type: Distribute Volume ID: 123259b2-3c61-4277-a7e8-27c7ec15e550 Status: Started Number of Bricks: 2 Transport-type: tcp Bricks: Brick1: gfsib01bkp.corvidtec.com:/data/brick01bkp/test2brick Brick2: gfsib01bkp.corvidtec.com:/data/brick02bkp/test2brick [root@gfs01bkp glusterfs]# gluster volume info test3brick Volume Name: test3brick Type: Distribute Volume ID: 9b1613fc-f7e5-4325-8f94-e3611a5c3701 Status: Started Number of Bricks: 2 Transport-type: tcp Bricks: Brick1: gfsib01bkp.corvidtec.com:/data/brick01bkp/test3brick Brick2: gfsib01bkp.corvidtec.com:/data/brick02bkp/test3brick From homegfs_bkp: # find . -type f -size 0 -perm 1000 -exec ls -al {} \; T 2 gmathur pme_ics 0 Jan 9 16:59 ./00/16/00169a69-1a7a-44c9-b2d8-991671ee87c4 -T 3 jcowan users 0 Jan 9 17:51 ./00/16/0016a0a0-fd22-4fb5-b6fb-5d7f9024ab74 -T 2 morourke sbir 0 Jan 9 18:17 ./00/16/0016b36f-32fc-4f2c-accd-e36be2f6c602 -T 2 carpentr irl 0 Jan 9 18:52 ./00/16/00163faf-741c-4e40-8081-784786b3cc71 -T 3 601 raven 0 Jan 9 22:49 ./00/16/00163385-a332-4050-8104-1b1af6cd8249 -T 3 bangell sbir 0 Jan 9 22:56 ./00/16/00167803-0244-46de-8246
Re: [Gluster-devel] missing files
My base filesystem has 40-TB and the tar takes 19 minutes. I copied over 10-TB and it took the tar extraction from 1-minute to 7-minutes. My suspicion is that it is related to number of files and not necessarily file size. Shyam is looking into reproducing this behavior on a redhat system. David (Sent from mobile) === David F. Robinson, Ph.D. President - Corvid Technologies 704.799.6944 x101 [office] 704.252.1310 [cell] 704.799.7974 [fax] david.robin...@corvidtec.com http://www.corvidtechnologies.com On Feb 11, 2015, at 7:38 AM, Justin Clift jus...@gluster.org wrote: On 11 Feb 2015, at 12:31, David F. Robinson david.robin...@corvidtec.com wrote: Some time ago I had a similar performance problem (with 3.4 if I remember correctly): a just created volume started to work fine, but after some time using it performance was worse. Removing all files from the volume didn't improve the performance again. I guess my problem is a little better depending on how you look at it. If I date the data from the volume, the performance goes back to that of an empty volume. I don't have to delete the .glusterfs entries to regain my performance. I only have to delete the data from the mount point. Interesting. Do you have somewhat accurate stats on how much data (eg # of entries, size of files) was in the data set that did this? Wondering if it's repeatable, so we can replicate the problem and solve. :) + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] missing files
Don't think it is the underlying file system. /data/brickxx is the underlying xfs. Performance to this is fine. When I created a volume it just puts the data in /data/brick/test2. The underlying filesystem shouldn't know/care that it is in a new directory. Also, if I create a /data/brick/test2 volume and put data on it, it gets slow in gluster. But, writing to /data/brick is still fine. And, after test2 gets slow, I can create a /data/test3 volume that is empty and its speed is fine. My knowledge is admittedly very limited here, but I don't see how it could be the underlying filesystem if the slowdown only occurs on the gluster mount and not on the underlying xfs filesystem. David (Sent from mobile) === David F. Robinson, Ph.D. President - Corvid Technologies 704.799.6944 x101 [office] 704.252.1310 [cell] 704.799.7974 [fax] david.robin...@corvidtec.com http://www.corvidtechnologies.com On Feb 11, 2015, at 12:18 AM, Justin Clift jus...@gluster.org wrote: On 11 Feb 2015, at 03:06, Shyam srang...@redhat.com wrote: snip 2) We ran an strace of tar and also collected io-stats outputs from these volumes, both show that create and mkdir is slower on slow as compared to the fast volume. This seems to be the overall reason for slowness Any idea's on why the create and mkdir is slower? Wondering if it's a case of underlying filesystem parameters (for the bricks) + maybe physical storage structure having become badly optimised over time. eg if its on spinning rust, not ssd, and sector placement is now bad Any idea if there are tools that can analyse this kind of thing? eg meta data placement / fragmentation / on a drive for XFS/ext4 + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] stale file handle
I am seeing the following on one of my FUSE clients (indy.rst and indy.rst.old has ??? ???). Has anyone seen this before? Any idea what causes this for a given client? If I try to access the file, I get a stale file handle.. # cp indy.rst dfr.rst cp: cannot stat `indy.rst': Stale file handle Bottom of the log has: [2015-02-10 22:20:34.913632] I [dht-rename.c:1344:dht_rename] 0-homegfs-dht: renaming /hpc_shared/motorsports/gmics/Raven/p4/133/dc3.tmp (hash=homegfs-replicate-2/cache=homegfs-replicate-2) = /hpc_shared/motorsports/gmics/Raven/p4/133/data_collected3 (hash=homegfs-replicate-3/cache=homegfs-replicate-2) [2015-02-10 22:40:04.138594] W [client-rpc-fops.c:504:client3_3_stat_cbk] 0-homegfs-client-1: remote operation failed: Stale file handle [2015-02-10 22:40:04.158855] W [MSGID: 108008] [afr-read-txn.c:221:afr_read_txn] 0-homegfs-replicate-0: Unreadable subvolume -1 found with event generation 2. (Possible split-brain) [2015-02-10 22:40:04.202696] W [fuse-bridge.c:779:fuse_attr_cbk] 0-glusterfs-fuse: 1396664: STAT() /hpc_shared/motorsports/gmics/Raven/p3/70_sst_r4_1em3/indy.rst.old = -1 (Stale file handle) The message W [MSGID: 108008] [afr-read-txn.c:221:afr_read_txn] 0-homegfs-replicate-0: Unreadable subvolume -1 found with event generation 2. (Possible split-brain) repeated 14 times between [2015-02-10 22:40:04.158855] and [2015-02-10 22:41:00.610296] [2015-02-10 22:41:45.339419] W [MSGID: 108008] [afr-read-txn.c:221:afr_read_txn] 0-homegfs-replicate-0: Unreadable subvolume -1 found with event generation 2. (Possible split-brain) The message W [MSGID: 108008] [afr-read-txn.c:221:afr_read_txn] 0-homegfs-replicate-0: Unreadable subvolume -1 found with event generation 2. (Possible split-brain) repeated 31 times between [2015-02-10 22:41:45.339419] and [2015-02-10 22:43:11.483421] [2015-02-10 22:43:37.498720] W [MSGID: 108008] [afr-read-txn.c:221:afr_read_txn] 0-homegfs-replicate-0: Unreadable subvolume -1 found with event generation 2. (Possible split-brain) However, the same files on my other FUSE clients look fine: From the storage system: From another client:___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] cannot delete non-empty directory
So, just to be sure before I do this, it is okay to do the following if I want to get rid of everything in the /old_shelf4/Aegis directory and below? rm -rf /data/brick*/homegfs_bkp/backup.0/old_shelf4/Aegis What happens to all of the files in the .glusterfs directory? Does this get rebuilt or do the links stay there for files that now no longer exist? And, is this same issue what causes all of the broken links in .glusterfs. See attached image for example. There appears to be a lot of broken links the .glusterfs directories. Is this normal or does it indicate another problem. Finally, if I search through the /data/brick* directories, should I find no entries of ---T permission files with zero length files? Do I need to clean all of these up somehow? A quick look at /data/brick01bkp/homegfs_bkp/.glusterfs/2f/54 shows many of these files. They look like -T 3 rbhinge pme_ics 0 Jan 9 16:45 2f54d7d6-968b-442f-8cfe-eff01d6cefe7 -T 2 rbhinge pme_ics 0 Jan 9 21:40 2f54d7e7-b198-4fd4-aec7-f5d0ff020f72 How do I find out what file these entries were pointing to? David -- Original Message -- From: Shyam srang...@redhat.com To: David F. Robinson david.robin...@corvidtec.com; Gluster Devel gluster-devel@gluster.org; gluster-us...@gluster.org gluster-us...@gluster.org; Susant Palai spa...@redhat.com Sent: 2/9/2015 11:11:20 AM Subject: Re: [Gluster-devel] cannot delete non-empty directory On 02/08/2015 12:19 PM, David F. Robinson wrote: I am seeing these messsages after I delete large amounts of data using gluster 3.6.2. cannot delete non-empty directory: old_shelf4/Aegis/!!!Programs/RavenCFD/Storage/Jimmy_Old/src_vj1.5_final *_From the FUSE mount (as root), the directory shows up as empty:_* # pwd /backup/homegfs/backup.0/old_shelf4/Aegis/!!!Programs/RavenCFD/Storage/Jimmy_Old/src_vj1.5_final # ls -al total 5 d- 2 root root 4106 Feb 6 13:55 . drwxrws--- 3 601 dmiller 72 Feb 6 13:55 .. However, when you look at the bricks, the files are still there (none on brick01bkp, all files are on brick02bkp). All of the files are 0-length and have --T permissions. These files are linkto files that are created by DHT, which basically mean the files were either renamed, or the brick layout changed (I suspect the former to be the cause). These files should have been deleted when the files that they point to were deleted, looks like this did not happen. Can I get the following information for some of the files here, - getfattr -d -m . -e text path to file on brick - The output of trusted.glusterfs.dht.linkto xattr should state where the real file belongs, in this case as there are only 2 bricks, it should be brick01bkp subvol - As the second brick is empty, we should be able to safely delete these files from the brick and proceed to do an rmdir on the mount point of the volume as the directory is now empty. - Please check, the one sub-directory that is showing up in this case as well, save1 Any suggestions on how to fix this and how to prevent it from happening? I believe there are renames happening here, possibly by the archive creator, one way to prevent the rename from creating a linkto file is to use the DHT set parameter to set a pattern so that file name hash considers only the static part of the name. The set parameter is, cluster.extra-hash-regex. A link on a similar problem and how to use this set parameter (there a few in the gluster forums) would be, http://www.gluster.org/pipermail/gluster-devel/2014-November/042863.html Additionally, there is a bug here, the unlink of the file should have cleaned up the linkto as well, so that all of the above is not required, we have noticed this with NFS and FUSE mounts (ref bugs, 1117923, 1139992), and investigation is in progress on the same. We will step up the priority on this so that we have a clean fix that can be used to prevent this in the future. Shyam___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] cannot delete non-empty directory
/backup/homegfs/backup.0/old_shelf4/Aegis/!!!Programs/Nextel_Cup/SHR/Backup/shr/Airbox/C24 [root@gfs01bkp C24]# ls -al total 1 drwx-- 2 jcowan users 39 Feb 6 12:41 . drwxrw-rw- 4 jcowan users 62 Feb 6 19:19 .. [root@gfs01bkp C24]# ls -al /data/brick*/homegfs_bkp/backup.0/old_shelf4/Aegis/\!\!\!Programs/Nextel_Cup/SHR/Backup/shr/Airbox/C24/z_slices/ total 4 drwxrw-rw-+ 2 jcowan users 4096 Feb 6 12:41 . drwxrw-rw-+ 3 jcowan users 29 Feb 6 12:41 .. -T 5 jcowan users0 Nov 19 23:30 c24-airbox_vr_z=25_zoom.jpeg -T 5 jcowan users0 Nov 19 23:30 c24-airbox_vr_z=26.jpeg -T 5 jcowan users0 Nov 19 23:30 c24-airbox_vr_z=27.jpeg -T 5 jcowan users0 Nov 19 23:30 c24-airbox_vr_z=28.jpeg -T 5 jcowan users0 Nov 19 23:30 c24-airbox_vr_z=29.5_zoom.jpeg -T 5 jcowan users0 Nov 19 23:30 c24-airbox_vr_z=30.jpeg -T 5 jcowan users0 Nov 19 23:30 c24-airbox_vr_z=31.jpeg -T 5 jcowan users0 Nov 19 23:30 c24-airbox_vr_z=32.5.jpeg [root@gfs01bkp C24]# getfattr -d -m . -e text /data/brick*bkp/homegfs_bkp/backup.0/old_shelf4/Aegis/\!\!\!Programs/Nextel_Cup/SHR/Backup/shr/Airbox/C24/z_slices/* getfattr: Removing leading '/' from absolute path names # file: data/brick01bkp/homegfs_bkp/backup.0/old_shelf4/Aegis/!!!Programs/Nextel_Cup/SHR/Backup/shr/Airbox/C24/z_slices/c24-airbox_vr_z=25_zoom.jpeg trusted.gfid=îr'V*N©ÍÆF¿ trusted.glusterfs.dht.linkto=homegfs_bkp-client-1 # file: data/brick01bkp/homegfs_bkp/backup.0/old_shelf4/Aegis/!!!Programs/Nextel_Cup/SHR/Backup/shr/Airbox/C24/z_slices/c24-airbox_vr_z=26.jpeg trusted.gfid=Là¾}®ÀLdza¥U trusted.glusterfs.dht.linkto=homegfs_bkp-client-1 # file: data/brick01bkp/homegfs_bkp/backup.0/old_shelf4/Aegis/!!!Programs/Nextel_Cup/SHR/Backup/shr/Airbox/C24/z_slices/c24-airbox_vr_z=27.jpeg trusted.gfid=©.ñªû2@¬ºÜdíÁ?%_ trusted.glusterfs.dht.linkto=homegfs_bkp-client-1 # file: data/brick01bkp/homegfs_bkp/backup.0/old_shelf4/Aegis/!!!Programs/Nextel_Cup/SHR/Backup/shr/Airbox/C24/z_slices/c24-airbox_vr_z=28.jpeg trusted.gfid=0¥ /DªÒx?Ïý trusted.glusterfs.dht.linkto=homegfs_bkp-client-1 # file: data/brick01bkp/homegfs_bkp/backup.0/old_shelf4/Aegis/!!!Programs/Nextel_Cup/SHR/Backup/shr/Airbox/C24/z_slices/c24-airbox_vr_z=29.5_zoom.jpeg trusted.gfid=¼9T'$²Cí¯Eÿx!1 trusted.glusterfs.dht.linkto=homegfs_bkp-client-1 # file: data/brick01bkp/homegfs_bkp/backup.0/old_shelf4/Aegis/!!!Programs/Nextel_Cup/SHR/Backup/shr/Airbox/C24/z_slices/c24-airbox_vr_z=30.jpeg trusted.gfid=tè 8rð trusted.glusterfs.dht.linkto=homegfs_bkp-client-1 # file: data/brick01bkp/homegfs_bkp/backup.0/old_shelf4/Aegis/!!!Programs/Nextel_Cup/SHR/Backup/shr/Airbox/C24/z_slices/c24-airbox_vr_z=31.jpeg trusted.gfid=x´Å EŦ¡ZmØWà trusted.glusterfs.dht.linkto=homegfs_bkp-client-1 # file: data/brick01bkp/homegfs_bkp/backup.0/old_shelf4/Aegis/!!!Programs/Nextel_Cup/SHR/Backup/shr/Airbox/C24/z_slices/c24-airbox_vr_z=32.5.jpeg trusted.gfid=d+0ÇxþM¯GxÑ@ trusted.glusterfs.dht.linkto=homegfs_bkp-client-1 -- Original Message -- From: Shyam srang...@redhat.com To: David F. Robinson david.robin...@corvidtec.com; Gluster Devel gluster-devel@gluster.org; gluster-us...@gluster.org gluster-us...@gluster.org; Susant Palai spa...@redhat.com Sent: 2/9/2015 11:11:20 AM Subject: Re: [Gluster-devel] cannot delete non-empty directory On 02/08/2015 12:19 PM, David F. Robinson wrote: I am seeing these messsages after I delete large amounts of data using gluster 3.6.2. cannot delete non-empty directory: old_shelf4/Aegis/!!!Programs/RavenCFD/Storage/Jimmy_Old/src_vj1.5_final *_From the FUSE mount (as root), the directory shows up as empty:_* # pwd /backup/homegfs/backup.0/old_shelf4/Aegis/!!!Programs/RavenCFD/Storage/Jimmy_Old/src_vj1.5_final # ls -al total 5 d- 2 root root 4106 Feb 6 13:55 . drwxrws--- 3 601 dmiller 72 Feb 6 13:55 .. However, when you look at the bricks, the files are still there (none on brick01bkp, all files are on brick02bkp). All of the files are 0-length and have --T permissions. These files are linkto files that are created by DHT, which basically mean the files were either renamed, or the brick layout changed (I suspect the former to be the cause). These files should have been deleted when the files that they point to were deleted, looks like this did not happen. Can I get the following information for some of the files here, - getfattr -d -m . -e text path to file on brick - The output of trusted.glusterfs.dht.linkto xattr should state where the real file belongs, in this case as there are only 2 bricks, it should be brick01bkp subvol - As the second brick is empty, we should be able to safely delete these files from the brick and proceed to do an rmdir on the mount point of the volume as the directory is now empty. - Please check, the one sub-directory that is showing up in this case as well, save1 Any suggestions
[Gluster-devel] cannot delete non-empty directory
I am seeing these messsages after I delete large amounts of data using gluster 3.6.2. cannot delete non-empty directory: old_shelf4/Aegis/!!!Programs/RavenCFD/Storage/Jimmy_Old/src_vj1.5_final From the FUSE mount (as root), the directory shows up as empty: # pwd /backup/homegfs/backup.0/old_shelf4/Aegis/!!!Programs/RavenCFD/Storage/Jimmy_Old/src_vj1.5_final # ls -al total 5 d- 2 root root4106 Feb 6 13:55 . drwxrws--- 3 601 dmiller 72 Feb 6 13:55 .. However, when you look at the bricks, the files are still there (none on brick01bkp, all files are on brick02bkp). All of the files are 0-length and have --T permissions. Any suggestions on how to fix this and how to prevent it from happening? # ls -al /data/brick*/homegfs_bkp/backup.0/old_shelf4/Aegis/\!\!\!Programs/RavenCFD/Storage/Jimmy_Old/src_vj1.5_final /data/brick01bkp/homegfs_bkp/backup.0/old_shelf4/Aegis/!!!Programs/RavenCFD/Storage/Jimmy_Old/src_vj1.5_final: total 4 d-+ 2 root root 10 Feb 6 13:55 . drwxrws---+ 3 601 raven 36 Feb 6 13:55 .. /data/brick02bkp/homegfs_bkp/backup.0/old_shelf4/Aegis/!!!Programs/RavenCFD/Storage/Jimmy_Old/src_vj1.5_final: total 8 d-+ 3 root root 4096 Dec 31 1969 . drwxrws---+ 3 601 raven 36 Feb 6 13:55 .. -T 5 601 raven0 Nov 20 00:08 read_inset.f.gz -T 5 601 raven0 Nov 20 00:08 readbc.f.gz -T 5 601 raven0 Nov 20 00:08 readcn.f.gz -T 5 601 raven0 Nov 20 00:08 readinp.f.gz -T 5 601 raven0 Nov 20 00:08 readinp_v1_2.f.gz -T 5 601 raven0 Nov 20 00:08 readinp_v1_3.f.gz -T 5 601 raven0 Nov 20 00:08 rotatept.f.gz d-+ 2 root root 118 Feb 6 13:54 save1 -T 5 601 raven0 Nov 20 00:08 sepvec.f.gz -T 5 601 raven0 Nov 20 00:08 shadow.f.gz -T 5 601 raven0 Nov 20 00:08 snksrc.f.gz -T 5 601 raven0 Nov 20 00:08 source.f.gz -T 5 601 raven0 Nov 20 00:08 step.f.gz -T 5 601 raven0 Nov 20 00:08 stoprog.f.gz -T 5 601 raven0 Nov 20 00:08 summer6.f.gz -T 5 601 raven0 Nov 20 00:08 totforc.f.gz -T 5 601 raven0 Nov 20 00:08 tritet.f.gz -T 5 601 raven0 Nov 20 00:08 wallrsd.f.gz -T 5 601 raven0 Nov 20 00:08 wheat.f.gz -T 5 601 raven0 Nov 20 00:08 write_inset.f.gz This is using gluster 3.6.2 on a distributed gluster volume that resides on a single machine. Both of the bricks are on one machine consisting of 2x RAID-6 arrays. df -h | grep brick /dev/mapper/vg01-lvol1 88T 22T 66T 25% /data/brick01bkp /dev/mapper/vg02-lvol1 88T 22T 66T 26% /data/brick02bkp # gluster volume info homegfs_bkp Volume Name: homegfs_bkp Type: Distribute Volume ID: 96de8872-d957-4205-bf5a-076e3f35b294 Status: Started Number of Bricks: 2 Transport-type: tcp Bricks: Brick1: gfsib01bkp.corvidtec.com:/data/brick01bkp/homegfs_bkp Brick2: gfsib01bkp.corvidtec.com:/data/brick02bkp/homegfs_bkp Options Reconfigured: storage.owner-gid: 100 performance.io-thread-count: 32 server.allow-insecure: on network.ping-timeout: 10 performance.cache-size: 128MB performance.write-behind-window-size: 128MB server.manage-gids: on changelog.rollover-time: 15 changelog.fsync-interval: 3 === David F. Robinson, Ph.D. President - Corvid Technologies 704.799.6944 x101 [office] 704.252.1310 [cell] 704.799.7974 [fax] david.robin...@corvidtec.com http://www.corvidtechnologies.com ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] missing files
I don't think I understood what you sent enough to give it a try. I'll wait until it comes out in a beta or release version. David -- Original Message -- From: Ben Turner btur...@redhat.com To: Justin Clift jus...@gluster.org; David F. Robinson david.robin...@corvidtec.com Cc: Benjamin Turner bennytu...@gmail.com; gluster-us...@gluster.org; Gluster Devel gluster-devel@gluster.org Sent: 2/6/2015 3:33:42 PM Subject: Re: [Gluster-devel] [Gluster-users] missing files - Original Message - From: Justin Clift jus...@gluster.org To: Benjamin Turner bennytu...@gmail.com Cc: David F. Robinson david.robin...@corvidtec.com, gluster-us...@gluster.org, Gluster Devel gluster-devel@gluster.org, Ben Turner btur...@redhat.com Sent: Friday, February 6, 2015 3:27:53 PM Subject: Re: [Gluster-devel] [Gluster-users] missing files On 6 Feb 2015, at 02:05, Benjamin Turner bennytu...@gmail.com wrote: I think that the multi threaded epoll changes that _just_ landed in master will help resolve this, but they are so new I haven't been able to test this. I'll know more when I get a chance to test tomorrow. Which multi-threaded epoll code just landed in master? Are you thinking of this one? http://review.gluster.org/#/c/3842/ If so, it's not in master yet. ;) Doh! I just saw - Required patches are all upstream now and assumed they were merged. I have been in class all week so I am not up2date with everything. I gave instructions on compiling it from the gerrit patches + master so if David wants to give it a go he can. Sorry for the confusion. -b + Justin -b On Thu, Feb 5, 2015 at 6:04 PM, David F. Robinson david.robin...@corvidtec.com wrote: Isn't rsync what geo-rep uses? David (Sent from mobile) === David F. Robinson, Ph.D. President - Corvid Technologies 704.799.6944 x101 [office] 704.252.1310 [cell] 704.799.7974 [fax] david.robin...@corvidtec.com http://www.corvidtechnologies.com On Feb 5, 2015, at 5:41 PM, Ben Turner btur...@redhat.com wrote: - Original Message - From: Ben Turner btur...@redhat.com To: David F. Robinson david.robin...@corvidtec.com Cc: Pranith Kumar Karampuri pkara...@redhat.com, Xavier Hernandez xhernan...@datalab.es, Benjamin Turner bennytu...@gmail.com, gluster-us...@gluster.org, Gluster Devel gluster-devel@gluster.org Sent: Thursday, February 5, 2015 5:22:26 PM Subject: Re: [Gluster-users] [Gluster-devel] missing files - Original Message - From: David F. Robinson david.robin...@corvidtec.com To: Ben Turner btur...@redhat.com Cc: Pranith Kumar Karampuri pkara...@redhat.com, Xavier Hernandez xhernan...@datalab.es, Benjamin Turner bennytu...@gmail.com, gluster-us...@gluster.org, Gluster Devel gluster-devel@gluster.org Sent: Thursday, February 5, 2015 5:01:13 PM Subject: Re: [Gluster-users] [Gluster-devel] missing files I'll send you the emails I sent Pranith with the logs. What causes these disconnects? Thanks David! Disconnects happen when there are interruption in communication between peers, normally there is ping timeout that happens. It could be anything from a flaky NW to the system was to busy to respond to the pings. My initial take is more towards the ladder as rsync is absolutely the worst use case for gluster - IIRC it writes in 4kb blocks. I try to keep my writes at least 64KB as in my testing that is the smallest block size I can write with before perf starts to really drop off. I'll try something similar in the lab. Ok I do think that the file being self healed is RCA for what you were seeing. Lets look at one of the disconnects: data-brick02a-homegfs.log:[2015-02-03 20:54:02.772180] I [server.c:518:server_rpc_notify] 0-homegfs-server: disconnecting connection from gfs01b.corvidtec.com-4175-2015/02/02-16:44:31:179119-homegfs-client-2-0-1 And in the glustershd.log from the gfs01b_glustershd.log file: [2015-02-03 20:55:48.001797] I [afr-self-heal-entry.c:554:afr_selfheal_entry_do] 0-homegfs-replicate-0: performing entry selfheal on 6c79a368-edaa-432b-bef9-ec690ab42448 [2015-02-03 20:55:49.341996] I [afr-self-heal-common.c:476:afr_log_selfheal] 0-homegfs-replicate-0: Completed entry selfheal on 6c79a368-edaa-432b-bef9-ec690ab42448. source=1 sinks=0 [2015-02-03 20:55:49.343093] I [afr-self-heal-entry.c:554:afr_selfheal_entry_do] 0-homegfs-replicate-0: performing entry selfheal on 792cb0d6-9290-4447-8cd7-2b2d7a116a69 [2015-02-03 20:55:50.463652] I [afr-self-heal-common.c:476:afr_log_selfheal] 0-homegfs-replicate-0: Completed entry selfheal on 792cb0d6-9290-4447-8cd7-2b2d7a116a69. source=1 sinks=0 [2015-02-03 20:55:51.465289] I [afr-self-heal-metadata.c:54:__afr_selfheal_metadata_do] 0-homegfs-replicate-0: performing metadata
Re: [Gluster-devel] missing files
/Phase_1_SOCOM14-003_adv_armor/References: total 0 drwxrws--- 2 root root 10 Feb 4 18:12 . drwxrws--x 6 root root 95 Feb 4 18:12 .. [root@gfs02a ~]# ls -alR /data/brick0*/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor/References /data/brick01a/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor/References: total 0 drwxrws--- 3 root root 41 Feb 4 18:12 . drwxrws--x 7 root root 118 Feb 4 18:12 .. drwxrws--- 2 streadway sbir 80 Jan 23 14:46 USSOCOM_OPAQUE_ARMOR /data/brick01a/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor/References/USSOCOM_OPAQUE_ARMOR: total 72 drwxrws--- 2 streadway sbir 80 Jan 23 14:46 . drwxrws--- 3 root root 41 Feb 4 18:12 .. -rwxrw 2 streadway sbir 17248 Jun 19 2014 COMPARISON OF SOLUTIONS.one -rwxrw 2 streadway sbir 49736 Jan 21 13:18 GIVEN TRADE SPACE.one /data/brick02a/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor/References: total 0 drwxrws--- 3 root root 41 Feb 4 18:12 . drwxrws--x 7 root root 118 Feb 4 18:12 .. drwxrws--- 2 streadway sbir 79 Jan 23 14:46 USSOCOM_OPAQUE_ARMOR /data/brick02a/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor/References/USSOCOM_OPAQUE_ARMOR: total 84 drwxrws--- 2 streadway sbir 79 Jan 23 14:46 . drwxrws--- 3 root root 41 Feb 4 18:12 .. -rwxrw 2 streadway sbir 42440 Jun 19 2014 ARMOR PACKAGES.one -rwxrw 2 streadway sbir 38184 Jun 19 2014 CURRENT STANDARD ARMORING.one [root@gfs02b ~]# ls -alR /data/brick0*/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor/References /data/brick01b/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor/References: total 0 drwxrws--- 3 root root 41 Feb 4 18:12 . drwxrws--x 7 root root 118 Feb 4 18:12 .. drwxrws--- 2 streadway sbir 80 Jan 23 14:46 USSOCOM_OPAQUE_ARMOR /data/brick01b/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor/References/USSOCOM_OPAQUE_ARMOR: total 72 drwxrws--- 2 streadway sbir 80 Jan 23 14:46 . drwxrws--- 3 root root 41 Feb 4 18:12 .. -rwxrw 2 streadway sbir 17248 Jun 19 2014 COMPARISON OF SOLUTIONS.one -rwxrw 2 streadway sbir 49736 Jan 21 13:18 GIVEN TRADE SPACE.one /data/brick02b/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor/References: total 0 drwxrws--- 3 root root 41 Feb 4 18:12 . drwxrws--x 7 root root 118 Feb 4 18:12 .. drwxrws--- 2 streadway sbir 79 Jan 23 14:46 USSOCOM_OPAQUE_ARMOR /data/brick02b/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor/References/USSOCOM_OPAQUE_ARMOR: total 84 drwxrws--- 2 streadway sbir 79 Jan 23 14:46 . drwxrws--- 3 root root 41 Feb 4 18:12 .. -rwxrw 2 streadway sbir 42440 Jun 19 2014 ARMOR PACKAGES.one -rwxrw 2 streadway sbir 38184 Jun 19 2014 CURRENT STANDARD ARMORING.one -- Original Message -- From: Xavier Hernandez xhernan...@datalab.es To: David F. Robinson david.robin...@corvidtec.com; Benjamin Turner bennytu...@gmail.com; Pranith Kumar Karampuri pkara...@redhat.com Cc: gluster-us...@gluster.org gluster-us...@gluster.org; Gluster Devel gluster-devel@gluster.org Sent: 2/5/2015 5:14:22 AM Subject: Re: [Gluster-devel] missing files Is the failure repeatable ? with the same directories ? It's very weird that the directories appear on the volume when you do an 'ls' on the bricks. Could it be that you only made a single 'ls' on fuse mount which not showed the directory ? Is it possible that this 'ls' triggered a self-heal that repaired the problem, whatever it was, and when you did another 'ls' on the fuse mount after the 'ls' on the bricks, the directories were there ? The first 'ls' could have healed the files, causing that the following 'ls' on the bricks showed the files as if nothing were damaged. If that's the case, it's possible that there were some disconnections during the copy. Added Pranith because he knows better replication and self-heal details. Xavi On 02/04/2015 07:23 PM, David F. Robinson wrote: Distributed/replicated Volume Name: homegfs Type: Distributed-Replicate Volume ID: 1e32672a-f1b7-4b58-ba94-58c085e59071 Status: Started Number of Bricks: 4 x 2 = 8 Transport-type: tcp Bricks: Brick1: gfsib01a.corvidtec.com:/data/brick01a/homegfs Brick2: gfsib01b.corvidtec.com:/data/brick01b/homegfs Brick3: gfsib01a.corvidtec.com:/data/brick02a/homegfs Brick4: gfsib01b.corvidtec.com:/data/brick02b/homegfs Brick5: gfsib02a.corvidtec.com:/data/brick01a/homegfs Brick6: gfsib02b.corvidtec.com:/data/brick01b/homegfs Brick7: gfsib02a.corvidtec.com:/data/brick02a/homegfs Brick8: gfsib02b.corvidtec.com:/data/brick02b/homegfs Options Reconfigured: performance.io-thread-count: 32 performance.cache-size: 128MB performance.write-behind-window-size: 128MB server.allow-insecure: on network.ping-timeout: 10 storage.owner-gid: 100 geo-replication.indexing: off geo
Re: [Gluster-devel] [Gluster-users] missing files
It was a mix of files from very small to very large. And many terabytes of data. Approx 20tb David (Sent from mobile) === David F. Robinson, Ph.D. President - Corvid Technologies 704.799.6944 x101 [office] 704.252.1310 [cell] 704.799.7974 [fax] david.robin...@corvidtec.com http://www.corvidtechnologies.com On Feb 5, 2015, at 4:55 PM, Ben Turner btur...@redhat.com wrote: - Original Message - From: Pranith Kumar Karampuri pkara...@redhat.com To: Xavier Hernandez xhernan...@datalab.es, David F. Robinson david.robin...@corvidtec.com, Benjamin Turner bennytu...@gmail.com Cc: gluster-us...@gluster.org, Gluster Devel gluster-devel@gluster.org Sent: Thursday, February 5, 2015 5:30:04 AM Subject: Re: [Gluster-users] [Gluster-devel] missing files On 02/05/2015 03:48 PM, Pranith Kumar Karampuri wrote: I believe David already fixed this. I hope this is the same issue he told about permissions issue. Oops, it is not. I will take a look. Yes David exactly like these: data-brick02a-homegfs.log:[2015-02-03 19:09:34.568842] I [server.c:518:server_rpc_notify] 0-homegfs-server: disconnecting connection from gfs02a.corvidtec.com-18563-2015/02/03-19:07:58:519134-homegfs-client-2-0-0 data-brick02a-homegfs.log:[2015-02-03 19:09:41.286551] I [server.c:518:server_rpc_notify] 0-homegfs-server: disconnecting connection from gfs01a.corvidtec.com-12804-2015/02/03-19:09:38:497808-homegfs-client-2-0-0 data-brick02a-homegfs.log:[2015-02-03 19:16:35.906412] I [server.c:518:server_rpc_notify] 0-homegfs-server: disconnecting connection from gfs02b.corvidtec.com-27190-2015/02/03-19:15:53:458467-homegfs-client-2-0-0 data-brick02a-homegfs.log:[2015-02-03 19:51:22.761293] I [server.c:518:server_rpc_notify] 0-homegfs-server: disconnecting connection from gfs01a.corvidtec.com-25926-2015/02/03-19:51:02:89070-homegfs-client-2-0-0 data-brick02a-homegfs.log:[2015-02-03 20:54:02.772180] I [server.c:518:server_rpc_notify] 0-homegfs-server: disconnecting connection from gfs01b.corvidtec.com-4175-2015/02/02-16:44:31:179119-homegfs-client-2-0-1 You can 100% verify my theory if you can correlate the time on the disconnects to the time that the missing files were healed. Can you have a look at /var/log/glusterfs/glustershd.log? That has all of the healed files + timestamps, if we can see a disconnect during the rsync and a self heal of the missing file I think we can safely assume that the disconnects may have caused this. I'll try this on my test systems, how much data did you rsync? What size ish of files / an idea of the dir layout? @Pranith - Could bricks flapping up and down during the rsync cause the files to be missing on the first ls(written to 1 subvol but not the other cause it was down), the ls triggered SH, and thats why the files were there for the second ls be a possible cause here? -b Pranith Pranith On 02/05/2015 03:44 PM, Xavier Hernandez wrote: Is the failure repeatable ? with the same directories ? It's very weird that the directories appear on the volume when you do an 'ls' on the bricks. Could it be that you only made a single 'ls' on fuse mount which not showed the directory ? Is it possible that this 'ls' triggered a self-heal that repaired the problem, whatever it was, and when you did another 'ls' on the fuse mount after the 'ls' on the bricks, the directories were there ? The first 'ls' could have healed the files, causing that the following 'ls' on the bricks showed the files as if nothing were damaged. If that's the case, it's possible that there were some disconnections during the copy. Added Pranith because he knows better replication and self-heal details. Xavi On 02/04/2015 07:23 PM, David F. Robinson wrote: Distributed/replicated Volume Name: homegfs Type: Distributed-Replicate Volume ID: 1e32672a-f1b7-4b58-ba94-58c085e59071 Status: Started Number of Bricks: 4 x 2 = 8 Transport-type: tcp Bricks: Brick1: gfsib01a.corvidtec.com:/data/brick01a/homegfs Brick2: gfsib01b.corvidtec.com:/data/brick01b/homegfs Brick3: gfsib01a.corvidtec.com:/data/brick02a/homegfs Brick4: gfsib01b.corvidtec.com:/data/brick02b/homegfs Brick5: gfsib02a.corvidtec.com:/data/brick01a/homegfs Brick6: gfsib02b.corvidtec.com:/data/brick01b/homegfs Brick7: gfsib02a.corvidtec.com:/data/brick02a/homegfs Brick8: gfsib02b.corvidtec.com:/data/brick02b/homegfs Options Reconfigured: performance.io-thread-count: 32 performance.cache-size: 128MB performance.write-behind-window-size: 128MB server.allow-insecure: on network.ping-timeout: 10 storage.owner-gid: 100 geo-replication.indexing: off geo-replication.ignore-pid-check: on changelog.changelog: on changelog.fsync-interval: 3 changelog.rollover-time: 15 server.manage-gids: on -- Original Message -- From: Xavier Hernandez xhernan...@datalab.es To: David F. Robinson david.robin...@corvidtec.com
Re: [Gluster-devel] [Gluster-users] missing files
Should I run my rsync with --block-size = something other than the default? Is there an optimal value? I think 128k is the max from my quick search. Didn't dig into it throughly though. David (Sent from mobile) === David F. Robinson, Ph.D. President - Corvid Technologies 704.799.6944 x101 [office] 704.252.1310 [cell] 704.799.7974 [fax] david.robin...@corvidtec.com http://www.corvidtechnologies.com On Feb 5, 2015, at 5:41 PM, Ben Turner btur...@redhat.com wrote: - Original Message - From: Ben Turner btur...@redhat.com To: David F. Robinson david.robin...@corvidtec.com Cc: Pranith Kumar Karampuri pkara...@redhat.com, Xavier Hernandez xhernan...@datalab.es, Benjamin Turner bennytu...@gmail.com, gluster-us...@gluster.org, Gluster Devel gluster-devel@gluster.org Sent: Thursday, February 5, 2015 5:22:26 PM Subject: Re: [Gluster-users] [Gluster-devel] missing files - Original Message - From: David F. Robinson david.robin...@corvidtec.com To: Ben Turner btur...@redhat.com Cc: Pranith Kumar Karampuri pkara...@redhat.com, Xavier Hernandez xhernan...@datalab.es, Benjamin Turner bennytu...@gmail.com, gluster-us...@gluster.org, Gluster Devel gluster-devel@gluster.org Sent: Thursday, February 5, 2015 5:01:13 PM Subject: Re: [Gluster-users] [Gluster-devel] missing files I'll send you the emails I sent Pranith with the logs. What causes these disconnects? Thanks David! Disconnects happen when there are interruption in communication between peers, normally there is ping timeout that happens. It could be anything from a flaky NW to the system was to busy to respond to the pings. My initial take is more towards the ladder as rsync is absolutely the worst use case for gluster - IIRC it writes in 4kb blocks. I try to keep my writes at least 64KB as in my testing that is the smallest block size I can write with before perf starts to really drop off. I'll try something similar in the lab. Ok I do think that the file being self healed is RCA for what you were seeing. Lets look at one of the disconnects: data-brick02a-homegfs.log:[2015-02-03 20:54:02.772180] I [server.c:518:server_rpc_notify] 0-homegfs-server: disconnecting connection from gfs01b.corvidtec.com-4175-2015/02/02-16:44:31:179119-homegfs-client-2-0-1 And in the glustershd.log from the gfs01b_glustershd.log file: [2015-02-03 20:55:48.001797] I [afr-self-heal-entry.c:554:afr_selfheal_entry_do] 0-homegfs-replicate-0: performing entry selfheal on 6c79a368-edaa-432b-bef9-ec690ab42448 [2015-02-03 20:55:49.341996] I [afr-self-heal-common.c:476:afr_log_selfheal] 0-homegfs-replicate-0: Completed entry selfheal on 6c79a368-edaa-432b-bef9-ec690ab42448. source=1 sinks=0 [2015-02-03 20:55:49.343093] I [afr-self-heal-entry.c:554:afr_selfheal_entry_do] 0-homegfs-replicate-0: performing entry selfheal on 792cb0d6-9290-4447-8cd7-2b2d7a116a69 [2015-02-03 20:55:50.463652] I [afr-self-heal-common.c:476:afr_log_selfheal] 0-homegfs-replicate-0: Completed entry selfheal on 792cb0d6-9290-4447-8cd7-2b2d7a116a69. source=1 sinks=0 [2015-02-03 20:55:51.465289] I [afr-self-heal-metadata.c:54:__afr_selfheal_metadata_do] 0-homegfs-replicate-0: performing metadata selfheal on 403e661a-1c27-4e79-9867-c0572aba2b3c [2015-02-03 20:55:51.466515] I [afr-self-heal-common.c:476:afr_log_selfheal] 0-homegfs-replicate-0: Completed metadata selfheal on 403e661a-1c27-4e79-9867-c0572aba2b3c. source=1 sinks=0 [2015-02-03 20:55:51.467098] I [afr-self-heal-entry.c:554:afr_selfheal_entry_do] 0-homegfs-replicate-0: performing entry selfheal on 403e661a-1c27-4e79-9867-c0572aba2b3c [2015-02-03 20:55:55.257808] I [afr-self-heal-common.c:476:afr_log_selfheal] 0-homegfs-replicate-0: Completed entry selfheal on 403e661a-1c27-4e79-9867-c0572aba2b3c. source=1 sinks=0 [2015-02-03 20:55:55.258548] I [afr-self-heal-metadata.c:54:__afr_selfheal_metadata_do] 0-homegfs-replicate-0: performing metadata selfheal on c612ee2f-2fb4-4157-a9ab-5a2d5603c541 [2015-02-03 20:55:55.259367] I [afr-self-heal-common.c:476:afr_log_selfheal] 0-homegfs-replicate-0: Completed metadata selfheal on c612ee2f-2fb4-4157-a9ab-5a2d5603c541. source=1 sinks=0 [2015-02-03 20:55:55.259980] I [afr-self-heal-entry.c:554:afr_selfheal_entry_do] 0-homegfs-replicate-0: performing entry selfheal on c612ee2f-2fb4-4157-a9ab-5a2d5603c541 As you can see the self heal logs are just spammed with files being healed, and I looked at a couple of disconnects and I see self heals getting run shortly after on the bricks that were down. Now we need to find the cause of the disconnects, I am thinking once the disconnects are resolved the files should be properly copied over without SH having to fix things. Like I said I'll give this a go on my lab systems and see if I can repro the disconnects, I'll have time to run through it tomorrow. If in the mean time anyone else has
Re: [Gluster-devel] [Gluster-users] missing files
I'll send you the emails I sent Pranith with the logs. What causes these disconnects? David (Sent from mobile) === David F. Robinson, Ph.D. President - Corvid Technologies 704.799.6944 x101 [office] 704.252.1310 [cell] 704.799.7974 [fax] david.robin...@corvidtec.com http://www.corvidtechnologies.com On Feb 5, 2015, at 4:55 PM, Ben Turner btur...@redhat.com wrote: - Original Message - From: Pranith Kumar Karampuri pkara...@redhat.com To: Xavier Hernandez xhernan...@datalab.es, David F. Robinson david.robin...@corvidtec.com, Benjamin Turner bennytu...@gmail.com Cc: gluster-us...@gluster.org, Gluster Devel gluster-devel@gluster.org Sent: Thursday, February 5, 2015 5:30:04 AM Subject: Re: [Gluster-users] [Gluster-devel] missing files On 02/05/2015 03:48 PM, Pranith Kumar Karampuri wrote: I believe David already fixed this. I hope this is the same issue he told about permissions issue. Oops, it is not. I will take a look. Yes David exactly like these: data-brick02a-homegfs.log:[2015-02-03 19:09:34.568842] I [server.c:518:server_rpc_notify] 0-homegfs-server: disconnecting connection from gfs02a.corvidtec.com-18563-2015/02/03-19:07:58:519134-homegfs-client-2-0-0 data-brick02a-homegfs.log:[2015-02-03 19:09:41.286551] I [server.c:518:server_rpc_notify] 0-homegfs-server: disconnecting connection from gfs01a.corvidtec.com-12804-2015/02/03-19:09:38:497808-homegfs-client-2-0-0 data-brick02a-homegfs.log:[2015-02-03 19:16:35.906412] I [server.c:518:server_rpc_notify] 0-homegfs-server: disconnecting connection from gfs02b.corvidtec.com-27190-2015/02/03-19:15:53:458467-homegfs-client-2-0-0 data-brick02a-homegfs.log:[2015-02-03 19:51:22.761293] I [server.c:518:server_rpc_notify] 0-homegfs-server: disconnecting connection from gfs01a.corvidtec.com-25926-2015/02/03-19:51:02:89070-homegfs-client-2-0-0 data-brick02a-homegfs.log:[2015-02-03 20:54:02.772180] I [server.c:518:server_rpc_notify] 0-homegfs-server: disconnecting connection from gfs01b.corvidtec.com-4175-2015/02/02-16:44:31:179119-homegfs-client-2-0-1 You can 100% verify my theory if you can correlate the time on the disconnects to the time that the missing files were healed. Can you have a look at /var/log/glusterfs/glustershd.log? That has all of the healed files + timestamps, if we can see a disconnect during the rsync and a self heal of the missing file I think we can safely assume that the disconnects may have caused this. I'll try this on my test systems, how much data did you rsync? What size ish of files / an idea of the dir layout? @Pranith - Could bricks flapping up and down during the rsync cause the files to be missing on the first ls(written to 1 subvol but not the other cause it was down), the ls triggered SH, and thats why the files were there for the second ls be a possible cause here? -b Pranith Pranith On 02/05/2015 03:44 PM, Xavier Hernandez wrote: Is the failure repeatable ? with the same directories ? It's very weird that the directories appear on the volume when you do an 'ls' on the bricks. Could it be that you only made a single 'ls' on fuse mount which not showed the directory ? Is it possible that this 'ls' triggered a self-heal that repaired the problem, whatever it was, and when you did another 'ls' on the fuse mount after the 'ls' on the bricks, the directories were there ? The first 'ls' could have healed the files, causing that the following 'ls' on the bricks showed the files as if nothing were damaged. If that's the case, it's possible that there were some disconnections during the copy. Added Pranith because he knows better replication and self-heal details. Xavi On 02/04/2015 07:23 PM, David F. Robinson wrote: Distributed/replicated Volume Name: homegfs Type: Distributed-Replicate Volume ID: 1e32672a-f1b7-4b58-ba94-58c085e59071 Status: Started Number of Bricks: 4 x 2 = 8 Transport-type: tcp Bricks: Brick1: gfsib01a.corvidtec.com:/data/brick01a/homegfs Brick2: gfsib01b.corvidtec.com:/data/brick01b/homegfs Brick3: gfsib01a.corvidtec.com:/data/brick02a/homegfs Brick4: gfsib01b.corvidtec.com:/data/brick02b/homegfs Brick5: gfsib02a.corvidtec.com:/data/brick01a/homegfs Brick6: gfsib02b.corvidtec.com:/data/brick01b/homegfs Brick7: gfsib02a.corvidtec.com:/data/brick02a/homegfs Brick8: gfsib02b.corvidtec.com:/data/brick02b/homegfs Options Reconfigured: performance.io-thread-count: 32 performance.cache-size: 128MB performance.write-behind-window-size: 128MB server.allow-insecure: on network.ping-timeout: 10 storage.owner-gid: 100 geo-replication.indexing: off geo-replication.ignore-pid-check: on changelog.changelog: on changelog.fsync-interval: 3 changelog.rollover-time: 15 server.manage-gids: on -- Original Message -- From: Xavier Hernandez xhernan...@datalab.es To: David F. Robinson david.robin...@corvidtec.com; Benjamin
Re: [Gluster-devel] [Gluster-users] missing files
Isn't rsync what geo-rep uses? David (Sent from mobile) === David F. Robinson, Ph.D. President - Corvid Technologies 704.799.6944 x101 [office] 704.252.1310 [cell] 704.799.7974 [fax] david.robin...@corvidtec.com http://www.corvidtechnologies.com On Feb 5, 2015, at 5:41 PM, Ben Turner btur...@redhat.com wrote: - Original Message - From: Ben Turner btur...@redhat.com To: David F. Robinson david.robin...@corvidtec.com Cc: Pranith Kumar Karampuri pkara...@redhat.com, Xavier Hernandez xhernan...@datalab.es, Benjamin Turner bennytu...@gmail.com, gluster-us...@gluster.org, Gluster Devel gluster-devel@gluster.org Sent: Thursday, February 5, 2015 5:22:26 PM Subject: Re: [Gluster-users] [Gluster-devel] missing files - Original Message - From: David F. Robinson david.robin...@corvidtec.com To: Ben Turner btur...@redhat.com Cc: Pranith Kumar Karampuri pkara...@redhat.com, Xavier Hernandez xhernan...@datalab.es, Benjamin Turner bennytu...@gmail.com, gluster-us...@gluster.org, Gluster Devel gluster-devel@gluster.org Sent: Thursday, February 5, 2015 5:01:13 PM Subject: Re: [Gluster-users] [Gluster-devel] missing files I'll send you the emails I sent Pranith with the logs. What causes these disconnects? Thanks David! Disconnects happen when there are interruption in communication between peers, normally there is ping timeout that happens. It could be anything from a flaky NW to the system was to busy to respond to the pings. My initial take is more towards the ladder as rsync is absolutely the worst use case for gluster - IIRC it writes in 4kb blocks. I try to keep my writes at least 64KB as in my testing that is the smallest block size I can write with before perf starts to really drop off. I'll try something similar in the lab. Ok I do think that the file being self healed is RCA for what you were seeing. Lets look at one of the disconnects: data-brick02a-homegfs.log:[2015-02-03 20:54:02.772180] I [server.c:518:server_rpc_notify] 0-homegfs-server: disconnecting connection from gfs01b.corvidtec.com-4175-2015/02/02-16:44:31:179119-homegfs-client-2-0-1 And in the glustershd.log from the gfs01b_glustershd.log file: [2015-02-03 20:55:48.001797] I [afr-self-heal-entry.c:554:afr_selfheal_entry_do] 0-homegfs-replicate-0: performing entry selfheal on 6c79a368-edaa-432b-bef9-ec690ab42448 [2015-02-03 20:55:49.341996] I [afr-self-heal-common.c:476:afr_log_selfheal] 0-homegfs-replicate-0: Completed entry selfheal on 6c79a368-edaa-432b-bef9-ec690ab42448. source=1 sinks=0 [2015-02-03 20:55:49.343093] I [afr-self-heal-entry.c:554:afr_selfheal_entry_do] 0-homegfs-replicate-0: performing entry selfheal on 792cb0d6-9290-4447-8cd7-2b2d7a116a69 [2015-02-03 20:55:50.463652] I [afr-self-heal-common.c:476:afr_log_selfheal] 0-homegfs-replicate-0: Completed entry selfheal on 792cb0d6-9290-4447-8cd7-2b2d7a116a69. source=1 sinks=0 [2015-02-03 20:55:51.465289] I [afr-self-heal-metadata.c:54:__afr_selfheal_metadata_do] 0-homegfs-replicate-0: performing metadata selfheal on 403e661a-1c27-4e79-9867-c0572aba2b3c [2015-02-03 20:55:51.466515] I [afr-self-heal-common.c:476:afr_log_selfheal] 0-homegfs-replicate-0: Completed metadata selfheal on 403e661a-1c27-4e79-9867-c0572aba2b3c. source=1 sinks=0 [2015-02-03 20:55:51.467098] I [afr-self-heal-entry.c:554:afr_selfheal_entry_do] 0-homegfs-replicate-0: performing entry selfheal on 403e661a-1c27-4e79-9867-c0572aba2b3c [2015-02-03 20:55:55.257808] I [afr-self-heal-common.c:476:afr_log_selfheal] 0-homegfs-replicate-0: Completed entry selfheal on 403e661a-1c27-4e79-9867-c0572aba2b3c. source=1 sinks=0 [2015-02-03 20:55:55.258548] I [afr-self-heal-metadata.c:54:__afr_selfheal_metadata_do] 0-homegfs-replicate-0: performing metadata selfheal on c612ee2f-2fb4-4157-a9ab-5a2d5603c541 [2015-02-03 20:55:55.259367] I [afr-self-heal-common.c:476:afr_log_selfheal] 0-homegfs-replicate-0: Completed metadata selfheal on c612ee2f-2fb4-4157-a9ab-5a2d5603c541. source=1 sinks=0 [2015-02-03 20:55:55.259980] I [afr-self-heal-entry.c:554:afr_selfheal_entry_do] 0-homegfs-replicate-0: performing entry selfheal on c612ee2f-2fb4-4157-a9ab-5a2d5603c541 As you can see the self heal logs are just spammed with files being healed, and I looked at a couple of disconnects and I see self heals getting run shortly after on the bricks that were down. Now we need to find the cause of the disconnects, I am thinking once the disconnects are resolved the files should be properly copied over without SH having to fix things. Like I said I'll give this a go on my lab systems and see if I can repro the disconnects, I'll have time to run through it tomorrow. If in the mean time anyone else has a theory / anything to add here it would be appreciated. -b -b David (Sent from mobile) === David F. Robinson, Ph.D
Re: [Gluster-devel] [Gluster-users] missing files
copy that. Thanks for looking into the issue. David -- Original Message -- From: Benjamin Turner bennytu...@gmail.com To: David F. Robinson david.robin...@corvidtec.com Cc: Ben Turner btur...@redhat.com; Pranith Kumar Karampuri pkara...@redhat.com; Xavier Hernandez xhernan...@datalab.es; gluster-us...@gluster.org gluster-us...@gluster.org; Gluster Devel gluster-devel@gluster.org Sent: 2/5/2015 9:05:43 PM Subject: Re: [Gluster-users] [Gluster-devel] missing files Correct! I have seen(back in the day, its been 3ish years since I have seen it) having say 50+ volumes each with a geo rep session take system load levels to the point where pings couldn't be serviced within the ping timeout. So it is known to happen but there has been alot of work in the geo rep space to help here, some of which is discussed: https://medium.com/@msvbhat/distributed-geo-replication-in-glusterfs-ec95f4393c50 (think tar + ssh and other fixes)Your symptoms remind me of that case of 50+ geo repd volumes, thats why I mentioned it from the start. My current shoot from the hip theory is when rsyncing all that data the servers got too busy to service the pings and it lead to disconnects. This is common across all of the clustering / distributed software I have worked on, if the system gets too busy to service heartbeat within the timeout things go crazy(think fork bomb on a single host). Now this could be a case of me putting symptoms from an old issue into what you are describing, but thats where my head is at. If I'm correct I should be able to repro using a similar workload. I think that the multi threaded epoll changes that _just_ landed in master will help resolve this, but they are so new I haven't been able to test this. I'll know more when I get a chance to test tomorrow. -b On Thu, Feb 5, 2015 at 6:04 PM, David F. Robinson david.robin...@corvidtec.com wrote: Isn't rsync what geo-rep uses? David (Sent from mobile) === David F. Robinson, Ph.D. President - Corvid Technologies 704.799.6944 x101 [office] 704.252.1310 [cell] 704.799.7974 [fax] david.robin...@corvidtec.com http://www.corvidtechnologies.com On Feb 5, 2015, at 5:41 PM, Ben Turner btur...@redhat.com wrote: - Original Message - From: Ben Turner btur...@redhat.com To: David F. Robinson david.robin...@corvidtec.com Cc: Pranith Kumar Karampuri pkara...@redhat.com, Xavier Hernandez xhernan...@datalab.es, Benjamin Turner bennytu...@gmail.com, gluster-us...@gluster.org, Gluster Devel gluster-devel@gluster.org Sent: Thursday, February 5, 2015 5:22:26 PM Subject: Re: [Gluster-users] [Gluster-devel] missing files - Original Message - From: David F. Robinson david.robin...@corvidtec.com To: Ben Turner btur...@redhat.com Cc: Pranith Kumar Karampuri pkara...@redhat.com, Xavier Hernandez xhernan...@datalab.es, Benjamin Turner bennytu...@gmail.com, gluster-us...@gluster.org, Gluster Devel gluster-devel@gluster.org Sent: Thursday, February 5, 2015 5:01:13 PM Subject: Re: [Gluster-users] [Gluster-devel] missing files I'll send you the emails I sent Pranith with the logs. What causes these disconnects? Thanks David! Disconnects happen when there are interruption in communication between peers, normally there is ping timeout that happens. It could be anything from a flaky NW to the system was to busy to respond to the pings. My initial take is more towards the ladder as rsync is absolutely the worst use case for gluster - IIRC it writes in 4kb blocks. I try to keep my writes at least 64KB as in my testing that is the smallest block size I can write with before perf starts to really drop off. I'll try something similar in the lab. Ok I do think that the file being self healed is RCA for what you were seeing. Lets look at one of the disconnects: data-brick02a-homegfs.log:[2015-02-03 20:54:02.772180] I [server.c:518:server_rpc_notify] 0-homegfs-server: disconnecting connection from gfs01b.corvidtec.com-4175-2015/02/02-16:44:31:179119-homegfs-client-2-0-1 And in the glustershd.log from the gfs01b_glustershd.log file: [2015-02-03 20:55:48.001797] I [afr-self-heal-entry.c:554:afr_selfheal_entry_do] 0-homegfs-replicate-0: performing entry selfheal on 6c79a368-edaa-432b-bef9-ec690ab42448 [2015-02-03 20:55:49.341996] I [afr-self-heal-common.c:476:afr_log_selfheal] 0-homegfs-replicate-0: Completed entry selfheal on 6c79a368-edaa-432b-bef9-ec690ab42448. source=1 sinks=0 [2015-02-03 20:55:49.343093] I [afr-self-heal-entry.c:554:afr_selfheal_entry_do] 0-homegfs-replicate-0: performing entry selfheal on 792cb0d6-9290-4447-8cd7-2b2d7a116a69 [2015-02-03 20:55:50.463652] I [afr-self-heal-common.c:476:afr_log_selfheal] 0-homegfs-replicate-0: Completed entry selfheal on 792cb0d6-9290-4447-8cd7-2b2d7a116a69. source=1 sinks=0 [2015-02-03 20:55:51.465289] I [afr-self-heal-metadata.c:54:__afr_selfheal_metadata_do] 0-homegfs
Re: [Gluster-devel] missing files
Distributed/replicated Volume Name: homegfs Type: Distributed-Replicate Volume ID: 1e32672a-f1b7-4b58-ba94-58c085e59071 Status: Started Number of Bricks: 4 x 2 = 8 Transport-type: tcp Bricks: Brick1: gfsib01a.corvidtec.com:/data/brick01a/homegfs Brick2: gfsib01b.corvidtec.com:/data/brick01b/homegfs Brick3: gfsib01a.corvidtec.com:/data/brick02a/homegfs Brick4: gfsib01b.corvidtec.com:/data/brick02b/homegfs Brick5: gfsib02a.corvidtec.com:/data/brick01a/homegfs Brick6: gfsib02b.corvidtec.com:/data/brick01b/homegfs Brick7: gfsib02a.corvidtec.com:/data/brick02a/homegfs Brick8: gfsib02b.corvidtec.com:/data/brick02b/homegfs Options Reconfigured: performance.io-thread-count: 32 performance.cache-size: 128MB performance.write-behind-window-size: 128MB server.allow-insecure: on network.ping-timeout: 10 storage.owner-gid: 100 geo-replication.indexing: off geo-replication.ignore-pid-check: on changelog.changelog: on changelog.fsync-interval: 3 changelog.rollover-time: 15 server.manage-gids: on -- Original Message -- From: Xavier Hernandez xhernan...@datalab.es To: David F. Robinson david.robin...@corvidtec.com; Benjamin Turner bennytu...@gmail.com Cc: gluster-us...@gluster.org gluster-us...@gluster.org; Gluster Devel gluster-devel@gluster.org Sent: 2/4/2015 6:03:45 AM Subject: Re: [Gluster-devel] missing files On 02/04/2015 01:30 AM, David F. Robinson wrote: Sorry. Thought about this a little more. I should have been clearer. The files were on both bricks of the replica, not just one side. So, both bricks had to have been up... The files/directories just don't show up on the mount. I was reading and saw a related bug (https://bugzilla.redhat.com/show_bug.cgi?id=1159484). I saw it suggested to run: find mount -d -exec getfattr -h -n trusted.ec.heal {} \; This command is specific for a dispersed volume. It won't do anything (aside from the error you are seeing) on a replicated volume. I think you are using a replicated volume, right ? In this case I'm not sure what can be happening. Is your volume a pure replicated one or a distributed-replicated ? on a pure replicated it doesn't make sense that some entries do not show in an 'ls' when the file is in both replicas (at least without any error message in the logs). On a distributed-replicated it could be caused by some problem while combining contents of each replica set. What's the configuration of your volume ? Xavi I get a bunch of errors for operation not supported: [root@gfs02a homegfs]# find wks_backup -d -exec getfattr -h -n trusted.ec.heal {} \; find: warning: the -d option is deprecated; please use -depth instead, because the latter is a POSIX-compliant feature. wks_backup/homer_backup/backup: trusted.ec.heal: Operation not supported wks_backup/homer_backup/logs/2014_05_20.log: trusted.ec.heal: Operation not supported wks_backup/homer_backup/logs/2014_05_21.log: trusted.ec.heal: Operation not supported wks_backup/homer_backup/logs/2014_05_18.log: trusted.ec.heal: Operation not supported wks_backup/homer_backup/logs/2014_05_19.log: trusted.ec.heal: Operation not supported wks_backup/homer_backup/logs/2014_05_22.log: trusted.ec.heal: Operation not supported wks_backup/homer_backup/logs: trusted.ec.heal: Operation not supported wks_backup/homer_backup: trusted.ec.heal: Operation not supported -- Original Message -- From: Benjamin Turner bennytu...@gmail.com mailto:bennytu...@gmail.com To: David F. Robinson david.robin...@corvidtec.com mailto:david.robin...@corvidtec.com Cc: Gluster Devel gluster-devel@gluster.org mailto:gluster-devel@gluster.org; gluster-us...@gluster.org gluster-us...@gluster.org mailto:gluster-us...@gluster.org Sent: 2/3/2015 7:12:34 PM Subject: Re: [Gluster-devel] missing files It sounds to me like the files were only copied to one replica, werent there for the initial for the initial ls which triggered a self heal, and were there for the last ls because they were healed. Is there any chance that one of the replicas was down during the rsync? It could be that you lost a brick during copy or something like that. To confirm I would look for disconnects in the brick logs as well as checking glusterfshd.log to verify the missing files were actually healed. -b On Tue, Feb 3, 2015 at 5:37 PM, David F. Robinson david.robin...@corvidtec.com mailto:david.robin...@corvidtec.com wrote: I rsync'd 20-TB over to my gluster system and noticed that I had some directories missing even though the rsync completed normally. The rsync logs showed that the missing files were transferred. I went to the bricks and did an 'ls -al /data/brick*/homegfs/dir/*' the files were on the bricks. After I did this 'ls', the files then showed up on the FUSE mounts. 1) Why are the files hidden on the fuse mount? 2) Why does the ls make them show up on the FUSE mount? 3) How can I prevent this from happening again? Note, I also mounted the gluster volume using NFS and saw
Re: [Gluster-devel] gluster 3.6.2 ls issues
Cancel this issue. I found the problem. Explanation below... We use active directory to manage our user accounts; however, open sssd doesn't seem to play well with gluster. When I turn it on, the cpu load shoots up to between 80-100% and stays there (previously submitted bug report). So, we I did on my gluster machines to keep the uid/gid updated (required due to server.manage-gids=on), is write a script that start opensssd, grabs all of the groups/users from the server, parses out the /etc/group and /etc/passwd file, and then shuts down sssd. I didn't realize that sssd uses the locally cached file. My script was running faster than sssd was updating the cache file, so this particular user wasn't in the SBIR group on all of the machines. He was in that group on gfs01a, but not on gfs01b (replica pair) or gfs02a/02b. I guess this gave him enough permission to cd into the directory, but for some strange reason he couldn't do an ls and have the directory name show up. The only reason I do any of this is because I had to use server.manage-gids to overcome the 32-group limitation. This requires that my storage system have all of the user accounts and groups. The preferred option would be to simply use sssd on my storage systems, but it doesn't seem to play well with gluster. David -- Original Message -- From: David F. Robinson david.robin...@corvidtec.com To: Gluster Devel gluster-devel@gluster.org; gluster-us...@gluster.org gluster-us...@gluster.org Sent: 2/3/2015 12:56:40 PM Subject: gluster 3.6.2 ls issues On my gluster filesystem mount, I have a user who does an ls and all of the directories do not show up. Not that the A15-029 directory doesn't show up. However, as kbutz I can cd into the directory. As root (also tested as several other users), I get the following from an ls -al [root@sb1 2015.1]# ls -al total 16 drwxrws--x 13 streadway sbir 868 Feb 3 12:48 . drwxrws--- 46 root sbir 16384 Feb 3 10:50 .. drwxrws--x 5 cczechsbir 606 Jan 30 12:58 A15-007 drwxrws--x 5 kbutz sbir 291 Feb 3 12:11 A15-029 drwxrws--x 3 randerson sbir 219 Feb 3 11:52 A15-063 drwxrws--x 4 abirnbaum sbir 223 Feb 3 10:14 A15-088 drwxrws--x 2 anelson sbir 270 Jan 27 14:30 AF151-058 drwxrws--x 3 tanderson sbir 216 Jan 28 09:43 AF151-072 drwxrws--x 3 streadway sbir 162 Jan 21 13:28 AF151-102 drwxrws--x 4 aaronward sbir 493 Feb 3 09:58 AF151-114 drwxrws--x 3 streadway sbir 162 Feb 3 12:07 AF151-174 drwxrws--x 3 dstowesbir 192 Jan 27 12:25 AF15-AT28 drwxrws--x 3 kboyett sbir 199 Jan 28 09:43 NASA As user kburz, I get the following: sb1:sbir/2015.1 ls -al total 16 drwxrws--x 13 streadway sbir 868 Feb 3 12:48 ./ drwxrws--- 46 root sbir 16384 Feb 3 10:50 ../ drwxrws--x 3 randerson sbir 219 Feb 3 11:52 A15-063/ drwxrws--x 4 abirnbaum sbir 223 Feb 3 10:14 A15-088/ drwxrws--x 2 anelson sbir 270 Jan 27 14:30 AF151-058/ drwxrws--x 3 streadway sbir 162 Jan 21 13:28 AF151-102/ drwxrws--x 3 streadway sbir 162 Feb 3 12:07 AF151-174/ drwxrws--x 3 kboyett sbir 199 Jan 28 09:43 NASA/ Note, that I can still cd into the non-listed directory as kbutz: [kbutz@sb1 ~]$ cd /homegfs/documentation/programs/sbir/2015.1 A15-063/ A15-088/ AF151-058/ AF151-102/ AF151-174/ NASA/ sb1:sbir/2015.1 cd A15-029 A15-029_proposal_draft_rev1.docx* CB_work/ gun_work/ Refs/ David === David F. Robinson, Ph.D. President - Corvid Technologies 704.799.6944 x101 [office] 704.252.1310 [cell] 704.799.7974 [fax] david.robin...@corvidtec.com http://www.corvidtechnologies.com ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] missing files
I rsync'd 20-TB over to my gluster system and noticed that I had some directories missing even though the rsync completed normally. The rsync logs showed that the missing files were transferred. I went to the bricks and did an 'ls -al /data/brick*/homegfs/dir/*' the files were on the bricks. After I did this 'ls', the files then showed up on the FUSE mounts. 1) Why are the files hidden on the fuse mount? 2) Why does the ls make them show up on the FUSE mount? 3) How can I prevent this from happening again? Note, I also mounted the gluster volume using NFS and saw the same behavior. The files/directories were not shown until I did the ls on the bricks. David === David F. Robinson, Ph.D. President - Corvid Technologies 704.799.6944 x101 [office] 704.252.1310 [cell] 704.799.7974 [fax] david.robin...@corvidtec.com http://www.corvidtechnologies.com ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] missing files
Sorry. Thought about this a little more. I should have been clearer. The files were on both bricks of the replica, not just one side. So, both bricks had to have been up... The files/directories just don't show up on the mount. I was reading and saw a related bug (https://bugzilla.redhat.com/show_bug.cgi?id=1159484). I saw it suggested to run: find mount -d -exec getfattr -h -n trusted.ec.heal {} \; I get a bunch of errors for operation not supported: [root@gfs02a homegfs]# find wks_backup -d -exec getfattr -h -n trusted.ec.heal {} \; find: warning: the -d option is deprecated; please use -depth instead, because the latter is a POSIX-compliant feature. wks_backup/homer_backup/backup: trusted.ec.heal: Operation not supported wks_backup/homer_backup/logs/2014_05_20.log: trusted.ec.heal: Operation not supported wks_backup/homer_backup/logs/2014_05_21.log: trusted.ec.heal: Operation not supported wks_backup/homer_backup/logs/2014_05_18.log: trusted.ec.heal: Operation not supported wks_backup/homer_backup/logs/2014_05_19.log: trusted.ec.heal: Operation not supported wks_backup/homer_backup/logs/2014_05_22.log: trusted.ec.heal: Operation not supported wks_backup/homer_backup/logs: trusted.ec.heal: Operation not supported wks_backup/homer_backup: trusted.ec.heal: Operation not supported -- Original Message -- From: Benjamin Turner bennytu...@gmail.com To: David F. Robinson david.robin...@corvidtec.com Cc: Gluster Devel gluster-devel@gluster.org; gluster-us...@gluster.org gluster-us...@gluster.org Sent: 2/3/2015 7:12:34 PM Subject: Re: [Gluster-devel] missing files It sounds to me like the files were only copied to one replica, werent there for the initial for the initial ls which triggered a self heal, and were there for the last ls because they were healed. Is there any chance that one of the replicas was down during the rsync? It could be that you lost a brick during copy or something like that. To confirm I would look for disconnects in the brick logs as well as checking glusterfshd.log to verify the missing files were actually healed. -b On Tue, Feb 3, 2015 at 5:37 PM, David F. Robinson david.robin...@corvidtec.com wrote: I rsync'd 20-TB over to my gluster system and noticed that I had some directories missing even though the rsync completed normally. The rsync logs showed that the missing files were transferred. I went to the bricks and did an 'ls -al /data/brick*/homegfs/dir/*' the files were on the bricks. After I did this 'ls', the files then showed up on the FUSE mounts. 1) Why are the files hidden on the fuse mount? 2) Why does the ls make them show up on the FUSE mount? 3) How can I prevent this from happening again? Note, I also mounted the gluster volume using NFS and saw the same behavior. The files/directories were not shown until I did the ls on the bricks. David === David F. Robinson, Ph.D. President - Corvid Technologies 704.799.6944 x101 [office] 704.252.1310 [cell] 704.799.7974 [fax] david.robin...@corvidtec.com http://www.corvidtechnologies.com ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] missing files
Like these? data-brick02a-homegfs.log:[2015-02-03 19:09:34.568842] I [server.c:518:server_rpc_notify] 0-homegfs-server: disconnecting connection from gfs02a.corvidtec.com-18563-2015/02/03-19:07:58:519134-homegfs-client-2-0-0 data-brick02a-homegfs.log:[2015-02-03 19:09:41.286551] I [server.c:518:server_rpc_notify] 0-homegfs-server: disconnecting connection from gfs01a.corvidtec.com-12804-2015/02/03-19:09:38:497808-homegfs-client-2-0-0 data-brick02a-homegfs.log:[2015-02-03 19:16:35.906412] I [server.c:518:server_rpc_notify] 0-homegfs-server: disconnecting connection from gfs02b.corvidtec.com-27190-2015/02/03-19:15:53:458467-homegfs-client-2-0-0 data-brick02a-homegfs.log:[2015-02-03 19:51:22.761293] I [server.c:518:server_rpc_notify] 0-homegfs-server: disconnecting connection from gfs01a.corvidtec.com-25926-2015/02/03-19:51:02:89070-homegfs-client-2-0-0 data-brick02a-homegfs.log:[2015-02-03 20:54:02.772180] I [server.c:518:server_rpc_notify] 0-homegfs-server: disconnecting connection from gfs01b.corvidtec.com-4175-2015/02/02-16:44:31:179119-homegfs-client-2-0-1 data-brick02a-homegfs.log:[2015-02-03 22:44:47.458905] I [server.c:518:server_rpc_notify] 0-homegfs-server: disconnecting connection from gfs01a.corvidtec.com-29467-2015/02/03-22:44:05:838129-homegfs-client-2-0-0 data-brick02a-homegfs.log:[2015-02-03 22:47:42.830866] I [server.c:518:server_rpc_notify] 0-homegfs-server: disconnecting connection from gfs01a.corvidtec.com-30069-2015/02/03-22:47:37:209436-homegfs-client-2-0-0 data-brick02a-homegfs.log:[2015-02-03 22:48:26.785931] I [server.c:518:server_rpc_notify] 0-homegfs-server: disconnecting connection from gfs01a.corvidtec.com-30256-2015/02/03-22:47:55:203659-homegfs-client-2-0-0 data-brick02a-homegfs.log:[2015-02-03 22:53:25.530836] I [server.c:518:server_rpc_notify] 0-homegfs-server: disconnecting connection from gfs01a.corvidtec.com-30658-2015/02/03-22:53:21:627538-homegfs-client-2-0-0 data-brick02a-homegfs.log:[2015-02-03 22:56:14.033823] I [server.c:518:server_rpc_notify] 0-homegfs-server: disconnecting connection from gfs01a.corvidtec.com-30893-2015/02/03-22:56:01:450507-homegfs-client-2-0-0 data-brick02a-homegfs.log:[2015-02-03 22:56:55.622800] I [server.c:518:server_rpc_notify] 0-homegfs-server: disconnecting connection from gfs01a.corvidtec.com-31080-2015/02/03-22:56:32:665370-homegfs-client-2-0-0 data-brick02a-homegfs.log:[2015-02-03 22:59:11.445742] I [server.c:518:server_rpc_notify] 0-homegfs-server: disconnecting connection from gfs01a.corvidtec.com-31383-2015/02/03-22:58:45:190874-homegfs-client-2-0-0 data-brick02a-homegfs.log:[2015-02-03 23:06:26.482709] I [server.c:518:server_rpc_notify] 0-homegfs-server: disconnecting connection from gfs01a.corvidtec.com-31720-2015/02/03-23:06:11:340012-homegfs-client-2-0-0 data-brick02a-homegfs.log:[2015-02-03 23:10:54.807725] I [server.c:518:server_rpc_notify] 0-homegfs-server: disconnecting connection from gfs01a.corvidtec.com-32083-2015/02/03-23:10:22:131678-homegfs-client-2-0-0 data-brick02a-homegfs.log:[2015-02-03 23:13:35.545513] I [server.c:518:server_rpc_notify] 0-homegfs-server: disconnecting connection from gfs01a.corvidtec.com-32284-2015/02/03-23:13:21:26552-homegfs-client-2-0-0 data-brick02a-homegfs.log:[2015-02-03 23:14:19.065271] I [server.c:518:server_rpc_notify] 0-homegfs-server: disconnecting connection from gfs01a.corvidtec.com-32471-2015/02/03-23:13:48:221126-homegfs-client-2-0-0 data-brick02a-homegfs.log:[2015-02-04 00:18:20.261428] I [server.c:518:server_rpc_notify] 0-homegfs-server: disconnecting connection from gfs01a.corvidtec.com-1369-2015/02/04-00:16:53:613570-homegfs-client-2-0-0 -- Original Message -- From: Benjamin Turner bennytu...@gmail.com To: David F. Robinson david.robin...@corvidtec.com Cc: Gluster Devel gluster-devel@gluster.org; gluster-us...@gluster.org gluster-us...@gluster.org Sent: 2/3/2015 7:12:34 PM Subject: Re: [Gluster-devel] missing files It sounds to me like the files were only copied to one replica, werent there for the initial for the initial ls which triggered a self heal, and were there for the last ls because they were healed. Is there any chance that one of the replicas was down during the rsync? It could be that you lost a brick during copy or something like that. To confirm I would look for disconnects in the brick logs as well as checking glusterfshd.log to verify the missing files were actually healed. -b On Tue, Feb 3, 2015 at 5:37 PM, David F. Robinson david.robin...@corvidtec.com wrote: I rsync'd 20-TB over to my gluster system and noticed that I had some directories missing even though the rsync completed normally. The rsync logs showed that the missing files were transferred. I went to the bricks and did an 'ls -al /data/brick*/homegfs/dir/*' the files were on the bricks. After I did this 'ls', the files then showed up on the FUSE mounts. 1) Why are the files hidden on the fuse mount? 2) Why does the ls
[Gluster-devel] failed heal
I have several files that gluster says it cannot heal. I deleted the files from all of the bricks (/data/brick0*/hpc_shared/motorsports/gmics/Raven/p3/*) and ran a full heal using 'gluster volume heal homegfs full'. Even after the full heal, the entries below still show up. How do I clear these? [root@gfs01a ~]# gluster volume heal homegfs info Gathering list of entries to be healed on volume homegfs has been successful Brick gfsib01a.corvidtec.com:/data/brick01a/homegfs Number of entries: 10 /hpc_shared/motorsports/gmics/Raven/p3/70_rke/Movies gfid:a6fc9011-74ad-4128-a232-4ccd41215ac8 gfid:bc17fa79-c1fd-483d-82b1-2c0d3564ddc5 gfid:ec804b5c-8bfc-4e7b-91e3-aded7952e609 gfid:ba62e340-4fad-477c-b450-704133577cbb gfid:4843aa40-8361-4a97-88d5-d37fc28e04c0 gfid:c90a8f1c-c49e-4476-8a50-2bfb0a89323c gfid:090042df-855a-4f5d-8929-c58feec10e33 /hpc_shared/motorsports/gmics/Raven/p3/70_rke/.Convrg.swp /hpc_shared/motorsports/gmics/Raven/p3/70_rke Brick gfsib01b.corvidtec.com:/data/brick01b/homegfs Number of entries: 2 gfid:f96b4ddf-8a75-4abb-a640-15dbe41fdafa /hpc_shared/motorsports/gmics/Raven/p3/70_rke Brick gfsib01a.corvidtec.com:/data/brick02a/homegfs Number of entries: 7 gfid:5d08fe1d-17b3-4a76-ab43-c708e346162f /hpc_shared/motorsports/gmics/Raven/p3/70_rke/PICTURES/.tmpcheck /hpc_shared/motorsports/gmics/Raven/p3/70_rke/PICTURES /hpc_shared/motorsports/gmics/Raven/p3/70_rke/Movies gfid:427d3738-3a41-4e51-ba2b-f0ba7254d013 gfid:8ad88a4d-8d5e-408f-a1de-36116cf6d5c1 gfid:0e034160-cd50-4108-956d-e45858f27feb Brick gfsib01b.corvidtec.com:/data/brick02b/homegfs Number of entries: 0 Brick gfsib02a.corvidtec.com:/data/brick01a/homegfs Number of entries: 0 Brick gfsib02b.corvidtec.com:/data/brick01b/homegfs Number of entries: 0 Brick gfsib02a.corvidtec.com:/data/brick02a/homegfs Number of entries: 0 Brick gfsib02b.corvidtec.com:/data/brick02b/homegfs Number of entries: 0 === David F. Robinson, Ph.D. President - Corvid Technologies 704.799.6944 x101 [office] 704.252.1310 [cell] 704.799.7974 [fax] david.robin...@corvidtec.com http://www.corvidtechnologies.com ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] v3.6.2
After shutting down all NFS and gluster processes, there was still an NFS process. [root@gfs01bkp ~]# rpcinfo -p program vers proto port service 104 tcp111 portmapper 103 tcp111 portmapper 102 tcp111 portmapper 104 udp111 portmapper 103 udp111 portmapper 102 udp111 portmapper 153 tcp 38465 mountd 151 tcp 38466 mountd 133 tcp 2049 nfs 1000241 udp 34738 status 1000241 tcp 37269 status [root@gfs01bkp ~]# netstat -anp | grep 2049 [root@gfs01bkp ~]# netstat -anp | grep 38465 [root@gfs01bkp ~]# netstat -anp | grep 38466 I killed off the processes using rpcinfo -d [root@gfs01bkp ~]# rpcinfo -p program vers proto port service 104 tcp111 portmapper 103 tcp111 portmapper 102 tcp111 portmapper 104 udp111 portmapper 103 udp111 portmapper 102 udp111 portmapper 1000241 udp 34738 status 1000241 tcp 37269 status Then I restarted the glusterd and did a 'mount -a'. Worked perfectly. And the errors that were showing up in the logs every 3-seconds stopped. Thanks for your help. Greatly appreciated. David -- Original Message -- From: Xavier Hernandez xhernan...@datalab.es To: David F. Robinson david.robin...@corvidtec.com; Kaushal M kshlms...@gmail.com Cc: Gluster Users gluster-us...@gluster.org; Gluster Devel gluster-devel@gluster.org Sent: 1/27/2015 10:02:31 AM Subject: Re: [Gluster-devel] [Gluster-users] v3.6.2 Hi, I had a similar problem once. It happened after doing some unrelated tests with NFS. I thought it was a problem I generated doing weird things, so I didn't investigate the cause further. To see if this is the same case, try this: * Unmount all NFS mounts and stop all gluster volumes * Check that there are no gluster processes running (ps ax | grep gluster), specially any glusterfs. glusterd is ok. * Check that there are no NFS processes running (ps ax | grep nfs) * Check with 'rpcinfo -p' that there's no nfs service registered The output should be similar to this: program vers proto port service 10 4 tcp 111 portmapper 10 3 tcp 111 portmapper 10 2 tcp 111 portmapper 10 4 udp 111 portmapper 10 3 udp 111 portmapper 10 2 udp 111 portmapper 100024 1 udp 33482 status 100024 1 tcp 37034 status If there are more services registered, you can directly delete them or check if they correspond to an active process. For example, if the output is this: program vers proto port service 10 4 tcp 111 portmapper 10 3 tcp 111 portmapper 10 2 tcp 111 portmapper 10 4 udp 111 portmapper 10 3 udp 111 portmapper 10 2 udp 111 portmapper 100021 3 udp 39618 nlockmgr 100021 3 tcp 41067 nlockmgr 100024 1 udp 33482 status 100024 1 tcp 37034 status You can do a netstat -anp | grep 39618 to see if there is some process really listening at the nlockmgr port. You can repeat this for port 41067. If there is some process, you should stop it. If there is no process listening on that port, you should remove it with a command like this: rpcinfo -d 100021 3 You must execute this command for all stale ports for any services other than portmapper and status. Once done you should get the output shown before. After that, you can try to start your volume and see if everything is registered (rpcinfo -p) and if gluster has started the nfs server (gluster volume status). If everything is ok, you should be able to mount the volume using NFS. Xavi On 01/27/2015 03:18 PM, David F. Robinson wrote: Turning off nfslock did not help. Also, still getting these messages every 3-seconds: [2015-01-27 14:16:12.921880] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/1f0cee5a2d074e39b32ee5a81c70e68c.socket failed (Invalid argument) [2015-01-27 14:16:15.922431] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/1f0cee5a2d074e39b32ee5a81c70e68c.socket failed (Invalid argument) [2015-01-27 14:16:18.923080] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/1f0cee5a2d074e39b32ee5a81c70e68c.socket failed (Invalid argument) [2015-01-27 14:16:21.923748] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/1f0cee5a2d074e39b32ee5a81c70e68c.socket failed (Invalid argument) [2015-01-27 14:16:24.924472] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/1f0cee5a2d074e39b32ee5a81c70e68c.socket failed (Invalid argument) [2015-01-27 14:16:27.925192] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/1f0cee5a2d074e39b32ee5a81c70e68c.socket failed (Invalid argument) [2015-01-27 14:16:30.925895] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run
Re: [Gluster-devel] [Gluster-users] v3.6.2
I rebooted the machine to see if the problem would return and it does. Same issue after a reboot. Any suggestions? One other thing I tested was to comment out the NFS mounts in /etc/fstab: # gfsib01bkp.corvidtec.com:/homegfs_bkp /backup_nfs/homegfs nfs vers=3,intr,bg,rsize=32768,wsize=32768 0 0 After the machine comes back up, I remove the comment and do a 'mount -a'. The mount works fine. It looks like it is a timing during startup issue. Is it trying to do the NFS mount while glusterd is still starting up? David -- Original Message -- From: Xavier Hernandez xhernan...@datalab.es To: David F. Robinson david.robin...@corvidtec.com; Kaushal M kshlms...@gmail.com Cc: Gluster Users gluster-us...@gluster.org; Gluster Devel gluster-devel@gluster.org Sent: 1/27/2015 10:02:31 AM Subject: Re: [Gluster-devel] [Gluster-users] v3.6.2 Hi, I had a similar problem once. It happened after doing some unrelated tests with NFS. I thought it was a problem I generated doing weird things, so I didn't investigate the cause further. To see if this is the same case, try this: * Unmount all NFS mounts and stop all gluster volumes * Check that there are no gluster processes running (ps ax | grep gluster), specially any glusterfs. glusterd is ok. * Check that there are no NFS processes running (ps ax | grep nfs) * Check with 'rpcinfo -p' that there's no nfs service registered The output should be similar to this: program vers proto port service 10 4 tcp 111 portmapper 10 3 tcp 111 portmapper 10 2 tcp 111 portmapper 10 4 udp 111 portmapper 10 3 udp 111 portmapper 10 2 udp 111 portmapper 100024 1 udp 33482 status 100024 1 tcp 37034 status If there are more services registered, you can directly delete them or check if they correspond to an active process. For example, if the output is this: program vers proto port service 10 4 tcp 111 portmapper 10 3 tcp 111 portmapper 10 2 tcp 111 portmapper 10 4 udp 111 portmapper 10 3 udp 111 portmapper 10 2 udp 111 portmapper 100021 3 udp 39618 nlockmgr 100021 3 tcp 41067 nlockmgr 100024 1 udp 33482 status 100024 1 tcp 37034 status You can do a netstat -anp | grep 39618 to see if there is some process really listening at the nlockmgr port. You can repeat this for port 41067. If there is some process, you should stop it. If there is no process listening on that port, you should remove it with a command like this: rpcinfo -d 100021 3 You must execute this command for all stale ports for any services other than portmapper and status. Once done you should get the output shown before. After that, you can try to start your volume and see if everything is registered (rpcinfo -p) and if gluster has started the nfs server (gluster volume status). If everything is ok, you should be able to mount the volume using NFS. Xavi On 01/27/2015 03:18 PM, David F. Robinson wrote: Turning off nfslock did not help. Also, still getting these messages every 3-seconds: [2015-01-27 14:16:12.921880] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/1f0cee5a2d074e39b32ee5a81c70e68c.socket failed (Invalid argument) [2015-01-27 14:16:15.922431] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/1f0cee5a2d074e39b32ee5a81c70e68c.socket failed (Invalid argument) [2015-01-27 14:16:18.923080] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/1f0cee5a2d074e39b32ee5a81c70e68c.socket failed (Invalid argument) [2015-01-27 14:16:21.923748] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/1f0cee5a2d074e39b32ee5a81c70e68c.socket failed (Invalid argument) [2015-01-27 14:16:24.924472] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/1f0cee5a2d074e39b32ee5a81c70e68c.socket failed (Invalid argument) [2015-01-27 14:16:27.925192] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/1f0cee5a2d074e39b32ee5a81c70e68c.socket failed (Invalid argument) [2015-01-27 14:16:30.925895] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/1f0cee5a2d074e39b32ee5a81c70e68c.socket failed (Invalid argument) [2015-01-27 14:16:33.926563] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/1f0cee5a2d074e39b32ee5a81c70e68c.socket failed (Invalid argument) [2015-01-27 14:16:36.927248] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/1f0cee5a2d074e39b32ee5a81c70e68c.socket failed (Invalid argument) -- Original Message -- From: Kaushal M kshlms...@gmail.com mailto:kshlms...@gmail.com To: David F. Robinson david.robin...@corvidtec.com mailto:david.robin...@corvidtec.com Cc: Joe Julian j...@julianfamily.org mailto:j...@julianfamily.org; Gluster Users gluster-us...@gluster.org mailto:gluster-us...@gluster.org; Gluster Devel gluster-devel@gluster.org mailto:gluster-devel@gluster.org Sent: 1/27/2015 1:49:56 AM Subject: Re: Re[2]: [Gluster-devel] [Gluster
Re: [Gluster-devel] [Gluster-users] v3.6.2
In my /etc/fstab, I have the following: gfsib01bkp.corvidtec.com:/homegfs_bkp /backup/homegfs glusterfs transport=tcp,_netdev 0 0 gfsib01bkp.corvidtec.com:/Software_bkp /backup/Software glusterfs transport=tcp,_netdev 0 0 gfsib01bkp.corvidtec.com:/Source_bkp /backup/Source glusterfs transport=tcp,_netdev 0 0 #... Setup NFS mounts as well gfsib01bkp.corvidtec.com:/homegfs_bkp /backup_nfs/homegfs nfs vers=3,intr,bg,rsize=32768,wsize=32768 0 0 It looks like it is trying to start the nfs mount before gluster has finished coming up and that this is hanging the nfs ports. I have _netdev in the glusterfs mount point to make sure the network has come up (including infiniband) prior to starting gluster. Shouldn't the gluster init scripts check for gluster startup prior to starting the nfs mount? It doesn't look like this is working properly. David -- Original Message -- From: Xavier Hernandez xhernan...@datalab.es To: David F. Robinson david.robin...@corvidtec.com; Kaushal M kshlms...@gmail.com Cc: Gluster Users gluster-us...@gluster.org; Gluster Devel gluster-devel@gluster.org Sent: 1/27/2015 10:02:31 AM Subject: Re: [Gluster-devel] [Gluster-users] v3.6.2 Hi, I had a similar problem once. It happened after doing some unrelated tests with NFS. I thought it was a problem I generated doing weird things, so I didn't investigate the cause further. To see if this is the same case, try this: * Unmount all NFS mounts and stop all gluster volumes * Check that there are no gluster processes running (ps ax | grep gluster), specially any glusterfs. glusterd is ok. * Check that there are no NFS processes running (ps ax | grep nfs) * Check with 'rpcinfo -p' that there's no nfs service registered The output should be similar to this: program vers proto port service 10 4 tcp 111 portmapper 10 3 tcp 111 portmapper 10 2 tcp 111 portmapper 10 4 udp 111 portmapper 10 3 udp 111 portmapper 10 2 udp 111 portmapper 100024 1 udp 33482 status 100024 1 tcp 37034 status If there are more services registered, you can directly delete them or check if they correspond to an active process. For example, if the output is this: program vers proto port service 10 4 tcp 111 portmapper 10 3 tcp 111 portmapper 10 2 tcp 111 portmapper 10 4 udp 111 portmapper 10 3 udp 111 portmapper 10 2 udp 111 portmapper 100021 3 udp 39618 nlockmgr 100021 3 tcp 41067 nlockmgr 100024 1 udp 33482 status 100024 1 tcp 37034 status You can do a netstat -anp | grep 39618 to see if there is some process really listening at the nlockmgr port. You can repeat this for port 41067. If there is some process, you should stop it. If there is no process listening on that port, you should remove it with a command like this: rpcinfo -d 100021 3 You must execute this command for all stale ports for any services other than portmapper and status. Once done you should get the output shown before. After that, you can try to start your volume and see if everything is registered (rpcinfo -p) and if gluster has started the nfs server (gluster volume status). If everything is ok, you should be able to mount the volume using NFS. Xavi On 01/27/2015 03:18 PM, David F. Robinson wrote: Turning off nfslock did not help. Also, still getting these messages every 3-seconds: [2015-01-27 14:16:12.921880] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/1f0cee5a2d074e39b32ee5a81c70e68c.socket failed (Invalid argument) [2015-01-27 14:16:15.922431] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/1f0cee5a2d074e39b32ee5a81c70e68c.socket failed (Invalid argument) [2015-01-27 14:16:18.923080] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/1f0cee5a2d074e39b32ee5a81c70e68c.socket failed (Invalid argument) [2015-01-27 14:16:21.923748] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/1f0cee5a2d074e39b32ee5a81c70e68c.socket failed (Invalid argument) [2015-01-27 14:16:24.924472] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/1f0cee5a2d074e39b32ee5a81c70e68c.socket failed (Invalid argument) [2015-01-27 14:16:27.925192] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/1f0cee5a2d074e39b32ee5a81c70e68c.socket failed (Invalid argument) [2015-01-27 14:16:30.925895] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/1f0cee5a2d074e39b32ee5a81c70e68c.socket failed (Invalid argument) [2015-01-27 14:16:33.926563] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/1f0cee5a2d074e39b32ee5a81c70e68c.socket failed (Invalid argument) [2015-01-27 14:16:36.927248] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/1f0cee5a2d074e39b32ee5a81c70e68c.socket failed (Invalid argument) -- Original Message -- From: Kaushal M kshlms...@gmail.com
Re: [Gluster-devel] [Gluster-users] v3.6.2
Not elegant, but here is my short-term fix to prevent the issue after a reboot: Added 'noauto' to the mount in /etc/rc.local: /etc/fstab: #... Note: Used the 'noauto' for the NFS mounts and put the mount in /etc/rc.local to ensure that #... glsuter has been started before attempting to mount using NFS. Otherwise, hangs ports #... during startup. gfsib01bkp.corvidtec.com:/homegfs_bkp /backup_nfs/homegfs nfs vers=3,intr,bg,rsize=32768,wsize=32768,noauto 0 0 gfsib01a.corvidtec.com:/homegfs /homegfs_nfs nfs vers=3,intr,bg,rsize=32768,wsize=32768,noauto 0 0 /etc/rc.local: /etc/init.d/glusterd restart (sleep 20; mount -a; mount /backup_nfs/homegfs) -- Original Message -- From: Xavier Hernandez xhernan...@datalab.es To: David F. Robinson david.robin...@corvidtec.com; Kaushal M kshlms...@gmail.com Cc: Gluster Users gluster-us...@gluster.org; Gluster Devel gluster-devel@gluster.org Sent: 1/27/2015 10:02:31 AM Subject: Re: [Gluster-devel] [Gluster-users] v3.6.2 Hi, I had a similar problem once. It happened after doing some unrelated tests with NFS. I thought it was a problem I generated doing weird things, so I didn't investigate the cause further. To see if this is the same case, try this: * Unmount all NFS mounts and stop all gluster volumes * Check that there are no gluster processes running (ps ax | grep gluster), specially any glusterfs. glusterd is ok. * Check that there are no NFS processes running (ps ax | grep nfs) * Check with 'rpcinfo -p' that there's no nfs service registered The output should be similar to this: program vers proto port service 10 4 tcp 111 portmapper 10 3 tcp 111 portmapper 10 2 tcp 111 portmapper 10 4 udp 111 portmapper 10 3 udp 111 portmapper 10 2 udp 111 portmapper 100024 1 udp 33482 status 100024 1 tcp 37034 status If there are more services registered, you can directly delete them or check if they correspond to an active process. For example, if the output is this: program vers proto port service 10 4 tcp 111 portmapper 10 3 tcp 111 portmapper 10 2 tcp 111 portmapper 10 4 udp 111 portmapper 10 3 udp 111 portmapper 10 2 udp 111 portmapper 100021 3 udp 39618 nlockmgr 100021 3 tcp 41067 nlockmgr 100024 1 udp 33482 status 100024 1 tcp 37034 status You can do a netstat -anp | grep 39618 to see if there is some process really listening at the nlockmgr port. You can repeat this for port 41067. If there is some process, you should stop it. If there is no process listening on that port, you should remove it with a command like this: rpcinfo -d 100021 3 You must execute this command for all stale ports for any services other than portmapper and status. Once done you should get the output shown before. After that, you can try to start your volume and see if everything is registered (rpcinfo -p) and if gluster has started the nfs server (gluster volume status). If everything is ok, you should be able to mount the volume using NFS. Xavi On 01/27/2015 03:18 PM, David F. Robinson wrote: Turning off nfslock did not help. Also, still getting these messages every 3-seconds: [2015-01-27 14:16:12.921880] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/1f0cee5a2d074e39b32ee5a81c70e68c.socket failed (Invalid argument) [2015-01-27 14:16:15.922431] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/1f0cee5a2d074e39b32ee5a81c70e68c.socket failed (Invalid argument) [2015-01-27 14:16:18.923080] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/1f0cee5a2d074e39b32ee5a81c70e68c.socket failed (Invalid argument) [2015-01-27 14:16:21.923748] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/1f0cee5a2d074e39b32ee5a81c70e68c.socket failed (Invalid argument) [2015-01-27 14:16:24.924472] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/1f0cee5a2d074e39b32ee5a81c70e68c.socket failed (Invalid argument) [2015-01-27 14:16:27.925192] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/1f0cee5a2d074e39b32ee5a81c70e68c.socket failed (Invalid argument) [2015-01-27 14:16:30.925895] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/1f0cee5a2d074e39b32ee5a81c70e68c.socket failed (Invalid argument) [2015-01-27 14:16:33.926563] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/1f0cee5a2d074e39b32ee5a81c70e68c.socket failed (Invalid argument) [2015-01-27 14:16:36.927248] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/1f0cee5a2d074e39b32ee5a81c70e68c.socket failed (Invalid argument) -- Original Message -- From: Kaushal M kshlms...@gmail.com mailto:kshlms...@gmail.com To: David F. Robinson david.robin...@corvidtec.com mailto:david.robin...@corvidtec.com Cc: Joe Julian j...@julianfamily.org mailto:j...@julianfamily.org; Gluster Users gluster-us...@gluster.org mailto:gluster-us...@gluster.org; Gluster Devel
Re: [Gluster-devel] v3.6.2
Tried shutting down glusterd and glusterfsd and restarting. [2015-01-26 14:52:53.548330] I [rpc-clnt.c:969:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600 [2015-01-26 14:52:53.549763] I [rpc-clnt.c:969:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600 [2015-01-26 14:52:53.551245] I [rpc-clnt.c:969:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600 [2015-01-26 14:52:53.552819] I [rpc-clnt.c:969:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600 [2015-01-26 14:52:53.554289] I [rpc-clnt.c:969:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600 [2015-01-26 14:52:53.555769] I [rpc-clnt.c:969:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600 [2015-01-26 14:52:53.564429] I [rpc-clnt.c:969:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600 [2015-01-26 14:52:53.565578] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/0cdef7faa934cfe52676689ff8c0110f.socket failed (Invalid argument) [2015-01-26 14:52:53.566488] I [MSGID: 106005] [glusterd-handler.c:4142:__glusterd_brick_rpc_notify] 0-management: Brick gfsib01bkp.corvidtec.com:/data/brick01bkp/Software_bkp has disconnected from glusterd. [2015-01-26 14:52:53.567453] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/09e734d5e8d52bb796896c7a33d0a3ff.socket failed (Invalid argument) [2015-01-26 14:52:53.568248] I [MSGID: 106005] [glusterd-handler.c:4142:__glusterd_brick_rpc_notify] 0-management: Brick gfsib01bkp.corvidtec.com:/data/brick02bkp/Software_bkp has disconnected from glusterd. [2015-01-26 14:52:53.569009] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/3f6844c74682f39fa7457082119628c5.socket failed (Invalid argument) [2015-01-26 14:52:53.569851] I [MSGID: 106005] [glusterd-handler.c:4142:__glusterd_brick_rpc_notify] 0-management: Brick gfsib01bkp.corvidtec.com:/data/brick01bkp/Source_bkp has disconnected from glusterd. [2015-01-26 14:52:53.570818] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/34d5cc70aba63082bbb467ab450bd08b.socket failed (Invalid argument) [2015-01-26 14:52:53.571777] I [MSGID: 106005] [glusterd-handler.c:4142:__glusterd_brick_rpc_notify] 0-management: Brick gfsib01bkp.corvidtec.com:/data/brick02bkp/Source_bkp has disconnected from glusterd. [2015-01-26 14:52:53.572681] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/0cd747876dca36cb21ecc7a36f7f897c.socket failed (Invalid argument) [2015-01-26 14:52:53.573533] I [MSGID: 106005] [glusterd-handler.c:4142:__glusterd_brick_rpc_notify] 0-management: Brick gfsib01bkp.corvidtec.com:/data/brick01bkp/homegfs_bkp has disconnected from glusterd. [2015-01-26 14:52:53.574433] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/88744e1365b414d41e720e480700716a.socket failed (Invalid argument) [2015-01-26 14:52:53.575399] I [MSGID: 106005] [glusterd-handler.c:4142:__glusterd_brick_rpc_notify] 0-management: Brick gfsib01bkp.corvidtec.com:/data/brick02bkp/homegfs_bkp has disconnected from glusterd. [2015-01-26 14:52:53.575434] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/1f0cee5a2d074e39b32ee5a81c70e68c.socket failed (Invalid argument) [2015-01-26 14:52:53.575447] I [MSGID: 106006] [glusterd-handler.c:4257:__glusterd_nodesvc_rpc_notify] 0-management: nfs has disconnected from glusterd. [2015-01-26 14:52:53.579663] I [glusterd-pmap.c:227:pmap_registry_bind] 0-pmap: adding brick /data/brick01bkp/homegfs_bkp on port 49152 [2015-01-26 14:52:53.581943] I [glusterd-pmap.c:227:pmap_registry_bind] 0-pmap: adding brick /data/brick02bkp/Source_bkp on port 49156 [2015-01-26 14:52:53.583487] I [glusterd-pmap.c:227:pmap_registry_bind] 0-pmap: adding brick /data/brick01bkp/Source_bkp on port 49153 [2015-01-26 14:52:53.584921] I [glusterd-pmap.c:227:pmap_registry_bind] 0-pmap: adding brick /data/brick02bkp/Software_bkp on port 49157 [2015-01-26 14:52:53.585719] I [glusterd-pmap.c:227:pmap_registry_bind] 0-pmap: adding brick /data/brick01bkp/Software_bkp on port 49154 [2015-01-26 14:52:53.586281] I [glusterd-pmap.c:227:pmap_registry_bind] 0-pmap: adding brick /data/brick02bkp/homegfs_bkp on port 49155 -- Original Message -- From: David F. Robinson david.robin...@corvidtec.com To: gluster-us...@gluster.org gluster-us...@gluster.org; Gluster Devel gluster-devel@gluster.org Sent: 1/26/2015 9:50:09 AM Subject: v3.6.2 I have a server with v3.6.2 from which I cannot mount using NFS. The FUSE mount works, however, I cannot get the NFS mount to work. From /var/log/message: Jan 26 09:27:28 gfs01bkp mount[2810]: mount to NFS server 'gfsib01bkp.corvidtec.com' failed: Connection refused, retrying Jan 26 09:27:53 gfs01bkp mount[4456]: mount to NFS server 'gfsib01bkp.corvidtec.com' failed: Connection refused, retrying Jan 26 09:29:28 gfs01bkp mount[2810]: mount to NFS server 'gfsib01bkp.corvidtec.com' failed: Connection refused, retrying Jan 26
Re: [Gluster-devel] v3.6.2
No firewall used on that machine. [root@gfs01bkp ~]# /etc/init.d/iptables status iptables: Firewall is not running. [ [root@gfs01bkp ~]# cat /etc/selinux/config # This file controls the state of SELinux on the system. # SELINUX= can take one of these three values: # enforcing - SELinux security policy is enforced. # permissive - SELinux prints warnings instead of enforcing. # disabled - No SELinux policy is loaded. SELINUX=disabled # SELINUXTYPE= can take one of these two values: # targeted - Targeted processes are protected, # mls - Multi Level Security protection. SELINUXTYPE=targeted -- Original Message -- From: Justin Clift jus...@gluster.org To: David F. Robinson david.robin...@corvidtec.com Cc: Gluster Users gluster-us...@gluster.org; Gluster Devel gluster-devel@gluster.org Sent: 1/26/2015 11:11:15 AM Subject: Re: [Gluster-devel] v3.6.2 On 26 Jan 2015, at 14:50, David F. Robinson david.robin...@corvidtec.com wrote: I have a server with v3.6.2 from which I cannot mount using NFS. The FUSE mount works, however, I cannot get the NFS mount to work. From /var/log/message: Jan 26 09:27:28 gfs01bkp mount[2810]: mount to NFS server 'gfsib01bkp.corvidtec.com' failed: Connection refused, retrying Jan 26 09:27:53 gfs01bkp mount[4456]: mount to NFS server 'gfsib01bkp.corvidtec.com' failed: Connection refused, retrying Jan 26 09:29:28 gfs01bkp mount[2810]: mount to NFS server 'gfsib01bkp.corvidtec.com' failed: Connection refused, retrying Jan 26 09:29:53 gfs01bkp mount[4456]: mount to NFS server 'gfsib01bkp.corvidtec.com' failed: Connection refused, retrying Jan 26 09:31:28 gfs01bkp mount[2810]: mount to NFS server 'gfsib01bkp.corvidtec.com' failed: Connection refused, retrying Jan 26 09:31:53 gfs01bkp mount[4456]: mount to NFS server 'gfsib01bkp.corvidtec.com' failed: Connection refused, retrying Jan 26 09:33:28 gfs01bkp mount[2810]: mount to NFS server 'gfsib01bkp.corvidtec.com' failed: Connection refused, retrying Jan 26 09:33:53 gfs01bkp mount[4456]: mount to NFS server 'gfsib01bkp.corvidtec.com' failed: Connection refused, retrying Jan 26 09:35:28 gfs01bkp mount[2810]: mount to NFS server 'gfsib01bkp.corvidtec.com' failed: Connection refused, retrying Jan 26 09:35:53 gfs01bkp mount[4456]: mount to NFS server 'gfsib01bkp.corvidtec.com' failed: Connection refused, retrying I also am continually getting the following errors in /var/log/glusterfs: [root@gfs01bkp glusterfs]# tail -f etc-glusterfs-glusterd.vol.log [2015-01-26 14:41:51.260827] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/1f0cee5a2d074e39b32ee5a81c70e68c.socket failed (Invalid argument) [2015-01-26 14:41:54.261240] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/1f0cee5a2d074e39b32ee5a81c70e68c.socket failed (Invalid argument) [2015-01-26 14:41:57.261642] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/1f0cee5a2d074e39b32ee5a81c70e68c.socket failed (Invalid argument) [2015-01-26 14:42:00.262073] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/1f0cee5a2d074e39b32ee5a81c70e68c.socket failed (Invalid argument) [2015-01-26 14:42:03.262504] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/1f0cee5a2d074e39b32ee5a81c70e68c.socket failed (Invalid argument) [2015-01-26 14:42:06.262935] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/1f0cee5a2d074e39b32ee5a81c70e68c.socket failed (Invalid argument) [2015-01-26 14:42:09.263334] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/1f0cee5a2d074e39b32ee5a81c70e68c.socket failed (Invalid argument) [2015-01-26 14:42:12.263761] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/1f0cee5a2d074e39b32ee5a81c70e68c.socket failed (Invalid argument) [2015-01-26 14:42:15.264177] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/1f0cee5a2d074e39b32ee5a81c70e68c.socket failed (Invalid argument) [2015-01-26 14:42:18.264623] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/1f0cee5a2d074e39b32ee5a81c70e68c.socket failed (Invalid argument) [2015-01-26 14:42:21.265053] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/1f0cee5a2d074e39b32ee5a81c70e68c.socket failed (Invalid argument) [2015-01-26 14:42:24.265504] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/1f0cee5a2d074e39b32ee5a81c70e68c.socket failed (Invalid argument) ^C Also, when I try to NFS mount my gluster volume, I am getting Any chance there's a network or host based firewall stopping some of the ports? + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] v3.6.2
Tried that... Still having errors starting gluster NFS... From the /var/log/gluster/nfs.log file: [2015-01-26 19:51:25.996481] I [MSGID: 100030] [glusterfsd.c:2018:main] 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.6.2 (args: /usr/sbin/glusterfs -s localhost --volfile-id gluster/nfs -p /var/lib/glusterd/nfs/run/nfs.pid -l /var/log/glusterfs/nfs.log -S /var/run/1f0cee5a2d074e39b32ee5a81c70e68c.socket) [2015-01-26 19:51:26.005501] I [rpcsvc.c:2142:rpcsvc_set_outstanding_rpc_limit] 0-rpc-service: Configured rpc.outstanding-rpc-limit with value 16 [2015-01-26 19:51:26.054144] E [nlm4.c:2481:nlm4svc_init] 0-nfs-NLM: unable to start /sbin/rpc.statd [2015-01-26 19:51:26.054183] E [nfs.c:1342:init] 0-nfs: Failed to initialize protocols [2015-01-26 19:51:26.054191] E [xlator.c:425:xlator_init] 0-nfs-server: Initialization of volume 'nfs-server' failed, review your volfile again [2015-01-26 19:51:26.054198] E [graph.c:322:glusterfs_graph_init] 0-nfs-server: initializing translator failed [2015-01-26 19:51:26.054205] E [graph.c:525:glusterfs_graph_activate] 0-graph: init failed [2015-01-26 19:51:26.05] W [glusterfsd.c:1194:cleanup_and_exit] (-- 0-: received signum (0), shutting down -- Original Message -- From: Anatoly Pugachev mator...@gmail.com To: David F. Robinson david.robin...@corvidtec.com Cc: gluster-us...@gluster.org gluster-us...@gluster.org; Gluster Devel gluster-devel@gluster.org Sent: 1/26/2015 2:48:08 PM Subject: Re: [Gluster-users] v3.6.2 David, can you stop glusterfs on affected machine and remove gluster related socket extension files from /var/run ? Start glusterfs service again and try once more ? On Mon, Jan 26, 2015 at 5:57 PM, David F. Robinson david.robin...@corvidtec.com wrote: Tried shutting down glusterd and glusterfsd and restarting. [2015-01-26 14:52:53.548330] I [rpc-clnt.c:969:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600 [2015-01-26 14:52:53.549763] I [rpc-clnt.c:969:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600 [2015-01-26 14:52:53.551245] I [rpc-clnt.c:969:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600 [2015-01-26 14:52:53.552819] I [rpc-clnt.c:969:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600 [2015-01-26 14:52:53.554289] I [rpc-clnt.c:969:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600 [2015-01-26 14:52:53.555769] I [rpc-clnt.c:969:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600 [2015-01-26 14:52:53.564429] I [rpc-clnt.c:969:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600 [2015-01-26 14:52:53.565578] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/0cdef7faa934cfe52676689ff8c0110f.socket failed (Invalid argument) [2015-01-26 14:52:53.566488] I [MSGID: 106005] [glusterd-handler.c:4142:__glusterd_brick_rpc_notify] 0-management: Brick gfsib01bkp.corvidtec.com:/data/brick01bkp/Software_bkp has disconnected from glusterd. [2015-01-26 14:52:53.567453] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/09e734d5e8d52bb796896c7a33d0a3ff.socket failed (Invalid argument) [2015-01-26 14:52:53.568248] I [MSGID: 106005] [glusterd-handler.c:4142:__glusterd_brick_rpc_notify] 0-management: Brick gfsib01bkp.corvidtec.com:/data/brick02bkp/Software_bkp has disconnected from glusterd. [2015-01-26 14:52:53.569009] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/3f6844c74682f39fa7457082119628c5.socket failed (Invalid argument) [2015-01-26 14:52:53.569851] I [MSGID: 106005] [glusterd-handler.c:4142:__glusterd_brick_rpc_notify] 0-management: Brick gfsib01bkp.corvidtec.com:/data/brick01bkp/Source_bkp has disconnected from glusterd. [2015-01-26 14:52:53.570818] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/34d5cc70aba63082bbb467ab450bd08b.socket failed (Invalid argument) [2015-01-26 14:52:53.571777] I [MSGID: 106005] [glusterd-handler.c:4142:__glusterd_brick_rpc_notify] 0-management: Brick gfsib01bkp.corvidtec.com:/data/brick02bkp/Source_bkp has disconnected from glusterd. [2015-01-26 14:52:53.572681] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/0cd747876dca36cb21ecc7a36f7f897c.socket failed (Invalid argument) [2015-01-26 14:52:53.573533] I [MSGID: 106005] [glusterd-handler.c:4142:__glusterd_brick_rpc_notify] 0-management: Brick gfsib01bkp.corvidtec.com:/data/brick01bkp/homegfs_bkp has disconnected from glusterd. [2015-01-26 14:52:53.574433] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/88744e1365b414d41e720e480700716a.socket failed (Invalid argument) [2015-01-26 14:52:53.575399] I [MSGID: 106005] [glusterd-handler.c:4142:__glusterd_brick_rpc_notify] 0-management: Brick gfsib01bkp.corvidtec.com:/data/brick02bkp/homegfs_bkp has disconnected from glusterd. [2015-01-26 14:52:53.575434] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run
Re: [Gluster-devel] [Gluster-users] v3.6.2
[root@gfs01bkp bricks]# ps -ef | grep rpcbind rpc 2306 1 0 11:32 ?00:00:00 rpcbind root 5265 4638 0 11:55 pts/000:00:00 grep rpcbind -- Original Message -- From: Joe Julian j...@julianfamily.org To: David F. Robinson david.robin...@corvidtec.com; gluster-us...@gluster.org gluster-us...@gluster.org; Gluster Devel gluster-devel@gluster.org Sent: 1/26/2015 11:55:09 AM Subject: Re: [Gluster-users] v3.6.2 Is rpcbind running? On January 26, 2015 6:57:44 AM PST, David F. Robinson david.robin...@corvidtec.com wrote: Tried shutting down glusterd and glusterfsd and restarting. [2015-01-26 14:52:53.548330] I [rpc-clnt.c:969:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600 [2015-01-26 14:52:53.549763] I [rpc-clnt.c:969:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600 [2015-01-26 14:52:53.551245] I [rpc-clnt.c:969:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600 [2015-01-26 14:52:53.552819] I [rpc-clnt.c:969:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600 [2015-01-26 14:52:53.554289] I [rpc-clnt.c:969:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600 [2015-01-26 14:52:53.555769] I [rpc-clnt.c:969:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600 [2015-01-26 14:52:53.564429] I [rpc-clnt.c:969:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600 [2015-01-26 14:52:53.565578] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/0cdef7faa934cfe52676689ff8c0110f.socket failed (Invalid argument) [2015-01-26 14:52:53.566488] I [MSGID: 106005] [glusterd-handler.c:4142:__glusterd_brick_rpc_notify] 0-management: Brick gfsib01bkp.corvidtec.com:/data/brick01bkp/Software_bkp has disconnected from glusterd. [2015-01-26 14:52:53.567453] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/09e734d5e8d52bb796896c7a33d0a3ff.socket failed (Invalid argument) [2015-01-26 14:52:53.568248] I [MSGID: 106005] [glusterd-handler.c:4142:__glusterd_brick_rpc_notify] 0-management: Brick gfsib01bkp.corvidtec.com:/data/brick02bkp/Software_bkp has disconnected from glusterd. [2015-01-26 14:52:53.569009] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/3f6844c74682f39fa7457082119628c5.socket failed (Invalid argument) [2015-01-26 14:52:53.569851] I [MSGID: 106005] [glusterd-handler.c:4142:__glusterd_brick_rpc_notify] 0-management: Brick gfsib01bkp.corvidtec.com:/data/brick01bkp/Source_bkp has disconnected from glusterd. [2015-01-26 14:52:53.570818] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/34d5cc70aba63082bbb467ab450bd08b.socket failed (Invalid argument) [2015-01-26 14:52:53.571777] I [MSGID: 106005] [glusterd-handler.c:4142:__glusterd_brick_rpc_notify] 0-management: Brick gfsib01bkp.corvidtec.com:/data/brick02bkp/Source_bkp has disconnected from glusterd. [2015-01-26 14:52:53.572681] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/0cd747876dca36cb21ecc7a36f7f897c.socket failed (Invalid argument) [2015-01-26 14:52:53.573533] I [MSGID: 106005] [glusterd-handler.c:4142:__glusterd_brick_rpc_notify] 0-management: Brick gfsib01bkp.corvidtec.com:/data/brick01bkp/homegfs_bkp has disconnected from glusterd. [2015-01-26 14:52:53.574433] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/88744e1365b414d41e720e480700716a.socket failed (Invalid argument) [2015-01-26 14:52:53.575399] I [MSGID: 106005] [glusterd-handler.c:4142:__glusterd_brick_rpc_notify] 0-management: Brick gfsib01bkp.corvidtec.com:/data/brick02bkp/homegfs_bkp has disconnected from glusterd. [2015-01-26 14:52:53.575434] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/1f0cee5a2d074e39b32ee5a81c70e68c.socket failed (Invalid argument) [2015-01-26 14:52:53.575447] I [MSGID: 106006] [glusterd-handler.c:4257:__glusterd_nodesvc_rpc_notify] 0-management: nfs has disconnected from glusterd. [2015-01-26 14:52:53.579663] I [glusterd-pmap.c:227:pmap_registry_bind] 0-pmap: adding brick /data/brick01bkp/homegfs_bkp on port 49152 [2015-01-26 14:52:53.581943] I [glusterd-pmap.c:227:pmap_registry_bind] 0-pmap: adding brick /data/brick02bkp/Source_bkp on port 49156 [2015-01-26 14:52:53.583487] I [glusterd-pmap.c:227:pmap_registry_bind] 0-pmap: adding brick /data/brick01bkp/Source_bkp on port 49153 [2015-01-26 14:52:53.584921] I [glusterd-pmap.c:227:pmap_registry_bind] 0-pmap: adding brick /data/brick02bkp/Software_bkp on port 49157 [2015-01-26 14:52:53.585719] I [glusterd-pmap.c:227:pmap_registry_bind] 0-pmap: adding brick /data/brick01bkp/Software_bkp on port 49154 [2015-01-26 14:52:53.586281] I [glusterd-pmap.c:227:pmap_registry_bind] 0-pmap: adding brick /data/brick02bkp/homegfs_bkp on port 49155 -- Original Message -- From: David F. Robinson david.robin...@corvidtec.com To: gluster-us...@gluster.org gluster-us...@gluster.org; Gluster Devel gluster-devel@gluster.org
Re: [Gluster-devel] v3.6.2
[root@gfs01bkp ~]# gluster volume status homegfs_bkp Status of volume: homegfs_bkp Gluster process PortOnline Pid -- Brick gfsib01bkp.corvidtec.com:/data/brick01bkp/homegfs _bkp49152 Y 4087 Brick gfsib01bkp.corvidtec.com:/data/brick02bkp/homegfs _bkp49155 Y 4092 NFS Server on localhost N/A N N/A Task Status of Volume homegfs_bkp -- Task : Rebalance ID : 6d4c6c4e-16da-48c9-9019-dccb7d2cfd66 Status : completed -- Original Message -- From: Atin Mukherjee amukh...@redhat.com To: Pranith Kumar Karampuri pkara...@redhat.com; Justin Clift jus...@gluster.org; David F. Robinson david.robin...@corvidtec.com Cc: Gluster Users gluster-us...@gluster.org; Gluster Devel gluster-devel@gluster.org Sent: 1/26/2015 11:51:13 PM Subject: Re: [Gluster-devel] v3.6.2 On 01/27/2015 07:33 AM, Pranith Kumar Karampuri wrote: On 01/26/2015 09:41 PM, Justin Clift wrote: On 26 Jan 2015, at 14:50, David F. Robinson david.robin...@corvidtec.com wrote: I have a server with v3.6.2 from which I cannot mount using NFS. The FUSE mount works, however, I cannot get the NFS mount to work. From /var/log/message: Jan 26 09:27:28 gfs01bkp mount[2810]: mount to NFS server 'gfsib01bkp.corvidtec.com' failed: Connection refused, retrying Jan 26 09:27:53 gfs01bkp mount[4456]: mount to NFS server 'gfsib01bkp.corvidtec.com' failed: Connection refused, retrying Jan 26 09:29:28 gfs01bkp mount[2810]: mount to NFS server 'gfsib01bkp.corvidtec.com' failed: Connection refused, retrying Jan 26 09:29:53 gfs01bkp mount[4456]: mount to NFS server 'gfsib01bkp.corvidtec.com' failed: Connection refused, retrying Jan 26 09:31:28 gfs01bkp mount[2810]: mount to NFS server 'gfsib01bkp.corvidtec.com' failed: Connection refused, retrying Jan 26 09:31:53 gfs01bkp mount[4456]: mount to NFS server 'gfsib01bkp.corvidtec.com' failed: Connection refused, retrying Jan 26 09:33:28 gfs01bkp mount[2810]: mount to NFS server 'gfsib01bkp.corvidtec.com' failed: Connection refused, retrying Jan 26 09:33:53 gfs01bkp mount[4456]: mount to NFS server 'gfsib01bkp.corvidtec.com' failed: Connection refused, retrying Jan 26 09:35:28 gfs01bkp mount[2810]: mount to NFS server 'gfsib01bkp.corvidtec.com' failed: Connection refused, retrying Jan 26 09:35:53 gfs01bkp mount[4456]: mount to NFS server 'gfsib01bkp.corvidtec.com' failed: Connection refused, retrying I also am continually getting the following errors in /var/log/glusterfs: [root@gfs01bkp glusterfs]# tail -f etc-glusterfs-glusterd.vol.log [2015-01-26 14:41:51.260827] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/1f0cee5a2d074e39b32ee5a81c70e68c.socket failed (Invalid argument) [2015-01-26 14:41:54.261240] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/1f0cee5a2d074e39b32ee5a81c70e68c.socket failed (Invalid argument) [2015-01-26 14:41:57.261642] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/1f0cee5a2d074e39b32ee5a81c70e68c.socket failed (Invalid argument) [2015-01-26 14:42:00.262073] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/1f0cee5a2d074e39b32ee5a81c70e68c.socket failed (Invalid argument) [2015-01-26 14:42:03.262504] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/1f0cee5a2d074e39b32ee5a81c70e68c.socket failed (Invalid argument) [2015-01-26 14:42:06.262935] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/1f0cee5a2d074e39b32ee5a81c70e68c.socket failed (Invalid argument) [2015-01-26 14:42:09.263334] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/1f0cee5a2d074e39b32ee5a81c70e68c.socket failed (Invalid argument) [2015-01-26 14:42:12.263761] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/1f0cee5a2d074e39b32ee5a81c70e68c.socket failed (Invalid argument) [2015-01-26 14:42:15.264177] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/1f0cee5a2d074e39b32ee5a81c70e68c.socket failed (Invalid argument) [2015-01-26 14:42:18.264623] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/1f0cee5a2d074e39b32ee5a81c70e68c.socket failed (Invalid argument) [2015-01-26 14:42:21.265053] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/1f0cee5a2d074e39b32ee5a81c70e68c.socket failed (Invalid argument) [2015-01-26 14:42:24.265504] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/1f0cee5a2d074e39b32ee5a81c70e68c.socket failed (Invalid argument) I believe this error message comes when the socket file is not present. I see the following commit which changed the location
Re: [Gluster-devel] 3.6.1 issue
That did not fix the issue (see below). I also have run into another possibly related issue. After untarring the boost directory and compiling the software, I cannot delete the source directory structure. It says direct not empty. corvidpost5:temp3/gfs \rm -r boost_1_57_0 rm: cannot remove `boost_1_57_0/libs/numeric/odeint/test': Directory not empty corvidpost5:temp3/gfs cd boost_1_57_0/libs/numeric/odeint/test/ corvidpost5:odeint/test ls -al total 0 drwxr-x--- 3 dfrobins users 94 Dec 20 01:51 ./ drwx-- 3 dfrobins users 100 Dec 20 01:51 ../ cluster.read-hash-mode to 2 results corvidpost5:TankExamples/DakotaList ls -al total 5 drwxr-x--- 2 dfrobins users 166 Dec 22 11:16 ./ drwxr-x--- 6 dfrobins users 445 Dec 22 11:16 ../ lrwxrwxrwx 1 dfrobins users 25 Dec 22 11:16 EvalTank.py - ../tank_model/EvalTank.py* -- 1 dfrobins users0 Dec 22 11:16 FEMTank.py -rwx--x--- 1 dfrobins users 734 Nov 7 11:05 RunTank.sh* -rw--- 1 dfrobins users 1432 Nov 7 11:05 dakota_PandL_list.in -rw--- 1 dfrobins users 1860 Nov 7 11:05 dakota_Ponly_list.in gluster volume info homegfs Volume Name: homegfs Type: Distributed-Replicate Volume ID: 1e32672a-f1b7-4b58-ba94-58c085e59071 Status: Started Number of Bricks: 4 x 2 = 8 Transport-type: tcp Bricks: Brick1: gfsib01a.corvidtec.com:/data/brick01a/homegfs Brick2: gfsib01b.corvidtec.com:/data/brick01b/homegfs Brick3: gfsib01a.corvidtec.com:/data/brick02a/homegfs Brick4: gfsib01b.corvidtec.com:/data/brick02b/homegfs Brick5: gfsib02a.corvidtec.com:/data/brick01a/homegfs Brick6: gfsib02b.corvidtec.com:/data/brick01b/homegfs Brick7: gfsib02a.corvidtec.com:/data/brick02a/homegfs Brick8: gfsib02b.corvidtec.com:/data/brick02b/homegfs Options Reconfigured: cluster.read-hash-mode: 2 performance.stat-prefetch: off performance.io-thread-count: 32 performance.cache-size: 128MB performance.write-behind-window-size: 128MB server.allow-insecure: on network.ping-timeout: 10 storage.owner-gid: 100 geo-replication.indexing: off geo-replication.ignore-pid-check: on changelog.changelog: on changelog.fsync-interval: 3 changelog.rollover-time: 15 server.manage-gids: on -- Original Message -- From: Vijay Bellur vbel...@redhat.com To: David F. Robinson david.robin...@corvidtec.com Cc: Justin Clift jus...@gluster.org; Gluster Devel gluster-devel@gluster.org Sent: 12/22/2014 9:23:44 AM Subject: Re: [Gluster-devel] 3.6.1 issue On 12/21/2014 11:10 PM, David F. Robinson wrote: So for now it is up to all of the individual users to know they cannot use tar without the -P switch if they are accessing a data storage system that uses gluster? Setting volume option cluster.read-hash-mode to 2 could help here. Can you please check if this resolves the problem without -P switch? -Vijay On Dec 21, 2014, at 12:30 PM, Vijay Bellur vbel...@redhat.com wrote: On 12/20/2014 12:09 PM, David F. Robinson wrote: Seems to work with -xPf. I obviously couldn't check all of the files, but the two specific ones that I noted in my original email do not show any problems when using -P... This is related to the way tar extracts symbolic links by default its interaction with GlusterFS. In a nutshell the following steps are involved in creation of symbolic links on the destination: a) Create an empty regular placeholder file with permission bits set to 0 and the name being that of the symlink source file. b) Record the device, inode numbers and the mtime of the placeholder file through stat. c) After the first pass of extraction is complete, there is a second pass involved to set right symbolic links. In this phase a stat is performed on the placeholder file. If all attributes recorded in b) are in sync with the latest information from stat buf, only then the placeholder is unlinked and a new symbolic link is created. If any attribute is out of sync, the unlink and creation of symbolic link do not happen. In the case of replicated GlusterFS volumes, the mtimes can vary across nodes during the creation of placeholder files. If the stat calls in steps b) and c) land on different nodes, then there is a very good likelihood that tar would skip creation of symbolic links and leave behind the placeholder files. A little more detail about this particular implementation behavior of symlinks for tar can be found at [1]. To overcome this behavior, we can make use of the P switch with tar command during extraction which will create the link file directly and not go ahead with the above set of steps. Keeping timestamps in sync across the cluster will help to an extent in preventing this situation. There are ongoing refinements in replicate's selection of read-child which will help in addressing this problem. -Vijay [1] http://lists.debian.org/debian-user/2003/03/msg03249.html ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] 3.6.1 issue
So for now it is up to all of the individual users to know they cannot use tar without the -P switch if they are accessing a data storage system that uses gluster? David (Sent from mobile) === David F. Robinson, Ph.D. President - Corvid Technologies 704.799.6944 x101 [office] 704.252.1310 [cell] 704.799.7974 [fax] david.robin...@corvidtec.com http://www.corvidtechnologies.com On Dec 21, 2014, at 12:30 PM, Vijay Bellur vbel...@redhat.com wrote: On 12/20/2014 12:09 PM, David F. Robinson wrote: Seems to work with -xPf. I obviously couldn't check all of the files, but the two specific ones that I noted in my original email do not show any problems when using -P... This is related to the way tar extracts symbolic links by default its interaction with GlusterFS. In a nutshell the following steps are involved in creation of symbolic links on the destination: a) Create an empty regular placeholder file with permission bits set to 0 and the name being that of the symlink source file. b) Record the device, inode numbers and the mtime of the placeholder file through stat. c) After the first pass of extraction is complete, there is a second pass involved to set right symbolic links. In this phase a stat is performed on the placeholder file. If all attributes recorded in b) are in sync with the latest information from stat buf, only then the placeholder is unlinked and a new symbolic link is created. If any attribute is out of sync, the unlink and creation of symbolic link do not happen. In the case of replicated GlusterFS volumes, the mtimes can vary across nodes during the creation of placeholder files. If the stat calls in steps b) and c) land on different nodes, then there is a very good likelihood that tar would skip creation of symbolic links and leave behind the placeholder files. A little more detail about this particular implementation behavior of symlinks for tar can be found at [1]. To overcome this behavior, we can make use of the P switch with tar command during extraction which will create the link file directly and not go ahead with the above set of steps. Keeping timestamps in sync across the cluster will help to an extent in preventing this situation. There are ongoing refinements in replicate's selection of read-child which will help in addressing this problem. -Vijay [1] http://lists.debian.org/debian-user/2003/03/msg03249.html ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] 3.6.1 issue
Seems to work with -xPf. I obviously couldn't check all of the files, but the two specific ones that I noted in my original email do not show any problems when using -P... David -- Original Message -- From: Vijay Bellur vbel...@redhat.com To: David F. Robinson david.robin...@corvidtec.com; Justin Clift jus...@gluster.org; Gluster Devel gluster-devel@gluster.org Sent: 12/20/2014 1:04:57 AM Subject: Re: [Gluster-devel] 3.6.1 issue On 12/16/2014 10:59 PM, David F. Robinson wrote: Gluster 3.6.1 seems to be having an issue creating symbolic links. To reproduce this issue, I downloaded the file dakota-6.1-public.src_.tar.gz from https://dakota.sandia.gov/download.html # gunzip dakota-6.1-public.src_.tar.gz # tar -xf dakota-6.1-public.src_.tar Can you please try with tar -xPf ... and check the results? Thanks, Vijay ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] glusterfs-3.5.3beta1 has been released for testing
When I installed the 3.5.3beta on my HPC cluster, I get the following warnings during the mounts: WARNING: getfattr not found, certain checks will be skipped.. I do not have attr installed on my compute nodes. Is this something that I need in order for gluster to work properly or can this safely be ignored? David -- Original Message -- From: Niels de Vos nde...@redhat.com To: gluster-us...@gluster.org; gluster-devel@gluster.org Sent: 10/5/2014 8:44:59 AM Subject: [Gluster-users] glusterfs-3.5.3beta1 has been released for testing GlusterFS 3.5.3 (beta1) has been released and is now available for testing. Get the tarball from here: - http://bits.gluster.org/pub/gluster/glusterfs/src/glusterfs-3.5.3beta1.tar.gz Packages for different distributions will land on the download server over the next few days. When packages become available, the package maintainers will send a notification to this list. With this beta release, we make it possible for bug reporters and testers to check if issues have indeed been fixed. All community members are invited to test and/or comment on this release. This release for the 3.5 stable series includes the following bug fixes: - 1081016: glusterd needs xfsprogs and e2fsprogs packages - 1129527: DHT :- data loss - file is missing on renaming same file from multiple client at same time - 1129541: [DHT:REBALANCE]: Rebalance failures are seen with error message remote operation failed: File exists - 1132391: NFS interoperability problem: stripe-xlator removes EOF at end of READDIR - 1133949: Minor typo in afr logging - 1136221: The memories are exhausted quickly when handle the message which has multi fragments in a single record - 1136835: crash on fsync - 1138922: DHT + rebalance : rebalance process crashed + data loss + few Directories are present on sub-volumes but not visible on mount point + lookup is not healing directories - 1139103: DHT + Snapshot :- If snapshot is taken when Directory is created only on hashed sub-vol; On restoring that snapshot Directory is not listed on mount point and lookup on parent is not healing - 1139170: DHT :- rm -rf is not removing stale link file and because of that unable to create file having same name as stale link file - 1139245: vdsm invoked oom-killer during rebalance and Killed process 4305, UID 0, (glusterfs nfs process) - 1140338: rebalance is not resulting in the hash layout changes being available to nfs client - 1140348: Renaming file while rebalance is in progress causes data loss - 1140549: DHT: Rebalance process crash after add-brick and `rebalance start' operation - 1140556: Core: client crash while doing rename operations on the mount - 1141558: AFR : gluster volume heal volume_name info prints some random characters - 1141733: data loss when rebalance + renames are in progress and bricks from replica pairs goes down and comes back - 1142052: Very high memory usage during rebalance - 1142614: files with open fd's getting into split-brain when bricks goes offline and comes back online - 1144315: core: all brick processes crash when quota is enabled - 1145000: Spec %post server does not wait for the old glusterd to exit - 1147243: nfs: volume set help says the rmtab file is in /var/lib/glusterd/rmtab To get more information about the above bugs, go to https://bugzilla.redhat.com, enter the bug number in the search box and press enter. If a bug from this list has not been sufficiently fixed, please open the bug report, leave a comment with details of the testing and change the status of the bug to ASSIGNED. In case someone has successfully verified a fix for a bug, please change the status of the bug to VERIFIED. The release notes have been posted for review, and a blog post contains an easier readable version: - http://review.gluster.org/8903 - http://blog.nixpanic.net/2014/10/glusterfs-353beta1-has-been-released.html Comments in bug reports, over email or on IRC (#gluster on Freenode) are much appreciated. Thanks for testing, Niels ___ Gluster-users mailing list gluster-us...@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Fw: Re: Corvid gluster testing
Just to clarify a little, there are two cases where I was evaluating performance. 1) The first case that Pranith was working involved 20-nodes with 4-processors on each node for a total of 80-processors. Each processor does its own independent i/o. These files are roughly 100-200MB each and there are several hundred of them. When I mounted the gluster system using fuse, it took 1.5-hours to do the i/o. When I mounted the same system using NFS, it took 30-minutes. Note, that in order to get the gluster mounted file-system down to 1.5-hours, I had to get rid of the replicated volume (this was done during troubleshooting with Pranith to rule out other possible issues). The timing was significantly worse (3+ hours) when I was using a replicated pair. 2) The second case was the output of a larger single file (roughly 2.5TB). For this case, it takes the gluster mounted filesystem 60-seconds (although I got that down to 52-seconds with some gluster parameter tuning). The NFS mount takes 38-seconds. I sent the results of this to the developer list first as this case is much easier to test (50-seconds versus what could be 3+ hours). I am head out of town for a few days and will not be able to do additional testing until Monday. For the second case, I will turn off cluster.eager-lock and send the results to the email list. If there is any other testing that you would like to see for the first case, let me know and I will be happy to perform the tests and send in the results... Sorry for the confusion... David -- Original Message -- From: Pranith Kumar Karampuri pkara...@redhat.com To: Anand Avati av...@gluster.org Cc: David F. Robinson david.robin...@corvidtec.com; Gluster Devel gluster-devel@gluster.org Sent: 8/6/2014 9:51:11 PM Subject: Re: [Gluster-devel] Fw: Re: Corvid gluster testing On 08/07/2014 07:18 AM, Anand Avati wrote: It would be worth checking the perf numbers without -o acl (in case it was enabled, as seen in the other gid thread). Client side -o acl mount option can have a negative impact on performance because of the increased number of up-calls from FUSE for access(). Actually it is all write intensive. here are the numbers they gave me from earlier runs: %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop - --- --- --- 0.00 0.00 us 0.00 us 0.00 us 99 FORGET 0.00 0.00 us 0.00 us 0.00 us 1093 RELEASE 0.00 0.00 us 0.00 us 0.00 us468 RELEASEDIR 0.00 60.00 us 26.00 us 107.00 us 4 SETATTR 0.00 91.56 us 42.00 us 157.00 us 27 UNLINK 0.00 20.75 us 12.00 us 55.00 us132 GETXATTR 0.00 19.03 us 9.00 us 95.00 us152 READLINK 0.00 43.19 us 12.00 us 106.00 us 83 OPEN 0.00 18.37 us 8.00 us 92.00 us257 STATFS 0.00 32.42 us 11.00 us 118.00 us322 OPENDIR 0.00 36.09 us 5.00 us 109.00 us359 FSTAT 0.00 51.14 us 37.00 us 183.00 us663 RENAME 0.00 33.32 us 6.00 us 123.00 us 1451 STAT 0.00 821.79 us 21.00 us 22678.00 us 84 READ 0.00 34.88 us 3.00 us 139.00 us 2326 FLUSH 0.01 789.33 us 72.00 us 64054.00 us347 CREATE 0.011144.63 us 43.00 us 280735.00 us337 FTRUNCATE 0.01 47.82 us 16.00 us 19817.00 us 16513 LOOKUP 0.02 604.85 us 11.00 us1233.00 us 1423 READDIRP 99.95 17.51 us 6.00 us 212701.00 us 300715967 WRITE Duration: 5390 seconds Data Read: 1495257497 bytes Data Written: 166546887668 bytes Pranith Thanks On Wed, Aug 6, 2014 at 6:26 PM, Pranith Kumar Karampuri pkara...@redhat.com wrote: On 08/07/2014 06:48 AM, Anand Avati wrote: On Wed, Aug 6, 2014 at 6:05 PM, Pranith Kumar Karampuri pkara...@redhat.com wrote: We checked this performance with plain distribute as well and on nfs it gave 25 minutes where as on nfs it gave around 90 minutes after disabling throttling in both situations. This sentence is very confusing. Can you please state it more clearly? sorry :-D. We checked this performance on plain distribute volume by disabling throttling. On nfs the run took 25 minutes. On fuse the run took 90 minutes. Pranith Thanks I was wondering if any of you guys know what could contribute to this difference. Pranith On 08/07/2014 01:33 AM, Anand Avati wrote: Seems like heavy FINODELK contention. As a diagnostic step
Re: [Gluster-devel] Fw: Re: Corvid gluster testing
Forgot to attach profile info in previous email. Attached... David -- Original Message -- From: David F. Robinson david.robin...@corvidtec.com To: gluster-devel@gluster.org Sent: 8/5/2014 2:41:34 PM Subject: Fw: Re: Corvid gluster testing I have been testing some of the fixes that Pranith incorporated into the 3.5.2-beta to see how they performed for moderate levels of i/o. All of the stability issues that I had seen in previous versions seem to have been fixed in 3.5.2; however, there still seem to be some significant performance issues. Pranith suggested that I send this to the gluster-devel email list, so here goes: I am running an MPI job that saves a restart file to the gluster file system. When I use the following in my fstab to mount the gluster volume, the i/o time for the 2.5GB file is roughly 45-seconds. gfsib01a.corvidtec.com:/homegfs /homegfs glusterfs transport=tcp,_netdev 0 0 When I switch this to use the NFS protocol (see below), the i/o time is 2.5-seconds. gfsib01a.corvidtec.com:/homegfs /homegfs nfs vers=3,intr,bg,rsize=32768,wsize=32768 0 0 The read-times for gluster are 10-20% faster than NFS, but the write times are almost 20x slower. I am running SL 6.4 and glusterfs-3.5.2-0.1.beta1.el6.x86_64... [root@gfs01a glusterfs]# gluster volume info homegfs Volume Name: homegfs Type: Distributed-Replicate Volume ID: 1e32672a-f1b7-4b58-ba94-58c085e59071 Status: Started Number of Bricks: 2 x 2 = 4 Transport-type: tcp Bricks: Brick1: gfsib01a.corvidtec.com:/data/brick01a/homegfs Brick2: gfsib01b.corvidtec.com:/data/brick01b/homegfs Brick3: gfsib01a.corvidtec.com:/data/brick02a/homegfs Brick4: gfsib01b.corvidtec.com:/data/brick02b/homegfs David -- Forwarded Message -- From: Pranith Kumar Karampuri pkara...@redhat.com To: David Robinson david.robin...@corvidtec.com Cc: Young Thomas tom.yo...@corvidtec.com Sent: 8/5/2014 2:25:38 AM Subject: Re: Corvid gluster testing gluster-devel@gluster.org is the email-id for the mailing list. We should probably start with the initial run numbers and the comparison for glusterfs mount and nfs mounts. May be something like glusterfs mount: 90 minutes nfs mount: 25 minutes And profile outputs, volume config, number of mounts, hardware configuration should be a good start. Pranith On 08/05/2014 09:28 AM, David Robinson wrote: Thanks pranith === David F. Robinson, Ph.D. President - Corvid Technologies 704.799.6944 x101 [office] 704.252.1310 [cell] 704.799.7974 [fax] david.robin...@corvidtec.com http://www.corvidtechnologies.com On Aug 4, 2014, at 11:22 PM, Pranith Kumar Karampuri pkara...@redhat.com wrote: On 08/05/2014 08:33 AM, Pranith Kumar Karampuri wrote: On 08/05/2014 08:29 AM, David F. Robinson wrote: On 08/05/2014 12:51 AM, David F. Robinson wrote: No. I don't want to use nfs. It eliminates most of the benefits of why I want to use gluster. Failover redundancy of the pair, load balancing, etc. What is the meaning of 'Failover redundancy of the pair, load balancing ' Could you elaborate more? smb/nfs/glusterfs are just access protocols that gluster supports functionality is almost same Here is my understanding. Please correct me where I am wrong. With gluster, if I am doing a write and one of the replicated pairs goes down, there is no interruption to the I/o. The failover is handled by gluster and the fuse client. This isn't done if I use an nfs mount unless the component of the pair that goes down isn't the one I used for the mount. With nfs, I will have to mount one of the bricks. So, if I have gfs01a, gfs01b, gfs02a, gfs02b, gfs03a, gfs03b, etc and my fstab mounts gfs01a, it is my understanding that all of my I/o will go through gfs01a which then gets distributed to all of the other bricks. Gfs01a throughput becomes a bottleneck. Where if I do a gluster mount using fuse, the load balancing is handled at the client side , not the server side. If I have 1000-nodes accessing 20-gluster bricks, I need the load balancing aspect. I cannot have all traffic going through the network interface on a single brick. If I am wrong with the above assumptions, I guess my question is why would one ever use the gluster mount instead of nfs and/or samba? Tom: feel free to chime in if I have missed anything. I see your point now. Yes the gluster server where you did the mount is kind of a bottle neck. Now that we established the problem is in the clients/protocols, you should send out a detailed mail on gluster-devel and see if anyone can help with you on performance xlators that can improve it a bit more. My area of expertise is more on replication. I am sub-maintainer for replication,locks components. I also know connection management/io-threads related issues which lead to hangs as I worked on them before. Performance xlators are black box to me. Performance xlators are enabled only on fuse gluster stack. On nfs server mounts we
[Gluster-devel] Fw: Re: Corvid gluster testing
I have been testing some of the fixes that Pranith incorporated into the 3.5.2-beta to see how they performed for moderate levels of i/o. All of the stability issues that I had seen in previous versions seem to have been fixed in 3.5.2; however, there still seem to be some significant performance issues. Pranith suggested that I send this to the gluster-devel email list, so here goes: I am running an MPI job that saves a restart file to the gluster file system. When I use the following in my fstab to mount the gluster volume, the i/o time for the 2.5GB file is roughly 45-seconds. gfsib01a.corvidtec.com:/homegfs /homegfs glusterfs transport=tcp,_netdev 0 0 When I switch this to use the NFS protocol (see below), the i/o time is 2.5-seconds. gfsib01a.corvidtec.com:/homegfs /homegfs nfs vers=3,intr,bg,rsize=32768,wsize=32768 0 0 The read-times for gluster are 10-20% faster than NFS, but the write times are almost 20x slower. I am running SL 6.4 and glusterfs-3.5.2-0.1.beta1.el6.x86_64... [root@gfs01a glusterfs]# gluster volume info homegfs Volume Name: homegfs Type: Distributed-Replicate Volume ID: 1e32672a-f1b7-4b58-ba94-58c085e59071 Status: Started Number of Bricks: 2 x 2 = 4 Transport-type: tcp Bricks: Brick1: gfsib01a.corvidtec.com:/data/brick01a/homegfs Brick2: gfsib01b.corvidtec.com:/data/brick01b/homegfs Brick3: gfsib01a.corvidtec.com:/data/brick02a/homegfs Brick4: gfsib01b.corvidtec.com:/data/brick02b/homegfs David -- Forwarded Message -- From: Pranith Kumar Karampuri pkara...@redhat.com To: David Robinson david.robin...@corvidtec.com Cc: Young Thomas tom.yo...@corvidtec.com Sent: 8/5/2014 2:25:38 AM Subject: Re: Corvid gluster testing gluster-devel@gluster.org is the email-id for the mailing list. We should probably start with the initial run numbers and the comparison for glusterfs mount and nfs mounts. May be something like glusterfs mount: 90 minutes nfs mount: 25 minutes And profile outputs, volume config, number of mounts, hardware configuration should be a good start. Pranith On 08/05/2014 09:28 AM, David Robinson wrote: Thanks pranith === David F. Robinson, Ph.D. President - Corvid Technologies 704.799.6944 x101 [office] 704.252.1310 [cell] 704.799.7974 [fax] david.robin...@corvidtec.com http://www.corvidtechnologies.com On Aug 4, 2014, at 11:22 PM, Pranith Kumar Karampuri pkara...@redhat.com wrote: On 08/05/2014 08:33 AM, Pranith Kumar Karampuri wrote: On 08/05/2014 08:29 AM, David F. Robinson wrote: On 08/05/2014 12:51 AM, David F. Robinson wrote: No. I don't want to use nfs. It eliminates most of the benefits of why I want to use gluster. Failover redundancy of the pair, load balancing, etc. What is the meaning of 'Failover redundancy of the pair, load balancing ' Could you elaborate more? smb/nfs/glusterfs are just access protocols that gluster supports functionality is almost same Here is my understanding. Please correct me where I am wrong. With gluster, if I am doing a write and one of the replicated pairs goes down, there is no interruption to the I/o. The failover is handled by gluster and the fuse client. This isn't done if I use an nfs mount unless the component of the pair that goes down isn't the one I used for the mount. With nfs, I will have to mount one of the bricks. So, if I have gfs01a, gfs01b, gfs02a, gfs02b, gfs03a, gfs03b, etc and my fstab mounts gfs01a, it is my understanding that all of my I/o will go through gfs01a which then gets distributed to all of the other bricks. Gfs01a throughput becomes a bottleneck. Where if I do a gluster mount using fuse, the load balancing is handled at the client side , not the server side. If I have 1000-nodes accessing 20-gluster bricks, I need the load balancing aspect. I cannot have all traffic going through the network interface on a single brick. If I am wrong with the above assumptions, I guess my question is why would one ever use the gluster mount instead of nfs and/or samba? Tom: feel free to chime in if I have missed anything. I see your point now. Yes the gluster server where you did the mount is kind of a bottle neck. Now that we established the problem is in the clients/protocols, you should send out a detailed mail on gluster-devel and see if anyone can help with you on performance xlators that can improve it a bit more. My area of expertise is more on replication. I am sub-maintainer for replication,locks components. I also know connection management/io-threads related issues which lead to hangs as I worked on them before. Performance xlators are black box to me. Performance xlators are enabled only on fuse gluster stack. On nfs server mounts we disable all the performance xlators except write-behind as nfs client does lots of things for improving performance. I suggest you guys follow up more on gluster-devel. Appreciate all the help you did for improving the product :-). Thanks a ton! Pranith