[Gluster-users] Problem: rsync files to glusterfs fail randomly~
Hello, everyone~ A few days ago I asked the same question, then I got a reply from joel vennin (thanks again), he said they got a similar problem, and once they removed the readahead configuration, everything worked fine. I took his advice and the situation got better at that moment. But today, the same problem occurred again, and I'm sure that our configurations already excluded the readahead feature. Here's the log from the gfs client where the rsync operation failed: …… [2010-05-21 10:25:03] W [fuse-bridge.c:491:fuse_entry_cbk] glusterfs-fuse: LOOKUP(/uigs/pblog/web/201005 /20100521) inode (ptr=0x18ea9960, ino=231986435693, gen=5471125106254156105) found conflict (ptr=0x19003 2b0, ino=231986435693, gen=5471125106254156522) [2010-05-21 10:25:03] W [fuse-bridge.c:1719:fuse_create_cbk] glusterfs-fuse: 35641: /uigs/pblog/web/2010 05/20100521/.pb_access_log.201005211020.10.15.4.61.nginx1.5FyvzI => -1 (No such file or directory) [2010-05-21 10:25:03] W [fuse-bridge.c:1719:fuse_create_cbk] glusterfs-fuse: 35644: /uigs/pblog/web/2010 05/20100521/pb_access_log.201005211020.10.15.4.61.nginx1 => -1 (No such file or directory) …… Anybody knows the reason that may cause such a problem? Thanks in advance, any suggestion would be appreciated~ On Mon, May 10, 2010 at 9:12 PM, bonn deng wrote: > > Hello, everyone~ > We're using glusterfs as our data storage tool, after we upgraded gfs > version from 2.0.7 to 3.0.3, we encountered some wierd problems: we need to > rsync some files to gfs cluster every five minutes, but randomly some files > cannot be transfered correctly or evan cannot be transfered at all. I ssh to > the computer where the rsync operation failed and check the log under > directory "/var/log/glusterfs", which reads: > > …… > [2010-05-10 20:32:05] W [fuse-bridge.c:1719:fuse_create_cbk] > glusterfs-fuse: 4499440: > /uigs/sugg/.sugg_access_log.2010051012.10.11.89.102.nginx1.cMi7LW => -1 > (No such file or directory) > [2010-05-10 20:32:13] W [fuse-bridge.c:1719:fuse_create_cbk] > glusterfs-fuse: 4499542: > /sogou-logs/nginx-logs/proxy/.proxy_access_log.2010051019.10.11.89.102. > nginx1.MnUaIR => -1 (No such file or directory) > > [2010-05-10 20:35:12] W [fuse-bridge.c:491:fuse_entry_cbk] glusterfs-fuse: > LOOKUP(/uigs/pblog/bdweb/201005/20100510) inode (ptr=0x2c010fb0, > ino=183475774 > 468, gen=5467705122580597717) found conflict (ptr=0x1d75640, > ino=183475774468, gen=5467705122580599136) > [2010-05-10 20:35:16] W [fuse-bridge.c:491:fuse_entry_cbk] glusterfs-fuse: > LOOKUP(/uigs/pblog/suggweb/201005/20100510) inode (ptr=0x1d783b0, > ino=245151107323 > , gen=5467705122580597722) found conflict (ptr=0x2c0bc4b0, > ino=245151107323, gen=5467705122580598133) > > [2010-05-10 20:40:08] W [fuse-bridge.c:491:fuse_entry_cbk] glusterfs-fuse: > LOOKUP(/uigs/pblog/bdweb/201005/20100510) inode (ptr=0x2aaab806cca0, > ino=183475774 > 468, gen=5467705122580597838) found conflict (ptr=0x1d75640, > ino=183475774468, gen=5467705122580599136) > [2010-05-10 20:40:12] W [fuse-bridge.c:491:fuse_entry_cbk] glusterfs-fuse: > LOOKUP(/uigs/pblog/suggweb/201005/20100510) inode (ptr=0x1d7c190, > ino=245151107323 > , gen=5467705122580597843) found conflict (ptr=0x2c0bc4b0, > ino=245151107323, gen=5467705122580598133) > > [2010-05-10 20:45:10] W [fuse-bridge.c:491:fuse_entry_cbk] glusterfs-fuse: > LOOKUP(/uigs/pblog/bdweb/201005/20100510) inode (ptr=0x2aaab00a6a90, > ino=183475774 > 468, gen=5467705122580597838) found conflict (ptr=0x1d75640, > ino=183475774468, gen=5467705122580599136) > [2010-05-10 20:45:14] W [fuse-bridge.c:491:fuse_entry_cbk] glusterfs-fuse: > LOOKUP(/uigs/pblog/suggweb/201005/20100510) inode (ptr=0x2aaab80960e0, > ino=2451511 > 07323, gen=5467705122580597669) found conflict (ptr=0x2c0bc4b0, > ino=245151107323, gen=5467705122580598133) > …… > > Does anybody know what's wrong with our gfs? And another question, in > order to trace the problem, we want to know to which machine the failed file > should be put, where can I get this information or what can I do? > By the way, we're now using glusterfs version 3.0.3, and we have nearly > 200 data servers in the gfs cluster (in distribute mode, not replicate). > What else do I need to put here in order to make our problem clear if it's > not now? > Thanks for your help! Any suggestion would be appreciated~ > > ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] Big I/O or gluster proccess problem
Are you using performance/io-threads filters on *both* client and server side? Despite what the docs say I found similar behaviour until I did that. Ian On 20/05/10 12:19, Ran wrote: Hi all , Our problem is simple but quite critical i posted few mounts ago regarding that issue and there was a good responses but not a fix What happen is that gluster stack when there is a big wrtite to it . For example time dd if=/dev/zero of=file bs=10240 count=100 OR mv 20gig_file.img into gluster mount . When that happen the all storage freezes for the entire proccess , mails , few vps's , simple dir etc.. Our setup is quite simple at this point , 2 servers in distrebute mod , each has 1 TB brick that replicate to each other using DRBD . iv monitored closly everting on this big writes and noticed that its not mem , proc , network problem . Iv also checked the same setup with simple NFS and its not happening there . Anyone has any idea on how to fix this , if you cant write big files into gluster without making the storage unfunctional then you cant realy do anything . Please advise , ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] Big I/O or gluster proccess problem
You should give a try to 3.0.4. I used to run 3.0.2 before and I have the same behavior: when writting in a directory, other processes couldn't access it untill the write is finished.. Since I have upgraded to 3.0.4, I have not the issue any more. cheers On Thu, May 20, 2010 at 3:10 PM, Tejas N. Bhise wrote: > If I am not mistaken, you have a single server for glusterfs, and this is > mirrored to a second ( non-glusterFS ) server using DRBD. If you have only a > single server to export data from, why use Glusterfs ? Also, we don't > officially support DRBD replication with glusterFS backend. > > Maybe you can consider GlusterFS replication across the two servers ? > > > - Original Message - > From: "Ran" > To: "Tejas N. Bhise" > Sent: Thursday, May 20, 2010 6:30:21 PM > Subject: Re: [Gluster-users] Big I/O or gluster proccess problem > > > Tejas hi , > The 2 servers DRBD pair with HA so gluster actualy has 1 server that export 1 > H.D ( 1 TB) this H.D is DRBD'd to the other server > also this 1 TB hd is raid 1 with linux raid (i know its not optimal but > rebust) in this setup if 1 servers go down the other continue , DRBD is more > rebust then gluster replication specialy for VPS's etc.. > I didnt check iowait but the load of the server is about 5 while the CPU's > are 10-50% only so that says it all(there are IO waits) > I was thinking of breaking the raid 1 seens this H.D allready has full mirror > with DRBD( to server 2) but im not sure it will resolve this problem seens > with NFS its not the same , it slows thigs down but not to not functional . > > client vol file 192.168.0.9 is the HA IP of this pair iv tested also with > plain config(no writebehind etc) > > <> > # file: /etc/glusterfs/glusterfs.vol > volume storage1-2 > type protocol/client > option transport-type tcp > option remote-host 192.168.0.9 > option remote-subvolume b1 > option ping-timeout 120 > option username .. > option password .. > end-volume > > volume cluster > type cluster/distribute > option lookup-unhashed yes > subvolumes storage1-2 > end-volume > > #volume writebehind > # type performance/write-behind > # option cache-size 3MB > # subvolumes cluster > #end-volume > > #volume readahead > # type performance/read-ahead > # option page-count 4 > # subvolumes writebehind > #end-volume > > volume iothreads > type performance/io-threads > option thread-count 4 > subvolumes cluster > end-volume > > > volume io-cache > type performance/io-cache > option cache-size 128MB > option page-size 256KB #128KB is default option > option force-revalidate-timeout 10 # default is 1 > subvolumes iothreads > end-volume > > > volume writebehind > type performance/write-behind > option aggregate-size 512KB # default is 0bytes > option flush-behind on # default is 'off' > subvolumes io-cache > end-volume > > <> > > server vol file > > <> > # file: /etc/glusterfs/glusterfs-server.vol > volume posix > type storage/posix > option directory /data/gluster > # option o-direct enable > option background-unlink yes > # option span-devices 8 > end-volume > > volume locks > type features/locks > subvolumes posix > end-volume > > volume b1 > type performance/io-threads > option thread-count 8 > subvolumes locks > end-volume > > volume server > type protocol/server > option transport.socket.nodelay on > option transport-type tcp > # option auth.addr.b1.allow * > option auth.login.b1.allow .. > option auth.login.gluster.password > subvolumes b1 > end-volume > > <> > > > 2010/5/20 Tejas N. Bhise < te...@gluster.com > > > > Ran, > > Can you please elaborate on - "2 servers in distrebute mod , each has 1 TB > brick that replicate to each other using DRBD" > Also, how many drives do you have and what does iowait look like when you > write a big file ? Tell us more about the configs of your servers, share the > volume files. > > Regards, > Tejas. > > > > > - Original Message - > From: "Ran" < smtp.tes...@gmail.com > > To: Gluster-users@gluster.org > Sent: Thursday, May 20, 2010 4:49:52 PM > Subject: [Gluster-users] Big I/O or gluster proccess problem > > Hi all , > Our problem is simple but quite critical i posted few mounts ago regarding > that issue and there was a good responses but not a fix > What happen is that gluster stack when there is a big wrtite to it . > For example time dd if=/dev/zero of=file bs=10240 count=100 > OR mv 20gig_file.img into gluster mount . > When that happen the all storage freezes for the entire proccess , mails , > few vps's , simple dir etc.. > Our setup is quite simple at this point , 2 servers in distrebute mod , each > has 1 TB brick that replicate to each other using DRBD . > iv monitored closly everting on this big writes and noticed that its not mem > , proc , network problem . > Iv also checked the same setup with simple NFS and its not happening there . > Anyone has any idea on how to fix this , if you cant write big files into > gluster without m
Re: [Gluster-users] Big I/O or gluster proccess problem
If I am not mistaken, you have a single server for glusterfs, and this is mirrored to a second ( non-glusterFS ) server using DRBD. If you have only a single server to export data from, why use Glusterfs ? Also, we don't officially support DRBD replication with glusterFS backend. Maybe you can consider GlusterFS replication across the two servers ? - Original Message - From: "Ran" To: "Tejas N. Bhise" Sent: Thursday, May 20, 2010 6:30:21 PM Subject: Re: [Gluster-users] Big I/O or gluster proccess problem Tejas hi , The 2 servers DRBD pair with HA so gluster actualy has 1 server that export 1 H.D ( 1 TB) this H.D is DRBD'd to the other server also this 1 TB hd is raid 1 with linux raid (i know its not optimal but rebust) in this setup if 1 servers go down the other continue , DRBD is more rebust then gluster replication specialy for VPS's etc.. I didnt check iowait but the load of the server is about 5 while the CPU's are 10-50% only so that says it all(there are IO waits) I was thinking of breaking the raid 1 seens this H.D allready has full mirror with DRBD( to server 2) but im not sure it will resolve this problem seens with NFS its not the same , it slows thigs down but not to not functional . client vol file 192.168.0.9 is the HA IP of this pair iv tested also with plain config(no writebehind etc) <> # file: /etc/glusterfs/glusterfs.vol volume storage1-2 type protocol/client option transport-type tcp option remote-host 192.168.0.9 option remote-subvolume b1 option ping-timeout 120 option username .. option password .. end-volume volume cluster type cluster/distribute option lookup-unhashed yes subvolumes storage1-2 end-volume #volume writebehind # type performance/write-behind # option cache-size 3MB # subvolumes cluster #end-volume #volume readahead # type performance/read-ahead # option page-count 4 # subvolumes writebehind #end-volume volume iothreads type performance/io-threads option thread-count 4 subvolumes cluster end-volume volume io-cache type performance/io-cache option cache-size 128MB option page-size 256KB #128KB is default option option force-revalidate-timeout 10 # default is 1 subvolumes iothreads end-volume volume writebehind type performance/write-behind option aggregate-size 512KB # default is 0bytes option flush-behind on # default is 'off' subvolumes io-cache end-volume <> server vol file <> # file: /etc/glusterfs/glusterfs-server.vol volume posix type storage/posix option directory /data/gluster # option o-direct enable option background-unlink yes # option span-devices 8 end-volume volume locks type features/locks subvolumes posix end-volume volume b1 type performance/io-threads option thread-count 8 subvolumes locks end-volume volume server type protocol/server option transport.socket.nodelay on option transport-type tcp # option auth.addr.b1.allow * option auth.login.b1.allow .. option auth.login.gluster.password subvolumes b1 end-volume <> 2010/5/20 Tejas N. Bhise < te...@gluster.com > Ran, Can you please elaborate on - "2 servers in distrebute mod , each has 1 TB brick that replicate to each other using DRBD" Also, how many drives do you have and what does iowait look like when you write a big file ? Tell us more about the configs of your servers, share the volume files. Regards, Tejas. - Original Message - From: "Ran" < smtp.tes...@gmail.com > To: Gluster-users@gluster.org Sent: Thursday, May 20, 2010 4:49:52 PM Subject: [Gluster-users] Big I/O or gluster proccess problem Hi all , Our problem is simple but quite critical i posted few mounts ago regarding that issue and there was a good responses but not a fix What happen is that gluster stack when there is a big wrtite to it . For example time dd if=/dev/zero of=file bs=10240 count=100 OR mv 20gig_file.img into gluster mount . When that happen the all storage freezes for the entire proccess , mails , few vps's , simple dir etc.. Our setup is quite simple at this point , 2 servers in distrebute mod , each has 1 TB brick that replicate to each other using DRBD . iv monitored closly everting on this big writes and noticed that its not mem , proc , network problem . Iv also checked the same setup with simple NFS and its not happening there . Anyone has any idea on how to fix this , if you cant write big files into gluster without making the storage unfunctional then you cant realy do anything . Please advise , ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] Big I/O or gluster proccess problem
Version 3.0.0 2010/5/20 Bedis 9 > Hey, > > If you're using Gluster, please tell us which version you're running... > > cheers > > > On Thu, May 20, 2010 at 2:12 PM, Tejas N. Bhise wrote: > > Ran, > > > > Can you please elaborate on - "2 servers in distrebute mod , each has 1 > TB brick that replicate to each other using DRBD" > > Also, how many drives do you have and what does iowait look like when you > write a big file ? Tell us more about the configs of your servers, share the > volume files. > > > > Regards, > > Tejas. > > > > - Original Message - > > From: "Ran" > > To: Gluster-users@gluster.org > > Sent: Thursday, May 20, 2010 4:49:52 PM > > Subject: [Gluster-users] Big I/O or gluster proccess problem > > > > Hi all , > > Our problem is simple but quite critical i posted few mounts ago > regarding > > that issue and there was a good responses but not a fix > > What happen is that gluster stack when there is a big wrtite to it . > > For example time dd if=/dev/zero of=file bs=10240 count=100 > > OR mv 20gig_file.img into gluster mount . > > When that happen the all storage freezes for the entire proccess , mails > , > > few vps's , simple dir etc.. > > Our setup is quite simple at this point , 2 servers in distrebute mod , > each > > has 1 TB brick that replicate to each other using DRBD . > > iv monitored closly everting on this big writes and noticed that its not > mem > > , proc , network problem . > > Iv also checked the same setup with simple NFS and its not happening > there . > > Anyone has any idea on how to fix this , if you cant write big files into > > gluster without making the storage unfunctional then you cant realy do > > anything . > > > > Please advise , > > > > ___ > > Gluster-users mailing list > > Gluster-users@gluster.org > > http://gluster.org/cgi-bin/mailman/listinfo/gluster-users > > ___ > > Gluster-users mailing list > > Gluster-users@gluster.org > > http://gluster.org/cgi-bin/mailman/listinfo/gluster-users > > > ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] Big I/O or gluster proccess problem
Hey, If you're using Gluster, please tell us which version you're running... cheers On Thu, May 20, 2010 at 2:12 PM, Tejas N. Bhise wrote: > Ran, > > Can you please elaborate on - "2 servers in distrebute mod , each has 1 TB > brick that replicate to each other using DRBD" > Also, how many drives do you have and what does iowait look like when you > write a big file ? Tell us more about the configs of your servers, share the > volume files. > > Regards, > Tejas. > > - Original Message - > From: "Ran" > To: Gluster-users@gluster.org > Sent: Thursday, May 20, 2010 4:49:52 PM > Subject: [Gluster-users] Big I/O or gluster proccess problem > > Hi all , > Our problem is simple but quite critical i posted few mounts ago regarding > that issue and there was a good responses but not a fix > What happen is that gluster stack when there is a big wrtite to it . > For example time dd if=/dev/zero of=file bs=10240 count=100 > OR mv 20gig_file.img into gluster mount . > When that happen the all storage freezes for the entire proccess , mails , > few vps's , simple dir etc.. > Our setup is quite simple at this point , 2 servers in distrebute mod , each > has 1 TB brick that replicate to each other using DRBD . > iv monitored closly everting on this big writes and noticed that its not mem > , proc , network problem . > Iv also checked the same setup with simple NFS and its not happening there . > Anyone has any idea on how to fix this , if you cant write big files into > gluster without making the storage unfunctional then you cant realy do > anything . > > Please advise , > > ___ > Gluster-users mailing list > Gluster-users@gluster.org > http://gluster.org/cgi-bin/mailman/listinfo/gluster-users > ___ > Gluster-users mailing list > Gluster-users@gluster.org > http://gluster.org/cgi-bin/mailman/listinfo/gluster-users > ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] Big I/O or gluster proccess problem
Ran, Can you please elaborate on - "2 servers in distrebute mod , each has 1 TB brick that replicate to each other using DRBD" Also, how many drives do you have and what does iowait look like when you write a big file ? Tell us more about the configs of your servers, share the volume files. Regards, Tejas. - Original Message - From: "Ran" To: Gluster-users@gluster.org Sent: Thursday, May 20, 2010 4:49:52 PM Subject: [Gluster-users] Big I/O or gluster proccess problem Hi all , Our problem is simple but quite critical i posted few mounts ago regarding that issue and there was a good responses but not a fix What happen is that gluster stack when there is a big wrtite to it . For example time dd if=/dev/zero of=file bs=10240 count=100 OR mv 20gig_file.img into gluster mount . When that happen the all storage freezes for the entire proccess , mails , few vps's , simple dir etc.. Our setup is quite simple at this point , 2 servers in distrebute mod , each has 1 TB brick that replicate to each other using DRBD . iv monitored closly everting on this big writes and noticed that its not mem , proc , network problem . Iv also checked the same setup with simple NFS and its not happening there . Anyone has any idea on how to fix this , if you cant write big files into gluster without making the storage unfunctional then you cant realy do anything . Please advise , ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
[Gluster-users] Big I/O or gluster proccess problem
Hi all , Our problem is simple but quite critical i posted few mounts ago regarding that issue and there was a good responses but not a fix What happen is that gluster stack when there is a big wrtite to it . For example time dd if=/dev/zero of=file bs=10240 count=100 OR mv 20gig_file.img into gluster mount . When that happen the all storage freezes for the entire proccess , mails , few vps's , simple dir etc.. Our setup is quite simple at this point , 2 servers in distrebute mod , each has 1 TB brick that replicate to each other using DRBD . iv monitored closly everting on this big writes and noticed that its not mem , proc , network problem . Iv also checked the same setup with simple NFS and its not happening there . Anyone has any idea on how to fix this , if you cant write big files into gluster without making the storage unfunctional then you cant realy do anything . Please advise , ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users