Re: [Gluster-users] Gluster 3.2.0 and ucarp not working
When client is connecting to any gluster node it automaticly receive list of all other nodes for that volume. Matus Dne 8.6.2011 8:13 Joshua Baker-LePain jl...@duke.edu napsal(a): On Mon, 6 Jun 2011 at 1:30am, Craig Carl wrote Matus - If you are using the Gluster native client (mount -t glusterfs ...) then ucarp/CTDB is NOT required and you should not install it. Always use the real IPs when you are mounting with 'mount -t glusterfs...'. Hrm. That wasn't my understanding. Say my fstab line looks like this: 192.168.2.100:/distrep /mnt/distrep glusterfs defaults,_netdev 0 0 Now, let's say that at mount time 192.168.2.100 is down. How does the Gluster native client know which other IP addresses to contact to get the volume file? Is there a way to put multiple hosts in the fstab line? -- Joshua Baker-LePain QB3 Shared Cluster Sysadmin UCSF ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] Gluster 3.2.0 and ucarp not working
On Wed, 8 Jun 2011 at 8:16am, bxma...@gmail.com wrote When client is connecting to any gluster node it automaticly receive list of all other nodes for that volume. Yes, but what if the node it first tries to contact (i.e., the one on the fstab line) is down? -- Joshua Baker-LePain QB3 Shared Cluster Sysadmin UCSF ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
[Gluster-users] Glusterfs 3.2.0 NFS Problem
Hi, i noticed a strange behavior with NFS and Glusterfs 3.2.0 , 3 of our Servers are loosing the Mount but when you restart the Volume on the Server it works again without a remount. On the server i noticed this entries in the Glusterfs/Nfs log-file when the mount on the Client becomes unavailable : [2011-06-08 14:37:02.568693] I [afr-inode-read.c:270:afr_stat] 0-ksc-replicate-0: /: no child is up [2011-06-08 14:37:02.569212] I [afr-inode-read.c:270:afr_stat] 0-ksc-replicate-0: /: no child is up [2011-06-08 14:37:02.611910] I [afr-inode-read.c:270:afr_stat] 0-ksc-replicate-0: /: no child is up [2011-06-08 14:37:02.624477] I [afr-inode-read.c:270:afr_stat] 0-ksc-replicate-0: /: no child is up [2011-06-08 14:37:04.288272] I [afr-inode-read.c:270:afr_stat] 0-ksc-replicate-0: /: no child is up [2011-06-08 14:37:04.296150] I [afr-inode-read.c:270:afr_stat] 0-ksc-replicate-0: /: no child is up [2011-06-08 14:37:04.309247] I [afr-inode-read.c:270:afr_stat] 0-ksc-replicate-0: /: no child is up [2011-06-08 14:37:04.320939] I [afr-inode-read.c:270:afr_stat] 0-ksc-replicate-0: /: no child is up [2011-06-08 14:37:04.321786] I [afr-inode-read.c:270:afr_stat] 0-ksc-replicate-0: /: no child is up [2011-06-08 14:37:04.333609] I [afr-inode-read.c:270:afr_stat] 0-ksc-replicate-0: /: no child is up [2011-06-08 14:37:04.334089] I [afr-inode-read.c:270:afr_stat] 0-ksc-replicate-0: /: no child is up [2011-06-08 14:37:04.344662] I [afr-inode-read.c:270:afr_stat] 0-ksc-replicate-0: /: no child is up [2011-06-08 14:37:04.352666] I [afr-inode-read.c:270:afr_stat] 0-ksc-replicate-0: /: no child is up [2011-06-08 14:37:04.354195] I [afr-inode-read.c:270:afr_stat] 0-ksc-replicate-0: /: no child is up [2011-06-08 14:37:04.360446] I [afr-inode-read.c:270:afr_stat] 0-ksc-replicate-0: /: no child is up [2011-06-08 14:37:04.369331] I [afr-inode-read.c:270:afr_stat] 0-ksc-replicate-0: /: no child is up [2011-06-08 14:37:04.471556] I [afr-inode-read.c:270:afr_stat] 0-ksc-replicate-0: /: no child is up [2011-06-08 14:37:04.480013] I [afr-inode-read.c:270:afr_stat] 0-ksc-replicate-0: /: no child is up [2011-06-08 14:37:05.639700] I [afr-inode-read.c:270:afr_stat] 0-ksc-replicate-0: /: no child is up [2011-06-08 14:37:05.652535] I [afr-inode-read.c:270:afr_stat] 0-ksc-replicate-0: /: no child is up [2011-06-08 14:37:07.578469] I [afr-inode-read.c:270:afr_stat] 0-ksc-replicate-0: /: no child is up [2011-06-08 14:37:07.588949] I [afr-inode-read.c:270:afr_stat] 0-ksc-replicate-0: /: no child is up [2011-06-08 14:37:07.590395] I [afr-inode-read.c:270:afr_stat] 0-ksc-replicate-0: /: no child is up [2011-06-08 14:37:07.591414] I [afr-inode-read.c:270:afr_stat] 0-ksc-replicate-0: /: no child is up [2011-06-08 14:37:07.591932] I [afr-inode-read.c:270:afr_stat] 0-ksc-replicate-0: /: no child is up [2011-06-08 14:37:07.592596] I [afr-inode-read.c:270:afr_stat] 0-ksc-replicate-0: /: no child is up [2011-06-08 14:37:07.639317] I [afr-inode-read.c:270:afr_stat] 0-ksc-replicate-0: /: no child is up [2011-06-08 14:37:07.652919] I [afr-inode-read.c:270:afr_stat] 0-ksc-replicate-0: /: no child is up [2011-06-08 14:37:09.332435] I [afr-inode-read.c:270:afr_stat] 0-ksc-replicate-0: /: no child is up [2011-06-08 14:37:09.340622] I [afr-inode-read.c:270:afr_stat] 0-ksc-replicate-0: /: no child is up [2011-06-08 14:37:09.349360] I [afr-inode-read.c:270:afr_stat] 0-ksc-replicate-0: /: no child is up [2011-06-08 14:37:09.349550] I [afr-inode-read.c:270:afr_stat] 0-ksc-replicate-0: /: no child is up [2011-06-08 14:37:09.360445] I [afr-inode-read.c:270:afr_stat] 0-ksc-replicate-0: /: no child is up [2011-06-08 14:37:09.369497] I [afr-inode-read.c:270:afr_stat] 0-ksc-replicate-0: /: no child is up [2011-06-08 14:37:09.369752] I [afr-inode-read.c:270:afr_stat] 0-ksc-replicate-0: /: no child is up [2011-06-08 14:37:09.382097] I [afr-inode-read.c:270:afr_stat] 0-ksc-replicate-0: /: no child is up [2011-06-08 14:37:09.382387] I [afr-inode-read.c:270:afr_stat] 0-ksc-replicate-0: /: no child is up Thx for the help ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] Gluster 3.2.0 and ucarp not working
On Wed, Jun 8, 2011 at 12:09 PM, Joshua Baker-LePain jl...@duke.edu wrote: On Wed, 8 Jun 2011 at 9:40am, Mohit Anchlia wrote On Tue, Jun 7, 2011 at 11:20 PM, Joshua Baker-LePain jl...@duke.edu wrote: On Wed, 8 Jun 2011 at 8:16am, bxma...@gmail.com wrote When client is connecting to any gluster node it automaticly receive list of all other nodes for that volume. Yes, but what if the node it first tries to contact (i.e., the one on the fstab line) is down? For client side use DNS round robin with all the hosts in your cluster. And if you use /etc/hosts rather than DNS...? What do you think should happen? -- Joshua Baker-LePain QB3 Shared Cluster Sysadmin UCSF ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] uninterruptible processes writing to glusterfsshare
Im using kernel 2.6.34 + fuse 2.5.5 + gluster 3.2 from beginning and it happen again today ... php-fpm freeze and reboot was only solution. Matus 2011/6/7 Markus Fröhlich markus.froehl...@xidras.com: hi! there ist no relavant output from dmesg. no entries in the server log - only the one line in the client-server log, I already posted. the glusterfs version on the server had been updated to gfs 3.2.0 more than a month ago. because of the troubles on the backup server, I deleted the whole backup share and started from scratch. I looked for a update of fuse and upgraded from 2.7.2-61.18.1 to 2.8.5-41.1 maybe this helps. here is the changelog info: Authors: Miklos Szeredi mik...@szeredi.hu Distribution: systemsmanagement:baracus / SLE_11_SP1 * Tue Mar 29 2011 db...@novell.com - remove the --no-canonicalize usage for suse_version = 11.3 * Mon Mar 21 2011 co...@novell.com - licenses package is about to die * Thu Feb 17 2011 mszer...@suse.cz - In case of failure to add to /etc/mtab don't umount. [bnc#668820] [CVE-2011-0541] * Tue Nov 16 2010 mszer...@suse.cz - Fix symlink attack for mount and umount [bnc#651598] * Wed Oct 27 2010 mszer...@suse.cz - Remove /etc/init.d/boot.fuse [bnc#648843] * Tue Sep 28 2010 mszer...@suse.cz - update to 2.8.5 * fix option escaping for fusermount [bnc#641480] * Wed Apr 28 2010 mszer...@suse.cz - keep examples and internal docs in devel package (from jnweiger) * Mon Apr 26 2010 mszer...@suse.cz - update to 2.8.4 * fix checking for symlinks in umount from /tmp * fix umounting if /tmp is a symlink kind regards markus froehlich Am 06.06.2011 21:19, schrieb Anthony J. Biacco: Could be fuse, check 'dmesg' for kernel module timeouts. In a similar vein, has anyone seen signifigant performance/reliability with diff fuse versions? say, latest source vs. Rhel distro rpms vers. -Tony -Original Message- From: Mohit Anchliamohitanch...@gmail.com Sent: June 06, 2011 1:14 PM To: Markus Fröhlichmarkus.froehl...@xidras.com Cc: gluster-users@gluster.orggluster-users@gluster.org Subject: Re: [Gluster-users] uninterruptible processes writing to glusterfsshare Is there anything in the server logs? Does it follow any particular pattern before going in this mode? Did you upgrade Gluster or is this new install? 2011/6/6 Markus Fröhlichmarkus.froehl...@xidras.com: hi! sometimes we've on some client-servers hanging uninterruptible processes (ps aux stat is on D ) and on one the CPU wait I/O grows within some minutes to 100%. you are not able to kill such processes - also kill -9 doesnt work - when you connect via strace to such an process, you wont see anything and you cannot detach it again. there are only two possibilities: killing the glusterfs process (umount GFS share) or rebooting the server. the only log entry I found, was on one client - just a single line: [2011-06-06 10:44:18.593211] I [afr-common.c:581:afr_lookup_collect_xattr] 0-office-data-replicate-0: data self-heal is pending for /pc-partnerbet-public/Promotionaktionen/Mailakquise_2009/Webmaster_2010/HTML/bilder/Thumbs.db. one of the client-servers is a samba-server, the other one a backup-server based on rsync with millions of small files. gfs-servers + gfs-clients: SLES11 x86_64, glusterfs V 3.2.0 and here are the configs from server and client: server config /etc/glusterd/vols/office-data/office-data.gfs-01-01.GFS-office-data02.vol: volume office-data-posix type storage/posix option directory /GFS/office-data02 end-volume volume office-data-access-control type features/access-control subvolumes office-data-posix end-volume volume office-data-locks type features/locks subvolumes office-data-access-control end-volume volume office-data-io-threads type performance/io-threads subvolumes office-data-locks end-volume volume office-data-marker type features/marker option volume-uuid 3c6e633d-a0bb-4c52-8f05-a2db9bc9c659 option timestamp-file /etc/glusterd/vols/office-data/marker.tstamp option xtime off option quota off subvolumes office-data-io-threads end-volume volume /GFS/office-data02 type debug/io-stats option latency-measurement off option count-fop-hits off subvolumes office-data-marker end-volume volume office-data-server type protocol/server option transport-type tcp option auth.addr./GFS/office-data02.allow * subvolumes /GFS/office-data02 end-volume -- client config /etc/glusterd/vols/office-data/office-data-fuse.vol: volume office-data-client-0 type protocol/client option remote-host gfs-01-01 option remote-subvolume /GFS/office-data02 option transport-type tcp end-volume volume office-data-replicate-0 type cluster/replicate subvolumes office-data-client-0 end-volume volume office-data-write-behind type performance/write-behind subvolumes
Re: [Gluster-users] uninterruptible processes writing toglusterfsshare
Hopefully this will help some people... try disabling the io-cache routine in the fuse configurations for your share. Let me know if you need instruction on doing this. It solved all of the lockup issues I was experiencing. I believe there is some sort of as-yet-undetermined memory leak here. Justice London -Original Message- From: gluster-users-boun...@gluster.org [mailto:gluster-users-boun...@gluster.org] On Behalf Of bxma...@gmail.com Sent: Wednesday, June 08, 2011 12:22 PM To: gluster-users@gluster.org Subject: Re: [Gluster-users] uninterruptible processes writing toglusterfsshare Im using kernel 2.6.34 + fuse 2.5.5 + gluster 3.2 from beginning and it happen again today ... php-fpm freeze and reboot was only solution. Matus 2011/6/7 Markus Fröhlich markus.froehl...@xidras.com: hi! there ist no relavant output from dmesg. no entries in the server log - only the one line in the client-server log, I already posted. the glusterfs version on the server had been updated to gfs 3.2.0 more than a month ago. because of the troubles on the backup server, I deleted the whole backup share and started from scratch. I looked for a update of fuse and upgraded from 2.7.2-61.18.1 to 2.8.5-41.1 maybe this helps. here is the changelog info: Authors: Miklos Szeredi mik...@szeredi.hu Distribution: systemsmanagement:baracus / SLE_11_SP1 * Tue Mar 29 2011 db...@novell.com - remove the --no-canonicalize usage for suse_version = 11.3 * Mon Mar 21 2011 co...@novell.com - licenses package is about to die * Thu Feb 17 2011 mszer...@suse.cz - In case of failure to add to /etc/mtab don't umount. [bnc#668820] [CVE-2011-0541] * Tue Nov 16 2010 mszer...@suse.cz - Fix symlink attack for mount and umount [bnc#651598] * Wed Oct 27 2010 mszer...@suse.cz - Remove /etc/init.d/boot.fuse [bnc#648843] * Tue Sep 28 2010 mszer...@suse.cz - update to 2.8.5 * fix option escaping for fusermount [bnc#641480] * Wed Apr 28 2010 mszer...@suse.cz - keep examples and internal docs in devel package (from jnweiger) * Mon Apr 26 2010 mszer...@suse.cz - update to 2.8.4 * fix checking for symlinks in umount from /tmp * fix umounting if /tmp is a symlink kind regards markus froehlich Am 06.06.2011 21:19, schrieb Anthony J. Biacco: Could be fuse, check 'dmesg' for kernel module timeouts. In a similar vein, has anyone seen signifigant performance/reliability with diff fuse versions? say, latest source vs. Rhel distro rpms vers. -Tony -Original Message- From: Mohit Anchliamohitanch...@gmail.com Sent: June 06, 2011 1:14 PM To: Markus Fröhlichmarkus.froehl...@xidras.com Cc: gluster-users@gluster.orggluster-users@gluster.org Subject: Re: [Gluster-users] uninterruptible processes writing to glusterfsshare Is there anything in the server logs? Does it follow any particular pattern before going in this mode? Did you upgrade Gluster or is this new install? 2011/6/6 Markus Fröhlichmarkus.froehl...@xidras.com: hi! sometimes we've on some client-servers hanging uninterruptible processes (ps aux stat is on D ) and on one the CPU wait I/O grows within some minutes to 100%. you are not able to kill such processes - also kill -9 doesnt work - when you connect via strace to such an process, you wont see anything and you cannot detach it again. there are only two possibilities: killing the glusterfs process (umount GFS share) or rebooting the server. the only log entry I found, was on one client - just a single line: [2011-06-06 10:44:18.593211] I [afr-common.c:581:afr_lookup_collect_xattr] 0-office-data-replicate-0: data self-heal is pending for /pc-partnerbet-public/Promotionaktionen/Mailakquise_2009/Webmaster_2010/HTML /bilder/Thumbs.db. one of the client-servers is a samba-server, the other one a backup-server based on rsync with millions of small files. gfs-servers + gfs-clients: SLES11 x86_64, glusterfs V 3.2.0 and here are the configs from server and client: server config /etc/glusterd/vols/office-data/office-data.gfs-01-01.GFS-office-data02.vol : volume office-data-posix type storage/posix option directory /GFS/office-data02 end-volume volume office-data-access-control type features/access-control subvolumes office-data-posix end-volume volume office-data-locks type features/locks subvolumes office-data-access-control end-volume volume office-data-io-threads type performance/io-threads subvolumes office-data-locks end-volume volume office-data-marker type features/marker option volume-uuid 3c6e633d-a0bb-4c52-8f05-a2db9bc9c659 option timestamp-file /etc/glusterd/vols/office-data/marker.tstamp option xtime off option quota off subvolumes office-data-io-threads end-volume volume /GFS/office-data02 type debug/io-stats option latency-measurement off option count-fop-hits off subvolumes office-data-marker end-volume volume
Re: [Gluster-users] uninterruptible processes writing toglusterfsshare
On Wed, Jun 8, 2011 at 12:29 PM, Justice London jlon...@lawinfo.com wrote: Hopefully this will help some people... try disabling the io-cache routine in the fuse configurations for your share. Let me know if you need instruction on doing this. It solved all of the lockup issues I was experiencing. I believe there is some sort of as-yet-undetermined memory leak here. Was there a bug filed? If you think this is a bug it will help others as well. Justice London -Original Message- From: gluster-users-boun...@gluster.org [mailto:gluster-users-boun...@gluster.org] On Behalf Of bxma...@gmail.com Sent: Wednesday, June 08, 2011 12:22 PM To: gluster-users@gluster.org Subject: Re: [Gluster-users] uninterruptible processes writing toglusterfsshare Im using kernel 2.6.34 + fuse 2.5.5 + gluster 3.2 from beginning and it happen again today ... php-fpm freeze and reboot was only solution. Matus 2011/6/7 Markus Fröhlich markus.froehl...@xidras.com: hi! there ist no relavant output from dmesg. no entries in the server log - only the one line in the client-server log, I already posted. the glusterfs version on the server had been updated to gfs 3.2.0 more than a month ago. because of the troubles on the backup server, I deleted the whole backup share and started from scratch. I looked for a update of fuse and upgraded from 2.7.2-61.18.1 to 2.8.5-41.1 maybe this helps. here is the changelog info: Authors: Miklos Szeredi mik...@szeredi.hu Distribution: systemsmanagement:baracus / SLE_11_SP1 * Tue Mar 29 2011 db...@novell.com - remove the --no-canonicalize usage for suse_version = 11.3 * Mon Mar 21 2011 co...@novell.com - licenses package is about to die * Thu Feb 17 2011 mszer...@suse.cz - In case of failure to add to /etc/mtab don't umount. [bnc#668820] [CVE-2011-0541] * Tue Nov 16 2010 mszer...@suse.cz - Fix symlink attack for mount and umount [bnc#651598] * Wed Oct 27 2010 mszer...@suse.cz - Remove /etc/init.d/boot.fuse [bnc#648843] * Tue Sep 28 2010 mszer...@suse.cz - update to 2.8.5 * fix option escaping for fusermount [bnc#641480] * Wed Apr 28 2010 mszer...@suse.cz - keep examples and internal docs in devel package (from jnweiger) * Mon Apr 26 2010 mszer...@suse.cz - update to 2.8.4 * fix checking for symlinks in umount from /tmp * fix umounting if /tmp is a symlink kind regards markus froehlich Am 06.06.2011 21:19, schrieb Anthony J. Biacco: Could be fuse, check 'dmesg' for kernel module timeouts. In a similar vein, has anyone seen signifigant performance/reliability with diff fuse versions? say, latest source vs. Rhel distro rpms vers. -Tony -Original Message- From: Mohit Anchliamohitanch...@gmail.com Sent: June 06, 2011 1:14 PM To: Markus Fröhlichmarkus.froehl...@xidras.com Cc: gluster-users@gluster.orggluster-users@gluster.org Subject: Re: [Gluster-users] uninterruptible processes writing to glusterfsshare Is there anything in the server logs? Does it follow any particular pattern before going in this mode? Did you upgrade Gluster or is this new install? 2011/6/6 Markus Fröhlichmarkus.froehl...@xidras.com: hi! sometimes we've on some client-servers hanging uninterruptible processes (ps aux stat is on D ) and on one the CPU wait I/O grows within some minutes to 100%. you are not able to kill such processes - also kill -9 doesnt work - when you connect via strace to such an process, you wont see anything and you cannot detach it again. there are only two possibilities: killing the glusterfs process (umount GFS share) or rebooting the server. the only log entry I found, was on one client - just a single line: [2011-06-06 10:44:18.593211] I [afr-common.c:581:afr_lookup_collect_xattr] 0-office-data-replicate-0: data self-heal is pending for /pc-partnerbet-public/Promotionaktionen/Mailakquise_2009/Webmaster_2010/HTML /bilder/Thumbs.db. one of the client-servers is a samba-server, the other one a backup-server based on rsync with millions of small files. gfs-servers + gfs-clients: SLES11 x86_64, glusterfs V 3.2.0 and here are the configs from server and client: server config /etc/glusterd/vols/office-data/office-data.gfs-01-01.GFS-office-data02.vol : volume office-data-posix type storage/posix option directory /GFS/office-data02 end-volume volume office-data-access-control type features/access-control subvolumes office-data-posix end-volume volume office-data-locks type features/locks subvolumes office-data-access-control end-volume volume office-data-io-threads type performance/io-threads subvolumes office-data-locks end-volume volume office-data-marker type features/marker option volume-uuid 3c6e633d-a0bb-4c52-8f05-a2db9bc9c659 option timestamp-file /etc/glusterd/vols/office-data/marker.tstamp option xtime off option quota off subvolumes office-data-io-threads end-volume volume
Re: [Gluster-users] uninterruptible processes writing toglusterfsshare
How to disable io-cache routine ? I will try it and report back :) thanks Matus 2011/6/8 Mohit Anchlia mohitanch...@gmail.com: On Wed, Jun 8, 2011 at 12:29 PM, Justice London jlon...@lawinfo.com wrote: Hopefully this will help some people... try disabling the io-cache routine in the fuse configurations for your share. Let me know if you need instruction on doing this. It solved all of the lockup issues I was experiencing. I believe there is some sort of as-yet-undetermined memory leak here. Was there a bug filed? If you think this is a bug it will help others as well. Justice London -Original Message- From: gluster-users-boun...@gluster.org [mailto:gluster-users-boun...@gluster.org] On Behalf Of bxma...@gmail.com Sent: Wednesday, June 08, 2011 12:22 PM To: gluster-users@gluster.org Subject: Re: [Gluster-users] uninterruptible processes writing toglusterfsshare Im using kernel 2.6.34 + fuse 2.5.5 + gluster 3.2 from beginning and it happen again today ... php-fpm freeze and reboot was only solution. Matus 2011/6/7 Markus Fröhlich markus.froehl...@xidras.com: hi! there ist no relavant output from dmesg. no entries in the server log - only the one line in the client-server log, I already posted. the glusterfs version on the server had been updated to gfs 3.2.0 more than a month ago. because of the troubles on the backup server, I deleted the whole backup share and started from scratch. I looked for a update of fuse and upgraded from 2.7.2-61.18.1 to 2.8.5-41.1 maybe this helps. here is the changelog info: Authors: Miklos Szeredi mik...@szeredi.hu Distribution: systemsmanagement:baracus / SLE_11_SP1 * Tue Mar 29 2011 db...@novell.com - remove the --no-canonicalize usage for suse_version = 11.3 * Mon Mar 21 2011 co...@novell.com - licenses package is about to die * Thu Feb 17 2011 mszer...@suse.cz - In case of failure to add to /etc/mtab don't umount. [bnc#668820] [CVE-2011-0541] * Tue Nov 16 2010 mszer...@suse.cz - Fix symlink attack for mount and umount [bnc#651598] * Wed Oct 27 2010 mszer...@suse.cz - Remove /etc/init.d/boot.fuse [bnc#648843] * Tue Sep 28 2010 mszer...@suse.cz - update to 2.8.5 * fix option escaping for fusermount [bnc#641480] * Wed Apr 28 2010 mszer...@suse.cz - keep examples and internal docs in devel package (from jnweiger) * Mon Apr 26 2010 mszer...@suse.cz - update to 2.8.4 * fix checking for symlinks in umount from /tmp * fix umounting if /tmp is a symlink kind regards markus froehlich Am 06.06.2011 21:19, schrieb Anthony J. Biacco: Could be fuse, check 'dmesg' for kernel module timeouts. In a similar vein, has anyone seen signifigant performance/reliability with diff fuse versions? say, latest source vs. Rhel distro rpms vers. -Tony -Original Message- From: Mohit Anchliamohitanch...@gmail.com Sent: June 06, 2011 1:14 PM To: Markus Fröhlichmarkus.froehl...@xidras.com Cc: gluster-users@gluster.orggluster-users@gluster.org Subject: Re: [Gluster-users] uninterruptible processes writing to glusterfsshare Is there anything in the server logs? Does it follow any particular pattern before going in this mode? Did you upgrade Gluster or is this new install? 2011/6/6 Markus Fröhlichmarkus.froehl...@xidras.com: hi! sometimes we've on some client-servers hanging uninterruptible processes (ps aux stat is on D ) and on one the CPU wait I/O grows within some minutes to 100%. you are not able to kill such processes - also kill -9 doesnt work - when you connect via strace to such an process, you wont see anything and you cannot detach it again. there are only two possibilities: killing the glusterfs process (umount GFS share) or rebooting the server. the only log entry I found, was on one client - just a single line: [2011-06-06 10:44:18.593211] I [afr-common.c:581:afr_lookup_collect_xattr] 0-office-data-replicate-0: data self-heal is pending for /pc-partnerbet-public/Promotionaktionen/Mailakquise_2009/Webmaster_2010/HTML /bilder/Thumbs.db. one of the client-servers is a samba-server, the other one a backup-server based on rsync with millions of small files. gfs-servers + gfs-clients: SLES11 x86_64, glusterfs V 3.2.0 and here are the configs from server and client: server config /etc/glusterd/vols/office-data/office-data.gfs-01-01.GFS-office-data02.vol : volume office-data-posix type storage/posix option directory /GFS/office-data02 end-volume volume office-data-access-control type features/access-control subvolumes office-data-posix end-volume volume office-data-locks type features/locks subvolumes office-data-access-control end-volume volume office-data-io-threads type performance/io-threads subvolumes office-data-locks end-volume volume office-data-marker type features/marker option volume-uuid 3c6e633d-a0bb-4c52-8f05-a2db9bc9c659 option timestamp-file
Re: [Gluster-users] uninterruptible processes writing toglusterfsshare
I have been working with the gluster support department (under paid support I should mention)... with no resolution or similar. A bug report specifically was not filed by myself though. Justice London -Original Message- From: Mohit Anchlia [mailto:mohitanch...@gmail.com] Sent: Wednesday, June 08, 2011 12:29 PM To: Justice London Cc: bxma...@gmail.com; gluster-users@gluster.org Subject: Re: [Gluster-users] uninterruptible processes writing toglusterfsshare On Wed, Jun 8, 2011 at 12:29 PM, Justice London jlon...@lawinfo.com wrote: Hopefully this will help some people... try disabling the io-cache routine in the fuse configurations for your share. Let me know if you need instruction on doing this. It solved all of the lockup issues I was experiencing. I believe there is some sort of as-yet-undetermined memory leak here. Was there a bug filed? If you think this is a bug it will help others as well. Justice London -Original Message- From: gluster-users-boun...@gluster.org [mailto:gluster-users-boun...@gluster.org] On Behalf Of bxma...@gmail.com Sent: Wednesday, June 08, 2011 12:22 PM To: gluster-users@gluster.org Subject: Re: [Gluster-users] uninterruptible processes writing toglusterfsshare Im using kernel 2.6.34 + fuse 2.5.5 + gluster 3.2 from beginning and it happen again today ... php-fpm freeze and reboot was only solution. Matus 2011/6/7 Markus Fröhlich markus.froehl...@xidras.com: hi! there ist no relavant output from dmesg. no entries in the server log - only the one line in the client-server log, I already posted. the glusterfs version on the server had been updated to gfs 3.2.0 more than a month ago. because of the troubles on the backup server, I deleted the whole backup share and started from scratch. I looked for a update of fuse and upgraded from 2.7.2-61.18.1 to 2.8.5-41.1 maybe this helps. here is the changelog info: Authors: Miklos Szeredi mik...@szeredi.hu Distribution: systemsmanagement:baracus / SLE_11_SP1 * Tue Mar 29 2011 db...@novell.com - remove the --no-canonicalize usage for suse_version = 11.3 * Mon Mar 21 2011 co...@novell.com - licenses package is about to die * Thu Feb 17 2011 mszer...@suse.cz - In case of failure to add to /etc/mtab don't umount. [bnc#668820] [CVE-2011-0541] * Tue Nov 16 2010 mszer...@suse.cz - Fix symlink attack for mount and umount [bnc#651598] * Wed Oct 27 2010 mszer...@suse.cz - Remove /etc/init.d/boot.fuse [bnc#648843] * Tue Sep 28 2010 mszer...@suse.cz - update to 2.8.5 * fix option escaping for fusermount [bnc#641480] * Wed Apr 28 2010 mszer...@suse.cz - keep examples and internal docs in devel package (from jnweiger) * Mon Apr 26 2010 mszer...@suse.cz - update to 2.8.4 * fix checking for symlinks in umount from /tmp * fix umounting if /tmp is a symlink kind regards markus froehlich Am 06.06.2011 21:19, schrieb Anthony J. Biacco: Could be fuse, check 'dmesg' for kernel module timeouts. In a similar vein, has anyone seen signifigant performance/reliability with diff fuse versions? say, latest source vs. Rhel distro rpms vers. -Tony -Original Message- From: Mohit Anchliamohitanch...@gmail.com Sent: June 06, 2011 1:14 PM To: Markus Fröhlichmarkus.froehl...@xidras.com Cc: gluster-users@gluster.orggluster-users@gluster.org Subject: Re: [Gluster-users] uninterruptible processes writing to glusterfsshare Is there anything in the server logs? Does it follow any particular pattern before going in this mode? Did you upgrade Gluster or is this new install? 2011/6/6 Markus Fröhlichmarkus.froehl...@xidras.com: hi! sometimes we've on some client-servers hanging uninterruptible processes (ps aux stat is on D ) and on one the CPU wait I/O grows within some minutes to 100%. you are not able to kill such processes - also kill -9 doesnt work - when you connect via strace to such an process, you wont see anything and you cannot detach it again. there are only two possibilities: killing the glusterfs process (umount GFS share) or rebooting the server. the only log entry I found, was on one client - just a single line: [2011-06-06 10:44:18.593211] I [afr-common.c:581:afr_lookup_collect_xattr] 0-office-data-replicate-0: data self-heal is pending for /pc-partnerbet-public/Promotionaktionen/Mailakquise_2009/Webmaster_2010/HTML /bilder/Thumbs.db. one of the client-servers is a samba-server, the other one a backup-server based on rsync with millions of small files. gfs-servers + gfs-clients: SLES11 x86_64, glusterfs V 3.2.0 and here are the configs from server and client: server config /etc/glusterd/vols/office-data/office-data.gfs-01-01.GFS-office-data02.vol : volume office-data-posix type storage/posix option directory /GFS/office-data02 end-volume volume office-data-access-control type features/access-control subvolumes office-data-posix end-volume volume office-data-locks
Re: [Gluster-users] Gluster 3.2.0 and ucarp not working
On Wed, 8 Jun 2011 at 12:12pm, Mohit Anchlia wrote On Wed, Jun 8, 2011 at 12:09 PM, Joshua Baker-LePain jl...@duke.edu wrote: On Wed, 8 Jun 2011 at 9:40am, Mohit Anchlia wrote On Tue, Jun 7, 2011 at 11:20 PM, Joshua Baker-LePain jl...@duke.edu wrote: On Wed, 8 Jun 2011 at 8:16am, bxma...@gmail.com wrote When client is connecting to any gluster node it automaticly receive list of all other nodes for that volume. Yes, but what if the node it first tries to contact (i.e., the one on the fstab line) is down? For client side use DNS round robin with all the hosts in your cluster. And if you use /etc/hosts rather than DNS...? What do you think should happen? I'm simply trying to find the most robust way to automatically mount GlusterFS volumes at boot time in my environment. In previous versions of Gluster, there was no issue since the volume files were on the clients. I understand that can still be done, but one loses the ability to manage all changes from the servers. Prior to this thread, I thought the best method was to use ucarp on the servers and mount using the ucarp address. If that won't work right (I haven't had time to fully test my setup yet), then I need to find another way. I don't run DNS on my cluster, so that solution is out. As far as I can tell, the only other solution is to mount in rc.local, with logic to detect a mount failure and go on to the next server. -- Joshua Baker-LePain QB3 Shared Cluster Sysadmin UCSF ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] Gluster 3.2.0 and ucarp not working
On Wed, Jun 8, 2011 at 1:17 PM, Mohit Anchlia mohitanch...@gmail.com wrote: On Wed, Jun 8, 2011 at 1:00 PM, Joshua Baker-LePain jl...@duke.edu wrote: On Wed, 8 Jun 2011 at 12:12pm, Mohit Anchlia wrote On Wed, Jun 8, 2011 at 12:09 PM, Joshua Baker-LePain jl...@duke.edu wrote: On Wed, 8 Jun 2011 at 9:40am, Mohit Anchlia wrote On Tue, Jun 7, 2011 at 11:20 PM, Joshua Baker-LePain jl...@duke.edu wrote: On Wed, 8 Jun 2011 at 8:16am, bxma...@gmail.com wrote When client is connecting to any gluster node it automaticly receive list of all other nodes for that volume. Yes, but what if the node it first tries to contact (i.e., the one on the fstab line) is down? For client side use DNS round robin with all the hosts in your cluster. And if you use /etc/hosts rather than DNS...? What do you think should happen? I'm simply trying to find the most robust way to automatically mount GlusterFS volumes at boot time in my environment. In previous versions of Gluster, there was no issue since the volume files were on the clients. I understand that can still be done, but one loses the ability to manage all changes from the servers. Prior to this thread, I thought the best method was to use ucarp on the servers and mount using the ucarp address. If that won't work right (I haven't had time to fully test my setup yet), then I need to find another way. I don't run DNS on my cluster, so that solution is out. As far as I can tell, the only other solution is to mount in rc.local, with logic to detect a mount failure and go on to the next server. ucarp should work so would the script at startup to check hosts before mounting. BTW: You need a virtual ip for ucarp -- Joshua Baker-LePain QB3 Shared Cluster Sysadmin UCSF ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] Gluster 3.2.0 and ucarp not working
On Wed, 8 Jun 2011 at 1:18pm, Mohit Anchlia wrote Prior to this thread, I thought the best method was to use ucarp on the servers and mount using the ucarp address. If that won't work right (I haven't had time to fully test my setup yet), then I need to find another way. I don't run DNS on my cluster, so that solution is out. As far as I can tell, the only other solution is to mount in rc.local, with logic to detect a mount failure and go on to the next server. ucarp should work so would the script at startup to check hosts before mounting. BTW: You need a virtual ip for ucarp As I said, that's what I'm doing now -- using the virtual IP address managed by ucarp in my fstab line. But Craig Carl from Gluster told the OP in this thread specifically to mount using the real IP address of a server when using the GlusterFS client, *not* to use the ucarp VIP. So I'm officially confused. -- Joshua Baker-LePain QB3 Shared Cluster Sysadmin UCSF___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] Gluster 3.2.0 and ucarp not working
On 06/08/2011 04:37 PM, Joshua Baker-LePain wrote: BTW: You need a virtual ip for ucarp As I said, that's what I'm doing now -- using the virtual IP address managed by ucarp in my fstab line. But Craig Carl from Gluster told the OP in this thread specifically to mount using the real IP address of a server when using the GlusterFS client, *not* to use the ucarp VIP. So I'm officially confused. GlusterFS client side gets its config from the server, and makes connections to each server. Any of the GlusterFS servers may be used for the mount, and the client will connect to all of them. If one of the servers goes away, and you have a replicated or HA setup, you shouldn't see any client side issues. GlusterFS using an NFS client presents a mount point, and a single point of connection to the server. Any of the GlusterFS servers may be used for the mount, and the client will connect to all of them. If one of the servers goes away, and you have a replicated or HA setup, you shouldn't see any client side issues if you are not attached to that server for your mount. Otherwise, the mount may hang. Does this make it clearer or less so? ucarp would be needed for the NFS side of the equation. round robin DNS is useful in both cases. -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics Inc. email: land...@scalableinformatics.com web : http://scalableinformatics.com http://scalableinformatics.com/sicluster phone: +1 734 786 8423 x121 fax : +1 866 888 3112 cell : +1 734 612 4615 ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] Gluster 3.2.0 and ucarp not working
On Wed, Jun 8, 2011 at 1:37 PM, Joshua Baker-LePain jl...@duke.edu wrote: On Wed, 8 Jun 2011 at 1:18pm, Mohit Anchlia wrote Prior to this thread, I thought the best method was to use ucarp on the servers and mount using the ucarp address. If that won't work right (I haven't had time to fully test my setup yet), then I need to find another way. I don't run DNS on my cluster, so that solution is out. As far as I can tell, the only other solution is to mount in rc.local, with logic to detect a mount failure and go on to the next server. ucarp should work so would the script at startup to check hosts before mounting. BTW: You need a virtual ip for ucarp As I said, that's what I'm doing now -- using the virtual IP address managed by ucarp in my fstab line. But Craig Carl from Gluster told the OP in this thread specifically to mount using the real IP address of a server when using the GlusterFS client, *not* to use the ucarp VIP. So I'm officially confused. I understand your confusion. On client side there are 2 concerns: 1) What if the server I am trying to mount to is down during reboot, restart, remount etc.? 2) What if I mounted my file system using server A and server A goes down? I think Craig was trying to address point 2. But you are worried about point 1 correct? Regarding point 1 - My suggestions are targeted towards point 1. Regarding point 2 - If file system is already mounted using server A and server A goes down then this will NOT impact the client mount. Like Joe mentioned list of servers in the clusters are fetched from server A during mount. After that if server A goes down it will still have that list and knows the configuration of cluster. So in short it's built in native glusterfs mount and provides HA on the client side. However if you use NFS then you can solve that by using ucarp. Clear as mud :)? -- Joshua Baker-LePain QB3 Shared Cluster Sysadmin UCSF ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
[Gluster-users] Red Hat packages with Scientic Linux
Hi, Is anyone running Gluster on SL6, and if so what are the experiences? Thanks, -- Jean Franco ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users