[Gluster-users] Gluster Barclamp?
Hey all, does anyone know if there has been a barclamp module (for the Dell Crowbar project) which has been built for Gluster? I'm guessing if there is not one that RedHat might not be such a fan of automating this step, but I'm crossing my fingers! Justice London ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] Switching clients to NFS
I have both read specifically that people with small files basically switched to NFS to solve small-file read issues and speed issues more generally. It seems to me (from my own testing as well), that 3.3 is still not working all that well with 'small files'. An image folder with about 20k images in it takes about 20+ seconds to do an 'ls' on, and about three seconds with NFS. It was enough of a speed difference on the front-end site to make me change my thinking and start working on getting keepalived working with gluster so that I can have a consistent NFS view. Justice London -Original Message- From: gluster-users-boun...@gluster.org [mailto:gluster-users-boun...@gluster.org] On Behalf Of David Coulson Sent: Friday, June 22, 2012 3:58 AM To: Fernando Frediani (Qube) Cc: 'gluster-users@gluster.org' Subject: Re: [Gluster-users] Switching clients to NFS On 6/22/12 6:18 AM, Fernando Frediani (Qube) wrote: > I have seen a few people recently saying they are using NFS instead of the Native Gluster client. I would imagine that the Gluster client would always be better and faster besides the automatic failover, but it makes me wonder what sort of problems their as experiencing with the Gluster client. > I ran FUSE mounts for a couple of months then switched to NFS. In general I saw an approx 2x improvement in read performance, and writes appeared to be a little quicker. Since my environment is 'mostly reads', as NFS utilizes the OS filesystem cache I saw a substantial drop in network traffic between Gluster nodes. I was also never able to get FUSE to mount volumes with an explicit SELinux context set. Not sure if it's a bug in FUSE on RHEL6, or just something broken with FUSE, but it just ignored the secontext= mount parameter. NFS works with this, so I was able to run SELinux in enforcing mode while using Gluster. David ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] Gluster 3.2.6 for XenServer
No, it was finally fixed in 3.3 so it does per-byte locking rather than per-file. Justice -Original Message- From: Nathan Stratton [mailto:nat...@robotics.net] Sent: Thursday, April 19, 2012 8:56 AM To: Justice London Cc: 'Gerald Brandt'; 'gluster-users' Subject: Re: [Gluster-users] Gluster 3.2.6 for XenServer On Thu, 19 Apr 2012, Justice London wrote: > My suggestion is actually to not use Gluster currently for XenServer. > A bug I filed about two years ago for using Gluster as a XenServer > back-end was just fixed in 3.3. If you're going to use Gluster+Xen, use 3.3. > > As for NFS, make sure that you changed the NFS port to 2049 in the > configuration and that should fix pretty much the front-end issues > with Gluster and Xen. Is the self heal issue still there in 3.3 where the file needs to be locked to heal? ><> Nathan Stratton nathan at robotics.net http://www.robotics.net ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] Gluster 3.2.6 for XenServer
My suggestion is actually to not use Gluster currently for XenServer. A bug I filed about two years ago for using Gluster as a XenServer back-end was just fixed in 3.3. If you're going to use Gluster+Xen, use 3.3. As for NFS, make sure that you changed the NFS port to 2049 in the configuration and that should fix pretty much the front-end issues with Gluster and Xen. Justice -Original Message- From: gluster-users-boun...@gluster.org [mailto:gluster-users-boun...@gluster.org] On Behalf Of Gerald Brandt Sent: Thursday, April 19, 2012 7:46 AM To: gluster-users Subject: [Gluster-users] Gluster 3.2.6 for XenServer Hi, I have Gluster 3.2.6 RPM's for Citrix XenServer 6.0. I've installed and mounted exports, but that's where I stopped. My issues are: 1. XenServer mounts the NFS servers SR subdirectory, not the export. Gluster won't do that. -- I can, apparently, mount the gluster export somewhere else, and then 'mount --bind' the subdir to the right place 2. I don't really know python, though I could fuddle my way through it, I guess. What I thought would be cool, was to modify the nfs.py mount script. If the 'Advanced Options' field contained the word gluster, then instead of an NFS mount happening, it would do the gluster mount. All the magic of the NFS mount would remain, and happen on the gluster one. Does this sound feasible? Is there anyone fluent in python that wants to help? Gerald ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] Apache hung tasks still occur with glusterfs 3.2.1
Disable io-cache and up the threads to 64 and your problems should disappear. They did for me when I made both of these changes. Justice London From: gluster-users-boun...@gluster.org [mailto:gluster-users-boun...@gluster.org] On Behalf Of Jiri Lunacek Sent: Monday, June 13, 2011 1:49 AM To: gluster-users@gluster.org Subject: [Gluster-users] Apache hung tasks still occur with glusterfs 3.2.1 Hi all. We have been having problems with hung tasks of apache reading from glusterfs 2-replica volume ever since upgrading to 3.2.0. The problems were identical to those described here: http://gluster.org/pipermail/gluster-users/2011-May/007697.html Yesterday we updated to 3.2.1. A good thing is that the hung tasks stopped appearing when gluster is in "intact" operation, i.e. when there are no modifications to the gluster configs at all. Today we modified some other volume exported by the same cluster (but not sharing anything with the volume used by the apache process). And, once again, two requests of apache reading from glusterfs volume are stuck. Any help with this issue would be very appreciated as right now we have to nightly-reboot the machine as the processes re stuck in iowait -> unkillable. I really do not want to go through the downgrade to 3.1.4 since it seems from the mailing list that it may not go exactly smooth. We are exporting millions of files and any large operation on the exported filesystem takes days. I am attaching tech info on the problem. client: Centos 5.6 2.6.18-238.9.1.el5 fuse-2.7.4-8.el5 glusterfs-fuse-3.2.1-1 glusterfs-core-3.2.1-1 servers: Centos 5.6 2.6.18-194.32.1.el5 fuse-2.7.4-8.el5 glusterfs-fuse-3.2.1-1 glusterfs-core-3.2.1-1 dmesg: INFO: task httpd:1246 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. httpd D 81000101d7a0 0 1246 2394 1247 1191 (NOTLB) 81013ee7dc38 0082 0092 81013ee7dcd8 81013ee7dd04 000a 810144d0f7e0 81019fc28100 308f8b444727 14ee 810144d0f9c8 00038006e608 Call Trace: [] do_gettimeofday+0x40/0x90 [] sync_page+0x0/0x43 [] io_schedule+0x3f/0x67 [] sync_page+0x3e/0x43 [] __wait_on_bit_lock+0x36/0x66 [] __lock_page+0x5e/0x64 [] wake_bit_function+0x0/0x23 [] pagevec_lookup+0x17/0x1e [] invalidate_inode_pages2_range+0x73/0x1bd [] finish_wait+0x32/0x5d [] :fuse:wait_answer_interruptible+0xb6/0xbd [] autoremove_wake_function+0x0/0x2e [] recalc_sigpending+0xe/0x25 [] sigprocmask+0xb7/0xdb [] :fuse:fuse_finish_open+0x36/0x62 [] :fuse:fuse_open_common+0x147/0x158 [] :fuse:fuse_open+0x0/0x7 [] __dentry_open+0xd9/0x1dc [] do_filp_open+0x2a/0x38 [] do_sys_open+0x44/0xbe [] tracesys+0xd5/0xe0 INFO: task httpd:1837 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. httpd D 810001004420 0 1837 2394 1856 1289 (NOTLB) 81013c6f9c38 0086 81013c6f9bf8 fffe 810170ce7000 000a 81019c0ae7a0 80311b60 308c0f83d792 0ec4 81019c0ae988 8006e608 Call Trace: [] do_gettimeofday+0x40/0x90 [] sync_page+0x0/0x43 [] io_schedule+0x3f/0x67 [] sync_page+0x3e/0x43 [] __wait_on_bit_lock+0x36/0x66 [] __lock_page+0x5e/0x64 [] wake_bit_function+0x0/0x23 [] pagevec_lookup+0x17/0x1e [] invalidate_inode_pages2_range+0x73/0x1bd [] finish_wait+0x32/0x5d [] :fuse:wait_answer_interruptible+0xb6/0xbd [] autoremove_wake_function+0x0/0x2e [] recalc_sigpending+0xe/0x25 [] sigprocmask+0xb7/0xdb [] :fuse:fuse_finish_open+0x36/0x62 [] :fuse:fuse_open_common+0x147/0x158 [] :fuse:fuse_open+0x0/0x7 [] __dentry_open+0xd9/0x1dc [] do_filp_open+0x2a/0x38 [] do_sys_open+0x44/0xbe [] tracesys+0xd5/0xe0 INFO: task httpd:383 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. httpd D 81019fa21100 0 383 2394 534 (NOTLB) 81013e497c08 0082 810183eb8910 884b9219 81019e41c600 0009 81019b1e2100 81019fa21100 308c0e2c2bfb 00016477 81019b1e22e8 00038006e608 Call Trace: [] :fuse:flush_bg_queue+0x2b/0x48 [] do_gettimeofday+0x40/0x90 [] getnstimeofday+0x10/0x28 [] sync_page+0x0/0x43 [] io_schedule+0x3f/0x67 [] sync_page+0x3e/0x43 [] __wait_on_bit_lock+0x36/0x66 [] __lock_page+0x5e/0x64 [] wake_bit_function+0x0/0x23 [] do_generic_mapping_read+0x1df/0x359 [] file_read_actor+0x0/0x159 [] __generic_file_aio_read+0x14c/0x198 [] generic_file_read+0xac/0xc5 [] autoremove_wake_function+0x0/0x2e [] selinux_file_permission+0x9f/0xb4 [] vfs_read+0xcb/0x171 [] sys_read+0x45/0x6e [] tracesys+0xd5/0
Re: [Gluster-users] uninterruptible processes writing toglusterfsshare
I have been working with the gluster support department (under paid support I should mention)... with no resolution or similar. A bug report specifically was not filed by myself though. Justice London -Original Message- From: Mohit Anchlia [mailto:mohitanch...@gmail.com] Sent: Wednesday, June 08, 2011 12:29 PM To: Justice London Cc: bxma...@gmail.com; gluster-users@gluster.org Subject: Re: [Gluster-users] uninterruptible processes writing toglusterfsshare On Wed, Jun 8, 2011 at 12:29 PM, Justice London wrote: > Hopefully this will help some people... try disabling the io-cache routine > in the fuse configurations for your share. Let me know if you need > instruction on doing this. It solved all of the lockup issues I was > experiencing. I believe there is some sort of as-yet-undetermined memory > leak here. Was there a bug filed? If you think this is a bug it will help others as well. > > Justice London > > -Original Message- > From: gluster-users-boun...@gluster.org > [mailto:gluster-users-boun...@gluster.org] On Behalf Of bxma...@gmail.com > Sent: Wednesday, June 08, 2011 12:22 PM > To: gluster-users@gluster.org > Subject: Re: [Gluster-users] uninterruptible processes writing > toglusterfsshare > > Im using kernel 2.6.34 + fuse 2.5.5 + gluster 3.2 from beginning and > it happen again today ... > php-fpm freeze and reboot was only solution. > > Matus > > > 2011/6/7 Markus Fröhlich : >> hi! >> >> there ist no relavant output from dmesg. >> no entries in the server log - only the one line in the client-server log, > I >> already posted. >> >> the glusterfs version on the server had been updated to gfs 3.2.0 more > than >> a month ago. >> because of the troubles on the backup server, I deleted the whole backup >> share and started from scratch. >> >> >> I looked for a update of "fuse" and upgraded from 2.7.2-61.18.1 to >> 2.8.5-41.1 >> maybe this helps. >> >> here is the changelog info: >> >> Authors: >> >> Miklos Szeredi >> Distribution: systemsmanagement:baracus / SLE_11_SP1 >> * Tue Mar 29 2011 db...@novell.com >> - remove the --no-canonicalize usage for suse_version <= 11.3 >> >> * Mon Mar 21 2011 co...@novell.com >> - licenses package is about to die >> >> * Thu Feb 17 2011 mszer...@suse.cz >> - In case of failure to add to /etc/mtab don't umount. [bnc#668820] >> [CVE-2011-0541] >> >> * Tue Nov 16 2010 mszer...@suse.cz >> - Fix symlink attack for mount and umount [bnc#651598] >> >> * Wed Oct 27 2010 mszer...@suse.cz >> - Remove /etc/init.d/boot.fuse [bnc#648843] >> >> * Tue Sep 28 2010 mszer...@suse.cz >> - update to 2.8.5 >> * fix option escaping for fusermount [bnc#641480] >> >> * Wed Apr 28 2010 mszer...@suse.cz >> - keep examples and internal docs in devel package (from jnweiger) >> >> * Mon Apr 26 2010 mszer...@suse.cz >> - update to 2.8.4 >> * fix checking for symlinks in umount from /tmp >> * fix umounting if /tmp is a symlink >> >> >> kind regards >> markus froehlich >> >> Am 06.06.2011 21:19, schrieb Anthony J. Biacco: >>> >>> Could be fuse, check 'dmesg' for kernel module timeouts. >>> >>> In a similar vein, has anyone seen signifigant performance/reliability >>> with diff fuse versions? say, latest source vs. Rhel distro rpms vers. >>> >>> -Tony >>> >>> >>> >>> -Original Message- >>> From: Mohit Anchlia >>> Sent: June 06, 2011 1:14 PM >>> To: Markus Fröhlich >>> Cc: gluster-users@gluster.org >>> Subject: Re: [Gluster-users] uninterruptible processes writing to >>> glusterfsshare >>> >>> Is there anything in the server logs? Does it follow any particular >>> pattern before going in this mode? >>> >>> Did you upgrade Gluster or is this new install? >>> >>> 2011/6/6 Markus Fröhlich: >>>> >>>> hi! >>>> >>>> sometimes we've on some client-servers hanging uninterruptible processes >>>> ("ps aux" stat is on "D" ) and on one the CPU wait I/O grows within some >>>> minutes to 100%. >>>> you are not able to kill such processes - also "kill -9" doesnt work - >>>> when >>>> you connect via "strace" to such an process, you wont see anything and >>>> you >>>> cannot detach it again. >>>> >>>> there are only two possibili
Re: [Gluster-users] uninterruptible processes writing toglusterfsshare
Hopefully this will help some people... try disabling the io-cache routine in the fuse configurations for your share. Let me know if you need instruction on doing this. It solved all of the lockup issues I was experiencing. I believe there is some sort of as-yet-undetermined memory leak here. Justice London -Original Message- From: gluster-users-boun...@gluster.org [mailto:gluster-users-boun...@gluster.org] On Behalf Of bxma...@gmail.com Sent: Wednesday, June 08, 2011 12:22 PM To: gluster-users@gluster.org Subject: Re: [Gluster-users] uninterruptible processes writing toglusterfsshare Im using kernel 2.6.34 + fuse 2.5.5 + gluster 3.2 from beginning and it happen again today ... php-fpm freeze and reboot was only solution. Matus 2011/6/7 Markus Fröhlich : > hi! > > there ist no relavant output from dmesg. > no entries in the server log - only the one line in the client-server log, I > already posted. > > the glusterfs version on the server had been updated to gfs 3.2.0 more than > a month ago. > because of the troubles on the backup server, I deleted the whole backup > share and started from scratch. > > > I looked for a update of "fuse" and upgraded from 2.7.2-61.18.1 to > 2.8.5-41.1 > maybe this helps. > > here is the changelog info: > > Authors: > > Miklos Szeredi > Distribution: systemsmanagement:baracus / SLE_11_SP1 > * Tue Mar 29 2011 db...@novell.com > - remove the --no-canonicalize usage for suse_version <= 11.3 > > * Mon Mar 21 2011 co...@novell.com > - licenses package is about to die > > * Thu Feb 17 2011 mszer...@suse.cz > - In case of failure to add to /etc/mtab don't umount. [bnc#668820] > [CVE-2011-0541] > > * Tue Nov 16 2010 mszer...@suse.cz > - Fix symlink attack for mount and umount [bnc#651598] > > * Wed Oct 27 2010 mszer...@suse.cz > - Remove /etc/init.d/boot.fuse [bnc#648843] > > * Tue Sep 28 2010 mszer...@suse.cz > - update to 2.8.5 > * fix option escaping for fusermount [bnc#641480] > > * Wed Apr 28 2010 mszer...@suse.cz > - keep examples and internal docs in devel package (from jnweiger) > > * Mon Apr 26 2010 mszer...@suse.cz > - update to 2.8.4 > * fix checking for symlinks in umount from /tmp > * fix umounting if /tmp is a symlink > > > kind regards > markus froehlich > > Am 06.06.2011 21:19, schrieb Anthony J. Biacco: >> >> Could be fuse, check 'dmesg' for kernel module timeouts. >> >> In a similar vein, has anyone seen signifigant performance/reliability >> with diff fuse versions? say, latest source vs. Rhel distro rpms vers. >> >> -Tony >> >> >> >> -Original Message- >> From: Mohit Anchlia >> Sent: June 06, 2011 1:14 PM >> To: Markus Fröhlich >> Cc: gluster-users@gluster.org >> Subject: Re: [Gluster-users] uninterruptible processes writing to >> glusterfsshare >> >> Is there anything in the server logs? Does it follow any particular >> pattern before going in this mode? >> >> Did you upgrade Gluster or is this new install? >> >> 2011/6/6 Markus Fröhlich: >>> >>> hi! >>> >>> sometimes we've on some client-servers hanging uninterruptible processes >>> ("ps aux" stat is on "D" ) and on one the CPU wait I/O grows within some >>> minutes to 100%. >>> you are not able to kill such processes - also "kill -9" doesnt work - >>> when >>> you connect via "strace" to such an process, you wont see anything and >>> you >>> cannot detach it again. >>> >>> there are only two possibilities: >>> killing the glusterfs process (umount GFS share) or rebooting the server. >>> >>> the only log entry I found, was on one client - just a single line: >>> [2011-06-06 10:44:18.593211] I >>> [afr-common.c:581:afr_lookup_collect_xattr] >>> 0-office-data-replicate-0: data self-heal is pending for >>> >>> /pc-partnerbet-public/Promotionaktionen/Mailakquise_2009/Webmaster_2010/HTML /bilder/Thumbs.db. >>> >>> one of the client-servers is a samba-server, the other one a >>> backup-server >>> based on rsync with millions of small files. >>> >>> gfs-servers + gfs-clients: SLES11 x86_64, glusterfs V 3.2.0 >>> >>> and here are the configs from server and client: >>> server config >>> >>> "/etc/glusterd/vols/office-data/office-data.gfs-01-01.GFS-office-data02.vol" : >>> volume office-data-posix >>> type storage/posix >>> option directory /GFS/office-data02 >>> end-
Re: [Gluster-users] gluster 3.2.0 - totally broken?
Ah, well I was actually looking around to see if with 3.2 there was a command to set the performance options... answers that question. Either way, though, I think setting the threads will help a lot. Justice London -Original Message- From: Burnash, James [mailto:jburn...@knight.com] Sent: Wednesday, May 18, 2011 12:57 PM To: 'Justice London'; 'Tomasz Chmielewski'; 'Anthony J. Biacco' Cc: gluster-users@gluster.org Subject: RE: [Gluster-users] gluster 3.2.0 - totally broken? I believe that it is more consistent and repeatable to just use the gluster command to set this. Example from this page: http://www.gluster.com/community/documentation/index.php/Gluster_3.1:_Settin g_Volume_Options gluster volume set VOLNAME performance.io-thread-count 64 This also means that your changes will persist across any other "gluster volume set" commands. Generally speaking, hand editing the volume config files is a bad idea, IMHO. James -Original Message- From: gluster-users-boun...@gluster.org [mailto:gluster-users-boun...@gluster.org] On Behalf Of Justice London Sent: Wednesday, May 18, 2011 3:49 PM To: 'Tomasz Chmielewski'; 'Anthony J. Biacco' Cc: gluster-users@gluster.org Subject: Re: [Gluster-users] gluster 3.2.0 - totally broken? Whoops, and forgot the threads edit for the brick instance config: volume -io-threads type performance/io-threads option thread-count 64 subvolumes -locks end-volume Justice London Systems Administrator phone 800-397-3743 ext. 7005 fax 760-510-0299 web www.lawinfo.com e-mail jlon...@lawinfo.com PLEASE NOTE: This message, including any attachments, may include privileged, confidential and/or inside information. Any distribution or use of this communication by anyone other than the intended recipient(s) is strictly prohibited and may be unlawful. If you are not the intended recipient, please notify the sender by replying to this message and then delete it from your system. -Original Message- From: gluster-users-boun...@gluster.org [mailto:gluster-users-boun...@gluster.org] On Behalf Of Tomasz Chmielewski Sent: Wednesday, May 18, 2011 10:05 AM To: Anthony J. Biacco Cc: gluster-users@gluster.org Subject: Re: [Gluster-users] gluster 3.2.0 - totally broken? On 18.05.2011 18:56, Anthony J. Biacco wrote: > I'm using it in real-world production, lot of small files (apache > webroot mounts mostly). I've seen a bunch of split-brain and self-heal > failing when I first did the switch. After I removed and recreated the > dirs it seemed to be fine for about a week now; yeah not long, I know. > > I 2^nd the notion that it'd be nice to see a list of what files/dirs > gluster thinks is out of sync or can't heal. Right now you gotta go > diving into the logs. > > I'm actually thinking of downgrading to 3.1.3 from 3.2.0. Wonder if > I'd have any ill-effects on the volume with a simple rpm downgrade and > daemon restart. I've been using 3.2.0 for a while, but I had a problem with userspace programs "hanging" on accessing some files on the gluster mount (described here on the list). I downgraded to 3.1.4 (remove 3.2.0 rpm and config, install 3.1.4 rpm, add nodes) and it works fine for me. 3.0.x was also crashing for me when SSHFS-like mount was used to the server with gluster mount (and reads/writes were made from the gluster mount through it). -- Tomasz Chmielewski http://wpkg.org ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users DISCLAIMER: This e-mail, and any attachments thereto, is intended only for use by the addressee(s) named herein and may contain legally privileged and/or confidential information. If you are not the intended recipient of this e-mail, you are hereby notified that any dissemination, distribution or copying of this e-mail, and any attachments thereto, is strictly prohibited. If you have received this in error, please immediately notify me and permanently delete the original and any copy of any e-mail and any printout thereof. E-mail transmission cannot be guaranteed to be secure or error-free. The sender therefore does not accept liability for any errors or omissions in the contents of this message which arise as a result of e-mail transmission. NOTICE REGARDING PRIVACY AND CONFIDENTIALITY Knight Capital Group may, at its discretion, monitor and review the content of all e-mail communications. http://www.knight.com ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] gluster 3.2.0 - totally broken?
Whoops, and forgot the threads edit for the brick instance config: volume -io-threads type performance/io-threads option thread-count 64 subvolumes -locks end-volume Justice London Systems Administrator phone 800-397-3743 ext. 7005 fax 760-510-0299 web www.lawinfo.com e-mail jlon...@lawinfo.com PLEASE NOTE: This message, including any attachments, may include privileged, confidential and/or inside information. Any distribution or use of this communication by anyone other than the intended recipient(s) is strictly prohibited and may be unlawful. If you are not the intended recipient, please notify the sender by replying to this message and then delete it from your system. -Original Message- From: gluster-users-boun...@gluster.org [mailto:gluster-users-boun...@gluster.org] On Behalf Of Tomasz Chmielewski Sent: Wednesday, May 18, 2011 10:05 AM To: Anthony J. Biacco Cc: gluster-users@gluster.org Subject: Re: [Gluster-users] gluster 3.2.0 - totally broken? On 18.05.2011 18:56, Anthony J. Biacco wrote: > I'm using it in real-world production, lot of small files (apache > webroot mounts mostly). I've seen a bunch of split-brain and self-heal > failing when I first did the switch. After I removed and recreated the > dirs it seemed to be fine for about a week now; yeah not long, I know. > > I 2^nd the notion that it'd be nice to see a list of what files/dirs > gluster thinks is out of sync or can't heal. Right now you gotta go > diving into the logs. > > I'm actually thinking of downgrading to 3.1.3 from 3.2.0. Wonder if I'd > have any ill-effects on the volume with a simple rpm downgrade and > daemon restart. I've been using 3.2.0 for a while, but I had a problem with userspace programs "hanging" on accessing some files on the gluster mount (described here on the list). I downgraded to 3.1.4 (remove 3.2.0 rpm and config, install 3.1.4 rpm, add nodes) and it works fine for me. 3.0.x was also crashing for me when SSHFS-like mount was used to the server with gluster mount (and reads/writes were made from the gluster mount through it). -- Tomasz Chmielewski http://wpkg.org ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] gluster 3.2.0 - totally broken?
I had issues with hanging of mounts as well with 3.2. I fixed it via upping the number of connections allowed to the fuse mount... by default it's something silly low like 16. I don't quite understand why the basic options are not at least slightly optimized for more than a few connections. Try this and see if it helps with 3.2. It helped me, but this is also a 3.1.x fix as well... Edit: /etc/glusterd/vols//-fuse.vol volume -write-behind type performance/write-behind option cache-size 4MB option flush-behind on subvolumes -replicate-0 end-volume volume -read-ahead type performance/read-ahead option page-count 4 subvolumes -write-behind end-volume volume -io-cache type performance/io-cache option cache-size 768MB option cache-timeout 1 subvolumes -read-ahead end-volume volume -quick-read type performance/quick-read option cache-timeout 1 option cache-size 768MB option max-file-size 64kB subvolumes -io-cache end-volume #volume sitestore-stat-prefetch #type performance/stat-prefetch #subvolumes -quick-read #end-volume volume type debug/io-stats option latency-measurement off option count-fop-hits off #subvolumes -stat-prefetch subvolumes -quick-read end-volume Justice London PLEASE NOTE: This message, including any attachments, may include privileged, confidential and/or inside information. Any distribution or use of this communication by anyone other than the intended recipient(s) is strictly prohibited and may be unlawful. If you are not the intended recipient, please notify the sender by replying to this message and then delete it from your system. -Original Message- From: gluster-users-boun...@gluster.org [mailto:gluster-users-boun...@gluster.org] On Behalf Of Tomasz Chmielewski Sent: Wednesday, May 18, 2011 10:05 AM To: Anthony J. Biacco Cc: gluster-users@gluster.org Subject: Re: [Gluster-users] gluster 3.2.0 - totally broken? On 18.05.2011 18:56, Anthony J. Biacco wrote: > I'm using it in real-world production, lot of small files (apache > webroot mounts mostly). I've seen a bunch of split-brain and self-heal > failing when I first did the switch. After I removed and recreated the > dirs it seemed to be fine for about a week now; yeah not long, I know. > > I 2^nd the notion that it'd be nice to see a list of what files/dirs > gluster thinks is out of sync or can't heal. Right now you gotta go > diving into the logs. > > I'm actually thinking of downgrading to 3.1.3 from 3.2.0. Wonder if I'd > have any ill-effects on the volume with a simple rpm downgrade and > daemon restart. I've been using 3.2.0 for a while, but I had a problem with userspace programs "hanging" on accessing some files on the gluster mount (described here on the list). I downgraded to 3.1.4 (remove 3.2.0 rpm and config, install 3.1.4 rpm, add nodes) and it works fine for me. 3.0.x was also crashing for me when SSHFS-like mount was used to the server with gluster mount (and reads/writes were made from the gluster mount through it). -- Tomasz Chmielewski http://wpkg.org ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
[Gluster-users] Gluster 3.2 optimization options?
Hey, are there any optimization options for 3.2 like there were for 3.1 versions? I specifically need to have more client connections and server connections. Without these 3.1 would lock up all the time as I was using it for web image storage. After allowing more than the stock number of connections it ran like a champ. I have been having issues with 3.2 and I believe it is because the default configs are not optimized. Please let me know! Justice London ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] Self heal with VM Storage
After the self heal finishes it sort of works. Usually this destroys InnoDB if you're running a database. Most often, though, it also causes some libraries and similar to not be properly read in by the VM guest which means you have to reboot it to fix for this. It should be fairly easy to reproduce... just shut down a storage brick (any configuration... it doesn't seem to matter). Make sure of course that you have a running VM guest (KVM, etc) using the gluster mount. You'll then turn off(unplug, etc.) one of the storage bricks and wait a few minutes... then re-enable it. Justice London jlon...@lawinfo.com -Original Message- From: Tejas N. Bhise [mailto:te...@gluster.com] Sent: Thursday, April 15, 2010 7:41 PM To: Justice London Cc: gluster-users@gluster.org Subject: Re: [Gluster-users] Self heal with VM Storage Justice, Thanks for the description. So, does this mean that after the self heal is over after some time, the guest starts to work fine ? We will reproduce this inhouse and get back. Regards, Tejas. - Original Message - From: "Justice London" To: "Tejas N. Bhise" Cc: gluster-users@gluster.org Sent: Friday, April 16, 2010 1:18:36 AM Subject: RE: [Gluster-users] Self heal with VM Storage Okay, but what happens on a brick shutting down and being added back to the cluster? This would be after some live data has been written to the other bricks. >From what I was seeing access to the file is locked. Is this not the case? If file access is being locked it will obviously cause issues for anything trying to read/write to the guest at the time. Justice London jlon...@lawinfo.com -Original Message- From: Tejas N. Bhise [mailto:te...@gluster.com] Sent: Thursday, April 15, 2010 12:33 PM To: Justice London Cc: gluster-users@gluster.org Subject: Re: [Gluster-users] Self heal with VM Storage Justice, >From posts from the community on this user list, I know that there are folks that run hundreds of VMs out of gluster. So it's probably more about the data usage than just a generic viability statement as you made in your post. Gluster does not support databases, though many people use them on gluster without much problem. Please let me know if you see some problem with unstructured file data on VMs. I would be happy to help debug that problem. Regards, Tejas. - Original Message - From: "Justice London" To: gluster-users@gluster.org Sent: Friday, April 16, 2010 12:52:19 AM Subject: [Gluster-users] Self heal with VM Storage I am running gluster as a storage backend for VM storage (KVM guests). If one of the bricks is taken offline (even for an instant), on bringing it back up it runs the metadata check. This causes the guest to both stop responding until the check finishes and also to ruin data that was in process (sql data for instance). I'm guessing the file is being locked while checked. Is there any way to fix for this? Without being able to fix for this, I'm not certain how viable gluster will be, or can be for VM storage. Justice London jlon...@lawinfo.com ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] Self heal with VM Storage
Okay, but what happens on a brick shutting down and being added back to the cluster? This would be after some live data has been written to the other bricks. >From what I was seeing access to the file is locked. Is this not the case? If file access is being locked it will obviously cause issues for anything trying to read/write to the guest at the time. Justice London jlon...@lawinfo.com -Original Message- From: Tejas N. Bhise [mailto:te...@gluster.com] Sent: Thursday, April 15, 2010 12:33 PM To: Justice London Cc: gluster-users@gluster.org Subject: Re: [Gluster-users] Self heal with VM Storage Justice, >From posts from the community on this user list, I know that there are folks that run hundreds of VMs out of gluster. So it's probably more about the data usage than just a generic viability statement as you made in your post. Gluster does not support databases, though many people use them on gluster without much problem. Please let me know if you see some problem with unstructured file data on VMs. I would be happy to help debug that problem. Regards, Tejas. - Original Message ----- From: "Justice London" To: gluster-users@gluster.org Sent: Friday, April 16, 2010 12:52:19 AM Subject: [Gluster-users] Self heal with VM Storage I am running gluster as a storage backend for VM storage (KVM guests). If one of the bricks is taken offline (even for an instant), on bringing it back up it runs the metadata check. This causes the guest to both stop responding until the check finishes and also to ruin data that was in process (sql data for instance). I'm guessing the file is being locked while checked. Is there any way to fix for this? Without being able to fix for this, I'm not certain how viable gluster will be, or can be for VM storage. Justice London jlon...@lawinfo.com ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
[Gluster-users] Self heal with VM Storage
I am running gluster as a storage backend for VM storage (KVM guests). If one of the bricks is taken offline (even for an instant), on bringing it back up it runs the metadata check. This causes the guest to both stop responding until the check finishes and also to ruin data that was in process (sql data for instance). I'm guessing the file is being locked while checked. Is there any way to fix for this? Without being able to fix for this, I'm not certain how viable gluster will be, or can be for VM storage. Justice London jlon...@lawinfo.com ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] Announcement: Alpha Release of Native NFS for GlusterFS
I did indeed start with 'glusterfsd' for the NFS mount! Justice London jlon...@lawinfo.com On Sat, 2010-03-20 at 13:25 -0600, Tejas N. Bhise wrote: > Justice, > > A quick question - did you start nfs with 'glusterfs' or 'glusterfsd'. If you > used 'glusterfs' please retry with 'glusterfsd' and let me know the results. > > We have mentioned in the release notes to use 'glusterfsd', but I saw a > couple other users face a problem because they used 'glusterfs' ( probably > because of the way unfsd or knfs was used ). > > Regards, > Tejas. > > - Original Message - > From: "Justice London" > To: "Tejas N. Bhise" > Cc: gluster-users@gluster.org, gluster-de...@nongnu.org, nfs-al...@gluster.com > Sent: Saturday, March 20, 2010 2:39:27 AM GMT +05:30 Chennai, Kolkata, > Mumbai, New Delhi > Subject: Re: [Gluster-users] Announcement: Alpha Release of Native NFS for > GlusterFS > > I'm sorry.. but I don't know how you guys tested this, but using a > bare-bones configuration with the NFS translator and a mirror > configuration between two systems (no performance translators, etc.) I > can lock up the entire system after writing 160-180megs of data. > > Basically: > dd if=/dev/full of=testfile bs=1M count=1000 is enough to lock the > entire machine. > > This is on a CentOS 5.4 system with a xen backend (for testing). > > I don't know what you guys tested with, but I can't get this stable... > at all. > > Justice London > jlon...@lawinfo.com > On Thu, 2010-03-18 at 10:36 -0600, Tejas N. Bhise wrote: > > Dear Community Users, > > > > Gluster is happy to announce the ALPHA release of the native NFS Server. > > The native NFS server is implemented as an NFS Translator and hence > > integrates very well, the NFS protocol on one side and GlusterFS protocol > > on the other side. > > > > This is an important step in our strategy to extend the benefits of > > Gluster to other operating system which can benefit from a better NFS > > based data service, while enjoying all the backend smarts that Gluster > > provides. > > > > The new NFS Server also strongly supports our efforts towards > > becoming a virtualization storage of choice. > > > > The release notes of the NFS ALPHA Release are available at - > > > > http://ftp.gluster.com/pub/gluster/glusterfs/qa-releases/nfs-alpha/GlusterFS_NFS_Alpha_Release_Notes.pdf > > > > The Release notes describe where RPMs and source code can be obtained > > and where bugs found in this ALPHA release can be filed. Some examples > > on usage are also provided. > > > > Please be aware that this is an ALPHA release and in no way should be > > used in production. Gluster is not responsible for any loss of data > > or service resulting from the use of this ALPHA NFS Release. > > > > Feel free to send feedback, comments and questions to: nfs-al...@gluster.com > > > > Regards, > > Tejas Bhise. > > ___ > > Gluster-users mailing list > > Gluster-users@gluster.org > > http://gluster.org/cgi-bin/mailman/listinfo/gluster-users > > ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] Announcement: Alpha Release of Native NFS for GlusterFS
I'm sorry.. but I don't know how you guys tested this, but using a bare-bones configuration with the NFS translator and a mirror configuration between two systems (no performance translators, etc.) I can lock up the entire system after writing 160-180megs of data. Basically: dd if=/dev/full of=testfile bs=1M count=1000 is enough to lock the entire machine. This is on a CentOS 5.4 system with a xen backend (for testing). I don't know what you guys tested with, but I can't get this stable... at all. Justice London jlon...@lawinfo.com On Thu, 2010-03-18 at 10:36 -0600, Tejas N. Bhise wrote: > Dear Community Users, > > Gluster is happy to announce the ALPHA release of the native NFS Server. > The native NFS server is implemented as an NFS Translator and hence > integrates very well, the NFS protocol on one side and GlusterFS protocol > on the other side. > > This is an important step in our strategy to extend the benefits of > Gluster to other operating system which can benefit from a better NFS > based data service, while enjoying all the backend smarts that Gluster > provides. > > The new NFS Server also strongly supports our efforts towards > becoming a virtualization storage of choice. > > The release notes of the NFS ALPHA Release are available at - > > http://ftp.gluster.com/pub/gluster/glusterfs/qa-releases/nfs-alpha/GlusterFS_NFS_Alpha_Release_Notes.pdf > > The Release notes describe where RPMs and source code can be obtained > and where bugs found in this ALPHA release can be filed. Some examples > on usage are also provided. > > Please be aware that this is an ALPHA release and in no way should be > used in production. Gluster is not responsible for any loss of data > or service resulting from the use of this ALPHA NFS Release. > > Feel free to send feedback, comments and questions to: nfs-al...@gluster.com > > Regards, > Tejas Bhise. > ___ > Gluster-users mailing list > Gluster-users@gluster.org > http://gluster.org/cgi-bin/mailman/listinfo/gluster-users ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] mmap() shared read/write support - exists with FUSE2.8.1 and GlusterFS 2.0.7 on kernel RHEL5 kernel 2.6.18-164?
Older versions of the fuse module (like that in the kernel you just mentioned) do not support mmap. Anything with kernel 2.6.27 or newer fully support it. Justice London jlon...@lawinfo.com -Original Message- From: gluster-users-boun...@gluster.org [mailto:gluster-users-boun...@gluster.org] On Behalf Of Max Sent: Wednesday, October 14, 2009 12:15 PM To: gluster-users@gluster.org Subject: [Gluster-users] mmap() shared read/write support - exists with FUSE2.8.1 and GlusterFS 2.0.7 on kernel RHEL5 kernel 2.6.18-164? I have been doing a lot of reading and re-learning about FUSE over the last week as I have been setting up GlusterFS in our dev environment. I cannot seem to get mmap() support to work. This is my set up: * RHEL5 / CentOS5 * FUSE 2.8.1 * GlusterFS 2.0.7 * RHEL5 kernel 2.6.18-164 (RHEL5.3, CentOS 5.3) * fuse.ko built from kernel source I continually get this when trying to use rrdtool to create or update RRD files on a shared front end: ERROR: mmaping file '/data/nagios/pnp/db/nagios-report/Test_2_-_192.168.79.132.rrd': No such device I can create and replicate files through scripts or vi/vim just fine on my two front end two back end cluster. Any ideas of what I might be doing wrong or if this is just a configuration / version mix that plain does not suppport mmap shared write access? Any pointers / tips / advice appreciated. Thanks, Max ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
[Gluster-users] Mount issue with CentOS kernel 2.6.18-164
I am unable to mount using the new built-in fuse module with the newest centos/rhel kernel: [2009-10-02 14:25:18] D [glusterfsd.c:1290:main] glusterfs: running in pid 3445 [2009-10-02 14:25:18] D [client-protocol.c:6127:init] vmclient1: defaulting frame-timeout to 30mins [2009-10-02 14:25:18] D [client-protocol.c:6138:init] vmclient1: defaulting ping-timeout to 10 [2009-10-02 14:25:18] D [transport.c:141:transport_load] transport: attempt to load file /usr/lib64/glusterfs/2.0.7/transport/socket.so [2009-10-02 14:25:18] D [transport.c:141:transport_load] transport: attempt to load file /usr/lib64/glusterfs/2.0.7/transport/socket.so [2009-10-02 14:25:18] D [client-protocol.c:6127:init] vmclient2: defaulting frame-timeout to 30mins [2009-10-02 14:25:18] D [client-protocol.c:6138:init] vmclient2: defaulting ping-timeout to 10 [2009-10-02 14:25:18] D [transport.c:141:transport_load] transport: attempt to load file /usr/lib64/glusterfs/2.0.7/transport/socket.so [2009-10-02 14:25:18] D [transport.c:141:transport_load] transport: attempt to load file /usr/lib64/glusterfs/2.0.7/transport/socket.so [2009-10-02 14:25:18] D [io-threads.c:2280:init] threads: io-threads: Autoscaling: off, min_threads: 16, max_threads: 16 [2009-10-02 14:25:18] D [read-ahead.c:824:init] readahead: Using conf->page_count = 8 [2009-10-02 14:25:18] D [write-behind.c:1980:init] writeback: disabling write-behind for first 1 bytes [2009-10-02 14:25:18] D [write-behind.c:2031:init] writeback: enabling flush-behind [2009-10-02 14:25:18] D [fuse-bridge.c:2825:init] glusterfs-fuse: fuse_lowlevel_new() failed with error Success on mount point /mnt/vmstoreexport [2009-10-02 14:25:18] E [xlator.c:736:xlator_init_rec] xlator: Initialization of volume 'fuse' failed, review your volfile again [2009-10-02 14:25:18] E [glusterfsd.c:599:_xlator_graph_init] glusterfs: initializing translator failed [2009-10-02 14:25:18] E [glusterfsd.c:1302:main] glusterfs: translator initialization failed. Exiting Kind of odd that 'Success' is counting as an error :P Justice London jlon...@lawinfo.com ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] Slow unfs3 + booster
I have tried both the 2.0.6 and 2.0.7rc3 releases. I also tried the git repo version but there is some sort of an issue with lockups with it right now so I'm just using the release version. I am not modifying the wsize version on this particular mount as unfortunately the way the server works does not allow to set options. I don't expect it to run as fast as possible like that, but still it should get more than Kilobytes/sec... megs or more are what I expect to see. The servers are currently at the default slot entry numbers. I tried increasing them to 128, which is what the NFS optimization guides out there recommend and it only very slightly helps with the write speed. I do have write-behind enabled. I just tried to disable it and see if that made any difference. It went from ~600KB/sec write speed to ~800KB/sec. Justice London jlon...@lawinfo.com On Fri, 2009-09-18 at 09:29 +0530, Shehjar Tikoo wrote: > Justice London wrote: > > I am having issues with slow writes to a gluster replicated setup using > > booster. on the order of about 800KB/sec. When writing a file to a fuse > > mounted version of the same filesystem I am able to write of course many, > > many times that speed. Has anyone gotten this to successfully work at this > > point? If so, with what changes or configs? I have tried both the standard > > and modified versions of unfs3. > > > > > > Do you have write-behind in the booster volfile? > > Which release of GlusterFS? > > With what wsize value are you mounting NFS server at the client? > Preferable size is 64KiB but I think Linux client default is 32KiB. > > Use the -o wsize=65536 with the mount command to increase this value. > > Please tell me the output of: > $ cat /proc/sys/sunrpc/tcp_slot_table_entries > > The default is 16 on Linux. Try increasing this to say 64 first, and > then to 128. Do tell me if this increases the throughput. > > -Shehjar > > > > > Justice London > > E-mail: jlon...@lawinfo.com > > > > > > > > > > > > > > ___ > > Gluster-users mailing list > > Gluster-users@gluster.org > > http://gluster.org/cgi-bin/mailman/listinfo/gluster-users > ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
[Gluster-users] Slow unfs3 + booster
I am having issues with slow writes to a gluster replicated setup using booster. on the order of about 800KB/sec. When writing a file to a fuse mounted version of the same filesystem I am able to write of course many, many times that speed. Has anyone gotten this to successfully work at this point? If so, with what changes or configs? I have tried both the standard and modified versions of unfs3. Justice London E-mail: jlon...@lawinfo.com ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] The continuing story ...
I have had plenty of instances for many different reasons where apache (a userspace program) has brought the entire system down. The kernel does not seem to be very good about telling apache 'no' at this point. If gluster is in the same boat I couldn't say for sure, but I'm certainly dealing with my own gluster lockup issues. Justice London E-mail: jlon...@lawinfo.com -Original Message- From: gluster-users-boun...@gluster.org [mailto:gluster-users-boun...@gluster.org] On Behalf Of Mark Mielke Sent: Thursday, September 10, 2009 12:09 PM To: David Saez Padros Cc: Anand Avati; gluster-users@gluster.org Subject: Re: [Gluster-users] The continuing story ... On 09/10/2009 09:38 AM, David Saez Padros wrote: > >> In particular, if you read about the intent of FUSE - the technology >> being used to create a file system, I think you will find that what >> Anand is saying is the *exact* purpose for this project. > > the lockups are on server side not in client side and fuse is > not used on the server side I think there is Stephan's problem and your problem, and I'm losing track over which one is being discussed. Sorry. :-) Server side, pure user space, with hardware locking up, or the kernel not be able to use a hardware resource - is a kernel problem. Yes, user space can trigger it - for example, by opening so many sockets and other such kernel resources, as to fill low memory - but as we found out recently, this is where the kernel is supposed to come in and kick the user program out with an out of memory killer, or not grant the resources in the first place. As it is - do we have evidence that GlusterFS is using up large number of file descriptors, sockets, processes, virtual memory, or other kernel resource? It seems to me that the failure in the case with the logs was the kernel finding the CPU not waking up for a long period of time? I'm not saying ignore GlusterFS in your evaluation - but I am saying if you truly want a resolution, you really should consider trying the linux developers, and seeing what they think. If they say this is a GlusterFS specific problem, I'm sure Anand and gluster.com would take a very serious second look at it. Until then - they gave it a shot, and don't have the ability to diagnose your problem or fix your problem. You could say they are incompetent and uncaring about their users - but a more accurate statement would probably be that this is entirely out of their domain, and they are unable to help you, and their professional recommendation and mine is to contact RedHat if you have a subscription, or if you do not, try the linux developers. I have no doubt at all that user space programs can hurt the kernel - but in every situation I can think of, the problem is really a *kernel* problem. The user space is just discovering the problem - which is unfortunate - but honestly, shit happens. We recently dealt with load builds failing due to the out of memory issue I reference above, as 32-bit linux kernel doesn't work very well with 32 Gbytes of RAM. Another problem we dealt with was Subversion mod_dav_fs quickly consuming all virtual memory in the machine, eventually leading to machine failure. For the Subversion issue - mod_dav_fs or something is uses should not be continually consuming more memory - so they have a bug - but the kernel *also* has a bug, because it should not allow httpd to bring the machine to a halt due to exhausted virtual memory. In the Subversion case, it's low on our priority list to solve, since we can work around it by having Apache recycle the process space more frequently and avoid the symptoms - but we should be taking this to both the Subversion developers at Collab.net *and* the Linux kernel developers. (I know what the Linux kernel developers will say though - 32-bit kernel was not designed for 32 Gbytes of RAM, and upgrade to a 64-bit kernel - but we have RHEL subscription, so perhaps we could take it that route...) Cheers, mark -- Mark Mielke ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users No virus found in this incoming message. Checked by AVG - www.avg.com Version: 8.5.409 / Virus Database: 270.13.89/2360 - Release Date: 09/11/09 09:15:00 ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] Replication not working on server hang
Just wanted to chime in that the EXACT same issue has occurred for me. I was going to work through the support chain but given that others are seeing it and hopefully have logs, perhaps I don't need to do so. Basically, I hope it can be fixed! Justice London E-mail: jlon...@lawinfo.com -Original Message- From: gluster-users-boun...@gluster.org [mailto:gluster-users-boun...@gluster.org] On Behalf Of Stephan von Krawczynski Sent: Friday, August 28, 2009 4:33 AM To: David Saez Padros Cc: Anand Avati; gluster-users Subject: Re: [Gluster-users] Replication not working on server hang > [...] > Glusterfs log only shows lines like this ones: > > [2009-08-28 09:19:28] E [client-protocol.c:292:call_bail] data2: bailing > out frame LOOKUP(32) frame sent = 2009-08-28 08:49:18. frame-timeout = 1800 > [2009-08-28 09:23:38] E [client-protocol.c:292:call_bail] data2: bailing > out frame LOOKUP(32) frame sent = 2009-08-28 08:53:28. frame-timeout = 1800 > > Once server2 has been rebooted all gluster fs become available > again on all clients and the hanged df and ls processes terminate, > but difficult to understand why a replicated share that must survive > to failure on one server does not. You are suffering from the problem we talked about few days ago on the list. If your local fs produces a deadlock somehow on one server glusterfs is currently unable to cope with the situation and just _waits_ for things to come. This deadlocks your clients, too, without any need. Your experience backs my critics on the handling of these situations. -- Regards, Stephan ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users No virus found in this incoming message. Checked by AVG - www.avg.com Version: 8.5.409 / Virus Database: 270.13.70/2329 - Release Date: 08/28/09 06:26:00 ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] 'Primary' brick outage or reboot issues
Well, I'm wondering now if this might all be fixed with the rc4 release that was just posted. What kind of lockup issues did that fix for? Basically I was able to replicate an issue by bringing down the first storage brick, where apache sessions would stall and bring the system load to 100+. This same issue was occurring for no apparent reason on the cluster and I wasn't able to determine a root cause. Justice London E-mail: jlon...@lawinfo.com -Original Message- From: Vikas Gorur [mailto:vi...@gluster.com] Sent: Friday, August 07, 2009 2:26 AM To: Justice London Cc: gluster-users@gluster.org Subject: Re: [Gluster-users] 'Primary' brick outage or reboot issues - "Justice London" wrote: > It appears that if the first brick in a replicated/distributed > configuration is rebooted or suffers some sort of a temporary issue, it both means > that the system doesn't appear to be dropped after 10 seconds from the > cluster and also that after it comes back up, pending transactions have issues > for the next 10 minutes or so. Is this a locks issue or is this a bug? If the first subvolume silently goes down (without resetting the connection) then an 'ls' will hang for 10 seconds (this is the "ping-pong" timeout) because replicate will not notice until then that the server has failed. Other operations should work fine, though. Can you elaborate what you mean by 'pending transactions' and what kind of issues they face? Vikas -- Engineer - http://gluster.com/ No virus found in this incoming message. Checked by AVG - www.avg.com Version: 8.5.392 / Virus Database: 270.13.45/2285 - Release Date: 08/06/09 05:57:00 ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
[Gluster-users] 'Primary' brick outage or reboot issues
It appears that if the first brick in a replicated/distributed configuration is rebooted or suffers some sort of a temporary issue, it both means that the system doesn't appear to be dropped after 10 seconds from the cluster and also that after it comes back up, pending transactions have issues for the next 10 minutes or so. Is this a locks issue or is this a bug? Justice London E-mail: jlon...@lawinfo.com ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] io-threads/io-cache
I think it's an error with the info that was entered. Hopefully one of the admins sees this and corrects. Justice London Systems Administrator E-mail: jlon...@lawinfo.com Website: www.lawinfo.com Office: 800-397-3743 x105 | Fax: 800-220-4546 # Legal Help For Everyone Find an Attorney: http://www.lawinfo.com/advanced-search.html Lead Counsel Program: http://www.lawinfo.com/programs.html ## PLEASE NOTE: This message, including any attachments, may include privileged, confidential and/or inside information. Any distribution or use of this communication by anyone other than the intended recipient(s) is strictly prohibited and may be unlawful. If you are not the intended recipient, please notify the sender by replying to this message and then delete it from your system. Thank you. _ From: gluster-users-boun...@gluster.org [mailto:gluster-users-boun...@gluster.org] On Behalf Of Hiren Joshi Sent: Wednesday, July 15, 2009 7:24 AM To: gluster-users@gluster.org Subject: [Gluster-users] io-threads/io-cache Hi All, http://www.gluster.org/docs/index.php/Translators/performance Can anyone tell me the difference between the 2? It has the same description Thanks, Josh. Checked by AVG - www.avg.com Version: 8.5.375 / Virus Database: 270.13.14/2238 - Release Date: 07/14/09 18:03:00 ___ Gluster-users mailing list Gluster-users@gluster.org http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users
[Gluster-users] FS mount lockup on server reboot
When I reboot any of the servers in a four-server configuration (replicated to server1, server3 and server2, server4 and then distributed across both), it appears the filesystem mount has trouble. It becomes un-readable at that point until the server comes back up. Is there some way or setting to prevent this? Justice London E-mail: jlon...@lawinfo.com ___ Gluster-users mailing list Gluster-users@gluster.org http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] Mounting GlusterFS volume but with the fsid=10option for FUSE and NFS
The fsid goes in your exports file as an addition to the export line (like rw,fsid=10). Justice London E-mail: jlon...@lawinfo.com _ From: gluster-users-boun...@gluster.org [mailto:gluster-users-boun...@gluster.org] On Behalf Of John Simmonds Sent: Friday, July 10, 2009 11:43 AM To: gluster-users@gluster.org Subject: [Gluster-users] Mounting GlusterFS volume but with the fsid=10option for FUSE and NFS Hi, I want to mount a glusterfs volume on a client machine then have that client machine export it via NFS. Problem is that NFS can't export a fuse filesystem unless the fuse filesystem is mounted with the fsid=10 option. I am using Gluster 2.0.1 with fuse 2.7.4 and kernel 2.6.25 on a Gentoo Distro. I have tried "mount -t glusterfs -o fsid=10 /etc/glusterfs/glusterfsd.vol /mnt/gluster" but not luck. When I run mount it doesn't show the fsid=10 option. I also tried fstab adding "/etc/glusterfs/glusterfsd.vol /mnt/gluster glusterfs fsid=10 0 0". But still drops the fsid=10 option. Any ideas? Regards John Simmonds _ This email was independently scanned for viruses by McAfee anti-virus software and none were found ___ Gluster-users mailing list Gluster-users@gluster.org http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] Gluster (2.0.1 -> git) with fuse 2.8 crashes NFS
It is fuse 2.8.0-pre3 that I tried with the patch you mentioned. When I tested I used 1M block sizes for a count of 100 and got 1.4MB/s when that was done over NFS. That was using both standard NFS with direct-io disabled and unfs3 with direct-io enabled. The same test of 1M blocks for a count of 100 made for 45MB/s on the same filesystem, but the local gluster mount instead of over NFS. When using NFS to one of the same machines as well, but local-disk mount I get around 50MB/s. Justice London E-mail: jlon...@lawinfo.com _ From: harshavardhanac...@gmail.com [mailto:harshavardhanac...@gmail.com] On Behalf Of Harshavardhana Sent: Thursday, July 09, 2009 9:22 PM To: Justice London Cc: Anand Avati; gluster-users Subject: Re: [Gluster-users] Gluster (2.0.1 -> git) with fuse 2.8 crashes NFS Justice, which is the libfuse version being used with glusterfs?. Just wanted to know what are the metrics you observed while testing?, block size in which writes/read were measured during testing?. etc, Regards -- Harshavardhana Z Research Inc http://www.zresearch.com/ On Thu, Jul 9, 2009 at 11:45 PM, Justice London wrote: Well, mostly it seems to be on the throughput. I haven't really measured for metadata improvements yet. Of note, is that NFS is now working, but it appears to be EXTREMELY slow. I was only able to manage about 1-2MB/s Justice London E-mail: jlon...@lawinfo.com -Original Message- From: anand.av...@gmail.com [mailto:anand.av...@gmail.com] On Behalf Of Anand Avati Sent: Tuesday, July 07, 2009 4:02 PM To: Justice London Cc: gluster-users; Harshavardhana Subject: Re: [Gluster-users] Gluster (2.0.1 -> git) with fuse 2.8 crashes NFS > The 2.0.3 release of gluster appears so far to have fixed the crash issue I > was experiencing. What was the specific patch that fixed for it I was > wondering? It was http://patches.gluster.com/patch/664/. A less ugly fix is lined up for 2.1 > Great job either way! It appears that with fuse 2.8 and newer kernels that > gluster absolutely flies. With a replication environment between two crummy > testbed machines it's probably about twice as fast as 2.7.4 based fuse! Just curious, are the observed performance improvements in terms of IO throughput or metadata latency? Avati Checked by AVG - www.avg.com Version: 8.5.375 / Virus Database: 270.13.8/2224 - Release Date: 07/08/09 05:53:00 Checked by AVG - www.avg.com Version: 8.5.375 / Virus Database: 270.13.8/2227 - Release Date: 07/09/09 05:55:00 ___ Gluster-users mailing list Gluster-users@gluster.org http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] Gluster (2.0.1 -> git) with fuse 2.8 crashes NFS
Well, mostly it seems to be on the throughput. I haven't really measured for metadata improvements yet. Of note, is that NFS is now working, but it appears to be EXTREMELY slow. I was only able to manage about 1-2MB/s Justice London E-mail: jlon...@lawinfo.com -Original Message- From: anand.av...@gmail.com [mailto:anand.av...@gmail.com] On Behalf Of Anand Avati Sent: Tuesday, July 07, 2009 4:02 PM To: Justice London Cc: gluster-users; Harshavardhana Subject: Re: [Gluster-users] Gluster (2.0.1 -> git) with fuse 2.8 crashes NFS > The 2.0.3 release of gluster appears so far to have fixed the crash issue I > was experiencing. What was the specific patch that fixed for it I was > wondering? It was http://patches.gluster.com/patch/664/. A less ugly fix is lined up for 2.1 > Great job either way! It appears that with fuse 2.8 and newer kernels that > gluster absolutely flies. With a replication environment between two crummy > testbed machines it's probably about twice as fast as 2.7.4 based fuse! Just curious, are the observed performance improvements in terms of IO throughput or metadata latency? Avati Checked by AVG - www.avg.com Version: 8.5.375 / Virus Database: 270.13.8/2224 - Release Date: 07/08/09 05:53:00 ___ Gluster-users mailing list Gluster-users@gluster.org http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] Gluster (2.0.1 -> git) with fuse 2.8 crashes NFS
NFS was not crashing, the mountpoint (glusterfs), was. I just applied the patch and mounted the NFS share. It didn't crash yet! I'll give it a try and see what happens. Justice London jlon...@lawinfo.com -Original Message- From: anand.av...@gmail.com [mailto:anand.av...@gmail.com] On Behalf Of Anand Avati Sent: Tuesday, July 07, 2009 5:25 PM To: Justice London Cc: Raghavendra G; gluster-users; Harshavardhana Subject: Re: [Gluster-users] Gluster (2.0.1 -> git) with fuse 2.8 crashes NFS On Wed, Jul 8, 2009 at 5:32 AM, Justice London wrote: > Actually, I spoke too soon. NFS still crashes, even if the mountpoint > doesn't. Justice, 2.0.3 fixes issues with 2.8.0-pre2. fuse-2.8.0-pre3 needs one more fix (http://patches.gluster.com/patch/693/) which is lined up for the next release. Just curious, what do you mean by that NFS still crashes even if the mountpoint doesn't? Are you running a unfs3 server on top of the fuse mountpoint and the unfs3 server crashes? Avati Checked by AVG - www.avg.com Version: 8.5.375 / Virus Database: 270.13.8/2224 - Release Date: 07/08/09 05:53:00 ___ Gluster-users mailing list Gluster-users@gluster.org http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] Gluster (2.0.1 -> git) with fuse 2.8 crashes NFS
The 2.0.3 release of gluster appears so far to have fixed the crash issue I was experiencing. What was the specific patch that fixed for it I was wondering? Great job either way! It appears that with fuse 2.8 and newer kernels that gluster absolutely flies. With a replication environment between two crummy testbed machines it's probably about twice as fast as 2.7.4 based fuse! Justice London jlon...@lawinfo.com _ From: gluster-users-boun...@gluster.org [mailto:gluster-users-boun...@gluster.org] On Behalf Of Justice London Sent: Thursday, July 02, 2009 12:33 PM To: 'Raghavendra G' Cc: 'gluster-users'; 'Harshavardhana' Subject: Re: [Gluster-users] Gluster (2.0.1 -> git) with fuse 2.8 crashes NFS Sure: Server: ### Export volume "brick" with the contents of "/home/export" directory. volume posix type storage/posix # POSIX FS translator option directory /home/gluster/vmglustore # Export this directory option background-unlink yes end-volume volume locks type features/posix-locks subvolumes posix end-volume volume brick type performance/io-threads option thread-count 32 # option autoscaling yes # option min-threads 8 # option max-threads 200 subvolumes locks end-volume ### Add network serving capability to above brick. volume brick-server type protocol/server option transport-type tcp # option transport-type unix # option transport-type ib-sdp # option transport.socket.bind-address 192.168.1.10 # Default is to listen on all interfaces # option transport.socket.listen-port 6996 # Default is 6996 # option transport-type ib-verbs # option transport.ib-verbs.bind-address 192.168.1.10 # Default is to listen on all interfaces # option transport.ib-verbs.listen-port 6996 # Default is 6996 # option transport.ib-verbs.work-request-send-size 131072 # option transport.ib-verbs.work-request-send-count 64 # option transport.ib-verbs.work-request-recv-size 131072 # option transport.ib-verbs.work-request-recv-count 64 option client-volume-filename /etc/glusterfs/glusterfs.vol subvolumes brick # NOTE: Access to any volume through protocol/server is denied by # default. You need to explicitly grant access through # "auth" # option. option auth.addr.brick.allow * # Allow access to "brick" volume end-volume Client: ### Add client feature and attach to remote subvolume volume remotebrick1 type protocol/client option transport-type tcp # option transport-type unix # option transport-type ib-sdp option remote-host 192.168.1.35 # IP address of the remote brick # option transport.socket.remote-port 6996 # default server port is 6996 # option transport-type ib-verbs # option transport.ib-verbs.remote-port 6996 # default server port is 6996 # option transport.ib-verbs.work-request-send-size 1048576 # option transport.ib-verbs.work-request-send-count 16 # option transport.ib-verbs.work-request-recv-size 1048576 # option transport.ib-verbs.work-request-recv-count 16 # option transport-timeout 30 # seconds to wait for a reply # from server for each request option remote-subvolume brick# name of the remote volume end-volume volume remotebrick2 type protocol/client option transport-type tcp option remote-host 192.168.1.36 option remote-subvolume brick end-volume volume brick-replicate type cluster/replicate subvolumes remotebrick1 remotebrick2 end-volume volume threads type performance/io-threads option thread-count 8 # option autoscaling yes # option min-threads 8 # option max-threads 200 subvolumes brick-replicate end-volume ### Add readahead feature volume readahead type performance/read-ahead option page-count 4 # cache per file = (page-count x page-size) option force-atime-update off subvolumes threads end-volume ### Add IO-Cache feature #volume iocache # type performance/io-cache # option page-size 1MB # option cache-size 64MB # subvolumes readahead #end-volume ### Add writeback feature volume writeback type performance/write-behind option cache-size 8MB option flush-behind on subvolumes readahead end-volume Justice London jlon...@lawinfo.com _ From: Raghavendra G [mailto:raghavendra...@gmail.com] Sent: Thursday, July 02, 2009 10:17 AM To: Justice London Cc: Harshavardhana; gluster-users Subject: Re: [Gluster-users] Gluster (2.0.1 -> git) with fuse 2.8 crashes NFS Hi, Can you send across the volume specification files you are using? regards, Raghavendra. 2009/6/24 Justice London Here you go. Let me know if you need anything else: Core was generated by `/usr/local/sbin/glusterfsd -p /var/run/glusterfsd.pid -f /etc/gluste
Re: [Gluster-users] Gluster (2.0.1 -> git) with fuse 2.8 crashes NFS
Sure: Server: ### Export volume "brick" with the contents of "/home/export" directory. volume posix type storage/posix # POSIX FS translator option directory /home/gluster/vmglustore # Export this directory option background-unlink yes end-volume volume locks type features/posix-locks subvolumes posix end-volume volume brick type performance/io-threads option thread-count 32 # option autoscaling yes # option min-threads 8 # option max-threads 200 subvolumes locks end-volume ### Add network serving capability to above brick. volume brick-server type protocol/server option transport-type tcp # option transport-type unix # option transport-type ib-sdp # option transport.socket.bind-address 192.168.1.10 # Default is to listen on all interfaces # option transport.socket.listen-port 6996 # Default is 6996 # option transport-type ib-verbs # option transport.ib-verbs.bind-address 192.168.1.10 # Default is to listen on all interfaces # option transport.ib-verbs.listen-port 6996 # Default is 6996 # option transport.ib-verbs.work-request-send-size 131072 # option transport.ib-verbs.work-request-send-count 64 # option transport.ib-verbs.work-request-recv-size 131072 # option transport.ib-verbs.work-request-recv-count 64 option client-volume-filename /etc/glusterfs/glusterfs.vol subvolumes brick # NOTE: Access to any volume through protocol/server is denied by # default. You need to explicitly grant access through # "auth" # option. option auth.addr.brick.allow * # Allow access to "brick" volume end-volume Client: ### Add client feature and attach to remote subvolume volume remotebrick1 type protocol/client option transport-type tcp # option transport-type unix # option transport-type ib-sdp option remote-host 192.168.1.35 # IP address of the remote brick # option transport.socket.remote-port 6996 # default server port is 6996 # option transport-type ib-verbs # option transport.ib-verbs.remote-port 6996 # default server port is 6996 # option transport.ib-verbs.work-request-send-size 1048576 # option transport.ib-verbs.work-request-send-count 16 # option transport.ib-verbs.work-request-recv-size 1048576 # option transport.ib-verbs.work-request-recv-count 16 # option transport-timeout 30 # seconds to wait for a reply # from server for each request option remote-subvolume brick# name of the remote volume end-volume volume remotebrick2 type protocol/client option transport-type tcp option remote-host 192.168.1.36 option remote-subvolume brick end-volume volume brick-replicate type cluster/replicate subvolumes remotebrick1 remotebrick2 end-volume volume threads type performance/io-threads option thread-count 8 # option autoscaling yes # option min-threads 8 # option max-threads 200 subvolumes brick-replicate end-volume ### Add readahead feature volume readahead type performance/read-ahead option page-count 4 # cache per file = (page-count x page-size) option force-atime-update off subvolumes threads end-volume ### Add IO-Cache feature #volume iocache # type performance/io-cache # option page-size 1MB # option cache-size 64MB # subvolumes readahead #end-volume ### Add writeback feature volume writeback type performance/write-behind option cache-size 8MB option flush-behind on subvolumes readahead end-volume Justice London jlon...@lawinfo.com _ From: Raghavendra G [mailto:raghavendra...@gmail.com] Sent: Thursday, July 02, 2009 10:17 AM To: Justice London Cc: Harshavardhana; gluster-users Subject: Re: [Gluster-users] Gluster (2.0.1 -> git) with fuse 2.8 crashes NFS Hi, Can you send across the volume specification files you are using? regards, Raghavendra. 2009/6/24 Justice London Here you go. Let me know if you need anything else: Core was generated by `/usr/local/sbin/glusterfsd -p /var/run/glusterfsd.pid -f /etc/glusterfs/gluster'. Program terminated with signal 11, Segmentation fault. [New process 653] [New process 656] [New process 687] [New process 657] [New process 658] [New process 659] [New process 660] [New process 661] [New process 662] [New process 663] [New process 665] [New process 666] [New process 667] [New process 668] [New process 669] [New process 670] [New process 671] [New process 672] [New process 679] [New process 680] [New process 681] [New process 682] [New process 683] [New process 684] [New process 686] [New process 676] [New process 685] [New process 674] [New process 675] [New process 677] [New process 654] [New process 673] [New process 678] [New process 664] #0 0xb808ee9c in __glusterfs_this_locat...@plt () from /usr/local/lib/lib
Re: [Gluster-users] Gluster (2.0.1 -> git) with fuse 2.8 crashes NFS
Here you go. Let me know if you need anything else: Core was generated by `/usr/local/sbin/glusterfsd -p /var/run/glusterfsd.pid -f /etc/glusterfs/gluster'. Program terminated with signal 11, Segmentation fault. [New process 653] [New process 656] [New process 687] [New process 657] [New process 658] [New process 659] [New process 660] [New process 661] [New process 662] [New process 663] [New process 665] [New process 666] [New process 667] [New process 668] [New process 669] [New process 670] [New process 671] [New process 672] [New process 679] [New process 680] [New process 681] [New process 682] [New process 683] [New process 684] [New process 686] [New process 676] [New process 685] [New process 674] [New process 675] [New process 677] [New process 654] [New process 673] [New process 678] [New process 664] #0 0xb808ee9c in __glusterfs_this_locat...@plt () from /usr/local/lib/libglusterfs.so.0 (gdb) backtrace #0 0xb808ee9c in __glusterfs_this_locat...@plt () from /usr/local/lib/libglusterfs.so.0 #1 0xb809b935 in default_fxattrop (frame=0x809cc68, this=0x8055a80, fd=0x809ca20, flags=GF_XATTROP_ADD_ARRAY, dict=0x809cac8) at defaults.c:1122 #2 0xb809b930 in default_fxattrop (frame=0x8063570, this=0x8055f80, fd=0x809ca20, flags=GF_XATTROP_ADD_ARRAY, dict=0x809cac8) at defaults.c:1122 #3 0xb76b3c35 in server_fxattrop (frame=0x809cc28, bound_xl=0x8055f80, hdr=0x8064c88, hdrlen=150, iobuf=0x0) at server-protocol.c:4596 #4 0xb76a9f1b in protocol_server_interpret (this=0x8056500, trans=0x8064698, hdr_p=0x8064c88 "", hdrlen=150, iobuf=0x0) at server-protocol.c:7502 #5 0xb76aa1cc in protocol_server_pollin (this=0x8056500, trans=0x8064698) at server-protocol.c:7783 #6 0xb76aa24f in notify (this=0x8056500, event=2, data=0x8064698) at server-protocol.c:7839 #7 0xb809737f in xlator_notify (xl=0x8056500, event=2, data=0x8064698) at xlator.c:912 #8 0xb4ea08dd in socket_event_poll_in (this=0x8064698) at socket.c:713 #9 0xb4ea099b in socket_event_handler (fd=8, idx=1, data=0x8064698, poll_in=1, poll_out=0, poll_err=0) at socket.c:813 #10 0xb80b168a in event_dispatch_epoll (event_pool=0x8050d58) at event.c:804 #11 0xb80b0471 in event_dispatch (event_pool=0x8051338) at event.c:975 ---Type to continue, or q to quit--- #12 0x0804b880 in main (argc=5, argv=0xbfae1044) at glusterfsd.c:1263 Current language: auto; currently asm Justice London jlon...@lawinfo.com On Mon, 2009-06-22 at 10:47 +0530, Harshavardhana wrote: > Hi Justice, > > Can you get a backtrace from the segfault through gdb? . > > Regards > -- > Harshavardhana > Z Research Inc http://www.zresearch.com/ > > > On Sat, Jun 20, 2009 at 10:47 PM, wrote: > Sure, the kernel version is 2.6.29 and the fuse release is the > just > released 2.8.0-pre3 (although I can use pre2 if needed). > > > Justice London > jlon...@lawinfo.com > > > Hi Justice, > > > > There are certain modifications required in > fuse-extra.c to make > > glusterfs work properly for fuse 2.8.0 release. glusterfs > 2.0.1 release is > > not tested against 2.8.0 release fuse and certainly will not > work without > > those modifications. May i know the kernel version you are > trying to use? > > and the version of fuse being under use? pre1 or pre2 > release? > > > > Regards > > -- > > Harshavardhana > > Z Research Inc http://www.zresearch.com/ > > > > > > On Fri, Jun 19, 2009 at 11:14 PM, Justice London > > wrote: > > > >> No matter what I do I cannot seem to get gluster to stay > stable when > >> doing any sort of writes to the mount, when using gluster > in combination > >> with fuse 2.8.0-preX and NFS. I tried both unfs3 and > standard kernel-nfs > >> and > >> no matter what, any sort of data transaction seems to crash > gluster > >> immediately. The error log is as such: > >> > >> > >> > >> pending frames: > >> > >> > >> > >> patchset: git://git.sv.gnu.org/gluster.git > >> > >> signal received: 11 > >> > >> configuration details:argp 1 > >> > >> backtrace 1 > >> > >> bdb->cursor->get 1 > >> > >> db.h 1 > &
[Gluster-users] Gluster (2.0.1 -> git) with fuse 2.8 crashes NFS
No matter what I do I cannot seem to get gluster to stay stable when doing any sort of writes to the mount, when using gluster in combination with fuse 2.8.0-preX and NFS. I tried both unfs3 and standard kernel-nfs and no matter what, any sort of data transaction seems to crash gluster immediately. The error log is as such: pending frames: patchset: git://git.sv.gnu.org/gluster.git signal received: 11 configuration details:argp 1 backtrace 1 bdb->cursor->get 1 db.h 1 dlfcn 1 fdatasync 1 libpthread 1 llistxattr 1 setfsid 1 spinlock 1 epoll.h 1 xattr.h 1 st_atim.tv_nsec 1 package-string: glusterfs 2.0.0git [0xf57fe400] /usr/local/lib/libglusterfs.so.0(default_fxattrop+0xc0)[0xb7f4d530] /usr/local/lib/glusterfs/2.0.0git/xlator/protocol/server.so(server_fxattrop+ 0x175)[0xb7565af5] /usr/local/lib/glusterfs/2.0.0git/xlator/protocol/server.so(protocol_server_ interpret+0xbb)[0xb755beeb] /usr/local/lib/glusterfs/2.0.0git/xlator/protocol/server.so(protocol_server_ pollin+0x9c)[0xb755c19c] /usr/local/lib/glusterfs/2.0.0git/xlator/protocol/server.so(notify+0x7f)[0xb 755c21f] /usr/local/lib/libglusterfs.so.0(xlator_notify+0x3f)[0xb7f4937f] /usr/local/lib/glusterfs/2.0.0git/transport/socket.so(socket_event_poll_in+0 x3d)[0xb4d528dd] /usr/local/lib/glusterfs/2.0.0git/transport/socket.so(socket_event_handler+0 xab)[0xb4d5299b] /usr/local/lib/libglusterfs.so.0[0xb7f6321a] /usr/local/lib/libglusterfs.so.0(event_dispatch+0x21)[0xb7f62001] /usr/local/sbin/glusterfsd(main+0xb3b)[0x804b81b] /lib/libc.so.6(__libc_start_main+0xe5)[0xb7df3455] /usr/local/sbin/glusterfsd[0x8049db1] Any ideas on if there is a solution, or will be one upcoming in either gluster or fuse? Other than with NFS, the git version of gluster seems to be really, really fast with fuse 2.8 Justice London jlon...@lawinfo.com ___ Gluster-users mailing list Gluster-users@gluster.org http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] NFS export under Centos 5.3
Well, I ended up finding that I had to mount with 'glusterfs' rather than using a mount command to get this to work properly. Perhaps it was an error on my part there, but I got it working in general. The issue now is that when disabling direct-io it brings gluster from at-disk speeds to a crawling 18-24M/s. This means when I want to export through NFS it further slows things down, somewhere in the range of 4-7M/s. Given that this is over gigabit Ethernet I should expect ~100M/s in good conditions and when simply reading/writing plain files over NFS this is the case. I did try unfs3 and turned back on direct-io and there are speed increases, but even when using plain filesystem, unfs seems to only be able to run about 24M/s tops. I'm not sure if this is an issue with the fuse lib/module, but certainly if I want to do any real work over NFS this isn't going to be a workable solution (either way, unfs or plain nfs with disable-direct-io). Any further suggestions to get this working (different fuse library, etc.) or at this point is this just how things will be with gluster+nfs? Justice London E-mail: jlon...@lawinfo.com _ From: Raghavendra G [mailto:raghavendra...@gmail.com] Sent: Tuesday, April 14, 2009 6:15 AM To: Justice London Cc: gluster-users@gluster.org Subject: Re: [Gluster-users] NFS export under Centos 5.3 Hi, glusterfs has to be mounted with direct-io disabled for NFS re-export to work properly. You can disable direct-io using --disable-direct-io option to glusterfs. regards, On Mon, Apr 13, 2009 at 11:06 PM, Justice London wrote: Has anyone successfully gotten NFS exports of a gluster filesystem to work properly? When the export is mounted, file-names and folders can be properly created, but actually writing data to any of these files or folders results in a permissions/write error. I can send out further details on configs, etc. but perhaps someone has simply encountered this before and worked around it. Justice London E-mail: jlon...@lawinfo.com ___ Gluster-users mailing list Gluster-users@gluster.org http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users -- Raghavendra G No virus found in this incoming message. Checked by AVG - www.avg.com Version: 8.5.287 / Virus Database: 270.11.54/2056 - Release Date: 04/13/09 05:51:00 ___ Gluster-users mailing list Gluster-users@gluster.org http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users
[Gluster-users] NFS export under Centos 5.3
Has anyone successfully gotten NFS exports of a gluster filesystem to work properly? When the export is mounted, file-names and folders can be properly created, but actually writing data to any of these files or folders results in a permissions/write error. I can send out further details on configs, etc. but perhaps someone has simply encountered this before and worked around it. Justice London E-mail: jlon...@lawinfo.com ___ Gluster-users mailing list Gluster-users@gluster.org http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users