Re: [Gluster-users] glusterfs and pacemaker
Hello Marcel, thank you very much for the patch. Great Job. It works on the first shot. Mounting and migration of the fs works. Till now I could not test a hart reset of a cluster node, cause a colleague is currently using the cluster. I applied the following parameters: Fstype: glusterfs Mountdir: /virtfs Glustervolume: 192.168.50.1:/gl_vol1 Maybe you can answer me a question for better understanding? My second node is 192.168.50.2. But in the Filesystem RA I have referenced to 192.168.50.1 (see above). During my first test node1 was up and running, but what happens if node1 is completely away and the address is inaccessible? Thx Uwe Uwe Weiss weiss edv-consulting Lattenkamp 14 22299 Hamburg Phone: +49 40 51323431 Fax:+49 40 51323437 eMail: u.we...@netz-objekte.de -Ursprüngliche Nachricht- Von: gluster-users-boun...@gluster.org [mailto:gluster-users-boun...@gluster.org] Im Auftrag von Marcel Pennewiß Gesendet: Sonntag, 17. Juli 2011 17:37 An: gluster-users@gluster.org Betreff: Re: [Gluster-users] glusterfs and pacemaker On Friday 15 July 2011 13:12:02 Marcel Pennewiß wrote: My idea is, that pacemaker starts and monitors the glusterfs mountpoints and migrates some resources to the remaining node if one or more mountpoint(s) fails. For using mountpoints, please have a look at OCF Filesystem agent. Uwe informed me (via PM) that this didn't work - we did not use this until now. After some investigation you'll see that ocf::Filesystem did not detect/work with glusterfs-shares :( A few changes are necessary to create a basic support for glusterfs. @Uwe: Please have a look at [1] and try to patch your Filesystem-OCF-script (which maybe located in /usr/lib/ocf/resource.d/heartbeat). [1] http://subversion.fem.tu-ilmenau.de/repository/fem-overlay/trunk/sys- cluster/resource-agents/files/filesystem-glusterfs-support.patch best regards Marcel ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] glusterfs and pacemaker
On Monday 18 July 2011 12:10:36 Uwe Weiss wrote: My second node is 192.168.50.2. But in the Filesystem RA I have referenced to 192.168.50.1 (see above). During my first test node1 was up and running, but what happens if node1 is completely away and the address is inaccessible? We're using replicated setup and both nodes share an IPv4/IPv6-address (via pacemaker) which is used for accessing/mounting glusterfs-share and nfs-share (from backup-server). Marcel ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] glusterfs and pacemaker
I don't know from which version on but, if you use the native client for mounting the volumes, it's only required to have the IP active in the mount moment. After that, the native client will transparently manage node's failure. Best regards, Samuel. On 18 July 2011 13:14, Marcel Pennewiß mailingli...@pennewiss.de wrote: On Monday 18 July 2011 12:10:36 Uwe Weiss wrote: My second node is 192.168.50.2. But in the Filesystem RA I have referenced to 192.168.50.1 (see above). During my first test node1 was up and running, but what happens if node1 is completely away and the address is inaccessible? We're using replicated setup and both nodes share an IPv4/IPv6-address (via pacemaker) which is used for accessing/mounting glusterfs-share and nfs-share (from backup-server). Marcel ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] glusterfs and pacemaker
On Monday 18 July 2011 13:26:00 samuel wrote: I don't know from which version on but, if you use the native client for mounting the volumes, it's only required to have the IP active in the mount moment. After that, the native client will transparently manage node's failure. ACK, that's why we use this shared IP (e.g. for backup issues via nfs). AFAIR glusterFS retrieves Volfile (via shared IP) and connects to the nodes. Marcel ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] Issue with Gluster Quota
Updated to the 3.2.2 release and found that Quotas do not work when using the POSIX-ACL translator. In fact, after I disabled the ACLs, I had to remove the quota'd directory (presumably, to remove some attributes) and start over in order to get them to work. Once I disabled ACLs and re-created my directory, quotas worked as expected. Is this a known limitation of using POSIX ACLs? I happen to need both features, so that could pose an issue :) -Brian Brian Smith Senior Systems Administrator IT Research Computing, University of South Florida 4202 E. Fowler Ave. ENB308 Office Phone: +1 813 974-1467 Organization URL: http://rc.usf.edu On 07/11/2011 12:48 PM, Brian Smith wrote: According to the logs, the last commit was: commit 5c20eb3bbf870edadd22d06babb5d38dad222533 Author: shishir gowda shishi...@gluster.com Date: Tue Jul 5 03:41:51 2011 + [root@gluster1 glusterfs-3.2git]# gluster volume quota home list path limit_set size -- /brs 10485760 81965056 [root@gluster1 glusterfs-3.2git]# gluster volume info Volume Name: home Type: Distribute Status: Started Number of Bricks: 2 Transport-type: tcp,rdma Bricks: Brick1: gluster1:/glusterfs/home Brick2: gluster2:/glusterfs/home Options Reconfigured: features.limit-usage: /brs:10MB features.quota: on -Brian Brian Smith Senior Systems Administrator IT Research Computing, University of South Florida 4202 E. Fowler Ave. ENB308 Office Phone: +1 813 974-1467 Organization URL: http://rc.usf.edu On 07/11/2011 08:29 AM, Saurabh Jain wrote: Hello Brian, I synced my gluster repository back to July 5th and tried quota on a certain dir of a distribute and the quota was implemeted properly on that, here are the logs, [root@centos-qa-client-3 glusterfs]# /root/july6git/inst/sbin/gluster volume quota dist list path limit_set size -- /dir 10485760 10485760 [root@centos-qa-client-3 glusterfs]# /root/july6git/inst/sbin/gluster volume info Volume Name: dist Type: Distribute Status: Started Number of Bricks: 2 Transport-type: tcp,rdma Bricks: Brick1: 10.1.12.134:/mnt/dist Brick2: 10.1.12.135:/mnt/dist Options Reconfigured: features.limit-usage: /dir:10MB features.quota: on [root@centos-qa-client-3 glusterfs]# requesting you to please inform us about the commit id to which your workspace is synced. Thanks, Saurabh From: gluster-users-boun...@gluster.org [gluster-users-boun...@gluster.org] on behalf of gluster-users-requ...@gluster.org [gluster-users-requ...@gluster.org] Sent: Friday, July 08, 2011 12:30 AM To: gluster-users@gluster.org Subject: Gluster-users Digest, Vol 39, Issue 13 Send Gluster-users mailing list submissions to gluster-users@gluster.org To subscribe or unsubscribe via the World Wide Web, visit http://gluster.org/cgi-bin/mailman/listinfo/gluster-users or, via email, send a message with subject or body 'help' to gluster-users-requ...@gluster.org You can reach the person managing the list at gluster-users-ow...@gluster.org When replying, please edit your Subject line so it is more specific than Re: Contents of Gluster-users digest... Today's Topics: 1. Re: Issue with Gluster Quota (Brian Smith) 2. Re: Issues with geo-rep (Carl Chenet) -- Message: 1 Date: Thu, 07 Jul 2011 13:10:06 -0400 From: Brian Smith b...@usf.edu Subject: Re: [Gluster-users] Issue with Gluster Quota To: gluster-users@gluster.org Message-ID: 4e15e86e.6030...@usf.edu Content-Type: text/plain; charset=ISO-8859-1 Sorry about that. I re-populated with an 82MB dump from dd: [root@gluster1 ~]# gluster volume quota home list path limit_set size -- /brs 10485760 81965056 [root@gluster1 ~]# getfattr -m . -d -e hex /glusterfs/home/brs getfattr: Removing leading '/' from absolute path names # file: glusterfs/home/brs security.selinux=0x726f6f743a6f626a6563745f723a66696c655f743a733000 trusted.gfid=0x1bbcb9a08bf64406b440f3bb3ad334ed trusted.glusterfs.dht=0x00017fff trusted.glusterfs.quota.----0001.contri=0x6000 trusted.glusterfs.quota.dirty=0x3000 trusted.glusterfs.quota.size=0x6000 [root@gluster2 ~]# getfattr -m . -d -e hex /glusterfs/home/brs getfattr: Removing leading '/' from absolute path names # file: glusterfs/home/brs security.selinux=0x726f6f743a6f626a6563745f723a66696c655f743a733000
Re: [Gluster-users] not sure how to troubleshoot SMB / CIFS overload when using GlusterFS
Whit, Genius! This morning I set out to remove as many variables as possible to whittle down the repro case as much as possible. I've become pretty good at debugging memory dumps on the Windows side over the years, and even inspected the web processes. Nothing looked out of the ordinary there, just a bunch of threads waiting to get file attribute data from the Gluster share. So then, to follow your lead, I reduced the Page of Death down from thousands of images to just five. I tried accessing the page, and boom, everything's frozen for minutes. Interesting. So I reduced it to one image, accessed the page, and boom, everything's dead instantly. That one image is a file that doesn't exist. So now, knowing that GlusterFS is kicking into overdrive fretting about a file it can't find, I decided to eliminate the web server altogether. I opened up Windows Explorer, and typed in a directory that didn't exist, and sure enough, I'm unable to navigate through the share in another Explorer window until it finally responds again a minute later. I think the Page of Death was exhibiting such a massive death (e.g. only able to respond again upwards of five minutes later) because it was systematically trying to access several files that weren't found, and each one it can't find causes the SMB connection to hang for close to a minute. I feel like this is a bit of major progress toward pinpointing the problem for a possible resolution. Here are some additional details that may help: The GlusterFS directory in question, /storage, has about 80,000 subdirs in it. As such, I'm using ext4 to overcome the subdir limitations of ext3. The non-existent image file that is able to cause everything to freeze exists in a directory, /storage/thisdirdoesntexist/images/blah.gif, where thisdirdoesntexist is in that storage directory along with those 80,000 real subdirs. I know it's a pretty laborious thing for Gluster to piece together a directory listing, and combined with Joseph's recognition of the flood of getdents, does it seem reasonable that Gluster or Samba is freezing because it's for some reason generating a subdir listing of /storage whenever it can't find one of its subdirs? As another test, if I access a file inside a non-existent subdir of a dir that only has five subdirs, and nothing freezes. So the freezing seems to be a function of the number of subdirectories that are siblings of the first part of the path that doesn't exist, if that makes sense. So in /this/is/a/long/path, if is doesn't exist, then Samba will generate a list of subdirs under /this. And if /this has 100,000 immediate subdirs under it, then you're about to experience a world of hurt. I read some where that FUSE's implementation of readdir() is a blocking operation. If true, the above explanation, plus FUSE's readdir(), are to blame. And I am therefore up a creek. It is not feasible to enforce the system to only have a few subdirs at any given level to prevent the lockup. Unless somebody, after reading this novel, has some ideas for me to try. =) Any magical ways to not get FUSE to block, or any trickery on Samba's side? Ken On Sun, Jul 17, 2011 at 10:29 PM, Whit Blauvelt whit.glus...@transpect.comwrote: On Sun, Jul 17, 2011 at 10:19:00PM -0500, Ken Randall wrote: (The no such file or directory part is expected since some of the image references don't exist.) Wild guess on that: Gluster may work harder at files it doesn't find than files it finds. It's going to look on one side or the other of the replicated file at first, and if it finds the file deliver it. But if it doesn't find the file, wouldn't it then check the other side of the replicated storage to make sure this wasn't a replication error? Might be interesting to run a version of the test where all the images referenced do exist, to see if it's the missing files that are driving up the CPU cycles. Whit ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] not sure how to troubleshoot SMB / CIFS overload when using GlusterFS
Joseph, Thank you for your response. Yours, combined with Whit's, led me to come up with a pretty solid repro case, and a pinpointing of what I think is going on. I tried your additional SMB configuration settings, and was hopeful, but it didn't alleviate the issue. But it was helpful, your interpretation of the logs. It makes sense now that Samba was pounding on GlusterFS, doing it's string of getdents operations. I also took your advice last night on stat-cache (I assume that was on the Gluster side, which I enabled), and wasn't sure where fast lookups was. That didn't seem to make a noticeable difference either. I think the lockups are happening as a result of being crippled by GlusterFS's relatively slow directory listing (5x-10x slower generating a dir listing than a raw SMB share), combined with FUSE's blocking readdir(). I'm not positive on that last point since there was only one mention of that on the internet. Am praying that somebody will see this and say, oh yeah, well sure, just change this one thing in FUSE and you're good to go! Somehow I don't think that's going to happen. :) Ken On Sun, Jul 17, 2011 at 10:35 PM, Joe Landman land...@scalableinformatics.com wrote: On 07/17/2011 11:19 PM, Ken Randall wrote: Joe, Thank you for your response. After seeing what you wrote, I bumped up the performance.cache-size up to 4096MB, the max allowed, and ran into the same wall. Hmmm ... I wouldn't think that any SMB caching would help in this case, since the same Samba server on top of the raw Gluster data wasn't exhibiting any trouble, or am I deceived? Samba could cache better so it didn't have to hit Gluster so hard. I haven't used strace before, but I ran it on the glusterfs process, and saw a lot of: epoll_wait(3, {{EPOLLIN, {u32=9, u64=9}}}, 257, 4294967295) = 1 readv(9, [{\200\0\16,, 4}], 1)= 4 readv(9, [{\0\n;\227\0\0\0\1, 8}], 1) = 8 readv(9, [{\0\0\0\0\0\0\0\0\0\0\0\0\0\**0\0\0\0\0\0\31\0\0\0\0\0\0\0\**1\0\0\0\0..., 3620}], 1) = 1436 readv(9, 0xa90b1b8, 1) = -1 EAGAIN (Resource temporarily unavailable) Interesting ... I am not sure why its reporting an EAGAIN for readv, other than it can't fill the vector from the read. And when I ran it on smbd, I saw a constant stream of this kind of activity: getdents(29, /* 25 entries */, 32768) = 840 getdents(29, /* 25 entries */, 32768) = 856 getdents(29, /* 25 entries */, 32768) = 848 getdents(29, /* 24 entries */, 32768) = 856 getdents(29, /* 25 entries */, 32768) = 864 getdents(29, /* 24 entries */, 32768) = 832 getdents(29, /* 25 entries */, 32768) = 832 getdents(29, /* 24 entries */, 32768) = 856 getdents(29, /* 25 entries */, 32768) = 840 getdents(29, /* 24 entries */, 32768) = 832 getdents(29, /* 25 entries */, 32768) = 784 getdents(29, /* 25 entries */, 32768) = 824 getdents(29, /* 25 entries */, 32768) = 808 getdents(29, /* 25 entries */, 32768) = 840 getdents(29, /* 25 entries */, 32768) = 864 getdents(29, /* 25 entries */, 32768) = 872 getdents(29, /* 25 entries */, 32768) = 832 getdents(29, /* 24 entries */, 32768) = 832 getdents(29, /* 25 entries */, 32768) = 840 getdents(29, /* 25 entries */, 32768) = 824 getdents(29, /* 25 entries */, 32768) = 824 getdents(29, /* 24 entries */, 32768) = 864 getdents(29, /* 25 entries */, 32768) = 848 getdents(29, /* 24 entries */, 32768) = 840 Get directory entries. This is the stuff that NTFS is caching for its web server, and it appears Samba is not. Try aio read size = 32768 csc policy = documents dfree cache time = 60 directory name cache size = 10 fake oplocks = yes getwd cache = yes level2 oplocks = yes max stat cache size = 16384 That chunk would get repeated over and over and over again as fast as the screen could go, with the occasional (every 5-10 seconds or so), would you see anything that you'd normally expect to see, such as: close(29) = 0 stat(Storage/01, 0x7fff07dae870) = -1 ENOENT (No such file or directory) write(23, \0\0\0#\377SMB24\0\0\300\**210A\310\0\0\0\0\0\0\0\0\0\0\** 0\0\1\0d\233..., 39) = 39 select(38, [5 20 23 27 30 31 35 36 37], [], NULL, {60, 0}) = 1 (in [23], left {60, 0}) read(23, \0\0\0x, 4) = 4 read(23, \377SMB2\0\0\0\0\30\7\310\0\**0\0\0\0\0\0\0\0\0\0\0\1\0\** 250P\273\0[8..., 120) = 120 stat(Storage, {st_mode=S_IFDIR|0755, st_size=1581056, ...}) = 0 stat(Storage/011235, 0x7fff07dad470) = -1 ENOENT (No such file or directory) stat(Storage/011235, 0x7fff07dad470) = -1 ENOENT (No such file or directory) open(Storage, O_RDONLY|O_NONBLOCK|O_**DIRECTORY) = 29 fcntl(29, F_SETFD, FD_CLOEXEC) = 0 (The no such file or directory part is expected since some of the image references don't exist.) Ok. It looks like Samba is pounding on GlusterFS metadata (getdents).
Re: [Gluster-users] not sure how to troubleshoot SMB / CIFS overload when using GlusterFS
Please fine responses inline. So now, knowing that GlusterFS is kicking into overdrive fretting about a file it can't find, I decided to eliminate the web server altogether. I opened up Windows Explorer, and typed in a directory that didn't exist, and sure enough, I'm unable to navigate through the share in another Explorer window until it finally responds again a minute later. I think the Page of Death was exhibiting such a massive death (e.g. only able to respond again upwards of five minutes later) because it was systematically trying to access several files that weren't found, and each one it can't find causes the SMB connection to hang for close to a minute. Gluster does not really care if a file is not found. It just looks up the filename on all servers and returns -ENOENT. End of story for Gluster. What's happening here is that Samba is 'searching' through all filenames in the directory to match some other filename with strcasecmp() to provide a case-insensitive match to the user. I feel like this is a bit of major progress toward pinpointing the problem for a possible resolution. Here are some additional details that may help: The GlusterFS directory in question, /storage, has about 80,000 subdirs in it. As such, I'm using ext4 to overcome the subdir limitations of ext3. The non-existent image file that is able to cause everything to freeze exists in a directory, /storage/thisdirdoesntexist/images/blah.gif, where thisdirdoesntexist is in that storage directory along with those 80,000 real subdirs. I know it's a pretty laborious thing for Gluster to piece together a directory listing, and combined with Joseph's recognition of the flood of getdents, does it seem reasonable that Gluster or Samba is freezing because it's for some reason generating a subdir listing of /storage whenever it can't find one of its subdirs? Yes, it is samba searching around for the case insensitive match. As another test, if I access a file inside a non-existent subdir of a dir that only has five subdirs, and nothing freezes. That is because iterating over 5 names to determine non existence of a case insensitive match is trivially fast. So the freezing seems to be a function of the number of subdirectories that are siblings of the first part of the path that doesn't exist, if that makes sense. So in /this/is/a/long/path, if is doesn't exist, then Samba will generate a list of subdirs under /this. And if /this has 100,000 immediate subdirs under it, then you're about to experience a world of hurt. I read some where that FUSE's implementation of readdir() is a blocking operation. If true, the above explanation, plus FUSE's readdir(), are to blame. What do you mean by that. FUSE's readdir() is as blocking or unblocking as the rest of its open/create/getattr/setattr etc. What you probably meant was that the fuse kernel module does not cache dirents? And I am therefore up a creek. It is not feasible to enforce the system to only have a few subdirs at any given level to prevent the lockup. Unless somebody, after reading this novel, has some ideas for me to try. =) Any magical ways to not get FUSE to block, or any trickery on Samba's side? It is not FUSE blocking that is your problem. You need a quicker trick to achieve case insensitivity. Avati ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] not sure how to troubleshoot SMB / CIFS overload when using GlusterFS
I also took your advice last night on stat-cache (I assume that was on the Gluster side, which I enabled), and wasn't sure where fast lookups was. That didn't seem to make a noticeable difference either. stat-prefetch xlator does not help here. It helps quicken lookups. But Samba is not interested in lookup/attributes. It is only looking for existance of directory entry names, without caring whether it is a file or directory. I think the lockups are happening as a result of being crippled by GlusterFS's relatively slow directory listing (5x-10x slower generating a dir listing than a raw SMB share), combined with FUSE's blocking readdir(). I'm not positive on that last point since there was only one mention of that on the internet. Am praying that somebody will see this and say, oh yeah, well sure, just change this one thing in FUSE and you're good to go! Somehow I don't think that's going to happen. :) GlusterFS uses readdirp() internally even if FUSE asks readdir() because it makes generation of unique list of entry names in distribute much more efficient and simple. This might be causing readdir ops themselves to be slow. Doing an 'strace -Tc' on smbd can show you what %age of time is spent in getdents(). One test you could try is checking if a pure replicated setup without stat-prefetch has the same performance hit as a distributed(+replicated?) setup? Both stat-prefetch and distributed upgrade readdirs into readdirps (one for performance, another for unique list generation). Avati ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] Tools for the admin
Or you can also use tr : to remove : On Sun, Jul 17, 2011 at 8:17 PM, Whit Blauvelt whit.glus...@transpect.com wrote: On Mon, Jul 18, 2011 at 03:48:11AM +0100, Dan Bretherton wrote: I had a closer look at this. It is the output of gfid-mismatch causing the problem; paths are shown with a trailing colon as in GlusterFS log files. The cut -f1 -d: to extract the paths obviously removes all the colons. I'm sure there is an easy way to remove the trailing ':' from filenames but I can't think of one off hand (and it is 3:30AM). Something along the lines of sed 's/.$//, as in: dog=doggy:; echo $dog | sed 's/.$//' That would remove any last character. To just get :: dog=doggy:; echo $dog | sed 's/:$//' (No, I didn't know that. I googled.) Whit ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] not sure how to troubleshoot SMB / CIFS overload when using GlusterFS
On Mon, Jul 18, 2011 at 09:49:42PM +0530, Anand Avati wrote: It is not FUSE blocking that is your problem. You need a quicker trick to achieve case insensitivity. Could this help? 5.4.2.1 case sensitive This share-level option, which has the obtuse synonym casesignames, specifies whether Samba should preserve case when resolving filenames in a specific share. The default value for this option is no, which is how Windows handles file resolution. If clients are using an operating system that takes advantage of case-sensitive filenames, you can set this configuration option to yes as shown here: [accounting] case sensitive = yes Otherwise, we recommend that you leave this option set to its default. From http://oreilly.com/catalog/samba/chapter/book/ch05_04.html As I read that, case sensitive = yes is telling Samba not to bother with any case substitutions. That is, you may need a quicker trick to achieve case insensitivity if you just get rid of case insensitivity. Whit ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] Issue with Gluster Quota
After further tests, it appears both now work after the update. Seems the attributes set while using the git source build on the 'brs' directory were gummed up. When I recreated the directory and remounted with -o acl, ACLs worked and so did quota enforcement. I'll keep testing and post if anything else comes up. So far, so good. -Brian Brian Smith Senior Systems Administrator IT Research Computing, University of South Florida 4202 E. Fowler Ave. ENB308 Office Phone: +1 813 974-1467 Organization URL: http://rc.usf.edu On 07/18/2011 11:57 AM, Brian Smith wrote: Updated to the 3.2.2 release and found that Quotas do not work when using the POSIX-ACL translator. In fact, after I disabled the ACLs, I had to remove the quota'd directory (presumably, to remove some attributes) and start over in order to get them to work. Once I disabled ACLs and re-created my directory, quotas worked as expected. Is this a known limitation of using POSIX ACLs? I happen to need both features, so that could pose an issue :) -Brian Brian Smith Senior Systems Administrator IT Research Computing, University of South Florida 4202 E. Fowler Ave. ENB308 Office Phone: +1 813 974-1467 Organization URL: http://rc.usf.edu On 07/11/2011 12:48 PM, Brian Smith wrote: According to the logs, the last commit was: commit 5c20eb3bbf870edadd22d06babb5d38dad222533 Author: shishir gowda shishi...@gluster.com Date: Tue Jul 5 03:41:51 2011 + [root@gluster1 glusterfs-3.2git]# gluster volume quota home list path limit_set size -- /brs 10485760 81965056 [root@gluster1 glusterfs-3.2git]# gluster volume info Volume Name: home Type: Distribute Status: Started Number of Bricks: 2 Transport-type: tcp,rdma Bricks: Brick1: gluster1:/glusterfs/home Brick2: gluster2:/glusterfs/home Options Reconfigured: features.limit-usage: /brs:10MB features.quota: on -Brian Brian Smith Senior Systems Administrator IT Research Computing, University of South Florida 4202 E. Fowler Ave. ENB308 Office Phone: +1 813 974-1467 Organization URL: http://rc.usf.edu On 07/11/2011 08:29 AM, Saurabh Jain wrote: Hello Brian, I synced my gluster repository back to July 5th and tried quota on a certain dir of a distribute and the quota was implemeted properly on that, here are the logs, [root@centos-qa-client-3 glusterfs]# /root/july6git/inst/sbin/gluster volume quota dist list path limit_set size -- /dir 10485760 10485760 [root@centos-qa-client-3 glusterfs]# /root/july6git/inst/sbin/gluster volume info Volume Name: dist Type: Distribute Status: Started Number of Bricks: 2 Transport-type: tcp,rdma Bricks: Brick1: 10.1.12.134:/mnt/dist Brick2: 10.1.12.135:/mnt/dist Options Reconfigured: features.limit-usage: /dir:10MB features.quota: on [root@centos-qa-client-3 glusterfs]# requesting you to please inform us about the commit id to which your workspace is synced. Thanks, Saurabh From: gluster-users-boun...@gluster.org [gluster-users-boun...@gluster.org] on behalf of gluster-users-requ...@gluster.org [gluster-users-requ...@gluster.org] Sent: Friday, July 08, 2011 12:30 AM To: gluster-users@gluster.org Subject: Gluster-users Digest, Vol 39, Issue 13 Send Gluster-users mailing list submissions to gluster-users@gluster.org To subscribe or unsubscribe via the World Wide Web, visit http://gluster.org/cgi-bin/mailman/listinfo/gluster-users or, via email, send a message with subject or body 'help' to gluster-users-requ...@gluster.org You can reach the person managing the list at gluster-users-ow...@gluster.org When replying, please edit your Subject line so it is more specific than Re: Contents of Gluster-users digest... Today's Topics: 1. Re: Issue with Gluster Quota (Brian Smith) 2. Re: Issues with geo-rep (Carl Chenet) -- Message: 1 Date: Thu, 07 Jul 2011 13:10:06 -0400 From: Brian Smith b...@usf.edu Subject: Re: [Gluster-users] Issue with Gluster Quota To: gluster-users@gluster.org Message-ID: 4e15e86e.6030...@usf.edu Content-Type: text/plain; charset=ISO-8859-1 Sorry about that. I re-populated with an 82MB dump from dd: [root@gluster1 ~]# gluster volume quota home list path limit_set size -- /brs 10485760 81965056 [root@gluster1 ~]# getfattr -m . -d -e hex /glusterfs/home/brs getfattr: Removing leading '/' from absolute path names # file: glusterfs/home/brs
[Gluster-users] GlusterFS v3.1.5 Stable Configuration
Hi, We've been using GlusterFS to manage shared files across a number of hosts in the past few months and have ran into a few problems -- basically one every month, roughly. The problems are occasionally extremely difficult to track down to GlusterFS, as they often masquerade as something else in the application log files that we have. The problems have been one instance of split-brain and then a number of instances of stuck files (i.e. any stat calls would block for an hour and then timeout with an error) as well as a couple instances of ghost files (remove the file, but GlusterFS continues to show it for a little while until the cache times out). We do *not* place a large amount of load on GlusterFS, and don't have any significant performance issues to deal with. With that in mind, the core question of this e-mail is: How can I modify our configuration to be the absolute *most* stable (problem free) that it can be, even if it means sacrificing performance? In sum, I don't have any particular performance concerns at this moment, but the GlusterFS bugs that we encounter are quite problematic -- so I'm willing to entertain any suggested stability improvement, even if it has a negative impact on performance (I suspect that the answer here is just turn off all performance-enhancing gluster caching, but I wanted to validate that is actually true before going so far). Thus please suggest anything that could be done to improve the stability of our setup -- as an aside, I think that this would be an advantageous thing to add to the FAQ. Right now the FAQ contains information for *performance* tuning, but not for *stability* tuning. Thanks for any help that you can give/suggestions that you can make. Here are the details of our environment: OS: RHEL5 GlusterFS Version: 3.1.5 Mount method: glusterfsd/FUSE GlusterFS Servers: web01, web02 GlusterFS Clients: web01, web02, dj01, dj02 $ sudo gluster volume info Volume Name: shared-application-data Type: Replicate Status: Started Number of Bricks: 2 Transport-type: tcp Bricks: Brick1: web01:/var/glusterfs/bricks/shared Brick2: web02:/var/glusterfs/bricks/shared Options Reconfigured: network.ping-timeout: 5 nfs.disable: on Configuration File Contents: */etc/glusterd/vols/shared-application-data/shared-application-data-fuse.vol * volume shared-application-data-client-0 type protocol/client option remote-host web01 option remote-subvolume /var/glusterfs/bricks/shared option transport-type tcp option ping-timeout 5 end-volume volume shared-application-data-client-1 type protocol/client option remote-host web02 option remote-subvolume /var/glusterfs/bricks/shared option transport-type tcp option ping-timeout 5 end-volume volume shared-application-data-replicate-0 type cluster/replicate subvolumes shared-application-data-client-0 shared-application-data-client-1 end-volume volume shared-application-data-write-behind type performance/write-behind subvolumes shared-application-data-replicate-0 end-volume volume shared-application-data-read-ahead type performance/read-ahead subvolumes shared-application-data-write-behind end-volume volume shared-application-data-io-cache type performance/io-cache subvolumes shared-application-data-read-ahead end-volume volume shared-application-data-quick-read type performance/quick-read subvolumes shared-application-data-io-cache end-volume volume shared-application-data-stat-prefetch type performance/stat-prefetch subvolumes shared-application-data-quick-read end-volume volume shared-application-data type debug/io-stats subvolumes shared-application-data-stat-prefetch end-volume */etc/glusterfs/glusterd.vol* volume management type mgmt/glusterd option working-directory /etc/glusterd option transport-type socket,rdma option transport.socket.keepalive-time 10 option transport.socket.keepalive-interval 2 end-volume -- Remi Broemeling System Administrator Clio - Practice Management Simplified 1-888-858-2546 x(2^5) | r...@goclio.com www.goclio.com | blog http://www.goclio.com/blog | twitterhttp://www.twitter.com/goclio | facebook http://www.facebook.com/goclio _⌠ oo ⌡_ (_ _) || ⌡_⌡⌡_⌡ ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
[Gluster-users] Reminder: Monitoring GlusterFS Webinar is Tomorrow
Greetings - if you're curious about monitoring GlusterFS performance, be sure and sign up for tomorrow's webinar. We will also post the recording online should you not be able to make it. Introducing Gluster for Geeks Technical Webinar Series In this Gluster for Geeks technical webinar, Craig Carl, Senior Systems Engineer, will explain and demonstrate how to monitor your Gluster cluster for availability and performance. Register: https://www3.gotomeeting.com/register/542541630 Topics covered will include: - What services and logs to watch - How to run baseline performance testing - How to use Ganglia for ongoing performance monitoring Craig will demonstrate how to use Ganglia to collate performance data from across a Gluster cluster and present it in a usable format. With time permitting we will demonstrate performance monitoring using Amazon CloudWatch and RightScale's monitoring tools. Webinar: Monitoring GlusterFS 3.2 Tuesday, July 19 at 10am PT / 1pm ET / 6pm UK (London) Speaker: Craig Carl, Senior Systems Engineer This will be a 90 minute webinar with the first hour dedicated to content and the last 30 minutes scheduled for QA. We look forward to seeing you there! If you can't make it, register anyway and we'll send you the recording. Register: https://www3.gotomeeting.com/register/542541630 ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] not sure how to troubleshoot SMB / CIFS overload when using GlusterFS
Anand, Whit, and Joseph, I appreciate your help very, very much. Anand's assertion about Samba doing string comparisons was spot on. And Whit's suggestion to change the smb.conf to make it be case sensitive did the trick. I am also forcing default case = lower and preserve case = no in smb.conf to make sure everything stays lower-case going in. With those changes in place, I believe I can hear a song in the back of my head, It's a whole new world... On the web app side we will be writing a request handler that will automatically lower-case any requests coming in, so any referenced images and files will work no matter the casing specified. (I've noticed that not many Linux hosted sites and SaaS platforms handle casing well, not sure why.) We will have some users complain about case sensitivity not being maintained on their files, but I think that the huge win for us being able to use GlusterFS is worth it. There are no great Windows solutions for ever-expandable storage, and we're well past the published limitations of Windows DFS-R. DFS-R is an amazing, refined piece of technology, but it is a solution for a different kind of problem. Thanks again, guys, I never would have navigated to this solution without you. Ken ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] GlusterFS v3.1.5 Stable Configuration
On Mon, Jul 18, 2011 at 10:53 AM, Remi Broemeling r...@goclio.com wrote: Hi, We've been using GlusterFS to manage shared files across a number of hosts in the past few months and have ran into a few problems -- basically one every month, roughly. The problems are occasionally extremely difficult to track down to GlusterFS, as they often masquerade as something else in the application log files that we have. The problems have been one instance of split-brain and then a number of instances of stuck files (i.e. any stat calls would block for an hour and then timeout with an error) as well as a couple instances of ghost files (remove the file, but GlusterFS continues to show it for a little while until the cache times out). We do not place a large amount of load on GlusterFS, and don't have any significant performance issues to deal with. With that in mind, the core question of this e-mail is: How can I modify our configuration to be the absolute most stable (problem free) that it can be, even if it means sacrificing performance? In sum, I don't have any particular performance It depends on kind of bugs or issues you are encountering. There might be solution for some bugs and may not be for others. concerns at this moment, but the GlusterFS bugs that we encounter are quite problematic -- so I'm willing to entertain any suggested stability improvement, even if it has a negative impact on performance (I suspect that the answer here is just turn off all performance-enhancing gluster caching, but I wanted to validate that is actually true before going so far). Thus please suggest anything that could be done to improve the stability of our setup -- as an aside, I think that this would be an advantageous thing to add to the FAQ. Right now the FAQ contains information for performance tuning, but not for stability tuning. Thanks for any help that you can give/suggestions that you can make. Here are the details of our environment: OS: RHEL5 GlusterFS Version: 3.1.5 Mount method: glusterfsd/FUSE GlusterFS Servers: web01, web02 GlusterFS Clients: web01, web02, dj01, dj02 $ sudo gluster volume info Volume Name: shared-application-data Type: Replicate Status: Started Number of Bricks: 2 Transport-type: tcp Bricks: Brick1: web01:/var/glusterfs/bricks/shared Brick2: web02:/var/glusterfs/bricks/shared Options Reconfigured: network.ping-timeout: 5 nfs.disable: on Configuration File Contents: /etc/glusterd/vols/shared-application-data/shared-application-data-fuse.vol volume shared-application-data-client-0 type protocol/client option remote-host web01 option remote-subvolume /var/glusterfs/bricks/shared option transport-type tcp option ping-timeout 5 end-volume volume shared-application-data-client-1 type protocol/client option remote-host web02 option remote-subvolume /var/glusterfs/bricks/shared option transport-type tcp option ping-timeout 5 end-volume volume shared-application-data-replicate-0 type cluster/replicate subvolumes shared-application-data-client-0 shared-application-data-client-1 end-volume volume shared-application-data-write-behind type performance/write-behind subvolumes shared-application-data-replicate-0 end-volume volume shared-application-data-read-ahead type performance/read-ahead subvolumes shared-application-data-write-behind end-volume volume shared-application-data-io-cache type performance/io-cache subvolumes shared-application-data-read-ahead end-volume volume shared-application-data-quick-read type performance/quick-read subvolumes shared-application-data-io-cache end-volume volume shared-application-data-stat-prefetch type performance/stat-prefetch subvolumes shared-application-data-quick-read end-volume volume shared-application-data type debug/io-stats subvolumes shared-application-data-stat-prefetch end-volume /etc/glusterfs/glusterd.vol volume management type mgmt/glusterd option working-directory /etc/glusterd option transport-type socket,rdma option transport.socket.keepalive-time 10 option transport.socket.keepalive-interval 2 end-volume -- Remi Broemeling System Administrator Clio - Practice Management Simplified 1-888-858-2546 x(2^5) | r...@goclio.com www.goclio.com | blog | twitter | facebook _⌠ oo ⌡_ (_ _) || ⌡_⌡⌡_⌡ ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
[Gluster-users] Questions on Replication and Design
Hi, all, We're looking to replace our DRBD/Ext4/NFS file storage configuration, using RHEL cluster w/ GlusterFS 3.2.2 (or greater, depending on the timeline). Currently, our configuration includes 2 four node clusters at different sites with 1. 8 @ 1.5TB LVM LVs on top of an HP-P2000 storage array 2. Each LV is mirrored to an identical LV on a cluster at a remote site using DRBD 3. Each LV/DRBD is an Ext4 volume and an NFS mount-point 4. DRBD is started as primary in one site and secondary in the remote site and can be switched easily in the event of a failure. 5. RHEL cluster w/ some patches runs DRBD, floating IP, ext4, NFS as a service for each of the 8 mounts. We went this route because it was the only way to get POSIX-ACL support and a working Quota implementation w/ replication of a large-ish volume without spending incredible amounts of money. With GlusterFS 3.2.2, these are both supported features. My proposed layout for the new configuration would look like so: 1. 8 @ 1.5TB LVM LVs on top of an HP-P2000 storage array 2. Each LV is an Ext4 FS 3. RHEL cluster runs a glusterd instance, floating IP and, ext4 mount for each of the 8 LVs. 4. Each of the 8 LVs is configured with a replicated pair in our remote site while they distribute across the local site. For instance: site1-node1: site2-node1: gluster1:/glusterfs -- gluster9:/glusterfs gluster2:/glusterfs -- gluster10:/glusterfs site1-node2: gluster3:/glusterfs -- gluster11:/glusterfs gluster4:/glusterfs -- gluster12:/glusterfs site1-node3: gluster5:/glusterfs -- gluster13:/glusterfs gluster6:/glusterfs -- gluster14:/glusterfs site1-node4: gluster7:/glusterfs -- gluster15:/glusterfs gluster8:/glusterfs -- gluster16:/glusterfs Distributed -- | | | | client client client client We will still use RHEL cluster to facilitate HA and failover of the glusterd/ip/fs instances on each cluster site. What say the experts about this approach and what caveats/issues should I be looking out for? I'll be building a test environment, but was wondering, before I start, whether this is a supportable configuration in the event we decide to get support, etc. Many thanks in advance! -Brian -- Brian Smith Senior Systems Administrator IT Research Computing, University of South Florida 4202 E. Fowler Ave. ENB308 Office Phone: +1 813 974-1467 Organization URL: http://rc.usf.edu ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] Tools for the admin
On 17 July 2011 18:05, Dan Bretherton d.a.brether...@reading.ac.uk wrote: Dear Vikas- Thanks for provding these tools. Unfortunately I think I have found a problem with the procedure outlined in the README - I don't think it works for files with names containing the colon character. I still have a lot of gfid errors in my logs after running the gfid tools on one volume, and all the filenames have one or more ':' characters. There are 1677 files still affected with gfid different so I don't think it can be a coincidence. Thanks for pointing this out, Dan. There was some urgency in writing the tool and I forgot to document that it wouldn't handle files with a : in them. It'll be fixed soon. -- Vikas Gorur Engineer - Gluster ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users