Re: [Gluster-users] iowait - ext4 + ssd journal
On Mon, 3 Nov 2014 at 5:37pm, Lindsay Mathieson wrote On 4 November 2014 11:28, Paul Robert Marino wrote: Use XFS instead of EXT4 There are many very good reasons its the new default filesystem in RHEL 7 xfs can't use a external journal. Actually, it can. I remember playing with it *way* back in the day when XFS was first ported to Linux. From 'man mkfs.xfs': The metadata log can be placed on another device to reduce the number of disk seeks. To create a filesystem on the first partition on the first SCSI disk with a 1 block log located on the first partition on the second SCSI disk, use: mkfs.xfs -l logdev=/dev/sdb1,size=1b /dev/sda1 -- Joshua Baker-LePain QB3 Shared Cluster Sysadmin UCSF ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Has anyone used encrypted filesystems with Gluster?
On Thu, 13 Sep 2012 at 2:30pm, Whit Blauvelt wrote This may be crazy, but has anyone used filesystem encryption (e.g. LUX) under Gluster? Or integrated encryption with Gluster in some other way? There's a certain demand to encrypt some of our storage, in case the hypothetical bad guy breaks into the server room and walks out with the servers. Is this a case where we can have encryption's advantages _or_ Gluster's? Or is there a practical way to have both? I haven't, but given that Gluster runs on top of a standard FS, I don't see any reason why this wouldn't work. Rather than just Gluster on top of ext3/4/XFS, it would be Gluster on top of ext3/4/XFS on top of an LUKS encrypted partition. The main stumbling block I see isn't Gluster related at all, it's simply how to do an unattended boot of a system with an encrypted partition... -- Joshua Baker-LePain QB3 Shared Cluster Sysadmin UCSF ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] Gluster 3.2.0 and ucarp not working
On Wed, 8 Jun 2011 at 4:44pm, Joe Landman wrote On 06/08/2011 04:37 PM, Joshua Baker-LePain wrote: BTW: You need a virtual ip for ucarp As I said, that's what I'm doing now -- using the virtual IP address managed by ucarp in my fstab line. But Craig Carl from Gluster told the OP in this thread specifically to mount using the real IP address of a server when using the GlusterFS client, *not* to use the ucarp VIP. So I'm officially confused. GlusterFS client side gets its config from the server, and makes connections to each server. Any of the GlusterFS servers may be used for the mount, and the client will connect to all of them. If one of the servers goes away, and you have a replicated or HA setup, you shouldn't see any client side issues. Hrm, apparently I'm not making myself clear. I fully understand the redundancy of a replicated glusterfs volume mounted on a client. After the mount, the client will not see any issues unless both members of a replica pair (or all 4 members of a replica quad, etc) go down. My concern is at mount time. Mounting via the glusterfs client (at the command line or via fstab) requires a single IP address. That server is contacted to get the volume config (which inclues the IP addresses of the rest of the servers). If that IP address is a regular IP address that points at a single server and that server is down *at client mount time*, then the mount will fail. I have setup ucarp for the sole purpose of using the ucarp managed VIP in my fstab lines, so that mounts will succeed even if some of the servers are down. All the "gluster" commands to create the volumes were done using real IP addresses. Does Craig Carl's advice not to use ucarp with the native glusterfs client apply in my siuation? ucarp would be needed for the NFS side of the equation. round robin DNS is useful in both cases. Again, I don't use DNS for my cluster, so that solution is out. -- Joshua Baker-LePain QB3 Shared Cluster Sysadmin UCSF ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] Gluster 3.2.0 and ucarp not working
On Wed, 8 Jun 2011 at 1:18pm, Mohit Anchlia wrote Prior to this thread, I thought the best method was to use ucarp on the servers and mount using the ucarp address. If that won't work right (I haven't had time to fully test my setup yet), then I need to find another way. I don't run DNS on my cluster, so that solution is out. As far as I can tell, the only other solution is to mount in rc.local, with logic to detect a mount failure and go on to the next server. ucarp should work so would the script at startup to check hosts before mounting. BTW: You need a virtual ip for ucarp As I said, that's what I'm doing now -- using the virtual IP address managed by ucarp in my fstab line. But Craig Carl from Gluster told the OP in this thread specifically to mount using the real IP address of a server when using the GlusterFS client, *not* to use the ucarp VIP. So I'm officially confused. -- Joshua Baker-LePain QB3 Shared Cluster Sysadmin UCSF___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] Gluster 3.2.0 and ucarp not working
On Wed, 8 Jun 2011 at 12:12pm, Mohit Anchlia wrote On Wed, Jun 8, 2011 at 12:09 PM, Joshua Baker-LePain wrote: On Wed, 8 Jun 2011 at 9:40am, Mohit Anchlia wrote On Tue, Jun 7, 2011 at 11:20 PM, Joshua Baker-LePain wrote: On Wed, 8 Jun 2011 at 8:16am, bxma...@gmail.com wrote When client is connecting to any gluster node it automaticly receive list of all other nodes for that volume. Yes, but what if the node it first tries to contact (i.e., the one on the fstab line) is down? For client side use DNS round robin with all the hosts in your cluster. And if you use /etc/hosts rather than DNS...? What do you think should happen? I'm simply trying to find the most robust way to automatically mount GlusterFS volumes at boot time in my environment. In previous versions of Gluster, there was no issue since the volume files were on the clients. I understand that can still be done, but one loses the ability to manage all changes from the servers. Prior to this thread, I thought the best method was to use ucarp on the servers and mount using the ucarp address. If that won't work right (I haven't had time to fully test my setup yet), then I need to find another way. I don't run DNS on my cluster, so that solution is out. As far as I can tell, the only other solution is to mount in rc.local, with logic to detect a mount failure and go on to the next server. -- Joshua Baker-LePain QB3 Shared Cluster Sysadmin UCSF ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] Gluster 3.2.0 and ucarp not working
On Wed, 8 Jun 2011 at 9:40am, Mohit Anchlia wrote On Tue, Jun 7, 2011 at 11:20 PM, Joshua Baker-LePain wrote: On Wed, 8 Jun 2011 at 8:16am, bxma...@gmail.com wrote When client is connecting to any gluster node it automaticly receive list of all other nodes for that volume. Yes, but what if the node it first tries to contact (i.e., the one on the fstab line) is down? For client side use DNS round robin with all the hosts in your cluster. And if you use /etc/hosts rather than DNS...? -- Joshua Baker-LePain QB3 Shared Cluster Sysadmin UCSF ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] Gluster 3.2.0 and ucarp not working
On Wed, 8 Jun 2011 at 8:16am, bxma...@gmail.com wrote When client is connecting to any gluster node it automaticly receive list of all other nodes for that volume. Yes, but what if the node it first tries to contact (i.e., the one on the fstab line) is down? -- Joshua Baker-LePain QB3 Shared Cluster Sysadmin UCSF ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] Gluster 3.2.0 and ucarp not working
On Mon, 6 Jun 2011 at 1:30am, Craig Carl wrote Matus - If you are using the Gluster native client (mount -t glusterfs ...) then ucarp/CTDB is NOT required and you should not install it. Always use the real IPs when you are mounting with 'mount -t glusterfs...'. Hrm. That wasn't my understanding. Say my fstab line looks like this: 192.168.2.100:/distrep /mnt/distrep glusterfs defaults,_netdev 0 0 Now, let's say that at mount time 192.168.2.100 is down. How does the Gluster native client know which other IP addresses to contact to get the volume file? Is there a way to put multiple hosts in the fstab line? -- Joshua Baker-LePain QB3 Shared Cluster Sysadmin UCSF ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] Scaling Gluster
On Mon, 23 May 2011 at 9:55am, Anthony J. Biacco wrote I think i read somewhere the server in the mount string is only used to retrieve the cluster config. So as long as that server is up at time of mount, you're fine. If the server goes down after mount, it doesnt matter as the cluster config on the mounting servers knows about all gluster servers. Somebody correct me if i'm wrong. That's correct -- it's the mount-time failure that ucarp can help avoid. Personally, i copy the cluster config to all my mounting servers and use that file in the mount command instead of a gluster server hostname. But then you lose the ability to easily and automatically distribute configuration changes made via the 'gluster' command on the servers. I'm not saying that can't be worked around, but everything is a tradeoff. -- Joshua Baker-LePain QB3 Shared Cluster Sysadmin UCSF ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] Scaling Gluster
On Thu, 19 May 2011 at 12:40am, José Celano wrote I have 2 replicated servers and several Apache servers that mount the same volume. When I mount a Gluster Volumen from one of the Apache servers mount.glusterfs ip-XX-XXX-XXX-XXX.eu-west-1.compute.internal:/dab578f4-06fa-4584-ac1d-c6ef14e6d0cf /mnt/glusterfs how could I obtain high availability if that server die? Use ucarp to share a virtual IP address between the Gluster servers, and mount using that IP address. -- Joshua Baker-LePain QB3 Shared Cluster Sysadmin UCSF___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] [SPAM?] Do clients need to run glusterd?
On Mon, 9 May 2011 at 10:33am, Vikas Gorur wrote I think the question is why there's a single init.d script that starts or shuts down both daemon and client at once. The init.d/glusterd script has nothing whatsoever to do with the client. It only controls starting/stopping the server. The client is an independent process that is started by mounting and stopped by unmounting. In theory that's how it should work. In practice, it isn't. Just look at the script itself: stop() { echo -n $"Stopping $BASE:" killproc $BASE echo pidof -c -o %PPID -x $GLUSTERFSD &> /dev/null [ $? -eq 0 ] && killproc $GLUSTERFSD &> /dev/null pidof -c -o %PPID -x $GLUSTERFS &> /dev/null [ $? -eq 0 ] && killproc $GLUSTERFS &> /dev/null } So it kills the glusterd ($BASE), glusterfsd, *and* glusterfs processes. That last one unmounts any mounted gluster filesystems. If one wanted to, e.g., shut down one server node of a replicated pair *but* still access the glusterfs mount from that node, one would have to remount the FS after doing a "service glusterd stop". -- Joshua Baker-LePain QB3 Shared Cluster Sysadmin UCSF ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] [SPAM?] Do clients need to run glusterd?
On Fri, 6 May 2011 at 5:00pm, Joshua Baker-LePain wrote Glusterd is the management daemon, and it needs to run on all the servers. If you shut down the glusterd service, it stops all volumes by killing the GlusterFS processes. This is why you will see your clients return errors. Glusterd interacts with the 'gluster' command-line tool, and is responsible for creating and starting volumes and making changes to volume configuration. I'm not seeing errors on the clients. The errors (well, warnings actually) I posted are in /var/log/glusterfs/etc-glusterfs-glusterd.vol.log on the *servers*. I can safely ignore them, then? OK, a quick test reveals that I'm seeing this with or without glusterd running on the clients. Is there anything to that warning? Again, I'm seeing this in the server logs whenever a client mounts the filesystem (using the fuse client): [2011-05-06 15:20:16.451959] W [socket.c:1494:__socket_proto_state_machine] 0-socket.management: reading from socket failed. Error (Transport endpoint is not connected), peer (172.19.12.4:1022) -- Joshua Baker-LePain QB3 Shared Cluster Sysadmin UCSF ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] [SPAM?] Do clients need to run glusterd?
On Fri, 6 May 2011 at 10:57pm, Tomasz Chmielewski wrote Assuming we have a distributed replica of two gluster servers: server1-server2 And several clients (client1, client2, ..., clientN). If we use the following command line: mount -t glusterfs server1:/test-volume /mnt/glusterfs Our mount will die if server1 is offline. Similar, if we use this entry in /etc/fstab: server1:/test-volume /mnt/glusterfs glusterfs defaults,_netdev 0 0 If server1 is offline, but server2 is online, the client will not be able to access the data. Could anyone shed some light on achieving high availability with glusterfs? One way is to use ucarp. Create a hostname, e.g., server. Then setup ucarp on each serverN with the IP address of server. If server1 goes down, server2 will assume the IP address of server (in addition to its own). Obviously, then, use "server" in place of server1 in your mount or fstab lines. -- Joshua Baker-LePain QB3 Shared Cluster Sysadmin UCSF ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] [SPAM?] Do clients need to run glusterd?
On Fri, 6 May 2011 at 1:44pm, Vikas Gorur wrote The clients do *not* need to run glusterd. This is what I hoped to hear. But... Glusterd is the management daemon, and it needs to run on all the servers. If you shut down the glusterd service, it stops all volumes by killing the GlusterFS processes. This is why you will see your clients return errors. Glusterd interacts with the 'gluster' command-line tool, and is responsible for creating and starting volumes and making changes to volume configuration. I'm not seeing errors on the clients. The errors (well, warnings actually) I posted are in /var/log/glusterfs/etc-glusterfs-glusterd.vol.log on the *servers*. I can safely ignore them, then? -- Joshua Baker-LePain QB3 Shared Cluster Sysadmin UCSF ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
[Gluster-users] Do clients need to run glusterd?
The glusterfs-core RPM installs the glusterd daemon and sets it to start automatically. I figured this wasn't needed on clients, so I turned it off. However, I now see tons of messages like this in the logs of the servers: [2011-05-06 00:03:57.253243] W [socket.c:1494:__socket_proto_state_machine] 0-socket.management: reading from socket failed. Error (Transport endpoint is not connected), peer (172.19.12.160:1022) where 172.19.12.160 is an example client address. The simple tests I've run seem to work (file creation, can see new files from other clients, etc). But the warning message gave me pause. -- Joshua Baker-LePain QB3 Shared Cluster Sysadmin UCSF ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] SRPMs?
On Wed, 13 Apr 2011 at 3:23pm, Bernard Li wrote On Wed, Apr 13, 2011 at 2:21 PM, Joshua Baker-LePain wrote: I figured that was the case, and it's easy enough to tweak the 3.1.1 spec file to build with the 3.1.4 tarball. But there are enough changes and things moving about that it's nice to have an "official" spec file to work from. The tarball includes the spec file, you can build (S)RPMs by running: rpmbuild -ta glusterfs-3.1.4.tar.gz Well color me embarrassed -- so it does. Thanks for pointing that out. -- Joshua Baker-LePain QB3 Shared Cluster Sysadmin UCSF___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] SRPMs?
On Wed, 13 Apr 2011 at 2:11pm, Vijay Bellur wrote Thanks for bringing this to our notice. It is not intentional and RPMs are being built from the same tarball available in the download site. I figured that was the case, and it's easy enough to tweak the 3.1.1 spec file to build with the 3.1.4 tarball. But there are enough changes and things moving about that it's nice to have an "official" spec file to work from. Thanks. -- Joshua Baker-LePain QB3 Shared Cluster Sysadmin UCSF ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
[Gluster-users] SRPMs?
I can't find SRPMs for any versions after 3.1.1 on download.gluster.com. Is this a conscious decision (if so, what's the rationale?) or just an oversight? Thanks. -- Joshua Baker-LePain QB3 Shared Cluster Sysadmin UCSF ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] Possible workaround for a problem with a "permissions" or "file access" error
On Mon, 20 Dec 2010 at 3:49pm, Joe Landman wrote On 12/20/2010 03:43 PM, Joshua Baker-LePain wrote: On Mon, 20 Dec 2010 at 3:36pm, Joe Landman wrote Has someone somewhere compiled a complete list of the settable config elements and their meanings? I don't know if the settings simply set key-value pairs, or if they actually impact things. A listing of these would be nice, though I found what we found by some creative guessing based upon older configurations. This came up recently, and the wiki has been updated with all the options and their default values: Actually these options weren't listed in that page. What I've noticed is that the options follow a similar naming convention to the translators of the 3.0.x and below so that performance/thingamabob would be performance.thingamabob . And the gluster tool helps you when you make a mistake, as in Whoops -- I missed that you were looking for NFS options rather than volume options. I absolutely agree that those should be returned by the gluster command as well. -- Joshua Baker-LePain QB3 Shared Cluster Sysadmin UCSF ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] Possible workaround for a problem with a "permissions" or "file access" error
On Mon, 20 Dec 2010 at 3:36pm, Joe Landman wrote Has someone somewhere compiled a complete list of the settable config elements and their meanings? I don't know if the settings simply set key-value pairs, or if they actually impact things. A listing of these would be nice, though I found what we found by some creative guessing based upon older configurations. This came up recently, and the wiki has been updated with all the options and their default values: http://www.gluster.com/community/documentation/index.php/Gluster_3.1:_Setting_Volume_Options Also, an RFE for "gluster volume info" was submitted such that it always displays the volume options in use. Currently it only displays values that have been changed from their defaults. -- Joshua Baker-LePain QB3 Shared Cluster Sysadmin UCSF ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] GlusterFS 3.1.1 - local volume mount
On Thu, 9 Dec 2010 at 4:29pm, Jacob Shucart wrote That is correct. If at the time you try to mount the volume the server is down, then you won't be able to mount the volume in the first place. If this is a concern, you can set up something like ucarp to handle IP failover so your mount command will work, but usually mounting is something that is not done frequently, right? Or you could have a round robin DNS entry that points to all of your storage nodes so that your mount command is essentially hitting a different server each time. I'll have several hundred clients which reboot at random times (cluster nodes which get kernel updates and then reboot when the jobs running on them have finished, as well as the usual hardware maintenance on random nodes), so it's better not having to worry whether or not a particular gluster server is down at any point in time. It's also a bit of a bummer that, for this particular situation, the deprecated volume file approach is more robust than the recommended one. I do understand that there are other advantages, though. Since I don't do DNS on the cluster, I'll have to look into ucarp. Thanks for the info. -- Joshua Baker-LePain QB3 Shared Cluster Sysadmin UCSF ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] GlusterFS 3.1.1 - local volume mount
On Thu, 9 Dec 2010 at 4:19pm, Jacob Shucart wrote When mounting as a glusterfs, the mount command below is used to establish the connection to the cluster, but once the connection is established to the cluster the client system has connections open to all of the servers and not just the one used to mount, so there is still no single point of failure. But if the server listed in the mount command is down *when the client attempts to establish the connection* (i.e. at mount time), then the client won't be able to discover the rest of the cluster and the mount will fail. Right? -- Joshua Baker-LePain QB3 Shared Cluster Sysadmin UCSF ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] GlusterFS 3.1.1 - local volume mount
On Thu, 9 Dec 2010 at 4:05pm, Jacob Shucart wrote With Gluster 3.1.1, you no longer need to do anything with the vol files. If you create a volume like you did below, then you simply mount it like: mount -t glusterfs 172.16.16.50:/pool /pool/mount Gluster automatically gets the volume information when mounting. This is described at: http://www.gluster.com/community/documentation/index.php/Gluster_3.1:_Manu ally_Mounting_Volumes This brings up an issue I've been wondering about. With the old style (vol-file-on-the-clients based) mounting, there was no single point of failure when it came to mounting (assuming, of course, that if the first server listed in the vol file was down, the gluster client would try the next one). In the syntax quoted above, if that particular server happens to be down, the mount will fail. Is there any way in 3.1.1 to avoid that SPOF? -- Joshua Baker-LePain QB3 Shared Cluster Sysadmin UCSF ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] Updated documentation - volume options
On Wed, 8 Dec 2010 at 1:06pm, Craig Carl wrote Joshua - 'gluster volume info ' or 'gluster volume info all' Only options that are not at their default value are displayed. Ah, that would explain why I didn't see them there. IMHO, there should be a way to see the values, regardless of whether or not they differ from the defaults. With the defaults now on the wiki, there's maybe less impetus for this. But net access from servers isn't always guaranteed. -- Joshua Baker-LePain QB3 Shared Cluster Sysadmin UCSF ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] Updated documentation - volume options
On Tue, 7 Dec 2010 at 9:46pm, Craig Carl wrote We have had several requests to update the volume options documentation with more details, the first version of the updated documentation is available - http://www.gluster.com/community/documentation/index.php/Gluster_3.1:_Setting_Volume_Options. Let us know if you see anything that needs to be changed, please also remember that the documentation wiki is publicly editable, helping with documentation is a great way for community members to support Gluster. Maybe I'm missing it, but there doesn't seem to be any way to query the options of a configured volume. Is this something that's on the roadmap, or should I file an RFE? -- Joshua Baker-LePain QB3 Shared Cluster Sysadmin UCSF ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] Encryption?
On Tue, 19 Oct 2010 at 3:46pm, Christopher J Bidwell wrote Does GlusterFS have any form of encryption with it? If not, will it soon be available? Can you just use encrypted block devices to build your Gluster volumes? -- Joshua Baker-LePain QB3 Shared Cluster Sysadmin UCSF ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
[Gluster-users] Release announcements?
Is there a channel I'm not seeing to get release announcements? I only found out 3.0.5 was out by seeing it recommended in mailing list threads. Just wondering... -- Joshua Baker-LePain QB3 Shared Cluster Sysadmin UCSF ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] Small File and "ls" performance ..
On Wed, 5 May 2010 at 1:00am, Tejas N. Bhise wrote We have recently made some code changes in an effort to improve small file and 'ls' performance. The patches are - selective readdirp - http://patches.gluster.com/patch/3203/ dht lookup revalidation optimization - http://patches.gluster.com/patch/3204/ updated write-behind default values - http://patches.gluster.com/patch/3223/ DISCLAIMER : These patches have not made it to any supported release yet and have not been tested yet. Don't use them in production. I am providing this information only as some advance notice for those in the community who might be interested in trying out these changes and provide feedback. I would like to cast my vote firmly in favor of these patches. I did a bit of a torture test with my scratch gluster setup: Storage bricks: 10 HP DL160 G5s, each with a single 7200RPM SATA disk Client: Same hardware. Everything is connected via GbE to the same switch. Gluster setup: Gluster 3.0.4. Standard replicate then distribute setup created via gluster-volgen. Test: An old version of <http://people.redhat.com/dledford/memtest.shtml>. This script unpacks *lots* of copies of the Linux kernel tarball (it's based on memory size -- for this client, it was 98), diffs all of them against the first copy, and then removes them all. So, lots of small files. Length of 1 run before patches: 5622m56.020s Length of 1 run after patches: 711m54.006s Wow. And the run with the test patches didn't generate any errors. Once these are fully tested they will make to an officially supported release. I rather look forward to that. -- Joshua Baker-LePain QB3 Shared Cluster Sysadmin UCSF ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
[Gluster-users] Newbie questions
I'm a Gluster newbie trying to get myself up to speed. I've been through the bulk of the website docs and I'm in the midst of some small (although increasing) scale test setups. But I wanted to poll the list's collective wisdom on how best to fit Gluster into my setup. As background, I currently have over 550 nodes with over 3000 cores in my (SGE scheduled) cluster, and we expand on a roughly biannual basis. The cluster is all gigabit ethernet -- each rack has a switch, and these switches each have 4-port trunks to our central switch. Despite the number of nodes in each rack, these trunks are not currently oversubscribed. The cluster is shared among many research groups and the vast majority of the jobs are embarrassingly parallel. Our current storage is an active-active pair of NetApp FAS3070s with a total of 8 shelves of disks. Unsurprisingly, it's fairly easy for any one user to flatten either head (or both) of the NetApp. I'm looking at Gluster for 2 purposes: 1) To host our "database" volume. This volume has copies of several protein and gene databases (PDB, UniProt, etc). The databases generally consist of tens of thousands of small (a few hundred KB at most) files. Users often start array jobs with hundreds or thousands of tasks, each task of which accesses many of these files. 2) To host a cluster-wide scratch space. Users waste a lot of time (and bandwidth) copying (often temporary) results back and forth between the network storage and the nodes' scratch disks. And scaling the NetApp is difficult, not least of which because it is rather difficult to convince PIs to spring for storage rather than more cores. For purpose 1, clearly I'm looking at a replicated volume. For purpose 2, I'm assuming that distributed is the way to go (rather than striped), although for reliability reasons I'd likely go replicated then distributed. For storage bricks, I'm looking at something like HP's DL180 G6, where I would have 25 internal SAS disks (or alternatively, I could put the same number in a SAS-attached external chassis). In addition to any general advice folks could give, I have these specific questions: 1) My initial leaning would be to RAID10 the disks at the server level, and then use the RAID volumes as gluster exports. But I could also see running the disks in JBOD mode and doing all the redundancy at the Gluster level. The latter would seem to make management (and, e.g., hot swap) more difficult, but is it preferred from a Gluster perspective? How difficult would it make disk and/or brick maintenance? 2) Is it frowned upon to create 2 volumes out of the same physical set of disks? I'd like to maximize the spindle count in both volumes (especially the scratch volume), but will it overly degrade performance? Would it be better to simply create one replicated and distributed volume and use that for both of the above purposes? 3) Is it crazy to think of doing a distributed (or NUFA) volume with the scratch disks in the whole cluster? Especially given that we have nodes of many ages and see not infrequent node crashes due to bad memory/HDDs/user code? If you've made it this far, thanks very much for reading. Any and all advice (and/or pointers at more documentation) would be much appreciated. -- Joshua Baker-LePain QB3 Shared Cluster Sysadmin UCSF ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users