Re: [Gluster-users] gluster-fuse and rebalance (Re: (3.1.X) add-brick during re-balancing)
Hi Tomo, That is correct. These are harmless, and if you still want to migrate these issues, then use the force option. With regards, Shishir - Original Message - From: "Tomoaki Sato" To: "Shishir Gowda" Cc: gluster-users@gluster.org, "Amar Tumballi" Sent: Thursday, June 28, 2012 10:23:59 AM Subject: Re: [Gluster-users] gluster-fuse and rebalance (Re: (3.1.X) add-brick during re-balancing) Shishir, Thank you for providing the requested information. I saw non-zero 'failures' count in output of 'gluster volume rebalance status' and saw following log massages. Let me confirm that these are harmless. /var/log/glusterfs/vol1-rebalance.log:[2012-06-28 12:49:06.381196] W [dht-rebalance.c:353:__dht_check_free_space] 0-vol1-dht: data movement attempted from node (vol1-client-0) with higher disk space to a node (vol1-client-2) with lesser disk space (/dir1/file-1) Regards, Tomo (2012/06/28 13:29), Shishir Gowda wrote: > Hi Tomo, > > gluster volume rebalance will re-distribute data only when the destination > server/brick has higher available space (also taking into account the file to > be migrated) than the source server/brick. This is done to maintain a balance > across the bricks/servers. > > A force option would bypass this check, and would migrate the file as long as > the destination has enough space to accommodate the new file. > > With regards, > shishir > > - Original Message - > From: "Tomoaki Sato" > To: "Shishir Gowda" > Cc: gluster-users@gluster.org, "Amar Tumballi" > Sent: Thursday, June 28, 2012 9:33:12 AM > Subject: Re: [Gluster-users] gluster-fuse and rebalance (Re: (3.1.X) > add-brick during re-balancing) > > Shishir, > > Thank you for your prompt reply. > When should I specify 'force' option for 'gluster volume reparance > start' ? > > -- excerpt from manual page -- > 8.5.2. Rebalancing Volume to Fix Layout and Migrate Data > > After expanding or shrinking a volume (using the add-brick and remove-brick > commands respectively), > you need to rebalance the data among the servers. > > To rebalance a volume to fix layout and migrate the existing data > > • Start the rebalance operation on any one of the server using the following > command: ># gluster volume rebalance VOLNAME start >For example: > # gluster volume rebalance test-volume start > Starting rebalancing on volume test-volume has been successful > > • Start the migration operation forcefully on any one of the server using the > following command: ># gluster volume rebalance VOLNAME start force >For example: > # gluster volume rebalance test-volume start force > Starting rebalancing on volume test-volume has been successful > -- -- > > Regards, > > Tomo > > (2012/06/27 17:32), Shishir Gowda wrote: >> Hi Tomo, >> >> That is correct. The gluster volume rebalance is no more dependent on >> gluster-fuse package. >> >> With regards, >> Shishir >> >> - Original Message - >> From: "Tomoaki Sato" >> To: "Amar Tumballi" >> Cc: gluster-users@gluster.org >> Sent: Wednesday, June 27, 2012 1:36:44 PM >> Subject: [Gluster-users] gluster-fuse and rebalance (Re: (3.1.X) add-brick >> during re-balancing) >> >> Amar, >> >> Let me confirm that gluster-fuse package is not required in NFS-server-use >> of GlusterFS-3.3.0 as 'glaster volume rebalance' dose not use FUSE. >> >> Regards, >> >> Tomo >> >> (2011/08/29 15:09), Amar Tumballi wrote: >>> >>> 1. fuse module is required to 'glaster volume rebalance'. >>> >>> 2. 'gluster volume rebalance start' and 'gluster volume rebalance >>> stop' must be issued in exactly same node. >>> >>> By the way, why multiple re-balancing on a volume can be exist in a >>> cluster ? >>> >>> >>> Enhancement to make sure 'rebalance' operation co-operate with other peers >>> is in progress, and will be available in 3.3.0 release. For now, yes, there >>> is a possibility that single volume can have multiple rebalance operations >>> going on (on different machines). Even though it should not break any >>> thing, it will degrade the performance. For now, preventing this should be >>> taken care by admin. >>> >>> Regards, >>> Amar >> >> >> ___ >> Gluster-users mailing list >> Gluster-users@gluster.org >> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users >> > > ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] gluster-fuse and rebalance (Re: (3.1.X) add-brick during re-balancing)
Shishir, Thank you for providing the requested information. I saw non-zero 'failures' count in output of 'gluster volume rebalance status' and saw following log massages. Let me confirm that these are harmless. /var/log/glusterfs/vol1-rebalance.log:[2012-06-28 12:49:06.381196] W [dht-rebalance.c:353:__dht_check_free_space] 0-vol1-dht: data movement attempted from node (vol1-client-0) with higher disk space to a node (vol1-client-2) with lesser disk space (/dir1/file-1) Regards, Tomo (2012/06/28 13:29), Shishir Gowda wrote: Hi Tomo, gluster volume rebalance will re-distribute data only when the destination server/brick has higher available space (also taking into account the file to be migrated) than the source server/brick. This is done to maintain a balance across the bricks/servers. A force option would bypass this check, and would migrate the file as long as the destination has enough space to accommodate the new file. With regards, shishir - Original Message - From: "Tomoaki Sato" To: "Shishir Gowda" Cc: gluster-users@gluster.org, "Amar Tumballi" Sent: Thursday, June 28, 2012 9:33:12 AM Subject: Re: [Gluster-users] gluster-fuse and rebalance (Re: (3.1.X) add-brick during re-balancing) Shishir, Thank you for your prompt reply. When should I specify 'force' option for 'gluster volume reparance start' ? -- excerpt from manual page -- 8.5.2. Rebalancing Volume to Fix Layout and Migrate Data After expanding or shrinking a volume (using the add-brick and remove-brick commands respectively), you need to rebalance the data among the servers. To rebalance a volume to fix layout and migrate the existing data • Start the rebalance operation on any one of the server using the following command: # gluster volume rebalance VOLNAME start For example: # gluster volume rebalance test-volume start Starting rebalancing on volume test-volume has been successful • Start the migration operation forcefully on any one of the server using the following command: # gluster volume rebalance VOLNAME start force For example: # gluster volume rebalance test-volume start force Starting rebalancing on volume test-volume has been successful -- -- Regards, Tomo (2012/06/27 17:32), Shishir Gowda wrote: Hi Tomo, That is correct. The gluster volume rebalance is no more dependent on gluster-fuse package. With regards, Shishir - Original Message - From: "Tomoaki Sato" To: "Amar Tumballi" Cc: gluster-users@gluster.org Sent: Wednesday, June 27, 2012 1:36:44 PM Subject: [Gluster-users] gluster-fuse and rebalance (Re: (3.1.X) add-brick during re-balancing) Amar, Let me confirm that gluster-fuse package is not required in NFS-server-use of GlusterFS-3.3.0 as 'glaster volume rebalance' dose not use FUSE. Regards, Tomo (2011/08/29 15:09), Amar Tumballi wrote: 1. fuse module is required to 'glaster volume rebalance'. 2. 'gluster volume rebalance start' and 'gluster volume rebalance stop' must be issued in exactly same node. By the way, why multiple re-balancing on a volume can be exist in a cluster ? Enhancement to make sure 'rebalance' operation co-operate with other peers is in progress, and will be available in 3.3.0 release. For now, yes, there is a possibility that single volume can have multiple rebalance operations going on (on different machines). Even though it should not break any thing, it will degrade the performance. For now, preventing this should be taken care by admin. Regards, Amar ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] gluster-fuse and rebalance (Re: (3.1.X) add-brick during re-balancing)
Hi Tomo, gluster volume rebalance will re-distribute data only when the destination server/brick has higher available space (also taking into account the file to be migrated) than the source server/brick. This is done to maintain a balance across the bricks/servers. A force option would bypass this check, and would migrate the file as long as the destination has enough space to accommodate the new file. With regards, shishir - Original Message - From: "Tomoaki Sato" To: "Shishir Gowda" Cc: gluster-users@gluster.org, "Amar Tumballi" Sent: Thursday, June 28, 2012 9:33:12 AM Subject: Re: [Gluster-users] gluster-fuse and rebalance (Re: (3.1.X) add-brick during re-balancing) Shishir, Thank you for your prompt reply. When should I specify 'force' option for 'gluster volume reparance start' ? -- excerpt from manual page -- 8.5.2. Rebalancing Volume to Fix Layout and Migrate Data After expanding or shrinking a volume (using the add-brick and remove-brick commands respectively), you need to rebalance the data among the servers. To rebalance a volume to fix layout and migrate the existing data • Start the rebalance operation on any one of the server using the following command: # gluster volume rebalance VOLNAME start For example: # gluster volume rebalance test-volume start Starting rebalancing on volume test-volume has been successful • Start the migration operation forcefully on any one of the server using the following command: # gluster volume rebalance VOLNAME start force For example: # gluster volume rebalance test-volume start force Starting rebalancing on volume test-volume has been successful -- -- Regards, Tomo (2012/06/27 17:32), Shishir Gowda wrote: > Hi Tomo, > > That is correct. The gluster volume rebalance is no more dependent on > gluster-fuse package. > > With regards, > Shishir > > - Original Message - > From: "Tomoaki Sato" > To: "Amar Tumballi" > Cc: gluster-users@gluster.org > Sent: Wednesday, June 27, 2012 1:36:44 PM > Subject: [Gluster-users] gluster-fuse and rebalance (Re: (3.1.X) add-brick > during re-balancing) > > Amar, > > Let me confirm that gluster-fuse package is not required in NFS-server-use of > GlusterFS-3.3.0 as 'glaster volume rebalance' dose not use FUSE. > > Regards, > > Tomo > > (2011/08/29 15:09), Amar Tumballi wrote: >> >> 1. fuse module is required to 'glaster volume rebalance'. >> >> 2. 'gluster volume rebalance start' and 'gluster volume rebalance stop' >> must be issued in exactly same node. >> >> By the way, why multiple re-balancing on a volume can be exist in a >> cluster ? >> >> >> Enhancement to make sure 'rebalance' operation co-operate with other peers >> is in progress, and will be available in 3.3.0 release. For now, yes, there >> is a possibility that single volume can have multiple rebalance operations >> going on (on different machines). Even though it should not break any thing, >> it will degrade the performance. For now, preventing this should be taken >> care by admin. >> >> Regards, >> Amar > > > ___ > Gluster-users mailing list > Gluster-users@gluster.org > http://gluster.org/cgi-bin/mailman/listinfo/gluster-users > ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] gluster-fuse and rebalance (Re: (3.1.X) add-brick during re-balancing)
Shishir, Thank you for your prompt reply. When should I specify 'force' option for 'gluster volume reparance start' ? -- excerpt from manual page -- 8.5.2. Rebalancing Volume to Fix Layout and Migrate Data After expanding or shrinking a volume (using the add-brick and remove-brick commands respectively), you need to rebalance the data among the servers. To rebalance a volume to fix layout and migrate the existing data • Start the rebalance operation on any one of the server using the following command: # gluster volume rebalance VOLNAME start For example: # gluster volume rebalance test-volume start Starting rebalancing on volume test-volume has been successful • Start the migration operation forcefully on any one of the server using the following command: # gluster volume rebalance VOLNAME start force For example: # gluster volume rebalance test-volume start force Starting rebalancing on volume test-volume has been successful -- -- Regards, Tomo (2012/06/27 17:32), Shishir Gowda wrote: Hi Tomo, That is correct. The gluster volume rebalance is no more dependent on gluster-fuse package. With regards, Shishir - Original Message - From: "Tomoaki Sato" To: "Amar Tumballi" Cc: gluster-users@gluster.org Sent: Wednesday, June 27, 2012 1:36:44 PM Subject: [Gluster-users] gluster-fuse and rebalance (Re: (3.1.X) add-brick during re-balancing) Amar, Let me confirm that gluster-fuse package is not required in NFS-server-use of GlusterFS-3.3.0 as 'glaster volume rebalance' dose not use FUSE. Regards, Tomo (2011/08/29 15:09), Amar Tumballi wrote: 1. fuse module is required to 'glaster volume rebalance'. 2. 'gluster volume rebalance start' and 'gluster volume rebalance stop' must be issued in exactly same node. By the way, why multiple re-balancing on a volume can be exist in a cluster ? Enhancement to make sure 'rebalance' operation co-operate with other peers is in progress, and will be available in 3.3.0 release. For now, yes, there is a possibility that single volume can have multiple rebalance operations going on (on different machines). Even though it should not break any thing, it will degrade the performance. For now, preventing this should be taken care by admin. Regards, Amar ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
[Gluster-users] Gluster 3.2.5 client crash report
Hello there Our gluster client was crashed 2 times last night. The gluster partition was unavailable for a long time and i have to remount it manually. On the gluster log, i saw things like these: /pending frames: frame : type(1) op(LOOKUP) frame : type(1) op(LOOKUP) patchset: git://git.gluster.com/glusterfs.git signal received: 11 time of crash: 2012-06-28 00:19:24 configuration details: argp 1 backtrace 1 dlfcn 1 fdatasync 1 libpthread 1 llistxattr 1 setfsid 1 spinlock 1 epoll.h 1 xattr.h 1 st_atim.tv_nsec 1 package-string: glusterfs 3.2.5 /lib64/libc.so.6[0x3cf0c32900]/ Here is my "gluster volume info" output: /Volume Name: xx Type: Distribute Status: Started Number of Bricks: 2 Transport-type: tcp Bricks: Brick1: Gnode1:/mnt/store Brick2: Gnode2:/mnt/store2 Options Reconfigured: nfs.addr-namelookup: off nfs.rpc-auth-allow: 10.0.0.245,10.0.0.244,10.0.0.247,10.0.0.54,10.0.0.55 auth.allow: 10.* cluster.min-free-disk: 5 performance.io-thread-count: 64 performance.cache-size: 512MB nfs.disable: off performance.write-behind-window-size: 4MB cluster.data-self-heal: off performance.stat-prefetch: off/ I googled around and found some people have the same problem with us but haven't found solution or any clues of what happened yet. Is it a bug ? and if it's could you please tell me how can i overcome it ? Many thanks ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] Gluster-3.3 Puzzler
which OS are you using? I believe 3.3 will install but won't run on older CentOSs (5.7/5.8) due to libc skew. and you did 'modprobe fuse' before you tried to mount it...? hjm On Wed, Jun 27, 2012 at 12:46 PM, Robin, Robin wrote: > Hi, > > Just updated to Gluster-3.3; I can't seem to mount my initial test volume. I > did the mount on the gluster server itself (which works on Gluster-3.2). > > # rpm -qa | grep -i gluster > glusterfs-fuse-3.3.0-1.el6.x86_64 > glusterfs-server-3.3.0-1.el6.x86_64 > glusterfs-3.3.0-1.el6.x86_64 > > # gluster volume info all > > Volume Name: vmvol > Type: Replicate > Volume ID: b105560a-e157-4b94-bac9-39378db6c6c9 > Status: Started > Number of Bricks: 1 x 2 = 2 > Transport-type: tcp > Bricks: > Brick1: mualglup01:/mnt/gluster/vmvol001 > Brick2: mualglup02:/mnt/gluster/vmvol001 > Options Reconfigured: > auth.allow: 127.0.0.1,134.53.*,10.* > > ## mount -t glusterfs mualglup01.mcs.muohio.edu:vmvol /mnt/test (did this on > the gluster machine itself) > > I'm getting the following in the logs: > +--+ > [2012-06-27 15:40:52.116160] I [rpc-clnt.c:1660:rpc_clnt_reconfig] > 0-vmvol-client-0: changing port to 24009 (from 0) > [2012-06-27 15:40:52.116479] I [rpc-clnt.c:1660:rpc_clnt_reconfig] > 0-vmvol-client-1: changing port to 24009 (from 0) > [2012-06-27 15:40:56.055124] I > [client-handshake.c:1636:select_server_supported_programs] 0-vmvol-client-0: > Using Program GlusterFS 3.3.0, Num (1298437), Version (330) > [2012-06-27 15:40:56.055575] I > [client-handshake.c:1433:client_setvolume_cbk] 0-vmvol-client-0: Connected > to 10.0.72.132:24009, attached to remote volume '/mnt/gluster/vmvol001'. > [2012-06-27 15:40:56.055610] I > [client-handshake.c:1445:client_setvolume_cbk] 0-vmvol-client-0: Server and > Client lk-version numbers are not same, reopening the fds > [2012-06-27 15:40:56.055682] I [afr-common.c:3627:afr_notify] > 0-vmvol-replicate-0: Subvolume 'vmvol-client-0' came back up; going online. > [2012-06-27 15:40:56.055871] I > [client-handshake.c:453:client_set_lk_version_cbk] 0-vmvol-client-0: Server > lk version = 1 > [2012-06-27 15:40:56.057871] I > [client-handshake.c:1636:select_server_supported_programs] 0-vmvol-client-1: > Using Program GlusterFS 3.3.0, Num (1298437), Version (330) > [2012-06-27 15:40:56.058277] I > [client-handshake.c:1433:client_setvolume_cbk] 0-vmvol-client-1: Connected > to 10.0.72.133:24009, attached to remote volume '/mnt/gluster/vmvol001'. > [2012-06-27 15:40:56.058304] I > [client-handshake.c:1445:client_setvolume_cbk] 0-vmvol-client-1: Server and > Client lk-version numbers are not same, reopening the fds > [2012-06-27 15:40:56.063514] I [fuse-bridge.c:4193:fuse_graph_setup] 0-fuse: > switched to graph 0 > [2012-06-27 15:40:56.063638] I > [client-handshake.c:453:client_set_lk_version_cbk] 0-vmvol-client-1: Server > lk version = 1 > [2012-06-27 15:40:56.063802] I [fuse-bridge.c:4093:fuse_thread_proc] 0-fuse: > unmounting /mnt/test > [2012-06-27 15:40:56.064207] W [glusterfsd.c:831:cleanup_and_exit] > (-->/lib64/libc.so.6(clone+0x6d) [0x35f0ce592d] (-->/lib64/libpthread.so.0() > [0x35f14077f1] (-->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xdd) > [0x405cfd]))) 0-: received signum (15), shutting down > [2012-06-27 15:40:56.064250] I [fuse-bridge.c:4643:fini] 0-fuse: Unmounting > '/mnt/test'. > > The server and client should be the same version (as attested by the rpm). > I've seen that some other people are getting the same errors in the archive; > no solutions were offered. > > Any help is appreciated. > > Thanks, > Robin > > > ___ > Gluster-users mailing list > Gluster-users@gluster.org > http://gluster.org/cgi-bin/mailman/listinfo/gluster-users > -- Harry Mangalam - Research Computing, OIT, Rm 225 MSTB, UC Irvine [m/c 2225] / 92697 Google Voice Multiplexer: (949) 478-4487 415 South Circle View Dr, Irvine, CA, 92697 [shipping] MSTB Lat/Long: (33.642025,-117.844414) (paste into Google Maps) ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] about HA infrastructure for hypervisors
Why don't you have KVM running on the Gluster bricks as well? We have a 4 node cluster (each with 4x 300GB 15k SAS drives in RAID10), 10 gigabit SFP+ Ethernet (with redundant switching). Each node participates in a distribute+replicate Gluster namespace and runs KVM. We found this to be the most efficient (and fastest) way to run the cluster. This works well for us, although (due to Gluster using fuse) it isn't as fast as we would like. Currently waiting for the KVM driver that has been discussed a few times recently, that should make a huge difference to the performance for us. Cheers, Thomas -Original Message- From: gluster-users-boun...@gluster.org [mailto:gluster-users-boun...@gluster.org] On Behalf Of Nicolas Sebrecht Sent: Wednesday, 27 June 2012 9:13 PM To: Gerald Brandt Cc: gluster-users Subject: [Gluster-users] Re: about HA infrastructure for hypervisors The 27/06/12, Gerald Brandt wrote: > Hi, > > If your switch breaks, you are done. Put each Gluster server on it's own switch. Right. Handling switch failures isn't what I'm most worried about but I guess that I'll need to add a network link between KVM hypervisors, too. Thanks for this tip, though. > > ++ ++ > > ||--|| > > | KVM hypervisor |---+ +---| KVM hypervisor | > > || | | || > > ++ | | ++ > >| | > > +--+ +--+ > > |switch| |switch| > > +--+ +--+ > > | | | | > > +---+ | | | | +---+ > > | | | | | +-| | > > | Glusterfs 3.3 |--+ +--)| Glusterfs 3.3 | > > | server A| || server B| > > | |---+| | > > +---++---+ -- Nicolas Sebrecht ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users -- Message protected by MailGuard: e-mail anti-virus, anti-spam and content filtering. http://www.mailguard.com.au ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] about HA infrastructure for hypervisors
On Wed, 27 Jun 2012, Brian Candler wrote: For a 16-disk array, your IOPS is not bad. But are you actually storing a VM image on it, and then doing lots of I/O within that VM (as opposed to mounting the volume form within the VM)? If so, can you specify your exact configuration, including OS and kernel versions? 2.6.32-220.23.1.el6.x86_64 [root@virt01 ~]# gluster volume info share Volume Name: share Type: Distributed-Replicate Volume ID: 09bfc0c3-e3d4-441b-af6f-acd263884920 Status: Started Number of Bricks: 4 x 2 = 8 Transport-type: tcp Bricks: Brick1: 10.59.0.11:/export Brick2: 10.59.0.12:/export Brick3: 10.59.0.13:/export Brick4: 10.59.0.14:/export Brick5: 10.59.0.15:/export Brick6: 10.59.0.16:/export Brick7: 10.59.0.17:/export Brick8: 10.59.0.18:/export Options Reconfigured: diagnostics.count-fop-hits: on diagnostics.latency-measurement: on nfs.nlm: off auth.allow: * nfs.disable: off I did my tests on two quad-core/8GB nodes, 12 disks in each (md RAID10), running ubuntu 12.04, and 10GE RJ45 direct connection. The disk arrays locally perform at 350MB/s for streaming writes. Well I would first ditch ubuntu and install Centos, but. My disk arrays are slow: [root@virt01 ~]# dd if=/dev/zero of=foo bs=1M count=5k 5120+0 records in 5120+0 records out 5368709120 bytes (5.4 GB) copied, 26.8408 s, 200 MB/s But doing a dd if=/dev/zero bs=1024k within a VM, whose image was mounted on glusterfs, I was getting only 6-25MB/s. [root@test ~]# dd if=/dev/zero of=foo bs=1M count=5k 5120+0 records in 5120+0 records out 5368709120 bytes (5.4 GB) copied, 172.706 s, 31.1 MB/s While this is slower then what I would like it see, its faster then what I was getting to my NetApp and it scales better! :) <> Nathan Stratton nathan at robotics.net http://www.robotics.net ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
[Gluster-users] Gluster Barclamp?
Hey all, does anyone know if there has been a barclamp module (for the Dell Crowbar project) which has been built for Gluster? I'm guessing if there is not one that RedHat might not be such a fan of automating this step, but I'm crossing my fingers! Justice London ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] about HA infrastructure for hypervisors
On Wed, Jun 27, 2012 at 03:07:21PM -0500, Nathan Stratton wrote: > >I've made a test setup like this, but unfortunately I haven't yet been able > >to get half-decent performance out of glusterfs 3.3 as a KVM backend. It > >may work better if you use local disk for the VM images, and within the VM > >mount the glusterfs volume for application data. > > What is considered half-decent? I have a 8 cluster > distribute+replicate setup and I am getting about 65MB/s and about > 1.5K IOPS. Considering that I am only using a single two disk SAS > strip in each host I think that is not bad. For a 16-disk array, your IOPS is not bad. But are you actually storing a VM image on it, and then doing lots of I/O within that VM (as opposed to mounting the volume form within the VM)? If so, can you specify your exact configuration, including OS and kernel versions? I did my tests on two quad-core/8GB nodes, 12 disks in each (md RAID10), running ubuntu 12.04, and 10GE RJ45 direct connection. The disk arrays locally perform at 350MB/s for streaming writes. But doing a dd if=/dev/zero bs=1024k within a VM, whose image was mounted on glusterfs, I was getting only 6-25MB/s. http://gluster.org/pipermail/gluster-users/2012-June/010553.html http://gluster.org/pipermail/gluster-users/2012-June/010560.html http://gluster.org/pipermail/gluster-users/2012-June/010570.html I get much better performance on locally-attached storage with O_DIRECT (kvm option "cache=none"), but have been unable to get O_DIRECT to work with glusterfs. After a kernel upgrade (to a 3.4+ kernel which supports O_DIRECT for fuse), and using the mount option direct-io-mode=enable, the VM simply wouldn't boot: http://gluster.org/pipermail/gluster-users/2012-June/010572.html http://gluster.org/pipermail/gluster-users/2012-June/010573.html Hence I'm keen to learn the recipe for good performance with glusterfs storing VM images, as if it exists, it doesn't seem to be well documented at all. Regards, Brian. ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] about HA infrastructure for hypervisors
On Wed, 27 Jun 2012, Brian Candler wrote: I've made a test setup like this, but unfortunately I haven't yet been able to get half-decent performance out of glusterfs 3.3 as a KVM backend. It may work better if you use local disk for the VM images, and within the VM mount the glusterfs volume for application data. What is considered half-decent? I have a 8 cluster distribute+replicate setup and I am getting about 65MB/s and about 1.5K IOPS. Considering that I am only using a single two disk SAS strip in each host I think that is not bad. Alternatively, look at something like ganeti (which by default runs on top of drbd+LVM, although you can also use it to manage a cluster which uses a shared file store backend like gluster) Maybe 3.3.1 will be better. But today, your investment in SSDs is quite likely to be wasted :-( The idea is to have HA if either one KVM hypervisor or one Glusterfs server stop working (failure, maintenance, etc). You'd also need some mechanism for starting each VM on node B if node A fails. You can probably script that, although there are lots of hazards for the unwary. Maybe better to have the failover done manually. Also check out oVirt, it integrates with Gluster and provides HA. 2. We still didn't decide what physical network to choose between FC, FCoE and Infiniband. Have you ruled out 10G ethernet? If so, why? I agree, we went all 10GBase-T. (note: using SFP+ ports, either with fibre SFP+s or SFP+ coax cables, gives much better latency that 10G over RJ45/CAT6) Actually with the new switches like Arista this is less of an issue. 3. Would it be better to split the Glusterfs namespace into two gluster volumes (one for each hypervisor), each running on a Glusterfs server (for the normal case where all servers are running)? I don't see how that would help - I expect you would mount both volumes on both KVM nodes anyway, to allow you to do live migration. Yep <> Nathan Stratton nathan at robotics.net http://www.robotics.net ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] about HA infrastructure for hypervisors
On Wed, Jun 27, 2012 at 10:06:30AM +0200, Nicolas Sebrecht wrote: > We are going to try glusterfs for our new HA servers. > > To get full HA, I'm thinking of building it this way: > > ++ ++ > || || > | KVM hypervisor |-++---| KVM hypervisor | > || || || > ++ || ++ > || > +--+ > |switch| > +--+ > || > +---+ ||+---+ > | | ||| | > | Glusterfs 3.3 |--++| Glusterfs 3.3 | > | server A|| server B| > | || | > +---++---+ I've made a test setup like this, but unfortunately I haven't yet been able to get half-decent performance out of glusterfs 3.3 as a KVM backend. It may work better if you use local disk for the VM images, and within the VM mount the glusterfs volume for application data. Alternatively, look at something like ganeti (which by default runs on top of drbd+LVM, although you can also use it to manage a cluster which uses a shared file store backend like gluster) Maybe 3.3.1 will be better. But today, your investment in SSDs is quite likely to be wasted :-( > The idea is to have HA if either one KVM hypervisor or one Glusterfs > server stop working (failure, maintenance, etc). You'd also need some mechanism for starting each VM on node B if node A fails. You can probably script that, although there are lots of hazards for the unwary. Maybe better to have the failover done manually. > 2. We still didn't decide what physical network to choose between FC, FCoE > and Infiniband. Have you ruled out 10G ethernet? If so, why? (note: using SFP+ ports, either with fibre SFP+s or SFP+ coax cables, gives much better latency that 10G over RJ45/CAT6) > 3. Would it be better to split the Glusterfs namespace into two gluster > volumes (one for each hypervisor), each running on a Glusterfs server > (for the normal case where all servers are running)? I don't see how that would help - I expect you would mount both volumes on both KVM nodes anyway, to allow you to do live migration. ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
[Gluster-users] Gluster-3.3 Puzzler
Hi, Just updated to Gluster-3.3; I can't seem to mount my initial test volume. I did the mount on the gluster server itself (which works on Gluster-3.2). # rpm -qa | grep -i gluster glusterfs-fuse-3.3.0-1.el6.x86_64 glusterfs-server-3.3.0-1.el6.x86_64 glusterfs-3.3.0-1.el6.x86_64 # gluster volume info all Volume Name: vmvol Type: Replicate Volume ID: b105560a-e157-4b94-bac9-39378db6c6c9 Status: Started Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: mualglup01:/mnt/gluster/vmvol001 Brick2: mualglup02:/mnt/gluster/vmvol001 Options Reconfigured: auth.allow: 127.0.0.1,134.53.*,10.* ## mount -t glusterfs mualglup01.mcs.muohio.edu:vmvol /mnt/test (did this on the gluster machine itself) I'm getting the following in the logs: +--+ [2012-06-27 15:40:52.116160] I [rpc-clnt.c:1660:rpc_clnt_reconfig] 0-vmvol-client-0: changing port to 24009 (from 0) [2012-06-27 15:40:52.116479] I [rpc-clnt.c:1660:rpc_clnt_reconfig] 0-vmvol-client-1: changing port to 24009 (from 0) [2012-06-27 15:40:56.055124] I [client-handshake.c:1636:select_server_supported_programs] 0-vmvol-client-0: Using Program GlusterFS 3.3.0, Num (1298437), Version (330) [2012-06-27 15:40:56.055575] I [client-handshake.c:1433:client_setvolume_cbk] 0-vmvol-client-0: Connected to 10.0.72.132:24009, attached to remote volume '/mnt/gluster/vmvol001'. [2012-06-27 15:40:56.055610] I [client-handshake.c:1445:client_setvolume_cbk] 0-vmvol-client-0: Server and Client lk-version numbers are not same, reopening the fds [2012-06-27 15:40:56.055682] I [afr-common.c:3627:afr_notify] 0-vmvol-replicate-0: Subvolume 'vmvol-client-0' came back up; going online. [2012-06-27 15:40:56.055871] I [client-handshake.c:453:client_set_lk_version_cbk] 0-vmvol-client-0: Server lk version = 1 [2012-06-27 15:40:56.057871] I [client-handshake.c:1636:select_server_supported_programs] 0-vmvol-client-1: Using Program GlusterFS 3.3.0, Num (1298437), Version (330) [2012-06-27 15:40:56.058277] I [client-handshake.c:1433:client_setvolume_cbk] 0-vmvol-client-1: Connected to 10.0.72.133:24009, attached to remote volume '/mnt/gluster/vmvol001'. [2012-06-27 15:40:56.058304] I [client-handshake.c:1445:client_setvolume_cbk] 0-vmvol-client-1: Server and Client lk-version numbers are not same, reopening the fds [2012-06-27 15:40:56.063514] I [fuse-bridge.c:4193:fuse_graph_setup] 0-fuse: switched to graph 0 [2012-06-27 15:40:56.063638] I [client-handshake.c:453:client_set_lk_version_cbk] 0-vmvol-client-1: Server lk version = 1 [2012-06-27 15:40:56.063802] I [fuse-bridge.c:4093:fuse_thread_proc] 0-fuse: unmounting /mnt/test [2012-06-27 15:40:56.064207] W [glusterfsd.c:831:cleanup_and_exit] (-->/lib64/libc.so.6(clone+0x6d) [0x35f0ce592d] (-->/lib64/libpthread.so.0() [0x35f14077f1] (-->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xdd) [0x405cfd]))) 0-: received signum (15), shutting down [2012-06-27 15:40:56.064250] I [fuse-bridge.c:4643:fini] 0-fuse: Unmounting '/mnt/test'. The server and client should be the same version (as attested by the rpm). I've seen that some other people are getting the same errors in the archive; no solutions were offered. Any help is appreciated. Thanks, Robin ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] about HA infrastructure for hypervisors
If you do decide to use 2 switchs: 1) for KVM Hosts use 2 nics, bridge them and run KVM on the bridge (usually br0), link each nic to a different switch 2) Interlink those switches. Dan -Original Message- From: gluster-users-boun...@gluster.org [mailto:gluster-users-boun...@gluster.org] On Behalf Of Nicolas Sebrecht Sent: Wednesday, June 27, 2012 4:13 AM To: Gerald Brandt Cc: gluster-users Subject: [Gluster-users] Re: about HA infrastructure for hypervisors The 27/06/12, Gerald Brandt wrote: > Hi, > > If your switch breaks, you are done. Put each Gluster server on it's own > switch. Right. Handling switch failures isn't what I'm most worried about but I guess that I'll need to add a network link between KVM hypervisors, too. Thanks for this tip, though. > > ++ ++ > > ||--|| > > | KVM hypervisor |---+ +---| KVM hypervisor | > > || | | || > > ++ | | ++ > >| | > > +--+ +--+ > > |switch| |switch| > > +--+ +--+ > > | | | | > > +---+ | | | | +---+ > > | | | | | +-| | > > | Glusterfs 3.3 |--+ +--)| Glusterfs 3.3 | > > | server A| || server B| > > | |---+| | > > +---++---+ -- Nicolas Sebrecht ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] Gluster 3.3.0 and VMware ESXi 5
On Wed, Jun 27, 2012 at 12:23 AM, Brian Candler wrote: > On Tue, Jun 26, 2012 at 02:08:42PM -0700, Simon Blackstein wrote: > >Thanks Brian. > >Yes, got rid of the .glusterfs and .vSphereHA directory that VMware > >makes. Rebooted, so yes it was remounted and used a different mount > >point name. Also got rid of attribute I found set on the root: > >setfattr -x trusted.gfid / && setfattr -x trusted.glusterfs.dht / > >Any other tips? :) > > Not from me I'm afraid... anyone else? > Looks like Simon already found the stale attribute on the forth node and is all set now. Avati ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
[Gluster-users] Re: about HA infrastructure for hypervisors
The 27/06/12, Gerald Brandt wrote: > Hi, > > If your switch breaks, you are done. Put each Gluster server on it's own > switch. Right. Handling switch failures isn't what I'm most worried about but I guess that I'll need to add a network link between KVM hypervisors, too. Thanks for this tip, though. > > ++ ++ > > ||--|| > > | KVM hypervisor |---+ +---| KVM hypervisor | > > || | | || > > ++ | | ++ > >| | > > +--+ +--+ > > |switch| |switch| > > +--+ +--+ > > | | | | > > +---+ | | | | +---+ > > | | | | | +-| | > > | Glusterfs 3.3 |--+ +--)| Glusterfs 3.3 | > > | server A| || server B| > > | |---+| | > > +---++---+ -- Nicolas Sebrecht ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] about HA infrastructure for hypervisors
- Original Message - > From: "Nicolas Sebrecht" > To: "gluster-users" > Sent: Wednesday, June 27, 2012 3:06:30 AM > Subject: [Gluster-users] about HA infrastructure for hypervisors > > Hi, > > We are going to try glusterfs for our new HA servers. > > To get full HA, I'm thinking of building it this way: > > ++ ++ > || || > | KVM hypervisor |-++---| KVM hypervisor | > || || || > ++ || ++ > || > +--+ > |switch| > +--+ > || > +---+ ||+---+ > | | ||| | > | Glusterfs 3.3 |--++| Glusterfs 3.3 | > | server A|| server B| > | || | > +---++---+ > > > The idea is to have HA if either one KVM hypervisor or one Glusterfs > server stop working (failure, maintenance, etc). > > Some points: > - We don't care much about duplicating the network (we're going to > have > spare materials only). > - Glusterfs servers will use gluster replication to get HA. > - Each Glusterfs server will have SSD disks in a RAID (1 or 10, I > guess). > - Most of the time, both KVM hypervisor will have VM running. > > 1. Is this a correct/typicall infrastructure? > > 2. We still didn't decide what physical network to choose between FC, > FCoE > and Infiniband. What would you suggest for both performance and easy > configuration? > > Is it possible to use FC or FCoE for a HA Glusterfs cluster? If so, > how > to configure Glusterfs nodes? > > 3. Would it be better to split the Glusterfs namespace into two > gluster > volumes (one for each hypervisor), each running on a Glusterfs server > (for the normal case where all servers are running)? > > > Thanks, > > -- > Nicolas Sebrecht Hi, If your switch breaks, you are done. Put each Gluster server on it's own switch. > ++ ++ > || || > | KVM hypervisor |---+ +---| KVM hypervisor | > || | | || > ++ | | ++ >| | > +--+ +--+ > |switch| |switch| > +--+ +--+ > | | | | > +---+ | | | | +---+ > | | | | | +-| | > | Glusterfs 3.3 |--+ +--)| Glusterfs 3.3 | > | server A| || server B| > | |---+| | > +---++---+ ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] gluster-fuse and rebalance (Re: (3.1.X) add-brick during re-balancing)
Hi Tomo, That is correct. The gluster volume rebalance is no more dependent on gluster-fuse package. With regards, Shishir - Original Message - From: "Tomoaki Sato" To: "Amar Tumballi" Cc: gluster-users@gluster.org Sent: Wednesday, June 27, 2012 1:36:44 PM Subject: [Gluster-users] gluster-fuse and rebalance (Re: (3.1.X) add-brick during re-balancing) Amar, Let me confirm that gluster-fuse package is not required in NFS-server-use of GlusterFS-3.3.0 as 'glaster volume rebalance' dose not use FUSE. Regards, Tomo (2011/08/29 15:09), Amar Tumballi wrote: > > 1. fuse module is required to 'glaster volume rebalance'. > > 2. 'gluster volume rebalance start' and 'gluster volume rebalance stop' > must be issued in exactly same node. > > By the way, why multiple re-balancing on a volume can be exist in a > cluster ? > > > Enhancement to make sure 'rebalance' operation co-operate with other peers is > in progress, and will be available in 3.3.0 release. For now, yes, there is a > possibility that single volume can have multiple rebalance operations going > on (on different machines). Even though it should not break any thing, it > will degrade the performance. For now, preventing this should be taken care > by admin. > > Regards, > Amar ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
[Gluster-users] gluster-fuse and rebalance (Re: (3.1.X) add-brick during re-balancing)
Amar, Let me confirm that gluster-fuse package is not required in NFS-server-use of GlusterFS-3.3.0 as 'glaster volume rebalance' dose not use FUSE. Regards, Tomo (2011/08/29 15:09), Amar Tumballi wrote: 1. fuse module is required to 'glaster volume rebalance'. 2. 'gluster volume rebalance start' and 'gluster volume rebalance stop' must be issued in exactly same node. By the way, why multiple re-balancing on a volume can be exist in a cluster ? Enhancement to make sure 'rebalance' operation co-operate with other peers is in progress, and will be available in 3.3.0 release. For now, yes, there is a possibility that single volume can have multiple rebalance operations going on (on different machines). Even though it should not break any thing, it will degrade the performance. For now, preventing this should be taken care by admin. Regards, Amar ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
[Gluster-users] about HA infrastructure for hypervisors
Hi, We are going to try glusterfs for our new HA servers. To get full HA, I'm thinking of building it this way: ++ ++ || || | KVM hypervisor |-++---| KVM hypervisor | || || || ++ || ++ || +--+ |switch| +--+ || +---+ ||+---+ | | ||| | | Glusterfs 3.3 |--++| Glusterfs 3.3 | | server A|| server B| | || | +---++---+ The idea is to have HA if either one KVM hypervisor or one Glusterfs server stop working (failure, maintenance, etc). Some points: - We don't care much about duplicating the network (we're going to have spare materials only). - Glusterfs servers will use gluster replication to get HA. - Each Glusterfs server will have SSD disks in a RAID (1 or 10, I guess). - Most of the time, both KVM hypervisor will have VM running. 1. Is this a correct/typicall infrastructure? 2. We still didn't decide what physical network to choose between FC, FCoE and Infiniband. What would you suggest for both performance and easy configuration? Is it possible to use FC or FCoE for a HA Glusterfs cluster? If so, how to configure Glusterfs nodes? 3. Would it be better to split the Glusterfs namespace into two gluster volumes (one for each hypervisor), each running on a Glusterfs server (for the normal case where all servers are running)? Thanks, -- Nicolas Sebrecht ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] Gluster 3.3.0 and VMware ESXi 5
On Tue, Jun 26, 2012 at 02:08:42PM -0700, Simon Blackstein wrote: >Thanks Brian. >Yes, got rid of the .glusterfs and .vSphereHA directory that VMware >makes. Rebooted, so yes it was remounted and used a different mount >point name. Also got rid of attribute I found set on the root: >setfattr -x trusted.gfid / && setfattr -x trusted.glusterfs.dht / >Any other tips? :) Not from me I'm afraid... anyone else? ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users