Re: [Gluster-users] novice kind of question.. replication(raid)
This is basically the config I'm using for replicate a directory between two hosts (RAID 1 if you like ;-) ) You need server and client even both are on the same host: ## # glusterfsd.vol (server): ## volume posix type storage/posix option directory /some_folder end-volume volume locks type features/locks subvolumes posix end-volume volume server type protocol/server option transport-type tcp option transport.socket.bind-address ... option transport.socket.listen-port 6996 option auth.addr.locks.allow * subvolumes locks end-volume # # glusterfs.vol (client): # volume remote1 type protocol/client option transport-type tcp option remote-host ip_or_name_of_box_a option remote-port 6996 option remote-subvolume locks end-volume volume remote2 type protocol/client option transport-type tcp option remote-host ip_or_name_of_box_b option remote-port 6996 option remote-subvolume locks end-volume volume replicate type cluster/replicate # optionally but useful if most is reading # !!!different values for box a and box b!!! # option read-subvolume remote1 # option read-subvolume remote2 subvolumes remote1 remote2 end-volume # # /etc/fstab # /etc/glusterfs/glusterfs.vol /some_folder glusterfs noatime 0 0 noatime is optional of course. Depends on your needs. - Robert On 04/16/10 14:18, pawel eljasz wrote: dear all, I just subscribed and started reading docs, but still not sure if I got the hung of it all is GlusterFS for something simple like: a box -b box /some_folder /some_folder so /some_folder on both boxes would contain same data if yes, then does setting only the servers suffice? or client side is needed too? can someone share a simplistic config that would work for above simple design? cheers ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] novice kind of question.. replication(raid)
You would normally always work with /mnt/glusterfs (the glusterfs mount) because every change will immediately will be replicated and this is what you want normally. Maybe there some confusion what is meant by locally. Basically in this setup everything is locally ;-) If you look in the glusterfsd.vol file you see a option directory. This directory has to exist on both hosts if you want a RAID 1 setup which means that your data will be stored on both backends which in turn means that the data will be duplicated on different hosts. This will save you the data if one hosts explodes ;-) This is what glusterfsd will do for you. It stores the data in the directory specified in option directory in glusterfsd.vol. This directory is really local for every backend. But you would normally do any changes in this directory. Strictly speaking: Do NOT change anything there. But something has to do the replication. And this is what the client/mount will do for you. If you mount glusterfs.vol on /mnt/glusterfs e.g. on both hosts you get a GlusterFS mount. In our case it is a replicated mount. As you see in volume replicate the option subvolumes remote1 remote2 will to the magic. It basically says: If someone copies a file to /mnt/glusterfs store it on remote1 and remote2 in directory /opt/glusterfsbackend (to get back to my example below). So in our case not glusterfsd will replicate the data but the client/mount will do it. As long as you can live with some inaccuracy you can do read only things like find, du, ... in the backend directory /opt/glusterfsbackend. This will be much faster. But don't change anything there (I know someone will bashing me for this ... ;-) ). - Robert On 04/16/10 16:08, Jenn Fountain wrote: Jumping on this thread with a relevant (I think question) - I am new to gluster as well. Where do you typically work with the files - local or gluster mount? IE: /repl/export - local /mnt/glusterfs - gluster mount Would you work with the files on /repl/export and then copy them (automate this via a script or can gluster automate this) to the /mnt/glusterfs so they replicate or work with them on the /mnt/glusterfs and have them replicate? Sorry for the novice question but I am a novice. -Jenn On Apr 16, 2010, at 10:01 AM, RW wrote: many thanks Robert for your quick reply, I still probably am missing/misunderstanding the big picture here, what about this: box a -- box b /dir_1 /dir_1 ^ ^ serivces locally services locally read/write to dir_1 read/write to /dir_1 This is basically the setup I described with my config files. /dir_1 (or /some_folder in you former mail) is the client mount. Everything you copy in there will be replicated to box a and box b. It doesn't matter if you do the copy in box a or b. But you need a different location for glusterfsd (the GlusterFS daemon) to store the files locally. This could be /opt/glusterfsbackend for example. You need this on both hosts and you need the mounts (client) on both hosts. - can all these local services/processes, whatever these might be, not know about mountig and all this stuff? You need to copy glusterfsd.vol on both hosts e.g. /etc/glusterfs/ Then you start glusterfsd (on Gentoo this is /etc/init.d/glusterfsd start). Now you should see a glusterfsd process on both hosts. You also copy glusterfs.vol to both hosts. As you can see in my /etc/fstab I supply the glusterfs.vol file as the filesystem and glusterfs as type. You now mount GlusterFS as you would do with every other filesystem. If you now copy a file to /some_folder on box a it will automatically be replicated to box b and after that it will be immediately be available at box b. The replication is done by the client (the mountpoint in your case if this helps to better understand). The servers basically only provide the backend services to store the data somewhere on a brick (host). In my example above this was /opt/glusterfsbackend. - and server between themselves make sure(resolve conflicts, etc.) that content of dir_1 on both boxes is the same? Most of the time ;-) There're situations where conflicts can occur but in this basic setup they're seldom. You have to monitor the log files. But GlusterFS provides self healing which means that if a backend (host) goes down the files generated on the good host - while the bad host is down - will be copied to the failed host if it is up again. But this will not happen immediately. This is the magic part of GlusterFS ;-) - so whatever happens(locally) on box_a is replicated(through servers) on box_b and vice versa, possible with GlusterFS or I need to be looking for something else? As long as you copy the files into the glusterfs mount (in your case /some_folder) the files will be copied to box b
Re: [Gluster-users] novice kind of question.. replication(raid)
See my answers in the text. Robert, thank you ever so much for clarifying the picture, but I still wonder, why I do? because to me that seems like kind of first aid functionality in any network distributed fs, it should be there.. so I wonder is it possible with glusterfs get the following: have server(backend) working as daemon on two(or any number of) boxes and have this server(s) on this box(es) watching over a local tree(folder) and basically these servers(backends) would be syncing with each other and would be doing it only to ensure of the content of this tree to be the same on all boxes Puh... I don't know if I get you right but for me it looks like that you're looking for a filesystem which requires a central storage (SAN) like GFS/GFS2 (Redhat) or OCFS (Oracle Cluster File System). GFS or GFS2 can also be used as a local filesystem. GFS/GFS2 is more what you've described above. server_1 - server_2 - server_3 ||| ^ ^ ^ /watch_me /watch_me /watch_me so no mounts, a process changes something in this local /watch_me on server_1 server_1 propagates(obviously working through the logic) the change to other servers and vice versa is it possible to, maybe by introducing client part of config into glusterfsd.vol, to have it like this? without having a client have to mount/configure replication? Well if I haven't missed something then the short answer should be: no. Since the glusterfsd daemons (backend) are only responsible for storing the data locally (besides some other things of course) you need a mount point because the magic of distribution/replication lies in the client (configuration). But I can show you a configuration where (almost) no mount is needed. But I doubt that it will help you. We're using GlusterFS where we have a central CMS (content management system). On this CMS host we've a GlusterFS mount which replicates the pictures uploaded to 8 other hosts. On each of this 8 hosts there is running glusterfsd of course. glusterfsd then stores this files locally on each host. The 8 hosts run Apache webservers which delivers this pictures to the web browsers out there. This scenario is very practical if you need to distribute files from a central location to many other hosts. Important to note here is that you really only read the files and do not modify it (besides the host which has the CMS of course). This changes on the backends won't be replicated and you'll probably get strange results over time. other than that glusterfs feels cool, last two days I was fiddling with coda but it the end it crashes way to often, at least Fedora's rpm is like this, yet there is(was) a problem with glusterfs for me too, if anybody uses fedora: https://bugzilla.redhat.com/show_bug.cgi?id=555728 I've had problems on Gentoo until version 3.0.2. 3.0.2 was the first version for us which works quite well. There are some issues left until now but I haven't tested 3.0.4 yet. ps. is it in reality as docs say, glusterfs won't work on slow and flaky networks? 1GbE at least? I would definitely recommend 1GbE. If you need a filesystem for slow and flaky networks (over WAN) maybe you should have a look at AFS (http://en.wikipedia.org/wiki/Andrew_File_System). But it is more complicated to setup. But I wouldn't compare GlusterFS and AFS directly. - Robert cheers On 16/04/10 15:01, RW wrote: many thanks Robert for your quick reply, I still probably am missing/misunderstanding the big picture here, what about this: box a -- box b /dir_1 /dir_1 ^ ^ serivces locally services locally read/write to dir_1 read/write to /dir_1 This is basically the setup I described with my config files. /dir_1 (or /some_folder in you former mail) is the client mount. Everything you copy in there will be replicated to box a and box b. It doesn't matter if you do the copy in box a or b. But you need a different location for glusterfsd (the GlusterFS daemon) to store the files locally. This could be /opt/glusterfsbackend for example. You need this on both hosts and you need the mounts (client) on both hosts. - can all these local services/processes, whatever these might be, not know about mountig and all this stuff? You need to copy glusterfsd.vol on both hosts e.g. /etc/glusterfs/ Then you start glusterfsd (on Gentoo this is /etc/init.d/glusterfsd start). Now you should see a glusterfsd process on both hosts. You also copy glusterfs.vol to both hosts. As you can see in my /etc/fstab I supply the glusterfs.vol file as the filesystem and glusterfs as type. You now mount GlusterFS as you would do with every other filesystem. If you now copy a file to /some_folder
Re: [Gluster-users] Setup for production - which one would you choose?
I definitely agree with Stephan in case of GlusterFS. I've had some major problems with 2.0.9 and 3.0.0. 3.0.2 now works (almost) fine for me. But even in 3.0.2 I still had to migrate a Drupal cache directory (using Boost module) to NFS because I have had some weird problems (host began to swap after a while). Currently my experience with GlusterFS is that mounts with heavy write activity and small files are not really useable. The same is true if you're having lot of files in an GlusterFS mount and you do lot of wildcard queries there e.g. find /glusterfs/ -name test* (just a note here: find was much faster than ls in my tests. For my tests I've used the kernel sources). For distributing and replicating files where the clients mostly read that files GlusterFS is really cool. :-) But most importantly: Before using it - test, test and test again your setup ;-) I've made the mistake to start with performance translators I thought they where good for me. Now I'm running completely without any one and everything is fine. option read-subvolume for example always caused load on two servers. After removing the translators it worked as expected. writebehind was the most problematic one which caused pictures converted with ImageMagick to get corrupted (you just saw half of the picture the other half was black). Just my 2... - Robert On 03/25/10 08:02, Stephan von Krawczynski wrote: In fact, background for my post is very trivial: glusterfs is really in development stage. So there is a real difference in using 2.0.9, 3.0.2 or 3.0.3. In fact it might be a difference of go vs no-go in your very special setup. That's why I judge the comparison to other rpm questions as not valid. This is not fetchmail where you can use almost any rpm flying around. And I did not tell to compile your whole setup by hand. I am talking about glusterfs and using its latest version in favor of using some available rpm not containing the latest version. -- Regards, Stephan On Wed, 24 Mar 2010 23:19:30 +0100 Steve stev...@gmx.net wrote: Original-Nachricht Datum: Wed, 24 Mar 2010 23:01:55 +0100 Von: Oliver Hoffmann o...@dom.de An: gluster-users@gluster.org Betreff: Re: [Gluster-users] Setup for production - which one would you choose? Yep, thanx. @Stephan: It is not a matter of knowing how use tar and make, but if you have a bunch of servers than you want to do an apt-get update/upgrade once in a while without compiling this piece of software on that server and another one on another server, etc. Not only that. On a RPM system (aka Red Hat, SuSE, Mandriva, etc) where you have a support contract, installing packages that are not made by the vendor does void support. So there is a good reason to use by vendor pre-build RPMs. A bunch of years ago I have helped a big vendor to virtualize the biggest Linux installation in northern Europe for one of their customers. There where over thousand Red Hat Enterprise Server installed in total. The customer followed ITIL Release To Production. No you could jump up and down about a new release of application XYZ and that you could install it form a self made RPM. The customer does not care. Installing own made RPMS = no support from Red Hat. Now if your business is depended on running systems and ever second downtime can cost you hundreds of € then you don't think twice about installing from source. You just don't do it. It's that easy. Just compare the potential problem (aka: downtime, loss of money, loss of trust from customers, etc) to the potential benefit of a own made RPM then you will quickly realize that it is a no go. Stephan is probably a small shop doing all his stuff by hand. But there are situations where this handicraft stuff is just not the way to go. It is hard to fully understand what you just wrote. If you are suggesting that someone else's personal preferences (or company objectives) are incorrect or misguided simply because they don't match your own I'm trying to understand how your last post pertains to the user forum for Gluster? There are plenty of reasons to prefer packages over source installations but that academic conversation is also not appropriate for this list. Cheers, Benjamin -Original Message- From: gluster-users-boun...@gluster.org [mailto:gluster-users-boun...@gluster.org] On Behalf Of Stephan von Krawczynski Sent: Wednesday, March 24, 2010 4:37 PM To: Ian Rogers Cc: gluster-users@gluster.org Subject: Re: [Gluster-users] Setup for production - which one would you choose? Ok, guys, honestly: it is allowed to learn (RMS fought for your right to do so) :-) Really rarely in the open source universe you will find a piece of software that is as easy to compile and run as glusterfs. All you have to know yourself is how to use tar. Then enter the source directory and do ./configure ; make ; make install What exactly is difficult to do?
Re: [Gluster-users] Fuse 2.8.1
On Gentoo we're using glusterfs 3.0.0 with fuse 2.8.1. So I would say yes ;-) - Robert On 02/12/10 15:15, Nick Birkett wrote: Currently we are using fuse-2.7.4glfs11 . Does glusterfs 3+ work with fuse 2.8.1 libraries ? Thanks, Nick This e-mail message may contain confidential and/or privileged information. If you are not an addressee or otherwise authorized to receive this message, you should not use, copy, disclose or take any action based on this e-mail or any information contained in the message. If you have received this material in error, please advise the sender immediately by reply e-mail and delete this message. Thank you. Streamline Computing is a trading division of Concurrent Thinking Limited: Registered in England and Wales No: 03913912 Registered Address: The Innovation Centre, Warwick Technology Park, Gallows Hill, Warwick, CV34 6UW, United Kingdom ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users