Re: [Gluster-users] novice kind of question.. replication(raid)

2010-04-16 Thread RW
This is basically the config I'm using for replicate
a directory between two hosts (RAID 1 if you like ;-) )
You need server and client even both are on the same
host:

##
# glusterfsd.vol (server):
##
volume posix
  type storage/posix
  option directory /some_folder
end-volume

volume locks
  type features/locks
  subvolumes posix
end-volume

volume server
  type protocol/server
  option transport-type tcp
  option transport.socket.bind-address ...
  option transport.socket.listen-port 6996
  option auth.addr.locks.allow *
  subvolumes locks
end-volume

#
# glusterfs.vol (client):
#
volume remote1
  type protocol/client
  option transport-type tcp
  option remote-host ip_or_name_of_box_a
  option remote-port 6996
  option remote-subvolume locks
end-volume

volume remote2
  type protocol/client
  option transport-type tcp
  option remote-host ip_or_name_of_box_b
  option remote-port 6996
  option remote-subvolume locks
end-volume

volume replicate
  type cluster/replicate
  # optionally but useful if most is reading
  # !!!different values for box a and box b!!!
  # option read-subvolume remote1
  # option read-subvolume remote2
  subvolumes remote1 remote2
end-volume

#
# /etc/fstab
#
/etc/glusterfs/glusterfs.vol /some_folder  glusterfs  noatime  0  0

noatime is optional of course. Depends on your needs.

- Robert


On 04/16/10 14:18, pawel eljasz wrote:
 dear all, I just subscribed and started reading docs,
 but still not sure if I got the hung of it all
 is GlusterFS for something simple like:
 
 a box -b box
 /some_folder  /some_folder
 
 so /some_folder on both boxes would contain same data
 
 if yes, then does setting only the servers suffice? or client side is
 needed too?
 can someone share a simplistic config that would work for above simple
 design?
 
 cheers
 
 
 
 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] novice kind of question.. replication(raid)

2010-04-16 Thread RW
You would normally always work with /mnt/glusterfs (the glusterfs mount)
because every change will immediately will be replicated
and this is what you want normally.

Maybe there some confusion what is meant by locally. Basically
in this setup everything is locally ;-) If you look in the
glusterfsd.vol file you see a option directory. This directory
has to exist on both hosts if you want a RAID 1 setup which
means that your data will be stored on both backends which in
turn means that the data will be duplicated on different hosts.
This will save you the data if one hosts explodes ;-) This is
what glusterfsd will do for you. It stores the data in the
directory specified in option directory in glusterfsd.vol.
This directory is really local for every backend. But you would
normally do any changes in this directory. Strictly speaking:
Do NOT change anything there.

But something has to do the replication. And this is what
the client/mount will do for you. If you mount glusterfs.vol
on /mnt/glusterfs e.g. on both hosts you get a GlusterFS
mount. In our case it is a replicated mount. As you see in
volume replicate the option subvolumes remote1 remote2
will to the magic. It basically says: If someone copies a
file to /mnt/glusterfs store it on remote1 and remote2 in
directory /opt/glusterfsbackend (to get back to my example
below). So in our case not glusterfsd will replicate the
data but the client/mount will do it.

As long as you can live with some inaccuracy you can do
read only things like find, du, ... in the backend directory
/opt/glusterfsbackend. This will be much faster. But don't
change anything there (I know someone will bashing me for
this ... ;-) ).

- Robert

On 04/16/10 16:08, Jenn Fountain wrote:
 Jumping on this thread with a relevant  (I think question) - I am new to 
 gluster as well.
 
 Where do you typically work with the files - local or gluster mount?  IE:  
 /repl/export - local /mnt/glusterfs - gluster mount
 
 Would you work with the files on /repl/export and then copy them (automate 
 this via a script or can gluster automate this) to the /mnt/glusterfs so they 
 replicate or work with them on the /mnt/glusterfs and have them replicate?   
 Sorry for the novice question but I am a novice. 
 
 -Jenn
 
 
 
 
 
 On Apr 16, 2010, at 10:01 AM, RW wrote:
 

 many thanks Robert for your quick reply,
 I still probably am missing/misunderstanding the big picture here, what
 about this:

box a   --   box b
/dir_1 /dir_1
  ^ ^
serivces locally services locally
   read/write to dir_1 read/write to /dir_1

 This is basically the setup I described with my config files.
 /dir_1 (or /some_folder in you former mail) is the client mount.
 Everything you copy in there will be replicated to box a and
 box b. It doesn't matter if you do the copy in box a or b.
 But you need a different location for glusterfsd (the GlusterFS
 daemon) to store the files locally. This could be /opt/glusterfsbackend
 for example. You need this on both hosts and you need the mounts
 (client) on both hosts.

 - can all these local services/processes, whatever these might be,
 not know about mountig and all this stuff?

 You need to copy glusterfsd.vol on both hosts e.g. /etc/glusterfs/
 Then you start glusterfsd (on Gentoo this is /etc/init.d/glusterfsd
 start). Now you should see a glusterfsd process on both hosts.
 You also copy glusterfs.vol to both hosts. As you can see in my
 /etc/fstab I supply the glusterfs.vol file as the filesystem
 and glusterfs as type. You now mount GlusterFS as you would do
 with every other filesystem. If you now copy a file to /some_folder
 on box a it will automatically be replicated to box b and after
 that it will be immediately be available at box b. The replication
 is done by the client (the mountpoint in your case if this
 helps to better understand). The servers basically only provide the
 backend services to store the data somewhere on a brick (host).
 In my example above this was /opt/glusterfsbackend.

 - and server between themselves make sure(resolve conflicts, etc.)
 that content of dir_1 on both boxes is the same?

 Most of the time ;-) There're situations where conflicts can
 occur but in this basic setup they're seldom. You have to monitor
 the log files. But GlusterFS provides self healing which means
 that if a backend (host) goes down the files generated on the
 good host - while the bad host is down - will be copied to the failed
 host if it is up again. But this will not happen immediately.
 This is the magic part of GlusterFS ;-)

 - so whatever happens(locally) on box_a is replicated(through servers)
 on box_b and vice versa,
 possible with GlusterFS or I need to be looking for something else?

 As long as you copy the files into the glusterfs mount (in your
 case /some_folder) the files will be copied to box b

Re: [Gluster-users] novice kind of question.. replication(raid)

2010-04-16 Thread RW
See my answers in the text.

 Robert, thank you ever so much for clarifying the picture,
 
 but I still wonder, why I do? because to me that seems like kind of
 first aid functionality
 in any network distributed fs, it should be there..
 so I wonder is it possible with glusterfs get the following:
 
 have server(backend) working as daemon on two(or any number of) boxes
 and have this server(s) on this box(es) watching over a local tree(folder)
 and basically these servers(backends) would be syncing with each other
 and would be doing it only to ensure of the content of this tree to be
 the same on all boxes

Puh... I don't know if I get you right but for me it looks like
that you're looking for a filesystem which requires a central storage
(SAN) like GFS/GFS2 (Redhat) or OCFS (Oracle Cluster File System).
GFS or GFS2 can also be used as a local filesystem. GFS/GFS2 is more
what you've described above.

 server_1  -  server_2  -  server_3
  |||
 ^  ^   ^
 /watch_me /watch_me   /watch_me
 
 so no mounts, a process changes something in this local /watch_me on
 server_1
 server_1 propagates(obviously working through the logic) the change to
 other servers and vice versa
 
 is it possible to, maybe by introducing client part of config into
 glusterfsd.vol,
 to have it like this? without having a client have to mount/configure
 replication?

Well if I haven't missed something then the short answer should be: no.
Since the glusterfsd daemons (backend) are only responsible for storing
the data locally (besides some other things of course) you need a
mount point because the magic of distribution/replication lies in the
client (configuration).

But I can show you a configuration where (almost) no mount is needed.
But I doubt that it will help you. We're using GlusterFS where we have a
central CMS (content management system). On this CMS host we've a
GlusterFS mount which replicates the pictures uploaded to 8 other
hosts. On each of this 8 hosts there is running glusterfsd of course.
glusterfsd then stores this files locally on each host. The 8 hosts
run Apache webservers which delivers this pictures to the web browsers
out there. This scenario is very practical if you need to distribute
files from a central location to many other hosts. Important to note
here is that you really only read the files and do not modify it
(besides the host which has the CMS of course). This changes on the
backends won't be replicated and you'll probably get strange results
over time.

 other than that glusterfs feels cool, last two days I was fiddling with coda
 but it the end it crashes way to often, at least Fedora's rpm is like this,
 yet there is(was) a problem with glusterfs for me too, if anybody uses
 fedora:
 https://bugzilla.redhat.com/show_bug.cgi?id=555728

I've had problems on Gentoo until version 3.0.2. 3.0.2 was the
first version for us which works quite well. There are some issues
left until now but I haven't tested 3.0.4 yet.

 ps. is it in reality as docs say, glusterfs won't work on slow and flaky
 networks? 1GbE at least?

I would definitely recommend 1GbE. If you need a filesystem for
slow and flaky networks (over WAN) maybe you should have a look at AFS
(http://en.wikipedia.org/wiki/Andrew_File_System). But it is more
complicated to setup. But I wouldn't compare GlusterFS and AFS
directly.

- Robert


 cheers
 
 
 On 16/04/10 15:01, RW wrote:
   
 many thanks Robert for your quick reply,
 I still probably am missing/misunderstanding the big picture here, what
 about this:

 box a   --   box b
 /dir_1 /dir_1
   ^ ^
 serivces locally services locally
read/write to dir_1 read/write to /dir_1
 
 This is basically the setup I described with my config files.
 /dir_1 (or /some_folder in you former mail) is the client mount.
 Everything you copy in there will be replicated to box a and
 box b. It doesn't matter if you do the copy in box a or b.
 But you need a different location for glusterfsd (the GlusterFS
 daemon) to store the files locally. This could be /opt/glusterfsbackend
 for example. You need this on both hosts and you need the mounts
 (client) on both hosts.

   
 - can all these local services/processes, whatever these might be,
 not know about mountig and all this stuff?
 
 You need to copy glusterfsd.vol on both hosts e.g. /etc/glusterfs/
 Then you start glusterfsd (on Gentoo this is /etc/init.d/glusterfsd
 start). Now you should see a glusterfsd process on both hosts.
 You also copy glusterfs.vol to both hosts. As you can see in my
 /etc/fstab I supply the glusterfs.vol file as the filesystem
 and glusterfs as type. You now mount GlusterFS as you would do
 with every other filesystem. If you now copy a file to /some_folder

Re: [Gluster-users] Setup for production - which one would you choose?

2010-03-25 Thread RW
I definitely agree with Stephan in case of GlusterFS.
I've had some major problems with 2.0.9 and 3.0.0.
3.0.2 now works (almost) fine for me. But even in 3.0.2
I still had to migrate a Drupal cache directory
(using Boost module) to NFS because I have had some
weird problems (host began to swap after a while).
Currently my experience with GlusterFS is that mounts
with heavy write activity and small files are not
really useable. The same is true if you're having
lot of files in an GlusterFS mount and you do lot
of wildcard queries there e.g. find /glusterfs/ -name test*
(just a note here: find was much faster than ls in
my tests. For my tests I've used the kernel sources).
For distributing and replicating files where the
clients mostly read that files GlusterFS is really
cool. :-) But most importantly: Before using it -
test, test and test again your setup ;-) I've made
the mistake to start with performance translators
I thought they where good for me. Now I'm running
completely without any one and everything is fine.
option read-subvolume for example always caused
load on two servers. After removing the translators
it worked as expected.
writebehind was the most problematic one which caused
pictures converted with ImageMagick to get corrupted
(you just saw half of the picture the other half
was black).

Just my 2...
- Robert


On 03/25/10 08:02, Stephan von Krawczynski wrote:
 In fact, background for my post is very trivial: glusterfs is really in
 development stage. So there is a real difference in using 2.0.9, 3.0.2 or
 3.0.3. In fact it might be a difference of go vs no-go in your very special
 setup. That's why I judge the comparison to other rpm questions as not valid.
 This is not fetchmail where you can use almost any rpm flying around.
 And I did not tell to compile your whole setup by hand. I am talking about
 glusterfs and using its latest version in favor of using some available rpm
 not containing the latest version.
 --
 Regards,
 Stephan
 
 
 On Wed, 24 Mar 2010 23:19:30 +0100
 Steve stev...@gmx.net wrote:
 

  Original-Nachricht 
 Datum: Wed, 24 Mar 2010 23:01:55 +0100
 Von: Oliver Hoffmann o...@dom.de
 An: gluster-users@gluster.org
 Betreff: Re: [Gluster-users] Setup for production - which one would you 
 choose?

 Yep, thanx.

 @Stephan: It is not a matter of knowing how use tar and make, but if you 
 have a bunch of servers than you want to do an apt-get update/upgrade 
 once in a while without compiling this piece of software on that server 
 and another one on another server, etc.

 Not only that. On a RPM system (aka Red Hat, SuSE, Mandriva, etc) where you 
 have a support contract, installing packages that are not made by the vendor 
 does void support. So there is a good reason to use by vendor pre-build RPMs.

 A bunch of years ago I have helped a big vendor to virtualize the biggest 
 Linux installation in northern Europe for one of their customers. There 
 where over thousand Red Hat Enterprise Server installed in total. The 
 customer followed ITIL Release To Production. No you could jump up and down 
 about a new release of application XYZ and that you could install it form a 
 self made RPM. The customer does not care. Installing own made RPMS = no 
 support from Red Hat. Now if your business is depended on running systems 
 and ever second downtime can cost you hundreds of € then you don't think 
 twice about installing from source. You just don't do it. It's that easy. 
 Just compare the potential problem (aka: downtime, loss of money, loss of 
 trust from customers, etc) to the potential benefit of a own made RPM then 
 you will quickly realize that it is a no go.

 Stephan is probably a small shop doing all his stuff by hand. But there are 
 situations where this handicraft stuff is just not the way to go.


 It is hard to fully understand what you just wrote.  If you are
 suggesting that someone else's personal preferences (or company
 objectives) are incorrect or misguided simply because they don't match
 your own I'm trying to understand how your last post pertains to the
 user forum for Gluster?  There are plenty of reasons to prefer packages
 over source installations but that academic conversation is also not
 appropriate for this list.

 Cheers,
 Benjamin



 -Original Message-
 From: gluster-users-boun...@gluster.org
 [mailto:gluster-users-boun...@gluster.org] On Behalf Of Stephan von
 Krawczynski
 Sent: Wednesday, March 24, 2010 4:37 PM
 To: Ian Rogers
 Cc: gluster-users@gluster.org
 Subject: Re: [Gluster-users] Setup for production - which one would you
 choose?

 Ok, guys, honestly: it is allowed to learn (RMS fought for your right to
 do so)
 :-)
 Really rarely in the open source universe you will find a piece of
 software
 that is as easy to compile and run as glusterfs. All you have to know
 yourself
 is how to use tar. Then enter the source directory and do ./configure ;
 make ;
 make install What exactly is difficult to do? 

Re: [Gluster-users] Fuse 2.8.1

2010-02-12 Thread RW
On Gentoo we're using glusterfs 3.0.0 with fuse 2.8.1.
So I would say yes ;-)

- Robert


On 02/12/10 15:15, Nick Birkett wrote:
 Currently we are using fuse-2.7.4glfs11 .
 
 Does glusterfs 3+ work with fuse 2.8.1 libraries ?
 
 Thanks,
 
 Nick
 
 
 
 
 
 This e-mail message may contain confidential and/or privileged
 information. If you are not an addressee or otherwise authorized to
 receive this message, you should not use, copy, disclose or take any
 action based on this e-mail or any information contained in the message.
 If you have received this material in error, please advise the sender
 immediately by reply e-mail and delete this message. Thank you.
 Streamline Computing is a trading division of Concurrent Thinking
 Limited: Registered in England and Wales No: 03913912
 Registered Address: The Innovation Centre, Warwick Technology Park,
 Gallows Hill, Warwick, CV34 6UW, United Kingdom
 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users