Re: [Gluster-users] [Gluster-devel] Need testers for GlusterFS 3.4.4

2014-06-04 Thread BGM
Ack 
tend to use the (your ;-) knowledge for a cfengine promise once I get the 
time...
best regards
Bernhard

 On 04.06.2014, at 20:51, James purplei...@gmail.com wrote:
 
 On Wed, Jun 4, 2014 at 2:43 PM, BGM bernhard.gl...@ecologic.eu wrote:
 we might get a cfengine/puppet framework to easily
 
 https://github.com/purpleidea/puppet-gluster
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Best Practices for different failure scenarios?

2014-02-24 Thread BGM
thnx Vijay,
will drill my head into it 
Bernhard

Sent from my iPad

 On 24.02.2014, at 17:26, Vijay Bellur vbel...@redhat.com wrote:
 
 On 02/21/2014 10:27 PM, BGM wrote:
 It might be very helpful to have a wiki next to this mailing list,
 where all the good experience, all the proved solutions for situations
 that are brought up here, could be gathered in a more
 permanent and straight way.
 
 +1. It would be very useful to evolve an operations guide for GlusterFS.
 
 .
 To your questions I would add:
 what's best practice in setting options for performance and/or integrity...
 (yeah, well, for which use case under which conditions)
 a mailinglist is very helpful for adhoc probs and questions,
 but it would be nice to distill the knowledge into a permanent, searchable 
 form.
 .
 sure anybody could set up a wiki, but...
 it would need the acceptance and participation of an active group
 to get best results.
 so IMO the appropriate place would be somewhere close to gluster.org?
 .
 
 Would be happy to carry this in doc/ folder of glusterfs.git and 
 collaborate on it if a lightweight documentation format like markdown or 
 asciidoc is used for evolving this guide.
 
 I haven't worked with neither of them,
 on the very first glance asciidoc looks easier to me.
 (assuming it is either or ?)
 and (sorry for being flat, i m op not dev ;-) you suggest everybody sets up 
 a git from where you
 pull, right?
 
 No need to setup a git on your own. We use the development workflow [1] for 
 submitting patches to documentation too.
 
 well, wouldn't a wiki be much easier? both, to contribute to and to access 
 the information?
 (like wiki.debian.org?)
 The git based solution might be easier to start of with,
 but would it reach a big enough community?
 
 Documentation in markdown or asciidoc is rendered well by github. One of the 
 chapters in our admin guide does get rendered like this [2].
 
 Wouldn't a wiki also have a better PR/marketing effect (by being easier to 
 access)?
 just a thought...
 
 We can roll out the content from git in various formats (like pdf, html etc.) 
 as both asciidoc/markdown can be converted to various formats. The advantage 
 of a git based workflow is that it becomes easy to review changes through 
 tools like gerrit and can also help in keeping false content/spam out of the 
 way.
 
 Having said that, feel free to use tools of your choice. We can just go ahead 
 and use whatever is easy for most of us :). At the end of the day, evolving 
 this guide is more important than the tools that we choose to use in the 
 process.
 
 Cheers,
 Vijay
 
 [1] 
 http://www.gluster.org/community/documentation/index.php/Development_Work_Flow
 
 [2] 
 https://github.com/gluster/glusterfs/blob/master/doc/admin-guide/en-US/markdown/admin_setting_volumes.md
 
 
 Bernhard
 
 
 -Vijay
 
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Problems to work with mounted directory in Gluster 3.2.7 - switch to 3.2.4 ; -)

2014-02-19 Thread BGM
well, note:
- you don't need zfs on the hardeware machines, xfs or ext3 or ext4 would do it 
too
- for production you wouldn't use a glusterfs on top of a glusterfs but rather 
giving the vm access to a real blockdevice, like a whole harddisk or at least a 
partition of it although migration of the vm wouldn't be possible than...
therefor: a VM as a glusterserver might not be the best idea.
- remember to peer probe the glusterserver partner from both sides! as 
mentioned below

for a first setup you should be fine with that.

regards

On 19.02.2014, at 19:32, Targino Silveira targinosilve...@gmail.com wrote:

 Thanks Bernhard I will do this.
 
 Regards, 
 
 
 Targino Silveira
 +55-85-8626-7297
 www.twitter.com/targinosilveira
 
 
 2014-02-19 14:43 GMT-03:00 Bernhard Glomm bernhard.gl...@ecologic.eu:
 I would strongly recommend to restart fresh with gluster 3.2.4 from 
 http://download.gluster.org/pub/gluster/glusterfs/3.4/
 It works totally fine for me.
 (reinstall the vms as slim as possible if you can.)
 
 As a quick howto consider this:
 
 - We have 2 Hardware machines (just desktop machines for dev-env)
 - both running zol
 - create a zpool and zfs filesystem
 - create a gluster replica 2 volume between hostA and hostB
 - installe 3 VM vmachine0{4,5,6}
 - vmachine0{4,5} each have a 100GB diskimage file as /dev/vdb which also 
 resides on the glustervolume
 - create ext3 filesystem on vmachine0{4,5}:/dev/vdb1
 - create gluster replica 2 between vmachine04 and vmachine05 as shown below
 
 (!!!obviously nobody would do that in any serious environment,
 just to show that even a setup like that _would_ be possible!!!)
 
 - run some benchmarks on that volume and compare the results to other 
 
 So:
 
 root@vmachine04[/0]:~ # mkdir -p /srv/vdb1/gf_brick
 root@vmachine04[/0]:~ # mount /dev/vdb1 /srv/vdb1/
 root@vmachine04[/0]:~ # gluster peer probe vmachine05
 peer probe: success
 
 # now switch over to vmachine05 and do
 
 root@vmachine05[/1]:~ # mkdir -p /srv/vdb1/gf_brick
 root@vmachine05[/1]:~ # mount /dev/vdb1 /srv/vdb1/
 root@vmachine05[/1]:~ # gluster peer probe vmachine04
 peer probe: success
 root@vmachine05[/1]:~ # gluster peer probe vmachine04
 peer probe: success: host vmachine04 port 24007 already in peer list
 
 # the peer probe from BOTH sides ist often forgotten 
 # switch back to vmachine04 and continue with
 
 root@vmachine04[/0]:~ # gluster peer status
 Number of Peers: 1
 
 Hostname: vmachine05
 Port: 24007
 Uuid: 085a1489-dabf-40bb-90c1-fbfe66539953
 State: Peer in Cluster (Connected)
 root@vmachine04[/0]:~ # gluster volume info layer_cake_volume
 
 Volume Name: layer_cake_volume
 Type: Replicate
 Volume ID: ef5299db-2896-4631-a2a8-d0082c1b25be
 Status: Started
 Number of Bricks: 1 x 2 = 2
 Transport-type: tcp
 Bricks:
 Brick1: vmachine04:/srv/vdb1/gf_brick
 Brick2: vmachine05:/srv/vdb1/gf_brick
 root@vmachine04[/0]:~ # gluster volume status layer_cake_volume
 Status of volume: layer_cake_volume
 Gluster process PortOnline  Pid
 --
 Brick vmachine04:/srv/vdb1/gf_brick 49152   Y   
 12778
 Brick vmachine05:/srv/vdb1/gf_brick 49152   Y   
 16307
 NFS Server on localhost 2049Y   12790
 Self-heal Daemon on localhost   N/A Y   12791
 NFS Server on vmachine052049Y   
 16320
 Self-heal Daemon on vmachine05  N/A Y   
 16319
 
 There are no active volume tasks
 
 # set any option you might like
 
 root@vmachine04[/1]:~ # gluster volume set layer_cake_volume 
 network.remote-dio enable
 volume set: success
 
 # go to vmachine06 and mount the volume
 root@vmachine06[/1]:~ # mkdir /srv/layer_cake
 root@vmachine06[/1]:~ # mount -t glusterfs -o 
 backupvolfile-server=vmachine05 vmachine04:/layer_cake_volume /srv/layer_cake
 root@vmachine06[/1]:~ # mount
 vmachine04:/layer_cake_volume on /srv/layer_cake type fuse.glusterfs 
 (rw,default_permissions,allow_other,max_read=131072)
 root@vmachine06[/1]:~ # df -h
 Filesystem Size  Used Avail Use% Mounted on
 ...
 vmachine04:/layer_cake_volume   97G  188M   92G   1% /srv/layer_cake
 
 All fine and stable
 
 
 
 # now let's see how it tastes
 # note this is postmark on  / NOT on the glustermounted layer_cake_volume!
 # that postmark results might be available tomorrow ;-)))
 root@vmachine06[/1]:~ # postmark
 PostMark v1.51 : 8/14/01
 pmset transactions 50
 pmset number 20
 pmset subdirectories 1
 pmrun
 Creating subdirectories...Done
 Creating files...Done
 Performing transactions..Done
 Deleting files...Done
 Deleting subdirectories...Done
 Time:
 2314 seconds total
 2214 seconds of transactions (225 per second)
 Files:
 450096 created (194 per second)
 Creation 

Re: [Gluster-users] Problems to work with mounted directory in Gluster 3.2.7 - switch to 3.2.4 ; -)

2014-02-19 Thread BGM
... keep it simple, make it robust ...
use raid1 (or raidz if you can) for the bricks
hth

On 19.02.2014, at 20:32, Targino Silveira targinosilve...@gmail.com wrote:

 Sure, 
 
 I will use XFS, as I sayd before it's for old data, so we don't need a great 
 performance, we only need to store data.
 
 regards,
 
 Targino Silveira
 +55-85-8626-7297
 www.twitter.com/targinosilveira
 
 
 2014-02-19 16:11 GMT-03:00 BGM bernhard.gl...@ecologic.eu:
 well, note:
 - you don't need zfs on the hardeware machines, xfs or ext3 or ext4 would do 
 it too
 - for production you wouldn't use a glusterfs on top of a glusterfs but 
 rather giving the vm access to a real blockdevice, like a whole harddisk or 
 at least a partition of it although migration of the vm wouldn't be possible 
 than...
 therefor: a VM as a glusterserver might not be the best idea.
 - remember to peer probe the glusterserver partner from both sides! as 
 mentioned below
 
 for a first setup you should be fine with that.
 
 regards
 
 On 19.02.2014, at 19:32, Targino Silveira targinosilve...@gmail.com wrote:
 
 Thanks Bernhard I will do this.
 
 Regards, 
 
 
 Targino Silveira
 +55-85-8626-7297
 www.twitter.com/targinosilveira
 
 
 2014-02-19 14:43 GMT-03:00 Bernhard Glomm bernhard.gl...@ecologic.eu:
 I would strongly recommend to restart fresh with gluster 3.2.4 from 
 http://download.gluster.org/pub/gluster/glusterfs/3.4/
 It works totally fine for me.
 (reinstall the vms as slim as possible if you can.)
 
 As a quick howto consider this:
 
 - We have 2 Hardware machines (just desktop machines for dev-env)
 - both running zol
 - create a zpool and zfs filesystem
 - create a gluster replica 2 volume between hostA and hostB
 - installe 3 VM vmachine0{4,5,6}
 - vmachine0{4,5} each have a 100GB diskimage file as /dev/vdb which also 
 resides on the glustervolume
 - create ext3 filesystem on vmachine0{4,5}:/dev/vdb1
 - create gluster replica 2 between vmachine04 and vmachine05 as shown below
 
 (!!!obviously nobody would do that in any serious environment,
 just to show that even a setup like that _would_ be possible!!!)
 
 - run some benchmarks on that volume and compare the results to other 
 
 So:
 
 root@vmachine04[/0]:~ # mkdir -p /srv/vdb1/gf_brick
 root@vmachine04[/0]:~ # mount /dev/vdb1 /srv/vdb1/
 root@vmachine04[/0]:~ # gluster peer probe vmachine05
 peer probe: success
 
 # now switch over to vmachine05 and do
 
 root@vmachine05[/1]:~ # mkdir -p /srv/vdb1/gf_brick
 root@vmachine05[/1]:~ # mount /dev/vdb1 /srv/vdb1/
 root@vmachine05[/1]:~ # gluster peer probe vmachine04
 peer probe: success
 root@vmachine05[/1]:~ # gluster peer probe vmachine04
 peer probe: success: host vmachine04 port 24007 already in peer list
 
 # the peer probe from BOTH sides ist often forgotten 
 # switch back to vmachine04 and continue with
 
 root@vmachine04[/0]:~ # gluster peer status
 Number of Peers: 1
 
 Hostname: vmachine05
 Port: 24007
 Uuid: 085a1489-dabf-40bb-90c1-fbfe66539953
 State: Peer in Cluster (Connected)
 root@vmachine04[/0]:~ # gluster volume info layer_cake_volume
 
 Volume Name: layer_cake_volume
 Type: Replicate
 Volume ID: ef5299db-2896-4631-a2a8-d0082c1b25be
 Status: Started
 Number of Bricks: 1 x 2 = 2
 Transport-type: tcp
 Bricks:
 Brick1: vmachine04:/srv/vdb1/gf_brick
 Brick2: vmachine05:/srv/vdb1/gf_brick
 root@vmachine04[/0]:~ # gluster volume status layer_cake_volume
 Status of volume: layer_cake_volume
 Gluster process PortOnline  Pid
 --
 Brick vmachine04:/srv/vdb1/gf_brick 49152   Y  
  12778
 Brick vmachine05:/srv/vdb1/gf_brick 49152   Y  
  16307
 NFS Server on localhost 2049Y   
 12790
 Self-heal Daemon on localhost   N/A Y   
 12791
 NFS Server on vmachine052049Y  
  16320
 Self-heal Daemon on vmachine05  N/A Y  
  16319
 
 There are no active volume tasks
 
 # set any option you might like
 
 root@vmachine04[/1]:~ # gluster volume set layer_cake_volume 
 network.remote-dio enable
 volume set: success
 
 # go to vmachine06 and mount the volume
 root@vmachine06[/1]:~ # mkdir /srv/layer_cake
 root@vmachine06[/1]:~ # mount -t glusterfs -o 
 backupvolfile-server=vmachine05 vmachine04:/layer_cake_volume 
 /srv/layer_cake
 root@vmachine06[/1]:~ # mount
 vmachine04:/layer_cake_volume on /srv/layer_cake type fuse.glusterfs 
 (rw,default_permissions,allow_other,max_read=131072)
 root@vmachine06[/1]:~ # df -h
 Filesystem Size  Used Avail Use% Mounted on
 ...
 vmachine04:/layer_cake_volume   97G  188M   92G   1% /srv/layer_cake
 
 All fine and stable
 
 
 
 # now let's see how it tastes
 # note this is postmark on  / NOT on the glustermounted layer_cake_volume!
 # that postmark results might be available

Re: [Gluster-users] Best Practices for different failure scenarios?

2014-02-19 Thread BGM


On 19.02.2014, at 21:15, James purplei...@gmail.com wrote:

 On Wed, Feb 19, 2014 at 3:07 PM, Michael Peek p...@nimbios.org wrote:
 Is there a best practices document somewhere for how to handle standard
 problems that crop up?
 
 Short answer, it sounds like you'd benefit from playing with a test
 cluster... Would I be correct in guessing that you haven't setup a
 gluster pool yet?
 You might want to look at:
 https://ttboj.wordpress.com/2014/01/08/automatically-deploying-glusterfs-with-puppet-gluster-vagrant/
 This way you can try them out easily...
 For some of those points... solve them with...
 
 Sort of a crib notes for things like:
 
 1) What do you do if you see that a drive is about to fail?
 RAID6
or: zol, raidzx
(open for critical commends)
or: brick remove  brick add  volume heal
(it's really just three commands, at least in my experience so far, touch wood) 
.
but Michael, I appreciate your _original_ question:
Is there a best practice document?
Nope, not that I am aware of.
.
It might be very helpful to have a wiki next to this mailing list,
where all the good experience, all the proved solutions for situations
that are brought up here, could be gathered in a more
permanent and straight way.
.
To your questions I would add:
what's best practice in setting options for performance and/or integrity...
(yeah, well, for which use case under which conditions)
a mailinglist is very helpful for adhoc probs and questions,
but it would be nice to distill the knowledge into a permanent, searchable form.
.
sure anybody could set up a wiki, but...
it would need the acceptance and participation of an active group
to get best results.
so IMO the appropriate place would be somewhere close to gluster.org?
.
regards
Bernhard
 

 
 2) What do you do if a drive has already failed?
 RAID6
 
 3) What do you do if a peer is about to fail?
 Get a new peer ready...
 
 4) What do you do if a peer has failed?
 Replace with new peer...
 
 5) What do you do to reinstall a peer from scratch (i.e. what
 configuration files/directories do you need to restore to get the host
 back up and talking to the rest of the cluster)?
 Bring up a new peer. Add to cluster... Same as failed peer...
 
 6) What do you do with failed-heals?
 7) What do you do with split-brains?
 These are more complex issues and a number of people have written about 
 them...
 Eg: http://joejulian.name/blog/fixing-split-brain-with-glusterfs-33/
 
 Cheers,
 James
 
 
 
 Michael
 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://supercolony.gluster.org/mailman/listinfo/gluster-users
 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://supercolony.gluster.org/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] [Bug 1057645] ownership of diskimage changes during livemigration, livemigration with kvm/libvirt fails

2014-01-27 Thread BGM
Hi Paul  all
I'm really keen on getting this solved,
right now it's a nasty show stopper.
I could try different gluster versions,
as long as I can get the .debs for it,
wouldn't want to start compiling 
(although does a config option have changed on package build?)
you reported that 3.4.0 on ubuntu 13.04 was working, right?
code diff, config options for package build.
Another approach: can anyone verify or falsify
https://bugzilla.redhat.com/show_bug.cgi?id=1057645
on another distro than ubuntu/debian?
thinking of it... could it be an apparmor interference? 
I had fun with apparmor and mysql on ubuntu 12.04 once...
will have a look at that tomorrow.
As mentioned before, a straight drbd/ocfs2 works (with only 1/4 speed 
and the pain of maintenance) so AFAIK I have to blame the ownership change
on gluster, not on an issue with my general setup
best regards
Bernhard
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] gluster and kvm livemigration

2014-01-23 Thread BGM
Hi Paul,
thnx, nice report,
u file(d) the bug?
can u do a
watch tree - pfungiA path to ur vm images pool
on both hosts
some vm running, some stopped.
start a machine
trigger the migration
at some point, the ownership of the vmimage.file flips from
libvirtd (running machnie) to root (normal permission, but only when stopped).
If the ownership/permission flips that way, 
libvirtd on the reciving side
can't write that file ... 
does group/acl permission flip likewise?
Regards
Bernhard

On 23.01.2014, at 16:49, Paul Boven bo...@jive.nl wrote:

 Hi Bernhard,
 
 I'm having exactly the same problem on Ubuntu 13.04 with the 3.4.1 packages 
 from semiosis. It worked fine with glusterfs-3.4.0.
 
 We've been trying to debug this on the list, but haven't found the smoking 
 gun yet.
 
 Please have a look at the URL below, and see if it matches what you are 
 experiencing?
 
 http://epboven.home.xs4all.nl/gluster-migrate.html
 
 Regards, Paul Boven.
 
 On 01/23/2014 04:27 PM, Bernhard Glomm wrote:
 
 I had/have problems with live-migrating a virtual machine on a 2sided
 replica volume.
 
 I run ubuntu 13.04 and gluster 3.4.2 from semiosis
 
 
 with network.remote-dio to enable I can use cache mode = none as
 performance option for the virtual disks,
 
 so live migration works without --unsafe
 
 I'm triggering the migration now through the Virtual Machine Manager as an
 
 unprivileged user which is group member of libvirtd.
 
 
 After migration the disks become read-only because
 
 on migration the disk files changes ownership from
 
 libvirt-qemu to root
 
 
 What am I missing?
 
 
 TIA
 
 
 Bernhard
 
 
 
 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://supercolony.gluster.org/mailman/listinfo/gluster-users
 
 
 -- 
 Paul Boven bo...@jive.nl +31 (0)521-596547
 Unix/Linux/Networking specialist
 Joint Institute for VLBI in Europe - www.jive.nl
 VLBI - It's a fringe science
 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://supercolony.gluster.org/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] GlusterFS share authentication?

2014-01-22 Thread BGM
On 22.01.2014, at 16:43, Peter B. p...@das-werkstatt.com wrote:
 On 01/21/2014 10:31 PM, Dan Mons wrote:
 On 22 January 2014 05:19, Peter B. p...@das-werkstatt.com wrote:
 The clients in fact *do* only access it over Samba. I just figured that
 *if* one user connected a GNU/Linux machine to the LAN, he could simply
 connect with write permissions using the GlusterFS Linux client. All
 he'd have to do for authenticating is to spoof one of the storage-IPs.
 man iptables
 
 I've been working with iptables for many years, but in this particular
 case, I fail to see how they would help.
 Maybe I'm overlooking something very obvious?
 
 Could you please elaborate your suggestion a bit?

I would suggest not to connect the dedicated storage nic(s) to the lan
but to a physical seperated network, vlan or if that all is not possible,
through a vpn.
could be wrong, but INHO with ip_forward off you should be fine?
regards
Bernhard
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Design/HW for cost-efficient NL archive = 0.5PB?

2014-01-02 Thread BGM


Sent from my iPad

On 02.01.2014, at 18:06, Justin Dossey j...@podomatic.com wrote:

 1) It depends on the number of drives per chassis, your tolerance for risk, 
 and the speed of rebuilds.  I'd recommend doing a couple of test rebuilds 
 with different array sizes to see how fast your controller and drives can 
 complete them, and then comparing the rebuild completion times to your SLA-- 
 if a rebuild takes two days to complete, is that good enough for you 
 (especially given the chances of another failure occuring during the 
 rebuild)?  All other things being equal, the smaller the array, the faster 
 the rebuild, but the more wasted space in the array.  Also note that many 
 controllers have tunable rebuild algorithms, so you can divert more resources 
 to completing rebuilds faster at the cost of performance.  One data point 
 from me: my last 16-2T-SATA RAID-6 rebuild took about 58 hours to complete.
 
 2) My understanding is that the way file reads work on GlusterFS, read 
 requests are sent to all nodes and the data is used from the first node to 
 respond to the request.  So if one node is busier than others, it is likely 
 to respond more slowly and thus receive a lower portion of the read activity, 
 as long as the files being read are larger than a single response.
 
 
 On Wed, Jan 1, 2014 at 12:21 PM, Fredrik Häll hall.fred...@gmail.com wrote:
 Thanks for all the input!
 
 It sure sounds like RAID-6 for disk failures and Gluster for the spanning 
 and high level redundancy parts is a good candidate. 
 
 Some final questions: 
 
 1) How big can one comfortably go in terms of RAID-6 array size? Given 4TB 
 SATA/SAS drives. On the one hand much points to keeping as few RAIDs as 
 possible, and disk usage is of course maximized. But there are complications 
 in terms of rebuild times and risk of losing the 2 drives. Hot spares may 
 also be an option. Your reflections?
 
 2) Is there any intelligence or automation in Gluster that makes smart use 
 of dual (or multiple) replicas? Say that I have 2 replicas, and one of them 
 is spending some effort on a RAID rebuild, is there functionality for 
 manually or automatically preferring the other (healhy) replica?
 
 Best regards, 
 
 Fredrik
 
 
 On Tue, Dec 31, 2013 at 10:27 PM, Justin Dossey j...@podomatic.com wrote:
 Yes, RAID-6 is better than RAID-5 in most cases.  I agonized over the 
 decision to deploy 5 for my Gluster cluster, and the reason I went with 5 
 is that the number of drives in the brick was (IMO) acceptably low.  I use 
 6 for my 16-drive arrays, which means I have to lose 3 disks out of the 16 
 to lose my data.  With 2x8-drive arrays in 5, I also have to lose 3 disks 
 to lose data, but if I do lose data, I only lose 50% of the data on the 
 server, and all these bricks are distribute-replicate anyway, so I wouldn't 
 actually lose any data at all.  That consideration, paired with the fact 
 that I keep spares on hand and replace failed drives within a day or two, 
 means that I'm okay with running 2x RAID-5 instead of 1x RAID-6.  (2x 
 RAID-6 would put me below my storage target, forcing additional hardware 
 purchases.)
 
 I suppose the short answer is evaluate your storage needs carefully.
 
 
 On Tue, Dec 31, 2013 at 11:19 AM, James purplei...@gmail.com wrote:
 On Tue, Dec 31, 2013 at 11:33 AM, Justin Dossey j...@podomatic.com wrote:
 
  Yes, I'd recommend sticking with RAID in addition to GlusterFS.  The 
  cluster I'm mid-build on (it's a live migration) is 18x RAID-5 bricks on 
  9 servers.  Each RAID-5 brick is 8 2T drives, so about 13T usable.  It's 
  better to deal with a RAID when a disk fails than to have to pull and 
  replace the brick, and I believe Red Hat's official recommendation is 
  still to minimize the number of bricks per server (which makes me a 
  rebel for having two, I suppose).  9 (slow-ish, SATA RAID) servers 
  easily saturate 1Gbit on a busy day.
 
 
 I think RedHat also recommends RAID6 instead of RAID5. In any case, I
 sure do, at least.
 
 James
 
 
 
 On Mon, Dec 30, 2013 at 5:54 AM, bernhard glomm
 bernhard.gl...@ecologic.eu wrote:
 
  some years ago I had a similar tasks.
  I did:
  - We had disk arrays with 24 slots, with optional 4 JBODS (each 24 
  slots) stacked on top, dual LWL controller 4GB (costs ;-)
  - creating raids (6) with not more than 7 disks each
  - as far as I remember I had one hot spare per each 4 raids
  - connecting as many of this raid bricks together with striped glusterfs 
  as needed
  - as for replication, I was planing for an offside duplicate of this 
  architecture and
  because losing data was REALLY not an option, writing it all off at a 
  second offside location onto LTFS tapes.
  As the original version for the LTFS library edition was far to 
  expensive for us
  I found an alternative solution that does the same thing
  but fort a much reasonable prize. LTFS is still a big thing in digital 
  Archiving.
  Give me a note if you like more details on 

Re: [Gluster-users] Design/HW for cost-efficient NL archive = 0.5PB?

2014-01-02 Thread BGM
thnx Justin for an accurate example figure on this!
(58hours rebuild time on 16 2TBHD with one HD failing, not two I suppose that 
was) 
again I would emphasize to go for maximum (affordable) brick security
(raid6/raidz2 or better) and gluster to expand the available space and/or
replication on a high level (i.e. replicate the whole dataset to a second side 
of server,
either for redundancy or access speed)
There are different aspects that make up the rebuild time of a failed raid:
- raid level
- disk speed
- controller performance
- active read/write usage of the system
tests (because the above mentioned aspects are difficult to bring into a simple 
math formula) 
and considering a proper SLA surely helps.
though a hi boss, all data lost isn't really an option, is it?
build stronger to last longer.
have a good one ;-)
Bernhard

On 02.01.2014, at 18:06, Justin Dossey j...@podomatic.com wrote:

to completing rebuilds faster at the cost of performance.  One data point from 
me: my last 16-2T-SATA RAID-6 rebuild took about 58 hours to complete.
 
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Three nodes cluster with 2 replicas

2014-01-02 Thread BGM
It's logical impossible what you ask for!
on a raid5, if two (you said two) fails ALL data IS lost.
think harder!
As Justin supossed, go for securing your bricks with (soft)raid (or maybe raidz)
and extend the space by striping it onto several server.
In a disaster,
a hard disk, and 10hours later a second one, fails,
while you are sick and out of office,
and your apprentice has to handle it.
out the bad HD, in the good one.
cross fingers (that is, that the new HDs are good)
and rock on.
Simple solutions for complex problems!
I don't argue that it is not totally impossible to build, with gluster, what 
you are
asking for, but to my taste it wouldn't b nice nd easy (i.e. rocksolid)
Bernhard

On 01.01.2014, at 22:06, shacky shack...@gmail.com wrote:

 Hi.
 
 I have three servers with 7 hard drives (without HW RAID controller) that I 
 wish to use to create a Gluster cluster.
 
 I am looking a way to have 2 replicas with 3 nodes, because I need much 
 storage space and 2 nodes are not enough, but I wish to have the same 
 security I'd have using a RAID5 on a node.
 
 So I wish my data to be protected if one (or two) of 7 hard drives will fail 
 on the same node and if an entire node of three will fail.
 
 Is it possibile?
 Could you help me to find out the correct way?
 
 Thank you very much!
 Bye.
 
 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://supercolony.gluster.org/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Design/HW for cost-efficient NL archive = 0.5PB?

2013-12-31 Thread BGM
Yepp,
and still I would still be VERY uncomfortable with raid5 and = 1TB disks.
read this
https://prestoprimews.ina.fr/public/deliverables/PP_WP3_ID3.2.1_ThreatsMassStorage_R0_v1.00.pdf
and maybe other deliverables from presto prime to understand why.
In short (I think it was in the above mentioned dekiverable)
PrestoPrime was looking into patterns of disk failure.
For that they worked together with google and others, which runs A LOT of 
spinning disks,
The most clear pattern they found was not with certain HD types or companies 
but with
batches of HDs, produced at the same time at the same facility.
If one out of the batch failed it was more likely other HDs from the same batch 
failed
soon after that too.
So if you order 50 disks at once to build your storage, something admins like 
to do normaly
it turns out to become a kind of russian roulette.
Also the paper points out the fact that it takes more and more time to a) 
detect a failure on a given HD and b) to recover from a disk failure the bigger 
the disks become.
From that point of view PrestoPrime strongly recommended against raid5 and AT 
LEAST for 
raid6 for any archiving (long term storage) purpose.
My opinion/advise is: 
build VERY strong bricks (raid6 or zfs raidz2/3?)
and use gluster to: 
- expand the space
- increase performance
- replicate the whole thing to offside vault
- think of LTFS to get your value resting on not spinning media
hth
Bernhard


On 31.12.2013, at 17:33, Justin Dossey j...@podomatic.com wrote:

 Yes, I'd recommend sticking with RAID in addition to GlusterFS.  The cluster 
 I'm mid-build on (it's a live migration) is 18x RAID-5 bricks on 9 servers.  
 Each RAID-5 brick is 8 2T drives, so about 13T usable.  It's better to deal 
 with a RAID when a disk fails than to have to pull and replace the brick, and 
 I believe Red Hat's official recommendation is still to minimize the number 
 of bricks per server (which makes me a rebel for having two, I suppose).  9 
 (slow-ish, SATA RAID) servers easily saturate 1Gbit on a busy day.
 
  The following is opinion only, so make up your own mind:
 
 If I had a big pile of RAID-5 or RAID-6 bricks, I would not want to spend 
 extra money for replica-3.  Instead, I would go replica-2 and use the 
 leftover money to build in additional redundancy on the hardware (e.g. 
 redundant power, redundant 10gigE).  If money were not an object, of course 
 there's no harm in going replica-3 or more.  But every build I've ever done 
 has a budget that seems slightly small for the desired outcome.
  
 
 
 
 On Mon, Dec 30, 2013 at 5:54 AM, bernhard glomm bernhard.gl...@ecologic.eu 
 wrote:
 some years ago I had a similar tasks.
 I did:
 - We had disk arrays with 24 slots, with optional 4 JBODS (each 24 slots) 
 stacked on top, dual LWL controller 4GB (costs ;-) 
 - creating raids (6) with not more than 7 disks each
 - as far as I remember I had one hot spare per each 4 raids
 - connecting as many of this raid bricks together with striped glusterfs as 
 needed
 - as for replication, I was planing for an offside duplicate of this 
 architecture and
 because losing data was REALLY not an option, writing it all off at a second 
 offside location onto LTFS tapes.
 As the original version for the LTFS library edition was far to expensive 
 for us 
 I found an alternative solution that does the same thing
 but fort a much reasonable prize. LTFS is still a big thing in digital 
 Archiving.
 Give me a note if you like more details on that.
 
 - This way I could fsck all (not to big) raids in parallel (sped things up)
 - proper robustness against disk failure
 - space that could grow infinite in size (add more and bigger disks) and 
 keep up with access speed (ad more server) at a pretty foreseeable prize
 - LTFS in the vault provided just the finishing having data accessible even 
 if two out three sides are down, 
 reasonable prize, (for instance no heat problem at the tape location)
 Nowadays I would go for the same approach except zfs raidz3 bricks (at least 
 do a thorough test on it) 
 instead of (small) hardware raid bricks. 
 As for simplicity and robustness I wouldn't like to end up with several 
 hundred glusterfs bricks, each on one individual disk,
 but rather leaving disk failure prevention either to hardware raid or zfs 
 and using gluster to connect this bricks into the
 fs size I need(  - and for mirroring the whole thing to a second side if 
 needed)
 hth
 Bernhard
 
 
 
   Bernhard Glomm
 IT Administration
 
 Phone:+49 (30) 86880 134
 Fax:  +49 (30) 86880 100
 Skype:bernhard.glomm.ecologic

 Ecologic Institut gemeinnützige GmbH | Pfalzburger Str. 43/44 | 10717 Berlin 
 | Germany
 GF: R. Andreas Kraemer | AG: Charlottenburg HRB 57947 | USt/VAT-IdNr.: 
 DE811963464
 Ecologic™ is a Trade Mark (TM) of Ecologic Institut gemeinnützige GmbH
 
 On Dec 25, 2013, at 8:47 PM, Fredrik Häll hall.fred...@gmail.com wrote:
 
 I am new to Gluster, but so far it seems 

Re: [Gluster-users] [Gluster-devel] glusterfs-3.4.2qa4 BUG 987555 not fixed?

2013-12-19 Thread BGM
thanks Nils,
will try that tomorrow,
and let you know of course
Bernhard


On 19.12.2013, at 17:34, Niels de Vos nde...@redhat.com wrote:

 On Thu, Dec 19, 2013 at 03:44:26PM +, Bernhard Glomm wrote:
 
 hi all
 
 I'm testing
 
 SRC: 
 http://bits.gluster.org/pub/gluster/glusterfs/src/glusterfs-3.4.2qa4.tar.gz
 
 on ubuntu 13.04
 
 previous I had gluster 3.2.7 (the one from ubuntu 13.04 repository) 
 installed.
 I use a two sided gluster mirror to host the imagefiles of my VM
 With gluster 3.2.7 all worked fine.
 
 I upgraded to gluster 3.4.2qa4 (see above).
 VM still worked fine, bonnie++ tests from inside the VM instances showing 
 similar results than before
 but than I hit the 987555 bug again
 
 The change for that bug introduces an option to the 
 /etc/glusterfs/gluster.vol configuration file. You can now add the 
 following line to that file:
 
  volume management
  ...
  option base-port 50152
  ...
  end-volume
 
 By default this is commented out with the default port (49152). In the 
 line above. 50152 is just an example, you can pick any port you like.  
 GlusterFS tries to detect if a port is in use, if it is, it'll try the 
 next one (and so on).
 
 Also note that QEMU had a fix for this as well. With the right version 
 of QEMU, there should be no need to change this option from the default.
 Details on the fixes for QEMU are referenced in Bug 1019053.
 
 Can you let us know if setting this option and restarting all the 
 glusterfsd processes helps?
 
 Thanks,
 Niels
 
 
 root@ping[/1]:~ # time virsh migrate --verbose --live --unsafe --p2p 
 --domain atom01 --desturi qemu+ssh://192.168.242.93/system
 error: Unable to read from monitor: Connection reset by peer
 
 
 root@ping[/0]:~ # netstat -tulpn|egrep 49152
 tcp0  0 0.0.0.0:49152   0.0.0.0:*   LISTEN   
3924/glusterfsd 
 
 or
 
 root@ping[/0]:~ # netstat -tulpn|egrep gluster
 tcp0  0 0.0.0.0:49155   0.0.0.0:*   LISTEN   
4031/glusterfsd 
 tcp0  0 0.0.0.0:38468   0.0.0.0:*   LISTEN   
5418/glusterfs  
 tcp0  0 0.0.0.0:49156   0.0.0.0:*   LISTEN   
4067/glusterfsd 
 tcp0  0 0.0.0.0:933 0.0.0.0:*   LISTEN   
5418/glusterfs  
 tcp0  0 0.0.0.0:38469   0.0.0.0:*   LISTEN   
5418/glusterfs  
 tcp0  0 0.0.0.0:49157   0.0.0.0:*   LISTEN   
4109/glusterfsd 
 tcp0  0 0.0.0.0:49158   0.0.0.0:*   LISTEN   
4155/glusterfsd 
 tcp0  0 0.0.0.0:49159   0.0.0.0:*   LISTEN   
4197/glusterfsd 
 tcp0  0 0.0.0.0:24007   0.0.0.0:*   LISTEN   
2682/glusterd   
 tcp0  0 0.0.0.0:49160   0.0.0.0:*   LISTEN   
4237/glusterfsd 
 tcp0  0 0.0.0.0:49161   0.0.0.0:*   LISTEN   
4280/glusterfsd 
 tcp0  0 0.0.0.0:49162   0.0.0.0:*   LISTEN   
4319/glusterfsd 
 tcp0  0 0.0.0.0:49163   0.0.0.0:*   LISTEN   
4360/glusterfsd 
 tcp0  0 0.0.0.0:49165   0.0.0.0:*   LISTEN   
5408/glusterfsd 
 tcp0  0 0.0.0.0:49152   0.0.0.0:*   LISTEN   
3924/glusterfsd 
 tcp0  0 0.0.0.0:20490.0.0.0:*   LISTEN   
5418/glusterfs  
 tcp0  0 0.0.0.0:38465   0.0.0.0:*   LISTEN   
5418/glusterfs  
 tcp0  0 0.0.0.0:49153   0.0.0.0:*   LISTEN   
3959/glusterfsd 
 tcp0  0 0.0.0.0:38466   0.0.0.0:*   LISTEN   
5418/glusterfs  
 tcp0  0 0.0.0.0:49154   0.0.0.0:*   LISTEN   
3996/glusterfsd 
 udp0  0 0.0.0.0:931 0.0.0.0:*
5418/glusterfs  
 
 
 is there a compile option work_together_with_libvirt ;-)
 Can anyone confirm this or has a work around?
 
 best
 
 Bernhard
 
 P.S.: As I learned in the discussion before libvirt is counting up the ports
 when it finds the one needed are blocked already.
 So after 12 migration attempt the VM finally WAS migrated
 IMHO there should/could be an option to configure the start port/port range
 and yes, given could/should be done ALSO for libvirt,
 fact is gluster 3.2.7 works (for me), 3.4.2 doesn't :-((
 I really would like to try the gfapi but not for the prize of no live 
 migration.
 
 -- 
 
 
 
 
 
 
 
 
 
Bernhard Glomm
 
IT Administration
 
 
 
 
  Phone:
 
 
  +49 (30) 86880 134
 
 
  Fax:
 
 
  +49 (30) 86880 100
 
 
  Skype:
 
 
  bernhard.glomm.ecologic
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
  Ecologic Institut gemeinnützige GmbH | 

Re: [Gluster-users] Mount GlusterFS from localhost

2013-12-15 Thread BGM
hi,
 I'm experience a similar issue.
I'm tetsting glusterfs on a two-sided mirror, on zol.
I got several volumes (each on it's own zfs-filesystem)
mount -a works fine,
but after a reboot
all the zfs filesystems are present but only
an  arbitrary amount of glusterfs got mounted
(like 2 random out of 6)
me I so far circumvented that by
abandon fstab for that purpose and using an rc script
(S99, K20) for the glustermount,
allowing a sleep 30 gracetime, than mounting the glustervolumes
( on a two sided mirror I mount my-ip:/volumename /my-mountpoint 
in contrast to mirror-ip:/volumename /my-mountpoint)
All glustervolumes get mounted fine than.
AFAIK option _netdev in fstab should have done the job,
but in my case it didn't, I thought maybe due to zol.
Anyhow, 30 sec gracetime does the job,
piked 30 sec just arbitrary, might be too much even...
best
Bernhard

On 13.12.2013, at 17:40, Joel Young j...@cryregarder.com wrote:

 I use a systemd file in /etc/systemd/system such as the attached.  You also 
 want to make sure you've done an
 
 systemctl enable NetworkManager-wait-online.service
 systemctl enable work.mount
 systemctl start work.mount
 
 Joel
 
 
 On Sat, Dec 7, 2013 at 9:33 AM, Vadim Nevorotin mala...@ubuntu.com wrote:
 Hello!
 
 I need to mount glusterfs from localhost. So both server and client are on 
 the same host.
 
 I've add to fstab
 
 localhost:/srv_tftp /srv/tftp glusterfs defaults,_netdev 0 0
 
 Then
 
 mount -a
 
 Ok, in this case all work great. But after reboot nothing is mounted, 
 because GlusterFS server starts after network and after remote FS.
 
 Is there any solutions to fix this problem? I use Debian, but I think that 
 the same problem is in all other distros. As I understand it's impossible to 
 execute init script after network is ready, but before remote fs are 
 mounted. It can fix the problem, but may be there is some different solution?
 
 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://supercolony.gluster.org/mailman/listinfo/gluster-users
 
 work.mount
 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://supercolony.gluster.org/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Ubuntu GlusterFS in Production

2013-12-05 Thread BGM
Josh,
although it might be more some bleeding than cutting edge,
is there/could you provide - some howto
get the lib gfapi working in ubuntu 13.04
given that it'll work with the port problem on live migration mentioned 
earlier? 
gluster 3.4.2 I hope fixed the port problem?
I'd be willing to do some testing/feed back on that.
best
Bernhard

On 05.12.2013, at 20:07, Josh Boon glus...@joshboon.com wrote:

 We're using it in production too. We're on KVM 1.6, gluster 3.4.1 running on 
 Ubuntu 13.04.  No problems but do be aware that you'll want fast links if you 
 actually want to saturate your disk bandwidth. We've bonded 10gbps links and 
 we still saturate those before we fully utilize our IO. I've not tested NFS 
 specifically but performance is something you're looking for I'd strongly 
 suggest gfapi which gets us near 400MBps writes in a replica 2 config across 
 the above mentioned interfaces.  You'll have to do some work on the KVM 
 sources for gfapi though as it's not even made it into the debian unstable 
 packages. 
 
 
 Best,
 Josh
 - Original Message -
 From: Jiri Hoogeveen j.hoogev...@bluebillywig.com
 To: Gerald Brandt g...@majentis.com
 Cc: gluster-users@gluster.org List gluster-users@gluster.org
 Sent: Thursday, December 5, 2013 11:31:34 AM
 Subject: Re: [Gluster-users] Ubuntu GlusterFS in Production
 
 Hi Gerald,
 
 Yes, we are using GlusterFS 3.3.2 with Ubuntu 12.04, KVM and bonding 802.3ad 
 on 2 x 1Gbps nic. This way every tcp session can go over a different nic.
 
 For vmWare vSphere we use the NFS of GlusterFS and for KVM the native 
 glusterfs client.
 
 This setup is working nice.
 
 Grtz, Jiri
 
 On 05 Dec 2013, at 14:49, Gerald Brandt g...@majentis.com wrote:
 
 Hi,
 
 Is anyone using GlusterFS on Ubuntu in production?  Specifically, I'm 
 looking at using the NFS portion of it over a bonded interface.  I believe 
 I'll get better speed than user the gluster client across a single interface.
 
 Setup:
 
 3 servers running KVM (about 24 VM's)
 2 NAS boxes running Ubuntu (13.04 and 13.10)
 
 Since Gluster NFS does server side replication, I'll put replication data 
 over a different nic than user data.
 
 Gerald
 
 ps: I had this setup with 3.2, but it proved unstable under load.
 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://supercolony.gluster.org/mailman/listinfo/gluster-users
 
 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://supercolony.gluster.org/mailman/listinfo/gluster-users
 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://supercolony.gluster.org/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users