Re: [Gluster-users] Gluster-users Digest, Vol 53, Issue 56 -- GlusterFS performance (Steve Thompson)

2012-10-02 Thread Ben England
Steve,

try glusterfs 3.3 and look at: 

http://community.gluster.org/a/linux-kernel-tuning-for-glusterfs/

There will be more optimizations in the next Gluster release.  Take advantage 
of the translators that Gluster supplies, including readahead translator and 
quick-read translator.

Red Hat does offer support for Red Hat Storage based on Gluster, and it has a 
pre-packaged tuning profile built into it.   We test with 10-GbE networks and 
Gluster 3.3 does have reasonably good performance for large-file sequential 
workloads (and it's scalable).
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


[Gluster-users] Gluster and maildir

2012-10-02 Thread jason
Hello,

I am working on an email solution using Gluster 3.3 for the primary
storage of the actual email.  The system is using dovecot and
maildir.  I will be using 10 servers for this configuration (hardware
servers, not virtual systems) and I have a question about the setting
this up with performance in mind.  I've used gluster in the past and
had varying success.  For this problem, I have two specific options
and I'm looking for advice on which one would perform better.

1) 10 servers  --  4 systems would be 2U systems with ALL of the
storage for email on them.  These 4 systems would also be running
services pertaining to email systems (IMAP/POP, SMTP, etc...).  The
remaining 6 systems are 1U systems and would connect to these systems
for email storage while running email services (IMAP/POP, SMTP,
etc...).

2) 10 servers  --  10 1U servers running both storage and all
services  (IMAP/POP, SMTP, etc...).

For this question, assume servers are the same (with the exception of
either a 1U chassis or a 2U chassis). Same HDD (2T 7200RPM), same RAM
(16G per system), same MB, same RAID controller, etc...  The RAID
configuration of all 1U systems would be RAID1 while the RAID
configuration of all 2U systems would be RAID5 with one hot spare.

Basically, I'm trying to figure out if Gluster will perform better
with more storage nodes in the storage block or if I would be better
off consolidating the storage to a few of the systems and freeing up
the resources for the email services on the remaining systems.  I've
had mixed results testing this in a KVM virtual environment, however
it's getting down to the time where I need to make some decisions on
ordering hardware.  I do know that RAID1 and RAID5 do not compare
apples to apples for performance,  I'm looking for thoughts from the
community as to which way you would set it up.

Kind Regards,

Jason


___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Gluster and maildir

2012-10-02 Thread Robert Hajime Lanning

On 10/02/12 13:01, ja...@combatyoga.net wrote:

Basically, I'm trying to figure out if Gluster will perform better with
more storage nodes in the storage block or if I would be better off
consolidating the storage to a few of the systems and freeing up the
resources for the email services on the remaining systems.  I've had
mixed results testing this in a KVM virtual environment, however it's
getting down to the time where I need to make some decisions on ordering
hardware.  I do know that RAID1 and RAID5 do not compare apples to
apples for performance,  I'm looking for thoughts from the community as
to which way you would set it up.


With maildir, I believe that fewer bricks would perform better.  The 
maildir format tends to be readdir() heavy.  Since Gluster does not have 
a master index of directory entries, it has to hit every brick in the 
volume.


--
Mr. Flibble
King of the Potato People
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


[Gluster-users] Retraction: Protocol stacking: gluster over NFS

2012-10-02 Thread harry mangalam
Hi All,

Well, it http://goo.gl/hzxyw was too good to be true. Under extreme, 
extended IO on a 48core node, some part of the the NFS stack collapses and 
leads to an IO lockup thru NFS.  We've replicated it on 48core and 64 core 
nodes, but don't know yet whether it acts similarly on lower-core-count nodes.

Tho I haven't had time to figure out exactly /how/ it collapses, I owe it to 
those who might be thinking of using it to tell them not to.

This is what I wrote, describing the situation to some co-workers:
===
With Joseph's and Kevin's help, I've been able to replicate Kevin's complete 
workflow on BDUC and executed it with a normally mounted gluster fs and my 
gluster-via-NFS-loopback (on both NFS3 and NFS4 clients).  

The good news is that the workflow went to completion on BDUC with the native 
gluster fs mount, doing pretty decent IO on one node - topping out at about 
250MB/s in and 75MB/s out (DDR IB)
ib1
  KB/s in  KB/s out
 268248.1  62278.40
 262835.1  64813.55
 248466.0  61000.24
 250071.3  67770.03
 252924.1  67235.13
 196261.3  56165.20
 255562.3  68524.45
 237479.3  68813.99
 209901.8  73147.73
 217020.4  70855.45

The bad news is that I've been able to replicate the failures that JF has 
seen.  The workflow starts normally but then eats up free RAM as KT's workflow 
saturates the nodes with about 26 instances of samtools which does a LOT of 
IO (10s of GB in the ~30m of the run).  This was the case even when I 
increased the number of nfsd's to 16 and even 32.

When using native gluster, the workflow goes to completion in about 23 hrs - 
about the same as when KT executed it on his machine (using NFS I think..?).
However when using the loopback mount, on both NFS3 and NFS4, it locks up the 
NFS side (the gluster mount continues to be R/W), requiring a hard reset on 
the node to clear the NFS error.  It is interesting that the samtools 
processes lock up during /reads/, not writes (via stracing several of the 
processes)

I found this entry in a FraunhoferFS discussion:
from https://groups.google.com/forum/?fromgroups=#!topic/fhgfs-
user/XoGPbv3kfhc
[[
In general, any network file system that uses the standard kernel page 
cache on the client side (including e.g. NFS, just to give another 
example) is not suitable for running client and server on the same 
machine, because that would lead to memory allocation deadlocks under 
high memory pressure - so you might want to watch out for that. 
(fhgfs uses a different caching mechanism on the clients to allow 
running it in such scenarios.) 
]]
but why this would be the case, I'm not sure - the server and client processes 
should be unable to step on each others data structures, so why they would 
interfere with each other is unclear.  Others on this list have mentioned 
similar opinions - I'd be interested in why this is theoretically the case.

The upshot is that under extreme, extended IO, NFS will lock up, so while we 
haven't seen it on BDUC except for KT's workflow, it's repeatable and we can't 
recover from it smoothly.  So we should move away from it.

I haven't been able to test it on a 3.x kernel (but will after this weekend); 
it's possible that it might work better, but I'm not optimistic.

-- 
Harry Mangalam - Research Computing, OIT, Rm 225 MSTB, UC Irvine
[m/c 2225] / 92697 Google Voice Multiplexer: (949) 478-4487
415 South Circle View Dr, Irvine, CA, 92697 [shipping]
MSTB Lat/Long: (33.642025,-117.844414) (paste into Google Maps)
--
Passive-Aggressive Supporter of the The Canada Party:
  http://www.americabutbetter.com/

___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users