Re: [Gluster-users] Performance

Hiren Joshi Thu, 13 Aug 2009 01:31:13 -0700

What are the advantages of XFS over ext3 (which I'm currently using)? My
fear with XFS when selecting a filesystem was that it's not as active or
as well supported as ext3 and if things go wrong, how easy would it be
to recover?
 
I have 6 x 1TB disks in a hardware raid 6 with battery backup and UPS,
it's now just the performance I need to get sorted...



________________________________

        From: Liam Slusser [mailto:lslus...@gmail.com] 
        Sent: 12 August 2009 20:35
        To: Mark Mielke
        Cc: Hiren Joshi; gluster-users@gluster.org
        Subject: Re: [Gluster-users] Performance
        
        

        I had a similar situation.  My larger gluster cluster has two
nodes but each node has 72 1.5tb hard drives.  I ended up creating three
30TB 24 drive raid6 arrays, formated with xfs and 64bit-inodes, and then
exporting three bricks with gluster.  I would recommend using a hardware
raid controller with battery backup power, UPS power, and a journaled
filesystem and i think you'll be fine.

        I'm exporting the three bricks on each of my two nodes, the
clients are using replication to replicate each of the three bricks on
each server and then using distribute to tie it all together.

        liam


        On Wed, Aug 12, 2009 at 10:51 AM, Mark Mielke
<m...@mark.mielke.cc> wrote:
        

                On 08/12/2009 01:24 PM, Hiren Joshi wrote:
                

                                36 partitions on each server - the word
"partition" is ambiguous. Are
                                they 36 separate drives? Or multiple
partitions on the same drive. If
                                multiple partitions on the same drive,
this would be a bad
                                idea, as it
                                would require the disk head to move back
and forth between the
                                partitions, significantly increasing the
latency, and therefore
                                significantly reducing the performance.
If each partition is
                                on its own
                                drive, you still won't see benefit
unless you have many clients
                                concurrently changing many different
files. In your above case, it's
                                touching a single file in sequence, and
having a cluster is
                                costing you
                                rather than benefitting you.
                                    
                                


                        We went with 36 partitions (on a single raid 6
drive) incase we got file
                        system corruption, it would take less time to
fsck a 100G partition than
                        a 3.6TB one. Would a 3.6TB single disk be
better?
                        


                Putting 3.6 TB on a single disk sounds like a lot of
eggs in one basket. :-)
                
                If you are worried about fsck, I would definitely do as
the other poster suggested and use a journalled file system. This nearly
eliminates the fsck time for most situations. This would be whether
using 100G partitions or using 3.6T partitions. In fact, there is very
few reasons not to use a journalled file system these days.
                
                As for how to deal with data on this partition - the
file system is going to have a better chance of placing files close to
each other, than setting up 36 partitions and having Gluster scatter the
files across all of them based on a hash. Personally, I would choose 4 x
1 Tbyte drives over 1 x 3.6 Tbyte drive, as this nearly quadruples my
bandwidth and for highly concurrent loads, nearly divides by four the
average latency to access files.
                
                But, if you already have the 3.6 Tbyte drive, I think
the only performance-friendly use would be to partition it based upon
access requirements, rather than a hash (random). That is, files that
are accessed frequently should be clustered together at the front of a
disk, files accessed less frequently could be in the middle, and files
accessed infrequently could be at the end. This would be a three
partition disk. Gluster does not have a file system that does this
automatically (that I can tell), so it would probably require a software
solution on your end. For example, I believe dovecot (IMAP server)
allows an "alternative storage" location to be defined, so that
infrequently read files can be moved to another disk, and it knows to
check the primary storage first, and fall back to the alternative
storage after.
                
                It you can't break up your storage by access patterns,
then I think a 3.6 Tbyte file system might still be the next best option
- it's still better than 36 partitions. But, make sure you have a good
file system on it, that scales well to this size. 


                Cheers,
                mark
                
                -- 
                Mark Mielke<m...@mielke.cc>
                
                

                _______________________________________________
                Gluster-users mailing list
                Gluster-users@gluster.org
        
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users

_______________________________________________
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users

Re: [Gluster-users] Performance

Reply via email to