Re: [Gluster-users] Small files

2015-01-29 Thread Liam Slusser
Matan -

We replicate to two nodes.  But since a zfs send | zfs recv communicates
one-way, I'd think you could do as many as you want.  It just might take a
little bit longer - although you should be able to run multiple at a time
as long as you had enough bandwidth over the network.  Ours are connected
via a dedicated 10gigabit network and see around 4-5gbit/sec on a large
commit.  How long the replication job takes depends on how much is changed
between the two snapshots.

Even though the seek time with a SSD is quick, you'll still get far greater
throughput in sequential read/writing vs small random accesses.

You can test it yourself.  Create a directory with 100 64MB files and
another directory with 64,000 100K files.  Now copy it from one place to
another and see for yourself which is faster.  Sequential reading always
wins.  And this is true with both Gluster and HDFS.

In HDFS small files exacerbates the problem because you need to contact the
NameNode to get the block information and then contact the DataNode to get
the block.  Think of it like this.  Reading 1000 64KB files in HDFS means
1000 requests to the NameNode and 1000 requests to the datanodes while
reading 1 64MB file is one trip to the NameNode and one trip the the
Datanode to get the same amount of data.

You can read more about this issue here:
http://blog.cloudera.com/blog/2009/02/the-small-files-problem/

thanks,
liam

On Thu, Jan 29, 2015 at 12:30 PM, Matan Safriel dev.ma...@gmail.com wrote:

 Hi Liam,

 Thanks for the comprehensive reply (!)
 How many nodes do you safely replicate to with ZFS?
 I don't think seek time is much of a concern with SSD by the way, so it
 does seem that glusterfs is much better for the small files scenario than
 HDFS, which as you say is very different in key aspects, and couldn't quite
 follow why rebalancing is slow or slower than in the case of HDFS actually,
 unless you just meant that HDFS works at a large block level and no more.

 Perhaps you'd care to comment ;)

 Matan

 On Thu, Jan 29, 2015 at 9:15 PM, Liam Slusser lslus...@gmail.com wrote:

 Matan - I'll do my best to take a shot at answering this...

 They're completely different technologies.  HDFS is not posix compliant
 and is not a mountable filesystem while Gluster is.

 In HDFS land, every file, directory and block in HDFS is represented as
 an object in the namenode’s memory, each of which occupies 150 bytes.  So
 10 million files would each up about 3 gigs of memory.  Furthermore was
 designed for streaming large files - the default blocksize in HDFS is 64MB.

 Gluster doesn't have a central namenode, so having millions of files
 doesn't put a tax on it in the same way.  But, again, small files causes
 lots of small seeks to handle the replication tasks/checks and generally
 isn't very efficient.  So don't expect blazing performance...  Doing
 rebalancing and rebuilding of Gluster bricks can be extremely painful since
 Gluster isn't a block level filesystem - so it will have to read each file
 one at a time.

 If you want to use HDFS and don't need a mountable filesystem have a look
 at HBASE.

 We tacked the small files problem by using a different technology.  I
 have an image store of about 120 million+ small-file images, I needed a
 mountable filesystem which was posix compliant and ended up doing a ZFS
 setup - using the built in replication to create a few identical copies on
 different servers for both load balancing and reliability.  So we update
 one server and than have a few read-only copies serving the data.  Changes
 get replicated, at a block level, every few minutes.

 thanks,
 liam


 On Thu, Jan 29, 2015 at 4:29 AM, Matan Safriel dev.ma...@gmail.com
 wrote:

 Hi,

 Is glusterfs much better than hdfs for the many small files scenario?

 Thanks,
 Matan



 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://www.gluster.org/mailman/listinfo/gluster-users




___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Writing is slow when there are 10 million files.

2014-04-15 Thread Liam Slusser
I had about 100 million files in Gluster and it was unbelievably painfully
slow.  We had to ditch it for other technology.


On Mon, Apr 14, 2014 at 11:24 PM, Franco Broi franco.b...@iongeo.comwrote:


 I seriously doubt this is the right filesystem for you, we have problems
 listing directories with a few hundred files, never mind millions.

 On Tue, 2014-04-15 at 10:45 +0900, Terada Michitaka wrote:
  Dear All,
 
 
 
  I have a problem with slow writing when there are 10 million files.
  (Top level directories are 2,500.)
 
 
  I configured GlusterFS distributed cluster(3 nodes).
  Each node's spec is below.
 
 
   CPU: Xeon E5-2620 (2.00GHz 6 Core)
   HDD: SATA 7200rpm 4TB*12 (RAID 6)
   NW: 10GBEth
   GlusterFS : glusterfs 3.4.2 built on Jan  3 2014 12:38:06
 
  This cluster(volume) is mounted on CentOS via FUSE client.
  This volume is storage of our application and I want to store 3
  hundred million to 5 billion files.
 
 
  I performed a writing test, writing 32KByte file × 10 million to this
  volume, and encountered a problem.
 
 
  (1) Writing is so slow and slow down as number of files increases.
In non clustering situation(one node), this node's writing speed is
  40 MByte/sec at random,
But writing speed is 3.6MByte/sec on that cluster.
  (2) ls command is very slow.
About 20 second. Directory creation takes about 10 seconds at
  lowest.
 
 
  Question:
 
   1)5 Billion files are possible to store in GlusterFS?
Has someone succeeded to store billion  files to GlusterFS?
 
   2) Could you give me a link for a tuning guide or some information of
  tuning?
 
  Thanks.
 
 
  -- Michitaka Terada
  ___
  Gluster-users mailing list
  Gluster-users@gluster.org
  http://supercolony.gluster.org/mailman/listinfo/gluster-users


 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://supercolony.gluster.org/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Writing is slow when there are 10 million files.

2014-04-15 Thread Liam Slusser
We consolidated hardware into a single large ZFS server with a redundant
hot slave.

thanks,
liam


On Mon, Apr 14, 2014 at 11:33 PM, Jeffrey 'jf' Lim jfs.wo...@gmail.comwrote:

 On Tue, Apr 15, 2014 at 2:30 PM, Liam Slusser lslus...@gmail.com wrote:
 
  I had about 100 million files in Gluster and it was unbelievably
 painfully
  slow.  We had to ditch it for other technology.
 

 and what is (or was) that other technology?

 -jf

 --
 He who settles on the idea of the intelligent man as a static entity
 only shows himself to be a fool.

 Mensan / Full-Stack Technical Polymath / System Administrator
 12 years over the entire web stack: Performance, Sysadmin, Ruby and
 Frontend

___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Writing is slow when there are 10 million files.

2014-04-15 Thread Liam Slusser
Our application also stores the path of the file in a database.  Accessing
a file directly is normally pretty speedy.  However, to get the files into
the database required searching parts of the filesystem which was really
slow.  We also had users using the filesystem fixing things which was all
unix shell ls/cp/mv etc, and again, really slow.

And the biggest problem I had was if one of the nodes went down for a
reboot/patching/whatever, to resync the filesystems took weeks because of
the huge number of files.

thanks,
liam



On Tue, Apr 15, 2014 at 3:15 AM, Terada Michitaka terra...@gmail.comwrote:

  To Liam:

 I had about 100 million files in Gluster and it was unbelievably
 painfully slow.  We had to ditch it for other technology.

 Has slow down occurred on writing file?, listing files, or both?

 In our application, path of the data is managed in database.
 ls is slow, but not influence to my application, but writing file slow
 down is critical.

  To All:

 I uploaded a statistics when writing test(32kbyte x 10 million, 6 bricks).

   http://gss.iijgio.com/gluster/gfs-profile_d03r2.txt

 Line 15, average-latency value is about 30 ms.
 I cannot judge this value is a normal(ordinary?) performance or not.

 Is it slow?

 Thanks,
 --Michika Terada




 2014-04-15 16:05 GMT+09:00 Franco Broi franco.b...@iongeo.com:


 My bug report is here
 https://bugzilla.redhat.com/show_bug.cgi?id=1067256

 On Mon, 2014-04-14 at 23:51 -0700, Joe Julian wrote:
  If you experience pain using any filesystem, you should see your
  doctor.
 
  If you're not actually experiencing pain, perhaps you should avoid
  hyperbole and instead talk about what version you tried, what your
  tests were, how you tried to fix it, and what the results were.
 
  If you're using a current version with a kernel that has readdirplus
  support for fuse it shouldn't be that bad. If it is, file a bug report
  - especially if you have the skills to help diagnose the problem.
 
  On April 14, 2014 11:30:26 PM PDT, Liam Slusser lslus...@gmail.com
  wrote:
 
  I had about 100 million files in Gluster and it was
  unbelievably painfully slow.  We had to ditch it for other
  technology.
 
 
  On Mon, Apr 14, 2014 at 11:24 PM, Franco Broi
  franco.b...@iongeo.com wrote:
 
  I seriously doubt this is the right filesystem for
  you, we have problems
  listing directories with a few hundred files, never
  mind millions.
 
  On Tue, 2014-04-15 at 10:45 +0900, Terada Michitaka
  wrote:
   Dear All,
  
  
  
   I have a problem with slow writing when there are 10
  million files.
   (Top level directories are 2,500.)
  
  
   I configured GlusterFS distributed cluster(3 nodes).
   Each node's spec is below.
  
  
CPU: Xeon E5-2620 (2.00GHz 6 Core)
HDD: SATA 7200rpm 4TB*12 (RAID 6)
NW: 10GBEth
GlusterFS : glusterfs 3.4.2 built on Jan  3 2014
  12:38:06
  
   This cluster(volume) is mounted on CentOS via FUSE
  client.
   This volume is storage of our application and I want
  to store 3
   hundred million to 5 billion files.
  
  
   I performed a writing test, writing 32KByte file ×
  10 million to this
   volume, and encountered a problem.
  
  
   (1) Writing is so slow and slow down as number of
  files increases.
 In non clustering situation(one node), this node's
  writing speed is
   40 MByte/sec at random,
 But writing speed is 3.6MByte/sec on that cluster.
   (2) ls command is very slow.
 About 20 second. Directory creation takes about 10
  seconds at
   lowest.
  
  
   Question:
  
1)5 Billion files are possible to store in
  GlusterFS?
 Has someone succeeded to store billion  files to
  GlusterFS?
  
2) Could you give me a link for a tuning guide or
  some information of
   tuning?
  
   Thanks.
  
  
   -- Michitaka Terada
 
   ___
   Gluster-users mailing list
   Gluster

Re: [Gluster-users] Switch recommendations

2012-03-30 Thread Liam Slusser
Just to put in my two cents.  I have 12 Dell 6248 connected via 10g to
a core 10g Brocade switch and haven't had any problem.  They work very
well, are super reliable, and are easy to manage.  I do recommend
using the latest firmware off of dell's website though!!

My only complaint is they do not have dual power supplies though you
can get a 1u dell power thingy that can act as a second PS for 3 or 4
of them.  But it takes up another U of space, id much rather have an
option to put in another PS.

I don't do anything crazy with them, just basic snmp stats, vlan
groups, and some trunk/port-channel groups, and they do all that very
well.  They're great top of rack switches - which is what we use them
for.  For the price they are hard to beat.

liam

On Fri, Jan 27, 2012 at 5:04 AM, Dan Bretherton
d.a.brether...@reading.ac.uk wrote:
 Dear All,
 I need to buy a bigger GigE switch for my GlusterFS cluster and I am trying
 to decide whether or not a much more expensive one would be justified.  I
 have limited experience with networking so I don't know if it would be
 appropriate to spend £500, £1500 or £3500 for a 48-port switch.  Those rough
 costs are based on a comparison of 3 Dell Powerconnect switches: the 5548
 (bigger version of what we have now), the 6248 and the 7048.  The servers in
 the cluster are nothing special - mostly Supermicro with SATA drives and
 1GigE network adapters.  I can only justify spending more than ~£500 if I
 can be sure that users would notice the difference.  Some of the users'
 applications do lots of small reads and writes, and they do run much more
 slowly if all the servers are not connected to the same switch, as is the
 case now while I don't have a big enough switch.  Any advice or comments
 would be much appreciated.

 Regards
 Dan.
 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Recommendations for busy static web server replacement

2012-02-08 Thread Liam Slusser
64k strip, and yes, one big 22+2 raid6 array.

Liam
On Feb 7, 2012 11:41 PM, Brian Candler b.cand...@pobox.com wrote:

 On Tue, Feb 07, 2012 at 05:11:01PM -0800, Liam Slusser wrote:
  We run a similar setup here.  I use gluster 2.x to strip/replicate 8
  30tb xfs bricks (24x1.5tb raid6) into a single 120tb array.

 What stripe size are you using on the RAID6 array?

 Do you put all 24 drives into a single RAID6 group (22+2), or two groups of
 12+2, or something else?

 Regards,

 Brian.

___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


[Gluster-users] A faster way to to replicate?

2012-01-16 Thread Liam Slusser
All -

Recently one of bricks lost a hard drive, during the rebuild (3ware
9690 controller) we lost another drive and then had a few ECC errors
on a third.  This was on a ~30tb 24 drive RAID6 array.  I was able to
force the controller to rebuild with a ignore_ECC flag which has
completed successfully and the XFS partition appears to be fine.  In
the 3ware device logs I see a dozen alerts about bad sectors/ecc
errors.  The partition is at 97% full so its a pretty good chance we
have some data corruption.

But not to worry, Gluster to the rescue right??  We currently have two
copies of our data and gluster handles the replication between them -
let's call them A and B clusters. Our A cluster is the cluster
having issues.  We've been planning on adding a third C cluster for
extra reliability and mostly for the added performance.  So since I
have a working good copy of the brick that is having issues on our B
cluster I started a gluster sync of our B cluster to the new C
cluster.

And OMG its so slow.  I've been running a ls -alR for the last week
and its only done 3.8% (replicated ~9.9 million files) of our total
space with an estimate finish date of another 223 days - thats the end
of August!  So my question is how can I get this done quicker?  Can I
rsync one brick to another brick directly - I know that will not copy
the extended attributes correctly and I believe will mess up gluster
right?

Anybody have some great ideas?

I'm running gluster 2.0.9 with 64bit Centos 5.7/6.2.  Each A/B/C
cluster is 4 x 30tb xfs raid6 bricks for a total of ~120tb (84tb in
use).

liam
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] ZFS + Linux + Glusterfs for a production ready 100+ TB NAS on cloud

2011-09-30 Thread Liam Slusser
I've used ZFS in lots of different roles and I've found that out of
the box ZFS performs decent but to get really great performance out of
the (zfs) filesystem you really need to tune it for the application.
ZFS has tons and tons of somewhat hidden features (edit /etc/system
and reboot type stuff) and if set correctly has outstanding
performance.

liam

On Thu, Sep 29, 2011 at 10:58 AM, Joe Landman
land...@scalableinformatics.com wrote:
 This said, please understand that there is a (significant) performance cost
 to all those nice features in ZFS.  And there is a reason why its not
 generally considered a high performance file system.  So if you start
 building with it, you shouldn't necessarily think that the whole is going to
 be faster than the sum of the parts.  Might be worse.

 This is a caution from someone who has tested/shipped many different file
 systems in the past.  ZFS included, on Solaris and other machines.  There is
 a very significant performance penalty one pays for using some of these
 features.  You have to decide if this penalty is worth it.
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] ZFS + Linux + Glusterfs for a production ready 100+ TB NAS on cloud

2011-09-24 Thread Liam Slusser
I have a very large, 500tb, Gluster cluster on Centos Linux but I use the
XFS filesystem in a production role.  Each xfs filesystem (brick) is around
32tb in size.  No problems all runs very well.

ls
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] ZFS + Linux + Glusterfs for a production ready 100+ TB NAS on cloud

2011-09-24 Thread Liam Slusser
I've also heard it can be slower however I've never done any performance
tests on the same hardware with ext3/4 vs XFS since my partitions are so big
ext3/4 is just not an option.  With that said I've been pleased with the
performance I get and am a happy XFS user.

ls
On Sep 24, 2011 12:31 PM, Craig Carl cr...@gestas.net wrote:
 XFS is a valid alternative to ZFS on Linux. If I remember correctly any
operation that requires modifying a lot of xattr's can be slower than ext*,
have you noticed anything like that? You might see slower rebalances or self
healing?

 Craig

 Sent from a mobile device, please excuse my tpyos.

 On Sep 24, 2011, at 22:14, Liam Slusser lslus...@gmail.com wrote:

 I have a very large, 500tb, Gluster cluster on Centos Linux but I use
the XFS filesystem in a production role. Each xfs filesystem (brick) is
around 32tb in size. No problems all runs very well.

 ls
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] gluster usable as open source solution

2011-08-08 Thread Liam Slusser
Glad I could help, Uwe.

We do use hardware RAID inside the servers.  The reason being is
because gluster isn't block level so doing a remirror/resync with 80+
million files takes weeks - so if we lose a hard drive the rebuild
time would be unacceptable.  With hardware raid a rebuild takes only a
day or two.

We access our clusters mostly via the native fuse gluster client
because performance via the NFS client is somewhat slow.  However we
do have a few clients that connect via NFS so we can mount readonly or
that don't require a lot of performance.

liam

On Thu, Aug 4, 2011 at 11:13 AM, Uwe Kastens kiste...@googlemail.com wrote:
 Hello Liam,

 Thank you for sharing information. This is very kind and helpful.

 Indeed there are some questions:
 - Are you working with hardware raid inside the server?
 - How are you acessing the storage? NFS/gluster native?

 Kind Regards

 Uwe




 2011/8/4 Liam Slusser lslus...@gmail.com

 I run two Gluster clusters in a very production roll using open source
 Gluster, all supported in-house mostly by myself.

 We have a 4-node 240tb after raid (576tb raw) cluster supporting a
 farm of audio transcoders.  This one was built not so much for speed
 (its not very speedy), but to be reliable and cheap.  Over the last
 two years we've had a few small issues but nothing major.  Very
 reliable.  All built on commodity hardware (Supermicro chassis's,
 Seagate 7.2k desktop harddrives).

 I also run a smaller 6-node 120tb (432tb raw) as storage for a pool of
 public facing apache webservers.  This smaller cluster serves content
 to feed our CDN providers which feeds all our users.  We can saturate
 a gigabit line (with 2-3meg http objects) without issues.  (Same
 Supermicro chassis's and Seagate 7.2k desktop harddrives)  This
 cluster has never gone down in the last two years it has been running.

 Our two homebuilt Gluster clusters replaced nearly 1/4 of a million
 dollars in Isilon hardware for less then the cost of the Isilon annual
 support contract while doubling the space at the same time.  It has
 saved our company hundreds of thousands of dollars and has been hugely
 successful.

 You're welcome to email me offline if you would like more information.

 liam

 On Thu, Aug 4, 2011 at 3:38 AM, Uwe Kastens kiste...@googlemail.com
 wrote:
  Hi,
 
  I looked at gluster over the past year. It looks nice but the commercial
  option is not so interesting, since it is not possible to evaluate a
  storage
  solution within 30 days. More than one any other storage platform its a
  matter of trust, if the scaling is working.
 
  So my questions to this mailinglist are:
  - Anybody using the open source edition in a bigger production
  environment?
  How is the expierence over a longer time?
  - Since gluster seems only to offer support within the enterprise
  version.
  Anybody out there how is supporting the open source edition?
 
  Regards
 
  Uwe
 
 
  ___
  Gluster-users mailing list
  Gluster-users@gluster.org
  http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
 
 


___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] HW raid or not

2011-08-08 Thread Liam Slusser
I'm in the HW raid camp.  Mostly because gluster is not block level,
so with large quantities of files replication can take days or weeks.
In my case a rebuild/resync can take weeks because of how many
files/directories I have in my cluster.

With hardware RAID I can just replace the disk and a rebuild happens
automatically and very quickly.

liam

On Mon, Aug 8, 2011 at 4:12 AM, Gabriel-Adrian Samfira
samfiragabr...@gmail.com wrote:
 We use raw disks with our setup. Gluster takes care of the replication
 part, so RAID would be useless for us. Performance wise, you are
 better off just adding a new brick and let gluster do the rest.

 Best regards,
 Gabriel

 On Mon, Aug 8, 2011 at 9:54 AM, Uwe Kastens kiste...@googlemail.com wrote:
 Hi,

 I know, that there is no general answer to this question :)

 Is it better to use HW Raid or LVM as gluster backend or raw disks?

 Regards

 Uwe


 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://gluster.org/cgi-bin/mailman/listinfo/gluster-users

___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] gluster usable as open source solution

2011-08-04 Thread Liam Slusser
I run two Gluster clusters in a very production roll using open source
Gluster, all supported in-house mostly by myself.

We have a 4-node 240tb after raid (576tb raw) cluster supporting a
farm of audio transcoders.  This one was built not so much for speed
(its not very speedy), but to be reliable and cheap.  Over the last
two years we've had a few small issues but nothing major.  Very
reliable.  All built on commodity hardware (Supermicro chassis's,
Seagate 7.2k desktop harddrives).

I also run a smaller 6-node 120tb (432tb raw) as storage for a pool of
public facing apache webservers.  This smaller cluster serves content
to feed our CDN providers which feeds all our users.  We can saturate
a gigabit line (with 2-3meg http objects) without issues.  (Same
Supermicro chassis's and Seagate 7.2k desktop harddrives)  This
cluster has never gone down in the last two years it has been running.

Our two homebuilt Gluster clusters replaced nearly 1/4 of a million
dollars in Isilon hardware for less then the cost of the Isilon annual
support contract while doubling the space at the same time.  It has
saved our company hundreds of thousands of dollars and has been hugely
successful.

You're welcome to email me offline if you would like more information.

liam

On Thu, Aug 4, 2011 at 3:38 AM, Uwe Kastens kiste...@googlemail.com wrote:
 Hi,

 I looked at gluster over the past year. It looks nice but the commercial
 option is not so interesting, since it is not possible to evaluate a storage
 solution within 30 days. More than one any other storage platform its a
 matter of trust, if the scaling is working.

 So my questions to this mailinglist are:
 - Anybody using the open source edition in a bigger production environment?
 How is the expierence over a longer time?
 - Since gluster seems only to offer support within the enterprise version.
 Anybody out there how is supporting the open source edition?

 Regards

 Uwe


 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] hardware raid controller

2011-07-15 Thread Liam Slusser
That's not a hardware raid controller.  The raid is done in software
via the raid driver.  You can probably find a linux driver however its
a really really crappy raid card.  I'd recommend getting something
else like a 3ware/lsi etc.

liam

On Fri, Jul 15, 2011 at 1:55 AM, Derk Roesink derkroes...@viditech.nl wrote:
 Hello!

 Im trying to install my first Gluster Storage Platform server.

 It has a Jetway JNF99FL-525-LF motherboard with an internal raid
 controller (based on a Intel ICH9R chipset) which has 4x 1tb drives for
 data that i would like to run in a RAID5 configuration

 It seems Gluster doesnt support the raid controller.. Because i still see
 the 4 disks as 'servers' in the WebUI.

 Any ideas?!

 Kind Regards,

 Derk



 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://gluster.org/cgi-bin/mailman/listinfo/gluster-users

___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Does gluster make use of a multicore setup? Hardware recs.?

2011-04-27 Thread Liam Slusser
Gluster is threaded and will take advantage of multiple CPU hardware
and memory, however having a fast disk subsystem is far more
important.  Having a lot of memory with huge bricks isn't very
necessary IMO because even with 32g of ram your cache hit ratio across
huge 30+tb bricks is so insanely small that it doesn't really make any
real world difference.

I have a 240tb cluster over 16 bricks (4 physical servers each
exporting 4 30tb bricks) and another 120tb cluster over 8 bricks (4
physical servers each exporting 2 30tb bricks).

Hardware wise both my clusters are basically the same.  Supermicro
SC846E1-R900B 4u 24 drive chassis, dual 2.2ghz quadcore xeons, 8g ram,
3ware 9690sa-4i4e SAS raid controller, 24 x Seagate 1.5tb SATA 7200rpm
hard drives in each chassis.  Each brick is a raid6 over all 24 drives
per chassis.  I daisy chain the chassis's together via SAS cables.  So
on my larger cluster I daisy chain 3 more 24 drive chassis's off the
back of the head node.  The smaller cluster I only daisy chain one
chassis off the back.

Lots of people prefer not to do raid and have gluster handle the file
replication (replicate) between bricks.  My problem with that is with
a huge amount of files (I have nearly 100 million files on my larger
cluster) that a rebuild (ls -alR) takes 2-3 weeks.  And since those
Seagate drives are crap (I lose maybe 1-3 drives a month!) I would
constantly be rebuilding almost all the time.  Using hardware raid
makes life much easier for me.  Instead of having 384 bricks I have
16.  When I lose a drive I just hot swap it and let the 3ware
controller rebuild the raid6 array.  The rebuild time on the 3ware
depends on the workload but its anywhere from 2-5 days normally.  One
time I lost a drive in the middle of a rebuild (so one failed and one
in a rebuild state) and was able to hotswap the new failed drive and
it correctly rebuilt the array with two failed drives without any
problems or downtime on the cluster.  Win!

So I'm a big fan of hardware raid, especially the 3ware controllers.
They handle the slow non-enterprise Seagate drives very well.  I've
tried LSI, Dell Perc 5e/6e, and Supermicro (LSI) controller and they
all had issues with drive timeouts.  A few recommendations when using
the 3ware controllers, disable SMARTD in Linux (it pisses off the
3ware controller) and the 3ware controller keeps an eye on the SMART
on each disk anyway, set the block readahead in linux to 16384
(/sbin/blockdev --setra 16384 /dev/sdX), upgrade the firmware on the
3ware controller to the newest version from 3ware, use the newest
3ware drives and not the included driver bundled with whatever linux
distro you use, spend the $100 and make sure you get the optional
battery backup module for the controller, and use nagios to check your
raid status!  Oh, and if you use desktop commodity hard drives, make
sure you have a bunch of spares on hand.  :)

Even with hardware raid I still use gluster's replication to provide
me redundancy so I can do patches and system maintenance without
downtime to my clients.  I mirror bricks between head nodes and then
use distribute to glue all the replicated bricks together.

I have two Dell 1950 1u public facing webservers (Linux/Apache) using
the gluster fuse mount connected via a private backend network to my
smaller cluster.  My average file request size is around 3megs (10-15
requests per second), and i've been able to push 800mbit/sec of http
traffic from those two clients.  Might have been higher but my
firewall only has gigabit ethernet which was basically saturated at
that point.   I only use a 128meg gluster client cache because I'm
feeding my CDN so the requests are very random and I very rarely see
two requests for the same file.  Thats pretty awesome random read
performance if you ask me considering the hardware.  I start getting
uncomfortable with anymore than 600mbit/sec of traffic as the service
read times off the bricks on the gluster servers start getting quite
high.  Those 1.5tb Seagate hard drives are cheap, $80 a drive, but
they're not very fast at random reads.

Hope that helps!

liam

On Wed, Apr 27, 2011 at 1:39 AM, Martin Schenker
martin.schen...@profitbricks.com wrote:
 Hi all!

 I'm new to the Gluster system and tried to find answers to some simple
 questions (and couldn't find the information with Google etc.)

 -does Gluster spread it's cpu load across a multicore environment? So does
 it make sense to have 50 core units as Gluster server? CPU loads seem to go
 up quite high during file system repairs so spreading / multithreading
 should help? What kind of CPUs are working well? How much memory does help
 the preformance?

 -Are there any recommendations for commodity hardware? We're thinking of 36
 slot 4U servers, what kind of controllers DO work well for IO speed? Any
 real life experiences? Does it dramatically improve the performance to
 increase the number of controllers per disk?

 The aim is for  a ~80-120T file system with 2-3 

Re: [Gluster-users] Server side or client side AFR

2011-04-22 Thread Liam Slusser
Problem with server side AFR is you loose your redundancy in the event
that one of your server goes down because the client is only
connecting to one of the two servers.  So if your client is connected
to the server that went down, you're SOL.

liam

On Fri, Apr 22, 2011 at 4:05 PM, Nobodys Home n1sh...@yahoo.com wrote:
 Hello all,
 I prefer server side AFR for my topology and I think it makes overall sense
 for most configurations.  However, I don't think the
 3.1 documentation regarding the creation of a replicated distributed volume
 is server side after a bit of testing.  It seems like I'm in a bit of a
 pickle:

 1.  Can I only expand volumes on the fly if I use the gluster cli to
 initially define my volumes?

 2. Most of the examples regarding server side AFR do not show how to add
 bricks to a predifined server.vol configuration on the fly.  If there is a
 way to do this can I get help on this?

 Best Regards,
 Nobodys Home
 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Backup Strategy

2011-03-09 Thread Liam Slusser
Netbackup is great and can probably backup directly from a glusterFS
client mount, however, the license and software cost for a few clients
and one server/media server is nearly $50k.  Not exactly cheap.  Id
look into Amanda backup if I was on a budget and wanted to backup to
tape.

Another option is to just do rsync your gluster cluster to a Sun
Solaris server with a ZFS partition.  Then you can do nightly zfs
snapshots of your data (snapshots only save what changes so it uses
very little space).

liam

On Wed, Mar 9, 2011 at 11:40 AM, Mohit Anchlia mohitanch...@gmail.com wrote:
 Thanks! have you heard of netbackup? Our co. already has license for
 it. I think it can be used.

 On Wed, Mar 9, 2011 at 11:11 AM, Sabuj Pattanayek sab...@gmail.com wrote:
 I read the docs. But here you go :

 http://lmgtfy.com/?q=backuppc+howto

 On Wed, Mar 9, 2011 at 1:05 PM, Mohit Anchlia mohitanch...@gmail.com wrote:
 Thanks! Is there a short blog or steps that I can look at.
 documentaion looks overwhelming at first look :)

 On Wed, Mar 9, 2011 at 10:53 AM, Sabuj Pattanayek sab...@gmail.com wrote:
 for the amount of features that you get with backuppc, it's worth the
 fairly painless setup. Btw, we've found that it's better/faster to use
 tar via backuppc (it supports rsync as well) to do the backups rather
 than rsync in backuppc. Rsync can be really slow if you have
 thousands/millions of files.

 On Wed, Mar 9, 2011 at 12:50 PM, Mohit Anchlia mohitanch...@gmail.com 
 wrote:
 Is there a problem with using just rsync vs backupcc? I need to read
 about backupcc and how easy it is to setup.


 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://gluster.org/cgi-bin/mailman/listinfo/gluster-users

 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://gluster.org/cgi-bin/mailman/listinfo/gluster-users

___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Gluster 3.1.1 64-bit only?

2010-12-14 Thread Liam Slusser
You can - its just not supported or very well tested.

liam

On Mon, Dec 13, 2010 at 10:07 AM, Matt Keating
matt.keating.li...@gmail.com wrote:
 Will it not run at all if I compile on a 32bit system?

 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] glusterfs client waiting on SYN_SENT to connect...

2010-12-14 Thread Liam Slusser
Just wanted to update you all.  Turns out the problem is my Juniper
Firewall - sort of.  I've created a service in our Juniper that
describes Gluster and allowed the tcp session to never timeout.
The problem comes when a server crashes and the TCP connection isn't
cleaned up.  It looks like the gluster client always starts using
the same outbound (source) TCP port and in our firewall that
source/dest port combination is already in use (never times out right)
and the firewall isn't allowing it to be created again - so its
blocked.

So right now if i do a netstat -pan

tcp0  1 10.10.10.101:996 10.20.10.102:6996
   SYN_SENT23491/glusterfs
tcp0  1 10.10.10.101:997 10.20.10.102:6996
   SYN_SENT23491/glusterfs
tcp0  1 10.10.10.101:100010.20.10.102:6996
   SYN_SENT23491/glusterfs
tcp0  0 10.10.10.101:100110.20.10.102:6996
   ESTABLISHED 23491/glusterfs
tcp0  0 10.10.10.101:999 10.20.10.101:6996
   ESTABLISHED 23491/glusterfs
tcp0  1 10.10.10.101:998 10.20.10.101:6996
   SYN_SENT23491/glusterfs
tcp0  1 10.10.10.101:100310.20.10.101:6996
   SYN_SENT23491/glusterfs
tcp0  1 10.10.10.101:100210.20.10.101:6996
   SYN_SENT23491/glusterfs

Now if i kill the gluster process and restart it againnotice the
source port doesn't change...

tcp0  1 10.10.10.101:996 10.20.10.102:6996
   SYN_SENT23687/glusterfs
tcp0  1 10.10.10.101:997 10.20.10.102:6996
   SYN_SENT23687/glusterfs
tcp0  1 10.10.10.101:100010.20.10.102:6996
   SYN_SENT23687/glusterfs
tcp0  0 10.10.10.101:100110.20.10.102:6996
   ESTABLISHED 23687/glusterfs
tcp0  0 10.10.10.101:999 10.20.10.101:6996
   ESTABLISHED 23687/glusterfs
tcp0  1 10.10.10.101:998 10.20.10.101:6996
   SYN_SENT23687/glusterfs
tcp0  1 10.10.10.101:100310.20.10.101:6996
   SYN_SENT23687/glusterfs
tcp0  1 10.10.10.101:100210.20.10.101:6996
   SYN_SENT23687/glusterfs

Now if i kill and restart a few times...i can get lucky and get a
different source port...but you can see i'm still missing a few
bricks.

tcp0  0 10.10.10.101:994 10.20.10.102:6996
   ESTABLISHED 23745/glusterfs
tcp0  0 10.10.10.101:995 10.20.10.102:6996
   ESTABLISHED 23745/glusterfs
tcp0  0 10.10.10.101:998 10.20.10.102:6996
   ESTABLISHED 23745/glusterfs
tcp0  1 10.10.10.101:100010.20.10.102:6996
   SYN_SENT23745/glusterfs
tcp0  0 10.10.10.101:997 10.20.10.101:6996
   ESTABLISHED 23745/glusterfs
tcp0  0 10.10.10.101:996 10.20.10.101:6996
   ESTABLISHED 23745/glusterfs
tcp0  1 10.10.10.101:100310.20.10.101:6996
   SYN_SENT23745/glusterfs
tcp0  1 10.10.10.101:100210.20.10.101:6996
   SYN_SENT23745/glusterfs

Now telnet works always because it always picks a random source port:

$ telnet 10.20.10.102 6996
Trying 10.20.10.102...
Connected to glusterserver (10.20.10.102).
Escape character is '^]'.

$ netstat -pan|grep telne
tcp0  0 10.10.10.101:58757   10.20.10.102:6996
   ESTABLISHED 23622/telnet

Why does gluster not use a more random source port??  I'm going to
have to dig through the Juniper docs to see if i can manually close an
active session (lets hope) which should fix my immediate problem but
it doesn't really fix the long term problem.

Thoughts?

thanks,
liam

On Fri, Dec 3, 2010 at 6:51 PM, Liam Slusser lslus...@gmail.com wrote:
 Ah the two different IPs are because I was changing my IPs for this mailing
 list and I guess I forgot that one.  :)  Will try added a static route.
 Also going to snoop traffic and see if the gluster client is actually
 getting to the server or being blocked by the firewall.  Ill letcha all know
 what I find.

 Thanks for the ideas.

 Liam

 On Dec 3, 2010 6:32 PM, mki-gluste...@mozone.net wrote:
 On Fri, Dec 03, 2010 at 04:25:18PM -0800, Liam Slusser wrote:
 [r...@client~]# netstat -pan|grep glus
 tcp 0 1 10.8.10.107:1000 10.8.11.102:6996 SYN_SENT 3385/glusterfs

 from the gluster client log:

 However, the port is obviously open...

 [r...@client~]# telnet 10.8.11.102 6996
 Trying 10.2.56.102...
 Connected to glusterserverb (10.8.11.102).
 Escape character is '^]'.
 ^]
 telnet close
 Connection closed.

 Looking further... why is your telnet trying 10.2.56.102 when you
 clearly specified 10.8.11.102? Also, what happens if you do a
 specific route for the 10.8.11.0/24 block thru the appropriate gw
 without relying on the default gw to route for you

[Gluster-users] glusterfs client waiting on SYN_SENT to connect...

2010-12-03 Thread Liam Slusser
Hey all,

I've run into a weird problem.  I have a few client boxes that
occasionally crash due to a non-gluster related problem.  But once the
box comes back up i cannot get the Gluster client to reconnect to the
bricks.

Centos 5 64bit and Gluster 2.0.9

df shows:

df: `/mnt/mymount': Transport endpoint is not connected

[r...@client~]# netstat -pan|grep glus

tcp0  1 10.8.10.107:100010.8.11.102:6996
 SYN_SENT3385/glusterfs
tcp0  1 10.8.10.107:100110.8.11.102:6996
 SYN_SENT3385/glusterfs
tcp0  1 10.8.10.107:998 10.8.11.102:6996
 SYN_SENT3385/glusterfs
tcp0  1 10.8.10.107:996 10.8.11.102:6996
 SYN_SENT3385/glusterfs
tcp0  1 10.8.10.107:100310.8.11.101:6996
 SYN_SENT3385/glusterfs
tcp0  1 10.8.10.107:100210.8.11.101:6996
 SYN_SENT3385/glusterfs
tcp0  1 10.8.10.107:997 10.8.11.101:6996
 SYN_SENT3385/glusterfs
tcp0  1 10.8.10.107:999 10.8.11.101:6996
 SYN_SENT3385/glusterfs

from the gluster client log:

+--+
[2010-12-03 15:48:28] W [glusterfsd.c:526:_log_if_option_is_invalid]
readahead: option 'page-size' is not recognized
[2010-12-03 15:48:28] N [glusterfsd.c:1306:main] glusterfs: Successfully started
[2010-12-03 15:48:29] W [fuse-bridge.c:1892:fuse_statfs_cbk]
glusterfs-fuse: 2: ERR = -1 (Transport endpoint is not connected)
[2010-12-03 15:48:30] W [fuse-bridge.c:1892:fuse_statfs_cbk]
glusterfs-fuse: 3: ERR = -1 (Transport endpoint is not connected)
[2010-12-03 15:48:31] W [fuse-bridge.c:1892:fuse_statfs_cbk]
glusterfs-fuse: 4: ERR = -1 (Transport endpoint is not connected)
[2010-12-03 15:48:31] W [fuse-bridge.c:1892:fuse_statfs_cbk]
glusterfs-fuse: 5: ERR = -1 (Transport endpoint is not connected)
[2010-12-03 15:48:32] W [fuse-bridge.c:1892:fuse_statfs_cbk]
glusterfs-fuse: 6: ERR = -1 (Transport endpoint is not connected)
[2010-12-03 15:51:37] E [socket.c:745:socket_connect_finish] brick1a:
connection to  failed (Connection timed out)
[2010-12-03 15:51:37] E [socket.c:745:socket_connect_finish] brick1a:
connection to  failed (Connection timed out)
[2010-12-03 15:51:37] E [socket.c:745:socket_connect_finish] brick2a:
connection to  failed (Connection timed out)
[2010-12-03 15:51:37] E [socket.c:745:socket_connect_finish] brick2a:
connection to  failed (Connection timed out)
[2010-12-03 15:51:37] E [socket.c:745:socket_connect_finish] brick1b:
connection to  failed (Connection timed out)
[2010-12-03 15:51:37] E [socket.c:745:socket_connect_finish] brick1b:
connection to  failed (Connection timed out)
[2010-12-03 15:51:37] E [socket.c:745:socket_connect_finish] brick2b:
connection to  failed (Connection timed out)
[2010-12-03 15:51:37] E [socket.c:745:socket_connect_finish] brick2b:
connection to  failed (Connection timed out)
[2010-12-03 15:59:46] W [fuse-bridge.c:1892:fuse_statfs_cbk]
glusterfs-fuse: 7: ERR = -1 (Transport endpoint is not connected)
[2010-12-03 15:59:47] W [fuse-bridge.c:1892:fuse_statfs_cbk]
glusterfs-fuse: 8: ERR = -1 (Transport endpoint is not connected)
[2010-12-03 15:59:54] W [fuse-bridge.c:1892:fuse_statfs_cbk]
glusterfs-fuse: 9: ERR = -1 (Transport endpoint is not connected)
[2010-12-03 15:59:55] W [fuse-bridge.c:1892:fuse_statfs_cbk]
glusterfs-fuse: 10: ERR = -1 (Transport endpoint is not connected)
[2010-12-03 15:59:55] W [fuse-bridge.c:1892:fuse_statfs_cbk]
glusterfs-fuse: 11: ERR = -1 (Transport endpoint is not connected)
[2010-12-03 15:59:55] W [fuse-bridge.c:1892:fuse_statfs_cbk]
glusterfs-fuse: 12: ERR = -1 (Transport endpoint is not connected)
[2010-12-03 15:59:56] W [fuse-bridge.c:1892:fuse_statfs_cbk]
glusterfs-fuse: 13: ERR = -1 (Transport endpoint is not connected)

However, the port is obviously open...

[r...@client~]# telnet 10.8.11.102 6996
Trying 10.2.56.102...
Connected to glusterserverb (10.8.11.102).
Escape character is '^]'.
^]
telnet close
Connection closed.

The gluster server log doesnt see ANY connection attempts from the
client however it DOES see my telnet tcp attempts.  I'm using IP
addresses in all my configuration files - no names.  I do have a
Juniper firewall between the two servers that is doing stateful
firewalling and i've set it up for the connections to never timeout -
and ive never had a problem once it finally connects.  And i can
create a new connection with telnet but not the client...

Anybody seen anything like this before?  Ideas?

thanks,
liam
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] glusterfs client waiting on SYN_SENT to connect...

2010-12-03 Thread Liam Slusser
I thought the exact same thing...but like i said i can telnet to the
host/port without any issue.  And there is no other issues on the
network that would indicate any not working correctly.  And all the
other clients on the same network/switch are working fine.  Its only
when a client crashes...

liam

On Fri, Dec 3, 2010 at 4:34 PM,  m...@mozone.net wrote:
 I've run into a weird problem.  I have a few client boxes that
 occasionally crash due to a non-gluster related problem.  But once the
 box comes back up i cannot get the Gluster client to reconnect to the
 bricks.

 This almost seems like a networking/firewall issue...  Do you have
 any trunks setup between the switch that the client and/or server
 are on and the router?  Perhaps one of those trunk legs is down
 causing random packets to get blackholed?

 Mohan

___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] glusterfs client waiting on SYN_SENT to connect...

2010-12-03 Thread Liam Slusser
Telnet never fails.  Gluster client consistently fails however.
Server is using bonded NICs but as far as i can tell they're
configured correctly, both links are up and passing traffic.

On Fri, Dec 3, 2010 at 6:15 PM,  mki-gluste...@mozone.net wrote:
 On Fri, Dec 03, 2010 at 06:03:19PM -0800, Liam Slusser wrote:
  This almost seems like a networking/firewall issue... ?Do you have
  any trunks setup between the switch that the client and/or server
  are on and the router? ?Perhaps one of those trunk legs is down
  causing random packets to get blackholed?

 I thought the exact same thing...but like i said i can telnet to the
 host/port without any issue.  And there is no other issues on the
 network that would indicate any not working correctly.  And all the
 other clients on the same network/switch are working fine.  Its only
 when a client crashes...

 Consistently?  If random telnets fail then that would explain your
 random SYN_SENT state stuck sockets.  Is the client or server using
 bonded nics?

 Mohan
 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://gluster.org/cgi-bin/mailman/listinfo/gluster-users

___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] upgrade 2.0.9 - 3.1.0 ?

2010-11-29 Thread Liam Slusser
Daniel,

I just asked this question earlier this month, see
http://gluster.org/pipermail/gluster-users/2010-November/005801.html
for that thread.  It was also recommended that i wait for 3.1.1 which
was released today i believe.

thanks,
liam

On Mon, Nov 29, 2010 at 9:37 AM, Daniel Maher dma+glus...@witbe.net wrote:
 Hello all,

 I have a relatively straightforward 4-node gluster setup (2 clients, 2
 servers, client-side replication) running version 2.0.9 across the board.

 We are considering upgrading to 3.1.0 .  The documentation indicates that it
 would be as simple as:
 - shut gluster down
 - uninstall the previous packages (we package install everything)
 - install the new packages
 - generate the new server and client configs
 - start everything up

 Beyond what is indicated above, is there any particular upgrade path or
 specific concerns we will need to address ?

 Thank you.


 --
 Daniel Maher dma+gluster AT witbe DOT net
 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://gluster.org/cgi-bin/mailman/listinfo/gluster-users

___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] gluster with xfs

2010-11-16 Thread Liam Slusser
Our smaller cluster, 60tb, stores media data and acts as our CDN feed
system.  Its a pretty simple setup, the front end is two Dell 1950 servers
running Apache mounting gluster via the fuse client.  We use bonded gigabit
ethernet on the back side to two supermicro 4u 24 bay servers.  Each server
has another supermicro 4u 24 bay chassis hanging off the back connected via
SAS.  Both servers are mirrors of each other.  Drives are desktop Seagate
1.5tb drives connected to a 3ware 9690 SAS card.  We make one huge 24 drive
raid6 volume (~30tb) as a brick and use gluster to glue it all together.

Performace is decent - we've pushed nearly 800mbit of web traffic with it.
Our Juniper firewall only has gigabit anyway so I don't know how much more I
could push if I went to 10g.  One weird thing I've noticed is 500mbit of web
traffic is about double that on the backend which is why we use bonded
ethernet for the backend.

Another trick we do is our two frontend webservers only mount one server
each - so webserver A only mounts gluster server A.  We found that the over
head of gluster constantly verifing the files were in sync added a ~20%
overhead.  All the clients that actually write the data of course mount both
servers so the files mirrored correctly.

Email me privately if you want more detail.

Liam
 On Nov 15, 2010 6:57 AM, Rudi Ahlers r...@softdux.com wrote:
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] gluster with xfs

2010-11-15 Thread Liam Slusser
We run two somewhat large gluster clusters in production on xfs with great
success.  I had to go with xfs as ext4 doesn't support large enough file
systems.  Make sure you mount your xfs partitions with 64bit inode support
and use only 64bit OS's.

I'm still running 2.0.9 however the performance is pretty good.  We use ours
to store media for our website and with our smaller two server four brick
60tb cluster I can easily push 800mbit of http traffic with an average
object size of 2-3megs.  Not bad for a bunch of slow sata disks!

Liam
On Nov 15, 2010 2:53 AM, David Lloyd david.ll...@v-consultants.co.uk
wrote:
 Hello,

 We're starting to set up a 4 node gluster system. I'm currently trying
 to decide on the low-level options, including what filesystem to use.

 For various reasons I would be more comfortable with XFS over ext4,
 but I read in the 'Introduction to Gluster' that 'XFS (can be slow)'.

 I haven't found any other details about this, and wondered if anyone
 has more information or experience of using gluster with XFS. Or if
 anything has changed with 3.1. We don't want it to be slow, and I'm
 happy enough using ext4 if necessary, but just wanted to see what
 others thought first.

 Thanks
 David

 --
 David Lloyd
 V Consultants
 www.v-consultants.co.uk
 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


[Gluster-users] upgrading from 2.0.9 to 3.1, any gotchas?

2010-11-12 Thread Liam Slusser
Hey Gluster Users,

Been awhile since i've posted here.  I'm looking to upgrade our 150tb
10 brick cluster from 2.0.9 to 3.1.  Is there any gotcha's that i
should be aware of?  Anybody run into any problems?  Any suggestions
or hints would be most helpful.  I hoping the new Gluster will be a
bit more forgiving on split brain issues and an increase in
performance is always welcome.

thanks,
liam
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] iscsi with gluster

2010-04-22 Thread Liam Slusser
I have a gfs2 cluster and have found the performance to be  
outstanding. It's great with small files.   It's hard to say how it  
compares to my gluster cluster since I designed them to do different  
tasks.  But since the storage is all shared block level it does have  
many advantages.


Liam



On Apr 22, 2010, at 1:29 AM, milo...@gmail.com milo...@gmail.com  
wrote:


On Thu, Apr 22, 2010 at 8:38 AM, Liam Slusser lslus...@gmail.com  
wrote:

You COULD run gluster on top of an iscsi mounted volume...but why
would you want too?  If you already have an iscsi SAN why not use  
gfs2

or something like that?



You need full cluster infrastructure for that - Gluster is a much
simpler solution.

GFS2 is also _very_ slow, although I never run a test to compare it
with Gluster, but my feeling is that Gluster much faster.

___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] How to re-sync

2010-03-07 Thread Liam Slusser
Assuming you used raid1 (distribute), you DO bring up the new machine
and start gluster.  On one of your gluster mounts you run a ls -alR
and it will resync the new node.  The gluster clients are smart enough
to get the files from the first node.

liam

On Sat, Mar 6, 2010 at 11:48 PM, Chad ccolu...@hotmail.com wrote:
 Ok, so assuming you have N glusterfsd servers (say 2 cause it does not
 really matter).
 Now one of the servers dies.
 You repair the machine and bring it back up.

 I think 2 things:
 1. You should not start glusterfsd on boot (you need to sync the HD first)
 2. When it is up how do you re-sync it?

 Do you rsync the underlying mount points?
 If it is a busy gluster cluster it will be getting new files all the time.
 So how do you sync and bring it back up safely so that clients don't connect
 to an incomplete server?

 ^C
 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://gluster.org/cgi-bin/mailman/listinfo/gluster-users

___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


[Gluster-users] gluster rebuild time

2010-02-04 Thread Liam Slusser
All,

I've been asked to share some rebuild times on my large gluster
cluster.  I recently added another more storage (bricks) and did a
full ls -alR on the whole system.  I estimate we have around 50
million files and directories.

Gluster Server Hardware:

2 x Supermicro 4u chassis with 24 1.5tb SATA drives and another 24
1.5tb SATA drives in an external drive array via SAS (total of 96
drives all together), 8 core 2.5ghz xeon, 8gig ram
3ware raid controllers, 24 drives per raid6 array, 4 arrays total, 2
arrays per server
Centos 5.3 64bit
XFS with inode64 mount option
Gluster 2.0.9
Bonded gigabit ethernet

Clients:

20 or so Dell 1950 clients
Mixture of RedHat ES4 and Centos 5 clients + 20 Windows XP clients via
Samba (theses are VMs and do have to run on windows jobs)
All clients on gigabit ethernet

I must say that our load on our gluster servers is normally very high,
load average on the box is anywhere from 7-10 at peak (although
decent service times) - so im sure if we had a more idle system the
rebuild time would have been quicker.  The system is at its highest
load while writing a large amount of data while at peak of the day -
so i try to schedule jobs around our peak times.

Anyhow...

I started the job sometime January 16th and it JUST finished...18 days later.

real27229m56.894s
user13m19.833s
sys 56m51.277s

Finish date was Wed Feb  3 23:33:12 PST 2010

Now i've known some people have mentioned that Gluster is happier with
many bricks instead of larger raid arrays like I use however either
way id be stuck doing a ls -aglR which takes forever.  So id rather
add a huge amount of space at once and keep the system setup similar -
and let my 3ware controllers deal with drive failures instead of
having to do a ls -aglR each time i loose a drive.  Replacing a drive
with the 3ware controller 7 to 8 days in a 24 drive raid6 array but
thats better then 18 days for Gluster to do a ls -aglR.

By comparison our old 14 node Isilon 6000 cluster (6tb per node) did a
node rebuilt/resync in about a day or two - theres a big difference in
block level and file system level replication!

We're still running Gluster 2.0.9 but I am looking to upgrade to 3.0
once a few more releases are out and am hoping that the new checksum
based checks will speedup this whole process.  Once i have some
numbers on 3.0 ill be sure to share.

thanks,
liam
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] booster with apache permission denied

2010-01-11 Thread Liam Slusser
 [client-protocol.c:5733:client_setvolume_cbk]
brick2a: Connected to 192.168.12.35:6996, attached to remote volume
'brick2a'.
[2010-01-11 14:16:02] D
[libglusterfsclient.c:1713:libgf_vmp_map_ghandle] libglusterfsclient:
New Entry: /pub
[2010-01-11 14:16:02] D
[libglusterfsclient.c:1421:libgf_init_vmpentry] libglusterfsclient:
New VMP entry: /pub
[2010-01-11 14:16:02] D
[libglusterfsclient.c:1724:libgf_vmp_map_ghandle] libglusterfsclient:
Empty list
[2010-01-11 14:16:02] D [booster.c:1190:booster_init] booster: booster is inited
[2010-01-11 14:16:02] D [libglusterfsclient.c:5318:glusterfs_chmod]
libglusterfsclient: path
/home/httpd/apps/httpd-2.2.14/logs/cgisock.29127
[2010-01-11 14:16:02] D
[libglusterfsclient.c:1541:_libgf_vmp_search_entry]
libglusterfsclient: VMP Search: path
/home/httpd/apps/httpd-2.2.14/logs/cgisock.29127, type: LongestPrefix
[2010-01-11 14:16:02] D
[libglusterfsclient.c:1631:libgf_vmp_search_entry] libglusterfsclient:
VMP Entry not found: path:
/home/httpd/apps/httpd-2.2.14/logs/cgisock.29127
[2010-01-11 14:16:02] D [libglusterfsclient.c:5443:glusterfs_chown]
libglusterfsclient: path
/home/httpd/apps/httpd-2.2.14/logs/cgisock.29127
[2010-01-11 14:16:02] D
[libglusterfsclient.c:1541:_libgf_vmp_search_entry]
libglusterfsclient: VMP Search: path
/home/httpd/apps/httpd-2.2.14/logs/cgisock.29127, type: LongestPrefix
[2010-01-11 14:16:02] D
[libglusterfsclient.c:1631:libgf_vmp_search_entry] libglusterfsclient:
VMP Entry not found: path:
/home/httpd/apps/httpd-2.2.14/logs/cgisock.29127
[2010-01-11 14:16:02] D
[libglusterfsclient.c:1713:libgf_vmp_map_ghandle] libglusterfsclient:
New Entry: /pub
[2010-01-11 14:16:02] D
[libglusterfsclient.c:1421:libgf_init_vmpentry] libglusterfsclient:
New VMP entry: /pub
[2010-01-11 14:16:02] D
[libglusterfsclient.c:1724:libgf_vmp_map_ghandle] libglusterfsclient:
Empty list
[2010-01-11 14:16:02] D [booster.c:1190:booster_init] booster: booster is inited
[2010-01-11 14:16:02] D
[libglusterfsclient.c:1713:libgf_vmp_map_ghandle] libglusterfsclient:
New Entry: /pub
[2010-01-11 14:16:02] D
[libglusterfsclient.c:1421:libgf_init_vmpentry] libglusterfsclient:
New VMP entry: /pub
[2010-01-11 14:16:02] D
[libglusterfsclient.c:1724:libgf_vmp_map_ghandle] libglusterfsclient:
Empty list
[2010-01-11 14:16:02] D [booster.c:1190:booster_init] booster: booster is inited
[2010-01-11 14:16:12] D [libglusterfsclient.c:4866:glusterfs_stat]
libglusterfsclient: path /pub/data/tnsc/test/test.mp3
[2010-01-11 14:16:12] D
[libglusterfsclient.c:1541:_libgf_vmp_search_entry]
libglusterfsclient: VMP Search: path /pub/data/tnsc/test/test.mp3,
type: LongestPrefix
[2010-01-11 14:16:12] D
[libglusterfsclient.c:1628:libgf_vmp_search_entry] libglusterfsclient:
VMP Entry found: path :/pub/data/tnsc/test/test.mp3 vmp: /pub/
[2010-01-11 14:16:12] D [libglusterfsclient.c:4788:__glusterfs_stat]
libglusterfsclient: path /data/tnsc/test/test.mp3, op: 2
[2010-01-11 14:16:12] D
[libglusterfsclient.c:869:libgf_resolve_path_light]
libglusterfsclient: Path: /data/tnsc/test/test.mp3, Resolved Path:
/data/tnsc/test/test.mp3
[2010-01-11 14:16:12] D
[libglusterfsclient-dentry.c:268:__do_path_resolve]
libglusterfsclient-dentry: resolved path(/data/tnsc/test/test.mp3)
till 1(/). sending lookup for remaining path
[2010-01-11 14:16:12] D [libglusterfsclient.c:4725:libgf_client_stat]
libglusterfsclient: path /data/tnsc/test/test.mp3, status 0, errno 0
[2010-01-11 14:16:12] D [libglusterfsclient.c:3001:glusterfs_open]
libglusterfsclient: path /pub/data/tnsc/test/test.mp3
[2010-01-11 14:16:12] D
[libglusterfsclient.c:1541:_libgf_vmp_search_entry]
libglusterfsclient: VMP Search: path /pub/data/tnsc/test/test.mp3,
type: LongestPrefix
[2010-01-11 14:16:12] D
[libglusterfsclient.c:1628:libgf_vmp_search_entry] libglusterfsclient:
VMP Entry found: path :/pub/data/tnsc/test/test.mp3 vmp: /pub/
[2010-01-11 14:16:12] D
[libglusterfsclient.c:869:libgf_resolve_path_light]
libglusterfsclient: Path: /data/tnsc/test/test.mp3, Resolved Path:
/data/tnsc/test/test.mp3
[2010-01-11 14:16:12] D
[libglusterfsclient-dentry.c:389:libgf_client_path_lookup]
libglusterfsclient: resolved path(/data/tnsc/test/test.mp3) to
1118653312/1118655564
[2010-01-11 14:16:12] D [libglusterfsclient.c:2752:libgf_client_open]
libglusterfsclient: open: path /data/tnsc/test/test.mp3, status: 0,
errno 117


On Mon, Jan 11, 2010 at 1:23 PM, Raghavendra G raghavendra...@gmail.com wrote:
 Hi Liam,

 Can you send glusterfs server logs?

 regards,
 On Sat, Jan 9, 2010 at 1:46 AM, Liam Slusser lslus...@gmail.com wrote:

 I believe i posted this here before but never got any replies.  I'm in
 the middle of upgrading to Gluster 2.0.9 and would like to move away
 from having to use fuse to serve up files out of apache so im working
 again on getting boosting working correctly.

 Everything appears to load and work fine but i always get permission
 denied, 403, in my apache logs.  Works fine under fuse.  I'm running
 Apache under the user nobody which does have read access

Re: [Gluster-users] booster with apache permission denied

2010-01-11 Thread Liam Slusser
I was able to install lighttpd 1.4.25 and it appears to work just fine
with glusterfs-booster.so.  So I think its an issue with Apache 2.2.14
(the newest version available).  I suppose i can try an older version
of Apache and see if i have better luck (say 2.0)...

liam

On Mon, Jan 11, 2010 at 2:20 PM, Liam Slusser lslus...@gmail.com wrote:
 Logs are below.  I also noticed this while trying to debug this
 issue...Notice the md5sum do not match up below?

 On the fuse mounted system:

 [r...@server test]# ls -al test.mp3
 -rw-r--r-- 1 user group 3692251 Aug 27  2007 test.mp3

 [r...@server test]# md5sum test.mp3
 d480d794882c814ae1a2426b79cf8b3e  test.mp3

 Using glusterfs-boost.so:

 [r...@server tmp]#
 LD_PRELOAD=/home/gluster/apps/glusterfs-2.0.9/lib/glusterfs/glusterfs-booster.so
 ls -al /pub/data/tnsc/test/test.mp3
 ls: /pub/data/tnsc/test/test.mp3: Invalid argument
 -rw-r--r-- 1 tcode tcode 3692251 Aug 27  2007 /pub/data/tnsc/test/test.mp3

 [r...@server tmp]#
 LD_PRELOAD=/home/gluster/apps/glusterfs-2.0.9/lib/glusterfs/glusterfs-booster.so
 cp /pub/data/tnsc/test/test.mp3 /tmp/test.mp3

 [r...@server tmp]# md5sum /tmp/test.mp3
 9bff3bb90b6897fc19b6b4658b83f3f8  /tmp/test.mp3

 [r...@server tmp]# ls -al /tmp/test.mp3
 -rw-r--r-- 1 root root 3690496 Jan 11 14:10 /tmp/test.mp3

 Here are the gluster logs from a clean apache start and one request to
 test.mp3 with wget:

 [2010-01-11 14:16:02] D [xlator.c:634:xlator_set_type] xlator:
 dlsym(notify) on
 /home/gluster/apps/glusterfs-2.0.9/lib/glusterfs/2.0.9/xlator/performance/io-threads.so:
 undefined symbol: notify -- neglecting
 [2010-01-11 14:16:02] D [xlator.c:634:xlator_set_type] xlator:
 dlsym(notify) on
 /home/gluster/apps/glusterfs-2.0.9/lib/glusterfs/2.0.9/xlator/performance/read-ahead.so:
 undefined symbol: notify -- neglecting
 [2010-01-11 14:16:02] D [xlator.c:634:xlator_set_type] xlator:
 dlsym(notify) on
 /home/gluster/apps/glusterfs-2.0.9/lib/glusterfs/2.0.9/xlator/performance/io-cache.so:
 undefined symbol: notify -- neglecting
 [2010-01-11 14:16:02] D [client-protocol.c:6130:init] brick1a:
 defaulting frame-timeout to 30mins
 [2010-01-11 14:16:02] D [client-protocol.c:6141:init] brick1a:
 defaulting ping-timeout to 10
 [2010-01-11 14:16:02] D [transport.c:141:transport_load] transport:
 attempt to load file
 /home/gluster/apps/glusterfs-2.0.9/lib/glusterfs/2.0.9/transport/socket.so
 [2010-01-11 14:16:02] D [transport.c:141:transport_load] transport:
 attempt to load file
 /home/gluster/apps/glusterfs-2.0.9/lib/glusterfs/2.0.9/transport/socket.so
 [2010-01-11 14:16:02] D [client-protocol.c:6130:init] brick2a:
 defaulting frame-timeout to 30mins
 [2010-01-11 14:16:02] D [client-protocol.c:6141:init] brick2a:
 defaulting ping-timeout to 10
 [2010-01-11 14:16:02] D [transport.c:141:transport_load] transport:
 attempt to load file
 /home/gluster/apps/glusterfs-2.0.9/lib/glusterfs/2.0.9/transport/socket.so
 [2010-01-11 14:16:02] D [transport.c:141:transport_load] transport:
 attempt to load file
 /home/gluster/apps/glusterfs-2.0.9/lib/glusterfs/2.0.9/transport/socket.so
 [2010-01-11 14:16:02] D [io-threads.c:2280:init] iothreads:
 io-threads: Autoscaling: off, min_threads: 32, max_threads: 32
 [2010-01-11 14:16:02] D [read-ahead.c:824:init] readahead: Using
 conf-page_count = 16
 [2010-01-11 14:16:02] D [client-protocol.c:6472:notify] brick1a: got
 GF_EVENT_PARENT_UP, attempting connect on transport
 [2010-01-11 14:16:02] D [client-protocol.c:6472:notify] brick1a: got
 GF_EVENT_PARENT_UP, attempting connect on transport
 [2010-01-11 14:16:02] D [client-protocol.c:6472:notify] brick2a: got
 GF_EVENT_PARENT_UP, attempting connect on transport
 [2010-01-11 14:16:02] D [client-protocol.c:6472:notify] brick2a: got
 GF_EVENT_PARENT_UP, attempting connect on transport
 [2010-01-11 14:16:02] D [client-protocol.c:6472:notify] brick1a: got
 GF_EVENT_PARENT_UP, attempting connect on transport
 [2010-01-11 14:16:02] D [client-protocol.c:6472:notify] brick1a: got
 GF_EVENT_PARENT_UP, attempting connect on transport
 [2010-01-11 14:16:02] D [client-protocol.c:6472:notify] brick2a: got
 GF_EVENT_PARENT_UP, attempting connect on transport
 [2010-01-11 14:16:02] D [client-protocol.c:6472:notify] brick2a: got
 GF_EVENT_PARENT_UP, attempting connect on transport
 [2010-01-11 14:16:02] D [client-protocol.c:6486:notify] brick2a: got
 GF_EVENT_CHILD_UP
 [2010-01-11 14:16:02] D [client-protocol.c:6486:notify] brick1a: got
 GF_EVENT_CHILD_UP
 [2010-01-11 14:16:02] D [client-protocol.c:6486:notify] brick1a: got
 GF_EVENT_CHILD_UP
 [2010-01-11 14:16:02] D [client-protocol.c:6486:notify] brick2a: got
 GF_EVENT_CHILD_UP
 [2010-01-11 14:16:02] N [client-protocol.c:5733:client_setvolume_cbk]
 brick1a: Connected to 192.168.12.30:6996, attached to remote volume
 'brick1a'.
 [2010-01-11 14:16:02] N [afr.c:2194:notify] replicate: Subvolume
 'brick1a' came back up; going online.
 [2010-01-11 14:16:02] N [client-protocol.c:5733:client_setvolume_cbk]
 brick1a: Connected

Re: [Gluster-users] booster with apache permission denied

2010-01-11 Thread Liam Slusser
Oh, sorry, here are the glusterfsd server logs - this is all it logged
from the Apache startup and wget.

server1:

[2010-01-11 22:09:39] N [server-protocol.c:7056:mop_setvolume] server:
accepted client from 192.168.12.72:1014
[2010-01-11 22:09:39] N [server-protocol.c:7056:mop_setvolume] server:
accepted client from 192.168.12.72:1015
[2010-01-11 22:09:39] N [server-protocol.c:7056:mop_setvolume] server:
accepted client from 192.168.12.72:1010
[2010-01-11 22:09:39] N [server-protocol.c:7056:mop_setvolume] server:
accepted client from 192.168.12.72:1011
[2010-01-11 22:09:39] N [server-protocol.c:7056:mop_setvolume] server:
accepted client from 192.168.12.72:1006
[2010-01-11 22:09:39] N [server-protocol.c:7056:mop_setvolume] server:
accepted client from 192.168.12.72:1007
[2010-01-11 22:09:39] N [server-protocol.c:7056:mop_setvolume] server:
accepted client from 192.168.12.72:1002
[2010-01-11 22:09:39] N [server-protocol.c:7056:mop_setvolume] server:
accepted client from 192.168.12.72:1005
[2010-01-11 22:09:39] N [server-protocol.c:7056:mop_setvolume] server:
accepted client from 192.168.12.72:1001
[2010-01-11 22:09:39] N [server-protocol.c:7056:mop_setvolume] server:
accepted client from 192.168.12.72:1004
[2010-01-11 22:09:49] E [posix.c:270:posix_lookup] server1: lstat on
/data/tnsc/test/test.mp3/.htaccess failed: Not a directory

server2:

[2010-01-11 22:14:12] N [server-protocol.c:7056:mop_setvolume] server:
accepted client from 192.168.12.72:1017
[2010-01-11 22:14:12] N [server-protocol.c:7056:mop_setvolume] server:
accepted client from 192.168.12.72:1016
[2010-01-11 22:14:12] N [server-protocol.c:7056:mop_setvolume] server:
accepted client from 192.168.12.72:993
[2010-01-11 22:14:12] N [server-protocol.c:7056:mop_setvolume] server:
accepted client from 192.168.12.72:992
[2010-01-11 22:14:12] N [server-protocol.c:7056:mop_setvolume] server:
accepted client from 192.168.12.72:987
[2010-01-11 22:14:12] N [server-protocol.c:7056:mop_setvolume] server:
accepted client from 192.168.12.72:984
[2010-01-11 22:14:12] N [server-protocol.c:7056:mop_setvolume] server:
accepted client from 192.168.12.72:983
[2010-01-11 22:14:12] N [server-protocol.c:7056:mop_setvolume] server:
accepted client from 192.168.12.72:981
[2010-01-11 22:14:12] N [server-protocol.c:7056:mop_setvolume] server:
accepted client from 192.168.12.72:982
[2010-01-11 22:14:12] N [server-protocol.c:7056:mop_setvolume] server:
accepted client from 192.168.12.72:980
[2010-01-11 22:14:14] E [posix.c:270:posix_lookup] server2: lstat on
/data/tnsc/test/test.mp3/.htaccess failed: Not a directory

thanks,
liam

On Mon, Jan 11, 2010 at 9:42 PM, Raghavendra G raghaven...@gluster.com wrote:
 Hi,

 Can you send the glusterfs server logs? The logs you've sent are of booster
 (which is glusterfs client). Looking at the configuration, there is a
 protocol/client in configuration and hence you need a glusterfs server
 running.

 We'll work on issue of md5sums being different.

 regards,
 On Tue, Jan 12, 2010 at 2:20 AM, Liam Slusser lslus...@gmail.com wrote:

 Logs are below.  I also noticed this while trying to debug this
 issue...Notice the md5sum do not match up below?

 On the fuse mounted system:

 [r...@server test]# ls -al test.mp3
 -rw-r--r-- 1 user group 3692251 Aug 27  2007 test.mp3

 [r...@server test]# md5sum test.mp3
 d480d794882c814ae1a2426b79cf8b3e  test.mp3

 Using glusterfs-boost.so:

 [r...@server tmp]#

 LD_PRELOAD=/home/gluster/apps/glusterfs-2.0.9/lib/glusterfs/glusterfs-booster.so
 ls -al /pub/data/tnsc/test/test.mp3
 ls: /pub/data/tnsc/test/test.mp3: Invalid argument
 -rw-r--r-- 1 tcode tcode 3692251 Aug 27  2007 /pub/data/tnsc/test/test.mp3

 [r...@server tmp]#

 LD_PRELOAD=/home/gluster/apps/glusterfs-2.0.9/lib/glusterfs/glusterfs-booster.so
 cp /pub/data/tnsc/test/test.mp3 /tmp/test.mp3

 [r...@server tmp]# md5sum /tmp/test.mp3
 9bff3bb90b6897fc19b6b4658b83f3f8  /tmp/test.mp3

 [r...@server tmp]# ls -al /tmp/test.mp3
 -rw-r--r-- 1 root root 3690496 Jan 11 14:10 /tmp/test.mp3

 Here are the gluster logs from a clean apache start and one request to
 test.mp3 with wget:

 [2010-01-11 14:16:02] D [xlator.c:634:xlator_set_type] xlator:
 dlsym(notify) on

 /home/gluster/apps/glusterfs-2.0.9/lib/glusterfs/2.0.9/xlator/performance/io-threads.so:
 undefined symbol: notify -- neglecting
 [2010-01-11 14:16:02] D [xlator.c:634:xlator_set_type] xlator:
 dlsym(notify) on

 /home/gluster/apps/glusterfs-2.0.9/lib/glusterfs/2.0.9/xlator/performance/read-ahead.so:
 undefined symbol: notify -- neglecting
 [2010-01-11 14:16:02] D [xlator.c:634:xlator_set_type] xlator:
 dlsym(notify) on

 /home/gluster/apps/glusterfs-2.0.9/lib/glusterfs/2.0.9/xlator/performance/io-cache.so:
 undefined symbol: notify -- neglecting
 [2010-01-11 14:16:02] D [client-protocol.c:6130:init] brick1a:
 defaulting frame-timeout to 30mins
 [2010-01-11 14:16:02] D [client-protocol.c:6141:init] brick1a:
 defaulting ping-timeout to 10
 [2010-01-11 14:16:02] D [transport.c:141

Re: [Gluster-users] Bonded Gigabit

2010-01-06 Thread Liam Slusser
Shoot me an email if you would like to see how I configure my Cisco  
switches.


Let us know how the testing works out!

Liam



On Jan 6, 2010, at 3:36 AM, Adrian Revill  
adrian.rev...@shazamteam.com wrote:



Hi Liam,
Yes that is a good point, i will have to check for that, as I will  
be moving from 3com to Cisco 5500g
So far I only have 2 elderly test servers, using netperf i have  
measured 1600Bbits/s which seems to be CPU limited. I will look at  
the double brick idea as it sounds a good work arround.


Liam Slusser wrote:
I use balance mode0 on my gluster servers - but it doesnt exactly  
work

as you would expect it too.  We run Cisco 4948g switches (48 port
gigabit) and our gluster servers have two gigabit links bounded
together using mode0.  Balance mode0 does a great job of balancing
outbound traffic however the Cisco's always routes each single  
INBOUND
tcp connection traffic down a single trunk.  So the only way to  
really

gain an advantage is to use multiple tcp connections between the many
hosts - or in the case of gluster using multiple bricks per server
striped together.

liam

On Tue, Jan 5, 2010 at 7:20 AM, Adrian Revill
adrian.rev...@shazamteam.com wrote:


Hi

I am looking at which is the best bonding mode for giagbit links  
for the
servers. I have a choice of using the 802.3ad (mode4) or bonding- 
rr (mode0)
I would prefer to use mode4 but this will only give a single TCP  
connection
1Gbit of bandwidth, where mode0 will give multi Gbit of band width  
to a

single TCP connection.

My question is, If i have 4 mirrored servers. when a AFR  
replicates data
between servers, does it run multiple TCP connections concurrently  
to copy
the data to all 4 servers at once, or does it do each server in  
turn.


__



This email has been scanned by the MessageLabs Email Security  
System.

For more information please visit http://www.messagelabs.com/email
__




___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users




__




This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email  
__







__
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email  
__

___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Bonded Gigabit

2010-01-05 Thread Liam Slusser
I use balance mode0 on my gluster servers - but it doesnt exactly work
as you would expect it too.  We run Cisco 4948g switches (48 port
gigabit) and our gluster servers have two gigabit links bounded
together using mode0.  Balance mode0 does a great job of balancing
outbound traffic however the Cisco's always routes each single INBOUND
tcp connection traffic down a single trunk.  So the only way to really
gain an advantage is to use multiple tcp connections between the many
hosts - or in the case of gluster using multiple bricks per server
striped together.

liam

On Tue, Jan 5, 2010 at 7:20 AM, Adrian Revill
adrian.rev...@shazamteam.com wrote:
 Hi

 I am looking at which is the best bonding mode for giagbit links for the
 servers. I have a choice of using the 802.3ad (mode4) or bonding-rr (mode0)
 I would prefer to use mode4 but this will only give a single TCP connection
 1Gbit of bandwidth, where mode0 will give multi Gbit of band width to a
 single TCP connection.

 My question is, If i have 4 mirrored servers. when a AFR replicates data
 between servers, does it run multiple TCP connections concurrently to copy
 the data to all 4 servers at once, or does it do each server in turn.

 __
 This email has been scanned by the MessageLabs Email Security System.
 For more information please visit http://www.messagelabs.com/email
 __
 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://gluster.org/cgi-bin/mailman/listinfo/gluster-users

___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Gluster-users Digest, Vol 20, Issue 22

2010-01-05 Thread Liam Slusser
Larry  All,

I would much rather rebuild a bad drive with a raid controller then
have to wait for Gluster to do it.  With a large number of files doing
a ls -aglR can take weeks.  Also you don't NEED enterprise drives with
a raid controller, i use desktop 1.5tb Seagate drives which happy as a
clam on a 3ware SAS card under a SAS expander.

liam


On Thu, Dec 17, 2009 at 8:17 AM, Larry Bates larry.ba...@vitalesafe.com wrote:
 Phi.l,

 I think the real question you need to ask has to do with why we are using
 GlusterFS at all and what happens when something fails.  Normally GlusterFS
 is used to provide scalability, redundancy/recovery, and performance.  For
 many applications performance will be the least of the worries so we
 concentrate on scalability and redundancy/recovery.  Scalability can be
 achieved no matter which way you configure your servers.  Using distribute
 translator (DHT) you can unify all the servers into a single virtual storage
 space.  The problem comes when you look at what happens when you have a
 machine/drive failures and need the redundancy/recovery capabilities of
 GlusterFS.  By putting 36Tb of storage on a single server and exposing it as
 a single volume (using either hardware or software RAID), you will have to
 replicate that to a replacement server after a failure.  Replicating 36Tb
 will take a lot of time and CPU cycles.  If you keep things simple (JBOD)
 and use AFR to replicate drives between servers and use DHT to unify
 everything together, now you only have to move 1.5Tb/2Tb when a drive fails.
  You will also note that you get to use 100% of your disk storage this way
 instead of wasting 1 drive per array with RAID5 or two drives with RAID6.
  Normally with RAID5/6 it is also imperative that you have a hot spare per
 array, which means you waste an additional driver per array.  To make
 RAID5/6 work with no single point of failure you have to do something like
 RAID50/60 across two controllers which gets expensive and much more
 difficult to manage and to grow.  Implementing GlusterFS using more modest
 hardware makes all those issues go away.  Just use GlusterFS to provide
 the RAID-like capabilities (via AFR and DHT).

 Personally I doubt that I would set up my storage the way you describe.  I
 probably would (and have) set it up with more smaller servers.  Something
 like three times as many 2U servers with 8x2Tb drives each (or even 6 times
 as many 1U servers with 4x2Tb drives each) and forget the expensive RAID
 SATA controllers, they aren't necessary and are just a single point of
 failure that you can eliminate.  In addition you will enjoy significant
 performance improvements because you have:

 1) Many parallel paths to storage (36x1U or 18x2U vs 6x5U servers).  Gigabit
 Ethernet is fast, but still will limit bandwidth to a single machine.
 2) Write performance on RAID5/6 is never going to be as fast as JBOD.
 3) You should have much more memory caching available (36x8Gb = 256Gb memory
 or 18x8Gb memory = 128Gb vs maybe 6x16Gb = 96Gb)
 4) Management of the storage is done in one place..GlusterFS.  No messy RAID
 controller setups to document/remember.
 5) You can expand in the future in a much more granular and controlled
 fashion.  Add 2 machines (1 for replication) and you get 8Tb (using 2Tb
 drives) of storage.  When you want to replace a machine, just set up new
 one, fail the old one, and let GlusterFS build the new one for you (AFR will
 do the heavy lifting).  CPUs will get faster, hard drives will get faster
 and bigger in the future, so make it easy to upgrade.  A small number of BIG
 machines makes it a lot harder to do upgrades as new hardware becomes
 available.
 6) Machine failures (motherboard, power supply, etc.) will effect much less
 of your storage network.  Having a spare 1U machine around as a hot spare
 doesn't cost much (maybe $1200).  Having a spare 5U monster around does
 (probably close to $6000).

 IMHO 36 x 1U or 18 x 2U servers shouldn't cost any more (and maybe less)
 than the big boxes you are looking to buy.  They are commodity items.  If
 you go the 1U route you don't need anything but a machine, with memory and 4
 hard drives (all server motherboards come with at least 4 SATA ports).  By
 using 2Tb drives, I think you would find that the cost would be actually
 less.  By NOT using hardware RAID you can also NOT use RAID-class hard
 drives which cost about $100 each more than non-RAID hard drives.  Just that
 change alone could save you 6 x 24 = 144 x $100 = $14,400!  JBOD just
 doesn't need RAID-class hard drives because you don't need the sophisticated
 firmware that the RAID-class hard drives provide.  You still will want
 quality hard drives, but failures will have such a low impact that it is
 much less of a problem.

 By using more smaller machines you also eliminate the need for redundant
 power supplies (which would be a requirement in your large boxes because it
 would be a single point of failure on a large 

Re: [Gluster-users] Gluster-users Digest, Vol 20, Issue 22

2010-01-05 Thread Liam Slusser
Yeah im waiting for Gluster to come out with a 3.0.1 release before i
upgrade.  I'll make sure to do my best to compare 3.0.1 with OneFS's
performance/recovery/etc once i upgrade.  I still have two Isilon
clusters which aren't in production anymore in our lab i can play
around with.

And i've been waiting for brfs for awhile now, it can't come soon enough!

thanks,
liam

On Tue, Jan 5, 2010 at 7:48 PM, Harshavardhana har...@gluster.com wrote:
 Hi Liam,

 GlusterFS does checksum based self-heal since the 3.0 release, i would
 believe your experiences are from 2.0? which has issues of doing a full file
 self-heal which will a lot of time.  But i would suggest an upgrade with
 3.0.1
 release which is due Feb 1st week for your cluster. 3.x releases with new
 self-heal you should get very less rebuild times. If its possible to compare
 the
 3.0.1 rebuild times with the One-FS from Isilon should help us improve it
 too.

 Thanks
\
 I would suggest wait for brtfs.
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


[Gluster-users] adding another brick howto

2010-01-04 Thread Liam Slusser
I just wanted to make sure I have the procedure correctly when adding
a new bricks into my gluster raid10 configuration.

The current configuration is two servers each exporting 3 large drive
arrays hanging off the back of each server.  In the gluster client
config file i mirror each partition on server 1 with the corresponding
partition on server 2.  Then i use cluster/distribute to stripe it all
together.  Think raid10.

I'm running low on disk space and wanted to add two more drive arrays
(one on each server, same size as the original arrays), which should
be easy as far as the gluster configuration file goes.

My questions are:

a)  Once i add the new arrays (bricks if you will), do i need to run a
ls -aglR on a gluster client so the new arrays will get the directory
tree created?
b)  What would happen if i don't do step a - would it just start
creating new files and the directory tree when a create file operation
happened?
c)  Is gluster smart enough to know that the original arrays are
running low on space and to write all new files to the new servers?
d)  Anything else i should be aware of?

I'm running Centos Linux 5.4 64bit with XFS inode64 file systems with
Gluster 2.0.6 (although upgrading to 2.0.9 sometime this week
depending on how much free time i get)

thanks,
liam
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] volume sizes

2009-12-30 Thread Liam Slusser
We have a very similar setup.  We have a 6 x 24 bay gluster cluster
with 36TB per node.  We use 3ware raid cards with raid6 over all the
24 drives making a ~32TB usable per node.  We have our gluster cluster
setup like raid 10, so 3 nodes stripped together and then mirrored to
the other 3 nodes.  Performance is very good as so is the reliability
which was more important to us then performance.  I thought about
breaking it into smaller pieces but it gets complicated very quick so
i went with the simpler is better setup.  We also grow about 1tb a
week of data so i have to add 1-2 nodes a year which is a huge pain in
the butt since gluster doesnt make it very easy to do. (ie building
the directory structure on each new node)  Doing a ls -agl on the root
of our cluster takes well over a week - we have around 50+ million
files in there.

The only downside is the rebuild time whenever we loose a drive.  The
3ware controller with such a large array takes about a week to rebuild
from any one drive failure.  Of course, with raid6, we can loose two
drives without any data loss.  Luckily we've never lost two or more
drives within the same week.  However if we DID for whatever reason
loose the whole array we can always pull the data of the other mirror
node.  I do very closely watch the SMART output of each drive and
proactively replace any drive which starts to show any signs of
failing or read/write errors.

I have a smaller cluster of 4 x 24 bay 36TB per node.  This array
pushes well over 500mbit of traffic almost 24/7 with almost zero
issues.  I've been very happy with how well it performs.  I do notice
that during an array rebuild after a failed drive the IOwait time on
the server is a bit higher but over all it does very well.

If you would like more information on my setup or what
hardware/software i run please feel free to contact me privately.

thanks,
liam


On Tue, Dec 29, 2009 at 1:54 PM, Anthony Goddard agodd...@mbl.edu wrote:
 First post!
 We're looking at setting up 6x 24 bay storage servers (36TB of JBOD storage 
 per node) and running glusterFS over this cluster.
 We have RAID cards on these boxes and are trying to decide what the best size 
 of each volume should be, for example if we present the OS's (and gluster) 
 with six 36TB volumes, I imagine rebuilding one node would take a long time, 
 and there may be other performance implications of this. On the other hand, 
 if we present gluster / the OS's with 6x 6TB volumes on each node, we might 
 have more trouble in managing a larger number of volumes.

 My gut tells me a lot of small (if you can call 6TB small) volumes will be 
 lower risk and offer faster rebuilds from a failure, though I don't know what 
 the pros and cons of these two approaches might be.

 Any advice would be much appreciated!


 Cheers,
 Anthony
 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://gluster.org/cgi-bin/mailman/listinfo/gluster-users

___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] What about maximum number of Folder of GlusterFS?

2009-12-30 Thread Liam Slusser
Thats ~32,000 folders per DIRECTORY - not in total.  With that said, i
have over 50 million files and directories on my cluster on XFS.
Gluster doesn't really care - its the underlined filesystem you need
to worry about.

liam

On Tue, Dec 29, 2009 at 5:39 PM, lesonus leso...@gmail.com wrote:
 I know that EXT3 has max number of folder is about 32., EXT4 ~ 64.000
 folder
 And I want to know Max folder of GlusterFS?
 Thanks in advanced!
 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://gluster.org/cgi-bin/mailman/listinfo/gluster-users

___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] big IO problem with kvm iamges while ionice doesnt work on shared storage

2009-12-19 Thread Liam Slusser


Might want to try upgrading to the newest version - the next few  
versions are much more stable.


Liam


On Dec 19, 2009, at 6:00 AM, Ran smtp.tes...@gmail.com wrote:


Hi all ,
We run into a big problem while using gluster(2.0.6) to serve kvm
images and other stuff like mail storage in distrubted mode with only
1 server(we will add more in the future) .

The problem is basicly when say 3 KVM VPS's using intesing IO  
applications
It doesnt happen all the time but when happen it uses all gluster  
server IO
and other servers that need access to the mail storage for example  
freezes .


Normaly on local disk the sulotion is ionice but as i understand it
only work on local block devices and not on network mounts .

Anyone has any idea to overcome this e.g limit IO for clients or
clients applications like kvm images ? this is a real IO problem wich
makes the all storage crowl .

Many thanks ,
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users

___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


[Gluster-users] booster and apache 2.2.14 permission errors

2009-11-05 Thread Liam Slusser
I'm having a strange booster+apache issue.  I am unable to get apache
to download any of the files through booster.  I get a 403 (Forbidden)
on any file.  If I enabled directory indexes i can get directory
listings but still a 403 on any file.  I can view/list just files just
fine by using LD_PRELOAD=...glusterfs-booster.so with ls or cat
/pub/data/path/to/myfile.  So its just apache that im having issues
with.  If i mount the file system (to /pub) with fuse and start httpd
without booster it works fine so im pretty sure i have all the
permissions correctly.

Ideas?

thanks,
liam

# wget -S http://x.x.x.x/data/test/test.mp3
--2009-11-05 18:35:07--  http://x.x.x.x/data/test/test.mp3
Connecting to x.x.x.x:80... connected.
HTTP request sent, awaiting response...
  HTTP/1.1 403 Forbidden
  Date: Fri, 06 Nov 2009 02:35:07 GMT
  Server: Apache/2.2.14 (Unix)
  Content-Length: 228
  Keep-Alive: timeout=5, max=100
  Connection: Keep-Alive
  Content-Type: text/html; charset=iso-8859-1
2009-11-05 18:35:07 ERROR 403: Forbidden.


# wget -S http://x.x.x.x/data/test/
--2009-11-05 18:36:13--  http://x.x.x.x/data/test/
Connecting to x.x.x.x:80... connected.
HTTP request sent, awaiting response...
  HTTP/1.1 200 OK
  Date: Fri, 06 Nov 2009 02:36:13 GMT
  Server: Apache/2.2.14 (Unix)
  Content-Length: 919
  Keep-Alive: timeout=5, max=100
  Connection: Keep-Alive
  Content-Type: text/html;charset=ISO-8859-1
Length: 919 [text/html]
Saving to: `index.html'

100%[=]
919 --.-K/s   in 0s

2009-11-05 18:36:13 (87.6 MB/s) - `index.html' saved [919/919]

(inside the index.html will be an apache pretty output of the files in
/data/test)

my booster-pub.log output:

[2009-11-05 18:41:50] D [libglusterfsclient.c:2908:glusterfs_open]
libglusterfsclient: path /pub/data/test/test.mp3
[2009-11-05 18:41:50] D
[libglusterfsclient.c:1517:_libgf_vmp_search_entry]
libglusterfsclient: VMP Search: path /pub/data/test/test.mp3, type:
LongestPrefix
[2009-11-05 18:41:50] D
[libglusterfsclient.c:1604:libgf_vmp_search_entry] libglusterfsclient:
VMP Entry found: path :/pub/data/test/test.mp3 vmp: /pub/
[2009-11-05 18:41:50] D
[libglusterfsclient.c:851:libgf_resolve_path_light]
libglusterfsclient: Path: /data/test/test.mp3, Resolved Path:
/data/test/test.mp3
[2009-11-05 18:41:50] D
[libglusterfsclient-dentry.c:389:libgf_client_path_lookup]
libglusterfsclient: resolved path(/data/test/test.mp3) to
1118653312/1118655564
[2009-11-05 18:41:50] D [libglusterfsclient.c:2659:libgf_client_open]
libglusterfsclient: open: path /data/test/test.mp3, status: 0, errno
117


My httpd.conf is very simple:

Alias /data/ /pub/data
Directory /pub/data/
Options All
AllowOverride All
Order allow,deny
Allow from all
/Directory

booster.fstab:

/home/gluster/apps/glusterfs-2.0.7/etc/glusterfs/glusterfs.vol-pub.booster
/pub/ glusterfs
subvolume=cache,logfile=/home/gluster/apps/glusterfs-2.0.7/var/log/glusterfs/booster-pub.log,loglevel=DEBUG,attr_timeout=0

glusterfs.vol-pub.booster:

/home/gluster/apps/glusterfs-2.volume brick1a
  type protocol/client
  option transport-type tcp
  option remote-host x.x.x.x
  option remote-subvolume brick1a
end-volume

volume brick2a
  type protocol/client
  option transport-type tcp
  option remote-host x.x.x.x
 option remote-subvolume brick2a
end-volume

volume replicate
  type cluster/replicate
  subvolumes brick1a brick2a
end-volume

volume iothreads
  type performance/io-threads
  option thread-count 32
  subvolumes replicate
end-volume

volume readahead
  type performance/read-ahead
  option page-count 16   # cache per file  = (page-count x page-size)
  option force-atime-update off
  subvolumes iothreads
end-volume

volume cache
  type performance/io-cache
  option cache-size 512MB
  subvolumes readahead
end-volume
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Problmes with XFS and Gluster 2.0.6

2009-09-19 Thread Liam Slusser
Have you checked that you have free inodes on your XFS partitions?

xfs_db -r -c sb -c p /dev/sda1 | egrep 'ifree|icount'

If you're running low - you'll have to mount your partition with the
inode64 option.  Note that it requires a 64bit box and all your
gluster clients will also need to be 64bit for everything to work.

There is a thread here a few months back about inode64 and gluster -
dig through the archives lots of good info in it - but the short is it
works fine as long as everything is 64bit.

liam

On Fri, Sep 18, 2009 at 5:44 PM, Nathan Stratton nat...@robotics.net wrote:

 Anyone else running into problems with XFS and Gluster? Things run fine for
 a while, but then I get things like:

 ls: reading directory .: Input/output error

 I initially did not think it was a Gluster issue because I saw the errors on
 the raw XFS exported partition. However when I checked I found that the
 problem happened on all 4 nodes. I just don't know how 4 XFS partions on 4
 different boxes could all become corrupted at one time.

 Whatever happens it is bad wrong because xfs can't even fix it:

 http://share.robotics.net/xfs-crash.txt

 

 Nathan Stratton                                CTO, BlinkMind, Inc.
 nathan at robotics.net                         nathan at blinkmind.com
 http://www.robotics.net                        http://www.blinkmind.com
 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://gluster.org/cgi-bin/mailman/listinfo/gluster-users

___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Problems with folders not being created on newly added disks

2009-09-09 Thread Liam Slusser
You should really upgrade to gluster 2.0.6, there has been many bug  
fixes.


ls



On Sep 9, 2009, at 4:36 AM, Roland Rabben rol...@jotta.no wrote:


Hi
I am using GlusterFS 2.02 on Ubuntu 9.04 64 bit. I have 4 data-nodes  
and 3

clients. Se my vol files at the end of this email.

After adding more disks to my data-nodes for more capacity and  
reconfiguring

GlusterFS to include those drives I am experiencing problems.

I am getting No such file or directory if I try to copy a new file  
into an

existing directory.
However, if I copy a new file into a new directory everyting works  
fine.


It seems that if I create the folderstructure from the old data- 
nodes on the

new disks, everything works fine.

So my questions are?

1. Am I doing somthing wrong in the upgrade process?
2. Do I need to manually create the existing folders on the new hard  
drives?

3. Self heal does not fix this. Shouldn't it?
4. Is there a tool that will create the folderstructure on the new  
disks for

me?


Client vol file example:
=
# DN-000
volume dn-000-01
   type protocol/client
   option transport-type tcp
   option remote-host dn-000
   option remote-subvolume brick-01
end-volume

volume dn-000-02
   type protocol/client
   option transport-type tcp
   option remote-host dn-000
   option remote-subvolume brick-02
end-volume

volume dn-000-03
   type protocol/client
   option transport-type tcp
   option remote-host dn-000
   option remote-subvolume brick-03
end-volume

volume dn-000-04
   type protocol/client
   option transport-type tcp
   option remote-host dn-000
   option remote-subvolume brick-04
end-volume


volume dn-000-ns
   type protocol/client
   option transport-type tcp
   option remote-host dn-000
   option remote-subvolume brick-ns
end-volume

# DN-001
volume dn-001-01
   type protocol/client
   option transport-type tcp
   option remote-host dn-001
   option remote-subvolume brick-01
end-volume

volume dn-001-02
   type protocol/client
   option transport-type tcp
   option remote-host dn-001
   option remote-subvolume brick-02
end-volume

volume dn-001-03
   type protocol/client
   option transport-type tcp
   option remote-host dn-001
   option remote-subvolume brick-03
end-volume

volume dn-001-04
   type protocol/client
   option transport-type tcp
   option remote-host dn-001
   option remote-subvolume brick-04
end-volume

volume dn-001-ns
   type protocol/client
   option transport-type tcp
   option remote-host dn-001
   option remote-subvolume brick-ns
end-volume

# DN-002
volume dn-002-01
   type protocol/client
   option transport-type tcp
   option remote-host dn-002
   option remote-subvolume brick-01
end-volume

volume dn-002-02
   type protocol/client
   option transport-type tcp
   option remote-host dn-002
   option remote-subvolume brick-02
end-volume

volume dn-002-03
   type protocol/client
   option transport-type tcp
   option remote-host dn-002
   option remote-subvolume brick-03
end-volume

volume dn-002-04
   type protocol/client
   option transport-type tcp
   option remote-host dn-002
   option remote-subvolume brick-04
end-volume

# DN-003
volume dn-003-01
   type protocol/client
   option transport-type tcp
   option remote-host dn-003
   option remote-subvolume brick-01
end-volume

volume dn-003-02
   type protocol/client
   option transport-type tcp
   option remote-host dn-003
   option remote-subvolume brick-02
end-volume

volume dn-003-03
   type protocol/client
   option transport-type tcp
   option remote-host dn-003
   option remote-subvolume brick-03
end-volume

volume dn-003-04
   type protocol/client
   option transport-type tcp
   option remote-host dn-003
   option remote-subvolume brick-04
end-volume

# Replicate data between the servers
# Use pairs, but swtich the order to distribute read load
volume repl-000-001-01
   type cluster/replicate
   subvolumes dn-000-01 dn-001-01
end-volume

volume repl-000-001-02
   type cluster/replicate
   subvolumes dn-001-02 dn-000-02
end-volume

volume repl-000-001-03
   type cluster/replicate
   subvolumes dn-000-03 dn-001-03
end-volume

volume repl-000-001-04
   type cluster/replicate
   subvolumes dn-001-04 dn-000-04
end-volume


volume repl-002-003-01
   type cluster/replicate
   subvolumes dn-002-01 dn-003-01
end-volume

volume repl-002-003-02
   type cluster/replicate
   subvolumes dn-003-02 dn-002-02
end-volume

volume repl-002-003-03
   type cluster/replicate
   subvolumes dn-002-03 dn-003-03
end-volume

volume repl-002-003-04
   type cluster/replicate
   subvolumes dn-003-04 dn-002-04
end-volume


# Also replicate the namespace
volume repl-ns
   type 

Re: [Gluster-users] double traffic usage since upgrade?

2009-09-08 Thread Liam Slusser
Any other thoughts on why i'm seeing double the inbound traffic?
We're have a large increase in site traffic the last few weeks and my
out bound traffic has increase to almost 400mbit/sec which has
translated to 800mbit of backend gluster traffic.  I'm basically at
the limit of gigabit ethernet unless i do bounding.

Ideas on how to fix this?

thanks,
liam


On Mon, Aug 17, 2009 at 3:28 PM, Liam Slusser lslus...@gmail.com wrote:
 On Mon, Aug 17, 2009 at 7:42 AM, Mark Mielkem...@mark.mielke.cc wrote:
 On 08/17/2009 08:06 AM, Shehjar Tikoo wrote:

 For a start, we've aimed at getting apache and unfs3 to work with booster.
 The functional support for both in booster is complete in
 2.0.6 release.

 For a list of system calls supported by booster, please see:
 http://www.gluster.org/docs/index.php/BoosterConfiguration

 There can be applications which need un-boosted syscalls also to be
 usable over GlusterFS. For such a scenario we have two ways booster
 can be used. Both approaches are described at the page linked above
 but in short, you're right in thinking that when the un-supported
 syscalls are also needed to go over FUSE, we are, as you said, leaking
 or redirecting calls over the FUSE mount point.


 Hi Shehjar:

 That's fine, I think, as long as it is recognized that trapping system call
 open() as booster is implemented today probably does not trap fopen() on
 Linux. If apache and unfs3 always call open() directly, and you are trapping
 this, then your purpose is being served.

 I was kind of hoping you had found a way around --disable-hidden-plt, so I
 could steal the idea from you. Too bad. :-)

 Cheers,
 mark

 --
 Mark Mielkem...@mielke.cc


 Just a FYI - I am not using booster at all on our feed boxes, this is
 just straight fuse and the glusterfs process [with the box we're
 seeing the traffic doubling on].

 liam

___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] glusterfsd initscript default sequence

2009-09-08 Thread Liam Slusser
The init script is also wrong if you used a non-default install path.
It always points to /usr/sbin/glusterfsd and not your --prefix
specified path.

liam


On Sun, Sep 6, 2009 at 9:18 PM, Jeff Evans je...@tricab.com wrote:

 In the case that the node is both a server and a client, as I
 wish to  use it (3-node cluster, where each is both a client and
 server in  cluster/replicate configuration), I found that using
 /etc/fstab to mount  and the default glusterfsd initscript of S90
 causes the mount to be made  before glusterfsd is up.

 My scenario exactly.

 In a test I
 just ran where I restarted all  three nodes at the same time, for
 the server that came up first, it  seems the client decided
 nothing was up.

 Yes, and this causes anything that depends upon the glusterfs mount to
 wait at startup for the FS to become available.

I too think S90 is off,
 although I'm not sure where it should go, or how to make it start
  glusterfsd before it gets to /etc/fstab mounting?

 I think the only way to ensure glusterfsd comes up before fstab
 mounting (mount -a) is by using the noauto option and then mounting it
 later in rc.local or whenever you are ready.

 In my case, I want glusterfs available ASAP and using S50 was adequate
 as this is before anything like smb/nfs/httpd starts looking for the
 mount.

 Thanks, Jeff.


 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://gluster.org/cgi-bin/mailman/listinfo/gluster-users

___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] new user help

2009-08-23 Thread Liam Slusser
answer inline...

On Sun, Aug 23, 2009 at 10:50 AM, Mag Gammagaw...@gmail.com wrote:
 I am trying to setup gluster 2.0 and I am a new user.

Welcome.


 Basically, I been trying to follow the tutorials but I am having no luck.

 My setup is 2 servers and 10 clients for now.

 Couple of questions:

 On the client, do I have to have FUSE module?

If you want to mount the filesystem, you need to use FUSE.  You can
also nfs mount it if you setup a unfs server on another client and
share the gluster mount.

 Do I have to run all this as root?

yes to mount the filesystem

 How can I check what clients are mounted on the server side?

Not really an easy way to do it, however i use... (in linux)

netstat -pan | grep EST | grep gluster_listen_port

 How can I check weather I can talk to the server via gluster tools
 (not ping :-)?

Try telneting to the gluster_listen_port - if it answers its up.

liam
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Interesting experiment

2009-08-18 Thread Liam Slusser
On Tue, Aug 18, 2009 at 3:05 AM, Hiren Joshij...@moonfruit.com wrote:
 Hi,

 Ok, the basic setup is 6 bricks per server, 2 servers. Mirror the six
 bricks and DHT them.

 I'm running three tests, dd 1G of zeros to the gluster mount, dd 1000
 100k files and dd 1000 1M files.

 With 3M write-behind I get:
 0m35.460s for 1G file
 0m52.427s for 100k files
 1m37.209s for 1M files

 Then I added a 400M external journal to all the bricks, the twist being
 the journals were made on a ram drive

 Running the same tests:
 0m33.614s for 1G file
 0m52.851s for 100k files
 1m31.693s for 1M files


 So why is it that adding an external journal (in the ram!) seems to make
 no difference at all?

I would imagine that most of your bottle neck is with the network and
not the disks.  Modern raid disk storage systems are much quicker than
gigabit ethernet.

liam
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] double traffic usage since upgrade?

2009-08-17 Thread Liam Slusser
On Mon, Aug 17, 2009 at 7:42 AM, Mark Mielkem...@mark.mielke.cc wrote:
 On 08/17/2009 08:06 AM, Shehjar Tikoo wrote:

 For a start, we've aimed at getting apache and unfs3 to work with booster.
 The functional support for both in booster is complete in
 2.0.6 release.

 For a list of system calls supported by booster, please see:
 http://www.gluster.org/docs/index.php/BoosterConfiguration

 There can be applications which need un-boosted syscalls also to be
 usable over GlusterFS. For such a scenario we have two ways booster
 can be used. Both approaches are described at the page linked above
 but in short, you're right in thinking that when the un-supported
 syscalls are also needed to go over FUSE, we are, as you said, leaking
 or redirecting calls over the FUSE mount point.


 Hi Shehjar:

 That's fine, I think, as long as it is recognized that trapping system call
 open() as booster is implemented today probably does not trap fopen() on
 Linux. If apache and unfs3 always call open() directly, and you are trapping
 this, then your purpose is being served.

 I was kind of hoping you had found a way around --disable-hidden-plt, so I
 could steal the idea from you. Too bad. :-)

 Cheers,
 mark

 --
 Mark Mielkem...@mielke.cc


Just a FYI - I am not using booster at all on our feed boxes, this is
just straight fuse and the glusterfs process [with the box we're
seeing the traffic doubling on].

liam
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


[Gluster-users] double traffic usage since upgrade?

2009-08-14 Thread Liam Slusser
I've been running 2.0.3 with two backend bricks and a frontend client of
mod_gluster/apache 2.2.11+worker for a few weeks now without much issue.
 Last night i upgraded to 2.0.6 only to find out that mod_gluster has been
removed and is recommending to use the booster library - which is fine but i
didnt have time to test it last night so i just mounted the whole filesystem
with a fuse mount and figured id test the booster config later and then
swap.  I did try running the 2.0.3 mod_gluster module with the 2.0.6 bricks
but apache kept segfaulting (every 10 seconds) and then would spawn another
process which would reconnect and keep going.  I figured it was dropping a
client request every few seconds which is why i went with the fuse mount
until i could test the booster library.

Well, before with mod_gluster, we would be pushing around 200mbit of web
traffic and it would evenly distribute that 200mbit between our two bricks -
so server1 would be pushing 100mbit and server2 would be pushing another
100mbit.  Basically both inbound from the backend bricks and outbound from
apache was basically identical.  Except of course if one of the backend
glusterd processes died for whatever reason the other remaining brick would
take the whole load and its traffic would double as you would expect.
 Perfect, all was happy.

Now using gluster 2.0.6 and fuse both server bricks are pushing the full
200mbit of traffic - so i basically have 400mbit of incoming traffic from
the gluster bricks but the same 200mbit of web traffic.  I can deal, but i
only have a shared gigabit link between my client server and backend bricks
and im already eating up basically 50% of that pipe.  It is also putting a
much larger load on both bricks since i have basically doubled the disk IO
time and traffic.  Is this a feature? Bug?

thanks,
liam
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Performance

2009-08-13 Thread Liam Slusser
XFS has been around since 1994 - originally written by SGI and is one of the
oldest journaling filesystems.  It has been in the linux source tree since
2.4 and is very stable.  It supports a max volume size of 16 exabytes where
ext3/4 runs out at 8tb i believe.  I've never had one of my xfs filesystems
need recovering and use it on a bunch of larger arrays where its to large to
use ext3.
Just make sure you're using 64bit linux and mount the filesystem with the
inode64 option so you don't run out of inodes.

liam

On Thu, Aug 13, 2009 at 1:31 AM, Hiren Joshi j...@moonfruit.com wrote:

  What are the advantages of XFS over ext3 (which I'm currently using)? My
 fear with XFS when selecting a filesystem was that it's not as active or as
 well supported as ext3 and if things go wrong, how easy would it be to
 recover?

 I have 6 x 1TB disks in a hardware raid 6 with battery backup and UPS, it's
 now just the performance I need to get sorted...

  --
 *From:* Liam Slusser [mailto:lslus...@gmail.com]
 *Sent:* 12 August 2009 20:35
 *To:* Mark Mielke
 *Cc:* Hiren Joshi; gluster-users@gluster.org
 *Subject:* Re: [Gluster-users] Performance


 I had a similar situation.  My larger gluster cluster has two nodes but
 each node has 72 1.5tb hard drives.  I ended up creating three 30TB 24 drive
 raid6 arrays, formated with xfs and 64bit-inodes, and then exporting three
 bricks with gluster.  I would recommend using a hardware raid controller
 with battery backup power, UPS power, and a journaled filesystem and i think
 you'll be fine.

 I'm exporting the three bricks on each of my two nodes, the clients are
 using replication to replicate each of the three bricks on each server and
 then using distribute to tie it all together.

 liam


 On Wed, Aug 12, 2009 at 10:51 AM, Mark Mielke m...@mark.mielke.cc wrote:

 On 08/12/2009 01:24 PM, Hiren Joshi wrote:

 36 partitions on each server - the word partition is ambiguous. Are
 they 36 separate drives? Or multiple partitions on the same drive. If
 multiple partitions on the same drive, this would be a bad
 idea, as it
 would require the disk head to move back and forth between the
 partitions, significantly increasing the latency, and therefore
 significantly reducing the performance. If each partition is
 on its own
 drive, you still won't see benefit unless you have many clients
 concurrently changing many different files. In your above case, it's
 touching a single file in sequence, and having a cluster is
 costing you
 rather than benefitting you.



 We went with 36 partitions (on a single raid 6 drive) incase we got file
 system corruption, it would take less time to fsck a 100G partition than
 a 3.6TB one. Would a 3.6TB single disk be better?


 Putting 3.6 TB on a single disk sounds like a lot of eggs in one basket.
 :-)

 If you are worried about fsck, I would definitely do as the other poster
 suggested and use a journalled file system. This nearly eliminates the fsck
 time for most situations. This would be whether using 100G partitions or
 using 3.6T partitions. In fact, there is very few reasons not to use a
 journalled file system these days.

 As for how to deal with data on this partition - the file system is going
 to have a better chance of placing files close to each other, than setting
 up 36 partitions and having Gluster scatter the files across all of them
 based on a hash. Personally, I would choose 4 x 1 Tbyte drives over 1 x 3.6
 Tbyte drive, as this nearly quadruples my bandwidth and for highly
 concurrent loads, nearly divides by four the average latency to access
 files.

 But, if you already have the 3.6 Tbyte drive, I think the only
 performance-friendly use would be to partition it based upon access
 requirements, rather than a hash (random). That is, files that are accessed
 frequently should be clustered together at the front of a disk, files
 accessed less frequently could be in the middle, and files accessed
 infrequently could be at the end. This would be a three partition disk.
 Gluster does not have a file system that does this automatically (that I can
 tell), so it would probably require a software solution on your end. For
 example, I believe dovecot (IMAP server) allows an alternative storage
 location to be defined, so that infrequently read files can be moved to
 another disk, and it knows to check the primary storage first, and fall back
 to the alternative storage after.

 It you can't break up your storage by access patterns, then I think a 3.6
 Tbyte file system might still be the next best option - it's still better
 than 36 partitions. But, make sure you have a good file system on it, that
 scales well to this size.


 Cheers,
 mark

 --
 Mark Mielkem...@mielke.cc


 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://gluster.org/cgi-bin/mailman/listinfo/gluster-users



___
Gluster-users mailing

Re: [Gluster-users] Performance

2009-08-12 Thread Liam Slusser
On Wed, Aug 12, 2009 at 10:24 AM, Hiren Joshi j...@moonfruit.com wrote:



 We went with 36 partitions (on a single raid 6 drive) incase we got file
 system corruption, it would take less time to fsck a 100G partition than
 a 3.6TB one. Would a 3.6TB single disk be better?


Have you looked at using XFS for a filesystem?  Its a journaling filesystem
and should almost require no rebuild/check in a crash.

liam
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Fuse problem

2009-08-11 Thread Liam Slusser
Looks like fuse isn't loaded.  Have you installed fuse?  The debug log below
has the hint, try modprobe fuse as root.
liam

On Tue, Aug 11, 2009 at 7:27 AM, Hiren Joshi j...@moonfruit.com wrote:

 Hello all,

 I'm running a 64bit Centos5 setup and am trying to mount a gluster
 filesystem (which is exported out of the same box).

 glusterfs --debug --volfile=/root/gluster/webspace2.vol
 /home/webspace_glust/

 Gives me:
 snip
 [2009-08-11 16:26:37] D [client-protocol.c:5963:init] glust1b_36:
 defaulting ping-timeout to 10
 [2009-08-11 16:26:37] D [transport.c:141:transport_load] transport:
 attempt to load file /usr/lib64/glusterfs/2.0.4/transport/socket.so
 [2009-08-11 16:26:37] D [transport.c:141:transport_load] transport:
 attempt to load file /usr/lib64/glusterfs/2.0.4/transport/socket.so
 fuse: device not found, try 'modprobe fuse' first
 [2009-08-11 16:26:37] D [fuse-bridge.c:2740:init] glusterfs-fuse:
 fuse_mount() failed with error No such device on mount point
 /home/webspace_glust/
 [2009-08-11 16:26:37] E [xlator.c:736:xlator_init_rec] xlator:
 Initialization of volume 'fuse' failed, review your volfile again
 [2009-08-11 16:26:37] E [glusterfsd.c:513:_xlator_graph_init] glusterfs:
 initializing translator failed
 [2009-08-11 16:26:37] E [glusterfsd.c:1217:main] glusterfs: translator
 initialization failed.  exiting


 Any thoughts?

 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] how to kill the glusterfsd process

2009-08-03 Thread Liam Slusser
If all else fails, kill -9?
ls

On Mon, Aug 3, 2009 at 7:03 AM, Wei Dong wdong@gmail.com wrote:

 Hi All,

 I'm trying to restart the glusterfsd service on my storage nodes and find
 that I'm simply unable to kill the process on some of the nodes.  Any
 suggestion?

 Thanks a lot,

 - Wei Dong
 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://gluster.org/cgi-bin/mailman/listinfo/gluster-users

___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Gluster 2.0.3 + Apache on CentOS5 performance issue

2009-07-30 Thread Liam Slusser
You might want to wait until 2.0.5 as there is a ton of bug fixes to  
booster in that release.


Either way please let us know how it goes.

ls

On Jul 30, 2009, at 12:39 AM, Somsak Sriprayoonsakul  
soms...@gmail.com wrote:



Thank you very much for you reply

At the time we used 2.0.3, and yes we used stock Apache from CentOS.  
I will try 2.0.4 very soon to see if it's work.


For Booster, it seems not working correctly for me. Booster  
complains a lots of error with plain 'ls' command (but giving the  
correct output). Also, with booster, Apache process refuse to start.  
I will try 2.0.4 to see if it improves. If not, I will attach error  
log next time.



2009/7/30 Raghavendra G raghaven...@gluster.com
Hi Somsak,

Sorry for the delayed reply. Below you've mentioned that you've  
problems with apache and booster. Going forward, Apache over booster  
will be the preferred approach. Can you tell us what version of  
glusterfs you are using? And as I can understand you are using  
apache 2.2, am I correct?


regards,
- Original Message -
From: Liam Slusser lslus...@gmail.com
To: Somsak Sriprayoonsakul soms...@gmail.com
Cc: gluster-users@gluster.org
Sent: Saturday, July 25, 2009 3:46:14 AM GMT +04:00 Abu Dhabi / Muscat
Subject: Re: [Gluster-users] Gluster 2.0.3 + Apache on CentOS5  
performance  issue


I haven't tried an apples to apples comparison with Apache 
+mod_gluster vs
Apache+fuse+gluster however i do run both setups.  I load tested  
both setups
so to verified it could handle 4x our normal daily load and left it  
at that.

 I didn't actually compare the two (although that might be cool to do
someday).
I really like the idea of Apache+mod_gluster as I don't have to deal  
with
the whole fuse and mounting the filesystem.  It always scares me  
having a
public facing webserver with your whole backend fileshare mounted  
locally.
 Its very slick for serving content such as media files.  We serve  
audio

content to our CDN with a pair of Apache/mod_gluster servers - pushing
200-300mbit on average daily and everything works very well.

We run an apache+fuse+gluster setup because we need to run some  
mod_perl
before serving the actual content.  However performance is still  
very good.
 We do around 50-100 requests (all jpeg images) per second off of a  
fuse
mount and everything works great.  We also have a java tomcat+fuse 
+gluster
service which does image manipulation on the fly off of a gluster  
mount.


We have two backend gluster servers using replication which serve  
all this

content.

If you would like more information on our setup id be happy to share
offline.  Just email me privately.

thanks,
liam

On Fri, Jul 24, 2009 at 8:08 AM, Somsak Sriprayoonsakul
soms...@gmail.comwrote:

 Oh thank you, thought noone will reply me :)

 Have you tried Apache + Fuse over GlusterFS? How is the performance?

 Also, anyone in this mailing-list have tried Apache with booster?  
I tried

 it but Apache refuse to start (just hang and freeze).

 2009/7/23 Liam Slusser lslus...@gmail.com


 We use mod_gluster and Apache
 2.2 with good results.  We also ran into the same issue as you  
that we ran out of memory past 150 threads even on a 8gig machine.   
We got around this by compiling Apache using mpm-worker
 (threads) instead of prefork - it uses 1/4 as much ram with the  
same number
 of connections (150-200) and everything has been running  
smoothly.  I cannot

 see any performance difference except it using way less memory.
 liam


 On Sun, Jul 12, 2009 at 5:11 AM, Somsak Sriprayoonsakul 
 soms...@gmail.com wrote:

 Hello,

 We have been evaluating the choice for the new platform for a  
webboard

 system.
 The webboard is PHP scripts that generate/modify HTML page when  
user
 posting/add comment to the page, resulting topic is actually  
stored as a
 HTML file with all related file (file attach to the topic,  
etc.. )stored in
 its own directory for each topic. In general, the web site  
mostly serve a
 lot of small static files using Apache while using PHP to do  
other dynamic
 contents. This system has been working very well in the past,  
with the
 increasing page view rate, it is very likely that we will need  
some kind of

 Cluster file system as backend very soon.

 We have set up a test system using Grinder as stress test tool.  
The test
 system is 11 machines of Intel Dual Core x86_64 CentOS5 with  
stock Apache
 (prefork, since the goal is to use this with PHP), linked  
together with
 Gigabit Ethernet. We try to compare the performance of either  
using single
 NFS server in sync mode against using 4 Gluster nodes  
(distribute of 2
 replicated nodes) through Fuse. However, the transaction per  
second (TPS)

 result is not good.

 NFS (single server, sync mode)
  - 100 thread of client - Peak TPS = 1716.67, Avg. TPS = 1066,  
mean

 response time = 61.63 ms
  - 200 threads - Peak TPS = 2790, Avg. TPS = 1716, mean rt =  
87.33 ms

  - 400 threads - Peak TPS = 3810, Avg

Re: [Gluster-users] Gluster 2.0.3 + Apache on CentOS5 performance issue

2009-07-30 Thread Liam Slusser
Its not released yet, but it is in QA.  You can download it here:
http://ftp.gluster.com/pub/gluster/glusterfs/qa-releases/glusterfs-2.0.5.tar.gzor
grab the newest git which has all the changes in it.
liam

On Thu, Jul 30, 2009 at 8:45 PM, Somsak Sriprayoonsakul
soms...@gmail.comwrote:

 Could you let me know when will this be (estimately). I can wait until
 2.0.5 and test it out again.

 2009/7/30 Liam Slusser lslus...@gmail.com

 You might want to wait until 2.0.5 as there is a ton of bug fixes to
 booster in that release.

 Either way please let us know how it goes.

 ls

 On Jul 30, 2009, at 12:39 AM, Somsak Sriprayoonsakul soms...@gmail.com
 wrote:

 Thank you very much for you reply

 At the time we used 2.0.3, and yes we used stock Apache from CentOS. I
 will try 2.0.4 very soon to see if it's work.

 For Booster, it seems not working correctly for me. Booster complains a
 lots of error with plain 'ls' command (but giving the correct output). Also,
 with booster, Apache process refuse to start. I will try 2.0.4 to see if it
 improves. If not, I will attach error log next time.


 2009/7/30 Raghavendra G  raghaven...@gluster.com
 raghaven...@gluster.com

 Hi Somsak,

 Sorry for the delayed reply. Below you've mentioned that you've problems
 with apache and booster. Going forward, Apache over booster will be the
 preferred approach. Can you tell us what version of glusterfs you are using?
 And as I can understand you are using apache 2.2, am I correct?

 regards,
 - Original Message -
 From: Liam Slusser  lslus...@gmail.comlslus...@gmail.com
 To: Somsak Sriprayoonsakul  soms...@gmail.comsoms...@gmail.com
 Cc: gluster-users@gluster.orggluster-users@gluster.org
 Sent: Saturday, July 25, 2009 3:46:14 AM GMT +04:00 Abu Dhabi / Muscat
 Subject: Re: [Gluster-users] Gluster 2.0.3 + Apache on CentOS5
 performance  issue

 I haven't tried an apples to apples comparison with Apache+mod_gluster vs
 Apache+fuse+gluster however i do run both setups.  I load tested both
 setups
 so to verified it could handle 4x our normal daily load and left it at
 that.
  I didn't actually compare the two (although that might be cool to do
 someday).
 I really like the idea of Apache+mod_gluster as I don't have to deal with
 the whole fuse and mounting the filesystem.  It always scares me having a
 public facing webserver with your whole backend fileshare mounted
 locally.
  Its very slick for serving content such as media files.  We serve audio
 content to our CDN with a pair of Apache/mod_gluster servers - pushing
 200-300mbit on average daily and everything works very well.

 We run an apache+fuse+gluster setup because we need to run some mod_perl
 before serving the actual content.  However performance is still very
 good.
  We do around 50-100 requests (all jpeg images) per second off of a fuse
 mount and everything works great.  We also have a java
 tomcat+fuse+gluster
 service which does image manipulation on the fly off of a gluster mount.

 We have two backend gluster servers using replication which serve all
 this
 content.

 If you would like more information on our setup id be happy to share
 offline.  Just email me privately.

 thanks,
 liam

 On Fri, Jul 24, 2009 at 8:08 AM, Somsak Sriprayoonsakul
  soms...@gmail.comsoms...@gmail.comwrote:

  Oh thank you, thought noone will reply me :)
 
  Have you tried Apache + Fuse over GlusterFS? How is the performance?
 
  Also, anyone in this mailing-list have tried Apache with booster? I
 tried
  it but Apache refuse to start (just hang and freeze).
 
  2009/7/23 Liam Slusser  lslus...@gmail.comlslus...@gmail.com
 
 
  We use mod_gluster and Apache
  2.2 with good results.  We also ran into the same issue as you that we
 ran out of memory past 150 threads even on a 8gig machine.  We got around
 this by compiling Apache using mpm-worker
  (threads) instead of prefork - it uses 1/4 as much ram with the same
 number
  of connections (150-200) and everything has been running smoothly.  I
 cannot
  see any performance difference except it using way less memory.
  liam
 
 
  On Sun, Jul 12, 2009 at 5:11 AM, Somsak Sriprayoonsakul 
  soms...@gmail.comsoms...@gmail.com wrote:
 
  Hello,
 
  We have been evaluating the choice for the new platform for a
 webboard
  system.
  The webboard is PHP scripts that generate/modify HTML page when user
  posting/add comment to the page, resulting topic is actually stored
 as a
  HTML file with all related file (file attach to the topic, etc..
 )stored in
  its own directory for each topic. In general, the web site mostly
 serve a
  lot of small static files using Apache while using PHP to do other
 dynamic
  contents. This system has been working very well in the past, with
 the
  increasing page view rate, it is very likely that we will need some
 kind of
  Cluster file system as backend very soon.
 
  We have set up a test system using Grinder as stress test tool. The
 test
  system is 11 machines of Intel Dual Core x86_64 CentOS5

Re: [Gluster-users] any configuration guidelines?

2009-07-29 Thread Liam Slusser
On Wed, Jul 29, 2009 at 1:22 PM, Nathan Stratton nat...@robotics.netwrote:

 On Tue, 28 Jul 2009, Wei Dong wrote:

  Hi All,

 We've been using GlusterFS 2.0.1 on our lab cluster to host a large number
 of small images for distributed processing with Hadoop and it has been
 working fine without human intervention for a couple of months.  Thanks for
 the wonderful project -- it's the only freely available cluster filesystem
 that fits our needs.

 What keeps bothering me is the extremely high flexibility of ClusterFS.
 There's simply so many ways to achieve the same goal that I don't know which
 is the best.  So I'm writing to ask if there are some general guidelines of
 configuration to improve both data safety and performance.


 Totally understand, I am facing many of the same issues, I am not sure if I
 should be doing replicate / distribute in the frontend client config or
 backend server configs.


 -Nathan


The preferred way is using the client and not the backend server.  There is
some documentation somewhere about it - ill see if i can dig it up.

ls
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Gluster 2.0.3 + Apache on CentOS5 performance issue

2009-07-24 Thread Liam Slusser
I haven't tried an apples to apples comparison with Apache+mod_gluster vs
Apache+fuse+gluster however i do run both setups.  I load tested both setups
so to verified it could handle 4x our normal daily load and left it at that.
 I didn't actually compare the two (although that might be cool to do
someday).
I really like the idea of Apache+mod_gluster as I don't have to deal with
the whole fuse and mounting the filesystem.  It always scares me having a
public facing webserver with your whole backend fileshare mounted locally.
 Its very slick for serving content such as media files.  We serve audio
content to our CDN with a pair of Apache/mod_gluster servers - pushing
200-300mbit on average daily and everything works very well.

We run an apache+fuse+gluster setup because we need to run some mod_perl
before serving the actual content.  However performance is still very good.
 We do around 50-100 requests (all jpeg images) per second off of a fuse
mount and everything works great.  We also have a java tomcat+fuse+gluster
service which does image manipulation on the fly off of a gluster mount.

We have two backend gluster servers using replication which serve all this
content.

If you would like more information on our setup id be happy to share
offline.  Just email me privately.

thanks,
liam

On Fri, Jul 24, 2009 at 8:08 AM, Somsak Sriprayoonsakul
soms...@gmail.comwrote:

 Oh thank you, thought noone will reply me :)

 Have you tried Apache + Fuse over GlusterFS? How is the performance?

 Also, anyone in this mailing-list have tried Apache with booster? I tried
 it but Apache refuse to start (just hang and freeze).

 2009/7/23 Liam Slusser lslus...@gmail.com


 We use mod_gluster and Apache
 2.2 with good results.  We also ran into the same issue as you that we ran 
 out of memory past 150 threads even on a 8gig machine.  We got around this 
 by compiling Apache using mpm-worker
 (threads) instead of prefork - it uses 1/4 as much ram with the same number
 of connections (150-200) and everything has been running smoothly.  I cannot
 see any performance difference except it using way less memory.
 liam


 On Sun, Jul 12, 2009 at 5:11 AM, Somsak Sriprayoonsakul 
 soms...@gmail.com wrote:

 Hello,

 We have been evaluating the choice for the new platform for a webboard
 system.
 The webboard is PHP scripts that generate/modify HTML page when user
 posting/add comment to the page, resulting topic is actually stored as a
 HTML file with all related file (file attach to the topic, etc.. )stored in
 its own directory for each topic. In general, the web site mostly serve a
 lot of small static files using Apache while using PHP to do other dynamic
 contents. This system has been working very well in the past, with the
 increasing page view rate, it is very likely that we will need some kind of
 Cluster file system as backend very soon.

 We have set up a test system using Grinder as stress test tool. The test
 system is 11 machines of Intel Dual Core x86_64 CentOS5 with stock Apache
 (prefork, since the goal is to use this with PHP), linked together with
 Gigabit Ethernet. We try to compare the performance of either using single
 NFS server in sync mode against using 4 Gluster nodes (distribute of 2
 replicated nodes) through Fuse. However, the transaction per second (TPS)
 result is not good.

 NFS (single server, sync mode)
  - 100 thread of client - Peak TPS = 1716.67, Avg. TPS = 1066, mean
 response time = 61.63 ms
  - 200 threads - Peak TPS = 2790, Avg. TPS = 1716, mean rt = 87.33 ms
  - 400 threads - Peak TPS = 3810, Avg. TPS = 1800, mean rt = 165ms
  - 600 threads - Peak TPS = 4506.67, Avg. TPS = 1676.67, mean rt =
 287.33ms

 4 nodes Gluster (2 distribute of replicated 2 node)
 - 100 thread - peak TPS = 1293.33, Avg. TPS = 430, mean rt = 207.33ms
 - 200 threads - Peak TPS = 974.67, Avg. TPS = 245.33, mean rt = 672.67ms
 - 300 threads - Peak TPS = 861.33, Avg. TPS = 210, mean rt = 931.33
 (no 400-600 threads since we run out of client machine, sorry).

 gfsd is configured to use 32 thread of iothread as brick. gfs-client is
 configured to use io-cache-write-behind-readahead-distribute-replicate.
 io-cache cache-size is 256MB. I used patched Fuse downloaded from Gluster
 web-site (build through DKMS).

 As the result yield, it seems that Gluster performance worse with
 increasing no. of client. One observation is that the glusterfs process on
 client is taking about 100% of CPU during all the tests. glusterfsd is
 utilizing only 70-80% of CPUs during the test time. Note that system is Dual
 core.

 I also tried using modglusterfs and not using fuse at all to serve all
 the static files and conduct another test with Grinder. The result is about
 the same, 1000+ peak TPS with 2-400 avg. TPS. A problem arise in this test
 that each Apache prefork process used more about twice more memory and we
 need to lower number of httpd processes by about half.

 I tried disable EnableMMAP and it didn't help much. Adjusting

Re: [Gluster-users] Gluster 2.0.3 + Apache on CentOS5 performance issue

2009-07-23 Thread Liam Slusser
We use mod_gluster and Apache
2.2 with good results.  We also ran into the same issue as you that we
ran out of memory past 150 threads even on a 8gig machine.  We got
around this by compiling Apache using mpm-worker
(threads) instead of prefork - it uses 1/4 as much ram with the same number
of connections (150-200) and everything has been running smoothly.  I cannot
see any performance difference except it using way less memory.
liam


On Sun, Jul 12, 2009 at 5:11 AM, Somsak Sriprayoonsakul
soms...@gmail.comwrote:

 Hello,

 We have been evaluating the choice for the new platform for a webboard
 system.
 The webboard is PHP scripts that generate/modify HTML page when user
 posting/add comment to the page, resulting topic is actually stored as a
 HTML file with all related file (file attach to the topic, etc.. )stored in
 its own directory for each topic. In general, the web site mostly serve a
 lot of small static files using Apache while using PHP to do other dynamic
 contents. This system has been working very well in the past, with the
 increasing page view rate, it is very likely that we will need some kind of
 Cluster file system as backend very soon.

 We have set up a test system using Grinder as stress test tool. The test
 system is 11 machines of Intel Dual Core x86_64 CentOS5 with stock Apache
 (prefork, since the goal is to use this with PHP), linked together with
 Gigabit Ethernet. We try to compare the performance of either using single
 NFS server in sync mode against using 4 Gluster nodes (distribute of 2
 replicated nodes) through Fuse. However, the transaction per second (TPS)
 result is not good.

 NFS (single server, sync mode)
  - 100 thread of client - Peak TPS = 1716.67, Avg. TPS = 1066, mean
 response time = 61.63 ms
  - 200 threads - Peak TPS = 2790, Avg. TPS = 1716, mean rt = 87.33 ms
  - 400 threads - Peak TPS = 3810, Avg. TPS = 1800, mean rt = 165ms
  - 600 threads - Peak TPS = 4506.67, Avg. TPS = 1676.67, mean rt = 287.33ms

 4 nodes Gluster (2 distribute of replicated 2 node)
 - 100 thread - peak TPS = 1293.33, Avg. TPS = 430, mean rt = 207.33ms
 - 200 threads - Peak TPS = 974.67, Avg. TPS = 245.33, mean rt = 672.67ms
 - 300 threads - Peak TPS = 861.33, Avg. TPS = 210, mean rt = 931.33
 (no 400-600 threads since we run out of client machine, sorry).

 gfsd is configured to use 32 thread of iothread as brick. gfs-client is
 configured to use io-cache-write-behind-readahead-distribute-replicate.
 io-cache cache-size is 256MB. I used patched Fuse downloaded from Gluster
 web-site (build through DKMS).

 As the result yield, it seems that Gluster performance worse with
 increasing no. of client. One observation is that the glusterfs process on
 client is taking about 100% of CPU during all the tests. glusterfsd is
 utilizing only 70-80% of CPUs during the test time. Note that system is Dual
 core.

 I also tried using modglusterfs and not using fuse at all to serve all the
 static files and conduct another test with Grinder. The result is about the
 same, 1000+ peak TPS with 2-400 avg. TPS. A problem arise in this test that
 each Apache prefork process used more about twice more memory and we need to
 lower number of httpd processes by about half.

 I tried disable EnableMMAP and it didn't help much. Adjusting readahead,
 write behind according to GlusterOptimization page didn't help much either.

 My question is, there seems to be bottleneck in this setup, but how can I
 track this? Note that, I didn't do any other optimization other than what
 said above. Are there any best practice configuration for using Apache to
 serve a bunch of small static files like this around?

 Regards,

 Somsak




 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users


___
Gluster-users mailing list
Gluster-users@gluster.org
http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] GlusterFS Preformance

2009-07-08 Thread Liam Slusser
You have to remember that when you are writing with NFS you're writing to
one node, where as your gluster setup below is copying the same data to two
nodes;  so you're doubling the bandwidth.  Dont expect nfs like performance
on writing with multiple storage bricks.  However read performance should be
quite good.
liam

On Wed, Jul 8, 2009 at 5:22 AM, Hiren Joshi j...@moonfruit.com wrote:

 Hi,

 I'm currently evaluating gluster with the intention of replacing our
 current setup and have a few questions:

 At the moment, we have a large SAN which is split into 10 partitions and
 served out via NFS. For gluster, I was thinking 12 nodes to make up
 about 6TB (mirrored so that's 1TB per node) and served out using
 gluster. What sort of filesystem should I be using for the nodes
 (currently on ext3) to give me the best performance and recoverability?

 Also, I setup a test with a simple mirrored pair with a client that
 looks like:
 volume glust3
  type protocol/client
  option transport-type tcp/client
  option remote-host glust3
  option remote-port 6996
  option remote-subvolume brick
 end-volume
 volume glust4
  type protocol/client
  option transport-type tcp/client
  option remote-host glust4
  option remote-port 6996
  option remote-subvolume brick
 end-volume
 volume mirror1
  type cluster/replicate
  subvolumes glust3 glust4
 end-volume
 volume writebehind
  type performance/write-behind
  option window-size 1MB
  subvolumes mirror1
 end-volume
 volume cache
  type performance/io-cache
  option cache-size 512MB
  subvolumes writebehind
 end-volume


 I ran a basic test by writing 1G to an NFS server and this gluster pair:
 [r...@glust1 ~]# time dd if=/dev/zero of=/mnt/glust2_nfs/nfs_test
 bs=65536 count=15625
 15625+0 records in
 15625+0 records out
 102400 bytes (1.0 GB) copied, 1718.16 seconds, 596 kB/s

 real28m38.278s
 user0m0.010s
 sys 0m0.650s
 [r...@glust1 ~]# time dd if=/dev/zero of=/mnt/glust/glust_test bs=65536
 count=15625
 15625+0 records in
 15625+0 records out
 102400 bytes (1.0 GB) copied, 3572.31 seconds, 287 kB/s

 real59m32.745s
 user0m0.010s
 sys 0m0.010s


 With it taking almost twice as long, can I expect this sort of
 performance degradation on 'real' servers? Also, what sort of setup
 would you recommend for us?

 Can anyone help?
 Thanks,
 Josh.

 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users

___
Gluster-users mailing list
Gluster-users@gluster.org
http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Need a quick answer on Distributed Replicated Storage questions

2009-06-18 Thread Liam Slusser
Jonathan,

You can export a Gluster mount via a client with a NFS server however the
performance is pretty poor.  As far as i know there is no way to export it
with iSCSI.

Your best option is to use a single/dual Linux/Solaris iscsi server to boot
strap all your systems in xenServer and then use Gluster and fuse to mount
your /data drive once the system is up and running.

liam

On Mon, Jun 15, 2009 at 5:15 PM, Jonathan Bayles jbay...@readytechs.comwrote:

 Hi all,

 I am attempting to prevent my company from having to buy a SAN to backend
 our virtualization platform(xenServer). Right now we have a light workload
 and 4 dell 2950's (6disks, 1 controller each) to leverage against the
 storage side. I like what I see in regard to the Distributed Replicated
 Storage where you essentially create a RAID 10 of bricks. This would work
 very well for me. The question is, how do I serve this storage paradigm to a
 front end that's expecting an NFS share or an iSCSI target? Does gluster
 enable me to access the entire cluster from a single IP? Or is it something
 I could run on a centos cluster (luci and ricci) and use the cluster suite
 to present the glustered file system in the form of an NFS share?

 Let me back up and state my needs/assumptions:

 * A storage cluster with the capacity equal to at least 1 node(assuming all
 nodes are the same).

 * I need to be able to lose/take down any one brick in the cluster at any
 time without a loss of data.

 * I need more than the throughput of a single server, if not in overall
 speed, then in width.

 * I need to be able to add more bricks in and have the expectation of
 increased storage capacity and throughput.

 * I need to present the storage as a single entity as an NFS share or a
 iSCSI target.

 If there are any existing models out there please point me too them, I
 don't mind doing the work I just don't want to re-invent the wheel. Thanks
 in advance for your time and effort, I know what its like to have to answer
 newbie questions!
 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users

___
Gluster-users mailing list
Gluster-users@gluster.org
http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Timestamp on replicated files and dirs

2009-06-08 Thread Liam Slusser
Stephan,

You can get the newest Gluster snapshot by using git, you can download it
here http://git-scm.com/download

Once you have git do:

git clone git://git.sv.gnu.org/gluster.git glusterfs

liam

On Mon, Jun 8, 2009 at 3:20 AM, Stephan von Krawczynski sk...@ithnet.comwrote:

 Hello Liam,

 I have no idea where to download the git release (I am really looking for a
 tgz source archive to download). Nevertheless I found something call
 glusterfs-2.0.2 in the qa-releases dir and tried that. It sets the
 file-timestamps correctly, but not the dir-timestamps.
 Is there some place where one can download a daily or weekly source
 snapshot?

 Regards,
 Stephan



 On Sat, 6 Jun 2009 10:33:12 -0700
 Liam Slusser lslus...@gmail.com wrote:

  This has been already fixed in the newest git released so grab that
  version if you need it today.  I believe it is included in version
  2.0.2.
 
  Liam
 


___
Gluster-users mailing list
Gluster-users@gluster.org
http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] raid5 or raid6 level cluster

2009-05-25 Thread Liam Slusser

Currently no, but it's in the roadmap for a future release.

ls



On May 25, 2009, at 1:57 AM, Vahriç Muhtaryan vah...@doruk.net.tr  
wrote:



Hello,



İs there anyway  to create raid6 or raid5 level glusterfs installati 
on ?




From docs I undetstood that I can do raid1 base glusterfs  
installation or radi0 (strapting data too all servers ) and raid10  
based solution but raid10 based solution is not cost effective  
because need too much server.




Do you have a plan for keep one or two server as a parity for whole  
glusterfs system ?




Regards

Vahric

___
Gluster-users mailing list
Gluster-users@gluster.org
http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users


___
Gluster-users mailing list
Gluster-users@gluster.org
http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users


[Gluster-users] Fwd: [Gluster-devel] rc8

2009-04-20 Thread Liam Slusser
I cant post to the devel list so i'll post here - i'm still seeing a memory
leak in rc8.  In my two node server cluster, server1's memory footprint gets
larger as well does the load average, while write performance decreases.
 Server2 (with the identical configuration file) does not have this issue.
 I had this same problem with rc1, rc4, rc7, a git from last week, and now
rc8.  1.3.12 works fine however.
liam

-- Forwarded message --
From: Gordan Bobic gor...@bobich.net
Date: Mon, Apr 20, 2009 at 2:01 PM
Subject: Re: [Gluster-devel] rc8
To: gluster-de...@nongnu.org


Gordan Bobic wrote:

 First-access failing bug still seems to be present.
 But other than that, it seems to be distinctly better than rc4. :)
 Good work! :)


And that massive memory leak is gone, too! The process hasn't grown by a KB
after a kernel compile! :D

s/Good work/Awesome work/


:)


Gordan


___
Gluster-devel mailing list
gluster-de...@nongnu.org
http://lists.nongnu.org/mailman/listinfo/gluster-devel
___
Gluster-users mailing list
Gluster-users@gluster.org
http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users