Re: [Gluster-users] Inviting comments on my plans

2012-11-19 Thread Fernando Frediani (Qube)
To grow the RAID is relatively simple and you can control how fast you want to 
rebuild to not impact your performance. You can even put some lines on your 
crontab to raise or decrease the priority of rebuild using the RAID controller 
CLI depending on the time. Yes RAID 5 you 'loose' the capacity of a disk, but 
you have to compromise somewhere. One disk isn't a big deal out of 8 or 12.
If you still decide to go with single disks and no RAID make sure that when a 
disk fails you won't get anything freezing or locked up until the disk is 
completely removed and declared dead by the kernel. I never did that myself to 
say how it will behaviour.

When you grow the RAID by adding more disks you grow the Logical Disk on the 
Raid controller first, then the partition(s) then the file system.
To be honest by the amount of data that you are talking about you will have I 
wouldn't even consider add a half populated server to grow later. As you are 
using small servers (12 slots) just add them fully populated and make the 
things easy not having to worry about grow RAID, but just replace disk when 
necessary.

Fernando

-Original Message-
From: gluster-users-boun...@gluster.org 
[mailto:gluster-users-boun...@gluster.org] On Behalf Of Shawn Heisey
Sent: 19 November 2012 16:36
To: 'gluster-users@gluster.org'
Subject: Re: [Gluster-users] Inviting comments on my plans

On 11/19/2012 3:18 AM, Fernando Frediani (Qube) wrote:
> Hi,
>
> I agree with the comment about Fedora and wouldn't choose it a distribution, 
> but if you are comfortable with it go ahead as I don't think this will be the 
> major pain.
>
> RAID: I see where you are coming from to choose not have any RAID and I have 
> thought myself before to do the same, mainly for performance reasons, but as 
> mentioned how are you going to handle the drive swap ? If you think you can 
> somehow automate it please share with us as I believe it is a major 
> performance gain running the disks independently .

There will be no automation.  I'll have to do everything myself -- telling the 
RAID controller to make the disk available to the OS, putting a filesystem on 
it, re-adding it to gluster, etc.  Although drive failure is inevitable, I do 
not expect it to be a common occurrence.

> What you are willing to do with XFS+BRTFS I am not quiet sure it will work as 
> you expect. Ideally you need to use snapshots from the Distributed Filesystem 
> otherwise you might think you are getting a consistent copy of the data and 
> you might not as you are not supposed to be reading/writing other than on the 
> Gluster mount.

The filesystem will be mounted on /bricks/fsname, but gluster will be pointed 
at /bricks/fsname/volname.  I would put snapshots in /bricks/fsname/snapshots.  
Gluster would never see the snapshot data.

> Performance: Simple and short - If you can compromise one disk per host AND 
> choose to not go with independent disks(no RAID) go with RAID 5.
> As your system grows the reads and write should (in theory) be distributed 
> across all bricks. If you have a disk failed you can easily replace it and 
> even in a unlikely event that you lose two disks in a server and loose its 
> data entirely you still have a copy of it in another place and can rebuild it 
> with a bit of patience , so no data loss.
> Also we have had more than enough reports of bad performance in Gluster for 
> all kinds of configurations (including RAID 10) so I don't think anyone 
> should expect Gluster to perform that well, so using RAID 5, 6 or 10 
> underneath shouldn't make much difference and RAID 10 only would waste space. 
> If you are storing bulk data (multimedia, images, big files) great, it will 
> be streamed and sequential data and it should be ok and acceptable, but if 
> you are storing things that do a lot of small IO or Virtual machines I'm not 
> sure if Gluster is the best choice for you and you should think carefully 
> about it.

A big problem that I would be facing if I went with RAID5 is that I won't 
initially have all drive bays populated.  The server has 12 drive bays.  If I 
populate 8 bays per server to start out, what happens when I need to fill in 
the other 4 bays?

If I make a new RAID5, then I have lost the capacity of another disk, and I 
have no option other than adding at least three drives at a time.  
I would not have the option of growing one disk at a time.  I can probably grow 
the existing RAID array, but that is a process that will literally take days, 
during which the entire array is in a fragile state with horrible performance. 
If others have experience with doing this on Dell hardware and have had 
consistently good luck with it, then my objection may be unfounded.

With individual disks instead of RAID, I can add one disk at a time to a server 
pair.

We will be 

Re: [Gluster-users] Inviting comments on my plans

2012-11-19 Thread Fernando Frediani (Qube)
Hi,

I agree with the comment about Fedora and wouldn't choose it a distribution, 
but if you are comfortable with it go ahead as I don't think this will be the 
major pain.

RAID: I see where you are coming from to choose not have any RAID and I have 
thought myself before to do the same, mainly for performance reasons, but as 
mentioned how are you going to handle the drive swap ? If you think you can 
somehow automate it please share with us as I believe it is a major performance 
gain running the disks independently .

What you are willing to do with XFS+BRTFS I am not quiet sure it will work as 
you expect. Ideally you need to use snapshots from the Distributed Filesystem 
otherwise you might think you are getting a consistent copy of the data and you 
might not as you are not supposed to be reading/writing other than on the 
Gluster mount.

Performance: Simple and short - If you can compromise one disk per host AND 
choose to not go with independent disks(no RAID) go with RAID 5.
As your system grows the reads and write should (in theory) be distributed 
across all bricks. If you have a disk failed you can easily replace it and even 
in a unlikely event that you lose two disks in a server and loose its data 
entirely you still have a copy of it in another place and can rebuild it with a 
bit of patience , so no data loss.
Also we have had more than enough reports of bad performance in Gluster for all 
kinds of configurations (including RAID 10) so I don't think anyone should 
expect Gluster to perform that well, so using RAID 5, 6 or 10 underneath 
shouldn't make much difference and RAID 10 only would waste space. If you are 
storing bulk data (multimedia, images, big files) great, it will be streamed 
and sequential data and it should be ok and acceptable, but if you are storing 
things that do a lot of small IO or Virtual machines I'm not sure if Gluster is 
the best choice for you and you should think carefully about it.

Fernando

-Original Message-
From: gluster-users-boun...@gluster.org 
[mailto:gluster-users-boun...@gluster.org] On Behalf Of Brian Candler
Sent: 18 November 2012 12:19
To: Shawn Heisey
Cc: gluster-users@gluster.org
Subject: Re: [Gluster-users] Inviting comments on my plans

On Sat, Nov 17, 2012 at 11:04:33AM -0700, Shawn Heisey wrote:
> Dell R720xd servers with two internal OS drives and 12 hot-swap 
> external 3.5 inch bays.  Fedora 18 alpha, to be upgraded to Fedora
> 18 when it is released.

I would strongly recommend *against* Fedora in any production environment, 
simply because there are new releases every 6 months, and releases are only 
supported for 18 months from release.  You are therefore locked into a complete 
OS reinstall every 6 months (or at best, three upgrades every 18 months).

If you want something that's free and RPM-based for production, I suggest you 
use CentOS or Scientific Linux.

> 2TB simple LVM volumes for bricks.
> A combination of 4TB disks (two bricks per drive) and 2TB disks.

With no RAID, 100% reliant on gluster replication? You discussed this later but 
I would still advise against this.  If you go this route, you will need to be 
very sure about your procedures for (a) detecting failed drives, and
(b) replacing failed drives.  It's certainly not a simple pull-out/push-in (or 
rebuild-on-hot-spare) as it would be with RAID.  You'll have to introduce a new 
drive, create the filesystem (or two filesystems on a 4TB drive), and 
reintroduce those filesystems as bricks into gluster: but not using 
replace-brick because the failed brick will have gone.  So you need to be 
confident in the abilities of your operational staff to do this.

If you do it this way, please test and document it for the rest of us.

> Now for the really controversial part of my plans: Left-hand brick 
> filesystems (listed first in each replica set) will be XFS, right-hand 
> bricks will be BTRFS.  The idea here is that we will have one copy of 
> the volume on a fully battle-tested and reliable filesystem, and 
> another copy of the filesystem stored in a way that we can create 
> periodic snapshots for last-ditch "oops" recovery.
> Because of the distributed nature of the filesystem, using those 
> snapshots will not be straightforward, but it will be POSSIBLE.

Of course it depends on your HA requirements, but another approach would be to 
have non-replicated volume (XFS) and then geo-replicate to another server with 
BTRFS, and do your snapshotting there. Then your "live" data is not dependent 
on BTRFS issues.

This also has the bonus that your BTRFS server could be network-remote.

> * Performance.
> RAID 5/6 comes with a severe penalty on performance during sustained 
> writes -- writing more data than will fit in your RAID controller's 
> cache memory.  Also, if you have a failed disk, all performance is 
> greatly impacted during the entire rebuild process, which for a 4TB 
> disk is likely to take a few days.

Actually, sustained sequential writes are the

Re: [Gluster-users] Very slow directory listing and high CPU usage on replicated volume

2012-11-06 Thread Fernando Frediani (Qube)
Actually you raised a very good point.

Why does it need to rely on FUSE. Why can't it be something that run in kernel 
that doesn't have any reliance on FUSE ? I imagine that would require a lot of 
engineering but the benefits no need to mention.
Does anyone know a bit of architecture of Isilon and of other POSIX compliant 
distributed filesystems  ?

Fernando

-Original Message-
From: Joe Landman [mailto:land...@scalableinformatics.com] 
Sent: 06 November 2012 12:39
To: Fernando Frediani (Qube)
Cc: 'gluster-users@gluster.org'
Subject: Re: [Gluster-users] Very slow directory listing and high CPU usage on 
replicated volume

On 11/06/2012 04:35 AM, Fernando Frediani (Qube) wrote:
> Joe,
>
> I don't think we have to accept this as this is not acceptable thing.

I understand your unhappyness with it.  But its "free" and you sometimes have 
to accept what you get for "free".

> I have seen countless people complaining about this problem for a 
> while and seems no improvements have been done. The thing about the 
> ramdisk although might help, looks more a chewing gun. I have seen 
> other distributed filesystems that don't suffer for the same problem, 
> so why Gluster have to ?

This goes to some aspect of the implementation.  FUSE makes metadata ops (and 
other very small IOs) problematic (as in time consuming).  There are no easy 
fixes for this, without engineering a new kernel subsystem
(unlikely) to incorporate Gluster, or redesigning FUSE so this is not an issue. 
 I am not sure either is likely.

Red Hat may be willing to talk to you about these if you give them money for 
subscriptions.  They eventually relented on xfs.


-- 
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics Inc.
email: land...@scalableinformatics.com
web  : http://scalableinformatics.com
http://scalableinformatics.com/sicluster
phone: +1 734 786 8423 x121
fax  : +1 866 888 3112
cell : +1 734 612 4615

___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Very slow directory listing and high CPU usage on replicated volume

2012-11-06 Thread Fernando Frediani (Qube)
Joe,

I don't think we have to accept this as this is not acceptable thing. I have 
seen countless people complaining about this problem for a while and seems no 
improvements have been done.
The thing about the ramdisk although might help, looks more a chewing gun. I 
have seen other distributed filesystems that don't suffer for the same problem, 
so why Gluster have to ?

-Original Message-
From: gluster-users-boun...@gluster.org 
[mailto:gluster-users-boun...@gluster.org] On Behalf Of Joe Landman
Sent: 05 November 2012 15:07
To: gluster-users@gluster.org
Subject: Re: [Gluster-users] Very slow directory listing and high CPU usage on 
replicated volume

On 11/05/2012 09:57 AM, harry mangalam wrote:
> Jeff Darcy wrote a nice piece in his hekafs blog about 'the importance 
> of keeping things sequential' which is essentially about the 
> contention for heads between data io and journal io.
>  ial/> (also congrats on the Linux Journal article on the glupy 
> python/gluster approach).
>
> We've been experimenting with SSDs on ZFS (using the SSDs fo the ZIL
> (journal)) and while it's provided a little bit of a boost, it has not 
> been dramatic.  Ditto XFS.  However, we did not stress it at all with 
> heavy loads

An issue you have to worry about is if the SSD streaming read/write path is 
around the same speed as the spinning rust performance.  If so, this design 
would be a wash at best.

Also, if this is under Linux, the ZFS pathways may not be terribly well 
optimized.

> in a gluster env and I'm now thinking that there is where you would 
> see the improvement. (see Jeff's graph about how the diff in 
> threads/load affects IOPS).
>
> Is anyone running a gluster system with the underlying XFS writing the 
> journal to SSDs?  If so, any improvement?  I would have expected to 
> hear about this as a recommended architecture for gluster if it had 
> performed MUCH better, but

Yes, we've done this, and do this on occasion.  No, there's no dramatic speed 
boost for most use cases.

Unfortunately, heavy metadata ops on GlusterFS are going to be slow, and we 
simply have to accept that for the near term.  This appears to be independent 
of the particular file system, or even storage technology. 
If you aren't doing metadata heavy ops, then you should be in good shape.  It 
appears that mirroring magnifies the metadata heavy ops significantly.

For laughs, about a year ago, we set up large ram disks (tmpfs) in a cluster, 
put a loopback device on them, then a file system, then GlusterFS atop this.  
Should have been very fast for metadata ops.  But it wasn't.  Gave some 
improvement, but not significant enough that we'd recommend doing "heroic" 
designs like this.

If your workloads are metadata heavy, we'd recommend local IO, and if you are 
mostly small IO, an SSD.




-- 
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics, Inc.
email: land...@scalableinformatics.com
web  : http://scalableinformatics.com
http://scalableinformatics.com/siflash
phone: +1 734 786 8423 x121
cell : +1 734 612 4615
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] performance in 3.3

2012-10-19 Thread Fernando Frediani (Qube)
Hi Doug,

Try to make the change suggested by Anand and let us know how you get on. I am 
interested to hear about the performance on 3.3 because bad performance has 
been subject of many emails for a while here.

Regards,

Fernando

-Original Message-
From: gluster-users-boun...@gluster.org 
[mailto:gluster-users-boun...@gluster.org] On Behalf Of Doug Schouten
Sent: 19 October 2012 02:45
To: gluster-users@gluster.org
Subject: [Gluster-users] performance in 3.3

Hi,

I am noticing a rather slow read performance using GlusterFS 3.3 with the 
following configuration:

Number of Bricks: 4
Transport-type: tcp
Bricks:
Brick1: server1:/srv/data
Brick2: server2:/srv/data
Brick3: server3:/srv/data
Brick4: server4:/srv/data
Options Reconfigured:
features.quota: off
features.quota-timeout: 1800
performance.flush-behind: on
performance.io-thread-count: 64
performance.quick-read: on
performance.stat-prefetch: on
performance.io-cache: on
performance.write-behind: on
performance.read-ahead: on
performance.write-behind-window-size: 4MB
performance.cache-refresh-timeout: 1
performance.cache-size: 4GB
nfs.rpc-auth-allow: none
network.frame-timeout: 60
nfs.disable: on
performance.cache-max-file-size: 1GB


The servers are connected with bonded 1Gb ethernet, and have LSI MegaRAID 
arrays with 12x1 TB disks in RAID-6 array, using XFS file system mounted like:

xfs logbufs=8,logbsize=32k,noatime,nodiratime  00

and we use the FUSE client

localhost:/global /global glusterfs
defaults,direct-io-mode=enable,log-level=WARNING,log-file=/var/log/gluster.log
0 0

Our files are all >= 2MB. When rsync-ing we see about 50MB/s read performance 
which improves to 250MB/s after the first copy. This indicates to me that the 
disk caching is working as expected. However I am rather surprised by the low 
50MB/s read speed; this is too low to be limited by network, and the native 
disk read performance is way better. 
Is there some configuration that can improve this situation?

thanks,


-- 


  Doug Schouten
  Research Associate
  TRIUMF
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] self-heal impact on performance

2012-10-04 Thread Fernando Frediani (Qube)
Joao, Gluster is not yet good to run Virtual Machines yet. That's a statement 
from RedHat sales. So although it works you will find issues like this.

Fernando

-Original Message-
From: gluster-users-boun...@gluster.org 
[mailto:gluster-users-boun...@gluster.org] On Behalf Of João Pagaime
Sent: 03 October 2012 17:40
To: Gluster-users@gluster.org
Subject: [Gluster-users] self-heal impact on performance

Hello all

I'm testing glusterfs  3.3.0-1 on a couple of CentOS   6.3  servers,
than run KVM

after inserting a new empty brick due to a simulated failure, the Self-healing 
process kicked in, as expected

after that however the VMs became mostly unsuable due to IO delay

it looks like the  Self-healing process doesn't let anything else run normally

I believe glusterfs  3.3 has some improvments to avoid this problem

is there some performe tunning that has to be done?

is there some specific command to start a special self-healing process for 
systems that have large files (lke virtualization systems)?

thanks, best regards,
João

PS: this probably isn't a new problem: I've picked up the email
subject from a message dating Mar 24   2011
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] infiniband bonding

2012-09-21 Thread Fernando Frediani (Qube)
Well, it actually says it is a limitation of the Infiniband driver so nothing 
with Gluster I guess. If the driver allow then in theory should not be a 
problem for Gluster.

Fernando

From: gluster-users-boun...@gluster.org 
[mailto:gluster-users-boun...@gluster.org] On Behalf Of samuel
Sent: 21 September 2012 10:56
To: gluster-users@gluster.org
Subject: [Gluster-users] infiniband bonding

Hi folks,

Reading this post: 
http://community.gluster.org/q/port-bonding-link-aggregation-transport-rdma-ib-verbs/

It says that gluster 3.2 does not support bonding of infiniband ports.

Does anyone knows whether 3.3 has changed this limitation? Is there any other 
place where to find information about this subject?

Thanks in advance!

Samuel.
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Throughout over infiniband

2012-09-10 Thread Fernando Frediani (Qube)
Well, I would say there is a reason, if the Gluster client performed as 
expected.
Using the Gluster client it should in theory access the file(s) directly from 
the nodes where they reside and not having to go though a single node exporting 
the NFS folder which would then have to gather the file.
Yes the NFS has all the caching stuff but if the Gluster client behaviour was  
similar it should be able to get similar performance which doesn't seem to be 
what has been resported.
I did tests myself using Gluster client and NFS and NFS got better performance 
also and I believe this is due the caching.

Fernando

-Original Message-
From: gluster-users-boun...@gluster.org 
[mailto:gluster-users-boun...@gluster.org] On Behalf Of Stephan von Krawczynski
Sent: 10 September 2012 13:57
To: Whit Blauvelt
Cc: gluster-users@gluster.org; Brian Candler
Subject: Re: [Gluster-users] Throughout over infiniband

On Mon, 10 Sep 2012 08:06:51 -0400
Whit Blauvelt  wrote:

> On Mon, Sep 10, 2012 at 11:13:11AM +0200, Stephan von Krawczynski wrote:
> > [...]
> > If you're lucky you reach something like 1/3 of the NFS performance.
> [Gluster NFS Client]
> Whit

There is a reason why one would switch from NFS to GlusterFS, and mostly it is 
redundancy. If you start using a NFS-client type you cut yourself off the 
"complete solution". As said elsewhere you can as well export GlusterFS via 
kernel-nfs-server. But honestly, it is a patch. It would be better by far if 
things are done right, native glusterfs client in kernel-space.
And remember, generally there should be no big difference between NFS and 
GlusterFS with bricks spread over several networks - if it is done how it 
should be, without userspace.

--
MfG,
Stephan
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] QEMU-GlusterFS native integration demo video

2012-09-03 Thread Fernando Frediani (Qube)
Ok, thanks for clarifying Bharata.
Looking forward to see the next version results.

Regards,

Fernando

-Original Message-
From: Bharata B Rao [mailto:bharata@gmail.com] 
Sent: 03 September 2012 10:19
To: Fernando Frediani (Qube)
Cc: gluster-users@gluster.org
Subject: Re: [Gluster-users] QEMU-GlusterFS native integration demo video

On Mon, Sep 3, 2012 at 2:42 PM, Fernando Frediani (Qube) 
 wrote:
> Hi Bharata,
>
> I am interested to see the performance for read and writes separately. 
> Aggregated will probably be much influenced by the read performance.

Sorry if it was not explicit from those URLs, but the numbers you see are 
separate for reads and writes. Aggregated means that the result is aggregated 
from 4 write or read requests that run simultaneously as part of either write 
or read test.

>
> Interesting the last figures, seems quiet a significant performance gain 
> using the QEMU-GlusterFS.

Right.

>
> See if you can get the results for read and writes separately.

I am working on next version of the patches that will work with the latest 
libgfapi. I will try to re-generate these numbers for the latest version.

Regards,
Bharata.
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] QEMU-GlusterFS native integration demo video

2012-09-03 Thread Fernando Frediani (Qube)
Hi Bharata,

I am interested to see the performance for read and writes separately. 
Aggregated will probably be much influenced by the read performance.

Interesting the last figures, seems quiet a significant performance gain using 
the QEMU-GlusterFS.

See if you can get the results for read and writes separately.

Best regards,

Fernando

-Original Message-
From: Bharata B Rao [mailto:bharata@gmail.com] 
Sent: 03 September 2012 10:01
To: Fernando Frediani (Qube)
Cc: gluster-users@gluster.org
Subject: Re: [Gluster-users] QEMU-GlusterFS native integration demo video

On Mon, Sep 3, 2012 at 2:14 PM, Fernando Frediani (Qube) 
 wrote:
> Hi Bharata
> Thanks for this, very useful.
>
> Would you be able to specify tests with mainly reads and writes. As far as I 
> know there is a big hit and poor performance on writes in normal fuse mounts.

Ok, here are the aggregated bandwidth numbers from a quick FIO (4 seq writes 
using libaio) write test:

Base: 189667KB/s
QEMU-GlusterFS native: 150635KB/s
QEMU-GlusterFS FUSE: 43028KB/s

So as you can see, native is much better than FUSE, but still doesn't match the 
base numbers. When I say base, it means that the guest is run directly from 
glusterfs brick (w/o gluster FUSE mount or using QEMU-GlusterFS native driver).

> Are you using IOmeter or bonnie ?

FIO at the moment. Plan to include more benchmark numbers in future.
Any help here would be appreciated :)

>
> Seems the results with fuse and the native qemu-glusterfs are pretty similar, 
> am I right ?

No, QEMU-GlusterFS native numbers are way better than FUSE. Let me quote the 
numbers from the URL I gave earlier:

Base: 63076KB/s
QEMU-GlusterFS native: 53609KB/s
QEMU-GlusterFS FUSE: 29392KB/s

Regards,
Bharata.
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] QEMU-GlusterFS native integration demo video

2012-09-03 Thread Fernando Frediani (Qube)
Hi Bharata
Thanks for this, very useful.

Would you be able to specify tests with mainly reads and writes. As far as I 
know there is a big hit and poor performance on writes in normal fuse mounts.
Are you using IOmeter or bonnie ?

Seems the results with fuse and the native qemu-glusterfs are pretty similar, 
am I right ?

Regards,

Fernando

-Original Message-
From: Bharata B Rao [mailto:bharata@gmail.com] 
Sent: 03 September 2012 06:54
To: Fernando Frediani (Qube)
Cc: gluster-users@gluster.org
Subject: Re: [Gluster-users] QEMU-GlusterFS native integration demo video

On Tue, Aug 28, 2012 at 3:04 PM, Fernando Frediani (Qube) 
 wrote:
> Thanks for sharing it with us Bharata.
>
> I saw you have two nodes. Have you done any performance tests and if so how 
> they compare with creating normal .qcow2 or .raw files on the filesystem, 
> specially for the writes ?

Fernando,

In the video I was using a single node system (local brick). However I have 
tested QEMU-GlusterFS with 2 node scenario too.

I have some performance numbers that compare the QEMU-GlusterFS native 
integration with QEMU-GlusterFS FUSE mount. Were you looking for anything 
different ?

I don't have numbers for qcow2 or for create, but have numbers for reads for 
raw files.

FIO numbers for read can be found here:
http://lists.nongnu.org/archive/html/qemu-devel/2012-07/msg02718.html
http://lists.gnu.org/archive/html/gluster-devel/2012-08/msg00063.html

I am planning to publish more numbers for other scenarios (qcow2 and writes 
etc) in future.

Regards,
Bharata.
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] QEMU-GlusterFS native integration demo video

2012-08-28 Thread Fernando Frediani (Qube)
Thanks for sharing it with us Bharata.

I saw you have two nodes. Have you done any performance tests and if so how 
they compare with creating normal .qcow2 or .raw files on the filesystem, 
specially for the writes ?

Thanks

Fernando

-Original Message-
From: gluster-users-boun...@gluster.org 
[mailto:gluster-users-boun...@gluster.org] On Behalf Of Bharata B Rao
Sent: 28 August 2012 05:14
To: gluster-users@gluster.org; gluster-de...@nongnu.org
Subject: [Gluster-users] QEMU-GlusterFS native integration demo video

Hi,

If you are interested and/or curious to know how QEMU can be used to create and 
boot VM's from GlusterFS volume, take a look at the demo video I have created 
at:

www.youtube.com/watch?v=JG3kF_djclg

Regards,
Bharata.
--
http://bharata.sulekha.com/blog/posts.htm, http://raobharata.wordpress.com/ 
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Query for web GUI for gluster configuration

2012-08-24 Thread Fernando Frediani (Qube)
Joe,
Gluster is a great concept and I quiet like it, however I also think it has a 
lot to go yet. Yes it solves peoples problems for certain scenarios but not yet 
for most of them which I am sure they are working on to solve and with RedHat 
acquisition it will perhaps be speed up.
I wouldn't be bad to say that 'they' (he doesn't do it alone) are doing it 
wrong. In many aspects they are doing it well and there has been a many 
improvements over the past couple of months, but on performance subject they 
didn't get there yet, that's just a fact either you like it or not, and they 
acknowledge it already.

Fernando

-Original Message-
From: Joe Julian [mailto:j...@julianfamily.org] 
Sent: 24 August 2012 10:35
To: Fernando Frediani (Qube)
Cc: 'gluster-users@gluster.org'
Subject: Re: [Gluster-users] Query for web GUI for gluster configuration

My images are, indeed, on the volume. I saw that image-based performance wasn't 
sufficient so I found a better way. That's my point. Not everyone is restricted 
from being able to take advantage of a better way. So yes, I mount a GlusterFS 
volume within a VM whose image resides on a GlusterFS volume.

By the way, you see "9 of 10 people" reporting poor performance in part because 
the people that don't have poor performance don't complain about it. Your 
metric is skewed.

I recognize you're upset because your expectations are not being met. 
You're not alone. I'm not alone either. I see about a dozen new users each day 
that are excited that this software solves a problem for them.

Maybe you should consider posting your use case and seeing if anyone has any 
suggestions on how you could satisfy that rather than trying to tell Vijay that 
he's doing it wrong.

On 08/24/2012 02:15 AM, Fernando Frediani (Qube) wrote:
> Joe,
> 9 of 10 people I have seen here reported very poor performance when running 
> VMs. Obviously they run, but al the performance tests gave results that no 
> one would expect to get. So yes it works for low profile virtual server, like 
> quite web servers, etc, but not in general. Any peak you need it can't 
> accomplish and it does need improvements on Gluster code.
> You can't add it to oVirt and tell people, "Look, it's there but it's only 
> for lwo profile servers that don't require any significant performance". 
> People don't wouldn't even consider that.
> Also you seem to eb mounting Gluster inside the VM which is not the case 
> here. On oVirt that would be for hosting the VM's images.
>
> Regards,
>
> Fernando
>
> -Original Message-
> From: gluster-users-boun...@gluster.org 
> [mailto:gluster-users-boun...@gluster.org] On Behalf Of Joe Julian
> Sent: 24 August 2012 10:08
> To: gluster-users@gluster.org
> Subject: Re: [Gluster-users] Query for web GUI for gluster 
> configuration
>
> It's only "far from good" for certain use cases, not all. I'm actually quite 
> pleased that GlusterFS emphasizes C&  P over A and none of my users complain 
> ( I use raw images for 14 kvm vms on one of my GlusterFS volumes ). Those VMs 
> mount GlusterFS volumes for their application data.
> Very little is done on the raw image.
>
> Not everybody that uses virtualization is going to be marketing those VMs to 
> 3rd parties that expect it to pretend to be an isolated system.
>
> On 08/24/2012 01:58 AM, Fernando Frediani (Qube) wrote:
>> Vijay, how are you planning to integrate Gluster with oVirt (a fantastic 
>> idea in my opinion) if the performance when running .qcow2 or even .raw 
>> files is far from good at the moment ?
>>
>> Fernando
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Query for web GUI for gluster configuration

2012-08-24 Thread Fernando Frediani (Qube)
Joe,
9 of 10 people I have seen here reported very poor performance when running 
VMs. Obviously they run, but al the performance tests gave results that no one 
would expect to get. So yes it works for low profile virtual server, like quite 
web servers, etc, but not in general. Any peak you need it can't accomplish and 
it does need improvements on Gluster code.
You can't add it to oVirt and tell people, "Look, it's there but it's only for 
lwo profile servers that don't require any significant performance". People 
don't wouldn't even consider that.
Also you seem to eb mounting Gluster inside the VM which is not the case here. 
On oVirt that would be for hosting the VM's images.

Regards,

Fernando

-Original Message-
From: gluster-users-boun...@gluster.org 
[mailto:gluster-users-boun...@gluster.org] On Behalf Of Joe Julian
Sent: 24 August 2012 10:08
To: gluster-users@gluster.org
Subject: Re: [Gluster-users] Query for web GUI for gluster configuration

It's only "far from good" for certain use cases, not all. I'm actually quite 
pleased that GlusterFS emphasizes C & P over A and none of my users complain ( 
I use raw images for 14 kvm vms on one of my GlusterFS volumes ). Those VMs 
mount GlusterFS volumes for their application data. 
Very little is done on the raw image.

Not everybody that uses virtualization is going to be marketing those VMs to 
3rd parties that expect it to pretend to be an isolated system.

On 08/24/2012 01:58 AM, Fernando Frediani (Qube) wrote:
> Vijay, how are you planning to integrate Gluster with oVirt (a fantastic idea 
> in my opinion) if the performance when running .qcow2 or even .raw files is 
> far from good at the moment ?
>
> Fernando
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Query for web GUI for gluster configuration

2012-08-24 Thread Fernando Frediani (Qube)
Vijay, how are you planning to integrate Gluster with oVirt (a fantastic idea 
in my opinion) if the performance when running .qcow2 or even .raw files is far 
from good at the moment ?

Fernando

-Original Message-
From: gluster-users-boun...@gluster.org 
[mailto:gluster-users-boun...@gluster.org] On Behalf Of Vijay Bellur
Sent: 24 August 2012 09:43
To: Jules Wang
Cc: gluster-users
Subject: Re: [Gluster-users] Query for web GUI for gluster configuration

On 08/24/2012 02:07 PM, Jules Wang wrote:
> hi, jack
>  As far as I know, the gmc project is deprecated. And redhat plans 
> to use its own product to cover this area.

Yes, gmc is deprecated. We are working with oVirt to provide GUI based gluster 
configuration capabilities. oVirt 3.1 which was recently released has some 
volume management features. If you give that a spin, we would love to hear your 
feedback :)

-Vijay
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] CTDB with Gluster

2012-08-21 Thread Fernando Frediani (Qube)
Hi James,
Well, uCarp does VRRP in a very simple way but it's not the solution for 
multiple nodes. I don't know keepalived. Wil it manage with a single intances 
the placement o IP addresses accross all nodes  and take-over by any other one 
randomly if a node goes down ?

Fernando

From: James [purplei...@gmail.com]
Sent: 21 August 2012 18:30
To: Fernando Frediani (Qube)
Cc: 'gluster-users@gluster.org'
Subject: Re: [Gluster-users] CTDB with Gluster

On Tue, 2012-08-21 at 16:33 +0000, Fernando Frediani (Qube) wrote:
> Just have a simple and easy IP distribution/take-over system in the case any 
> nodes fail.
So what you want is VRRP.
Why not use keepalived ? It's very easy to setup.

James

>
> Fernando
>
> -Original Message-
> From: James [mailto:purplei...@gmail.com]
> Sent: 21 August 2012 15:54
> To: Fernando Frediani (Qube)
> Cc: 'gluster-users@gluster.org'
> Subject: Re: [Gluster-users] CTDB with Gluster
>
> On Tue, 2012-08-21 at 08:40 +, Fernando Frediani (Qube) wrote:
> > Has anyone used CTDB(http://ctdb.samba.org/) for IP failover/balance
> > between nodes using Gluster ?
> I use keepalived at the moment, to provide a virtual ip for the gluster 
> cluster (to use as the mount ip) but in the future I'll probably switch to 
> cman/corosync based cluster management which will manage a virtual ip 
> "resource".
>
> > I guess that’s what was used on the old commercial version of Gluster
> > Storage Platform.
> >
> > Initially I had thoughts on uCarp, but CTDB seems a much better fit
> > for this type of environments.
> I don't know anything about CTDB, but my one minute viewing of their webpage 
> makes me think otherwise. What are you trying to do ?
>
>
> James
>
> >
> >
> >
> > Does it do the job well and fast ?
> >
> >
> >
> > Thanks
> >
> >
> >
> > Fernando
> >
> >
> >
> >
> > ___
> > Gluster-users mailing list
> > Gluster-users@gluster.org
> > http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>

___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] CTDB with Gluster

2012-08-21 Thread Fernando Frediani (Qube)
Just have a simple and easy IP distribution/take-over system in the case any 
nodes fail.

Fernando

-Original Message-
From: James [mailto:purplei...@gmail.com] 
Sent: 21 August 2012 15:54
To: Fernando Frediani (Qube)
Cc: 'gluster-users@gluster.org'
Subject: Re: [Gluster-users] CTDB with Gluster

On Tue, 2012-08-21 at 08:40 +, Fernando Frediani (Qube) wrote:
> Has anyone used CTDB(http://ctdb.samba.org/) for IP failover/balance 
> between nodes using Gluster ?
I use keepalived at the moment, to provide a virtual ip for the gluster cluster 
(to use as the mount ip) but in the future I'll probably switch to 
cman/corosync based cluster management which will manage a virtual ip 
"resource".

> I guess that’s what was used on the old commercial version of Gluster 
> Storage Platform.
> 
> Initially I had thoughts on uCarp, but CTDB seems a much better fit 
> for this type of environments.
I don't know anything about CTDB, but my one minute viewing of their webpage 
makes me think otherwise. What are you trying to do ?


James

> 
>  
> 
> Does it do the job well and fast ?
> 
>  
> 
> Thanks
> 
>  
> 
> Fernando
> 
>  
> 
> 
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users

___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


[Gluster-users] CTDB with Gluster

2012-08-21 Thread Fernando Frediani (Qube)
Has anyone used CTDB(http://ctdb.samba.org/) for IP failover/balance between 
nodes using Gluster ? I guess that's what was used on the old commercial 
version of Gluster Storage Platform.
Initially I had thoughts on uCarp, but CTDB seems a much better fit for this 
type of environments.

Does it do the job well and fast ?

Thanks

Fernando

___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Gluster speed sooo slow

2012-08-13 Thread Fernando Frediani (Qube)
3.2 Ivan.

-Original Message-
From: gluster-users-boun...@gluster.org 
[mailto:gluster-users-boun...@gluster.org] On Behalf Of Ivan Dimitrov
Sent: 13 August 2012 12:33
To: 'gluster-users@gluster.org'
Subject: Re: [Gluster-users] Gluster speed sooo slow

I have a low traffic free hosting and I converted some x,000 users on glusterfs 
a few months ago. I'm not impressed at all and would probably not convert any 
more users. It works ok for now, but with 88GB of 2TB volume. It's kind of 
pointless for now... :( I'm researching a way to convert my payed hosting 
users, but I can't find any system suitable for the job.

Fernando, what gluster structure are you talking about?

Best Regards
Ivan Dimitrov

Fernando, what
On 8/13/12 2:16 PM, Fernando Frediani (Qube) wrote:
> I heard from a Large ISP talking to someone that works there they were trying 
> to use GlusteFS for Maildir and they had a hell because of the many small 
> files and had customer complaining all the time.
> Latency is acceptable on a networked filesystem, but the results people are 
> reporting are beyond any latency problems, they are due to the way Gluster is 
> structured and that was already confirmed by some people on this list, so 
> changed are indeed needed on the code. If you take even a Gigabit network the 
> round trip isn't that much really, (not more than a quarter of a ms) so it 
> shouldn't be a big thing.
> Yes FUSE might also contribute to decrease performance but still the 
> performance problems are on the architecture of the filesystem.
> One thing that is new to Gluster and that in my opinion could contribute to 
> increase performance is the Distributed-Stripped volumes, but that doesn't 
> still work for all enviroemnts.
> So as it stands for Multimedia or Archive files fine, for other usages I 
> wouldn't bet my chips and would rather test thoroughly first.
>
> -Original Message-
> From: Brian Candler [mailto:b.cand...@pobox.com]
> Sent: 13 August 2012 11:00
> To: Fernando Frediani (Qube)
> Cc: 'Ivan Dimitrov'; 'gluster-users@gluster.org'
> Subject: Re: [Gluster-users] Gluster speed sooo slow
>
> On Mon, Aug 13, 2012 at 09:40:49AM +, Fernando Frediani (Qube) wrote:
>> I think Gluster as it stands now and current level of development is
>> more for Multimedia and Archival files, not for small files nor for
>> running Virtual Machines. It requires still a fair amount of
>> development which hopefully RedHat will put in place.
> I know a large ISP is using gluster successfully for Maildir storage - or at 
> least was a couple of years ago when I last spoke to them about it - which 
> means very large numbers of small files.
>
> I think you need to be clear on the difference between throughput and latency.
>
> Any networked filesystem is going to have latency, and gluster maybe suffers 
> more than most because of the FUSE layer at the client.  This will show as 
> poor throughput if a single client is sequentially reading or writing lots of 
> small files, because it has to wait a round trip for each request.
>
> However, if you have multiple clients accessing at the same time, you can 
> still have high total throughput.  This is because the "wasted" time between 
> requests from one client is used to service other clients.
>
> If gluster were to do aggressive client-side caching then it might be able to 
> make responses appear faster to a single client, but this would be at the 
> risk of data loss (e.g.  responding that a file has been committed to disk, 
> when in fact it hasn't).  But this would make no difference to total 
> throughput with multiple clients, which depends on the available bandwidth 
> into the disk drives and across the network.
>
> So it all depends on your overall usage pattern. Only make your judgement 
> based on a single-threaded benchmark if that's what your usage pattern is 
> really going to be like: i.e.  are you really going to have a single user 
> accessing the filesystem, and their application reads or writes one file 
> after the other rather than multiple files concurrently.
>
> Regards,
>
> Brian.
>

___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Gluster speed sooo slow

2012-08-13 Thread Fernando Frediani (Qube)
I heard from a Large ISP talking to someone that works there they were trying 
to use GlusteFS for Maildir and they had a hell because of the many small files 
and had customer complaining all the time.
Latency is acceptable on a networked filesystem, but the results people are 
reporting are beyond any latency problems, they are due to the way Gluster is 
structured and that was already confirmed by some people on this list, so 
changed are indeed needed on the code. If you take even a Gigabit network the 
round trip isn't that much really, (not more than a quarter of a ms) so it 
shouldn't be a big thing.
Yes FUSE might also contribute to decrease performance but still the 
performance problems are on the architecture of the filesystem.
One thing that is new to Gluster and that in my opinion could contribute to 
increase performance is the Distributed-Stripped volumes, but that doesn't 
still work for all enviroemnts.
So as it stands for Multimedia or Archive files fine, for other usages I 
wouldn't bet my chips and would rather test thoroughly first.

-Original Message-
From: Brian Candler [mailto:b.cand...@pobox.com] 
Sent: 13 August 2012 11:00
To: Fernando Frediani (Qube)
Cc: 'Ivan Dimitrov'; 'gluster-users@gluster.org'
Subject: Re: [Gluster-users] Gluster speed sooo slow

On Mon, Aug 13, 2012 at 09:40:49AM +, Fernando Frediani (Qube) wrote:
>I think Gluster as it stands now and current level of development is
>more for Multimedia and Archival files, not for small files nor for
>running Virtual Machines. It requires still a fair amount of
>development which hopefully RedHat will put in place.

I know a large ISP is using gluster successfully for Maildir storage - or at 
least was a couple of years ago when I last spoke to them about it - which 
means very large numbers of small files.

I think you need to be clear on the difference between throughput and latency.

Any networked filesystem is going to have latency, and gluster maybe suffers 
more than most because of the FUSE layer at the client.  This will show as poor 
throughput if a single client is sequentially reading or writing lots of small 
files, because it has to wait a round trip for each request.

However, if you have multiple clients accessing at the same time, you can still 
have high total throughput.  This is because the "wasted" time between requests 
from one client is used to service other clients.

If gluster were to do aggressive client-side caching then it might be able to 
make responses appear faster to a single client, but this would be at the risk 
of data loss (e.g.  responding that a file has been committed to disk, when in 
fact it hasn't).  But this would make no difference to total throughput with 
multiple clients, which depends on the available bandwidth into the disk drives 
and across the network.

So it all depends on your overall usage pattern. Only make your judgement based 
on a single-threaded benchmark if that's what your usage pattern is really 
going to be like: i.e.  are you really going to have a single user accessing 
the filesystem, and their application reads or writes one file after the other 
rather than multiple files concurrently.

Regards,

Brian.
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


[Gluster-users] Problem with too many small files

2012-08-13 Thread Fernando Frediani (Qube)
I am not sure how it works on Gluster but to mitigate the problem with listing 
a lot of small files wouldn't it be suitable to keep on every node a copy of 
the directory tree. I think Isilon does that and there is probably a lot to be 
learned from them which seems quiet mature technology. Could also have another 
interesting thing added in the future, local SSD to keep the file system 
metadata for faster access.

Regards,
Fernando Frediani
Lead Systems Engineer
[Description: Description: Description: Description: Description: 
cid:image003.png@01CD637B.AB2C8040]

260-266 Goswell Road, London, EC1V 7EB, United Kingdom
sales: +44 (0) 20 7150 3800
ddi: +44 (0) 20 7150 3803
fax:+44 (0) 20 7336 8420
web:   http://www.qubenet.net/

Qube Managed Services Limited
Company Number: 6215769 (Registered in England and Wales)
VAT Registration Number: GB 933 8400 27

This e-mail and the information it contains are confidential.
If you have received this e-mail in error please notify the sender immediately.
You should not copy it for any purpose or disclose its contents to any other 
person.
The contents of this e-mail do not necessarily reflect the views of the company.
E&OE

P Please consider the environment before printing this email

<>___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Gluster speed sooo slow

2012-08-13 Thread Fernando Frediani (Qube)
I think Gluster as it stands now and current level of development is more for 
Multimedia and Archival files, not for small files nor for running Virtual 
Machines. It requires still a fair amount of development which hopefully RedHat 
will put in place.

Fernando

From: gluster-users-boun...@gluster.org 
[mailto:gluster-users-boun...@gluster.org] On Behalf Of Ivan Dimitrov
Sent: 13 August 2012 08:33
To: gluster-users@gluster.org
Subject: Re: [Gluster-users] Gluster speed sooo slow

There is a big difference with working with small files (around 16kb) and big 
files (2mb). Performance is much better with big files. Witch is too bad for me 
;(

On 8/11/12 2:15 AM, Gandalf Corvotempesta wrote:
What do you mean with "small files"? 16k ? 160k? 16mb?
Do you know any workaround or any other software for this?

Mee too i'm trying to create a clustered storage for many
small file
2012/8/10 Philip Poten mailto:philip.po...@gmail.com>>
Hi Ivan,

that's because Gluster has really bad "many small files" performance
due to it's architecture.

On all stat() calls (which rsync is doing plenty of), all replicas are
being checked for integrity.

regards,
Philip

2012/8/10 Ivan Dimitrov mailto:dob...@amln.net>>:
> So I stopped a node to check the BIOS and after it went up, the rebalance
> kicked in. I was looking for those kind of speeds on a normal write. The
> rebalance is much faster than my rsync/cp.
>
> https://dl.dropbox.com/u/282332/Screen%20Shot%202012-08-10%20at%202.04.09%20PM.png
>
> Best Regards
> Ivan Dimitrov
>
>
> On 8/10/12 1:23 PM, Ivan Dimitrov wrote:
>>
>> Hello
>> What am I doing wrong?!?
>>
>> I have a test setup with 4 identical servers with 2 disks each in
>> distribute-replicate 2. All servers are connected to a GB switch.
>>
>> I am experiencing really slow speeds at anything I do. Slow write, slow
>> read, not to mention random write/reads.
>>
>> Here is an example:
>> random-files is a directory with 32768 files with average size 16kb.
>> [root@gltclient]:~# rsync -a /root/speedtest/random-files/
>> /home/gltvolume/
>> ^^ This will take more than 3 hours.
>>
>> On any of the servers if I do "iostat" the disks are not loaded at all:
>>
>> https://dl.dropbox.com/u/282332/Screen%20Shot%202012-08-10%20at%201.08.54%20PM.png
>>
>> This is similar result for all servers.
>>
>> Here is an example of simple "ls" command on the content.
>> [root@gltclient]:~# unalias ls
>> [root@gltclient]:~# /usr/bin/time -f "%e seconds" ls /home/gltvolume/ | wc
>> -l
>> 2.81 seconds
>> 5393
>>
>> almost 3 seconds to display 5000 files?!?! When they are 32,000, the ls
>> will take around 35-45 seconds.
>>
>> This directory is on local disk:
>> [root@gltclient]:~# /usr/bin/time -f "%e seconds" ls
>> /root/speedtest/random-files/ | wc -l
>> 1.45 seconds
>> 32768
>>
>> [root@gltclient]:~# /usr/bin/time -f "%e seconds" cat /home/gltvolume/*
>> >/dev/null
>> 190.50 seconds
>>
>> [root@gltclient]:~# /usr/bin/time -f "%e seconds" du -sh /home/gltvolume/
>> 126M/home/gltvolume/
>> 75.23 seconds
>>
>>
>> Here is the volume information.
>>
>> [root@glt1]:~# gluster volume info
>>
>> Volume Name: gltvolume
>> Type: Distributed-Replicate
>> Volume ID: 16edd852-8d23-41da-924d-710b753bb374
>> Status: Started
>> Number of Bricks: 4 x 2 = 8
>> Transport-type: tcp
>> Bricks:
>> Brick1: 1.1.74.246:/home/sda3
>> Brick2: glt2.network.net:/home/sda3
>> Brick3: 1.1.74.246:/home/sdb1
>> Brick4: glt2.network.net:/home/sdb1
>> Brick5: glt3.network.net:/home/sda3
>> Brick6: gltclient.network.net:/home/sda3
>> Brick7: glt3.network.net:/home/sdb1
>> Brick8: gltclient.network.net:/home/sdb1
>> Options Reconfigured:
>> performance.io-thread-count: 32
>> performance.cache-size: 256MB
>> cluster.self-heal-daemon: on
>>
>>
>> [root@glt1]:~# gluster volume status all detail
>> Status of volume: gltvolume
>>
>> --
>> Brick: Brick 1.1.74.246:/home/sda3
>> Port : 24009
>> Online   : Y
>> Pid  : 1479
>> File System  : ext4
>> Device   : /dev/sda3
>> Mount Options: rw,noatime
>> Inode Size   : 256
>> Disk Space Free  : 179.3GB
>> Total Disk Space : 179.7GB
>> Inode Count  : 11968512
>> Free Inodes  : 11901550
>>
>> --
>> Brick: Brick glt2.network.net:/home/sda3
>> Port : 24009
>> Online   : Y
>> Pid  : 1589
>> File System  : ext4
>> Device   : /dev/sda3
>> Mount Options: rw,noatime
>> Inode Size   : 256
>> Disk Space Free  : 179.3GB
>> Total Disk Space : 179.7GB
>> Inode Count  : 11968512
>> Free Inodes  : 11901550
>>
>> --
>> Brick: Brick 1.1.74.246:/home/sdb1
>> Port   

Re: [Gluster-users] gluster taking a lot of CPU and crashes database at times

2012-08-08 Thread Fernando Frediani (Qube)
I also don't think Database and GlusterFS is a good combination cause of the 
performance issues related, unless performance is the last thing to concern 
about or database is so small that it doesn't make diference.

From: gluster-users-boun...@gluster.org 
[mailto:gluster-users-boun...@gluster.org] On Behalf Of Joe Julian
Sent: 08 August 2012 17:55
To: gluster-users@gluster.org
Subject: Re: [Gluster-users] gluster taking a lot of CPU and crashes database 
at times

On Wed, Aug 8, 2012 at 4:30 PM, Philip Poten 
mailto:philip.po...@gmail.com>> wrote:
Hey,

running postgres (or any database) on a gluster share is an extremely
bad idea. This can and will not end well, no matter what you do.

I still don't see why people keep saying this. I've been running mysql on a 
GlusterFS volume since the 2.0 days. I know Avati  agrees with you though 
(though I keep trying to convince him otherwise).

The only problem I've ever had was with creation or alteration of MyISAM files 
as they create a temporary filename, then rename it. This often causes an error 
as the rename (apparently) hasn't completed before it tries to open again (a 
bug that still seems to exist in 3.3.0).

InnoDB files can actually be quite efficient on a distributed volume if you 
create sufficient file segments to be distributed across subvolumes.

The only real problem would be if someone thought they could run multiple 
instances of the database server. Regardless of what filesystem they're on, 
relational database engines are not built to be able to do that.
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Gluster v 3.3 with KVM and High Availability

2012-07-12 Thread Fernando Frediani (Qube)
Folks,
Gluster is not ready to run Virtual Machines at all. Yes you can build a 2 node 
cluster and live migrate machines, but the performance is poor and they need to 
do a lot of work on it yet.
I wouldn't put in production even a cluster with low performance web server VMs 
until this is solved. For Archive or Multimedia general storage maybe, but not 
to run VMs.
Perhaps someone is intending to integrate with RHEV (seems they are as it's 
going to be on oVirt 3.1 now) so they will put more effort to solve this 
problem that 10 of 10 of those who tested  are reporting the same thing.

Fernando

-Original Message-
From: gluster-users-boun...@gluster.org 
[mailto:gluster-users-boun...@gluster.org] On Behalf Of Mark Nipper
Sent: 12 July 2012 09:40
To: Brian Candler
Cc: gluster-users@gluster.org
Subject: Re: [Gluster-users] Gluster v 3.3 with KVM and High Availability

On 12 Jul 2012, Brian Candler wrote:
> And I forgot to add: since a KVM VM is a userland process anyway, I'd 
> expect a big performance gain when KVM gets the ability to talk to 
> libglusterfs to send its disk I/O directly, without going through a 
> kernel mount (and hence bypassing the kernel cache). It looks like this is 
> being developed now:
> http://lists.gnu.org/archive/html/qemu-devel/2012-06/msg01745.html
> You can see the performance figures at the bottom of that post.

Something concerns me about those performance figures.
If I'm reading them correctly, the normal fuse mount performance is about what 
I was seeing, 2-3MB.  And now bypassing everything, libglusterfs is still 
capping out a little under 20MB/s.

So am I kidding myself that approaching 45-50MB/s with a FUSE based 
Gluster mount and using cache=writethrough is actually a safe thing to do 
really?  I know the performance is abysmal without setting the cache mode, but 
is using writethrough really safe, or is it a recipe for disaster waiting to 
happen?

--
Mark Nipper
ni...@bitgnome.net (XMPP)
+1 979 575 3193
-
I cannot tolerate intolerant people.
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] "Granular locking" - does this need to be enabled in 3.3.0 ?

2012-07-09 Thread Fernando Frediani (Qube)
Jake,

I haven't had a chanced to test with my KVM cluster yet but it should be a 
default things from 3.3.
Just be in mind that running Virtual Machines is NOT a supported things for 
Redhat Storage server according to Redhat Sales people. They said towards the 
end of the year. As you might have observed performance specially for write 
isn't any near fantastic.

Fernando

From: gluster-users-boun...@gluster.org 
[mailto:gluster-users-boun...@gluster.org] On Behalf Of Christian Wittwer
Sent: 09 July 2012 15:51
To: Jake Grimmett
Cc: gluster-users@gluster.org
Subject: Re: [Gluster-users] "Granular locking" - does this need to be enabled 
in 3.3.0 ?

Hi Jake
I can confirm exact the same behaviour with gluster 3.3.0 on Ubuntu 12.04. 
During the self-heal process the VM gets 100% I/O wait and  is locked.
After the self-heal the root filesystem was read-only which forced me to do a 
reboot and fsck.

Cheers,
Christian
2012/7/9 Jake Grimmett mailto:j...@mrc-lmb.cam.ac.uk>>
Dear All,

I have a pair of Scientific Linux 6.2 servers, acting as KVM virtualisation 
hosts for ~30 VM's. The VM images are stored in a replicated gluster volume 
shared between the two servers. Live migration works fine, and the sanlock 
prevents me from (stupidly) starting the same VM on both machines. Each server 
has 10GB ethernet and a 10 disk RAID5 array.

If I migrate all the VM's to server #1 and shutdown server #2, all works 
perfectly with no interruption. When I restart server #2, the VM's freeze while 
the self-heal process is running - and this healing can take a long time.

I'm not sure if "Granular Locking" is on. It's listed as a "technology preview" 
in the Redhat Storage server 2 notes - do I need to do anything to enable it?

i.e. set "cluster.data-self-heal-algorithm" to diff ?
or edit "cluster.self-heal-window-size" ?

any tips from other people doing similar much appreciated!

Many thanks,

Jake

jog <---at---> mrc-lmb.cam.ac.uk
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users

___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Gluster 3.3 and FS

2012-07-02 Thread Fernando Frediani (Qube)
Hi Mitsue,

I have a report from a colleague that his company was using Gluster for storing 
mailboxes and they end up with many issues. I think it was something related to 
number of small files which is known to not perform well.
Needless to say for you to test the performance, specially the write 
performance, before putting into production.

Fernando

-Original Message-
From: gluster-users-boun...@gluster.org 
[mailto:gluster-users-boun...@gluster.org] On Behalf Of Mitsue Acosta Murakami
Sent: 29 June 2012 21:43
To: gluster-users@gluster.org
Subject: [Gluster-users] Gluster 3.3 and FS

Hello,

We are planning to use Gluster 3.3 and NFS with Geo-replication. We wonder what 
would be the best FS : ext3, ext4 or xfs?

We are going to build a storage for mailboxes files.

Any advice would be very much appreciated.


Regards,

--
Mitsue Acosta Murakami


___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] about HA infrastructure for hypervisors

2012-06-28 Thread Fernando Frediani (Qube)
It would be interesting if it could read in a round-robin manner from where it 
contains the data. Eventually if the local storage is too busy (and therefore 
providing higher latency times) it would be good to read some of the data from 
another quiet node which, even over the network, could provide better latency 
times. In short distribute the IO load across the whole cluster if it contains 
multiple copies of the data.
What's other's opinions on that ?

Regards,

Fernando

-Original Message-
From: Tim Bell [mailto:tim.b...@cern.ch] 
Sent: 28 June 2012 11:55
To: Fernando Frediani (Qube); 'Nicolas Sebrecht'; 'Thomas Jackson'
Cc: 'gluster-users'
Subject: RE: [Gluster-users] about HA infrastructure for hypervisors


Assuming that we use a 3 copy approach across the hypervisors, does Gluster
favour the local copy on the hypervisor if the data is on
distributed/replicated ? 

It would be good to avoid the network hop when the data is on the local
disk.

Tim

> -Original Message-
> From: gluster-users-boun...@gluster.org [mailto:gluster-users-
> boun...@gluster.org] On Behalf Of Fernando Frediani (Qube)
> Sent: 28 June 2012 11:43
> To: 'Nicolas Sebrecht'; 'Thomas Jackson'
> Cc: 'gluster-users'
> Subject: Re: [Gluster-users] about HA infrastructure for hypervisors
> 
> You should indeed to use the same server running as a storage brick as a
> KVM host to maximize hardware and power usage. Only thing I am not sure
> is if you can limit the amount of host memory Gluster can eat so most of
it
> gets reserved for the Virtual Machines.
> 
> Fernando
> 
> -Original Message-
> From: gluster-users-boun...@gluster.org [mailto:gluster-users-
> boun...@gluster.org] On Behalf Of Nicolas Sebrecht
> Sent: 28 June 2012 10:31
> To: Thomas Jackson
> Cc: 'gluster-users'
> Subject: [Gluster-users] Re: about HA infrastructure for hypervisors
> 
> The 28/06/12, Thomas Jackson wrote:
> 
> > Why don't you have KVM running on the Gluster bricks as well?
> 
> Good point. While abtracting we decided to seperate KVM & Gluster but I
> can't remember why.
> We'll think about that again.
> 
> > We have a 4 node cluster (each with 4x 300GB 15k SAS drives in
> > RAID10), 10 gigabit SFP+ Ethernet (with redundant switching). Each
> > node participates in a distribute+replicate Gluster namespace and runs
> > KVM. We found this to be the most efficient (and fastest) way to run the
> cluster.
> >
> > This works well for us, although (due to Gluster using fuse) it isn't
> > as fast as we would like. Currently waiting for the KVM driver that
> > has been discussed a few times recently, that should make a huge
> > difference to the performance for us.
> 
> Ok! Thanks.
> 
> --
> Nicolas Sebrecht
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] about HA infrastructure for hypervisors

2012-06-28 Thread Fernando Frediani (Qube)
Interesting info Brian,
I am surprised with this actually. Would always expect 10Gig have a very good 
and low latency times. Obviously I wouldn't expect copper be exactly the same 
as Fibre due the losses, but not much behind either.

Please share any future results you get, as it's quiet value information for 
people before designing their systems.

Regards,

Fernando

-Original Message-
From: gluster-users-boun...@gluster.org 
[mailto:gluster-users-boun...@gluster.org] On Behalf Of Brian Candler
Sent: 28 June 2012 11:33
To: Nicolas Sebrecht
Cc: gluster-users
Subject: Re: [Gluster-users] about HA infrastructure for hypervisors

On Thu, Jun 28, 2012 at 11:25:20AM +0200, Nicolas Sebrecht wrote:
> We excluded ethernet due to searches on the web. It appeared that 
> ethernet has bad latency.

I read it on the web, it must be true :-)

As I said: don't use 10GE over CAT6/RJ45 (10Gbase-T). That does indeed have 
poor latency.  As I understand it: in order to get such high data rates over 
copper, it has to employ mechanisms similar to DSL lines, like interleaving, 
which means 10G has comparable or even higher latency than 1G.  Switches with 
all 10Gbase-T ports are expensive, only available from a couple of vendors, and 
consume a lot of power.

However switches with SFP+ ports don't have these problems. For short reach you 
can use SFP+ direct attach cables, and for long reach use fibre.

http://en.wikipedia.org/wiki/10-gigabit_Ethernet

I have been working with Intel X520-DA2 NICs and a Netgear XSM7224S switch, and 
direct-attach cables (3m Netgear AXC763, 5m Intel)

This all works fine, although with older versions of Linux I had to build the 
latest Intel drivers from their website to fix problems with the links going 
down every day or two.

Regards,

Brian.
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] about HA infrastructure for hypervisors

2012-06-28 Thread Fernando Frediani (Qube)
You should indeed to use the same server running as a storage brick as a KVM 
host to maximize hardware and power usage. Only thing I am not sure is if you 
can limit the amount of host memory Gluster can eat so most of it gets reserved 
for the Virtual Machines.

Fernando

-Original Message-
From: gluster-users-boun...@gluster.org 
[mailto:gluster-users-boun...@gluster.org] On Behalf Of Nicolas Sebrecht
Sent: 28 June 2012 10:31
To: Thomas Jackson
Cc: 'gluster-users'
Subject: [Gluster-users] Re: about HA infrastructure for hypervisors

The 28/06/12, Thomas Jackson wrote:

> Why don't you have KVM running on the Gluster bricks as well?

Good point. While abtracting we decided to seperate KVM & Gluster but I can't 
remember why.
We'll think about that again.

> We have a 4 node cluster (each with 4x 300GB 15k SAS drives in 
> RAID10), 10 gigabit SFP+ Ethernet (with redundant switching). Each 
> node participates in a distribute+replicate Gluster namespace and runs 
> KVM. We found this to be the most efficient (and fastest) way to run the 
> cluster.
> 
> This works well for us, although (due to Gluster using fuse) it isn't 
> as fast as we would like. Currently waiting for the KVM driver that 
> has been discussed a few times recently, that should make a huge 
> difference to the performance for us.

Ok! Thanks.

--
Nicolas Sebrecht
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Gluster 3.3.0 and VMware ESXi 5

2012-06-26 Thread Fernando Frediani (Qube)
Yeah, that's quiet annoying on 3.3
I've run into the same problem when trying to re-create a volume made of the 
same disks that I couldn't get rid of even using these commands, so I ended up 
replacing the physical disks and re-creating the partition.
Even when you do a Delete on the volume it seems to run into the same trouble.

Fernando

From: Simon Blackstein [mailto:si...@blackstein.com]
Sent: 26 June 2012 18:01
To: Anand Avati
Cc: Fernando Frediani (Qube); gluster-users@gluster.org
Subject: Re: [Gluster-users] Gluster 3.3.0 and VMware ESXi 5

Honestly, I've been trying to reset this volume completely to see if the error 
was transitional but now getting the '/gfs or a prefix of it is already part of 
a volume' message even after removing the attributes from the directory... lots 
of changes in this version to watch out for :(


setfattr -x trusted.gfid /gfs

setfattr -x trusted.glusterfs.volume-id /gfs

setfattr -x trusted.afr.gfs-vdi-client-0 /gfs

setfattr -x trusted.afr.gfs-vdi-client-1 /gfs

setfattr -x trusted.glusterfs.dht /gfs

Many Rgds,

Simon

On Tue, Jun 26, 2012 at 8:28 AM, Anand Avati 
mailto:anand.av...@gmail.com>> wrote:
Is this at the same 'time' as  before (at the time for VM boot), or does it 
actually progress a little more (i.e, "start booting") and then throw up? It 
will be helpful if we move this discussion to bugzilla and you provide trace 
logs.

Avati
On Tue, Jun 26, 2012 at 7:59 AM, Simon Blackstein 
mailto:si...@blackstein.com>> wrote:
Hi Avati,

Thanks. I just tried a recompile (I'd installed from RPM before) and brought up 
the volume again. I now get a similar but different message:

An unexpected error was received from the ESX host while powering on VM 
vm-26944.
Failed to power on VM.
Could not power on VM : Not found.
Failed to create swap file '/gfs/gfs-test1/./gfs-test1-d27c6ac2.vswp' : Not 
found

I only modified the first brick node which is where I'm pointing the ESXi 
server as NFS client. Do I need to modify all nodes like this?

Many Thanks!

Simon

On Tue, Jun 26, 2012 at 4:14 AM, Anand Avati 
mailto:anand.av...@gmail.com>> wrote:
Fernando,
  Yes, to try the patch you need to install from source. We will include the 
patch in the next release if you need RPMs.

Avati
On Tue, Jun 26, 2012 at 3:02 AM, Fernando Frediani (Qube) 
mailto:fernando.fredi...@qubenet.net>> wrote:
Hi Avati,


How I suppose to apply the patch if I have installed the RPM version ? Should I 
have a compiled from source installed instead ?



Regards,

Fernando

From: 
gluster-users-boun...@gluster.org<mailto:gluster-users-boun...@gluster.org> 
[mailto:gluster-users-boun...@gluster.org<mailto:gluster-users-boun...@gluster.org>]
 On Behalf Of Anand Avati
Sent: 26 June 2012 04:00
To: Simon
Cc: gluster-users@gluster.org<mailto:gluster-users@gluster.org>

Subject: Re: [Gluster-users] Gluster 3.3.0 and VMware ESXi 5

Simon - can you please try this patch: http://review.gluster.com/3617

Thanks,
Avati
On Mon, Jun 25, 2012 at 7:13 PM, Simon 
mailto:si...@blackstein.com>> wrote:
I'm having the same error deploying a green field ESXi 5.0 farm against
GlusterFS 3.3. Can provision a VM but can't start it with the identical error:

An unexpected error was received from the ESX host while powering on VM 
vm-26941.
Failed to power on VM.
Unable to retrieve the current working directory: 0 (No such file or directory).
Check if the directory has been deleted or unmounted.
Unable to retrieve the current working directory: 0 (No such file or directory).
Check if the directory has been deleted or unmounted.
Is GlusterFS supported against VMware or should I be looking somewhere else?

___
Gluster-users mailing list
Gluster-users@gluster.org<mailto:Gluster-users@gluster.org>
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users





___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Gluster 3.3.0 and VMware ESXi 5

2012-06-26 Thread Fernando Frediani (Qube)
Hi Avati,


How I suppose to apply the patch if I have installed the RPM version ? Should I 
have a compiled from source installed instead ?



Regards,

Fernando

From: gluster-users-boun...@gluster.org 
[mailto:gluster-users-boun...@gluster.org] On Behalf Of Anand Avati
Sent: 26 June 2012 04:00
To: Simon
Cc: gluster-users@gluster.org
Subject: Re: [Gluster-users] Gluster 3.3.0 and VMware ESXi 5

Simon - can you please try this patch: http://review.gluster.com/3617

Thanks,
Avati
On Mon, Jun 25, 2012 at 7:13 PM, Simon 
mailto:si...@blackstein.com>> wrote:
I'm having the same error deploying a green field ESXi 5.0 farm against
GlusterFS 3.3. Can provision a VM but can't start it with the identical error:

An unexpected error was received from the ESX host while powering on VM 
vm-26941.
Failed to power on VM.
Unable to retrieve the current working directory: 0 (No such file or directory).
Check if the directory has been deleted or unmounted.
Unable to retrieve the current working directory: 0 (No such file or directory).
Check if the directory has been deleted or unmounted.
Is GlusterFS supported against VMware or should I be looking somewhere else?

___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users

___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Can't run KVM Virtual Machines on a Gluster volume

2012-06-25 Thread Fernando Frediani (Qube)
Hi,
Thanks for the reply. It was actually the bloody SElinux.
Disabling it allowed the VM to be powered on.

I am now still testing the write performance with both qcow2 and raw, cache 
none and writeback and as far as I can tell it's not good at all.

Fernando

-Original Message-
From: Brian Candler [mailto:b.cand...@pobox.com] 
Sent: 25 June 2012 08:57
To: Fernando Frediani (Qube)
Cc: 'gluster-users@gluster.org'
Subject: Re: [Gluster-users] Can't run KVM Virtual Machines on a Gluster volume

On Sat, Jun 23, 2012 at 08:16:47PM +0000, Fernando Frediani (Qube) wrote:
>I just built a 2 node(4 bricks), Distributed-Replicated and everything
>mounts fine.
> 
>Each node mounts using GlusterFS client on its hostname (mount –t
>glusterfs hostname:VOLUME /virtual-machines)
> 
>When creating a new Virtual Machine using virt-manager it creates the
>file on the storage, but when trying to power it On, it doesn’t work
>and gives back an error message.(See below. Yes the folder has full
>permission to All to write.)

Failing to access something, when the something has permissions, usually 
implies a problem with apparmor / SElinux.

If this is an Ubuntu platform, then use libvirt to start the VM (i.e. define it 
using an XML file), and it should automatically create an apparmor policy 
dynamically.  Otherwise, try turning off apparmor / SElinux temporarily to see 
whether or that's the problem.  You could also try mounting your volume 
directly onto /var/lib/libvirt/images.

>Has anyone actually was able to run it fine  on Gluster 3.3

I have. Ubuntu 12.04.
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Can't run KVM Virtual Machines on a Gluster volume

2012-06-25 Thread Fernando Frediani (Qube)
Which logs are you talking about ?
I've posted the logs from the virt-manager running in one of the nodes. It runs 
as root.
Also I've tried the same thing on a local storage and it works fine.

Fernando

-Original Message-
From: Vijay Bellur [mailto:vi...@gluster.com] 
Sent: 24 June 2012 05:29
To: Fernando Frediani (Qube)
Cc: 'gluster-users@gluster.org'
Subject: Re: [Gluster-users] Can't run KVM Virtual Machines on a Gluster volume

On 06/23/2012 04:16 PM, Fernando Frediani (Qube) wrote:
> I just built a 2 node(4 bricks), Distributed-Replicated and everything 
> mounts fine.
>
> Each node mounts using GlusterFS client on its hostname (mount -t 
> glusterfs hostname:VOLUME /virtual-machines)
>
> When creating a new Virtual Machine using virt-manager it creates the 
> file on the storage, but when trying to power it On, it doesn't work 
> and gives back an error message.(See below. Yes the folder has full 
> permission to All to write.)
>

Can you please post the complete client and server log files? What user is the 
virt-manager running as?

-Vijay
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Can't run KVM Virtual Machines on a Gluster volume

2012-06-23 Thread Fernando Frediani (Qube)
Thanks for the reply Adam, but I've tried also Raw and it had the exact same 
error when trying to Power On the machine. Does it make any diference if it's 
IDE or VirtIO and if cache is None or Writethrough ?

Regards,

Fernando

From: Adam Tygart [mo...@k-state.edu]
Sent: 24 June 2012 00:14
To: Fernando Frediani (Qube)
Cc: gluster-users@gluster.org
Subject: Re: [Gluster-users] Can't run KVM Virtual Machines on a Gluster volume

Fernando,

The qcow2 disk format requires the file to be opened with O_DIRECT.
This is unsupported in FUSE with a linux kernel below 3.4. You can use
the raw disk format, or use https://github.com/avati/liboindirect to
catch opens with O_DIRECT and change them to O_SYNC.

--
Adam Tygart

On Sat, Jun 23, 2012 at 3:16 PM, Fernando Frediani (Qube)
 wrote:
> I just built a 2 node(4 bricks), Distributed-Replicated and everything
> mounts fine.
>
> Each node mounts using GlusterFS client on its hostname (mount –t glusterfs
> hostname:VOLUME /virtual-machines)
>
> When creating a new Virtual Machine using virt-manager it creates the file
> on the storage, but when trying to power it On, it doesn’t work and gives
> back an error message.(See below. Yes the folder has full permission to All
> to write.)
>
>
>
> Has anyone actually was able to run it fine  on Gluster 3.3 ? I get the same
> results on VMware ESXi, but I thought that as for KVM using GlusterFS client
> that would work fine and better. If I mount the volume as NFS the exact same
> thing happens.
>
> Quiet frustrating !!!
>
>
>
> Fernando
>
>
>
> Unable to complete install: 'internal error Process exited while reading
> console log output: char device redirected to /dev/pts/1
>
> qemu-kvm: -drive
> file=/virtual-machines/Ubuntu_12.img,if=none,id=drive-virtio-disk0,format=qcow2,cache=none:
> could not open disk image /virtual-machines/Ubuntu_12.img: Permission denied
>
> '
>
>
>
> Traceback (most recent call last):
>
>   File "/usr/share/virt-manager/virtManager/asyncjob.py", line 44, in
> cb_wrapper
>
> callback(asyncjob, *args, **kwargs)
>
>   File "/usr/share/virt-manager/virtManager/create.py", line 1903, in
> do_install
>
> guest.start_install(False, meter=meter)
>
>   File "/usr/lib/python2.6/site-packages/virtinst/Guest.py", line 1223, in
> start_install
>
> noboot)
>
>   File "/usr/lib/python2.6/site-packages/virtinst/Guest.py", line 1291, in
> _create_guest
>
> dom = self.conn.createLinux(start_xml or final_xml, 0)
>
>   File "/usr/lib64/python2.6/site-packages/libvirt.py", line 2064, in
> createLinux
>
> if ret is None:raise libvirtError('virDomainCreateLinux() failed',
> conn=self)
>
> libvirtError: internal error Process exited while reading console log
> output: char device redirected to /dev/pts/1
>
> qemu-kvm: -drive
> file=/virtual-machines/Ubuntu_12.img,if=none,id=drive-virtio-disk0,format=qcow2,cache=none:
> could not open disk image /virtual-machines/Ubuntu_12.img: Permission denied
>
>
>
>
>
> Regards,
>
>
>
> Fernando Frediani
>
>
>
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


[Gluster-users] Can't run KVM Virtual Machines on a Gluster volume

2012-06-23 Thread Fernando Frediani (Qube)
I just built a 2 node(4 bricks), Distributed-Replicated and everything mounts 
fine.
Each node mounts using GlusterFS client on its hostname (mount -t glusterfs 
hostname:VOLUME /virtual-machines)
When creating a new Virtual Machine using virt-manager it creates the file on 
the storage, but when trying to power it On, it doesn't work and gives back an 
error message.(See below. Yes the folder has full permission to All to write.)

Has anyone actually was able to run it fine  on Gluster 3.3 ? I get the same 
results on VMware ESXi, but I thought that as for KVM using GlusterFS client 
that would work fine and better. If I mount the volume as NFS the exact same 
thing happens.
Quiet frustrating !!!

Fernando

Unable to complete install: 'internal error Process exited while reading 
console log output: char device redirected to /dev/pts/1
qemu-kvm: -drive 
file=/virtual-machines/Ubuntu_12.img,if=none,id=drive-virtio-disk0,format=qcow2,cache=none:
 could not open disk image /virtual-machines/Ubuntu_12.img: Permission denied
'

Traceback (most recent call last):
  File "/usr/share/virt-manager/virtManager/asyncjob.py", line 44, in cb_wrapper
callback(asyncjob, *args, **kwargs)
  File "/usr/share/virt-manager/virtManager/create.py", line 1903, in do_install
guest.start_install(False, meter=meter)
  File "/usr/lib/python2.6/site-packages/virtinst/Guest.py", line 1223, in 
start_install
noboot)
  File "/usr/lib/python2.6/site-packages/virtinst/Guest.py", line 1291, in 
_create_guest
dom = self.conn.createLinux(start_xml or final_xml, 0)
  File "/usr/lib64/python2.6/site-packages/libvirt.py", line 2064, in 
createLinux
if ret is None:raise libvirtError('virDomainCreateLinux() failed', 
conn=self)
libvirtError: internal error Process exited while reading console log output: 
char device redirected to /dev/pts/1
qemu-kvm: -drive 
file=/virtual-machines/Ubuntu_12.img,if=none,id=drive-virtio-disk0,format=qcow2,cache=none:
 could not open disk image /virtual-machines/Ubuntu_12.img: Permission denied


Regards,

Fernando Frediani


___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Switching clients to NFS

2012-06-22 Thread Fernando Frediani (Qube)
I have seen a few people recently saying they are using NFS instead of the 
Native Gluster client. I would imagine that the Gluster client would always be 
better and faster besides the automatic failover, but it makes me wonder what 
sort of problems their as experiencing with the Gluster client.

Fernando

-Original Message-
From: gluster-users-boun...@gluster.org 
[mailto:gluster-users-boun...@gluster.org] On Behalf Of Sean Fulton
Sent: 22 June 2012 11:04
To: gluster-users@gluster.org
Subject: Re: [Gluster-users] Switching clients to NFS

I think he is talking about 1 set of servers, having some clients mount via NFS 
and some mount via native gluster.

I did this in a test environment with 3.2.5 and it seemed to work OK. We had 
four nodes initially set up as gluster native and we changed over one by one to 
nfs. The world did not end. But it was in a test environment.

sean

On 06/22/2012 05:59 AM, Rajesh Amaravathi wrote:
> if you are using version 3.3, then you will need a separate client. you 
> cannot mount any other nfs exports/volumes on glusterfs servers.
> if you are using 3.2.x, then you can mount it on the same servers
>
> Regards,
> Rajesh Amaravathi,
> Software Engineer, GlusterFS
> RedHat Inc.
>
> - Original Message -
> From: "Marcus Bointon" 
> To: "gluster-users Discussion List" 
> Sent: Friday, June 22, 2012 3:23:43 PM
> Subject: [Gluster-users] Switching clients to NFS
>
> I'm looking at switching some clients from native gluster to NFS. Any advice 
> on how to do this as transparently as possible? Can both mounts be used at 
> the same time (so I can test NFS before switching)? I'm on a vanilla 2-way 
> AFR config where both clients are also servers.
>
> Marcus

--
Sean Fulton
GCN Publishing, Inc.
Internet Design, Development and Consulting For Today's Media Companies 
http://www.gcnpublishing.com
(203) 665-6211, x203



___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] RAID options for Gluster

2012-06-18 Thread Fernando Frediani (Qube)
Disks are generally cheap, but there are other things that people normally 
don't think about like the power to run them, the space they consume in a 
array, etc. Use RAID 10 in a Gluster environment I think it's a total waste of 
space as that will give you 1/4 of the Raw space. I'm of the opinion that if 
you need something to perform better than the concern with amount of total 
space you would use another solution with RAID 10. (pure XFS or XFS+DRBD, etc)
Given that Gluster is not for high performance (low latency) applications, RAID 
5 seems to be a good option given that you still have the ability to replace a 
disk if it fails and even in a very unlikely event that a you loose a entire 
array you will have you data replicated somewhere. Hopefully for reads Gluster 
should be able to read round-robin from both copies to improve throughput.
Unfortunately Gluster doesn't yet have the ability to be aware of bricks on the 
same node and don't put data there. That would allow create more than one set 
of RAID on the same server and avoid going over 12-16 disks in a RAID 5 array.

Fernando

-Original Message-
From: gluster-users-boun...@gluster.org 
[mailto:gluster-users-boun...@gluster.org] On Behalf Of Arnold Krille
Sent: 15 June 2012 23:10
To: gluster-users@gluster.org
Subject: Re: [Gluster-users] RAID options for Gluster

Gotta wear my BAARF-hat:

On 15.06.2012 12:14, Fernando Frediani (Qube) wrote:
> Going to the idea of using RAID controllers would you think that for say 16 
> disks(or 12) Raid 5 would be fine  given the data is already replicated 
> somewhere in another node in a very unlikely event you loose a node.
> Now in a node with more number of disk slots could create multiple Raid 5 
> logical volumes, but will Gluster be smart enough to not put replicated data 
> on two logical volumes residing on the same node ?

Using raid5 will just leave you with reading from at least two disks then 
writing to two disks instead of just writing your data to disk.
Unless write-performance is of no interest to you, you should re-think raid5...

If disks are pricey for you, just use all of them and deal with the failed 
bricks. If disks are cheap, just put always two together in a raid1.

Have fun,

Arnold

(*) http://www.miracleas.com/BAARF/
--
Dieses Email wurde elektronisch erstellt und ist ohne handschriftliche 
Unterschrift gültig.

___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Gluster 3.3.0 and VMware ESXi 5

2012-06-15 Thread Fernando Frediani (Qube)
Was there any clue on that amount of logs on why a Virtual Machine cann't be 
Powered On using VMware ?
Is it a NFS related problem ?

Fernando

-Original Message-
From: Fernando Frediani (Qube) 
Sent: 14 June 2012 10:02
To: 'Tomoaki Sato'; 'gluster-users@gluster.org'
Subject: RE: [Gluster-users] Gluster 3.3.0 and VMware ESXi 5

Hi,

This logs way too much data too quickly, so I have cut the part of the nfs.log 
for during the time I tried to power on the VM. Find it attached.

Regards,

Fernando

-Original Message-
From: gluster-users-boun...@gluster.org 
[mailto:gluster-users-boun...@gluster.org] On Behalf Of Tomoaki Sato
Sent: 14 June 2012 01:13
To: gluster-users@gluster.org
Subject: Re: [Gluster-users] Gluster 3.3.0 and VMware ESXi 5

Could you try "gluster volume set  
diagnostics.client-log-level TRACE" ?

Tomo Sato

(2012/06/14 0:19), Fernando Frediani (Qube) wrote:
> Hi,
>
> I don't see anything on the nfs log files when watching 
> /var/log/glusterfs/nfs.log and trying to power on the machine at the same 
> time. On the Glusterd logs I don't see anything as well.
> Anywhere else to check that it should be logging to ?
>
> Fernando
>
> -Original Message-
> From: Vijay Bellur [mailto:vbel...@redhat.com]
> Sent: 11 June 2012 17:54
> To: Fernando Frediani (Qube)
> Cc: 'Atha Kouroussis'; 'gluster-users@gluster.org'; Rajesh Amaravathi; 
> Krishna Srinivas
> Subject: Re: [Gluster-users] Gluster 3.3.0 and VMware ESXi 5
>
> On 06/11/2012 05:52 PM, Fernando Frediani (Qube) wrote:
>> Was doing some read on RedHat website and found this URL which I wonder if 
>> the problem would have anything to do with this:
>> http://docs.redhat.com/docs/en-US/Red_Hat_Storage_Software_Appliance/3
>> .2/html/User_Guide/ch14s04s08.html
>>
>> Although both servers and client are 64 I wonder if somehow this could be 
>> related as it seems the closest thing I could think of.
>>
>> The error I get when trying to power up a VM is:
>>
>> An unexpected error was received from the ESX host while powering on VM 
>> vm-21112.
>> Failed to power on VM.
>> Unable to retrieve the current working directory: 0 (No such file or 
>> directory). Check if the directory has been deleted or unmounted.
>> Unable to retrieve the current working directory: 0 (No such file or 
>> directory). Check if the directory has been deleted or unmounted.
>> Unable to retrieve the current working directory: 0 (No such file or 
>> directory). Check if the directory has been deleted or unmounted.
>>
>>
>
> Can you please post nfs log file from the Gluster server that you are trying 
> to mount from?
>
> Thanks,
> Vijay
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users

___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] RAID options for Gluster

2012-06-15 Thread Fernando Frediani (Qube)
Right, it seems that using individual disks without RAID although possible 
isn't a good idea because of the non automation of disk replacement. Also there 
would be a problem with the maximum filesize.

Going to the idea of using RAID controllers would you think that for say 16 
disks(or 12) Raid 5 would be fine  given the data is already replicated 
somewhere in another node in a very unlikely event you loose a node.
Now in a node with more number of disk slots could create multiple Raid 5 
logical volumes, but will Gluster be smart enough to not put replicated data on 
two logical volumes residing on the same node ?

I don't even consider using RAID 10 as that would be a big waste of space 
because as the data is already replicated between nodes, having it replicated 
on the disks it would drop the usable space to 1/4 of the Raw. If I have 
latency sensitive applications I wouldn't probably use Gluster for that, but 
something else. For hosting non performance intensive applications I think 
Gluster is fine. Also in a medium sized cluster it would give a good throughput 
when running backups for example.
But bottom line the maximum performance you get from a single file is what a 
single RAID logical volume where the file resides can do.

Regards,

Fernando

-Original Message-
From: Brian Candler [mailto:b.cand...@pobox.com] 
Sent: 14 June 2012 14:55
To: Fernando Frediani (Qube)
Cc: 'gluster-users@gluster.org'
Subject: Re: [Gluster-users] RAID options for Gluster

On Thu, Jun 14, 2012 at 11:06:32AM +, Fernando Frediani (Qube) wrote:
>No RAID (individual hot swappable disks):
> 
>Each disk is a brick individually (server:/disk1, server:/disk2, etc)
>so no RAID controller is required. As the data is replicated if one
>fail the data must exist in another disk on another node.
> 
>Pros:
> 
>Cheaper to build as there is no cost for a expensive RAID controller.

Except that software (md) RAID is free and works with a HBA.

>Improved performance as writes have to be done only on a single disk
>not in the entire RAID5/6 Array.
> 
>Make better usage of the Raw space as there is no disk for parity on a
>RAID 5/6
> 
> 
>Cons:
> 
>If a failed disk gets replaced the data need to be replicated over the
>network (not a big deal if using Infiniband or 1Gbps+ Network)
> 
>The biggest file size is the size of one disk if using a volume type
>Distributed.

Additional Cons:

* You will probably need to write your own tools to monitor and notify you when 
a disk fails in the array (wherease there are easily-available existing tools 
for md RAID, including E-mail notifications and SNMP integration)

* The process of swapping a disk is not a simple hot-swap: you need to replace 
the failed drive, mkfs a new filesystem, and re-introduce it into the gluster 
volume.  This is something you will need to document procedures for and test 
carefully, whereas RAID swaps are relatively no-brainer.

* For a large configuration with hundreds of drives, it can become ungainly to 
have a gluster volume with hundreds of bricks.

>RAID doesn’t scale well beyond ~16 disks

But you can groups your disks into multiple RAID volumes.

>Attaching a JBOD to a node and creating multiple RAID Arrays(or a
>single server with more disk slots) instead of adding a new node can
>save power(no need CPU, Memory, Motherboard), but having multiple
>bricks on the same node might happen the data is replicated inside the
>same node making the downtime of a node something critical, or does
>Gluster is smart to replicate data to a brick in a different node ?

It's not automatic, you configure it explicitly. If your replica count is 2 
then you give it pairs of bricks, and data will be replicated onto each brick 
in the pair. It's your responsibility to ensure that those two bricks are on 
different servers, if high availability is your concern.

Another alternative to consider: RAID10 on each node. Eliminates the 
performance penalty of RAID5/6, indeed will give you improved read performance 
compared to single disks, but halves your available storage capacity.

You can of course mix-and-match. e.g. RAID5 for backup volumes; RAID10 for 
highly active read/write volumes; some gluster volumes are replicated and some 
are not, etc.  This can become a management headache if it gets too complex 
though.
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] RAID options for Gluster

2012-06-14 Thread Fernando Frediani (Qube)
Not really, the discussion on the original email is how to implement the 
underneath storage for Gluster, if use or not use RAID controllers and make the 
best usage of the resources.
Provided performance is not mission critical but if using certain things 
described on the email you can get some extra (like running individual disks) 
is always a bonus. The main propose in that case was make better usage of the 
Raw spare with some level of data resilience.

Fernando

-Original Message-
From: gluster-users-boun...@gluster.org 
[mailto:gluster-users-boun...@gluster.org] On Behalf Of Marcus Bointon
Sent: 14 June 2012 14:34
To: gluster-users@gluster.org
Subject: Re: [Gluster-users] RAID options for Gluster

On 14 Jun 2012, at 15:22, "Fernando Frediani (Qube)" 
 wrote:

> Well, as far as I know the amount of IOPS you can get from a RAID 5/6 is the 
> same that you get from a single disk. The write can not be acknowledged until 
> it is written to all the data and parity disks.

It can exceed that with battery back-up on the controller. With battery 
back-up, writes are often faster than reads (in all of IOPS, latency and 
throughput), at least until you hit the cache size limit. Sustained writes will 
not get such good performance because of the limit you mention, but random 
writes can still do pretty well, YMMV.

If you want to scale writes properly, you need some variant of RAID-10. I've 
got one server with RAID-10 across 6 SSDs, works well.

Marcus
--
Marcus Bointon
Synchromedia Limited: Creators of http://www.smartmessages.net/ UK info@hand 
CRM solutions mar...@synchromedia.co.uk | http://www.synchromedia.co.uk/



___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] RAID options for Gluster

2012-06-14 Thread Fernando Frediani (Qube)
Well, as far as I know the amount of IOPS you can get from a RAID 5/6 is the 
same that you get from a single disk. The write can not be acknowledged until 
it is written to all the data and parity disks.

With regards it scaling beyond 16 disks, yeah it might do and be possible to 
create with more disks, however it might increase the rebuild time when a disk 
gets replaced and in theory it should decrease performance as there will be 
more disks to acknowledge writes.

Fernando

From: George Machitidze [mailto:gio...@gmail.com]
Sent: 14 June 2012 14:01
To: Fernando Frediani (Qube)
Cc: gluster-users@gluster.org
Subject: Re: [Gluster-users] RAID options for Gluster

Hi,

Some corrections...
Cons:
Extra cost of the RAID controller.
Performance of the array is equivalent a single disk + RAID controller caching 
features.
RAID doesn’t scale well beyond ~16 disks

Performance of the array is not equivalent of a single disk and doesn't depend 
only on cache size or spec. features - it depends on the total IOPS, block 
sizes, access type etc.

RAID scales well beyond 16 disks, ex. for Adaptec. Yes, it will scale, but is 
it software or hardware, for both array reconfiguration and grow is the same 
kind of problem - data needs to be reallocated.

Maximum Number of Arrays that can be created on the same set of drives: 4
Maximum Logical Drive Size: 512TB
Maximum Number of Drives in Striped Array (such as RAID 0): 128
Maximum Number of Drives in RAID 5 Array: 32
Maximum Number of Drives in RAID 50 Array: 32
Maximum Number of Drives in RAID 6 Array: 32
Maximum Number of Drives in RAID 60 Array: 32
Available Stripe Sizes for Arrays are 16, 32, 64, 128, 256, 512, or 1024 KB. 
Striped RAID configurations have a default stripe size of 256 KB.
Note: A RAID 10, RAID 50, or RAID 60 array cannot have more than 32 legs when 
created using the Build method. Maximum disk drive count is only limited by 
RAID level. For instance:
a RAID 10 array built with 32 RAID 1 legs (64 disk drives) is supported
a RAID 50 array built with 32 RAID 5 legs (number of drives will vary) is also 
supported


Best regards,
George Machitidze


On Thu, Jun 14, 2012 at 3:06 PM, Fernando Frediani (Qube) 
mailto:fernando.fredi...@qubenet.net>> wrote:
> Cons:
>
> Extra cost of the RAID controller.
>
> Performance of the array is equivalent a single disk + RAID controller
> caching features.
>
> RAID doesn’t scale well beyond ~16 disks
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


[Gluster-users] RAID options for Gluster

2012-06-14 Thread Fernando Frediani (Qube)
I think this discussion probably came up here already but I couldn't find much 
on the archives. Would you able to comment or correct whatever might look wrong.

What options people think is more adequate to use with Gluster in terms of RAID 
underneath and a good balance between cost, usable space and performance. I 
have thought about two main options with its Pros and Cons

No RAID (individual hot swappable disks):
Each disk is a brick individually (server:/disk1, server:/disk2, etc) so no 
RAID controller is required. As the data is replicated if one fail the data 
must exist in another disk on another node.
Pros:
Cheaper to build as there is no cost for a expensive RAID controller.
Improved performance as writes have to be done only on a single disk not in the 
entire RAID5/6 Array.
Make better usage of the Raw space as there is no disk for parity on a RAID 5/6

Cons:
If a failed disk gets replaced the data need to be replicated over the network 
(not a big deal if using Infiniband or 1Gbps+ Network)
The biggest file size is the size of one disk if using a volume type 
Distributed.

In this case does anyone know if when replacing a failed disk does it need to 
be manually formatted and mounted ?

RAID Controller:
Using a RAID controller with battery backup can improve the performance 
specially caching the writes on the controller's memory but at the end one 
single array means the equivalent performance of one disk for each brick. Also 
RAID requires have either 1 or 2 disks for parity. If using very cheap disks 
probably better use RAID 6, if using better quality ones should be fine RAID 5 
as, again, the data the data is replicated to another RAID 5 on another node.
Pros:
Can create larger array as a single brick in order to fit bigger files for when 
using Distributed volume type.
Disk rebuild should be quicker (and more automated?)
Cons:
Extra cost of the RAID controller.
Performance of the array is equivalent a single disk + RAID controller caching 
features.
RAID doesn't scale well beyond ~16 disks

Attaching a JBOD to a node and creating multiple RAID Arrays(or a single server 
with more disk slots) instead of adding a new node can save power(no need CPU, 
Memory, Motherboard), but having multiple bricks on the same node might happen 
the data is replicated inside the same node making the downtime of a node 
something critical, or does Gluster is smart to replicate data to a brick in a 
different node ?

Regards,

Fernando
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Gluster 3.3.0 and VMware ESXi 5

2012-06-13 Thread Fernando Frediani (Qube)
Hi,

I don't see anything on the nfs log files when watching 
/var/log/glusterfs/nfs.log and trying to power on the machine at the same time. 
On the Glusterd logs I don't see anything as well.
Anywhere else to check that it should be logging to ?

Fernando

-Original Message-
From: Vijay Bellur [mailto:vbel...@redhat.com] 
Sent: 11 June 2012 17:54
To: Fernando Frediani (Qube)
Cc: 'Atha Kouroussis'; 'gluster-users@gluster.org'; Rajesh Amaravathi; Krishna 
Srinivas
Subject: Re: [Gluster-users] Gluster 3.3.0 and VMware ESXi 5

On 06/11/2012 05:52 PM, Fernando Frediani (Qube) wrote:
> Was doing some read on RedHat website and found this URL which I wonder if 
> the problem would have anything to do with this:
> http://docs.redhat.com/docs/en-US/Red_Hat_Storage_Software_Appliance/3
> .2/html/User_Guide/ch14s04s08.html
>
> Although both servers and client are 64 I wonder if somehow this could be 
> related as it seems the closest thing I could think of.
>
> The error I get when trying to power up a VM is:
>
> An unexpected error was received from the ESX host while powering on VM 
> vm-21112.
> Failed to power on VM.
> Unable to retrieve the current working directory: 0 (No such file or 
> directory). Check if the directory has been deleted or unmounted.
> Unable to retrieve the current working directory: 0 (No such file or 
> directory). Check if the directory has been deleted or unmounted.
> Unable to retrieve the current working directory: 0 (No such file or 
> directory). Check if the directory has been deleted or unmounted.
>
>

Can you please post nfs log file from the Gluster server that you are trying to 
mount from?

Thanks,
Vijay
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] glusterfs volume as a massively shared storage for VM images

2012-06-11 Thread Fernando Frediani (Qube)
Hi Christian,

In theory it should work, but ability to properly run VMs on Gluster is 
something relatively new due the improvements on granular healing so I don't 
think it has been extended tested.
I wasn't able to find any people using it in production and those I heard are 
using for testing. I tried myself here to use it with VMware and could never 
get it working, some problem with NFS on Gluster side.

With regards performance not sure how long you have been on this mail list, but 
have look for the last emails Brian Candler sent with his results for KVM VMs, 
they don't seem very promising as it stands.

Fernando

From: gluster-users-boun...@gluster.org 
[mailto:gluster-users-boun...@gluster.org] On Behalf Of Christian Parpart
Sent: 11 June 2012 01:11
To: gluster-users@gluster.org
Subject: [Gluster-users] glusterfs volume as a massively shared storage for VM 
images

Hi all,

I am looking for a solution, like, I hope Glusterfs can fit into it, that is,
something that allows me to do live migrations of virtual machines from one 
compute node to another (KVM, OpenStack).

And so I found an article about the Glusterfs-OpenStack-Connector, which was 
just a basic Glusterfs setup that shared the /var/lib/nova/instances directory 
across all compute nodes. This is necessary to allow virtual machines to 
migrate within milliseconds from one compute node to another, as all nodes 
already share every data.

Now my question is, how well does Glusterfs scale when you're having about 50+ 
compute nodes (on which you're about to run virtual machines). What kind of 
setup would you recommend, to actually not suffer in runtime performance, 
networking iops, nor in availability.

Many thanks in advance,
Christian Parpart.
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Gluster 3.3.0 and VMware ESXi 5

2012-06-11 Thread Fernando Frediani (Qube)
Was doing some read on RedHat website and found this URL which I wonder if the 
problem would have anything to do with this:
http://docs.redhat.com/docs/en-US/Red_Hat_Storage_Software_Appliance/3.2/html/User_Guide/ch14s04s08.html

Although both servers and client are 64 I wonder if somehow this could be 
related as it seems the closest thing I could think of.

The error I get when trying to power up a VM is:

An unexpected error was received from the ESX host while powering on VM 
vm-21112.
Failed to power on VM.
Unable to retrieve the current working directory: 0 (No such file or 
directory). Check if the directory has been deleted or unmounted. 
Unable to retrieve the current working directory: 0 (No such file or 
directory). Check if the directory has been deleted or unmounted. 
Unable to retrieve the current working directory: 0 (No such file or 
directory). Check if the directory has been deleted or unmounted.

Fernando

-Original Message-
From: gluster-users-boun...@gluster.org 
[mailto:gluster-users-boun...@gluster.org] On Behalf Of Fernando Frediani (Qube)
Sent: 07 June 2012 16:53
To: 'Atha Kouroussis'; 'gluster-users@gluster.org'
Subject: Re: [Gluster-users] Gluster 3.3.0 and VMware ESXi 5

Hi Atha,

I have a very similar setup and behaviour here.
I have two bricks with replication and I am able to mount the NFS, deploy a 
machine there, but when I try to Power it On it simply doesn't work and gives a 
different message saying that it couldn't find some files.

I wonder if anyone actually got it working with VMware ESXi and can share with 
us their scenario setup. Here I have two CentOS 6.2 and Gluster 3.3.0.

Fernando

-Original Message-
From: gluster-users-boun...@gluster.org 
[mailto:gluster-users-boun...@gluster.org] On Behalf Of Atha Kouroussis
Sent: 07 June 2012 15:29
To: gluster-users@gluster.org
Subject: [Gluster-users] Gluster 3.3.0 and VMware ESXi 5

Hi everybody,
we are testing Gluster 3.3 as an alternative to our current Nexenta based 
storage. With the introduction of granular based locking gluster seems like a 
viable alternative for VM storage.

Regrettably we cannot get it to work even for the most rudimentary tests. We 
have a two brick setup with two ESXi 5 servers. We created both distributed and 
replicated volumes. We can mount the volumes via NFS on the ESXi servers 
without any issues but that is as far as we can go.

When we try to migrate a VM to the gluster backed datastore there is no 
activity on the bricks and eventually the operation times out on the ESXi side. 
The nfs.log shows messages like these (distributed volume):

[2012-06-07 00:00:16.992649] E [nfs3.c:3551:nfs3_rmdir_resume] 0-nfs-nfsv3: 
Unable to resolve FH: (192.168.11.11:646) vmvol : 
7d25cb9a-b9c8-440d-bbd8-973694ccad17
[2012-06-07 00:00:17.027559] W [nfs3.c:3525:nfs3svc_rmdir_cbk] 0-nfs: 3bb48d69: 
/TEST => -1 (Directory not empty)
[2012-06-07 00:00:17.066276] W [nfs3.c:3525:nfs3svc_rmdir_cbk] 0-nfs: 3bb48d90: 
/TEST => -1 (Directory not empty)
[2012-06-07 00:00:17.097118] E [nfs3.c:3551:nfs3_rmdir_resume] 0-nfs-nfsv3: 
Unable to resolve FH: (192.168.11.11:646) vmvol : 
----0001


When the volume is mounted on the ESXi servers, we get messages like these in 
nfs.log:

[2012-06-06 23:57:34.697460] W [socket.c:195:__socket_rwv] 0-socket.nfs-server: 
readv failed (Connection reset by peer)


The same volumes mounted via NFS on a linux box work fine and we did a couple 
of benchmarks with bonnie++ with very promising results.
Curiously, if we ssh into the ESXi boxes and go to the mount point of the 
volume, we can see it contents and write.

Any clues of what might be going on? Thanks in advance.

Cheers,
Atha


___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Performance optimization tips Gluster 3.3? (small files / directory listings)

2012-06-08 Thread Fernando Frediani (Qube)
Thanks for sharing that Brian,

I wonder if the cause of the problem when trying to power Up VMware ESXi VMs is 
for the same reason.

Fernando

-Original Message-
From: Brian Candler [mailto:b.cand...@pobox.com] 
Sent: 08 June 2012 17:47
To: Pranith Kumar Karampuri
Cc: olav johansen; gluster-users@gluster.org; Fernando Frediani (Qube)
Subject: Re: [Gluster-users] Performance optimization tips Gluster 3.3? (small 
files / directory listings)

On Thu, Jun 07, 2012 at 02:36:26PM +0100, Brian Candler wrote:
> I'm interested in understanding this, especially the split-brain 
> scenarios (better to understand them *before* you're stuck in a 
> problem :-)
> 
> BTW I'm in the process of building a 2-node 3.3 test cluster right now.

FYI, I have got KVM working with a glusterfs 3.3.0 replicated volume as the 
image store.

There are two nodes, both running as glusterfs storage and as KVM hosts.

I build a 10.04 ubuntu image using vmbuilder, stored on the replicated 
glusterfs volume:

vmbuilder kvm ubuntu --hostname lucidtest --mem 512 --debug --rootsize 
20480 --dest /gluster/safe/images/lucidtest

I was able to fire it up (virsh start lucidtest), ssh into it, and then 
live-migrate it to another host:

brian@dev-storage1:~$ virsh migrate --live lucidtest 
qemu+ssh://dev-storage2/system
brian@dev-storage2's password: 

brian@dev-storage1:~$ virsh list
 Id Name State
--

brian@dev-storage1:~$ 

And I live-migrated it back again, all without the ssh session being 
interrupted.

I then rebooted the second storage server. While it was rebooting I did some 
work in the VM which grew its image. When the second storage server came back, 
it resynchronised the image immediately and automatically. Here is the relevant 
entry from /var/log/glusterfs/glustershd.log on the first
(non-rebooted) machine:

[2012-06-08 17:08:40.817893] E [socket.c:1715:socket_connect_finish] 
0-safe-client-1: connection to 10.0.1.2:24009 failed (Connection timed out)
[2012-06-08 17:09:10.698272] I 
[client-handshake.c:1636:select_server_supported_programs] 0-safe-client-1: 
Using Program GlusterFS 3.3.0, Num (1298437), Version (330)
[2012-06-08 17:09:10.700197] I 
[client-handshake.c:1433:client_setvolume_cbk] 0-safe-client-1: Connected to 
10.0.1.2:24009, attached to remote volume '/disk/storage2/safe'.
[2012-06-08 17:09:10.700234] I 
[client-handshake.c:1445:client_setvolume_cbk] 0-safe-client-1: Server and 
Client lk-version numbers are not same, reopening the fds
[2012-06-08 17:09:10.701901] I 
[client-handshake.c:453:client_set_lk_version_cbk] 0-safe-client-1: Server lk 
version = 1
[2012-06-08 17:09:14.699571] I 
[afr-common.c:1189:afr_detect_self_heal_by_iatt] 0-safe-replicate-0: size 
differs for  
[2012-06-08 17:09:14.699616] I [afr-common.c:1340:afr_launch_self_heal] 
0-safe-replicate-0: background  data self-heal triggered. path: 
, reason: lookup detected pending 
operations
[2012-06-08 17:09:18.230855] I 
[afr-self-heal-algorithm.c:122:sh_loop_driver_done] 0-safe-replicate-0: diff 
self-heal on : completed. (19 blocks 
of 3299 were different (0.58%))
[2012-06-08 17:09:18.232520] I 
[afr-self-heal-common.c:2159:afr_self_heal_completion_cbk] 0-safe-replicate-0: 
background  data self-heal completed on 


So at first glance this is extremely impressive. It's also very new and shiny, 
and I wonder how many edge cases remain to be debugged in live use, but I can't 
argue that it's very neat indeed!

Performance-wise:

(1) on the storage/VM host, which has the replicated volume mounted via FUSE:

root@dev-storage1:~# dd if=/dev/zero of=/gluster/safe/test.zeros bs=1024k 
count=500
500+0 records in
500+0 records out
524288000 bytes (524 MB) copied, 2.7086 s, 194 MB/s

(The bricks have a 12-disk md RAID10 array, far-2 layout, and there's probably 
scope for some performance tweaking here)

(2) however from within the VM guest, performance was very poor (2.2MB/s).

I tried my usual tuning options:


...



but glusterfs objected to the cache='none' option (possibly this opens the file 
with O_DIRECT?)

# virsh start lucidtest
virsherror: Failed to start domain lucidtest
error: internal error process exited while connecting to monitor: char 
device redirected to /dev/pts/0
kvm: -drive 
file=/gluster/safe/images/lucidtest/tmpaJqTD9.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,cache=none,aio=native:
 could not open disk image /gluster/safe/images/lucidtest/tmpaJqTD9.qcow2: 
Invalid argument

The VM boots with io='native' and bus='virtio', but performance is still very 
poor:

ubuntu@lucidtest:~$ dd if=/dev/zero of=/var/tmp/test.zeros bs=1024k 
count=100
100+0 records in
100+0 records out
104857600 bytes (105 MB) copied, 17.4095 s, 6.0 MB/s

This will need some fur

Re: [Gluster-users] Gluster 3.3.0 and VMware ESXi 5

2012-06-08 Thread Fernando Frediani (Qube)
I don't think there are many people using it with VMware specifically nor the 
people who develop it have probably tested it much.
I also suspect is some problem with NFS settings and wonder if it is possible 
to change it if you use that environment only for running virtual machines. I 
know that different from a normal Linux mount ESXi has some special way to 
mount it, that if not configured on the server side things won't work.

Fernando

-Original Message-
From: Atha Kouroussis [mailto:akourous...@gmail.com] 
Sent: 08 June 2012 05:46
To: Fernando Frediani (Qube)
Cc: gluster-users@gluster.org
Subject: Re: [Gluster-users] Gluster 3.3.0 and VMware ESXi 5

Hi Fernando, 
thanks for the reply. I'm seeing exactly the same behavior. I'm wondering if it 
somehow has to do with locking. I read here 
(http://community.gluster.org/q/can-not-mount-nfs-share-without-nolock-option/) 
that locking on NFS was not implemented in 3.2.x and it is now in 3.3. I tested 
3.2.x with ESXi a few months ago and it seemed to work fine but the lack of 
granular locking made it a no-go back then.

Anybody care to chime in with any suggestions? Is there a way to revert NFS to 
3.2.x behavior to test?

Cheers,
Atha

On Thursday, June 7, 2012 at 11:52 AM, Fernando Frediani (Qube) wrote:

> Hi Atha,
> 
> I have a very similar setup and behaviour here.
> I have two bricks with replication and I am able to mount the NFS, deploy a 
> machine there, but when I try to Power it On it simply doesn't work and gives 
> a different message saying that it couldn't find some files.
> 
> I wonder if anyone actually got it working with VMware ESXi and can share 
> with us their scenario setup. Here I have two CentOS 6.2 and Gluster 3.3.0.
> 
> Fernando
> 
> -Original Message-
> From: gluster-users-boun...@gluster.org 
> [mailto:gluster-users-boun...@gluster.org] On Behalf Of Atha Kouroussis
> Sent: 07 June 2012 15:29
> To: gluster-users@gluster.org (mailto:gluster-users@gluster.org)
> Subject: [Gluster-users] Gluster 3.3.0 and VMware ESXi 5
> 
> Hi everybody,
> we are testing Gluster 3.3 as an alternative to our current Nexenta based 
> storage. With the introduction of granular based locking gluster seems like a 
> viable alternative for VM storage.
> 
> Regrettably we cannot get it to work even for the most rudimentary tests. We 
> have a two brick setup with two ESXi 5 servers. We created both distributed 
> and replicated volumes. We can mount the volumes via NFS on the ESXi servers 
> without any issues but that is as far as we can go.
> 
> When we try to migrate a VM to the gluster backed datastore there is no 
> activity on the bricks and eventually the operation times out on the ESXi 
> side. The nfs.log shows messages like these (distributed volume):
> 
> [2012-06-07 00:00:16.992649] E [nfs3.c:3551:nfs3_rmdir_resume] 0-nfs-nfsv3: 
> Unable to resolve FH: (192.168.11.11:646) vmvol : 
> 7d25cb9a-b9c8-440d-bbd8-973694ccad17
> [2012-06-07 00:00:17.027559] W [nfs3.c:3525:nfs3svc_rmdir_cbk] 0-nfs: 
> 3bb48d69: /TEST => -1 (Directory not empty)
> [2012-06-07 00:00:17.066276] W [nfs3.c:3525:nfs3svc_rmdir_cbk] 0-nfs: 
> 3bb48d90: /TEST => -1 (Directory not empty)
> [2012-06-07 00:00:17.097118] E [nfs3.c:3551:nfs3_rmdir_resume] 0-nfs-nfsv3: 
> Unable to resolve FH: (192.168.11.11:646) vmvol : 
> ----0001
> 
> 
> When the volume is mounted on the ESXi servers, we get messages like these in 
> nfs.log:
> 
> [2012-06-06 23:57:34.697460] W [socket.c:195:__socket_rwv] 
> 0-socket.nfs-server: readv failed (Connection reset by peer)
> 
> 
> The same volumes mounted via NFS on a linux box work fine and we did a couple 
> of benchmarks with bonnie++ with very promising results.
> Curiously, if we ssh into the ESXi boxes and go to the mount point of the 
> volume, we can see it contents and write.
> 
> Any clues of what might be going on? Thanks in advance.
> 
> Cheers,
> Atha
> 
> 
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org (mailto:Gluster-users@gluster.org)
> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users



___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


[Gluster-users] Suggestions for Gluster 3.4

2012-06-07 Thread Fernando Frediani (Qube)
It was said in previous emails about suggestions on how to improve Gluster on 
the development of the next version, 3.4.
Well I guess we can all put up a list and see what will be more popular and 
useful to most people then send to the developers for consideration.

My list starts with:

RAID 1E type cluster (Numbers of nodes don't need to be multiple of the either 
'replicate' or 'stripe' count. Can grow the cluster adding a single node.)
Server x Brick awareness ( Avoids replicate data on two bricks running on the 
same server. Very useful when having multiple logical drives under the same 
RAID controller for improved performance.).
Rack awareness (For very large clusters: Avoids replicate data on servers on 
the same rack)

Regards,

Fernando
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Gluster 3.3.0 and VMware ESXi 5

2012-06-07 Thread Fernando Frediani (Qube)
Hi Atha,

I have a very similar setup and behaviour here.
I have two bricks with replication and I am able to mount the NFS, deploy a 
machine there, but when I try to Power it On it simply doesn't work and gives a 
different message saying that it couldn't find some files.

I wonder if anyone actually got it working with VMware ESXi and can share with 
us their scenario setup. Here I have two CentOS 6.2 and Gluster 3.3.0.

Fernando

-Original Message-
From: gluster-users-boun...@gluster.org 
[mailto:gluster-users-boun...@gluster.org] On Behalf Of Atha Kouroussis
Sent: 07 June 2012 15:29
To: gluster-users@gluster.org
Subject: [Gluster-users] Gluster 3.3.0 and VMware ESXi 5

Hi everybody,
we are testing Gluster 3.3 as an alternative to our current Nexenta based 
storage. With the introduction of granular based locking gluster seems like a 
viable alternative for VM storage.

Regrettably we cannot get it to work even for the most rudimentary tests. We 
have a two brick setup with two ESXi 5 servers. We created both distributed and 
replicated volumes. We can mount the volumes via NFS on the ESXi servers 
without any issues but that is as far as we can go.

When we try to migrate a VM to the gluster backed datastore there is no 
activity on the bricks and eventually the operation times out on the ESXi side. 
The nfs.log shows messages like these (distributed volume):

[2012-06-07 00:00:16.992649] E [nfs3.c:3551:nfs3_rmdir_resume] 0-nfs-nfsv3: 
Unable to resolve FH: (192.168.11.11:646) vmvol : 
7d25cb9a-b9c8-440d-bbd8-973694ccad17
[2012-06-07 00:00:17.027559] W [nfs3.c:3525:nfs3svc_rmdir_cbk] 0-nfs: 3bb48d69: 
/TEST => -1 (Directory not empty)
[2012-06-07 00:00:17.066276] W [nfs3.c:3525:nfs3svc_rmdir_cbk] 0-nfs: 3bb48d90: 
/TEST => -1 (Directory not empty)
[2012-06-07 00:00:17.097118] E [nfs3.c:3551:nfs3_rmdir_resume] 0-nfs-nfsv3: 
Unable to resolve FH: (192.168.11.11:646) vmvol : 
----0001


When the volume is mounted on the ESXi servers, we get messages like these in 
nfs.log:

[2012-06-06 23:57:34.697460] W [socket.c:195:__socket_rwv] 0-socket.nfs-server: 
readv failed (Connection reset by peer)


The same volumes mounted via NFS on a linux box work fine and we did a couple 
of benchmarks with bonnie++ with very promising results.
Curiously, if we ssh into the ESXi boxes and go to the mount point of the 
volume, we can see it contents and write.

Any clues of what might be going on? Thanks in advance.

Cheers,
Atha


___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Performance optimization tips Gluster 3.3? (small files / directory listings)

2012-06-07 Thread Fernando Frediani (Qube)
Hi,
Sorry this reply won't be of any help to your problem, but I am too curious to 
understand how it can be even slower if monting using Gluster client which I 
would expect always be quicker than NFS or anything else.
If you find the reason port it back to the list and share with us please. I 
think this directory index issues has been reported already for systems with 
many files.

Regards,

Fernando

From: gluster-users-boun...@gluster.org 
[mailto:gluster-users-boun...@gluster.org] On Behalf Of olav johansen
Sent: 07 June 2012 03:32
To: gluster-users@gluster.org
Subject: [Gluster-users] Performance optimization tips Gluster 3.3? (small 
files / directory listings)

Hi,

I'm using Gluster 3.3.0-1.el6.x86_64, on two storage nodes, replicated mode 
(fs1, fs2)
Node specs: CentOS 6.2 Intel Quad Core 2.8Ghz, 4Gb ram, 3ware raid, 2x500GB 
sata 7200rpm (RAID1 for os), 6x1TB sata 7200rpm (RAID10 for /data), 1Gbit 
network

I've it mounted data partition to web1 a Dual Quad 2.8Ghz, 8Gb ram, using 
glusterfs. (also tried NFS -> Gluster mount)

We have 50Gb of files, ~800'000 files in 3 levels of directories (max 2000 
directories in one folder)

My main problem is speed of directory indexes "ls -alR"  on the gluster mount 
takes 23 minutes every time.

It don't seem like any directory listing information cache, with regular NFS 
(not gluster) between web1<->fs1, this takes 6m13s first time, and 5m13s there 
after.

Gluster mount is 4+ times slower for directory indexing performance vs pure NFS 
to single server, is this as expected?
I understand there is a lot more calls involved checking both nodes but I'm 
just looking for a reality check regarding this.

Any suggestions of how I can speed this up?

Thanks,

___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Striped replicated volumes in Gluster 3.3.0

2012-06-05 Thread Fernando Frediani (Qube)
John,

I just hope that 3.4 will be backwards compatible with 3.3 at least. 
Fortunately I wasn't affected by this non-compatible upgrade but I do think 
developers should really care about making next versions backwards compatible 
with previous versions. Sometimes it seems that people are more concerned on 
getting things out of the door rather then care about these details which might 
look secondary in favour of the "improvements".
I imagine how frustrating will be for people that have a big cluster running 
and are willing to upgrade to Verison 3.3 and they can't without some extra 
work and downtime.

Count with me for 3.4 development and that is my first contribution suggesting 
it to the plan.

Fernando

-Original Message-
From: gluster-users-boun...@gluster.org 
[mailto:gluster-users-boun...@gluster.org] On Behalf Of John Mark Walker
Sent: 05 June 2012 23:35
To: Whit Blauvelt
Cc: gluster-users@gluster.org
Subject: Re: [Gluster-users] Striped replicated volumes in Gluster 3.3.0

- Original Message -
> > I'm sorry if the release didn't address the specific features you 
> > need, and I'm sorry if we gave the impression that it would. Our 
> > additional features for 3.3 were always pretty clear, or so I 
> > thought. If you can find any statements from the past year that were 
> > misleading, I would be happy to address them directly, but your 
> > statement above was a bit vague.
> 
> John Mark,
> 
> Please consider my comments in the context of Amar's comment, which I


Also, I invite you to help us with 3.4 planning. We welcome those with a stake 
in Gluster to help improve it.

We're going to get 3.4 planning underway very shortly.

-JM
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] GlusterFS 3.3 not yet quiet ready for Virtual Machines storage

2012-06-05 Thread Fernando Frediani (Qube)
Well Brian, first of all I think it's a bit of a waste to make a RAID10 on a 
Gluster enviroment given that the data is already replicated across other 
nodes. It would limit from begining the usable space to 1/4 of the Raw which is 
quiet a lot. Consider not only the price of the Disk, but alto to maintain it 
and the extra power each consumes, besides extra CPU and Memory.
I would much rather have multiple nodes made of either RAID 5 or 6 and spread 
the IOPS across them as if each brick was a disk in a large RAID10 enviroment 
(that's what was my point about spread IOPS acrross the whole cluster).
Yes some VMs could mount things directly from the Gluster volume(and using 
Gluster client if a Linux machine which is even better), but that is not always 
an option specially in if you are a Service Provider and you can not give that 
access to customers, so they must store they data on the local vdisks of their 
VMs

Fernando

From: Brian Candler [b.cand...@pobox.com]
Sent: 05 June 2012 17:28
To: Fernando Frediani (Qube)
Cc: 'gluster-users@gluster.org'
Subject: Re: [Gluster-users] GlusterFS 3.3 not yet quiet ready for Virtual 
Machines storage

On Mon, Jun 04, 2012 at 10:28:50PM +0000, Fernando Frediani (Qube) wrote:
>I have been reading and trying to test(without much success) Gluster
>3.3 for Virtual Machines storage and from what I could see it isn’t yet
>quiet ready for running virtual machines.
>
>One great improvement about the granular locking which was essential
>for these types of environments was achieved, but the other one is
>still not, which is the ability to use
>striped+(distributed)+replicated.

I think you would have to have a very specialised requirement for this to be
"essential".

Suppose you have a host with 12 disks in a RAID10, and you make a replicated
volume with another similar host for resilience.  That gives you a pretty
huge I/O ops for a VM to use, and also a pretty huge VM size (depending on
how big the disks are, of course).

Also: if you are handling terabytes of data, the natural approach in many
cases would be to have a relatively small VM image, and store the data in
glusterfs, mounting it from within the VM.  This means that the same dataset
can be shared by multiple VMs, and is easier to backup and replicate.
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Striped replicated volumes in Gluster 3.3.0

2012-06-04 Thread Fernando Frediani (Qube)
I tried it to host Virtual Machines images and it didn't work at all. Was 
hoping to be able to spread the IOPS more through the cluster.
 That's part of what I was trying to say on the email I sent earlier today.

Fernando

-Original Message-
From: gluster-users-boun...@gluster.org 
[mailto:gluster-users-boun...@gluster.org] On Behalf Of Amar Tumballi
Sent: 05 June 2012 03:51
To: gluster-users@gluster.org
Subject: Re: [Gluster-users] Striped replicated volumes in Gluster 3.3.0

On 06/04/2012 11:21 AM, Amar Tumballi wrote:
> On 06/01/2012 10:18 PM, Travis Rhoden wrote:
>> Did an answer to Christian's question pop up? I was going to write in 
>> with the exact same one.
>>
>> If I created a replicated striped volume, what would keep it from 
>> working in a non-Hadoop environment? Does the NFS server refuse to 
>> export such a volume? Does the FUSE client refuse to mount such a volume?
>>
>> I read through the docs and found the following phrase: "In this 
>> release, configuration of this volume type is supported only for Map 
>> Reduce workloads."
>> What does that mean exactly? Hopefully not, that I'm unable to store 
>> my KVM images on it?
>>
>>
>

Hi,

Striped-Replicated Volumes can be created like any other volume type with 
GlusterFS-3.3.0. It is not restricted to be exported with NFS or FUSE. It would 
still work with non-Hadoop environment.

The statement is the documentation is because of what we tested the type of 
volume with. Considering the other type of volumes, Striped-Replicate volume 
has been tested only with Hadoop workload as of now.

You can use striped-replicate volume, but have a test run for some time before 
putting into production. If it works for you, we can say that feature is good, 
but if there are issues, we are glad to fix it and make it more stable.

Regards,
Amar

___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


[Gluster-users] GlusterFS 3.3 not yet quiet ready for Virtual Machines storage

2012-06-04 Thread Fernando Frediani (Qube)
Hi,

I have been reading and trying to test(without much success) Gluster 3.3 for 
Virtual Machines storage and from what I could see it isn't yet quiet ready for 
running virtual machines.

One great improvement about the granular locking which was essential for these 
types of environments was achieved, but the other one is still not, which is 
the ability to use striped+(distributed)+replicated.
As it stands now the natural choice would be Distributed + Replicated but when 
storing a Virtual Machines image it would reside in a single brick(replicated 
of course), so the maximum amount of IOPS for write would be  the equivalent of 
a single brick's RAID controller and its disks underneath, while if 
striped+(distributed)+replicated was available it would spread the IOPS across 
all bricks containing the large Virtual Machine image and therefore multiple 
bricks and RAID controllers.
Also, if I understand correctly, the maximum size for a file wouldn't be the 
size of a brick as ,again, the file would be spread across multiple bricks.

This type of volume is said to be available on version 3.3 but as the 
documentation says it is only to run MapReduce workloads.

What is everybody's opinion about this  and has this been thought or considered 
?

Regards,

Fernando
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users