[Gluster-users] libvirt and libgfapi in RHEL 6.5 beta

2013-10-11 Thread Jake Grimmett

Dear All,

Very pleased to see that the Redhat 6.5 beta promises "Native Support 
for GlusterFS in QEMU allows native access to GlusterFS volumes using 
the libgfapi library"


Can I ask if virt-manager & libvirt can control libgfapi mounts? :)

or do I need to use ovirt? :(

many thanks

Jake

___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


[Gluster-users] Gluster + QEMU + live Migration + cache != none

2012-08-09 Thread Jake Grimmett
Dear All,

I've just updated my VM hosts to Scientific Linux 6.3. After setting the Disk 
cache mode to "default" (writethrough?) rather than "none" I get the following 
error when I try to migrate the VM:

migrating wiki1
error: Unsafe migration: Migration may lead to data corruption if disks use 
cache != none

Is gluster "coherent" across nodes, if so can I just use the --unsafe flag to 
force the move?

Many thanks,

Jake
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] "Granular locking" - does this need to be enabled in 3.3.0 ?

2012-07-19 Thread Jake Grimmett

Dear Pranith /Anand ,

Update on our progress with using KVM & Gluster:

We built a two server (Dell R710) cluster, each box has...
 5 x 500 GB SATA RAID5 array (software raid)
 an Intel 10GB ethernet HBA.
 One box has 8GB RAM, the other 48GB
 both have 2 x E5520 Xeon
 Centos 6.3 installed
 Gluster 3.3 installed from the rpm files on the gluster site


1) create a replicated gluster volume (on top of xfs)
2) setup qemu/kvm with a gluster volume (mounts localhost:/gluster-vol)
3) sanlock configured (this is evil!)
4) build a virtual machines with 30GB qcow2 image, 1GB RAM
5) clone this VM into 4 machines
6) check that live migration works (OK)

Start basic test cycle:
a) migrate all machines to host #1, then reboot host #2
b) watch logs for self-heal to complete
c) migrate VM's to host #2, reboot host #1
d) check logs for self heal

The above cycle can be repeated numerous times, and completes without 
error, provided that no (or little) load is on the VM.



If I give the VM's a work load, such by running "bonnie++" on each VM, 
things start to break.

1) it becomes almost impossible to log in to each VM
2) the kernel on each VM starts giving timeout errors
i.e. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
3) top / uptime on the hosts shows load average of up to 24
4) dd write speed (block size 1K) to gluster is around 3MB/s on the host


While I agree that running bonnie++ on four VM's is possibly unfair, 
there are load spikes on quiet machines (yum updates etc). I suspect 
that the I/O of one VM starts blocking that of another VM, and the 
pressure builds up rapidly on gluster - which does not seem to cope well 
under pressure. Possibly this is the access pattern / block size of 
qcow2 disks?


I'm (slightly) disappointed.

Though it doesn't corrupt data, the I/O performance is < 1% of my 
hardwares capability. Hopefully work on buffering and other tuning will 
fix this ? Or maybe the work mentioned getting qemu talking directly to 
gluster will fix this?


best wishes

Jake

--
Dr Jake Grimmett
Head Of Scientific Computing
MRC Laboratory of Molecular Biology
Hills Road, Cambridge, CB2 0QH, UK.
Phone 01223 402219
Mobile 0776 9886539
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] "Granular locking" - does this need to be enabled in 3.3.0 ?

2012-07-10 Thread Jake Grimmett

Dear Pranith,

I've reduced the number of VM's on the cluster to 16, most have qcow2 
format image files of between 2GB and 8GB. The heaviest load comes from 
three bigger VM's:


1) 8.5G - a lightly loaded ldap server
2) 24G - a lightly loaded confluence server
3) 30G - a gridengine master server

Most I/O is read, but there are database writes going on here.

Typical CPU usage on the host server (Dell R720XD, 2 x E5-2643) is 5%
Memory use is 20GB / 47GB

I'm keen to help work the bugs out, but rather than risk upsetting 16 
live machines (...and their owners), I'll build a new VM cluster on our 
dev Dell R710's. Centos 6.3 is out, and this is a good opportunity to 
see how the latest RHEL / KVM / sanlock interacts with gluster 3.3.0.


I'll update the thread in a couple of days when the test servers are 
working...


regards,

Jake


On 07/10/2012 04:44 AM, Pranith Kumar Karampuri wrote:

Jake,
 Granular locking is the only way data-self-heal is performed at the 
moment. Could you give us the steps to re-create this issue, so that we can 
test this scenario locally. I will raise a bug with the info you provide.
This is roughly the info I am looking for:
1) What is the size of each VM. (Number of VMs: 30 as per your mail)
2) What is the kind of load in the VM. You said small web-servers with low 
traffic, What kind of traffic is it? Writes(Uploads of files), Reads etc.
3) Steps leading to the hang.
4) If you think you can re-create the issue, can you post the statedumps of the 
brick processes and the mount process when the hangs appear.

Pranith.
- Original Message -
From: "Jake Grimmett"
To: "Anand Avati"
Cc: "Jake Grimmett", gluster-users@gluster.org
Sent: Monday, July 9, 2012 11:51:19 PM
Subject: Re: [Gluster-users] "Granular locking" - does this need to be enabled 
in 3.3.0 ?

Hi Anand,

This is one entry (of many) in the client log when bringing my second node
of the cluster back up, the glustershd.log is completely silent at this
point.

If your interested in seeing the nodes split&  reconnect, the relevant
glustershd.log section is at http://pastebin.com/0Va3RxDD

many thanks!

Jake


Was this the client log or the glustershd log?

Thanks,
Avati

On Mon, Jul 9, 2012 at 8:23 AM, Jake Grimmett
wrote:


Hi Fernando / Christian,

Many thanks for getting back to me.

Slow writes are acceptable; most of our VM's are small web servers with
low traffic. My aim is to have a fully self-contained two server KVM
cluster with live migration, no external storage and the ability to
reboot
either node with zero VM downtime.  We seem to be "almost there", bar a
hiccup when the self-heal is in progress and some minor grumbles from
sanlock (which might be fixed by the new sanlock in RHEL 6.3)

Incidentally, the logs shows a "diff" self heal on a node reboot:

[2012-07-09 16:04:06.743512] I
[afr-self-heal-algorithm.c:**122:sh_loop_driver_done]
0-gluster-rep-replicate-0: diff self-heal on /box1-clone2.img:
completed.
(16 blocks of 16974 were different (0.09%))

So, does this log show "Granular locking" occurring, or does it just
happen transparently when a file exceeds a certain size?

many thanks

Jake



On 07/09/2012 04:01 PM, Fernando Frediani (Qube) wrote:


Jake,

I haven’t had a chanced to test with my KVM cluster yet but it should
be
a default things from 3.3.

Just be in mind that running Virtual Machines is NOT a supported things
for Redhat Storage server according to Redhat Sales people. They said
towards the end of the year. As you might have observed performance
specially for write isn’t any near fantastic.


Fernando

*From:*gluster-users-bounces@**gluster.org
[mailto:gluster-users-bounces@**gluster.org]
*On Behalf Of *Christian Wittwer
*Sent:* 09 July 2012 15:51
*To:* Jake Grimmett
*Cc:* gluster-users@gluster.org
*Subject:* Re: [Gluster-users] "Granular locking" - does this need to
be

enabled in 3.3.0 ?

Hi Jake

I can confirm exact the same behaviour with gluster 3.3.0 on Ubuntu
12.04. During the self-heal process the VM gets 100% I/O wait and is
locked.

After the self-heal the root filesystem was read-only which forced me
to
do a reboot and fsck.

Cheers,

Christian

2012/7/9 Jake Grimmettmailto:j...@mrc-lmb.cam.ac.uk>**>


Dear All,

I have a pair of Scientific Linux 6.2 servers, acting as KVM
virtualisation hosts for ~30 VM's. The VM images are stored in a
replicated gluster volume shared between the two servers. Live
migration
works fine, and the sanlock prevents me from (stupidly) starting the
same VM on both machines. Each server has 10GB ethernet and a 10 disk
RAID5 array.

If I migrate all the VM's to server #1 and shutdown server #2, all
works
perfectly with no interruption. When I restart server #2, the VM's
freeze while the self-heal process is running - and this healing can
take a long time.

I'm not sure if "Granular 

Re: [Gluster-users] "Granular locking" - does this need to be enabled in 3.3.0 ?

2012-07-09 Thread Jake Grimmett
Hi Anand,

This is one entry (of many) in the client log when bringing my second node
of the cluster back up, the glustershd.log is completely silent at this
point.

If your interested in seeing the nodes split & reconnect, the relevant
glustershd.log section is at http://pastebin.com/0Va3RxDD

many thanks!

Jake

> Was this the client log or the glustershd log?
>
> Thanks,
> Avati
>
> On Mon, Jul 9, 2012 at 8:23 AM, Jake Grimmett 
> wrote:
>
>> Hi Fernando / Christian,
>>
>> Many thanks for getting back to me.
>>
>> Slow writes are acceptable; most of our VM's are small web servers with
>> low traffic. My aim is to have a fully self-contained two server KVM
>> cluster with live migration, no external storage and the ability to
>> reboot
>> either node with zero VM downtime.  We seem to be "almost there", bar a
>> hiccup when the self-heal is in progress and some minor grumbles from
>> sanlock (which might be fixed by the new sanlock in RHEL 6.3)
>>
>> Incidentally, the logs shows a "diff" self heal on a node reboot:
>>
>> [2012-07-09 16:04:06.743512] I
>> [afr-self-heal-algorithm.c:**122:sh_loop_driver_done]
>> 0-gluster-rep-replicate-0: diff self-heal on /box1-clone2.img:
>> completed.
>> (16 blocks of 16974 were different (0.09%))
>>
>> So, does this log show "Granular locking" occurring, or does it just
>> happen transparently when a file exceeds a certain size?
>>
>> many thanks
>>
>> Jake
>>
>>
>>
>> On 07/09/2012 04:01 PM, Fernando Frediani (Qube) wrote:
>>
>>> Jake,
>>>
>>> I haven’t had a chanced to test with my KVM cluster yet but it should
>>> be
>>> a default things from 3.3.
>>>
>>> Just be in mind that running Virtual Machines is NOT a supported things
>>> for Redhat Storage server according to Redhat Sales people. They said
>>> towards the end of the year. As you might have observed performance
>>> specially for write isn’t any near fantastic.
>>>
>>>
>>> Fernando
>>>
>>> *From:*gluster-users-bounces@**gluster.org
>>> [mailto:gluster-users-bounces@**gluster.org]
>>> *On Behalf Of *Christian Wittwer
>>> *Sent:* 09 July 2012 15:51
>>> *To:* Jake Grimmett
>>> *Cc:* gluster-users@gluster.org
>>> *Subject:* Re: [Gluster-users] "Granular locking" - does this need to
>>> be
>>>
>>> enabled in 3.3.0 ?
>>>
>>> Hi Jake
>>>
>>> I can confirm exact the same behaviour with gluster 3.3.0 on Ubuntu
>>> 12.04. During the self-heal process the VM gets 100% I/O wait and is
>>> locked.
>>>
>>> After the self-heal the root filesystem was read-only which forced me
>>> to
>>> do a reboot and fsck.
>>>
>>> Cheers,
>>>
>>> Christian
>>>
>>> 2012/7/9 Jake Grimmett >> <mailto:j...@mrc-lmb.cam.ac.uk>**>
>>>
>>>
>>> Dear All,
>>>
>>> I have a pair of Scientific Linux 6.2 servers, acting as KVM
>>> virtualisation hosts for ~30 VM's. The VM images are stored in a
>>> replicated gluster volume shared between the two servers. Live
>>> migration
>>> works fine, and the sanlock prevents me from (stupidly) starting the
>>> same VM on both machines. Each server has 10GB ethernet and a 10 disk
>>> RAID5 array.
>>>
>>> If I migrate all the VM's to server #1 and shutdown server #2, all
>>> works
>>> perfectly with no interruption. When I restart server #2, the VM's
>>> freeze while the self-heal process is running - and this healing can
>>> take a long time.
>>>
>>> I'm not sure if "Granular Locking" is on. It's listed as a "technology
>>> preview" in the Redhat Storage server 2 notes - do I need to do
>>> anything
>>> to enable it?
>>>
>>> i.e. set "cluster.data-self-heal-**algorithm" to diff ?
>>> or edit "cluster.self-heal-window-**size" ?
>>>
>>> any tips from other people doing similar much appreciated!
>>>
>>> Many thanks,
>>>
>>> Jake
>>>
>>> jog <---at---> mrc-lmb.cam.ac.uk <http://mrc-lmb.cam.ac.uk>
>>> __**_
>>> Gluster-users mailing list
>>> Gluster-users@gluster.org
>>> <mailto:Gluster-users@gluster.**org
>>> >
>>> http://gluster.org/cgi-bin/**mailman/listinfo/gluster-users<http://gluster.org/cgi-bin/mailman/listinfo/gluster-users>
>>>
>>>
>>
>> --
>> Dr Jake Grimmett
>> Head Of Scientific Computing
>> MRC Laboratory of Molecular Biology
>> Hills Road, Cambridge, CB2 0QH, UK.
>> Phone 01223 402219
>> Mobile 0776 9886539
>>
>> __**_
>> Gluster-users mailing list
>> Gluster-users@gluster.org
>> http://gluster.org/cgi-bin/**mailman/listinfo/gluster-users<http://gluster.org/cgi-bin/mailman/listinfo/gluster-users>
>>
>


-- 
Dr Jake Grimmett
Head Of Scientific Computing
MRC Laboratory of Molecular Biology
Hills Road, Cambridge, CB2 0QH, UK.
Phone 01223 402219
Mobile 0776 9886539


___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] "Granular locking" - does this need to be enabled in 3.3.0 ?

2012-07-09 Thread Jake Grimmett

Hi Fernando / Christian,

Many thanks for getting back to me.

Slow writes are acceptable; most of our VM's are small web servers with 
low traffic. My aim is to have a fully self-contained two server KVM 
cluster with live migration, no external storage and the ability to 
reboot either node with zero VM downtime.  We seem to be "almost there", 
bar a hiccup when the self-heal is in progress and some minor grumbles 
from sanlock (which might be fixed by the new sanlock in RHEL 6.3)


Incidentally, the logs shows a "diff" self heal on a node reboot:

[2012-07-09 16:04:06.743512] I 
[afr-self-heal-algorithm.c:122:sh_loop_driver_done] 
0-gluster-rep-replicate-0: diff self-heal on /box1-clone2.img: 
completed. (16 blocks of 16974 were different (0.09%))


So, does this log show "Granular locking" occurring, or does it just 
happen transparently when a file exceeds a certain size?


many thanks

Jake


On 07/09/2012 04:01 PM, Fernando Frediani (Qube) wrote:

Jake,

I haven’t had a chanced to test with my KVM cluster yet but it should be
a default things from 3.3.

Just be in mind that running Virtual Machines is NOT a supported things
for Redhat Storage server according to Redhat Sales people. They said
towards the end of the year. As you might have observed performance
specially for write isn’t any near fantastic.


Fernando

*From:*gluster-users-boun...@gluster.org
[mailto:gluster-users-boun...@gluster.org] *On Behalf Of *Christian Wittwer
*Sent:* 09 July 2012 15:51
*To:* Jake Grimmett
*Cc:* gluster-users@gluster.org
*Subject:* Re: [Gluster-users] "Granular locking" - does this need to be
enabled in 3.3.0 ?

Hi Jake

I can confirm exact the same behaviour with gluster 3.3.0 on Ubuntu
12.04. During the self-heal process the VM gets 100% I/O wait and is locked.

After the self-heal the root filesystem was read-only which forced me to
do a reboot and fsck.

Cheers,

Christian

2012/7/9 Jake Grimmett mailto:j...@mrc-lmb.cam.ac.uk>>

Dear All,

I have a pair of Scientific Linux 6.2 servers, acting as KVM
virtualisation hosts for ~30 VM's. The VM images are stored in a
replicated gluster volume shared between the two servers. Live migration
works fine, and the sanlock prevents me from (stupidly) starting the
same VM on both machines. Each server has 10GB ethernet and a 10 disk
RAID5 array.

If I migrate all the VM's to server #1 and shutdown server #2, all works
perfectly with no interruption. When I restart server #2, the VM's
freeze while the self-heal process is running - and this healing can
take a long time.

I'm not sure if "Granular Locking" is on. It's listed as a "technology
preview" in the Redhat Storage server 2 notes - do I need to do anything
to enable it?

i.e. set "cluster.data-self-heal-algorithm" to diff ?
or edit "cluster.self-heal-window-size" ?

any tips from other people doing similar much appreciated!

Many thanks,

Jake

jog <---at---> mrc-lmb.cam.ac.uk <http://mrc-lmb.cam.ac.uk>
___
Gluster-users mailing list
Gluster-users@gluster.org <mailto:Gluster-users@gluster.org>
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users




--
Dr Jake Grimmett
Head Of Scientific Computing
MRC Laboratory of Molecular Biology
Hills Road, Cambridge, CB2 0QH, UK.
Phone 01223 402219
Mobile 0776 9886539
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


[Gluster-users] "Granular locking" - does this need to be enabled in 3.3.0 ?

2012-07-09 Thread Jake Grimmett

Dear All,

I have a pair of Scientific Linux 6.2 servers, acting as KVM 
virtualisation hosts for ~30 VM's. The VM images are stored in a 
replicated gluster volume shared between the two servers. Live migration 
works fine, and the sanlock prevents me from (stupidly) starting the 
same VM on both machines. Each server has 10GB ethernet and a 10 disk 
RAID5 array.


If I migrate all the VM's to server #1 and shutdown server #2, all works 
perfectly with no interruption. When I restart server #2, the VM's 
freeze while the self-heal process is running - and this healing can 
take a long time.


I'm not sure if "Granular Locking" is on. It's listed as a "technology 
preview" in the Redhat Storage server 2 notes - do I need to do anything 
to enable it?


i.e. set "cluster.data-self-heal-algorithm" to diff ?
or edit "cluster.self-heal-window-size" ?

any tips from other people doing similar much appreciated!

Many thanks,

Jake

jog <---at---> mrc-lmb.cam.ac.uk
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users