Re: [CentOS] DRBD on a xen host: crash on high I/O

2009-07-29 Thread Ross Walker

On Jul 29, 2009, at 2:30 PM, "Andrea Dell'Amico"  
 wrote:

> On Wed, 2009-07-29 at 16:16 +0200, Andrea Dell'Amico wrote:
>> On Wed, 2009-07-29 at 09:55 -0400, Ross Walker wrote:
>
>> I'm pretty sure the crash is DRBD related: until the secondary drbd
>> server is detached, all is working well. There are 23 guests running,
>> right now, some of them paravirtualized, other full virtualized.  
>> Some of
>> them use files images, other logical volumes (all of them over a drbd
>> device).
>> And I don't have a resource starvation, but a kernel crash and an
>> immediate reboot.
>
> It seems that one:
> http://thread.gmane.org/gmane.linux.network.drbd/17537 but I didn't
> loose the link between primary and secondary

The OP with the iSCSI problem saw no resource starvation either, yet  
the hypervisor was rate limiting his dom0 CPU usage to the point where  
he was missing interrupts.

-Ross

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] DRBD on a xen host: crash on high I/O

2009-07-29 Thread Andrea Dell'Amico
On Wed, 2009-07-29 at 16:16 +0200, Andrea Dell'Amico wrote:
> On Wed, 2009-07-29 at 09:55 -0400, Ross Walker wrote:

> I'm pretty sure the crash is DRBD related: until the secondary drbd
> server is detached, all is working well. There are 23 guests running,
> right now, some of them paravirtualized, other full virtualized. Some of
> them use files images, other logical volumes (all of them over a drbd
> device).
> And I don't have a resource starvation, but a kernel crash and an
> immediate reboot.

It seems that one:
http://thread.gmane.org/gmane.linux.network.drbd/17537 but I didn't
loose the link between primary and secondary.

Andrea
-- 
"Fortune does not change men, it unmasks them."
- Suzanne Necker


signature.asc
Description: This is a digitally signed message part
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] DRBD on a xen host: crash on high I/O

2009-07-29 Thread Andrea Dell'Amico
On Wed, 2009-07-29 at 09:55 -0400, Ross Walker wrote:

> I read on another forum how a user using iSCSI for domUs was  
> experiencing network hangs due to the fact that dom0 didn't have  
> enough scheduler credits to handle the network throughput. That might  
> be related.
> 
> http://lists.centos.org/pipermail/centos-virt/2009-June/001021.html

I'm pretty sure the crash is DRBD related: until the secondary drbd
server is detached, all is working well. There are 23 guests running,
right now, some of them paravirtualized, other full virtualized. Some of
them use files images, other logical volumes (all of them over a drbd
device).
And I don't have a resource starvation, but a kernel crash and an
immediate reboot.

> -Ross

ciao
andrea
-- 
"In six days God created the heaven and the earth. On the seventh day,
Stanley Kubrick sent everything back for modifications."
- http://www.jonhs.com/freemovies/dark_side_of_the_moon.htm



signature.asc
Description: This is a digitally signed message part
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] DRBD on a xen host: crash on high I/O

2009-07-29 Thread Ross Walker

On Jul 29, 2009, at 7:52 AM, "Andrea Dell'Amico"  
 wrote:

> On Tue, 2009-07-28 at 14:31 -0400, William L. Maltby wrote:
>
>>> When the two hosts are in sync, if I activate more than a few (six  
>>> or
>>> seven) xen guests, the master server crashes spectacularly and  
>>> reboots.
>>>
>>> I've seen a kernel dump over the serial console, but the machine
>>> restarts immediately so I didn't write it down.
>>
>> If you have an available pc, hook it up in place of the serial  
>> console
>> and start a terminal emulator, e.g. minicom or whatever you prefer,  
>> and
>> turn on full logging. This should save everyting in a file that you  
>> can
>> then review.
>
> Uhm. The console is on the DRAC5 card. I think I would need to  
> activate
> some network kernel crash dump feature.
>
>> If it's a Windows based, just remember to get rid of the ^M with
>> dos2unix, or equivalent, after you send it to a *IX box.
>>
>> I don't know anything about the rest of your problem, sorry.
>
> As I wrote, it's a production server. I cannot stop it when I want, I
> need to reserve a weekend session.
> In the meantime, I was asking if there's a known problem with a setup
> like mine.

I read on another forum how a user using iSCSI for domUs was  
experiencing network hangs due to the fact that dom0 didn't have  
enough scheduler credits to handle the network throughput. That might  
be related.

http://lists.centos.org/pipermail/centos-virt/2009-June/001021.html

-Ross

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] DRBD on a xen host: crash on high I/O

2009-07-29 Thread Andrea Dell'Amico
On Tue, 2009-07-28 at 14:31 -0400, William L. Maltby wrote:

> > When the two hosts are in sync, if I activate more than a few (six or
> > seven) xen guests, the master server crashes spectacularly and reboots.
> > 
> > I've seen a kernel dump over the serial console, but the machine
> > restarts immediately so I didn't write it down.
> 
> If you have an available pc, hook it up in place of the serial console
> and start a terminal emulator, e.g. minicom or whatever you prefer, and
> turn on full logging. This should save everyting in a file that you can
> then review.

Uhm. The console is on the DRAC5 card. I think I would need to activate
some network kernel crash dump feature.

> If it's a Windows based, just remember to get rid of the ^M with
> dos2unix, or equivalent, after you send it to a *IX box.
> 
> I don't know anything about the rest of your problem, sorry.

As I wrote, it's a production server. I cannot stop it when I want, I
need to reserve a weekend session.
In the meantime, I was asking if there's a known problem with a setup
like mine.

Thanks, anyway

> HTH

ciao
andrea
-- 
Officina Metropolis Pub
http://www.officinametropolis.com/


signature.asc
Description: This is a digitally signed message part
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] DRBD on a xen host: crash on high I/O

2009-07-28 Thread William L. Maltby

On Tue, 2009-07-28 at 20:11 +0200, Andrea Dell'Amico wrote:
> Hello,
> I have a couple of Dell 2950 III, both of them with CentOS 5.3, Xen,
> drbd 8.2 and cluster suite.
> Hardware: 32DB RAM, RAID 5 with 6 SAS disks (one hot spare) on a PERC/6
> controller.
> 
> I configured DRBD to use the main network interfaces (bnx2 driver), with
> bonding and crossover cables to have a direct link.
> The normal network traffic uses two different network cards.
> There are two DRBD resources for a total of a little less than 1TB.
> 
> When the two hosts are in sync, if I activate more than a few (six or
> seven) xen guests, the master server crashes spectacularly and reboots.
> 
> I've seen a kernel dump over the serial console, but the machine
> restarts immediately so I didn't write it down.

If you have an available pc, hook it up in place of the serial console
and start a terminal emulator, e.g. minicom or whatever you prefer, and
turn on full logging. This should save everyting in a file that you can
then review.

If it's a Windows based, just remember to get rid of the ^M with
dos2unix, or equivalent, after you send it to a *IX box.

I don't know anything about the rest of your problem, sorry.

> 
> Unfortunately I cannot experiment because I have production services on
> those machines (and they are working fine until I start drbd on the
> slave).
> 
> drbd configuration is attached.
> 
> Anybody has an idea of the problem? The crash is perfectly reproducible,
> and drbd seems to be the problem (maybe the Xen kernel helps?).
> 
> Thanks in advance,
> Andrea
> 

HTH
-- 
Bill

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos