>>> Lars Marowsky-Bree <l...@suse.com> schrieb am 10.07.2013 um 13:41 in 
>>> Nachricht
<20130710114131.gb18...@suse.de>:
> On 2013-07-10T08:31:17, Ulrich Windl <ulrich.wi...@rz.uni-regensburg.de> 
> wrote:
> 
>> I had reported about terrible performance of cLVM (maybe related to using 
> OCFS also) when uses in SLES11 SP2. I guesses cLVM (or OCFS2) is 
> "communicating to death" on activity. Now I have some interesing news:
> 
> No, the performance issue with cLVM2 mirroring is not at all related to
> OCFS2; that's just cLVM2's algorithm being, well, suboptimal.
> 
>> on top of cLVM/OCFS I have image files for Xen VMs. I set up an OpenLDAP 
> server in one of the VMs. Now about everytime the LDAP server gets an update 
> (meaning id does some flushed disk writes), corosync reports a faulty ring. 
> It's like:
> 
> That, though, clearly shouldn't happen. And I've never seen this,
> despite hosting a "few" VMs on my OCFS2 cluster (even with cLVM2
> mirroring).
> 
> Network problems in hypervisors though also have a tendency to be, well,
> due to the hypervisor, or some network cards (broadcom?).

Yes:
driver: bnx2
version: 2.1.11
firmware-version: bc 5.2.3 NCSI 2.0.12


> 
>>  # grep FAULTY /var/log/messages |wc -l
>> 1546
>> 
>> However the "FAULT" never lasts longer than one second.
> 
> That's weird. Multicast or unicast?

Multicast.

> 
> 
>> OTOH our network guy says it's impossible to use the full network
>> bandwidth. This makes me wonder: Is there a protocol implementation
>> bug in TOTEM that is triggered when lots of packets arrive or when
>> packets are delayed slightly, or is there a kernel bug that looses
>> packets?
> 
> My guess would be the latter here.

Does not sound good.

> 
> Can this be reproduced with another high network load pattern? Packet
> loss etc?

No, but TCP handles packet loss more gracefully than the cluster, it seems.

> 
>> Is there any perspective to see the light at the end of the tunnel? The 
> problems should be easily reproducable.
> 
> Bugs that get reported have a chance of being fixed ;-)

One more bug and my suport engineer kills me ;-)

Regards,
Ulrich


_______________________________________________
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to