Re: [Lustre-discuss] Large Corosync/Pacemaker clusters

2012-11-07 Thread Adrian Ulrich
I will try the renice solution you proposed. re-niceing corosync should not be required as the process is supposed to run with RT-Priority anyway. I have been thinking that I could increase the token timeout value in /etc/corosync/corosync.conf , to prevent short hiccups. Did you

Re: [Lustre-discuss] Large Corosync/Pacemaker clusters

2012-11-06 Thread Marco Passerini
Message- From: Charles Taylor [mailto:tay...@hpc.ufl.edu] Sent: Wednesday, October 24, 2012 3:33 PM To: Hall, Shawn Cc: lustre-discuss@lists.lustre.org Subject: Re: [Lustre-discuss] Large Corosync/Pacemaker clusters FWIW, we are running HA Lustre using corosync/pacemaker.We broke our

Re: [Lustre-discuss] Large Corosync/Pacemaker clusters

2012-11-06 Thread Hall, Shawn
[mailto:marco.passer...@csc.fi] Sent: Tuesday, November 06, 2012 7:13 AM To: lustre-discuss@lists.lustre.org Cc: Hall, Shawn Subject: Re: [Lustre-discuss] Large Corosync/Pacemaker clusters Hi, I'm also setting up a high-available Lustre system, I configured pairs for the OSSes and MDSes, redundant

[Lustre-discuss] Large Corosync/Pacemaker clusters

2012-10-24 Thread Hall, Shawn
Hi, We're setting up fairly large Lustre 2.1.2 filesystems, each with 18 nodes and 159 resources all in one Corosync/Pacemaker cluster as suggested by our vendor. We're getting mixed messages on how large of a Corosync/Pacemaker cluster will work well between our vendor an others. 1.

Re: [Lustre-discuss] Large Corosync/Pacemaker clusters

2012-10-24 Thread Jeff Johnson
Shawn, In my opinion you shouldn't be running corosync on any more than two machines. They should be configured in self contained pairs (mds pair, oss pairs). Anything beyond that would be chaos to manage, even if it worked. Don't forget the stonith portion. Not every block storage