Re: [Lustre-discuss] Large Corosync/Pacemaker clusters

2012-11-06 Thread Hall, Shawn
Hi, Our vendor actually has several of the parameters in corosync.conf increased by default, and we have not touched them. These are: Token: 1 Retransmits_before_loss: 25 Consensus: 12000 Join: 1000 Merge: 400 Downcheck: 2000 We also have secauth turned off, as this can consume 75% of your

[Lustre-discuss] The latest updates around Lustre and open source files systems from OpenSFS and EOFS

2012-11-06 Thread Norman Morse
You headed to Salt Lake City? SC is always the main HPC conference of the year. OpenSFS and EOFS will be at SC12 Nov 12 - 15th talking open source and file systems at booth 2101. We've had a really busy year with some great progress around Lustre development in particular! Also some new important p

[Lustre-discuss] The latest updates around Lustre and open source files systems from OpenSFS and EOFS

2012-11-06 Thread Norman Morse
You headed to Salt Lake City? SC is always the main HPC conference of the year. OpenSFS and EOFS will be at SC12 Nov 12 - 15th talking open source and file systems at booth 2101. We've had a really busy year with some great progress around Lustre development in particular! Also some new important p

Re: [Lustre-discuss] Large Corosync/Pacemaker clusters

2012-11-06 Thread Marco Passerini
Hi, I'm also setting up a high-available Lustre system, I configured pairs for the OSSes and MDSes, redundant Corosync rings (two separate rings: IB and Eth), and Stonith is enabled. The current configuration seems to work fine, however yesterday we experienced some problem because 4 OSSes got