> I will try the "renice" solution you proposed.

re-niceing corosync should not be required as the process is supposed to run 
with RT-Priority anyway.


> I have been thinking that I could increase the "token" timeout value in 
> /etc/corosync/corosync.conf , to prevent short "hiccups". Did you 
> specify a value to this parameter or did you leave the default 1000ms value?

We configured the token timeout to 17 seconds:

 totem {
        [....]
        transport: udpu
        rrp_mode: passive
        token:     17000
 }


This configuration works just fine for us since months: We didn't see a single 
'false positive STONITH' with this configuration.


Regards,
 Adrian







_______________________________________________
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss

Reply via email to