Hi all, we are currently running a three node Moodle/Apache cluster with OCFS2 as upload directory. Everything is fine, but sometimes some nodes losing connections.
I get the following error on Node 2 kernel: [555631.411454] o2net: connection to node node-03 (num 2) at xxx.196.20.20:7777 has been idle for 7.0 seconds, shutting it down. kernel: [555631.411482] (19959,0):o2net_idle_timer:1495 here are some times that might help debug the situation: (tmr 1301847991.990535 now 1301847998.990086 dr 1301847991.990489 adv 1301847991.990536:1301847991.990537 func (d672c340:502) 1301847983.930438:1301847983.930444) after that Apache is going down and forces some kernel errors. and Node 3: kernel: [555392.301334] o2net: no longer connected to node node-02 (num 1) at xxx.196.20.9:7777 and is trying to reconnect FOR HOURS... and also here Apache is going down causing the cluster to stuck. I'm not able to stop ocfs2 nor o2cb All nodes are running: Debian Squeeze, 2.6.32-5-amd64 on a VMWare ESX Virtual Machine If you need any further information please let me know. Thanks for all help i'll get regards Marc _______________________________________________ Ocfs2-users mailing list Ocfs2-users@oss.oracle.com http://oss.oracle.com/mailman/listinfo/ocfs2-users