Which io scheduler are you using? On el4, it is best to use deadline. cfq is the default. Check the faq for details on using deadline.
Derek Hazell wrote: > > Hi Ocfs2 user > We got some relevant log messages (via a serial console) and via a > putty session logged on a root. > I suspect we need to set up a private network between the ocfs2 > cluster members, is this right? Anything else we might need to do? > > regards, I appreciate your help > > Derek > ######################################################## > CURRENT O2CB CONFIG > [EMAIL PROTECTED] fs]# /etc/init.d/o2cb configure > Configuring the O2CB driver. > This will configure the on-boot properties of the O2CB driver. > The following questions will determine whether the driver is loaded on > boot. The current values will be shown in brackets ('[]'). Hitting > <ENTER> without typing an answer will keep that current value. Ctrl-C > will abort. > Load O2CB driver on boot (y/n) [y]: > Cluster to start on boot (Enter "none" to clear) [ocfs2]: > Specify heartbeat dead threshold (>=7) [61]: > Specify network idle timeout in ms (>=5000) [60000]: 120000 > Specify network keepalive delay in ms (>=1000) [2000]: > Specify network reconnect delay in ms (>=2000) [2000]: > Writing O2CB configuration: OK > O2CB cluster ocfs2 already online > [EMAIL PROTECTED] fs]# > ################## > TRACE OF ROOT PUTTY LOGIN > > [EMAIL PROTECTED] ~]# > Message from [EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]> at Fri Aug 22 > 23:12:03 2008 ... > sysname kernel: Heartbeat thread (11) printing last 24 blocking > operations (cur = 8): > > Message from [EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]> at Fri Aug 22 > 23:12:03 2008 ... > sysname kernel: Heartbeat thread stuck at waiting for read completion, > stuffing current time into that blocker (index 8) > > Message from [EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]> at Fri Aug 22 > 23:12:03 2008 ... > sysname kernel: Index 9: took 0 ms to do bio alloc read > > . > . > . > > Message from [EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]> at Fri Aug 22 > 23:12:04 2008 ... > sysname kernel: Index 3: took 5240 ms to do waiting for write completion > > Message from [EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]> at Fri Aug 22 > 23:12:04 2008 ... > sysname kernel: Index 4: took 0 ms to do allocating bios for read > > Message from [EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]> at Fri Aug 22 > 23:12:04 2008 ... > sysname kernel: Index 5: took 0 ms to do bio alloc read > > Message from [EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]> at Fri Aug 22 > 23:12:04 2008 ... > sysname kernel: Index 6: took 0 ms to do bio add page read > > Message from [EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]> at Fri Aug 22 > 23:12:04 2008 ... > sysname kernel: Index 7: took 0 ms to do submit_bio for read > > Message from [EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]> at Fri Aug 22 > 23:12:04 2008 ... > sysname kernel: Index 8: took 120303 ms to do waiting for read completion > > ############# > TRACE OF SERIAL CONSOLE: > (11,1):o2hb_write_timeout:269 ERROR: Heartbeat write timeout to device > emcpowerb1 after 120000 milliseconds > Heartbeat thread (11) printing last 24 blocking operations (cur = 8): > Heartbeat thread stuck at waiting for read completion, stuffing > current time into that blocker (index 8) > Index 9: took 0 ms to do bio alloc read > Index 10: took 0 ms to do bio add page read > Index 11: took 0 ms to do submit_bio for read > Index 12: took 3025 ms to do waiting for read completion > Index 13: took 0 ms to do bio alloc write > Index 14: took 0 ms to do bio add page write > Index 15: took 0 ms to do submit_bio for write > Index 16: took 0 ms to do checking slots > Index 17: took 7221 ms to do waiting for write completion > Index 18: took 0 ms to do allocating bios for read > Index 19: took 0 ms to do bio alloc read > Index 20: took 0 ms to do bio add page read > Index 21: took 0 ms to do submit_bio for read > Index 22: took 3892 ms to do waiting for read completion > Index 23: took 0 ms to do bio alloc write > Index 0: took 0 ms to do bio add page write > Index 1: took 0 ms to do submit_bio for write > Index 2: took 0 ms to do checking slots > Index 3: took 5240 ms to do waiting for write completion > Index 4: took 0 ms to do allocating bios for read > Index 5: took 0 ms to do bio alloc read > Index 6: took 0 ms to do bio add page read > Index 7: took 0 ms to do submit_bio for read > Index 8: took 120303 ms to do waiting for read completion > *** ocfs2 is very sorry to be fencing this system by restarting *** > Bootdata ok (command line is ro root=/dev/VolGroup_ID_12182/LogVol1 > console=ttyS0,9600n8) > > > ################################################################################ > -----Original Message----- > From: [EMAIL PROTECTED] > <mailto:[EMAIL PROTECTED]> > [mailto:[EMAIL PROTECTED] > <mailto:[EMAIL PROTECTED]>] On Behalf Of Sunil Mushran > Sent: Tuesday, 19 August 2008 3:56 AM > To: _Derek Hazell (Internet) > Cc: ocfs2-users@oss.oracle.com <mailto:ocfs2-users@oss.oracle.com> > Subject: Re: [Ocfs2-users] ocfs2 issue? : unexplained reboots of RHEL > 4 server (kernel:2.6.9-42.0.2.ELs) > > > Configure a netdump or netconsole server. It will catch the relevant > > messages. > > ################################################################################ > > ------------------------------------------------------------------------ > > _______________________________________________ > Ocfs2-users mailing list > Ocfs2-users@oss.oracle.com > http://oss.oracle.com/mailman/listinfo/ocfs2-users _______________________________________________ Ocfs2-users mailing list Ocfs2-users@oss.oracle.com http://oss.oracle.com/mailman/listinfo/ocfs2-users