Ronald Moesbergen wrote: > 2010/10/29 Ronald Moesbergen <intercom...@gmail.com>: > >> 2010/10/29 Tao Ma <tao...@oracle.com>: >> >>> Hi Ronald, >>> >> Hi Tao, >> >> Thanks for looking into this. >> >> >>> On 10/29/2010 05:12 PM, Ronald Moesbergen wrote: >>> >>>> Hello, >>>> >>>> I was testing kernel 2.6.36 (vanilla mainline) and encountered the >>>> following BUG(): >>>> >>>> [157756.266000] o2net: no longer connected to node app01 (num 0) at >>>> 10.2.25.13:7777 >>>> [157756.266077] (o2hb-5FA56B1D0A,2908,0):o2dlm_eviction_cb:267 o2dlm >>>> has evicted node 0 from group 5FA56B1D0A9249099CE58C82CFEC873A >>>> [157756.274443] (ocfs2rec,14060,0):dlm_get_lock_resource:836 >>>> 5FA56B1D0A9249099CE58C82CFEC873A:M00000000000000000000186ba2b09b: at >>>> least one node (0) to recover before lock mastery can begin >>>> [157757.275776] (ocfs2rec,14060,0):dlm_get_lock_resource:890 >>>> 5FA56B1D0A9249099CE58C82CFEC873A:M00000000000000000000186ba2b09b: at >>>> least one node (0) to recover before lock mastery can begin >>>> [157760.774045] (dlm_reco_thread,2920,2):dlm_get_lock_resource:836 >>>> 5FA56B1D0A9249099CE58C82CFEC873A:$RECOVERY: at least one node (0) to >>>> recover before lock mastery can begin >>>> [157760.774124] (dlm_reco_thread,2920,2):dlm_get_lock_resource:870 >>>> 5FA56B1D0A9249099CE58C82CFEC873A: recovery map is not empty, but must >>>> master $RECOVERY lock now >>>> [157760.774205] (dlm_reco_thread,2920,2):dlm_do_recovery:523 (2920) >>>> Node 1 is the Recovery Master for the Dead Node 0 for Domain >>>> 5FA56B1D0A9249099CE58C82CFEC873A >>>> [157768.261818] (ocfs2rec,14060,0):ocfs2_replay_journal:1605 >>>> Recovering node 0 from slot 0 on device (8,32) >>>> [157772.850182] ------------[ cut here ]------------ >>>> [157772.850211] kernel BUG at fs/ocfs2/journal.c:1700! >>>> >>> Strange. the bug line is >>> BUG_ON(osb->node_num == node_num); >>> and it detects the same node number in the cluster. >>> > > I just tried to reproduce it and succeeded. Here's what I did: > - unmount the filesystem on node app02 > - shutdown the o2cb services on app02 > - Do a halt -f on app01, which still has the OCFS2 volume mounted. > - Start o2cb services on app02 > - Mount the OCFS2 filesystem -> BUG > Thanks for the test. I will look at it. Thanks.
Regards, Tao _______________________________________________ Ocfs2-users mailing list Ocfs2-users@oss.oracle.com http://oss.oracle.com/mailman/listinfo/ocfs2-users