Package: redhat-cluster
Version: 1.03.00-7
Severity: critical
Justification: breaks the whole system

Hi,

I'm trying to install gfs on an intel Xeon 64bit bi-processor.
I've created a 2 node manual fencing cluster-conf, nfs1 and nfs2 are the
nodes.
I'm working on nfs2 alone.

nfs2:~# uname -a
Linux nfs2 2.6.18-4-amd64 #1 SMP Wed Feb 21 14:29:38 UTC 2007 x86_64
GNU/Linux

Here what I made:

gfs_mkfs -j 2 -p lock_dlm -t cluster1:dsk1 /dev/sda1
modprobe cman
/etc/init.d/ccs start
/etc/init.d/cman start
fence_tool join -D

fence_tool: wait for quorum 1
fence_tool: get our node name
fence_tool: connect to ccs
fence_tool: start fenced
fenced: 1172670491 our name from cman "nfs2"
fenced: 1172670491 delay post_join 6s post_fail 0s
fenced: 1172670491 added 2 nodes from ccs
fenced: 1172670491 start:
fenced: 1172670491   event_id    = 1
fenced: 1172670491   last_stop   = 0
fenced: 1172670491   last_start  = 1
fenced: 1172670491   last_finish = 0
fenced: 1172670491   node_count  = 1
fenced: 1172670491   start_type  = join
fenced: 1172670491 members:
fenced: 1172670491   nodeid = 1 "nfs2"
fenced: 1172670491 do_recovery stop 0 start 1 finish 0
fenced: 1172670491 our nodeid 1
fenced: 1172670491 add first victim nfs1
fenced: 1172670497 delay of 6s leaves 1 victims
fenced: 1172670497 fencing node nfs1

Then I fenced out nfs1 which is missing...
fence_ack_manual -n nfs1

fenced: 1172670595 finish:
fenced: 1172670595   event_id    = 1
fenced: 1172670595   last_stop   = 0
fenced: 1172670595   last_start  = 1
fenced: 1172670595   last_finish = 1
fenced: 1172670595   node_count  = 0

Then I do the mount
mount /dev/sda1 /mnt/

Syslog output:

CMAN 1.03.00 (built Feb 23 2007 09:21:03) installed
NET: Registered protocol family 30
CMAN: Waiting to join or form a Linux-cluster
CMAN: forming a new cluster
CMAN: quorum regained, resuming activity
Lock_Harness 1.03.00 (built Feb 23 2007 09:21:42) installed
GFS 1.03.00 (built Feb 23 2007 09:21:30) installed
GFS: Trying to join cluster "lock_dlm", "cluster1:dsk1"
DLM 1.03.00 (built Feb 23 2007 09:21:14) installed
Lock_DLM (built Feb 23 2007 09:21:19) installed
GFS: fsid=cluster1:dsk1.0: Joined cluster. Now mounting FS...
GFS: fsid=cluster1:dsk1.0: jid=0: Trying to acquire journal lock...
GFS: fsid=cluster1:dsk1.0: jid=0: Looking at journal...
GFS: fsid=cluster1:dsk1.0: jid=0: Done
GFS: fsid=cluster1:dsk1.0: jid=1: Trying to acquire journal lock...
GFS: fsid=cluster1:dsk1.0: jid=1: Looking at journal...
GFS: fsid=cluster1:dsk1.0: jid=1: Done
Unable to handle kernel NULL pointer dereference at 0000000000000074 RIP:
 [<ffffffff8025e804>] _spin_lock_irqsave+0x3/0xd
PGD 74499067 PUD 71383067 PMD 0
Oops: 0002 [1] SMP
CPU 0
Modules linked in: lock_dlm dlm gfs lock_harness cman nfs nfsd exportfs lockd 
nfs_acl sunrpc button ac battery autofs4 ipv6 xfs loop tsdev i2c_i801 i2c_core 
serio_raw intel_rng evdev shpchp pci_hotplug psmouse pcspkr ext3 jbd mbcache 
dm_mirror dm_snapshot dm_mod raid456 xor raid1 md_mod ide_generic ide_cd cdrom 
piix sd_mod generic ide_core ahci libata e1000 qla2xxx firmware_class 
scsi_transport_fc scsi_mod thermal processor fan
Pid: 6787, comm: mount Not tainted 2.6.18-4-amd64 #1
RIP: 0010:[<ffffffff8025e804>]  [<ffffffff8025e804>] _spin_lock_irqsave+0x3/0xd
RSP: 0000:ffff810076105be0  EFLAGS: 00010096
RAX: 0000000000000296 RBX: ffff81007e356180 RCX: ffff810076104000
RDX: 0000000000000080 RSI: 0000000000000000 RDI: 0000000000000074
RBP: 0000000000000070 R08: ffff81007425aef8 R09: ffff8100788ae680
R10: ffffffff8027d27d R11: 0000000000000058 R12: 0000000000000000
R13: 0000000000000078 R14: 0000000000000000 R15: 0000000000000074
FS:  00002b84941341d0(0000) GS:ffffffff80521000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000074 CR3: 0000000077fd0000 CR4: 00000000000006e0
Process mount (pid: 6787, threadinfo ffff810076104000, task ffff8100796277b0)
Stack:  ffffffff8022e03a ffff81007e356180 ffff81007e356180 ffffffff883eb700
 0000000000000000 0000000000000000 0000000000000000 ffff81007249c000
 ffffffff802bcc67 0000000000000000 ffffffff883eb700 0000000000000000
Call Trace:
 [<ffffffff8022e03a>] __up_write+0x21/0x10d
 [<ffffffff802bcc67>] vfs_kern_mount+0xce/0x11a
 [<ffffffff802bccf5>] do_kern_mount+0x36/0x4d
 [<ffffffff802c52db>] do_mount+0x68c/0x6ff
 [<ffffffff8022ae52>] mntput_no_expire+0x19/0x8b
 [<ffffffff8020dd5f>] link_path_walk+0xd3/0xe5
 [<ffffffff802200e5>] __up_read+0x13/0x8a
 [<ffffffff8020a6d0>] do_page_fault+0x3d1/0x706
 [<ffffffff802bdba3>] __blkdev_put+0x149/0x159
 [<ffffffff802290ad>] iput+0x4b/0x84
 [<ffffffff802aacb1>] zone_statistics+0x3e/0x6d
 [<ffffffff802265d5>] vfs_stat_fd+0x1b/0x4a
 [<ffffffff8020de4a>] __alloc_pages+0x5c/0x2a9
 [<ffffffff802290ad>] iput+0x4b/0x84
 [<ffffffff802482f5>] sys_mount+0x8a/0xd7
 [<ffffffff802584d6>] system_call+0x7e/0x83


Code: f0 ff 0f 0f 88 17 01 00 00 c3 fa f0 ff 0f 0f 88 18 01 00 00
RIP  [<ffffffff8025e804>] _spin_lock_irqsave+0x3/0xd
 RSP <ffff810076105be0>
CR2: 0000000000000074

The disk has not been mounted and the system cannot shutdown or halt.

nfs2:~# cat /proc/cpuinfo
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 15
model           : 6
model name      :                   Intel(R) Xeon(TM) CPU 3.00GHz
stepping        : 4
cpu MHz         : 3000.119
cache size      : 2048 KB
physical id     : 0
siblings        : 2
core id         : 0
cpu cores       : 2
fpu             : yes
fpu_exception   : yes
cpuid level     : 6
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall
lm constant_tsc pni monitor ds_cpl vmx est cid cx16 xtpr lahf_lm
bogomips        : 6005.69
clflush size    : 64
cache_alignment : 128
address sizes   : 36 bits physical, 48 bits virtual
power management:

processor       : 1
vendor_id       : GenuineIntel
cpu family      : 15
model           : 6
model name      :                   Intel(R) Xeon(TM) CPU 3.00GHz
stepping        : 4
cpu MHz         : 3000.119
cache size      : 2048 KB
physical id     : 1
siblings        : 2
core id         : 0
cpu cores       : 2
fpu             : yes
fpu_exception   : yes
cpuid level     : 6
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall
lm constant_tsc pni monitor ds_cpl vmx est cid cx16 xtpr lahf_lm
bogomips        : 6000.70
clflush size    : 64
cache_alignment : 128
address sizes   : 36 bits physical, 48 bits virtual
power management:

processor       : 2
vendor_id       : GenuineIntel
cpu family      : 15
model           : 6
model name      :                   Intel(R) Xeon(TM) CPU 3.00GHz
stepping        : 4
cpu MHz         : 3000.119
cache size      : 2048 KB
physical id     : 0
siblings        : 2
core id         : 1
cpu cores       : 2
fpu             : yes
fpu_exception   : yes
cpuid level     : 6
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall
lm constant_tsc pni monitor ds_cpl vmx est cid cx16 xtpr lahf_lm
bogomips        : 6000.71
clflush size    : 64
cache_alignment : 128
address sizes   : 36 bits physical, 48 bits virtual
power management:

processor       : 3
vendor_id       : GenuineIntel
cpu family      : 15
model           : 6
model name      :                   Intel(R) Xeon(TM) CPU 3.00GHz
stepping        : 4
cpu MHz         : 3000.119
cache size      : 2048 KB
physical id     : 1
siblings        : 2
core id         : 1
cpu cores       : 2
fpu             : yes
fpu_exception   : yes
cpuid level     : 6
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall
lm constant_tsc pni monitor ds_cpl vmx est cid cx16 xtpr lahf_lm
bogomips        : 6000.80
clflush size    : 64
cache_alignment : 128
address sizes   : 36 bits physical, 48 bits virtual
power management:

Any help?
Thanks

-- System Information:
Debian Release: 4.0
  APT prefers testing
  APT policy: (990, 'testing')
Architecture: i386 (i686)
Shell:  /bin/sh linked to /bin/bash
Kernel: Linux 2.6.18-3-686
Locale: LANG=C, LC_CTYPE=C (charmap=ANSI_X3.4-1968)


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]

Reply via email to