Re: [Ocfs2-users] Unable to start cluster with one node
Tao Ma wrote: Are you sure you use the right device in your fstab? I started from scratch and rebuilt a new drbd device on a new LV. mkfs.ocfs2 works fine, but it won't mount. I can built an ext3 filesystem on it and mount it without problems. If yes, could you please strace the mount process to see the arguments mount.ocfs2 give to ocfs2_hb_ctl and why it fails? [EMAIL PROTECTED] ~]# strace mount.ocfs2 /dev/drbd2 /mnt/mirror2 execve(/sbin/mount.ocfs2, [mount.ocfs2, /dev/drbd2, /mnt/mirror2], [/* 23 vars */]) = 0 uname({sys=Linux, node=rhedgetest01, ...}) = 0 brk(0) = 0x8c4 access(/etc/ld.so.preload, R_OK) = -1 ENOENT (No such file or directory) open(/etc/ld.so.cache, O_RDONLY) = 3 fstat64(3, {st_mode=S_IFREG|0644, st_size=70865, ...}) = 0 old_mmap(NULL, 70865, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb7f03000 close(3)= 0 open(/lib/libcom_err.so.2, O_RDONLY) = 3 read(3, \177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\354\310?\0004\0\0\0..., 512) = 512 fstat64(3, {st_mode=S_IFREG|0755, st_size=7004, ...}) = 0 old_mmap(0x3fc000, 8636, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x3fc000 old_mmap(0x3fe000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1000) = 0x3fe000 close(3)= 0 open(/lib/tls/libc.so.6, O_RDONLY)= 3 read(3, \177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\320\36\271\0004\0\0\0..., 512) = 512 fstat64(3, {st_mode=S_IFREG|0755, st_size=1529136, ...}) = 0 old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7f02000 old_mmap(0xb7d000, 1227964, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb7d000 old_mmap(0xca3000, 16384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x125000) = 0xca3000 old_mmap(0xca7000, 7356, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xca7000 close(3)= 0 old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7f01000 mprotect(0xca3000, 8192, PROT_READ) = 0 mprotect(0xb74000, 4096, PROT_READ) = 0 set_thread_area({entry_number:-1 - 6, base_addr:0xb7f016c0, limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, useable:1}) = 0 munmap(0xb7f03000, 70865) = 0 rt_sigaction(SIGTERM, {0x8049f48, [TERM], SA_RESTORER|SA_RESTART, 0xba4908}, {SIG_DFL}, 8) = 0 rt_sigaction(SIGINT, {0x8049f48, [INT], SA_RESTORER|SA_RESTART, 0xba4908}, {SIG_DFL}, 8) = 0 brk(0) = 0x8c4 brk(0x8c61000) = 0x8c61000 open(/dev/drbd2, O_RDONLY|O_DIRECT|O_LARGEFILE) = 3 pread64(3, \2\2\2\2\2\2\2\2this is an ocfs2 volume\0..., 512, 0) = 512 pread64(3, \2\2\2\2\2\2\2\2this is an ocfs2 volume\0..., 512, 0) = 512 pread64(3, \2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2..., 512, 1024) = 512 pread64(3, \2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2..., 1024, 2048) = 1024 pread64(3, \2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2..., 2048, 4096) = 2048 pread64(3, OCFSV2\0\0\241\217_v\377\377\377\377\0\0\0\0\347\377\7\0\0\0\0\0\0\0\0\0..., 4096, 8192) = 4096 close(3)= 0 open(/sys/o2cb/interface_revision, O_RDONLY) = -1 ENOENT (No such file or directory) open(/proc/fs/ocfs2_nodemanager/interface_revision, O_RDONLY) = 3 read(3, 5\n, 15) = 2 read(3, , 13) = 0 close(3)= 0 stat64(/sys/kernel/config, 0xbfe8d500) = -1 ENOENT (No such file or directory) stat64(/config, {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0 statfs64(/config, 84, {f_type=0x62656570, f_bsize=4096, f_blocks=0, f_bfree=0, f_bavail=0, f_files=0, f_ffree=0, f_fsid={0, 0}, f_namelen=255, f_frsize=4096}) = 0 open(/proc/sys/fs/ocfs2/nm/hb_ctl_path, O_RDONLY) = 3 read(3, /sbin/ocfs2_hb_ctl\n, 4096) = 19 read(3, , 4077) = 0 close(3)= 0 rt_sigprocmask(SIG_BLOCK, ~[TRAP SEGV RTMIN RT_1], NULL, 8) = 0 access(/sbin/ocfs2_hb_ctl, X_OK) = 0 clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0xb7f01708) = 4689 waitpid(4689, ocfs2_hb_ctl: I/O error on channel while starting heartbeat [{WIFEXITED(s) WEXITSTATUS(s) == 1}], 0) = 4689 rt_sigprocmask(SIG_UNBLOCK, ~[TRAP SEGV RTMIN RT_1], NULL, 8) = 0 --- SIGCHLD (Child exited) @ 0 (0) --- write(2, mount.ocfs2, 11mount.ocfs2) = 11 write(2, : , 2: ) = 2 write(2, Error when attempting to run /sb..., 74Error when attempting to run /sbin/ocfs2_hb_ctl: Operation not permitted) = 74 write(2, \r\n, 2 ) = 2 exit_group(1) = ? So here you mean you can mount an old ocfs2 filesystem in your single node, right? If yes, have you updated the ocfs2-tools recently(If you
Re: [Ocfs2-users] Unable to start cluster with one node
http://oss.oracle.com/bugzilla/show_bug.cgi?id=944 You will need to apply the following patch. http://oss.oracle.com/bugzilla/attachment.cgi?id=565 David Coulson wrote: Tao Ma wrote: Are you sure you use the right device in your fstab? I started from scratch and rebuilt a new drbd device on a new LV. mkfs.ocfs2 works fine, but it won't mount. I can built an ext3 filesystem on it and mount it without problems. If yes, could you please strace the mount process to see the arguments mount.ocfs2 give to ocfs2_hb_ctl and why it fails? [EMAIL PROTECTED] ~]# strace mount.ocfs2 /dev/drbd2 /mnt/mirror2 execve(/sbin/mount.ocfs2, [mount.ocfs2, /dev/drbd2, /mnt/mirror2], [/* 23 vars */]) = 0 uname({sys=Linux, node=rhedgetest01, ...}) = 0 brk(0) = 0x8c4 access(/etc/ld.so.preload, R_OK) = -1 ENOENT (No such file or directory) open(/etc/ld.so.cache, O_RDONLY) = 3 fstat64(3, {st_mode=S_IFREG|0644, st_size=70865, ...}) = 0 old_mmap(NULL, 70865, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb7f03000 close(3)= 0 open(/lib/libcom_err.so.2, O_RDONLY) = 3 read(3, \177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\354\310?\0004\0\0\0..., 512) = 512 fstat64(3, {st_mode=S_IFREG|0755, st_size=7004, ...}) = 0 old_mmap(0x3fc000, 8636, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x3fc000 old_mmap(0x3fe000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1000) = 0x3fe000 close(3)= 0 open(/lib/tls/libc.so.6, O_RDONLY)= 3 read(3, \177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\320\36\271\0004\0\0\0..., 512) = 512 fstat64(3, {st_mode=S_IFREG|0755, st_size=1529136, ...}) = 0 old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7f02000 old_mmap(0xb7d000, 1227964, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb7d000 old_mmap(0xca3000, 16384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x125000) = 0xca3000 old_mmap(0xca7000, 7356, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xca7000 close(3)= 0 old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7f01000 mprotect(0xca3000, 8192, PROT_READ) = 0 mprotect(0xb74000, 4096, PROT_READ) = 0 set_thread_area({entry_number:-1 - 6, base_addr:0xb7f016c0, limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, useable:1}) = 0 munmap(0xb7f03000, 70865) = 0 rt_sigaction(SIGTERM, {0x8049f48, [TERM], SA_RESTORER|SA_RESTART, 0xba4908}, {SIG_DFL}, 8) = 0 rt_sigaction(SIGINT, {0x8049f48, [INT], SA_RESTORER|SA_RESTART, 0xba4908}, {SIG_DFL}, 8) = 0 brk(0) = 0x8c4 brk(0x8c61000) = 0x8c61000 open(/dev/drbd2, O_RDONLY|O_DIRECT|O_LARGEFILE) = 3 pread64(3, \2\2\2\2\2\2\2\2this is an ocfs2 volume\0..., 512, 0) = 512 pread64(3, \2\2\2\2\2\2\2\2this is an ocfs2 volume\0..., 512, 0) = 512 pread64(3, \2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2..., 512, 1024) = 512 pread64(3, \2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2..., 1024, 2048) = 1024 pread64(3, \2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2\2..., 2048, 4096) = 2048 pread64(3, OCFSV2\0\0\241\217_v\377\377\377\377\0\0\0\0\347\377\7\0\0\0\0\0\0\0\0\0..., 4096, 8192) = 4096 close(3)= 0 open(/sys/o2cb/interface_revision, O_RDONLY) = -1 ENOENT (No such file or directory) open(/proc/fs/ocfs2_nodemanager/interface_revision, O_RDONLY) = 3 read(3, 5\n, 15) = 2 read(3, , 13) = 0 close(3)= 0 stat64(/sys/kernel/config, 0xbfe8d500) = -1 ENOENT (No such file or directory) stat64(/config, {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0 statfs64(/config, 84, {f_type=0x62656570, f_bsize=4096, f_blocks=0, f_bfree=0, f_bavail=0, f_files=0, f_ffree=0, f_fsid={0, 0}, f_namelen=255, f_frsize=4096}) = 0 open(/proc/sys/fs/ocfs2/nm/hb_ctl_path, O_RDONLY) = 3 read(3, /sbin/ocfs2_hb_ctl\n, 4096) = 19 read(3, , 4077) = 0 close(3)= 0 rt_sigprocmask(SIG_BLOCK, ~[TRAP SEGV RTMIN RT_1], NULL, 8) = 0 access(/sbin/ocfs2_hb_ctl, X_OK) = 0 clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0xb7f01708) = 4689 waitpid(4689, ocfs2_hb_ctl: I/O error on channel while starting heartbeat [{WIFEXITED(s) WEXITSTATUS(s) == 1}], 0) = 4689 rt_sigprocmask(SIG_UNBLOCK, ~[TRAP SEGV RTMIN RT_1], NULL, 8) = 0 --- SIGCHLD (Child exited) @ 0 (0) --- write(2, mount.ocfs2, 11mount.ocfs2) = 11 write(2, : , 2: ) = 2 write(2, Error when attempting to run /sb..., 74Error when attempting
Re: [Ocfs2-users] Unable to start cluster with one node
Sunil Mushran wrote: You will need to apply the following patch. http://oss.oracle.com/bugzilla/attachment.cgi?id=565 Thanks - That seems to have solved the problem. Will that patch be making it into any distribution binary packages soon? I have been running the RHEL4 packages, and it'd be nice to have 'official' RPMs containing this fix. ___ Ocfs2-users mailing list Ocfs2-users@oss.oracle.com http://oss.oracle.com/mailman/listinfo/ocfs2-users
Re: [Ocfs2-users] Unable to start cluster with one node
We'll have to push this into mainline first. Update the bug with your request. David Coulson wrote: Sunil Mushran wrote: You will need to apply the following patch. http://oss.oracle.com/bugzilla/attachment.cgi?id=565 Thanks - That seems to have solved the problem. Will that patch be making it into any distribution binary packages soon? I have been running the RHEL4 packages, and it'd be nice to have 'official' RPMs containing this fix. ___ Ocfs2-users mailing list Ocfs2-users@oss.oracle.com http://oss.oracle.com/mailman/listinfo/ocfs2-users ___ Ocfs2-users mailing list Ocfs2-users@oss.oracle.com http://oss.oracle.com/mailman/listinfo/ocfs2-users
Re: [Ocfs2-users] Unable to start cluster with one node
Hi David, David Coulson wrote: This is probably a stupid question, but here we go. I have two boxes running RHEL4U6 with DRBD mirroring disk between them. DRBD is setup in active/active mode, and seems to be working nicely. I have OCFS2 filesystems build on the DRBD devices, and normally I am able to mount them on both nodes and life is good. Now, I have one node down. Not good, but that is why we have two... DRBD is fine, but OCFS2 won't startup correctly. [EMAIL PROTECTED] network-scripts]# !/etc/init.d/o2cb status Module configfs: Loaded Filesystem configfs: Mounted Module ocfs2_nodemanager: Loaded Module ocfs2_dlm: Loaded Module ocfs2_dlmfs: Loaded Filesystem ocfs2_dlmfs: Mounted Checking O2CB cluster ocfs2: Online Heartbeat dead threshold: 31 Network idle timeout: 3 Network keepalive delay: 2000 Network reconnect delay: 2000 Checking O2CB heartbeat: Not active There is no problem with the status. As for Not active, it just means that there is no device hearbeating. So if you don't mount a ocfs2 volume, it will just shows Not active. I take it this has something to do with establishing a quorum, which probably isn't happy with a single node. Is there a configuration change or workaround that will allow a single OCFS2 node to mount a filesystem? a single OCFS2 node can mount a file system without any change in the configuration. So you may try to mount it. If there is any problem, please paste the error message here. Thanks. Regards, Tao ___ Ocfs2-users mailing list Ocfs2-users@oss.oracle.com http://oss.oracle.com/mailman/listinfo/ocfs2-users
Re: [Ocfs2-users] Unable to start cluster with one node
Hi David, David Coulson wrote: Hi Tao, nt a file system without any change in the configuration. So you may try to mount it. If there is any problem, please paste the error message here. Thanks. I tried to create a filesystem on a unused DRBD block device... mkfs.ocfs2 seemed to work okay, but it won't mount. # mount /mnt/mirror2 ocfs2_hb_ctl: I/O error on channel while starting heartbeat mount.ocfs2: Error when attempting to run /sbin/ocfs2_hb_ctl: Operation not permitted Are you sure you use the right device in your fstab? If yes, could you please strace the mount process to see the arguments mount.ocfs2 give to ocfs2_hb_ctl and why it fails? Another ocfs2 filesystem which was built earlier seems happy, however the two nodes were working together then. So here you mean you can mount an old ocfs2 filesystem in your single node, right? If yes, have you updated the ocfs2-tools recently(If you use a new mkfs.ocfs2(like 1.3.9) and an old ocfs2_hb_ctl(like 1.2.x), it will fails). btw, is there any error message in dmesg? Regards, Tao ___ Ocfs2-users mailing list Ocfs2-users@oss.oracle.com http://oss.oracle.com/mailman/listinfo/ocfs2-users