Re: Problem with multiple mounts
On Wed, Nov 08, 2006 at 04:06:23PM -0700, Andreas Dilger wrote: I would suggest that even while this is not supported, it would be prudent to fix such a bug. It might be possible to hit a similar problem if there is corruption of the on-disk data in the journal and oopsing the kernel isn't a graceful way to deal with bad data on disk. On the other hand corrupt data at least doesn't change under you while you are trying to figure out the filesystem. This particular use would have meta data changing while you are trying to read it, making things not be consistent with each other from one moment to another. There may be nothing that can be done about it. -- Len Sorensen
Re: Problem with multiple mounts
On Wed, Nov 08, 2006 at 11:22:15AM -0800, Suzuki wrote: I exported a disk partition using nbd protocol. On the nbd client, I make reiserfs and run fsstress test case on this partition. At the same time, I mount this partition on the nbd server. Then Oops appears as following: ReiserFS: sda10: found reiserfs format 3.6 with standard journal ReiserFS: sda10: using ordered data mode ReiserFS: sda10: journal params: device sda10, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30 ReiserFS: sda10: checking transaction log (sda10) Oops: Kernel access of bad area, sig: 11 [#1] Call Trace: [C00011333090] [C01EDB70] .journal_read+0x165c/0x1b6c (unreliable) [C00011333410] [C01EF280] .journal_init+0xdc0/0xee8 [C00011333530] [C01CDBD8] .reiserfs_fill_super+0xa90/0x1e40 [C00011333790] [C011E988] .get_sb_bdev+0x208/0x31c [C00011333870] [C01CA00C] .get_super_block+0x38/0x60 [C00011333900] [C011E260] .vfs_kern_mount+0xec/0x198 [C000113339B0] [C011E3E0] .do_kern_mount+0x88/0xdc [C00011333A50] [C01532CC] .do_mount+0xd50/0xe08 [C00011333D60] [C0175090] .compat_sys_mount+0x368/0x448 [C00011333E30] [C000861C] syscall_exit+0x0/0x40 But, if we try the steps in the reverse order, mount the partition on nbd server first and then try fsstress tests on the client side. This is just to ensure that the server is not seeing an incomplete journal created by the client side runs. Things work fine ! I doubt if this is due to the mount finding an incomplete journal created by the client side fsstress runs in the first scenario. My question is : Is this supported ? Mounting a filesystem which is already mounted and replaying the ( - a may be incomplete- ) journal. Absolutely not supported. Unless you have a filesystem that is specifically designed for simultanious read-write mount from multiple places, then you can't. For performance reasons most systems cache writes and updates in many cases, so the data read by one system may be out of date because another system has an update waiting to go to disk. You need a filesystem that has the ability for multiple systems to talk to each other about updates and locking and such things. Look for a cluster supporting filesystem or whatever is used to refer to a filesystem that supports multiple hosts having it mounted to provide redundant access. No normal filesystem can do it unless everyone has it mounted read only. If you want to share it, use NFS. That's what it's for. -- Len Sorensen
Re: Problem with multiple mounts
Lennart Sorensen wrote: On Wed, Nov 08, 2006 at 11:22:15AM -0800, Suzuki wrote: I exported a disk partition using nbd protocol. On the nbd client, I make reiserfs and run fsstress test case on this partition. At the same time, I mount this partition on the nbd server. Then Oops appears as following: ReiserFS: sda10: found reiserfs format 3.6 with standard journal ReiserFS: sda10: using ordered data mode ReiserFS: sda10: journal params: device sda10, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30 ReiserFS: sda10: checking transaction log (sda10) Oops: Kernel access of bad area, sig: 11 [#1] Call Trace: [C00011333090] [C01EDB70] .journal_read+0x165c/0x1b6c (unreliable) [C00011333410] [C01EF280] .journal_init+0xdc0/0xee8 [C00011333530] [C01CDBD8] .reiserfs_fill_super+0xa90/0x1e40 [C00011333790] [C011E988] .get_sb_bdev+0x208/0x31c [C00011333870] [C01CA00C] .get_super_block+0x38/0x60 [C00011333900] [C011E260] .vfs_kern_mount+0xec/0x198 [C000113339B0] [C011E3E0] .do_kern_mount+0x88/0xdc [C00011333A50] [C01532CC] .do_mount+0xd50/0xe08 [C00011333D60] [C0175090] .compat_sys_mount+0x368/0x448 [C00011333E30] [C000861C] syscall_exit+0x0/0x40 But, if we try the steps in the reverse order, mount the partition on nbd server first and then try fsstress tests on the client side. This is just to ensure that the server is not seeing an incomplete journal created by the client side runs. Things work fine ! I doubt if this is due to the mount finding an incomplete journal created by the client side fsstress runs in the first scenario. My question is : Is this supported ? Mounting a filesystem which is already mounted and replaying the ( - a may be incomplete- ) journal. Absolutely not supported. Unless you have a filesystem that is specifically designed for simultanious read-write mount from multiple places, then you can't. For performance reasons most systems cache writes and updates in many cases, so the data read by one system may be out of date because another system has an update waiting to go to disk. You need a filesystem that has the ability for multiple systems to talk to each other about updates and locking and such things. Look for a cluster supporting filesystem or whatever is used to refer to a filesystem that supports multiple hosts having it mounted to provide redundant access. No normal filesystem can do it unless everyone has it mounted read only. If you want to share it, use NFS. That's what it's for. Thanks for the response. This problem was reported by one of our test team on 2.6.19. So, I wanted to confirm that what they are doing is not supported ! Thanks, Suzuki -- Len Sorensen
Re: Problem with multiple mounts
On Nov 08, 2006 14:38 -0800, Suzuki wrote: Lennart Sorensen wrote: ReiserFS: sda10: checking transaction log (sda10) Oops: Kernel access of bad area, sig: 11 [#1] Call Trace: [C00011333090] [C01EDB70] .journal_read+0x165c/0x1b6c (unreliable) [C00011333410] [C01EF280] .journal_init+0xdc0/0xee8 [C00011333530] [C01CDBD8] .reiserfs_fill_super+0xa90/0x1e40 [C00011333790] [C011E988] .get_sb_bdev+0x208/0x31c [C00011333870] [C01CA00C] .get_super_block+0x38/0x60 [C00011333900] [C011E260] .vfs_kern_mount+0xec/0x198 [C000113339B0] [C011E3E0] .do_kern_mount+0x88/0xdc [C00011333A50] [C01532CC] .do_mount+0xd50/0xe08 [C00011333D60] [C0175090] .compat_sys_mount+0x368/0x448 [C00011333E30] [C000861C] syscall_exit+0x0/0x40 My question is : Is this supported ? Mounting a filesystem which is already mounted and replaying the ( - a may be incomplete- ) journal. Thanks for the response. This problem was reported by one of our test team on 2.6.19. So, I wanted to confirm that what they are doing is not supported ! I would suggest that even while this is not supported, it would be prudent to fix such a bug. It might be possible to hit a similar problem if there is corruption of the on-disk data in the journal and oopsing the kernel isn't a graceful way to deal with bad data on disk. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc.