Re: [PATCH] do_mounts: try all available filesystems before panicking
2014-05-26 7:19 GMT+03:00 Dave Chinner : > On Mon, May 26, 2014 at 11:19:04AM +1000, Dave Chinner wrote: >> On Mon, May 26, 2014 at 10:08:13AM +1000, Dave Chinner wrote: >> > On Sun, May 25, 2014 at 01:04:09PM -0700, Linus Torvalds wrote: >> > > On Mon, May 5, 2014 at 11:34 AM, Plamen Petrov >> > > wrote: >> > > > >> > > > The story short: on systems with btrfs root I have a kernel .config >> > > > with ext4, >> > > > xfs and btrfs built-in which works fine with 3.13.x, but 3.14.x >> > > > panics. After >> > > > inserting some debug printks, I got this info from mount_block_root: >> > > > >> > > > ---> EACCESS=13, EINVAL=22, Available filesystems: ext3 ext2 ext4 >> > > > fuseblk xfs btrfs >> > > > -> Tried ext3, error code is -22. >> > > > -> Tried ext2, error code is -22. >> > > > -> Tried ext4, error code is -22. >> > > > -> Tried fuseblk, error code is -22. >> > > > -> Tried xfs, error code is -38. >> > > > VFS: Cannot open root device "sda2" or unknown-block(8,2): error -38 >> > > > Please append a correct "root=" boot option; here are the available >> > > > partitions: >> >> BTW, This is the original thread with lots of triage in it: >> >> http://www.spinics.net/lists/linux-btrfs/msg33455.html >> >> But that doesn't reach any conclusion. I suspect that the >> change of btrfs init (from very early (@~1.8s into the boot) until a >> few milliseconds before the root mount is changing the order in >> which the filesystem type list is traversed by the mount, resulting >> in XFS being used to probe the device before btrfs. > > On that point, on 3.15-rc6: > > $ tail -1 /proc/filesystems > btrfs > $ > >> Why XFS is seeing /dev/sda2 as containing an XFS filesystem is not >> yet clear, but perhaps once you've dumped the the first sector of >> the btrfs partition all will become clear > > No need, I found the regression. Plamen, can you please try the > patch below? > Yes, Dave. I applied your patch on top of linux 3.15-rc5. I tried the patch you sent in a VM I used for tests. With only your patch applied - the system boots up normally. Thanks for the different perspectives, guys! Always a pleasure communicating with you. After a month or so waiting - definetly worth it! Thanks a lot! -- Plamen Petrov -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] do_mounts: try all available filesystems before panicking
On Mon, May 26, 2014 at 11:19:04AM +1000, Dave Chinner wrote: > On Mon, May 26, 2014 at 10:08:13AM +1000, Dave Chinner wrote: > > On Sun, May 25, 2014 at 01:04:09PM -0700, Linus Torvalds wrote: > > > On Mon, May 5, 2014 at 11:34 AM, Plamen Petrov > > > wrote: > > > > > > > > The story short: on systems with btrfs root I have a kernel .config > > > > with ext4, > > > > xfs and btrfs built-in which works fine with 3.13.x, but 3.14.x panics. > > > > After > > > > inserting some debug printks, I got this info from mount_block_root: > > > > > > > > ---> EACCESS=13, EINVAL=22, Available filesystems: ext3 ext2 ext4 > > > > fuseblk xfs btrfs > > > > -> Tried ext3, error code is -22. > > > > -> Tried ext2, error code is -22. > > > > -> Tried ext4, error code is -22. > > > > -> Tried fuseblk, error code is -22. > > > > -> Tried xfs, error code is -38. > > > > VFS: Cannot open root device "sda2" or unknown-block(8,2): error -38 > > > > Please append a correct "root=" boot option; here are the available > > > > partitions: > > BTW, This is the original thread with lots of triage in it: > > http://www.spinics.net/lists/linux-btrfs/msg33455.html > > But that doesn't reach any conclusion. I suspect that the > change of btrfs init (from very early (@~1.8s into the boot) until a > few milliseconds before the root mount is changing the order in > which the filesystem type list is traversed by the mount, resulting > in XFS being used to probe the device before btrfs. On that point, on 3.15-rc6: $ tail -1 /proc/filesystems btrfs $ > Why XFS is seeing /dev/sda2 as containing an XFS filesystem is not > yet clear, but perhaps once you've dumped the the first sector of > the btrfs partition all will become clear No need, I found the regression. Plamen, can you please try the patch below? Cheers, Dave. -- Dave Chinner da...@fromorbit.com xfs: xfs_readsb needs to check for magic numbers From: Dave Chinner Commit daba542 ("xfs: skip verification on initial "guess" superblock read") dropped the use of a veridier for the initial superblock read so we can probe the sector size of the filesystem stored in the superblock. It, however, now fails to validate that what was read initially is actually an XFS superblock and hence will fail the sector size check and return ENOSYS. This causes probe-based mounts to fail because it expects XFS to return EINVAL when it doesn't recognise the superblock format. cc: Reported-by: Plamen Petrov Signed-off-by: Dave Chinner --- fs/xfs/xfs_mount.c | 23 +-- 1 file changed, 17 insertions(+), 6 deletions(-) diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c index 8d1afb8..2409224 100644 --- a/fs/xfs/xfs_mount.c +++ b/fs/xfs/xfs_mount.c @@ -327,8 +327,19 @@ reread: /* * Initialize the mount structure from the superblock. */ - xfs_sb_from_disk(&mp->m_sb, XFS_BUF_TO_SBP(bp)); - xfs_sb_quota_from_disk(&mp->m_sb); + xfs_sb_from_disk(sbp, XFS_BUF_TO_SBP(bp)); + xfs_sb_quota_from_disk(sbp); + + /* +* If we haven't validated the superblock, do so now before we try +* to check the sector size and reread the superblock appropriately. +*/ + if (sbp->sb_magicnum != XFS_SB_MAGIC) { + if (loud) + xfs_warn(mp, "Invalid superblock magic number"); + error = EINVAL; + goto release_buf; + } /* * We must be able to do sector-sized and sector-aligned IO. @@ -341,11 +352,11 @@ reread: goto release_buf; } - /* -* Re-read the superblock so the buffer is correctly sized, -* and properly verified. -*/ if (buf_ops == NULL) { + /* +* Re-read the superblock so the buffer is correctly sized, +* and properly verified. +*/ xfs_buf_relse(bp); sector_size = sbp->sb_sectsize; buf_ops = loud ? &xfs_sb_buf_ops : &xfs_sb_quiet_buf_ops; -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] do_mounts: try all available filesystems before panicking
On Sun, May 25, 2014 at 05:11:30PM -0400, Theodore Ts'o wrote: > On Sun, May 25, 2014 at 01:04:09PM -0700, Linus Torvalds wrote: > > > > The fact is, I think xfs is just buggy. Returning 38 (ENOSYS) is > > totally insane. "No such system call"? Somebody is on some bad bad > > drugs. Not that the mount_block_root() loop and error handling might > > not be a good thing to perhaps tweak _too_, but at the very least your > > patch means that now it no longer prints out the error number at all. > > There's only a single instance of ENOSYS in fs/xfs/xfs_mount.c: > > /* >* We must be able to do sector-sized and sector-aligned IO. >*/ > if (sector_size > sbp->sb_sectsize) { > if (loud) > xfs_warn(mp, "device supports %u byte sectors (not %u)", > sector_size, sbp->sb_sectsize); > error = ENOSYS; > goto release_buf; > } > > Plamen, does changing the ENOSYS to EINVAL above fix things for you? > > > Anyway, I'm also not seeing why that xfs error would be new to 3.14, > > though.. Adding the XFS people to the cc. > > If I had to guess, commit daba5427d is new to 3.14, and it might > explain the change in behavior. Yup, it's buggy, though not in an obvious way. I'll have a patch for it soon. Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] do_mounts: try all available filesystems before panicking
On Mon, May 26, 2014 at 10:08:13AM +1000, Dave Chinner wrote: > On Sun, May 25, 2014 at 01:04:09PM -0700, Linus Torvalds wrote: > > On Mon, May 5, 2014 at 11:34 AM, Plamen Petrov > > wrote: > > > > > > The story short: on systems with btrfs root I have a kernel .config with > > > ext4, > > > xfs and btrfs built-in which works fine with 3.13.x, but 3.14.x panics. > > > After > > > inserting some debug printks, I got this info from mount_block_root: > > > > > > ---> EACCESS=13, EINVAL=22, Available filesystems: ext3 ext2 ext4 fuseblk > > > xfs btrfs > > > -> Tried ext3, error code is -22. > > > -> Tried ext2, error code is -22. > > > -> Tried ext4, error code is -22. > > > -> Tried fuseblk, error code is -22. > > > -> Tried xfs, error code is -38. > > > VFS: Cannot open root device "sda2" or unknown-block(8,2): error -38 > > > Please append a correct "root=" boot option; here are the available > > > partitions: BTW, This is the original thread with lots of triage in it: http://www.spinics.net/lists/linux-btrfs/msg33455.html But that doesn't reach any conclusion. I suspect that the change of btrfs init (from very early (@~1.8s into the boot) until a few milliseconds before the root mount is changing the order in which the filesystem type list is traversed by the mount, resulting in XFS being used to probe the device before btrfs. Why XFS is seeing /dev/sda2 as containing an XFS filesystem is not yet clear, but perhaps once you've dumped the the first sector of the btrfs partition all will become clear Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] do_mounts: try all available filesystems before panicking
On Sun, May 25, 2014 at 01:04:09PM -0700, Linus Torvalds wrote: > On Mon, May 5, 2014 at 11:34 AM, Plamen Petrov wrote: > > > > The story short: on systems with btrfs root I have a kernel .config with > > ext4, > > xfs and btrfs built-in which works fine with 3.13.x, but 3.14.x panics. > > After > > inserting some debug printks, I got this info from mount_block_root: > > > > ---> EACCESS=13, EINVAL=22, Available filesystems: ext3 ext2 ext4 fuseblk > > xfs btrfs > > -> Tried ext3, error code is -22. > > -> Tried ext2, error code is -22. > > -> Tried ext4, error code is -22. > > -> Tried fuseblk, error code is -22. > > -> Tried xfs, error code is -38. > > VFS: Cannot open root device "sda2" or unknown-block(8,2): error -38 > > Please append a correct "root=" boot option; here are the available > > partitions: So, XFS returned ENOSYS to the mount attempt. That means it found what appears to be a valid XFS superblock at block zero. That is, the magic number matched, the version was valid, all of the sanity checks of the values are within supported ranges, and the reason the mount failed was either a block size larger than page size or an unsupported inode size. There would have been an error in dmesg to tell you which. Can you please send the dmesg output of the failed mount attempt, as well as the output of: # dd if=/dev/sda2 bs=512 count=1 | hexdump -C So we can determine exactly why XFS thought it should be mounting that block device? > > Last one tried is xfs, the needed btrfs in this case never gets a chance. > > Looking at the code in init/do_mounts.c we can see that it "continue"s only > > if > > the return code it got is EINVAL, yet xfs clearly does not fit - so the > > kernel > > panics. Maybe there are other filesystems like xfs - I did not check. This > > patch fixes mount_block_root to try all available filesystems first, and > > then > > panic. The patched 3.14.x works for me. > > Hmm. I don't really dislike your patch, but it makes all the code > _after_ the switch-statement dead, since there is now no way to ever > fall through the switch statement. I don't think the patch addresses the cause of the problem. The code is trying to "mount the first filesystem type it finds that matches", but that match has resulted in a "filesystem cannot be mounted" error. The cause of the problem is that there's a difference between "don't understand what is on disk" and "understand exactly what is on disk and we don't support it". If we find a superblock match, then there is no other filesystem type that should be checked regardless of the error that is returned to upper loop. The loop should only continue if the filesystem doesn't recognise what is on disk (i.e. EINVAL) is returned. If it matches, then the filesystem must try to mount the filesystem. What do you expect a filesystem to do if it has an error during mount? It's going to return something other than EINVAL. It could be ENOMEM, EIO, etc, and those cases should terminate the "search until we find a match" loop. So, from that persepective the change is simply wrong. > The fact is, I think xfs is just buggy. Returning 38 (ENOSYS) is > totally insane. "No such system call"? Somebody is on some bad bad > drugs. Not that the mount_block_root() loop and error handling might > not be a good thing to perhaps tweak _too_, but at the very least your > patch means that now it no longer prints out the error number at all. Sure, the error might be silly, but it's irrelevant to the patch being discussed because it could have been one of several different errors that a failed mount could return. And, besides, XFS has returned that error for this condition for, well, more than 10 years: http://oss.sgi.com/cgi-bin/gitweb.cgi?p=archive/xfs-import.git;a=commitdiff;h=ba4331892d608b4e816b52a4de29693af0dd5c13 IOWs, ENOSYS in this case effectively means "system does not support filesystem configuration". It's not an invalid superblock (EINVAL) nor is it a corrupted superblock (EFSCORRUPTED), so it's something else... Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] do_mounts: try all available filesystems before panicking
On Sun, May 25, 2014 at 01:04:09PM -0700, Linus Torvalds wrote: > > The fact is, I think xfs is just buggy. Returning 38 (ENOSYS) is > totally insane. "No such system call"? Somebody is on some bad bad > drugs. Not that the mount_block_root() loop and error handling might > not be a good thing to perhaps tweak _too_, but at the very least your > patch means that now it no longer prints out the error number at all. There's only a single instance of ENOSYS in fs/xfs/xfs_mount.c: /* * We must be able to do sector-sized and sector-aligned IO. */ if (sector_size > sbp->sb_sectsize) { if (loud) xfs_warn(mp, "device supports %u byte sectors (not %u)", sector_size, sbp->sb_sectsize); error = ENOSYS; goto release_buf; } Plamen, does changing the ENOSYS to EINVAL above fix things for you? > Anyway, I'm also not seeing why that xfs error would be new to 3.14, > though.. Adding the XFS people to the cc. If I had to guess, commit daba5427d is new to 3.14, and it might explain the change in behavior. Cheers, - Ted -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] do_mounts: try all available filesystems before panicking
On Mon, May 5, 2014 at 11:34 AM, Plamen Petrov wrote: > > The story short: on systems with btrfs root I have a kernel .config with ext4, > xfs and btrfs built-in which works fine with 3.13.x, but 3.14.x panics. After > inserting some debug printks, I got this info from mount_block_root: > > ---> EACCESS=13, EINVAL=22, Available filesystems: ext3 ext2 ext4 fuseblk xfs > btrfs > -> Tried ext3, error code is -22. > -> Tried ext2, error code is -22. > -> Tried ext4, error code is -22. > -> Tried fuseblk, error code is -22. > -> Tried xfs, error code is -38. > VFS: Cannot open root device "sda2" or unknown-block(8,2): error -38 > Please append a correct "root=" boot option; here are the available > partitions: > > Last one tried is xfs, the needed btrfs in this case never gets a chance. > Looking at the code in init/do_mounts.c we can see that it "continue"s only if > the return code it got is EINVAL, yet xfs clearly does not fit - so the kernel > panics. Maybe there are other filesystems like xfs - I did not check. This > patch fixes mount_block_root to try all available filesystems first, and then > panic. The patched 3.14.x works for me. Hmm. I don't really dislike your patch, but it makes all the code _after_ the switch-statement dead, since there is now no way to ever fall through the switch statement. So now that /* * Allow the user to distinguish between failed sys_open * and bad superblock on root device. * and give them a list of the available devices */ comment ends up being entirely stale, and the code after it is pointless and it all looks very misleading. And I'm assuming somebody cared about that difference at some point. The fact is, I think xfs is just buggy. Returning 38 (ENOSYS) is totally insane. "No such system call"? Somebody is on some bad bad drugs. Not that the mount_block_root() loop and error handling might not be a good thing to perhaps tweak _too_, but at the very least your patch means that now it no longer prints out the error number at all. Maybe just making it do something like the attached patch instead? It doesn't panic on unrecognized errors, just prints them out (just once, if it repeats). It also doesn't do the "goto repeat" if we already have the RDONLY bit set, because if somebody is returning insane error numbers, that could otherwise result in an endless loop. Anyway, I'm also not seeing why that xfs error would be new to 3.14, though.. Adding the XFS people to the cc. Comments (patch obviously TOTALLY UNTESTED) Linus init/do_mounts.c | 24 ++-- 1 file changed, 6 insertions(+), 18 deletions(-) diff --git a/init/do_mounts.c b/init/do_mounts.c index 82f22885c87e..a6a725f46f18 100644 --- a/init/do_mounts.c +++ b/init/do_mounts.c @@ -385,6 +385,7 @@ void __init mount_block_root(char *name, int flags) #else const char *b = name; #endif + int last_err = 0; get_fs_names(fs_names); retry: @@ -394,29 +395,16 @@ retry: case 0: goto out; case -EACCES: + if (flags & MS_RDONLY) + break; flags |= MS_RDONLY; goto retry; case -EINVAL: continue; } - /* -* Allow the user to distinguish between failed sys_open -* and bad superblock on root device. -* and give them a list of the available devices -*/ -#ifdef CONFIG_BLOCK - __bdevname(ROOT_DEV, b); -#endif - printk("VFS: Cannot open root device \"%s\" or %s: error %d\n", - root_device_name, b, err); - printk("Please append a correct \"root=\" boot option; here are the available partitions:\n"); - - printk_all_partitions(); -#ifdef CONFIG_DEBUG_BLOCK_EXT_DEVT - printk("DEBUG_BLOCK_EXT_DEVT is enabled, you need to specify " - "explicit textual name for \"root=\" boot option.\n"); -#endif - panic("VFS: Unable to mount root fs on %s", b); + if (err != last_err) + printk("VFS: Cannot open root device \"%s\" or %s: error %d\n", + root_device_name, b, err); } printk("List of all partitions:\n");