Re: [PATCH RFC] btrfs: self heal from SB fail
If the old btrfs is over written with newer btrfs SB, and if mkfs.btrfs is not overwriting all the copies of SB then its a mkfs.btrfs bug. Nope. If the new btrfs is a smaller one than original btrfs, the 2nd super can still be there. Oh right, the mkfs.btrfs -b option, where we don't use entire device. Here the source of the problem is having the stale SB which is created by previous btrfs, and at the time of mkfs we do notice that there is a stale SB, so we already know how many copy of SB were there, but we are bit conservative not to clean all SBs. Its important to fix the problem end (mkfs), not the victim end (sb recovery). But looks like not everyone agrees to this view, and sb self heal will go wrong if mkfs does not clean all copy of SB, so now the only way to recover it is by user intervention, so I am ok to add mount -o sb_copy_num=<0|1|2>. Suggestions for any better name is welcome. Side check, any idea what's the use case around mkfs.btrfs -b option other than for testing ? Thanks, Anand -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC] btrfs: self heal from SB fail
On 2017年12月08日 22:02, Anand Jain wrote: > > > On 12/08/2017 08:51 PM, Austin S. Hemmelgarn wrote: >> On 2017-12-08 02:57, Anand Jain wrote: >>> -EXPERIMENTAL- >>> As of now when primary SB fails we won't self heal and would fail mount, >>> this is an experimental patch which thinks why not go and read backup >>> copy. >> I like the concept, and actually think this should be default behavior >> on a filesystem that's already mounted (we fix other errors, why not >> SB's), > > >> but I don't think it should be default behavior at mount time for the >> reasons Qu has outlined (picking up old BTRFS SB's after reformatting >> is bad). > > If the primary SB is not btrfs, then unless forced with -t btrfs > option, this mount won't be in btrfs at all. That means it goes to > the respective FS by default. > > If the old btrfs is over written with newer btrfs SB, and if mkfs.btrfs > is not overwriting all the copies of SB then its a mkfs.btrfs bug. Nope. If the new btrfs is a smaller one than original btrfs, the 2nd super can still be there. (AFAIK all these examples have been given by David for several times) Thanks, Qu > > Which means there is already mount -t option to redirect to the FS > module that is needed by the user, when primary SB is corrupted. > We should use this feature. > > Thanks, Anand > > >> However, I do think it's useful to be able to ask for this behavior on >> mount, so that you don't need to fight with the programs to get a >> filesystem to mount when the first SB is missing (perhaps add a >> 'usebackupsb' option to mirror 'usebackuproot'?). >>> >>> Signed-off-by: Anand Jain >>> --- >>> fs/btrfs/disk-io.c | 8 +++- >>> fs/btrfs/volumes.c | 10 +++--- >>> 2 files changed, 14 insertions(+), 4 deletions(-) >>> >>> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c >>> index 9b20c1f3563b..a791b8dfe8a8 100644 >>> --- a/fs/btrfs/disk-io.c >>> +++ b/fs/btrfs/disk-io.c >>> @@ -3190,7 +3190,7 @@ struct buffer_head *btrfs_read_dev_super(struct >>> block_device *bdev) >>> * So, we need to add a special mount option to scan for >>> * later supers, using BTRFS_SUPER_MIRROR_MAX instead >>> */ >>> - for (i = 0; i < 1; i++) { >>> + for (i = 0; i < BTRFS_SUPER_MIRROR_MAX; i++) { >>> ret = btrfs_read_dev_one_super(bdev, i, &bh); >>> if (ret) >>> continue; >>> @@ -4015,11 +4015,17 @@ static int btrfs_check_super_valid(struct >>> btrfs_fs_info *fs_info) >>> ret = -EINVAL; >>> } >>> +#if 0 >>> + /* >>> + * Need a way to check for any copy of SB, as its not a >>> + * strong check, just ignore this for now. >>> + */ >>> if (btrfs_super_bytenr(sb) != BTRFS_SUPER_INFO_OFFSET) { >>> btrfs_err(fs_info, "super offset mismatch %llu != %u", >>> btrfs_super_bytenr(sb), BTRFS_SUPER_INFO_OFFSET); >>> ret = -EINVAL; >>> } >>> +#endif >>> /* >>> * Obvious sys_chunk_array corruptions, it must hold at least >>> one key >>> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c >>> index 9fa2539a8493..f368db94d62b 100644 >>> --- a/fs/btrfs/volumes.c >>> +++ b/fs/btrfs/volumes.c >>> @@ -1369,7 +1369,7 @@ int btrfs_scan_one_device(const char *path, >>> fmode_t flags, void *holder, >>> { >>> struct btrfs_super_block *disk_super; >>> struct block_device *bdev; >>> - struct page *page; >>> + struct buffer_head *sb_bh; >>> int ret = -EINVAL; >>> u64 devid; >>> u64 transid; >>> @@ -1392,8 +1392,12 @@ int btrfs_scan_one_device(const char *path, >>> fmode_t flags, void *holder, >>> goto error; >>> } >>> - if (btrfs_read_disk_super(bdev, bytenr, &page, &disk_super)) >>> + sb_bh = btrfs_read_dev_super(bdev); >>> + if (IS_ERR(sb_bh)) { >>> + ret = PTR_ERR(sb_bh); >>> goto error_bdev_put; >>> + } >>> + disk_super = (struct btrfs_super_block *) sb_bh->b_data; >>> devid = btrfs_stack_device_id(&disk_super->dev_item); >>> transid = btrfs_super_generation(disk_super); >>> @@ -1413,7 +1417,7 @@ int btrfs_scan_one_device(const char *path, >>> fmode_t flags, void *holder, >>> if (!ret && fs_devices_ret) >>> (*fs_devices_ret)->total_devices = total_devices; >>> - btrfs_release_disk_super(page); >>> + brelse(sb_bh); >>> error_bdev_put: >>> blkdev_put(bdev, flags); >>> >> >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >> the body of a message to majord...@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html signature.asc Description: OpenPGP digital signature
RE: [PATCH RFC] btrfs: self heal from SB fail
> -Original Message- > From: linux-btrfs-ow...@vger.kernel.org [mailto:linux-btrfs- > ow...@vger.kernel.org] On Behalf Of Austin S. Hemmelgarn > Sent: Friday, 8 December 2017 11:51 PM > To: Anand Jain ; linux-btrfs@vger.kernel.org > Subject: Re: [PATCH RFC] btrfs: self heal from SB fail > > On 2017-12-08 02:57, Anand Jain wrote: > > -EXPERIMENTAL- > > As of now when primary SB fails we won't self heal and would fail > > mount, this is an experimental patch which thinks why not go and read > > backup copy. > I like the concept, and actually think this should be default behavior on a > filesystem that's already mounted (we fix other errors, why not SB's), but I > don't think it should be default behavior at mount time for the reasons Qu > has outlined (picking up old BTRFS SB's after reformatting is bad). However, > I > do think it's useful to be able to ask for this behavior on mount, so that you > don't need to fight with the programs to get a filesystem to mount when the > first SB is missing (perhaps add a 'usebackupsb' option to mirror > 'usebackuproot'?). I agree with this. The behaviour I'd like to see would be refusal to mount (without additional mount options) but also: print the needed info to the kernel log so the user can add the required mount option or read the wiki for more information, and print some diagnostic info on the primary + secondary super blocks. Paul.
Re: [PATCH RFC] btrfs: self heal from SB fail
On 12/08/2017 08:51 PM, Austin S. Hemmelgarn wrote: On 2017-12-08 02:57, Anand Jain wrote: -EXPERIMENTAL- As of now when primary SB fails we won't self heal and would fail mount, this is an experimental patch which thinks why not go and read backup copy. I like the concept, and actually think this should be default behavior on a filesystem that's already mounted (we fix other errors, why not SB's), but I don't think it should be default behavior at mount time for the reasons Qu has outlined (picking up old BTRFS SB's after reformatting is bad). If the primary SB is not btrfs, then unless forced with -t btrfs option, this mount won't be in btrfs at all. That means it goes to the respective FS by default. If the old btrfs is over written with newer btrfs SB, and if mkfs.btrfs is not overwriting all the copies of SB then its a mkfs.btrfs bug. Which means there is already mount -t option to redirect to the FS module that is needed by the user, when primary SB is corrupted. We should use this feature. Thanks, Anand However, I do think it's useful to be able to ask for this behavior on mount, so that you don't need to fight with the programs to get a filesystem to mount when the first SB is missing (perhaps add a 'usebackupsb' option to mirror 'usebackuproot'?). Signed-off-by: Anand Jain --- fs/btrfs/disk-io.c | 8 +++- fs/btrfs/volumes.c | 10 +++--- 2 files changed, 14 insertions(+), 4 deletions(-) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 9b20c1f3563b..a791b8dfe8a8 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -3190,7 +3190,7 @@ struct buffer_head *btrfs_read_dev_super(struct block_device *bdev) * So, we need to add a special mount option to scan for * later supers, using BTRFS_SUPER_MIRROR_MAX instead */ - for (i = 0; i < 1; i++) { + for (i = 0; i < BTRFS_SUPER_MIRROR_MAX; i++) { ret = btrfs_read_dev_one_super(bdev, i, &bh); if (ret) continue; @@ -4015,11 +4015,17 @@ static int btrfs_check_super_valid(struct btrfs_fs_info *fs_info) ret = -EINVAL; } +#if 0 + /* + * Need a way to check for any copy of SB, as its not a + * strong check, just ignore this for now. + */ if (btrfs_super_bytenr(sb) != BTRFS_SUPER_INFO_OFFSET) { btrfs_err(fs_info, "super offset mismatch %llu != %u", btrfs_super_bytenr(sb), BTRFS_SUPER_INFO_OFFSET); ret = -EINVAL; } +#endif /* * Obvious sys_chunk_array corruptions, it must hold at least one key diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 9fa2539a8493..f368db94d62b 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -1369,7 +1369,7 @@ int btrfs_scan_one_device(const char *path, fmode_t flags, void *holder, { struct btrfs_super_block *disk_super; struct block_device *bdev; - struct page *page; + struct buffer_head *sb_bh; int ret = -EINVAL; u64 devid; u64 transid; @@ -1392,8 +1392,12 @@ int btrfs_scan_one_device(const char *path, fmode_t flags, void *holder, goto error; } - if (btrfs_read_disk_super(bdev, bytenr, &page, &disk_super)) + sb_bh = btrfs_read_dev_super(bdev); + if (IS_ERR(sb_bh)) { + ret = PTR_ERR(sb_bh); goto error_bdev_put; + } + disk_super = (struct btrfs_super_block *) sb_bh->b_data; devid = btrfs_stack_device_id(&disk_super->dev_item); transid = btrfs_super_generation(disk_super); @@ -1413,7 +1417,7 @@ int btrfs_scan_one_device(const char *path, fmode_t flags, void *holder, if (!ret && fs_devices_ret) (*fs_devices_ret)->total_devices = total_devices; - btrfs_release_disk_super(page); + brelse(sb_bh); error_bdev_put: blkdev_put(bdev, flags); -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC] btrfs: self heal from SB fail
On 2017年12月08日 21:05, Austin S. Hemmelgarn wrote: > On 2017-12-08 07:59, Qu Wenruo wrote: >> >> >> On 2017年12月08日 20:51, Austin S. Hemmelgarn wrote: >>> On 2017-12-08 02:57, Anand Jain wrote: -EXPERIMENTAL- As of now when primary SB fails we won't self heal and would fail mount, this is an experimental patch which thinks why not go and read backup copy. >>> I like the concept, and actually think this should be default behavior >>> on a filesystem that's already mounted (we fix other errors, why not >>> SB's), but I don't think it should be default behavior at mount time for >>> the reasons Qu has outlined (picking up old BTRFS SB's after >>> reformatting is bad). However, I do think it's useful to be able to ask >>> for this behavior on mount, so that you don't need to fight with the >>> programs to get a filesystem to mount when the first SB is missing >>> (perhaps add a 'usebackupsb' option to mirror 'usebackuproot'?). >> >> Yeah, I also like the idea of 'usebackupsuper'/'usebackupsuper=n' mount >> option than do it automatically. > I still think there should be an option to do automatic detection (it's > not particularly hard, and it's not likely to do the wrong thing in most > cases), but being able to explicitly specify a particular superblock for > a mount is definitely a step in the right direction. usebackupsuper <- Auto usebackupsuper=n <- Manual usebackupsuper=n,m <- Manual multi, in given order (The last one seems a little overkilled though) Thanks, Qu >> >> Thanks, >> Qu >> Signed-off-by: Anand Jain --- fs/btrfs/disk-io.c | 8 +++- fs/btrfs/volumes.c | 10 +++--- 2 files changed, 14 insertions(+), 4 deletions(-) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 9b20c1f3563b..a791b8dfe8a8 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -3190,7 +3190,7 @@ struct buffer_head *btrfs_read_dev_super(struct block_device *bdev) * So, we need to add a special mount option to scan for * later supers, using BTRFS_SUPER_MIRROR_MAX instead */ - for (i = 0; i < 1; i++) { + for (i = 0; i < BTRFS_SUPER_MIRROR_MAX; i++) { ret = btrfs_read_dev_one_super(bdev, i, &bh); if (ret) continue; @@ -4015,11 +4015,17 @@ static int btrfs_check_super_valid(struct btrfs_fs_info *fs_info) ret = -EINVAL; } +#if 0 + /* + * Need a way to check for any copy of SB, as its not a + * strong check, just ignore this for now. + */ if (btrfs_super_bytenr(sb) != BTRFS_SUPER_INFO_OFFSET) { btrfs_err(fs_info, "super offset mismatch %llu != %u", btrfs_super_bytenr(sb), BTRFS_SUPER_INFO_OFFSET); ret = -EINVAL; } +#endif /* * Obvious sys_chunk_array corruptions, it must hold at least one key diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 9fa2539a8493..f368db94d62b 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -1369,7 +1369,7 @@ int btrfs_scan_one_device(const char *path, fmode_t flags, void *holder, { struct btrfs_super_block *disk_super; struct block_device *bdev; - struct page *page; + struct buffer_head *sb_bh; int ret = -EINVAL; u64 devid; u64 transid; @@ -1392,8 +1392,12 @@ int btrfs_scan_one_device(const char *path, fmode_t flags, void *holder, goto error; } - if (btrfs_read_disk_super(bdev, bytenr, &page, &disk_super)) + sb_bh = btrfs_read_dev_super(bdev); + if (IS_ERR(sb_bh)) { + ret = PTR_ERR(sb_bh); goto error_bdev_put; + } + disk_super = (struct btrfs_super_block *) sb_bh->b_data; devid = btrfs_stack_device_id(&disk_super->dev_item); transid = btrfs_super_generation(disk_super); @@ -1413,7 +1417,7 @@ int btrfs_scan_one_device(const char *path, fmode_t flags, void *holder, if (!ret && fs_devices_ret) (*fs_devices_ret)->total_devices = total_devices; - btrfs_release_disk_super(page); + brelse(sb_bh); error_bdev_put: blkdev_put(bdev, flags); > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html signature.asc Description: OpenPGP digital signature
Re: [PATCH RFC] btrfs: self heal from SB fail
On 2017-12-08 07:59, Qu Wenruo wrote: On 2017年12月08日 20:51, Austin S. Hemmelgarn wrote: On 2017-12-08 02:57, Anand Jain wrote: -EXPERIMENTAL- As of now when primary SB fails we won't self heal and would fail mount, this is an experimental patch which thinks why not go and read backup copy. I like the concept, and actually think this should be default behavior on a filesystem that's already mounted (we fix other errors, why not SB's), but I don't think it should be default behavior at mount time for the reasons Qu has outlined (picking up old BTRFS SB's after reformatting is bad). However, I do think it's useful to be able to ask for this behavior on mount, so that you don't need to fight with the programs to get a filesystem to mount when the first SB is missing (perhaps add a 'usebackupsb' option to mirror 'usebackuproot'?). Yeah, I also like the idea of 'usebackupsuper'/'usebackupsuper=n' mount option than do it automatically. I still think there should be an option to do automatic detection (it's not particularly hard, and it's not likely to do the wrong thing in most cases), but being able to explicitly specify a particular superblock for a mount is definitely a step in the right direction. Thanks, Qu Signed-off-by: Anand Jain --- fs/btrfs/disk-io.c | 8 +++- fs/btrfs/volumes.c | 10 +++--- 2 files changed, 14 insertions(+), 4 deletions(-) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 9b20c1f3563b..a791b8dfe8a8 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -3190,7 +3190,7 @@ struct buffer_head *btrfs_read_dev_super(struct block_device *bdev) * So, we need to add a special mount option to scan for * later supers, using BTRFS_SUPER_MIRROR_MAX instead */ - for (i = 0; i < 1; i++) { + for (i = 0; i < BTRFS_SUPER_MIRROR_MAX; i++) { ret = btrfs_read_dev_one_super(bdev, i, &bh); if (ret) continue; @@ -4015,11 +4015,17 @@ static int btrfs_check_super_valid(struct btrfs_fs_info *fs_info) ret = -EINVAL; } +#if 0 + /* + * Need a way to check for any copy of SB, as its not a + * strong check, just ignore this for now. + */ if (btrfs_super_bytenr(sb) != BTRFS_SUPER_INFO_OFFSET) { btrfs_err(fs_info, "super offset mismatch %llu != %u", btrfs_super_bytenr(sb), BTRFS_SUPER_INFO_OFFSET); ret = -EINVAL; } +#endif /* * Obvious sys_chunk_array corruptions, it must hold at least one key diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 9fa2539a8493..f368db94d62b 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -1369,7 +1369,7 @@ int btrfs_scan_one_device(const char *path, fmode_t flags, void *holder, { struct btrfs_super_block *disk_super; struct block_device *bdev; - struct page *page; + struct buffer_head *sb_bh; int ret = -EINVAL; u64 devid; u64 transid; @@ -1392,8 +1392,12 @@ int btrfs_scan_one_device(const char *path, fmode_t flags, void *holder, goto error; } - if (btrfs_read_disk_super(bdev, bytenr, &page, &disk_super)) + sb_bh = btrfs_read_dev_super(bdev); + if (IS_ERR(sb_bh)) { + ret = PTR_ERR(sb_bh); goto error_bdev_put; + } + disk_super = (struct btrfs_super_block *) sb_bh->b_data; devid = btrfs_stack_device_id(&disk_super->dev_item); transid = btrfs_super_generation(disk_super); @@ -1413,7 +1417,7 @@ int btrfs_scan_one_device(const char *path, fmode_t flags, void *holder, if (!ret && fs_devices_ret) (*fs_devices_ret)->total_devices = total_devices; - btrfs_release_disk_super(page); + brelse(sb_bh); error_bdev_put: blkdev_put(bdev, flags); -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC] btrfs: self heal from SB fail
On 12/08/2017 08:12 PM, Nikolay Borisov wrote: On 8.12.2017 12:33, Anand Jain wrote: On 12/08/2017 04:40 PM, Nikolay Borisov wrote: On 8.12.2017 09:57, Anand Jain wrote: -EXPERIMENTAL- As of now when primary SB fails we won't self heal and would fail mount, this is an experimental patch which thinks why not go and read backup copy. Signed-off-by: Anand Jain --- fs/btrfs/disk-io.c | 8 +++- fs/btrfs/volumes.c | 10 +++--- 2 files changed, 14 insertions(+), 4 deletions(-) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 9b20c1f3563b..a791b8dfe8a8 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -3190,7 +3190,7 @@ struct buffer_head *btrfs_read_dev_super(struct block_device *bdev) * So, we need to add a special mount option to scan for * later supers, using BTRFS_SUPER_MIRROR_MAX instead */ - for (i = 0; i < 1; i++) { + for (i = 0; i < BTRFS_SUPER_MIRROR_MAX; i++) { ret = btrfs_read_dev_one_super(bdev, i, &bh); if (ret) continue; @@ -4015,11 +4015,17 @@ static int btrfs_check_super_valid(struct btrfs_fs_info *fs_info) ret = -EINVAL; } +#if 0 + /* + * Need a way to check for any copy of SB, as its not a + * strong check, just ignore this for now. + */ if (btrfs_super_bytenr(sb) != BTRFS_SUPER_INFO_OFFSET) { btrfs_err(fs_info, "super offset mismatch %llu != %u", btrfs_super_bytenr(sb), BTRFS_SUPER_INFO_OFFSET); ret = -EINVAL; } +#endif /* * Obvious sys_chunk_array corruptions, it must hold at least one key diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 9fa2539a8493..f368db94d62b 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -1369,7 +1369,7 @@ int btrfs_scan_one_device(const char *path, fmode_t flags, void *holder, { struct btrfs_super_block *disk_super; struct block_device *bdev; - struct page *page; + struct buffer_head *sb_bh; int ret = -EINVAL; u64 devid; u64 transid; @@ -1392,8 +1392,12 @@ int btrfs_scan_one_device(const char *path, fmode_t flags, void *holder, goto error; } - if (btrfs_read_disk_super(bdev, bytenr, &page, &disk_super)) + sb_bh = btrfs_read_dev_super(bdev); This patch prompts another question: why do we have a page-based and a bufferhead-based interface to reading the super block? Right. we need to know that. Sorry I just saw this. FWIW unless we explicitly need a per-block state tracking (which I don't think we do) we should ideally switch to page-based mechanism. Buffer heads are considered deprecated. Also the only reason why we do have btrfsic_submit_bh is for the superblock bufer head. So potentially by removing the only requirement in the kernel where bh are used we can simplify the btrfsic code as well . Just something to keep in mind. Thanks. These are related to the other email thread [1] [1] [Question] Two variants of SB reads can we move into bio read instead ? We can follow up there. ? Could we ? That cleanup is due for a long time now. Thanks, Anand I have a very old patch to converge them and clean this up, but haven't sent it because there are some missing information on why it ended up like that in the first place. Thanks, Anand I did prototype switching the bufferheads to page based but the resulting code wasn't any cleaner. I believe there is also open the question what happens when btrfs is run on a 64k page machine. I.e. we are going to read a single page and the sb is going to be in the first 4k but what about the rest 60, they could potentially contain other metadata. The page will have to be freed asap so as not to peg the neighboring metadata? + if (IS_ERR(sb_bh)) { + ret = PTR_ERR(sb_bh); goto error_bdev_put; + } + disk_super = (struct btrfs_super_block *) sb_bh->b_data; devid = btrfs_stack_device_id(&disk_super->dev_item); transid = btrfs_super_generation(disk_super); @@ -1413,7 +1417,7 @@ int btrfs_scan_one_device(const char *path, fmode_t flags, void *holder, if (!ret && fs_devices_ret) (*fs_devices_ret)->total_devices = total_devices; - btrfs_release_disk_super(page); + brelse(sb_bh); error_bdev_put: blkdev_put(bdev, flags); -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC] btrfs: self heal from SB fail
On 2017年12月08日 20:51, Austin S. Hemmelgarn wrote: > On 2017-12-08 02:57, Anand Jain wrote: >> -EXPERIMENTAL- >> As of now when primary SB fails we won't self heal and would fail mount, >> this is an experimental patch which thinks why not go and read backup >> copy. > I like the concept, and actually think this should be default behavior > on a filesystem that's already mounted (we fix other errors, why not > SB's), but I don't think it should be default behavior at mount time for > the reasons Qu has outlined (picking up old BTRFS SB's after > reformatting is bad). However, I do think it's useful to be able to ask > for this behavior on mount, so that you don't need to fight with the > programs to get a filesystem to mount when the first SB is missing > (perhaps add a 'usebackupsb' option to mirror 'usebackuproot'?). Yeah, I also like the idea of 'usebackupsuper'/'usebackupsuper=n' mount option than do it automatically. Thanks, Qu >> >> Signed-off-by: Anand Jain >> --- >> fs/btrfs/disk-io.c | 8 +++- >> fs/btrfs/volumes.c | 10 +++--- >> 2 files changed, 14 insertions(+), 4 deletions(-) >> >> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c >> index 9b20c1f3563b..a791b8dfe8a8 100644 >> --- a/fs/btrfs/disk-io.c >> +++ b/fs/btrfs/disk-io.c >> @@ -3190,7 +3190,7 @@ struct buffer_head *btrfs_read_dev_super(struct >> block_device *bdev) >> * So, we need to add a special mount option to scan for >> * later supers, using BTRFS_SUPER_MIRROR_MAX instead >> */ >> - for (i = 0; i < 1; i++) { >> + for (i = 0; i < BTRFS_SUPER_MIRROR_MAX; i++) { >> ret = btrfs_read_dev_one_super(bdev, i, &bh); >> if (ret) >> continue; >> @@ -4015,11 +4015,17 @@ static int btrfs_check_super_valid(struct >> btrfs_fs_info *fs_info) >> ret = -EINVAL; >> } >> +#if 0 >> + /* >> + * Need a way to check for any copy of SB, as its not a >> + * strong check, just ignore this for now. >> + */ >> if (btrfs_super_bytenr(sb) != BTRFS_SUPER_INFO_OFFSET) { >> btrfs_err(fs_info, "super offset mismatch %llu != %u", >> btrfs_super_bytenr(sb), BTRFS_SUPER_INFO_OFFSET); >> ret = -EINVAL; >> } >> +#endif >> /* >> * Obvious sys_chunk_array corruptions, it must hold at least >> one key >> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c >> index 9fa2539a8493..f368db94d62b 100644 >> --- a/fs/btrfs/volumes.c >> +++ b/fs/btrfs/volumes.c >> @@ -1369,7 +1369,7 @@ int btrfs_scan_one_device(const char *path, >> fmode_t flags, void *holder, >> { >> struct btrfs_super_block *disk_super; >> struct block_device *bdev; >> - struct page *page; >> + struct buffer_head *sb_bh; >> int ret = -EINVAL; >> u64 devid; >> u64 transid; >> @@ -1392,8 +1392,12 @@ int btrfs_scan_one_device(const char *path, >> fmode_t flags, void *holder, >> goto error; >> } >> - if (btrfs_read_disk_super(bdev, bytenr, &page, &disk_super)) >> + sb_bh = btrfs_read_dev_super(bdev); >> + if (IS_ERR(sb_bh)) { >> + ret = PTR_ERR(sb_bh); >> goto error_bdev_put; >> + } >> + disk_super = (struct btrfs_super_block *) sb_bh->b_data; >> devid = btrfs_stack_device_id(&disk_super->dev_item); >> transid = btrfs_super_generation(disk_super); >> @@ -1413,7 +1417,7 @@ int btrfs_scan_one_device(const char *path, >> fmode_t flags, void *holder, >> if (!ret && fs_devices_ret) >> (*fs_devices_ret)->total_devices = total_devices; >> - btrfs_release_disk_super(page); >> + brelse(sb_bh); >> error_bdev_put: >> blkdev_put(bdev, flags); >> > > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html signature.asc Description: OpenPGP digital signature
Re: [PATCH RFC] btrfs: self heal from SB fail
On 2017年12月08日 20:41, Anand Jain wrote: > > > On 12/08/2017 08:02 PM, Qu Wenruo wrote: >> >> >> On 2017年12月08日 19:48, Anand Jain wrote: >>> >>> >>> On 12/08/2017 07:01 PM, Qu Wenruo wrote: On 2017年12月08日 18:39, Anand Jain wrote: > > > On 12/08/2017 04:17 PM, Qu Wenruo wrote: >> >> >> On 2017年12月08日 15:57, Anand Jain wrote: >>> -EXPERIMENTAL- >>> As of now when primary SB fails we won't self heal and would fail >>> mount, >>> this is an experimental patch which thinks why not go and read >>> backup >>> copy. >> >> Just curious about in which real world case that backup super >> block can >> help. >> At least from what I see in mail list, only few cases where backup >> super >> helps. > > Theoretical design helps. I ended up in this situation though. And > ext4 has -o sb flag to manage this part. When we can expect EIO on > any part of the disk block why not on the LBA which contains > primary > SB. And should we fail the mount for that reason ? No. And how do you ensure it's a btrfs? >>> >>> Hmm. You mean outside of btrfs ? I did experiment with wipe and then >>> using /etc/fstab to mount, and it did lead to btrfs, is that your >>> concern that it shouldn't have been. That looked surprising to me as >>> well, but then problem points at wipefs instead. >> >> It's closer, but still doesn't reach the point. >> It can be a mkfs, other than wipefs. >> >> It's can be another case like: >> >> One user used btrfs for a while >> But bugs made him/her unhappy, and he/she turned to use xfs (whatever >> the fs is) instead. >> >> While he/she forgot to change its fstab, and rebooted the system. >> >> And suddenly, it's btrfs again! >> What a surprise. >> > > I have worked quite a lot on defining the problem statements before > btrfs. Here the user has to be blamed to have forgotten to update > the fstab. I don't understand why should we workaround in btrfs for > the reason that user may miss to update fstab. We don't design > a stuff that like that, we design for the purpose, here backup SB > is for the purpose that we all know, if we don't use backup SB, then > its an incomplete design. Then implement something like ext*, using explicit mount option sb=n. And since it's already called "backup", it's something used for recovery, not for normal operation. We already have rescue tool to recover sb from backup, so it's not a incomplete design. Personally speaking, I prefer the current way and leave backup just as backup. You can my example of a user fault, but the real world needs us to handle user's fault. Or there will be no need for mkfs -f options or rm --no-preserve-root options at all. Thanks, Qu > >>> > >> Despite that self super heal seems good, although I agree with >> David, we >> need a weaker but necessary check (magic and fsid from primary >> super?) >> to ensure it's a valid btrfs before we use the backup supers. > > Of course, we already have btrfs_check_super_valid() to verify > the SB, > I don't understand why - how do we verify the SB should be the > point of > concern here, at all. The point here is, to distinguish an old and invalid btrfs (some other valid fs mkfs beyond the old fs) from a valid btrfs with corrupted primary fs. >>> >>> Ok. When you check all the SBs you would pick the one which has passed >>> btrfs_check_super_valid() and has highest generation. Did I ans your >>> concern ? If a SB does not pass btrfs_check_super_valid() its not a >>> valid btrfs SB at all. >> >> No, not really. >> >> What if the first SB is a XFS one or even a fs you didn't ever hear? >> >> Skip it and use the 2nd one? >> This filesystem is not even btrfs anymore. > > There is no problem is user does the correct thing that is to > update the fstab. OR using a complete mount command. If user > using -t btrfs OR unupdated btrfs then it indicates his intentions. > > Thanks, Anand > >> Thanks, >> Qu >> >>> >>> This the main concern here. The difference between recovery and recognizing a bad btrfs is quite important. >>> >>> btrfs_check_super_valid() is already doing that right ? The point here >>> is, should we use the backup SB when btrfs_check_super_valid() >>> fails on >>> primary SB. >>> >>> Thanks, Anand >>> Thanks, Qu > > Thanks, Anand > >> Thanks, >> Qu >> >> >>> >>> Signed-off-by: Anand Jain >>> --- >>> fs/btrfs/disk-io.c | 8 +++- >>> fs/btrfs/volumes.c | 10 +++--- >>> 2 files changed, 14 insertions(+), 4 deletions(-) >>> >>> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c >>> index 9b20c1f3563b..a791b8dfe8a8 100644 >>> --- a/fs/btrfs/disk-io.c >>> +++ b/fs/btrfs/disk-io.c >>> @@ -3190,7 +3190,7 @@ struct buffer_head >>
Re: [PATCH RFC] btrfs: self heal from SB fail
On 2017-12-08 02:57, Anand Jain wrote: -EXPERIMENTAL- As of now when primary SB fails we won't self heal and would fail mount, this is an experimental patch which thinks why not go and read backup copy. I like the concept, and actually think this should be default behavior on a filesystem that's already mounted (we fix other errors, why not SB's), but I don't think it should be default behavior at mount time for the reasons Qu has outlined (picking up old BTRFS SB's after reformatting is bad). However, I do think it's useful to be able to ask for this behavior on mount, so that you don't need to fight with the programs to get a filesystem to mount when the first SB is missing (perhaps add a 'usebackupsb' option to mirror 'usebackuproot'?). Signed-off-by: Anand Jain --- fs/btrfs/disk-io.c | 8 +++- fs/btrfs/volumes.c | 10 +++--- 2 files changed, 14 insertions(+), 4 deletions(-) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 9b20c1f3563b..a791b8dfe8a8 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -3190,7 +3190,7 @@ struct buffer_head *btrfs_read_dev_super(struct block_device *bdev) * So, we need to add a special mount option to scan for * later supers, using BTRFS_SUPER_MIRROR_MAX instead */ - for (i = 0; i < 1; i++) { + for (i = 0; i < BTRFS_SUPER_MIRROR_MAX; i++) { ret = btrfs_read_dev_one_super(bdev, i, &bh); if (ret) continue; @@ -4015,11 +4015,17 @@ static int btrfs_check_super_valid(struct btrfs_fs_info *fs_info) ret = -EINVAL; } +#if 0 + /* +* Need a way to check for any copy of SB, as its not a +* strong check, just ignore this for now. +*/ if (btrfs_super_bytenr(sb) != BTRFS_SUPER_INFO_OFFSET) { btrfs_err(fs_info, "super offset mismatch %llu != %u", btrfs_super_bytenr(sb), BTRFS_SUPER_INFO_OFFSET); ret = -EINVAL; } +#endif /* * Obvious sys_chunk_array corruptions, it must hold at least one key diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 9fa2539a8493..f368db94d62b 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -1369,7 +1369,7 @@ int btrfs_scan_one_device(const char *path, fmode_t flags, void *holder, { struct btrfs_super_block *disk_super; struct block_device *bdev; - struct page *page; + struct buffer_head *sb_bh; int ret = -EINVAL; u64 devid; u64 transid; @@ -1392,8 +1392,12 @@ int btrfs_scan_one_device(const char *path, fmode_t flags, void *holder, goto error; } - if (btrfs_read_disk_super(bdev, bytenr, &page, &disk_super)) + sb_bh = btrfs_read_dev_super(bdev); + if (IS_ERR(sb_bh)) { + ret = PTR_ERR(sb_bh); goto error_bdev_put; + } + disk_super = (struct btrfs_super_block *) sb_bh->b_data; devid = btrfs_stack_device_id(&disk_super->dev_item); transid = btrfs_super_generation(disk_super); @@ -1413,7 +1417,7 @@ int btrfs_scan_one_device(const char *path, fmode_t flags, void *holder, if (!ret && fs_devices_ret) (*fs_devices_ret)->total_devices = total_devices; - btrfs_release_disk_super(page); + brelse(sb_bh); error_bdev_put: blkdev_put(bdev, flags); -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC] btrfs: self heal from SB fail
On 12/08/2017 08:02 PM, Qu Wenruo wrote: On 2017年12月08日 19:48, Anand Jain wrote: On 12/08/2017 07:01 PM, Qu Wenruo wrote: On 2017年12月08日 18:39, Anand Jain wrote: On 12/08/2017 04:17 PM, Qu Wenruo wrote: On 2017年12月08日 15:57, Anand Jain wrote: -EXPERIMENTAL- As of now when primary SB fails we won't self heal and would fail mount, this is an experimental patch which thinks why not go and read backup copy. Just curious about in which real world case that backup super block can help. At least from what I see in mail list, only few cases where backup super helps. Theoretical design helps. I ended up in this situation though. And ext4 has -o sb flag to manage this part. When we can expect EIO on any part of the disk block why not on the LBA which contains primary SB. And should we fail the mount for that reason ? No. And how do you ensure it's a btrfs? Hmm. You mean outside of btrfs ? I did experiment with wipe and then using /etc/fstab to mount, and it did lead to btrfs, is that your concern that it shouldn't have been. That looked surprising to me as well, but then problem points at wipefs instead. It's closer, but still doesn't reach the point. It can be a mkfs, other than wipefs. It's can be another case like: One user used btrfs for a while But bugs made him/her unhappy, and he/she turned to use xfs (whatever the fs is) instead. While he/she forgot to change its fstab, and rebooted the system. And suddenly, it's btrfs again! What a surprise. I have worked quite a lot on defining the problem statements before btrfs. Here the user has to be blamed to have forgotten to update the fstab. I don't understand why should we workaround in btrfs for the reason that user may miss to update fstab. We don't design a stuff that like that, we design for the purpose, here backup SB is for the purpose that we all know, if we don't use backup SB, then its an incomplete design. Despite that self super heal seems good, although I agree with David, we need a weaker but necessary check (magic and fsid from primary super?) to ensure it's a valid btrfs before we use the backup supers. Of course, we already have btrfs_check_super_valid() to verify the SB, I don't understand why - how do we verify the SB should be the point of concern here, at all. The point here is, to distinguish an old and invalid btrfs (some other valid fs mkfs beyond the old fs) from a valid btrfs with corrupted primary fs. Ok. When you check all the SBs you would pick the one which has passed btrfs_check_super_valid() and has highest generation. Did I ans your concern ? If a SB does not pass btrfs_check_super_valid() its not a valid btrfs SB at all. No, not really. What if the first SB is a XFS one or even a fs you didn't ever hear? Skip it and use the 2nd one? This filesystem is not even btrfs anymore. There is no problem is user does the correct thing that is to update the fstab. OR using a complete mount command. If user using -t btrfs OR unupdated btrfs then it indicates his intentions. Thanks, Anand Thanks, Qu This the main concern here. The difference between recovery and recognizing a bad btrfs is quite important. btrfs_check_super_valid() is already doing that right ? The point here is, should we use the backup SB when btrfs_check_super_valid() fails on primary SB. Thanks, Anand Thanks, Qu Thanks, Anand Thanks, Qu Signed-off-by: Anand Jain --- fs/btrfs/disk-io.c | 8 +++- fs/btrfs/volumes.c | 10 +++--- 2 files changed, 14 insertions(+), 4 deletions(-) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 9b20c1f3563b..a791b8dfe8a8 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -3190,7 +3190,7 @@ struct buffer_head *btrfs_read_dev_super(struct block_device *bdev) * So, we need to add a special mount option to scan for * later supers, using BTRFS_SUPER_MIRROR_MAX instead */ - for (i = 0; i < 1; i++) { + for (i = 0; i < BTRFS_SUPER_MIRROR_MAX; i++) { ret = btrfs_read_dev_one_super(bdev, i, &bh); if (ret) continue; @@ -4015,11 +4015,17 @@ static int btrfs_check_super_valid(struct btrfs_fs_info *fs_info) ret = -EINVAL; } +#if 0 + /* + * Need a way to check for any copy of SB, as its not a + * strong check, just ignore this for now. + */ if (btrfs_super_bytenr(sb) != BTRFS_SUPER_INFO_OFFSET) { btrfs_err(fs_info, "super offset mismatch %llu != %u", btrfs_super_bytenr(sb), BTRFS_SUPER_INFO_OFFSET); ret = -EINVAL; } +#endif /* * Obvious sys_chunk_array corruptions, it must hold at least one key diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 9fa2539a8493..f368db94d62b 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -1369,7 +1369,7 @@ int btrfs_scan_one_device(const char *path, fmode_t
Re: [PATCH RFC] btrfs: self heal from SB fail
On 12/08/2017 09:17 AM, Qu Wenruo wrote: > > > On 2017年12月08日 15:57, Anand Jain wrote: >> -EXPERIMENTAL- >> As of now when primary SB fails we won't self heal and would fail mount, >> this is an experimental patch which thinks why not go and read backup >> copy. > > Just curious about in which real world case that backup super block can > help. > At least from what I see in mail list, only few cases where backup super > helps. That's about using 'backup roots' with old info about previous generations than the current one on disk, which is a different thing than using one of the other copies of the super block itself. > Despite that self super heal seems good, although I agree with David, we > need a weaker but necessary check (magic and fsid from primary super?) > to ensure it's a valid btrfs before we use the backup supers. > > Thanks, > Qu > > >> >> Signed-off-by: Anand Jain >> --- >> fs/btrfs/disk-io.c | 8 +++- >> fs/btrfs/volumes.c | 10 +++--- >> 2 files changed, 14 insertions(+), 4 deletions(-) >> >> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c >> index 9b20c1f3563b..a791b8dfe8a8 100644 >> --- a/fs/btrfs/disk-io.c >> +++ b/fs/btrfs/disk-io.c >> @@ -3190,7 +3190,7 @@ struct buffer_head *btrfs_read_dev_super(struct >> block_device *bdev) >> * So, we need to add a special mount option to scan for >> * later supers, using BTRFS_SUPER_MIRROR_MAX instead >> */ >> -for (i = 0; i < 1; i++) { >> +for (i = 0; i < BTRFS_SUPER_MIRROR_MAX; i++) { >> ret = btrfs_read_dev_one_super(bdev, i, &bh); >> if (ret) >> continue; >> @@ -4015,11 +4015,17 @@ static int btrfs_check_super_valid(struct >> btrfs_fs_info *fs_info) >> ret = -EINVAL; >> } >> >> +#if 0 >> +/* >> + * Need a way to check for any copy of SB, as its not a >> + * strong check, just ignore this for now. >> + */ >> if (btrfs_super_bytenr(sb) != BTRFS_SUPER_INFO_OFFSET) { >> btrfs_err(fs_info, "super offset mismatch %llu != %u", >>btrfs_super_bytenr(sb), BTRFS_SUPER_INFO_OFFSET); >> ret = -EINVAL; >> } >> +#endif >> >> /* >> * Obvious sys_chunk_array corruptions, it must hold at least one key >> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c >> index 9fa2539a8493..f368db94d62b 100644 >> --- a/fs/btrfs/volumes.c >> +++ b/fs/btrfs/volumes.c >> @@ -1369,7 +1369,7 @@ int btrfs_scan_one_device(const char *path, fmode_t >> flags, void *holder, >> { >> struct btrfs_super_block *disk_super; >> struct block_device *bdev; >> -struct page *page; >> +struct buffer_head *sb_bh; >> int ret = -EINVAL; >> u64 devid; >> u64 transid; >> @@ -1392,8 +1392,12 @@ int btrfs_scan_one_device(const char *path, fmode_t >> flags, void *holder, >> goto error; >> } >> >> -if (btrfs_read_disk_super(bdev, bytenr, &page, &disk_super)) >> +sb_bh = btrfs_read_dev_super(bdev); >> +if (IS_ERR(sb_bh)) { >> +ret = PTR_ERR(sb_bh); >> goto error_bdev_put; >> +} >> +disk_super = (struct btrfs_super_block *) sb_bh->b_data; >> >> devid = btrfs_stack_device_id(&disk_super->dev_item); >> transid = btrfs_super_generation(disk_super); >> @@ -1413,7 +1417,7 @@ int btrfs_scan_one_device(const char *path, fmode_t >> flags, void *holder, >> if (!ret && fs_devices_ret) >> (*fs_devices_ret)->total_devices = total_devices; >> >> -btrfs_release_disk_super(page); >> +brelse(sb_bh); >> >> error_bdev_put: >> blkdev_put(bdev, flags); >> > -- Hans van Kranenburg -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC] btrfs: self heal from SB fail
On 8.12.2017 12:33, Anand Jain wrote: > > > On 12/08/2017 04:40 PM, Nikolay Borisov wrote: >> >> >> On 8.12.2017 09:57, Anand Jain wrote: >>> -EXPERIMENTAL- >>> As of now when primary SB fails we won't self heal and would fail mount, >>> this is an experimental patch which thinks why not go and read backup >>> copy. >>> >>> Signed-off-by: Anand Jain >>> --- >>> fs/btrfs/disk-io.c | 8 +++- >>> fs/btrfs/volumes.c | 10 +++--- >>> 2 files changed, 14 insertions(+), 4 deletions(-) >>> >>> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c >>> index 9b20c1f3563b..a791b8dfe8a8 100644 >>> --- a/fs/btrfs/disk-io.c >>> +++ b/fs/btrfs/disk-io.c >>> @@ -3190,7 +3190,7 @@ struct buffer_head *btrfs_read_dev_super(struct >>> block_device *bdev) >>> * So, we need to add a special mount option to scan for >>> * later supers, using BTRFS_SUPER_MIRROR_MAX instead >>> */ >>> - for (i = 0; i < 1; i++) { >>> + for (i = 0; i < BTRFS_SUPER_MIRROR_MAX; i++) { >>> ret = btrfs_read_dev_one_super(bdev, i, &bh); >>> if (ret) >>> continue; >>> @@ -4015,11 +4015,17 @@ static int btrfs_check_super_valid(struct >>> btrfs_fs_info *fs_info) >>> ret = -EINVAL; >>> } >>> +#if 0 >>> + /* >>> + * Need a way to check for any copy of SB, as its not a >>> + * strong check, just ignore this for now. >>> + */ >>> if (btrfs_super_bytenr(sb) != BTRFS_SUPER_INFO_OFFSET) { >>> btrfs_err(fs_info, "super offset mismatch %llu != %u", >>> btrfs_super_bytenr(sb), BTRFS_SUPER_INFO_OFFSET); >>> ret = -EINVAL; >>> } >>> +#endif >>> /* >>> * Obvious sys_chunk_array corruptions, it must hold at least >>> one key >>> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c >>> index 9fa2539a8493..f368db94d62b 100644 >>> --- a/fs/btrfs/volumes.c >>> +++ b/fs/btrfs/volumes.c >>> @@ -1369,7 +1369,7 @@ int btrfs_scan_one_device(const char *path, >>> fmode_t flags, void *holder, >>> { >>> struct btrfs_super_block *disk_super; >>> struct block_device *bdev; >>> - struct page *page; >>> + struct buffer_head *sb_bh; >>> int ret = -EINVAL; >>> u64 devid; >>> u64 transid; >>> @@ -1392,8 +1392,12 @@ int btrfs_scan_one_device(const char *path, >>> fmode_t flags, void *holder, >>> goto error; >>> } >>> - if (btrfs_read_disk_super(bdev, bytenr, &page, &disk_super)) >>> + sb_bh = btrfs_read_dev_super(bdev); >> >> This patch prompts another question: why do we have a page-based and a >> bufferhead-based interface to reading the super block? > > Right. we need to know that. Sorry I just saw this. FWIW unless we explicitly need a per-block state tracking (which I don't think we do) we should ideally switch to page-based mechanism. Buffer heads are considered deprecated. Also the only reason why we do have btrfsic_submit_bh is for the superblock bufer head. So potentially by removing the only requirement in the kernel where bh are used we can simplify the btrfsic code as well . Just something to keep in mind. > > I have a very old patch to converge them and clean this up, but haven't > sent it because there are some missing information on why it ended up > like that in the first place. > > Thanks, Anand > > >> I did prototype >> switching the bufferheads to page based but the resulting code wasn't >> any cleaner. I believe there is also open the question what happens when >> btrfs is run on a 64k page machine. I.e. we are going to read a single >> page and the sb is going to be in the first 4k but what about the rest >> 60, they could potentially contain other metadata. The page will have to >> be freed asap so as not to peg the neighboring metadata? > > >> >>> + if (IS_ERR(sb_bh)) { >>> + ret = PTR_ERR(sb_bh); >>> goto error_bdev_put; >>> + } >>> + disk_super = (struct btrfs_super_block *) sb_bh->b_data; >>> devid = btrfs_stack_device_id(&disk_super->dev_item); >>> transid = btrfs_super_generation(disk_super); >>> @@ -1413,7 +1417,7 @@ int btrfs_scan_one_device(const char *path, >>> fmode_t flags, void *holder, >>> if (!ret && fs_devices_ret) >>> (*fs_devices_ret)->total_devices = total_devices; >>> - btrfs_release_disk_super(page); >>> + brelse(sb_bh); >>> error_bdev_put: >>> blkdev_put(bdev, flags); >>> >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >> the body of a message to majord...@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info a
Re: [PATCH RFC] btrfs: self heal from SB fail
On 2017年12月08日 19:48, Anand Jain wrote: > > > On 12/08/2017 07:01 PM, Qu Wenruo wrote: >> >> >> On 2017年12月08日 18:39, Anand Jain wrote: >>> >>> >>> On 12/08/2017 04:17 PM, Qu Wenruo wrote: On 2017年12月08日 15:57, Anand Jain wrote: > -EXPERIMENTAL- > As of now when primary SB fails we won't self heal and would fail > mount, > this is an experimental patch which thinks why not go and read backup > copy. Just curious about in which real world case that backup super block can help. At least from what I see in mail list, only few cases where backup super helps. >>> >>> Theoretical design helps. I ended up in this situation though. And >>> ext4 has -o sb flag to manage this part. When we can expect EIO on >>> any part of the disk block why not on the LBA which contains primary >>> SB. And should we fail the mount for that reason ? No. >> >> And how do you ensure it's a btrfs? > > Hmm. You mean outside of btrfs ? I did experiment with wipe and then > using /etc/fstab to mount, and it did lead to btrfs, is that your > concern that it shouldn't have been. That looked surprising to me as > well, but then problem points at wipefs instead. It's closer, but still doesn't reach the point. It can be a mkfs, other than wipefs. It's can be another case like: One user used btrfs for a while But bugs made him/her unhappy, and he/she turned to use xfs (whatever the fs is) instead. While he/she forgot to change its fstab, and rebooted the system. And suddenly, it's btrfs again! What a surprise. > >> >>> Despite that self super heal seems good, although I agree with David, we need a weaker but necessary check (magic and fsid from primary super?) to ensure it's a valid btrfs before we use the backup supers. >>> >>> Of course, we already have btrfs_check_super_valid() to verify the SB, >>> I don't understand why - how do we verify the SB should be the >>> point of >>> concern here, at all. >> >> The point here is, to distinguish an old and invalid btrfs (some other >> valid fs mkfs beyond the old fs) from a valid btrfs with corrupted >> primary fs. > > Ok. When you check all the SBs you would pick the one which has passed > btrfs_check_super_valid() and has highest generation. Did I ans your > concern ? If a SB does not pass btrfs_check_super_valid() its not a > valid btrfs SB at all. No, not really. What if the first SB is a XFS one or even a fs you didn't ever hear? Skip it and use the 2nd one? This filesystem is not even btrfs anymore. Thanks, Qu > > >> This the main concern here. >> The difference between recovery and recognizing a bad btrfs is quite >> important. > > btrfs_check_super_valid() is already doing that right ? The point here > is, should we use the backup SB when btrfs_check_super_valid() fails on > primary SB. > > Thanks, Anand > >> Thanks, >> Qu >> >>> >>> Thanks, Anand >>> Thanks, Qu > > Signed-off-by: Anand Jain > --- > fs/btrfs/disk-io.c | 8 +++- > fs/btrfs/volumes.c | 10 +++--- > 2 files changed, 14 insertions(+), 4 deletions(-) > > diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c > index 9b20c1f3563b..a791b8dfe8a8 100644 > --- a/fs/btrfs/disk-io.c > +++ b/fs/btrfs/disk-io.c > @@ -3190,7 +3190,7 @@ struct buffer_head *btrfs_read_dev_super(struct > block_device *bdev) > * So, we need to add a special mount option to scan for > * later supers, using BTRFS_SUPER_MIRROR_MAX instead > */ > - for (i = 0; i < 1; i++) { > + for (i = 0; i < BTRFS_SUPER_MIRROR_MAX; i++) { > ret = btrfs_read_dev_one_super(bdev, i, &bh); > if (ret) > continue; > @@ -4015,11 +4015,17 @@ static int btrfs_check_super_valid(struct > btrfs_fs_info *fs_info) > ret = -EINVAL; > } > +#if 0 > + /* > + * Need a way to check for any copy of SB, as its not a > + * strong check, just ignore this for now. > + */ > if (btrfs_super_bytenr(sb) != BTRFS_SUPER_INFO_OFFSET) { > btrfs_err(fs_info, "super offset mismatch %llu != %u", > btrfs_super_bytenr(sb), BTRFS_SUPER_INFO_OFFSET); > ret = -EINVAL; > } > +#endif > /* > * Obvious sys_chunk_array corruptions, it must hold at least > one key > diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c > index 9fa2539a8493..f368db94d62b 100644 > --- a/fs/btrfs/volumes.c > +++ b/fs/btrfs/volumes.c > @@ -1369,7 +1369,7 @@ int btrfs_scan_one_device(const char *path, > fmode_t flags, void *holder, > { > struct btrfs_super_block *disk_super; > struct block_device *bdev; > - struct page *page; > + struct buffer_head *sb_bh; > int ret = -EINVAL;
Re: [PATCH RFC] btrfs: self heal from SB fail
On 12/08/2017 07:01 PM, Qu Wenruo wrote: On 2017年12月08日 18:39, Anand Jain wrote: On 12/08/2017 04:17 PM, Qu Wenruo wrote: On 2017年12月08日 15:57, Anand Jain wrote: -EXPERIMENTAL- As of now when primary SB fails we won't self heal and would fail mount, this is an experimental patch which thinks why not go and read backup copy. Just curious about in which real world case that backup super block can help. At least from what I see in mail list, only few cases where backup super helps. Theoretical design helps. I ended up in this situation though. And ext4 has -o sb flag to manage this part. When we can expect EIO on any part of the disk block why not on the LBA which contains primary SB. And should we fail the mount for that reason ? No. And how do you ensure it's a btrfs? Hmm. You mean outside of btrfs ? I did experiment with wipe and then using /etc/fstab to mount, and it did lead to btrfs, is that your concern that it shouldn't have been. That looked surprising to me as well, but then problem points at wipefs instead. Despite that self super heal seems good, although I agree with David, we need a weaker but necessary check (magic and fsid from primary super?) to ensure it's a valid btrfs before we use the backup supers. Of course, we already have btrfs_check_super_valid() to verify the SB, I don't understand why - how do we verify the SB should be the point of concern here, at all. The point here is, to distinguish an old and invalid btrfs (some other valid fs mkfs beyond the old fs) from a valid btrfs with corrupted primary fs. Ok. When you check all the SBs you would pick the one which has passed btrfs_check_super_valid() and has highest generation. Did I ans your concern ? If a SB does not pass btrfs_check_super_valid() its not a valid btrfs SB at all. This the main concern here. The difference between recovery and recognizing a bad btrfs is quite important. btrfs_check_super_valid() is already doing that right ? The point here is, should we use the backup SB when btrfs_check_super_valid() fails on primary SB. Thanks, Anand Thanks, Qu Thanks, Anand Thanks, Qu Signed-off-by: Anand Jain --- fs/btrfs/disk-io.c | 8 +++- fs/btrfs/volumes.c | 10 +++--- 2 files changed, 14 insertions(+), 4 deletions(-) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 9b20c1f3563b..a791b8dfe8a8 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -3190,7 +3190,7 @@ struct buffer_head *btrfs_read_dev_super(struct block_device *bdev) * So, we need to add a special mount option to scan for * later supers, using BTRFS_SUPER_MIRROR_MAX instead */ - for (i = 0; i < 1; i++) { + for (i = 0; i < BTRFS_SUPER_MIRROR_MAX; i++) { ret = btrfs_read_dev_one_super(bdev, i, &bh); if (ret) continue; @@ -4015,11 +4015,17 @@ static int btrfs_check_super_valid(struct btrfs_fs_info *fs_info) ret = -EINVAL; } +#if 0 + /* + * Need a way to check for any copy of SB, as its not a + * strong check, just ignore this for now. + */ if (btrfs_super_bytenr(sb) != BTRFS_SUPER_INFO_OFFSET) { btrfs_err(fs_info, "super offset mismatch %llu != %u", btrfs_super_bytenr(sb), BTRFS_SUPER_INFO_OFFSET); ret = -EINVAL; } +#endif /* * Obvious sys_chunk_array corruptions, it must hold at least one key diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 9fa2539a8493..f368db94d62b 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -1369,7 +1369,7 @@ int btrfs_scan_one_device(const char *path, fmode_t flags, void *holder, { struct btrfs_super_block *disk_super; struct block_device *bdev; - struct page *page; + struct buffer_head *sb_bh; int ret = -EINVAL; u64 devid; u64 transid; @@ -1392,8 +1392,12 @@ int btrfs_scan_one_device(const char *path, fmode_t flags, void *holder, goto error; } - if (btrfs_read_disk_super(bdev, bytenr, &page, &disk_super)) + sb_bh = btrfs_read_dev_super(bdev); + if (IS_ERR(sb_bh)) { + ret = PTR_ERR(sb_bh); goto error_bdev_put; + } + disk_super = (struct btrfs_super_block *) sb_bh->b_data; devid = btrfs_stack_device_id(&disk_super->dev_item); transid = btrfs_super_generation(disk_super); @@ -1413,7 +1417,7 @@ int btrfs_scan_one_device(const char *path, fmode_t flags, void *holder, if (!ret && fs_devices_ret) (*fs_devices_ret)->total_devices = total_devices; - btrfs_release_disk_super(page); + brelse(sb_bh); error_bdev_put: blkdev_put(bdev, flags); -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linu
Re: [PATCH RFC] btrfs: self heal from SB fail
On 2017年12月08日 18:39, Anand Jain wrote: > > > On 12/08/2017 04:17 PM, Qu Wenruo wrote: >> >> >> On 2017年12月08日 15:57, Anand Jain wrote: >>> -EXPERIMENTAL- >>> As of now when primary SB fails we won't self heal and would fail mount, >>> this is an experimental patch which thinks why not go and read backup >>> copy. >> >> Just curious about in which real world case that backup super block can >> help. >> At least from what I see in mail list, only few cases where backup super >> helps. > > Theoretical design helps. I ended up in this situation though. And > ext4 has -o sb flag to manage this part. When we can expect EIO on > any part of the disk block why not on the LBA which contains primary > SB. And should we fail the mount for that reason ? No. And how do you ensure it's a btrfs? > >> Despite that self super heal seems good, although I agree with David, we >> need a weaker but necessary check (magic and fsid from primary super?) >> to ensure it's a valid btrfs before we use the backup supers. > > Of course, we already have btrfs_check_super_valid() to verify the SB, > I don't understand why - how do we verify the SB should be the point of > concern here, at all. The point here is, to distinguish an old and invalid btrfs (some other valid fs mkfs beyond the old fs) from a valid btrfs with corrupted primary fs. This the main concern here. The difference between recovery and recognizing a bad btrfs is quite important. Thanks, Qu > > Thanks, Anand > >> Thanks, >> Qu >> >> >>> >>> Signed-off-by: Anand Jain >>> --- >>> fs/btrfs/disk-io.c | 8 +++- >>> fs/btrfs/volumes.c | 10 +++--- >>> 2 files changed, 14 insertions(+), 4 deletions(-) >>> >>> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c >>> index 9b20c1f3563b..a791b8dfe8a8 100644 >>> --- a/fs/btrfs/disk-io.c >>> +++ b/fs/btrfs/disk-io.c >>> @@ -3190,7 +3190,7 @@ struct buffer_head *btrfs_read_dev_super(struct >>> block_device *bdev) >>> * So, we need to add a special mount option to scan for >>> * later supers, using BTRFS_SUPER_MIRROR_MAX instead >>> */ >>> - for (i = 0; i < 1; i++) { >>> + for (i = 0; i < BTRFS_SUPER_MIRROR_MAX; i++) { >>> ret = btrfs_read_dev_one_super(bdev, i, &bh); >>> if (ret) >>> continue; >>> @@ -4015,11 +4015,17 @@ static int btrfs_check_super_valid(struct >>> btrfs_fs_info *fs_info) >>> ret = -EINVAL; >>> } >>> +#if 0 >>> + /* >>> + * Need a way to check for any copy of SB, as its not a >>> + * strong check, just ignore this for now. >>> + */ >>> if (btrfs_super_bytenr(sb) != BTRFS_SUPER_INFO_OFFSET) { >>> btrfs_err(fs_info, "super offset mismatch %llu != %u", >>> btrfs_super_bytenr(sb), BTRFS_SUPER_INFO_OFFSET); >>> ret = -EINVAL; >>> } >>> +#endif >>> /* >>> * Obvious sys_chunk_array corruptions, it must hold at least >>> one key >>> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c >>> index 9fa2539a8493..f368db94d62b 100644 >>> --- a/fs/btrfs/volumes.c >>> +++ b/fs/btrfs/volumes.c >>> @@ -1369,7 +1369,7 @@ int btrfs_scan_one_device(const char *path, >>> fmode_t flags, void *holder, >>> { >>> struct btrfs_super_block *disk_super; >>> struct block_device *bdev; >>> - struct page *page; >>> + struct buffer_head *sb_bh; >>> int ret = -EINVAL; >>> u64 devid; >>> u64 transid; >>> @@ -1392,8 +1392,12 @@ int btrfs_scan_one_device(const char *path, >>> fmode_t flags, void *holder, >>> goto error; >>> } >>> - if (btrfs_read_disk_super(bdev, bytenr, &page, &disk_super)) >>> + sb_bh = btrfs_read_dev_super(bdev); >>> + if (IS_ERR(sb_bh)) { >>> + ret = PTR_ERR(sb_bh); >>> goto error_bdev_put; >>> + } >>> + disk_super = (struct btrfs_super_block *) sb_bh->b_data; >>> devid = btrfs_stack_device_id(&disk_super->dev_item); >>> transid = btrfs_super_generation(disk_super); >>> @@ -1413,7 +1417,7 @@ int btrfs_scan_one_device(const char *path, >>> fmode_t flags, void *holder, >>> if (!ret && fs_devices_ret) >>> (*fs_devices_ret)->total_devices = total_devices; >>> - btrfs_release_disk_super(page); >>> + brelse(sb_bh); >>> error_bdev_put: >>> blkdev_put(bdev, flags); >>> >> > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html signature.asc Description: OpenPGP digital signature
Re: [PATCH RFC] btrfs: self heal from SB fail
On 12/08/2017 04:17 PM, Qu Wenruo wrote: On 2017年12月08日 15:57, Anand Jain wrote: -EXPERIMENTAL- As of now when primary SB fails we won't self heal and would fail mount, this is an experimental patch which thinks why not go and read backup copy. Just curious about in which real world case that backup super block can help. At least from what I see in mail list, only few cases where backup super helps. Theoretical design helps. I ended up in this situation though. And ext4 has -o sb flag to manage this part. When we can expect EIO on any part of the disk block why not on the LBA which contains primary SB. And should we fail the mount for that reason ? No. Despite that self super heal seems good, although I agree with David, we need a weaker but necessary check (magic and fsid from primary super?) to ensure it's a valid btrfs before we use the backup supers. Of course, we already have btrfs_check_super_valid() to verify the SB, I don't understand why - how do we verify the SB should be the point of concern here, at all. Thanks, Anand Thanks, Qu Signed-off-by: Anand Jain --- fs/btrfs/disk-io.c | 8 +++- fs/btrfs/volumes.c | 10 +++--- 2 files changed, 14 insertions(+), 4 deletions(-) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 9b20c1f3563b..a791b8dfe8a8 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -3190,7 +3190,7 @@ struct buffer_head *btrfs_read_dev_super(struct block_device *bdev) * So, we need to add a special mount option to scan for * later supers, using BTRFS_SUPER_MIRROR_MAX instead */ - for (i = 0; i < 1; i++) { + for (i = 0; i < BTRFS_SUPER_MIRROR_MAX; i++) { ret = btrfs_read_dev_one_super(bdev, i, &bh); if (ret) continue; @@ -4015,11 +4015,17 @@ static int btrfs_check_super_valid(struct btrfs_fs_info *fs_info) ret = -EINVAL; } +#if 0 + /* +* Need a way to check for any copy of SB, as its not a +* strong check, just ignore this for now. +*/ if (btrfs_super_bytenr(sb) != BTRFS_SUPER_INFO_OFFSET) { btrfs_err(fs_info, "super offset mismatch %llu != %u", btrfs_super_bytenr(sb), BTRFS_SUPER_INFO_OFFSET); ret = -EINVAL; } +#endif /* * Obvious sys_chunk_array corruptions, it must hold at least one key diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 9fa2539a8493..f368db94d62b 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -1369,7 +1369,7 @@ int btrfs_scan_one_device(const char *path, fmode_t flags, void *holder, { struct btrfs_super_block *disk_super; struct block_device *bdev; - struct page *page; + struct buffer_head *sb_bh; int ret = -EINVAL; u64 devid; u64 transid; @@ -1392,8 +1392,12 @@ int btrfs_scan_one_device(const char *path, fmode_t flags, void *holder, goto error; } - if (btrfs_read_disk_super(bdev, bytenr, &page, &disk_super)) + sb_bh = btrfs_read_dev_super(bdev); + if (IS_ERR(sb_bh)) { + ret = PTR_ERR(sb_bh); goto error_bdev_put; + } + disk_super = (struct btrfs_super_block *) sb_bh->b_data; devid = btrfs_stack_device_id(&disk_super->dev_item); transid = btrfs_super_generation(disk_super); @@ -1413,7 +1417,7 @@ int btrfs_scan_one_device(const char *path, fmode_t flags, void *holder, if (!ret && fs_devices_ret) (*fs_devices_ret)->total_devices = total_devices; - btrfs_release_disk_super(page); + brelse(sb_bh); error_bdev_put: blkdev_put(bdev, flags); -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC] btrfs: self heal from SB fail
On 12/08/2017 04:40 PM, Nikolay Borisov wrote: On 8.12.2017 09:57, Anand Jain wrote: -EXPERIMENTAL- As of now when primary SB fails we won't self heal and would fail mount, this is an experimental patch which thinks why not go and read backup copy. Signed-off-by: Anand Jain --- fs/btrfs/disk-io.c | 8 +++- fs/btrfs/volumes.c | 10 +++--- 2 files changed, 14 insertions(+), 4 deletions(-) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 9b20c1f3563b..a791b8dfe8a8 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -3190,7 +3190,7 @@ struct buffer_head *btrfs_read_dev_super(struct block_device *bdev) * So, we need to add a special mount option to scan for * later supers, using BTRFS_SUPER_MIRROR_MAX instead */ - for (i = 0; i < 1; i++) { + for (i = 0; i < BTRFS_SUPER_MIRROR_MAX; i++) { ret = btrfs_read_dev_one_super(bdev, i, &bh); if (ret) continue; @@ -4015,11 +4015,17 @@ static int btrfs_check_super_valid(struct btrfs_fs_info *fs_info) ret = -EINVAL; } +#if 0 + /* +* Need a way to check for any copy of SB, as its not a +* strong check, just ignore this for now. +*/ if (btrfs_super_bytenr(sb) != BTRFS_SUPER_INFO_OFFSET) { btrfs_err(fs_info, "super offset mismatch %llu != %u", btrfs_super_bytenr(sb), BTRFS_SUPER_INFO_OFFSET); ret = -EINVAL; } +#endif /* * Obvious sys_chunk_array corruptions, it must hold at least one key diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 9fa2539a8493..f368db94d62b 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -1369,7 +1369,7 @@ int btrfs_scan_one_device(const char *path, fmode_t flags, void *holder, { struct btrfs_super_block *disk_super; struct block_device *bdev; - struct page *page; + struct buffer_head *sb_bh; int ret = -EINVAL; u64 devid; u64 transid; @@ -1392,8 +1392,12 @@ int btrfs_scan_one_device(const char *path, fmode_t flags, void *holder, goto error; } - if (btrfs_read_disk_super(bdev, bytenr, &page, &disk_super)) + sb_bh = btrfs_read_dev_super(bdev); This patch prompts another question: why do we have a page-based and a bufferhead-based interface to reading the super block? Right. we need to know that. Sorry I just saw this. I have a very old patch to converge them and clean this up, but haven't sent it because there are some missing information on why it ended up like that in the first place. Thanks, Anand I did prototype switching the bufferheads to page based but the resulting code wasn't any cleaner. I believe there is also open the question what happens when btrfs is run on a 64k page machine. I.e. we are going to read a single page and the sb is going to be in the first 4k but what about the rest 60, they could potentially contain other metadata. The page will have to be freed asap so as not to peg the neighboring metadata? + if (IS_ERR(sb_bh)) { + ret = PTR_ERR(sb_bh); goto error_bdev_put; + } + disk_super = (struct btrfs_super_block *) sb_bh->b_data; devid = btrfs_stack_device_id(&disk_super->dev_item); transid = btrfs_super_generation(disk_super); @@ -1413,7 +1417,7 @@ int btrfs_scan_one_device(const char *path, fmode_t flags, void *holder, if (!ret && fs_devices_ret) (*fs_devices_ret)->total_devices = total_devices; - btrfs_release_disk_super(page); + brelse(sb_bh); error_bdev_put: blkdev_put(bdev, flags); -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC] btrfs: self heal from SB fail
On 8.12.2017 09:57, Anand Jain wrote: > -EXPERIMENTAL- > As of now when primary SB fails we won't self heal and would fail mount, > this is an experimental patch which thinks why not go and read backup > copy. > > Signed-off-by: Anand Jain > --- > fs/btrfs/disk-io.c | 8 +++- > fs/btrfs/volumes.c | 10 +++--- > 2 files changed, 14 insertions(+), 4 deletions(-) > > diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c > index 9b20c1f3563b..a791b8dfe8a8 100644 > --- a/fs/btrfs/disk-io.c > +++ b/fs/btrfs/disk-io.c > @@ -3190,7 +3190,7 @@ struct buffer_head *btrfs_read_dev_super(struct > block_device *bdev) >* So, we need to add a special mount option to scan for >* later supers, using BTRFS_SUPER_MIRROR_MAX instead >*/ > - for (i = 0; i < 1; i++) { > + for (i = 0; i < BTRFS_SUPER_MIRROR_MAX; i++) { > ret = btrfs_read_dev_one_super(bdev, i, &bh); > if (ret) > continue; > @@ -4015,11 +4015,17 @@ static int btrfs_check_super_valid(struct > btrfs_fs_info *fs_info) > ret = -EINVAL; > } > > +#if 0 > + /* > + * Need a way to check for any copy of SB, as its not a > + * strong check, just ignore this for now. > + */ > if (btrfs_super_bytenr(sb) != BTRFS_SUPER_INFO_OFFSET) { > btrfs_err(fs_info, "super offset mismatch %llu != %u", > btrfs_super_bytenr(sb), BTRFS_SUPER_INFO_OFFSET); > ret = -EINVAL; > } > +#endif > > /* >* Obvious sys_chunk_array corruptions, it must hold at least one key > diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c > index 9fa2539a8493..f368db94d62b 100644 > --- a/fs/btrfs/volumes.c > +++ b/fs/btrfs/volumes.c > @@ -1369,7 +1369,7 @@ int btrfs_scan_one_device(const char *path, fmode_t > flags, void *holder, > { > struct btrfs_super_block *disk_super; > struct block_device *bdev; > - struct page *page; > + struct buffer_head *sb_bh; > int ret = -EINVAL; > u64 devid; > u64 transid; > @@ -1392,8 +1392,12 @@ int btrfs_scan_one_device(const char *path, fmode_t > flags, void *holder, > goto error; > } > > - if (btrfs_read_disk_super(bdev, bytenr, &page, &disk_super)) > + sb_bh = btrfs_read_dev_super(bdev); This patch prompts another question: why do we have a page-based and a bufferhead-based interface to reading the super block? I did prototype switching the bufferheads to page based but the resulting code wasn't any cleaner. I believe there is also open the question what happens when btrfs is run on a 64k page machine. I.e. we are going to read a single page and the sb is going to be in the first 4k but what about the rest 60, they could potentially contain other metadata. The page will have to be freed asap so as not to peg the neighboring metadata? > + if (IS_ERR(sb_bh)) { > + ret = PTR_ERR(sb_bh); > goto error_bdev_put; > + } > + disk_super = (struct btrfs_super_block *) sb_bh->b_data; > > devid = btrfs_stack_device_id(&disk_super->dev_item); > transid = btrfs_super_generation(disk_super); > @@ -1413,7 +1417,7 @@ int btrfs_scan_one_device(const char *path, fmode_t > flags, void *holder, > if (!ret && fs_devices_ret) > (*fs_devices_ret)->total_devices = total_devices; > > - btrfs_release_disk_super(page); > + brelse(sb_bh); > > error_bdev_put: > blkdev_put(bdev, flags); > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC] btrfs: self heal from SB fail
On 2017年12月08日 15:57, Anand Jain wrote: > -EXPERIMENTAL- > As of now when primary SB fails we won't self heal and would fail mount, > this is an experimental patch which thinks why not go and read backup > copy. Just curious about in which real world case that backup super block can help. At least from what I see in mail list, only few cases where backup super helps. Despite that self super heal seems good, although I agree with David, we need a weaker but necessary check (magic and fsid from primary super?) to ensure it's a valid btrfs before we use the backup supers. Thanks, Qu > > Signed-off-by: Anand Jain > --- > fs/btrfs/disk-io.c | 8 +++- > fs/btrfs/volumes.c | 10 +++--- > 2 files changed, 14 insertions(+), 4 deletions(-) > > diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c > index 9b20c1f3563b..a791b8dfe8a8 100644 > --- a/fs/btrfs/disk-io.c > +++ b/fs/btrfs/disk-io.c > @@ -3190,7 +3190,7 @@ struct buffer_head *btrfs_read_dev_super(struct > block_device *bdev) >* So, we need to add a special mount option to scan for >* later supers, using BTRFS_SUPER_MIRROR_MAX instead >*/ > - for (i = 0; i < 1; i++) { > + for (i = 0; i < BTRFS_SUPER_MIRROR_MAX; i++) { > ret = btrfs_read_dev_one_super(bdev, i, &bh); > if (ret) > continue; > @@ -4015,11 +4015,17 @@ static int btrfs_check_super_valid(struct > btrfs_fs_info *fs_info) > ret = -EINVAL; > } > > +#if 0 > + /* > + * Need a way to check for any copy of SB, as its not a > + * strong check, just ignore this for now. > + */ > if (btrfs_super_bytenr(sb) != BTRFS_SUPER_INFO_OFFSET) { > btrfs_err(fs_info, "super offset mismatch %llu != %u", > btrfs_super_bytenr(sb), BTRFS_SUPER_INFO_OFFSET); > ret = -EINVAL; > } > +#endif > > /* >* Obvious sys_chunk_array corruptions, it must hold at least one key > diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c > index 9fa2539a8493..f368db94d62b 100644 > --- a/fs/btrfs/volumes.c > +++ b/fs/btrfs/volumes.c > @@ -1369,7 +1369,7 @@ int btrfs_scan_one_device(const char *path, fmode_t > flags, void *holder, > { > struct btrfs_super_block *disk_super; > struct block_device *bdev; > - struct page *page; > + struct buffer_head *sb_bh; > int ret = -EINVAL; > u64 devid; > u64 transid; > @@ -1392,8 +1392,12 @@ int btrfs_scan_one_device(const char *path, fmode_t > flags, void *holder, > goto error; > } > > - if (btrfs_read_disk_super(bdev, bytenr, &page, &disk_super)) > + sb_bh = btrfs_read_dev_super(bdev); > + if (IS_ERR(sb_bh)) { > + ret = PTR_ERR(sb_bh); > goto error_bdev_put; > + } > + disk_super = (struct btrfs_super_block *) sb_bh->b_data; > > devid = btrfs_stack_device_id(&disk_super->dev_item); > transid = btrfs_super_generation(disk_super); > @@ -1413,7 +1417,7 @@ int btrfs_scan_one_device(const char *path, fmode_t > flags, void *holder, > if (!ret && fs_devices_ret) > (*fs_devices_ret)->total_devices = total_devices; > > - btrfs_release_disk_super(page); > + brelse(sb_bh); > > error_bdev_put: > blkdev_put(bdev, flags); > signature.asc Description: OpenPGP digital signature