Re: btrfs scrub status reports not running when it is
On Thu, Jan 15, 2015 at 10:02:37AM -0800, Zach Brown wrote: It says that scrub isn't running if any devices have completed. If you drop all those ret 0 conditional branches that are either noops or wrong, does it work like you'd expect? Why wrong? The ioctl callback returns -ENODEV or -ENOTCONN that get translated to the errno values and ioctl(...) returns -1 in both cases. Wrong because returning 0 on the first ENOTCONN, instead of continuing to find more devices which might still be scrubbing, leads to this confusing status message. You're right, fix on the way. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs scrub status reports not running when it is
On Thu, Jan 15, 2015 at 12:24:41PM +0100, David Sterba wrote: On Wed, Jan 14, 2015 at 02:27:17PM -0800, Zach Brown wrote: On Wed, Jan 14, 2015 at 04:06:02PM -0500, Sandy McArthur Jr wrote: Sometimes btrfs scrub status reports that is not running when it still is. I think this a cosmetic bug. And I believe this is related to the scrub completing on some drives before others in a multi-drive btrfs filesystem that is not well balanced. Boy, I don't really know this code, but it looks like: if (ss-in_progress) printf(, running for %llu seconds\n, ss-duration); else printf(, interrupted after %llu seconds, not running\n, ss-duration); in_progress = is_scrub_running_in_kernel(fdmnt, di_args, fi_args.num_devices); static int is_scrub_running_in_kernel(int fd, struct btrfs_ioctl_dev_info_args *di_args, u64 max_devices) { struct scrub_progress sp; int i; int ret; for (i = 0; i max_devices; i++) { memset(sp, 0, sizeof(sp)); sp.scrub_args.devid = di_args[i].devid; ret = ioctl(fd, BTRFS_IOC_SCRUB_PROGRESS, sp.scrub_args); if (ret 0 errno == ENODEV) continue; if (ret 0 errno == ENOTCONN) return 0; It says that scrub isn't running if any devices have completed. If you drop all those ret 0 conditional branches that are either noops or wrong, does it work like you'd expect? Why wrong? The ioctl callback returns -ENODEV or -ENOTCONN that get translated to the errno values and ioctl(...) returns -1 in both cases. Wrong because returning 0 on the first ENOTCONN, instead of continuing to find more devices which might still be scrubbing, leads to this confusing status message. That's my working theory having spent 15 seconds reading code. I would be not surprised at all if I'm missing something here. - z -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs scrub status reports not running when it is
On Wed, Jan 14, 2015 at 02:27:17PM -0800, Zach Brown wrote: On Wed, Jan 14, 2015 at 04:06:02PM -0500, Sandy McArthur Jr wrote: Sometimes btrfs scrub status reports that is not running when it still is. I think this a cosmetic bug. And I believe this is related to the scrub completing on some drives before others in a multi-drive btrfs filesystem that is not well balanced. Boy, I don't really know this code, but it looks like: if (ss-in_progress) printf(, running for %llu seconds\n, ss-duration); else printf(, interrupted after %llu seconds, not running\n, ss-duration); in_progress = is_scrub_running_in_kernel(fdmnt, di_args, fi_args.num_devices); static int is_scrub_running_in_kernel(int fd, struct btrfs_ioctl_dev_info_args *di_args, u64 max_devices) { struct scrub_progress sp; int i; int ret; for (i = 0; i max_devices; i++) { memset(sp, 0, sizeof(sp)); sp.scrub_args.devid = di_args[i].devid; ret = ioctl(fd, BTRFS_IOC_SCRUB_PROGRESS, sp.scrub_args); if (ret 0 errno == ENODEV) continue; if (ret 0 errno == ENOTCONN) return 0; It says that scrub isn't running if any devices have completed. If you drop all those ret 0 conditional branches that are either noops or wrong, does it work like you'd expect? Why wrong? The ioctl callback returns -ENODEV or -ENOTCONN that get translated to the errno values and ioctl(...) returns -1 in both cases. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs scrub status reports not running when it is
Am Wed, 14 Jan 2015 16:06:02 -0500 schrieb Sandy McArthur Jr sandy...@gmail.com: Sometimes btrfs scrub status reports that is not running when it still is. [...] FWIW, I (and one other person) reported this in the thread titled 'btrfs scrub status misreports as interrupted' (starting on 22.11.2014). # uname -a Linux mcplex 3.18.2-gentoo #1 SMP Mon Jan 12 10:24:25 EST 2015 x86_64 Intel(R) Core(TM) i7-2600S CPU @ 2.80GHz GenuineIntel GNU/Linux # btrfs --version Btrfs v3.18.1 Too bad it's still there; I'm on kernel 3.17.8 and userspace 3.18.1, respectively, and didn't see this issue the last time I ran a scrub, so I was hoping it was gone by now. (On the upside, though, this isn't exactly the worst bug btrfs has ever had ;) .) Greetings -- Marc Joliet -- People who think they know everything really annoy those of us who know we don't - Bjarne Stroustrup pgpxeBuwdNml4.pgp Description: Digitale Signatur von OpenPGP
Re: btrfs scrub status reports not running when it is
Okay, different output when the scrub is actually complete: completed status: scrub status for 94b3345e-2589-423c-a228-d569bf94ab58 scrub started at Tue Jan 13 01:18:22 2015 and finished after 139459 seconds total bytes scrubbed: 23.30TiB with 513 errors error details: verify=19 csum=494 corrected errors: 512, uncorrectable errors: 1, unverified errors: 0 Still, the output when wrapping up is still not intuitive to me: scrub started at Tue Jan 13 01:18:22 2015, interrupted after 136982 seconds, not running On Wed, Jan 14, 2015 at 4:06 PM, Sandy McArthur Jr sandy...@gmail.com wrote: Sometimes btrfs scrub status reports that is not running when it still is. I think this a cosmetic bug. And I believe this is related to the scrub completing on some drives before others in a multi-drive btrfs filesystem that is not well balanced. Based on `iostat 1` activity the last drive in the btrfs filesystem was still being scrubbed at the time I copied the output below, you can see the total bytes scrubbed is increasing despite showing as not running. The last drive being scrubbed was not the device identified when you list mount points with `mount`: # date ; echo ; btrfs scrub status /mcmedia/ Wed Jan 14 15:20:18 EST 2015 scrub status for 94b3345e-2589-423c-a228-d569bf94ab58 scrub started at Tue Jan 13 01:18:22 2015, interrupted after 136912 seconds, not running total bytes scrubbed: 23.05TiB with 513 errors error details: verify=19 csum=494 corrected errors: 512, uncorrectable errors: 1, unverified errors: 0 # date ; echo ; btrfs scrub status /mcmedia/ Wed Jan 14 15:21:25 EST 2015 scrub status for 94b3345e-2589-423c-a228-d569bf94ab58 scrub started at Tue Jan 13 01:18:22 2015, interrupted after 136982 seconds, not running total bytes scrubbed: 23.06TiB with 513 errors error details: verify=19 csum=494 corrected errors: 512, uncorrectable errors: 1, unverified errors: 0 # uname -a Linux mcplex 3.18.2-gentoo #1 SMP Mon Jan 12 10:24:25 EST 2015 x86_64 Intel(R) Core(TM) i7-2600S CPU @ 2.80GHz GenuineIntel GNU/Linux # btrfs --version Btrfs v3.18.1 -- Sandy McArthur Jr He who dares not offend cannot be honest. - Thomas Paine -- Sandy McArthur He who dares not offend cannot be honest. - Thomas Paine -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
btrfs scrub status reports not running when it is
Sometimes btrfs scrub status reports that is not running when it still is. I think this a cosmetic bug. And I believe this is related to the scrub completing on some drives before others in a multi-drive btrfs filesystem that is not well balanced. Based on `iostat 1` activity the last drive in the btrfs filesystem was still being scrubbed at the time I copied the output below, you can see the total bytes scrubbed is increasing despite showing as not running. The last drive being scrubbed was not the device identified when you list mount points with `mount`: # date ; echo ; btrfs scrub status /mcmedia/ Wed Jan 14 15:20:18 EST 2015 scrub status for 94b3345e-2589-423c-a228-d569bf94ab58 scrub started at Tue Jan 13 01:18:22 2015, interrupted after 136912 seconds, not running total bytes scrubbed: 23.05TiB with 513 errors error details: verify=19 csum=494 corrected errors: 512, uncorrectable errors: 1, unverified errors: 0 # date ; echo ; btrfs scrub status /mcmedia/ Wed Jan 14 15:21:25 EST 2015 scrub status for 94b3345e-2589-423c-a228-d569bf94ab58 scrub started at Tue Jan 13 01:18:22 2015, interrupted after 136982 seconds, not running total bytes scrubbed: 23.06TiB with 513 errors error details: verify=19 csum=494 corrected errors: 512, uncorrectable errors: 1, unverified errors: 0 # uname -a Linux mcplex 3.18.2-gentoo #1 SMP Mon Jan 12 10:24:25 EST 2015 x86_64 Intel(R) Core(TM) i7-2600S CPU @ 2.80GHz GenuineIntel GNU/Linux # btrfs --version Btrfs v3.18.1 -- Sandy McArthur Jr He who dares not offend cannot be honest. - Thomas Paine -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs scrub status reports not running when it is
On Wed, Jan 14, 2015 at 04:06:02PM -0500, Sandy McArthur Jr wrote: Sometimes btrfs scrub status reports that is not running when it still is. I think this a cosmetic bug. And I believe this is related to the scrub completing on some drives before others in a multi-drive btrfs filesystem that is not well balanced. Boy, I don't really know this code, but it looks like: if (ss-in_progress) printf(, running for %llu seconds\n, ss-duration); else printf(, interrupted after %llu seconds, not running\n, ss-duration); in_progress = is_scrub_running_in_kernel(fdmnt, di_args, fi_args.num_devices); static int is_scrub_running_in_kernel(int fd, struct btrfs_ioctl_dev_info_args *di_args, u64 max_devices) { struct scrub_progress sp; int i; int ret; for (i = 0; i max_devices; i++) { memset(sp, 0, sizeof(sp)); sp.scrub_args.devid = di_args[i].devid; ret = ioctl(fd, BTRFS_IOC_SCRUB_PROGRESS, sp.scrub_args); if (ret 0 errno == ENODEV) continue; if (ret 0 errno == ENOTCONN) return 0; It says that scrub isn't running if any devices have completed. If you drop all those ret 0 conditional branches that are either noops or wrong, does it work like you'd expect? - z -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html