[Kernel-packages] [Bug 1548009] Re: ZFS pools should be automatically scrubbed
Would it not make more sense to have /usr/lib/zfs-linux/scrub set the PATH that it requires in order to find the zpool command? -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to zfs-linux in Ubuntu. https://bugs.launchpad.net/bugs/1548009 Title: ZFS pools should be automatically scrubbed Status in zfs-linux package in Ubuntu: Fix Released Status in zfs-linux source package in Xenial: Incomplete Bug description: [Impact] Xenial shipped with a cron job to automatically scrub ZFS pools, as desired by many users and as implemented by mdadm for traditional Linux software RAID. Unfortunately, this cron job does not work, because it needs a PATH line for /sbin, where the zpool utility lives. Given the existence of the cron job and various discussions on IRC, etc., users expect that scrubs are happening, when they are not. This means ZFS is not pre-emptively checking for (and correcting) corruption. The odds of disk corruption are admittedly very low, but violating users' expectations of data safety, especially when they've gone out of their way to use a filesystem which touts data safety, is bad. [Test Case] $ truncate -s 1G test.img $ sudo zpool create test `pwd`/test.img $ sudo zpool status test $ sudo vi /etc/cron.d/zfsutils-linux Modify /etc/cron.d/zfsutils-linux to run the cron job in a few minutes (modifying the date range if it's not currently the 8th through the 14th and the "-eq 0" check if it's not currently a Sunday). $ grep zfs /var/log/cron.log Verify in /var/log/cron.log that the job ran. $ sudo zpool status test Expected results: scan: scrub repaired 0 in ... on Actual results: scan: none requested Then, add the PATH line, update the time rules in the cron job, and repeat the test. Now it will work. - OR - The best test case is to leave the cron job file untouched, install the patched package, wait for the second Sunday of the month, and verify with zpool status that a scrub ran. I did this, on Xenial, with the package I built. The debdiff is in comment #11 and was accepted to Yakkety. If someone can get this in -proposed before the 14th, I'll gladly install the actual package from -proposed and make sure it runs correctly on the 14th. [Regression Potential] The patch only touches the cron.d file, which has only one cron job in it. This cron job is completely broken (inoperative) at the moment, so the regression potential is very low. ORIGINAL, PRE-SRU, DESCRIPTION: mdadm automatically checks MD arrays. ZFS should automatically scrub pools too. Scrubbing a pool allows ZFS to detect on-disk corruption and (when the pool has redundancy) correct it. Note that ZFS does not blindly assume the other copy is correct; it will only overwrite bad data with data that is known to be good (i.e. it passes the checksum). I've attached a debdiff which accomplishes this. It builds and installs cleanly. The meat of it is the scrub script I've been using on production systems, both servers and laptops, and recommending in my Ubuntu root- on-ZFS HOWTO, for years, which scrubs all *healthy* pools. If a pool is not healthy, scrubbing it is bad for two reasons: 1) It adds a lot of disk load which could theoretically lead to another failure. We should save that disk load for resilvering. 2) Performance is already less on a degraded pool and scrubbing can make that worse, even though scrubs are throttled. Arguably, I might be being too conservative here, but the marginal benefit of scrubbing a *degraded* pool is pretty minimal as pools should not be left degraded for very long. The cron.d in this patch scrubs on the second Sunday of the month. mdadm scrubs on the first Sunday of the month. This way, if a system has both MD and ZFS pools, the load doesn't all happen at the same time. If the system doesn't have both types, it shouldn't really matter which week. If you'd rather make it the same week as MD, I see no problem with that. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/1548009/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1376245] Re: remove_proc_entry+0x139/0x1b0() -- name 'fs/nfsfs'
I was getting this error regularly in my system log yesterday. Since installing the kernel images from #18 I'm not seeing the error any more, so it looks like that's the fix. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1376245 Title: remove_proc_entry+0x139/0x1b0() -- name 'fs/nfsfs' Status in “linux” package in Ubuntu: Confirmed Bug description: Ubuntu 14.04 amd64 ii linux-image-3.13.0-37-generic 3.13.0-37.64amd64Linux kernel image for version 3.13.0 on 64 bit x86 SMP sloeuillet@SLoeuillet-DE107:/sfr$ uname -a Linux SLoeuillet-DE107 3.13.0-37-generic #64-Ubuntu SMP Mon Sep 22 21:28:38 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Today, I encountered 2 quite identical stack traces in my dmesg, related to nfs : [368973.412120] WARNING: CPU: 1 PID: 15607 at /build/buildd/linux-3.13.0/fs/proc/generic.c:511 remove_proc_entry+0x139/0x1b0() [368973.412122] name 'fs/nfsfs' [368973.412123] Modules linked in: cdc_acm nfsv3 rpcsec_gss_krb5 8021q garp stp mrp llc cuse bnep rfcomm bluetooth nfsv4 nfsd auth_rpcgss nfs_acl nfs lockd sunrpc binfmt_misc fscache hp_wmi sparse_keymap snd_hda_codec_analog snd_hda_intel snd_hda_codec snd_hwdep snd_pcm radeon snd_page_alloc snd_seq_midi snd_seq_midi_event snd_rawmidi snd_seq sp5100_tco snd_seq_device serio_raw snd_timer ttm drm_kms_helper snd kvm edac_core soundcore drm edac_mce_amd k8temp i2c_algo_bit i2c_piix4 tpm_infineon shpchp wmi mac_hid parport_pc ppdev lp parport hid_generic usbhid hid tg3 psmouse ptp pps_core floppy ahci libahci [368973.412164] CPU: 1 PID: 15607 Comm: kworker/u8:1 Tainted: GW 3.13.0-37-generic #64-Ubuntu [368973.412166] Hardware name: Hewlett-Packard HP Compaq dc5850 Small Form Factor/3029h, BIOS 786F6 v03.14 11/15/2011 [368973.412171] Workqueue: netns cleanup_net [368973.412172] 0009 88006a875c80 8171ed09 88006a875cc8 [368973.412176] 88006a875cb8 8106773d 0005 [368973.412179] a05168a8 88011b418b30 0100 88006a875d18 [368973.412182] Call Trace: [368973.412188] [] dump_stack+0x45/0x56 [368973.412192] [] warn_slowpath_common+0x7d/0xa0 [368973.412195] [] warn_slowpath_fmt+0x4c/0x50 [368973.412198] [] remove_proc_entry+0x139/0x1b0 [368973.412217] [] nfs_fs_proc_net_exit+0x62/0x70 [nfs] [368973.412225] [] nfs_net_exit+0x12/0x20 [nfs] [368973.412228] [] ops_exit_list.isra.1+0x39/0x60 [368973.412231] [] cleanup_net+0x110/0x250 [368973.412235] [] process_one_work+0x182/0x450 [368973.412237] [] worker_thread+0x121/0x410 [368973.412240] [] ? rescuer_thread+0x430/0x430 [368973.412243] [] kthread+0xd2/0xf0 [368973.412246] [] ? kthread_create_on_node+0x1c0/0x1c0 [368973.412249] [] ret_from_fork+0x7c/0xb0 [368973.412251] [] ? kthread_create_on_node+0x1c0/0x1c0 [368973.412253] ---[ end trace bb78fe56a3dcb678 ]--- [368973.448199] WARNING: CPU: 1 PID: 15607 at /build/buildd/linux-3.13.0/fs/proc/generic.c:511 remove_proc_entry+0x139/0x1b0() [368973.448201] name 'fs/nfsfs' [368973.448203] Modules linked in: cdc_acm nfsv3 rpcsec_gss_krb5 8021q garp stp mrp llc cuse bnep rfcomm bluetooth nfsv4 nfsd auth_rpcgss nfs_acl nfs lockd sunrpc binfmt_misc fscache hp_wmi sparse_keymap snd_hda_codec_analog snd_hda_intel snd_hda_codec snd_hwdep snd_pcm radeon snd_page_alloc snd_seq_midi snd_seq_midi_event snd_rawmidi snd_seq sp5100_tco snd_seq_device serio_raw snd_timer ttm drm_kms_helper snd kvm edac_core soundcore drm edac_mce_amd k8temp i2c_algo_bit i2c_piix4 tpm_infineon shpchp wmi mac_hid parport_pc ppdev lp parport hid_generic usbhid hid tg3 psmouse ptp pps_core floppy ahci libahci [368973.448250] CPU: 1 PID: 15607 Comm: kworker/u8:1 Tainted: GW 3.13.0-37-generic #64-Ubuntu [368973.448252] Hardware name: Hewlett-Packard HP Compaq dc5850 Small Form Factor/3029h, BIOS 786F6 v03.14 11/15/2011 [368973.448257] Workqueue: netns cleanup_net [368973.448259] 0009 88006a875c80 8171ed09 88006a875cc8 [368973.448263] 88006a875cb8 8106773d 0005 [368973.448265] a05168a8 88011b418b30 0100 88006a875d18 [368973.448268] Call Trace: [368973.448275] [] dump_stack+0x45/0x56 [368973.448279] [] warn_slowpath_common+0x7d/0xa0 [368973.448283] [] warn_slowpath_fmt+0x4c/0x50 [368973.448286] [] remove_proc_entry+0x139/0x1b0 [368973.448313] [] nfs_fs_proc_net_exit+0x62/0x70 [nfs] [368973.448322] [] nfs_net_exit+0x12/0x20 [nfs] [368973.448325] [] ops_exit_list.isra.1+0x39/0x60 [368973.448327] [] cleanup_net+0x110/0x250 [368973.448331] [] process_one_work+0x182/0x450 [368973.448334] [] worker_thread+0x121/0x410 [368973.448337] [] ? rescuer_thread+0x430/0x430