** Description changed: [Impact] The mdadm package is missing the mdcheck script. This has two consequences: In the immediate term, that means that we get failed systemd units on all of our physical machines (because they have mirrored disks) as we upgrade them to 20.04. This raises alarms in our monitoring system as we monitor systemd unit failures. In the longer-term, this means that the arrays are not being checked. If a drive develops a bad sector, this would normally be caught by the checking and a good copy would be rewritten from the other side of the mirror. Without the checking, that will not happen. If the other drive (the one with the good version of the sector) dies, then that sector's data is lost permanently. The consequences of that depend on what that sector was storing, but it's not good, obviously. [Test Case] * systemctl start mdcheck_start.service * journalctl -u mdcheck_start -- Logs begin at Wed 2020-09-23 18:33:35 UTC, end at Wed 2020-09-23 18:40:27 UTC. -- Sep 23 18:40:27 mdadmgroovy systemd[1]: Starting MD array scrubbing... Sep 23 18:40:27 mdadmgroovy systemd[1515]: mdcheck_start.service: Failed to execute command: No such file or directory Sep 23 18:40:27 mdadmgroovy systemd[1515]: mdcheck_start.service: Failed at step EXEC spawning /usr/share/mdadm/mdcheck: No such file or directory Sep 23 18:40:27 mdadmgroovy systemd[1]: mdcheck_start.service: Main process exited, code=exited, status=203/EXEC Sep 23 18:40:27 mdadmgroovy systemd[1]: mdcheck_start.service: Failed with result 'exit-code'. Sep 23 18:40:27 mdadmgroovy systemd[1]: Failed to start MD array scrubbing. * ls -altr /usr/share/mdadm/mdcheck ls: cannot access '/usr/share/mdadm/mdcheck': No such file or directory * dpkg -l mdadm ii mdadm 4.1-5ubuntu1 amd64 tool to administer Linux MD arrays (software RAID) * dpkg -L mdadm | grep -i mdcheck /lib/systemd/system/mdcheck_continue.service /lib/systemd/system/mdcheck_continue.timer /lib/systemd/system/mdcheck_start.service /lib/systemd/system/mdcheck_start.timer * Also, we'd like to see if the mdcheck is performed under the 'natural' scheduled execution (so on nearest Sunday) and have impacted users to report feedback supported with logs. * We found a regression fixed upstream: https://git.kernel.org/pub/scm/utils/mdadm/mdadm.git/commit/?id=6636788aaf4ec0cacaefb6e77592e4a68e70a957 + + * We then found a regression fix for the above regression fix, push into groovy, and then submitted upstream to linux-raid ML: + https://marc.info/?l=linux-raid&m=160130979927617&w=2 * We'd like to see if when mdcheck_start is enabled, enable mdcheck_continue too. [Regression Potential] * 'misc/mdcheck' will be introduced in Ubuntu for the first time, and is pretty young in the Debian mdadm story too (introduced in Sept 12 2020). Not known fix since debian introduced it 2 weeks-ish ago has been added on top of it so far. $ git log --oneline --grep="mdcheck" 5a3db0f Install misc/mdcheck; turn on hardening; enable dh_lintian. (Closes: #960132) f258a5e mdcheck: improve cleanup ea83549 mdcheck: add some logging. 979b1fe mdcheck: be careful when sourcing the output of "mdadm --detail --export" 36dab45 mdcheck: don't git error if not /dev/md?* devices exist. 868ab80 mdcheck: don't pass the '+' to "date". df881f7 mdcheck: new script to help with regular checks of md arrays. And no presence of new opened bug(s) related to mdcheck introduction. At code inspection, 'mdcheck' script seems to be harmless (at least at first glance), of course, real case scenario testing within raid types situations will be needed to conclude during the verification testing phase, and if possible, running the script in debug mode (set -xv) might be a good idea to see the script workflow in action. This change will permit 'mdcheck' to be run on the first Sunday of each month for 6 hours (mdcheck_start.timer: OnCalendar=Sun *-*-1..7 1:00:00), then on every subsequent morning until the check is finished (mdcheck_continue.timer:OnCalendar=daily). It's not a script that one would typically run manually on a regular basis. The script uses 'logger' to enter messages into the system log, so we will have a trace of its execution (in addition the systemd unit,timer usual logs) when it begins, paused and continue. I also added in my upload a patch in which mdcheck logs the completion as well. Giving the opportunity to user to know how long the raid check took, which I think is paramount information to include with the introduction of this script in Ubuntu. I would suggest we don't release the package in focal-updates before having at least one sample of a 'natural' scheduled execution on the first Sunday of the month (Next should be October 4th ?), and have impacted users to report feedback supported with logs. I think running it on Sunday is reasonable, (just like fstrim, zfs scrub, ...). Typically, Sunday is a day when cron and timer runs to do some execution like that. One thing, I would like to confirm, but maybe not a blocker for this case, is to make sure 'mdcheck_continue' starts fine when condition are met, since it has never been tester due to 'mdcheck_start' failure due to missing 'mdcheck' script. [Other Info] Debian bug: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=960132 salsa commit: https://salsa.debian.org/lechner/mdadm/-/commit/5a3db0f5429fc81e0f53cbf9aa473059b74fe057 [Original Description] mdcheck_start.service trying to start unexisting file root@d:~# cat /lib/systemd/system/mdcheck_start.service | grep Exec ExecStart=/usr/share/mdadm/mdcheck --duration $MDADM_CHECK_DURATION root@d:~# ls -la /usr/share/mdadm/mdcheck ls: cannot access '/usr/share/mdadm/mdcheck': No such file or directory ProblemType: Bug DistroRelease: Ubuntu 19.10 Package: mdadm 4.1-2ubuntu3 ProcVersionSignature: Ubuntu 5.3.0-19.20-generic 5.3.1 Uname: Linux 5.3.0-19-generic x86_64 ApportVersion: 2.20.11-0ubuntu8.2 Architecture: amd64 Date: Fri Nov 15 13:13:17 2019 Lspci: Error: [Errno 2] No such file or directory: 'lspci': 'lspci' Lsusb: Error: [Errno 2] No such file or directory: 'lsusb': 'lsusb' MachineType: HP HP EliteBook x360 1030 G3 ProcEnviron: LANG=C TERM=screen PATH=(custom, no user) ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.3.0-19-generic root=/dev/mapper/system-root ro cryptdevice=UUID=95c107ea-73d0-4206-a31c-fb0ed6d7d6a9:cryptlvm mem_sleep_default=deep ProcMDstat: Personalities : unused devices: <none> SourcePackage: mdadm UpgradeStatus: No upgrade log present (probably fresh install) dmi.bios.date: 08/07/2019 dmi.bios.vendor: HP dmi.bios.version: Q90 Ver. 01.08.01 dmi.board.name: 8438 dmi.board.vendor: HP dmi.board.version: KBC Version 14.3F.00 dmi.chassis.asset.tag: 5CD9296RDC dmi.chassis.type: 31 dmi.chassis.vendor: HP dmi.modalias: dmi:bvnHP:bvrQ90Ver.01.08.01:bd08/07/2019:svnHP:pnHPEliteBookx3601030G3:pvr:rvnHP:rn8438:rvrKBCVersion14.3F.00:cvnHP:ct31:cvr: dmi.product.family: 103C_5336AN HP EliteBook x360 dmi.product.name: HP EliteBook x360 1030 G3 dmi.product.sku: 5SR46ES#ACB dmi.sys.vendor: HP etc.blkid.tab: Error: [Errno 2] No such file or directory: '/etc/blkid.tab' initrd.files: Error: [Errno 2] No such file or directory: '/boot/initrd.img-5.3.0-19-generic'
-- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1852747 Title: mdcheck_start.service trying to start unexisting file To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/mdadm/+bug/1852747/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs