Package: smartmontools
Version: 6.3+svn4002-2

When booting my Raspberry Pi (running Raspbian, an unofficial Debian Jessie port) I am *sometimes* finding that smartd fails to pick up an external USB-attached drive. Looking at syslog the reason for this appears to be that smartd is often started in the boot sequence before the drive is being picked up and therefore the /dev/disk/by-id/ reference doesn't (yet) exist:

Jul 20 18:59:17 backup systemd[1]: Starting Self Monitoring and Reporting Technology (SMART) Daemon... Jul 20 18:59:17 backup systemd[1]: Started Self Monitoring and Reporting Technology (SMART) Daemon. Jul 20 18:59:17 backup smartd[403]: smartd 6.4 2014-10-07 r4002 [armv7l-linux-4.4.13-v7+] (local build) Jul 20 18:59:17 backup smartd[403]: Copyright (C) 2002-14, Bruce Allen, Christian Franke, www.smartmontools.org Jul 20 18:59:17 backup smartd[403]: Opened configuration file /etc/smartd.conf Jul 20 18:59:17 backup smartd[403]: Configuration file /etc/smartd.conf parsed. Jul 20 18:59:17 backup smartd[403]: Device: /dev/disk/by-id/ata-WDC_WD30EFRX-68EUZN0_WD-WCC4N6FVL47R [SAT], open() failed: No such device Jul 20 18:59:17 backup smartd[403]: Unable to monitor any SMART enabled devices. Try debug (-d) option. Exiting... Jul 20 18:59:17 backup systemd[1]: smartd.service: main process exited, code=exited, status=17/n/a Jul 20 18:59:17 backup systemd[1]: Unit smartd.service entered failed state.
 |
<unrelated logging snipped>
 |
Jul 20 18:59:21 backup kernel: [ 9.092426] usb 1-1.2: New USB device found, idVendor=174c, idProduct=1053 Jul 20 18:59:21 backup kernel: [ 9.092439] usb 1-1.2: New USB device strings: Mfr=2, Product=3, SerialNumber=1 Jul 20 18:59:21 backup kernel: [ 9.092446] usb 1-1.2: Product: USB3.0 Device Jul 20 18:59:21 backup kernel: [ 9.092452] usb 1-1.2: Manufacturer: Generic Jul 20 18:59:21 backup kernel: [ 9.092458] usb 1-1.2: SerialNumber: AC0000000001 Jul 20 18:59:21 backup kernel: [ 9.093049] usb-storage 1-1.2:1.0: USB Mass Storage device detected Jul 20 18:59:21 backup kernel: [ 9.094618] scsi host0: usb-storage 1-1.2:1.0 Jul 20 18:59:22 backup kernel: [ 10.091572] scsi 0:0:0:0: Direct-Access ASMT 2105 0 PQ: 0 ANSI: 6 Jul 20 18:59:22 backup kernel: [ 10.096115] sd 0:0:0:0: [sda] Very big device. Trying to use READ CAPACITY(16). Jul 20 18:59:22 backup kernel: [ 10.099979] sd 0:0:0:0: [sda] 5860533168 512-byte logical blocks: (3.00 TB/2.73 TiB) Jul 20 18:59:22 backup kernel: [ 10.099995] sd 0:0:0:0: [sda] 4096-byte physical blocks Jul 20 18:59:22 backup kernel: [ 10.104503] sd 0:0:0:0: Attached scsi generic sg0 type 0 Jul 20 18:59:22 backup kernel: [ 10.104857] sd 0:0:0:0: [sda] Write Protect is off Jul 20 18:59:22 backup kernel: [ 10.104873] sd 0:0:0:0: [sda] Mode Sense: 43 00 00 00 Jul 20 18:59:22 backup kernel: [ 10.106730] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA Jul 20 18:59:22 backup kernel: [ 10.110461] sd 0:0:0:0: [sda] Very big device. Trying to use READ CAPACITY(16).
Jul 20 18:59:22 backup kernel: [   10.157535]  sda: sda1 sda2 sda3
Jul 20 18:59:22 backup kernel: [ 10.161101] sd 0:0:0:0: [sda] Very big device. Trying to use READ CAPACITY(16). Jul 20 18:59:22 backup kernel: [ 10.163033] sd 0:0:0:0: [sda] Attached SCSI disk

In case it is relevant here is my smartd.conf:

/dev/disk/by-id/ata-WDC_WD30EFRX-68EUZN0_WD-WCC4N6FVL47R -d sat -n standby -a -o on -S on -s (S/../.././03|L/../../7/04) -r 194 -I 194 -W 5,40,45

I am using the /dev/disk/by-id/ reference as I sometimes swap disks around and each has different smartd configuration parameters. Following boot completion the symlink does indeed exist:

#  ls -l /dev/disk/by-id/ata-WDC_WD30EFRX-68EUZN0_WD-WCC4N6FVL47R
lrwxrwxrwx 1 root root 9 Jul 20 18:59 /dev/disk/by-id/ata-WDC_WD30EFRX-68EUZN0_WD-WCC4N6FVL47R -> ../../sda

...and if I then start smartmontool manually everything is fine.

More often than not it works fine without intervention i.e. the disk is picked up before smartd starts but as seen above there can be a few seconds difference the other way which ends in non-obvious failure (i.e. I only spot if if I check the logs). Should/could there by anything to prevent this race condition?

Perhaps it is a Raspbian-specific issue? If so, apologies for this misdirected report.

Regards,

Mathew

Reply via email to