On 2016-10-17 23:23, Anand Jain wrote:


I would like to monitor my btrfs-filesystem for missing drives.

This is actually correct behavior, the filesystem reports that it should
have 6 devices, which is how it knows a device is missing.

 Missing - means missing at the time of mount. So how are you planning
to monitor a disk which is failed while in production ?
No, in `btrfs fi show` it means that it can't find the device. All that `fi show` does is print out what info it can find about the filesystem, nothing more, nothing less. It's trivial to see from the output (two different ways I might add) that you're missing devices and how many you still have. The only way without poking at the FS directly to figure out how many devices the FS is supposed to have (or at least, how many it thinks it should have) is the device count output by `btrfs fi show`.

Now, for production usage, you have three things you should be monitoring:
1. Output from `btrfs dev stats`. This reports per-device error counters, and is one of the best ways to see if something is wrong, and also gives you a decent indicator of exactly what is wrong.
2. Status from regular scrub operations.  Pretty self explanatory.
3. SMART status of the underlying devices themselves. This will catch pre-failure conditions, and the direct access from smartctl will error out when the drive has failed to the point of not being present.

You can additionally monitor:
1. Filesystem flags. These will change when the filesystem goes degraded, and it's actually good practice for any filesystem, not just BTRFS. 2. Total filesystem size. If this changes without manual intervention, something is seriously wrong.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to