On 2015-11-22 16:59, Nils Steinger wrote:
Hi,

I recently ran into a problem while trying to back up some of my btrfs
subvolumes over the network:
`btrfs send` works flawlessly on snapshots of most subvolumes, but keeps
failing on snapshots of a certain subvolume — always after sending 15 GiB:

btrfs send /btrfs/snapshots/home/2015-11-17_03:28:14_BOOT-AUTOSNAPSHOT |
pv | ssh kappa "btrfs receive /mnt/300gb/backups/snapshots/zeta/home/"
At subvol /btrfs/snapshots/home/2015-11-17_03:28:14_BOOT-AUTOSNAPSHOT
At subvol 2015-11-17_03:28:14_BOOT-AUTOSNAPSHOT
ERROR: send ioctl failed with -2: No such file or directory
   15GB 0:34:34 [7,41MB/s]

I've tried piping the output to /dev/null instead of ssh and got the
same error (again after sending 15 GiB), so this seems to be on the
sending side.
This is an issue that comes up sometimes with send, it's not well understood or documented, but sometimes something in source FS can get into a state that send chokes on, and then crashes. I've actually been trying to reproduce this myself on a small filesystem so that it's easier to debug, but so far been unsuccessful. I have yet to find any reliable way to reproduce this, and thus have no reliable way to prevent it from happening either.

However, btrfs scrub reports no errors and I don't get any messages in
dmesg when the btrfs send fails.
Scrub is intended to fix corruption due to hardware failures. In almost all cases that I've seen of what you are getting, it wasn't a provable hardware issue, and scrub returned no errors.

What could cause this kind of error?
And is there a way to fix it, preferably without recreating the FS?
In general (assuming you are seeing the same issue I run into from time to time), there are two options other than recreating the filesystem: 1. Recreate the file that scrub is choking on. You can see what file by adding -vv to the receive command-li9ne, although be ready for lots of output. It's important to note that mv won't work for this unless you're moving the data to a different filesystem (if it's a directory, copy everything out and then recreate the directory, then copy everything back in). The downside to this option is that you will usually run into multiple files that send chokes on, and the only way to find them all is to keep repeating the process until send completes successfully. 2. Run a full balance on the FS (this doesn't work anywhere near as reliably as the first option, but is the only way to fix some issues caused by doing batch deduplication on some older kernels).

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

Reply via email to