I spent some time investigating our issue further. As far as I can tell, the main issue is that ioctl(FIFREEZE) can take a long time when running VSS backups, and the default timeout is 10s. This is very noticeable under load, with rare peaks of >5s seen, so 10s seem plausible
If the timeout is hit in the kernel module, the hv_vss_daemon doesnt recover and quits, with the FS still frozen. This fixes some HV VSS daemon behavior where it doesnt recover on a write failed if the previous request timed out (e.g. THAW takes too long) We are currently running this patch including @AlexNg 's patch (1 of 2) in the usual backup loop We already hit the bug at least 5 times, which causes the VSS backup to fail, but subsequent backups work without problems, and the guest systems continue to work normally ** Patch added: "Experimental fix/workaround for hv_vss_daemon (based on 4.4.0-34)" https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1470250/+attachment/4730147/+files/vss_timeout_fix.patch -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1470250 Title: [Hyper-V] Ubuntu 14.04.2 LTS Generation 2 SCSI Errors on VSS Based Backups To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1470250/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs