I spent a while trying to reproduce this well: - I tried many vm configurations running 'serial-shell-looper', both manually and started by cloud-init but it didn't break - I wrote a similar script using util.meta_log to see if the difference in implementation between python open() and shell piping would make a difference. I wasn't able to find anything useful though. http://paste.ubuntu.com/15843875/ - I was able to reproduce about 9 times out of 10 using XenialTestBasic with no modifications. After removing most of the functionality of the test other than basic booting (no curtin cmd, no curtin archive, no extra disks), it still failed just as reliably. There were still occasional cases where there was no failure though - Since these failures have been occurring much more recently, I reverted the net.ifnames=0 removal and ran vmtests several times, and did not see any failures. I enabled and disabled this parameter many times to make sure, but it appears that this issue appears almost always with ifnames enabled and never with them disabled, suggesting that somehow that I haven't figured out yet naming of network devices is shifting timing enough that it can toggle this error on and off - I was able to reproduce it just as often using a modified version of the cloud-init vmtests, using both a cloud-init deb built from the current revision of cloud-init and a deb built from cloud-init at revision 1188, before the new networking code was merged in. In both versions, this error almost always occurred when running with ifnames enabled and never occurred when running with ifnames disabled.
I'm not really sure how to reproduce this error on a small scale yet. I am going to try to figure out what could be running concurrently with cc_ssh_authkey_fingerprints and see if I can figure anything else out from there. I haven't yet tried disabling StandardOutput=journal+console in cloud-final.service, but I will give that a try as well, although it is already present in wily and wily does not seem to have this issue The only idea I have so far for underlying cause is flow control on /dev/console. Since serial console is being forwarded to a file over ipmi by qemu and is write only it may be possible that somehow something expects to read from there (maybe agetty?) and flow control is causing a block. I'm not sure if that makes sense though. The main thing that suggests that is a series of bugs in several different mailing lists about syslog-ng writing directly to /dev/console causing hangs in some situations such as when traffic from /dev/console is being forwarded to a device that temporarily goes offline, causing write to block. http://comments.gmane.org/gmane.comp.syslog-ng/10561 -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1473527 Title: module ssh-authkey-fingerprints fails Input/output error: /dev/console To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/cloud-init/+bug/1473527/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs