By the way the task blocks on the following : ==> /proc/54255/task/54255/stack <== [<c000000000103a1c>] wakeup_preempt_entity.isra.6+0x7c/0x90 [<c000000000015df8>] __switch_to+0x1f8/0x350 [<c00000000054cd9c>] get_request+0x29c/0x910 [<c000000000550d64>] blk_queue_bio+0x164/0x500 [<c00000000054e274>] generic_make_request+0x154/0x310 [<c00000000054e504>] submit_bio+0xd4/0x1f0 [<c0000000003b805c>] ext4_io_submit+0x7c/0xb0 [<c0000000003b2938>] ext4_writepages+0x4a8/0xdd0 [<c000000000243090>] do_writepages+0x60/0xc0 [<c000000000230988>] __filemap_fdatawrite_range+0xf8/0x170 [<c000000000416378>] jbd2_journal_begin_ordered_truncate+0xe8/0x130 [<c0000000003b5c70>] ext4_evict_inode+0x530/0x5e0 [<c00000000030d488>] evict+0xf8/0x2a0 [<c0000000002fe668>] do_unlinkat+0x1a8/0x3a0 [<c000000000009284>] system_call+0x38/0xe4
-- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1766543 Title: instance deletion takes a while and blocks nova-compute Status in linux package in Ubuntu: Incomplete Status in nova package in Ubuntu: New Bug description: Hi, I have a cloud running xenial/mitaka (with 18.02 charms). Sometimes, an instance will take minutes to delete. I tracked down the time taken to be file deletion : Apr 23 07:23:00 hostname nova-compute[54255]: 2018-04-23 07:23:00.920 54255 INFO nova.virt.libvirt.driver [req- 35ccfe64-9280-4de6-ae88-045ca91bf90f bc0ab055427645aca4ed09266e85b1db 1cb457a8302543fea067e5f14b5241e7 - - -] [instance: 97731f51-63be-4056 -869f-084b38580b9a] Deleting instance files /srv/nova/instances/97731f51-63be-4056-869f-084b38580b9a_del Apr 23 07:27:33 hostname nova-compute[54255]: 2018-04-23 07:27:33.767 54255 INFO nova.virt.libvirt.driver [req- 35ccfe64-9280-4de6-ae88-045ca91bf90f bc0ab055427645aca4ed09266e85b1db 1cb457a8302543fea067e5f14b5241e7 - - -] [instance: 97731f51-63be-4056 -869f-084b38580b9a] Deletion of /srv/nova/instances/97731f51-63be-4056 -869f-084b38580b9a_del complete As you can see, 4 minutes and 33 seconds have elapsed between the 2 lines. nova-compute logs absolutely _nothing_ during this time. Periodic tasks are not run, etc... Generally, a deletion takes a few seconds top. The logs above are generally immediately followed by : Apr 23 07:27:33 hostname nova-compute[54255]: 2018-04-23 07:27:33.771 54255 DEBUG oslo.messaging._drivers.impl_rabbit [req- 35ccfe64-9280-4de6-ae88-045ca91bf90f bc0ab055427645aca4ed09266e85b1db 1cb457a8302543fea067e5f14b5241e7 - - -] Received recoverable error from kombu: on_error /usr/lib/python2.7/dist- packages/oslo_messaging/_drivers/impl_rabbit.py:683 (which is error: [Errno 104] Connection reset by peer) because nova-compute doesn't even maintain the rabbitmq connection (on the rabbitmq server I can see errors about "Missed heartbeats from client, timeout: 60s"). So nova-compute appears to be "frozen" during several minutes. This can cause problems because events can be missed, etc... We have telegraf on this host, and there's little to no CPU, disk, network or memory activity at that time. Nothing relevant in kern.log either. And this is happening on 3 different architectures, so this is all very puzzling. Is nova-compute supposed to be totally stuck while deleting instance files ? Have you ever seen something similar ? I'm going to try to repro on queens. Thanks To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1766543/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp