On 02/03/2011 07:08 AM, Trent W. Buck wrote: > t...@cybersource.com.au (Trent W. Buck) > writes: > >> I'm being a bit more patient than last time, and I think they ARE >> proceeding, just REALLY slowly. Meanwhile aptitude consumes a 100% of a >> core busy-waiting for a response from dpkg :-/ >> >> They look like this: >> >> $ ssh omega cat /proc/7713/stack >> Warning: Permanently added 'omega,192.168.155.22' (RSA) to the list of >> known hosts. >> [<ffffffff811669b7>] sync_inodes_sb+0x87/0xb0 >> [<ffffffff8116b292>] __sync_filesystem+0x82/0x90 >> [<ffffffff8116b379>] sync_filesystems+0xd9/0x130 >> [<ffffffff8116b431>] sys_sync+0x21/0x40 >> [<ffffffff810121b2>] system_call_fastpath+0x16/0x1b >> [<ffffffffffffffff>] 0xffffffffffffffff >> >> $ ssh omega cat /proc/5619/stack >> Warning: Permanently added 'omega,192.168.155.22' (RSA) to the list of >> known hosts. >> [<ffffffff81222865>] jbd2_log_wait_commit+0xc5/0x150 >> [<ffffffff811d7a2c>] ext4_sync_file+0x13c/0x2e0 >> [<ffffffff8116b051>] vfs_fsync_range+0xa1/0xe0 >> [<ffffffff8116b0fd>] vfs_fsync+0x1d/0x20 >> [<ffffffff8116b13e>] do_fsync+0x3e/0x60 >> [<ffffffff8116b190>] sys_fsync+0x10/0x20 >> [<ffffffff810121b2>] system_call_fastpath+0x16/0x1b >> [<ffffffffffffffff>] 0xffffffffffffffff > And here's one that is well and truly wedged: > > root@omega:~# cat /proc/31430/stack > [<ffffffff811669b7>] sync_inodes_sb+0x87/0xb0 > [<ffffffff8116b292>] __sync_filesystem+0x82/0x90 > [<ffffffff8116b379>] sync_filesystems+0xd9/0x130 > [<ffffffff8116b431>] sys_sync+0x21/0x40 > [<ffffffff810121b2>] system_call_fastpath+0x16/0x1b > [<ffffffffffffffff>] 0xffffffffffffffff > > In that case, even kill -SEGV'ing upstart won't stop it. I got that > with only a single dpkg run (i.e. no concurrency), after switching the > container's rootfs from ext4 to ext3, and forcing dpkg[0] to be upgraded > before anything else. Sigh... > > I'm THIS CLOSE to giving up and wrapping apt-get in libeatmydata. > > [0] I did this because I noticed that lucid's dpkg still suffers from > > http://bugs.debian.org/578635 > http://bugs.debian.org/605009 > https://launchpad.net/bugs/570805 > > But lucid-updates& lucid-security both contain a version that > contains CLAIMS to address the first of those.
Ouch ! Assuming you have an ubuntu version on your host, I think the kernel is compiled with DETECT_HUNG_TASK, where a kernel stack trace is displayed if a task stays in the 'D' state indefinitively. Do you have such stack on your logs ? ------------------------------------------------------------------------------ The modern datacenter depends on network connectivity to access resources and provide services. The best practices for maximizing a physical server's connectivity to a physical network are well understood - see how these rules translate into the virtual world? http://p.sf.net/sfu/oracle-sfdevnlfb _______________________________________________ Lxc-users mailing list Lxc-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/lxc-users