(attempting to cc Matthias. If I have the wrong one, please fix it up) (please generally cc reporters when forwarding their bug reports)
On Wed, 01 Aug 2007 18:39:51 -0400 Chuck Ebbert <[EMAIL PROTECTED]> wrote: > Looks like the same problem with spinlock unfairness we've seen > elsewhere: it seems to be looping here? Or is everyone stuck > just waiting for writeout? > > lock_timer_base(): > for (;;) { > tvec_base_t *prelock_base = timer->base; > base = tbase_get_base(prelock_base); > if (likely(base != NULL)) { > spin_lock_irqsave(&base->lock, *flags); > if (likely(prelock_base == timer->base)) > return base; > /* The timer has migrated to another CPU */ > spin_unlock_irqrestore(&base->lock, *flags); > } > cpu_relax(); > } > > The problem goes away completely if filesystem are mounted > *without* noatime. Has happened in 2.6.20 through 2.6.22... > > https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=249563 > > Part of sysrq-t listing: > > mysqld D 000017c0 2196 23162 1562 > e383fcb8 00000082 61650954 000017c0 e383fc9c 00000000 c0407208 > e383f000 > a12b0434 00004d1d c6ed2c00 c6ed2d9c c200fa80 00000000 c0724640 > f6c60540 > c4ff3c70 00000508 00000286 c042ffcb e383fcc8 00014926 00000000 > 00000286 > Call Trace: > [<c0407208>] do_IRQ+0xbd/0xd1 > [<c042ffcb>] lock_timer_base+0x19/0x35 > [<c04300df>] __mod_timer+0x9a/0xa4 > [<c060bb55>] schedule_timeout+0x70/0x8f > [<c042fd37>] process_timeout+0x0/0x5 > [<c060bb50>] schedule_timeout+0x6b/0x8f > [<c060b67c>] io_schedule_timeout+0x39/0x5d > [<c0465eea>] congestion_wait+0x50/0x64 > [<c0438539>] autoremove_wake_function+0x0/0x35 > [<c04620e2>] balance_dirty_pages_ratelimited_nr+0x148/0x193 > [<c045e7fd>] generic_file_buffered_write+0x4c7/0x5d3 I expect the lock_timer_base() this is just stack gunk. Matthias's trace also includes mysqld S 000017c0 2524 1623 1562 f6ce3b44 00000082 60ca34b2 000017c0 f6ce3b28 00000000 f6ce3b54 f6ce3000 57c63d9c 00004d1d f6c90000 f6c9019c c200fa80 00000000 c0724640 f6c60540 000007d0 c07e1f00 00000286 c042ffcb f6ce3b54 000290ef 00000000 00000286 Call Trace: [<c042ffcb>] lock_timer_base+0x19/0x35 [<c04300df>] __mod_timer+0x9a/0xa4 [<c060bb55>] schedule_timeout+0x70/0x8f [<c042fd37>] process_timeout+0x0/0x5 [<c060bb50>] schedule_timeout+0x6b/0x8f [<c04857c7>] do_select+0x36d/0x3c4 [<c0485da0>] __pollwait+0x0/0xac [<c04e934d>] __next_cpu+0x12/0x1e [<c0420130>] find_busiest_group+0x1c4/0x553 [<c0420ada>] update_curr+0x23b/0x25c [<c04eb700>] rb_insert_color+0x8c/0xad [<c042128f>] enqueue_entity+0x276/0x294 and it appears that schedule_timeout() always leaves a copy of lock_timer_base+0x19 on the stack. Enabling CONFIG_FRAME_POINTER might help sort that out. I think. Or perhaps lock_timer_base() really has gone and got stuck. One possibility is that gcc has decided to cache timer->base in a register rather than rereading it around that loop, which would be bad. Do: gdb vmlinux (gdb) x/100i lock_timer_base Is the machine really completely dead? Or are some tasks running? If the latter, it might be dirty-memory windup - perhaps some device driver has died and we're not getting writes out to disk. Are all the CPUs running flat-out? If so, yup, maybe it's lock_timer_base(). Hit sysrq-P ten times, see where things are stuck. Please leave `vmstat 1' running in an ssh seesion next time, let's see the output just prior to the hang. And do this: while true do echo cat /proc/meminfo sleep 1 done in another ssh session so we can see what the memory looked like when it died too. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/