Your message dated Fri, 05 Dec 2014 02:09:18 +0000
with message-id <[email protected]>
and subject line Re: Bug#772050: linux-image-3.16-0.bpo.3-amd64-dbg: vmlinux
points to wrong source code
has caused the Debian Bug report #772050,
regarding linux-image-3.16-0.bpo.3-amd64-dbg: vmlinux points to wrong source
code
to be marked as done.
This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.
(NB: If you are a system administrator and have no idea what this
message is talking about, this may indicate a serious mail system
misconfiguration somewhere. Please contact [email protected]
immediately.)
--
772050: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=772050
Debian Bug Tracking System
Contact [email protected] with problems
--- Begin Message ---
Package: linux-image-3.16-0.bpo.3-amd64-dbg
Version: 3.16.5-1~bpo70+1
Severity: normal
Dear Maintainer,
we've been analyzing a kernel bug in blk-mq with the same kernel version
where we triggered an Oops by hot-unplugging a qcow2 Qemu/KVM virtio-blk
storage device during active I/O to that device within the virtual machine
running this kernel.
So we've installed linux-image-3.16-0.bpo.3-amd64-dbg (version
3.16.5-1~bpo70+1),
gdb (version 7.4.1+dfsg-0.1) and installed the related source code. But when
trying to list the functions from the call trace, wrong code locations are
displayed.
# apt-get update
# apt-get install gdb ctags vim apt-src linux-image-3.16-0.bpo.3-amd64-dbg
# cd /usr/src
# apt-src update
# apt-src install linux-image-3.16-0.bpo.3-amd64
# dpkg -l | grep linux-image
ii linux-image-3.16-0.bpo.3-amd64 3.16.5-1~bpo70+1
amd64 Linux 3.16 for 64-bit PCs
ii linux-image-3.16-0.bpo.3-amd64-dbg 3.16.5-1~bpo70+1
amd64 Debugging symbols for Linux 3.16-0.bpo.3-amd64
ii linux-image-3.2.0-4-amd64 3.2.63-2+deb7u1
amd64 Linux 3.2 for 64-bit PCs
ii linux-image-amd64 3.2+46
amd64 Linux for 64-bit PCs (meta-package)
# apt-src list linux-image-3.16-0.bpo.3-amd64
i linux 3.16.5-1~bpo70 /usr/src/linux-3.16.5
Oops call trace:
[ 81.248004] Call Trace:
[ 81.248004] [<ffffffff81545f7b>] ? mutex_lock+0x1b/0x2a
[ 81.248004] [<ffffffff812a75c4>] ? blk_mq_free_queue+0x24/0x150
[ 81.248004] [<ffffffff8129e7c8>] ? blk_release_queue+0x88/0xd0
[ 81.248004] [<ffffffff812ca160>] ? kobject_cleanup+0x80/0x1d0
[ 81.248004] [<ffffffff812abba2>] ? disk_release+0x92/0xd0
[ 81.248004] [<ffffffff813c4f3b>] ? device_release+0x3b/0xb0
[ 81.248004] [<ffffffff812ca160>] ? kobject_cleanup+0x80/0x1d0
[ 81.248004] [<ffffffff811f2095>] ? __blkdev_put+0x115/0x1a0
[ 81.248004] [<ffffffff811f2285>] ? blkdev_close+0x25/0x30
[ 81.248004] [<ffffffff811bd323>] ? __fput+0xb3/0x210
[ 81.257437] [<ffffffff8108c164>] ? task_work_run+0xc4/0xe0
[ 81.257437] [<ffffffff8106f310>] ? do_exit+0x2c0/0xa80
[ 81.257437] [<ffffffff8106fb56>] ? do_group_exit+0x46/0xb0
[ 81.257437] [<ffffffff8106fbd7>] ? SyS_exit_group+0x17/0x20
[ 81.257437] [<ffffffff8154792d>] ? system_call_fast_compare_end+0x10/0x15
[ 81.257437] Code: 55 53 48 89 fb 48 83 ec 20 65 48 8b 04 25 48 c8 00 00 48
8b 80 38 c0 ff ff a8 08 75 29 48 8b 57 18 b8 01 00 00 00 48 85 d2 74 03 <8b> 42
28 85 c0 74 14 4c 8d 6b 20 4c 89 ef e8 0e eb b$
[ 81.258715] RIP [<ffffffff81545ddf>] __mutex_lock_slowpath+0x3f/0x1c0
Let's run gdb:
# gdb /usr/lib/debug/vmlinux-3.16-0.bpo.3-amd64
(gdb) list *blk_mq_free_queue+0x24
96 /build/linux-LrLd2z/linux-3.16.5/include/linux/list.h: No such file
or directory.
(gdb) quit
# mkdir -p /build/linux-LrLd2z
# ln -sT /usr/src/linux-3.16.5/ /build/linux-LrLd2z/linux-3.16.5
# gdb /usr/lib/debug/vmlinux-3.16-0.bpo.3-amd64
(gdb) list *blk_mq_free_queue+0x24
0xffffffff812a75c4 is in blk_mq_free_queue
(/build/linux-LrLd2z/linux-3.16.5/include/linux/list.h:101).
96 * in an undefined state.
97 */
98 #ifndef CONFIG_DEBUG_LIST
99 static inline void __list_del_entry(struct list_head *entry)
100 {
101 __list_del(entry->prev, entry->next);
102 }
103
104 static inline void list_del(struct list_head *entry)
105 {
Can't be possible! There is no mutex_lock() here!
* (gdb) list *blk_release_queue+0x88
0xffffffff8129e7c8 is in blk_release_queue
(/build/linux-LrLd2z/linux-3.16.5/block/blk-sysfs.c:523).
518 __blk_queue_free_tags(q);
519
520 if (q->mq_ops)
521 blk_mq_free_queue(q);
522
523 kfree(q->flush_rq);
524
525 blk_trace_shutdown(q);
526
527 bdi_destroy(&q->backing_dev_info);
This points to kfree() - also wrong!
Let's check the disassembly!
# objdump -D /usr/lib/debug/vmlinux-3.16-0.bpo.3-amd64 | less
(less) /<blk_mq_free_queue>:
ffffffff812a75a0 <blk_mq_free_queue>:
ffffffff812a75a0: e8 1b 28 2a 00 callq ffffffff81549dc0
<__fentry__>
ffffffff812a75a5: 41 54 push %r12
ffffffff812a75a7: 55 push %rbp
ffffffff812a75a8: 53 push %rbx
ffffffff812a75a9: 48 8b af a8 07 00 00 mov 0x7a8(%rdi),%rbp
ffffffff812a75b0: 48 89 fb mov %rdi,%rbx
ffffffff812a75b3: e8 78 f1 ff ff callq ffffffff812a6730
<blk_mq_freeze_queue>
ffffffff812a75b8: 4c 8d 65 38 lea 0x38(%rbp),%r12
ffffffff812a75bc: 4c 89 e7 mov %r12,%rdi
ffffffff812a75bf: e8 9c e9 29 00 callq ffffffff81545f60
<mutex_lock>
ffffffff812a75c4: 48 8b 8b b0 07 00 00 mov 0x7b0(%rbx),%rcx
0xa0 + 0x24 = 0xc4
Here is definitely a call to mutex_lock() at blk_mq_free_queue+0x24 !!!
No call to any list stuff! So just the source code information in vmlinux is
wrong!
It's the same with the other code locations.
Please fix that in your package build!
Cheers,
Sebastian Parschauer
Senior Linux Kernel Developer - Storage
Linux Kernel Maintainer at ProfitBricks
-- System Information:
Debian Release: 7.7
APT prefers stable-updates
APT policy: (500, 'stable-updates'), (500, 'stable')
Architecture: amd64 (x86_64)
Kernel: Linux 3.16-0.bpo.3-amd64 (SMP w/4 CPU cores)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
-- no debconf information
--- End Message ---
--- Begin Message ---
Control: notfound -1 3.16.5-1~bpo70+1
On Thu, 2014-12-04 at 12:42 -0500, Sebastian Parschauer wrote:
> Package: linux-image-3.16-0.bpo.3-amd64-dbg
> Version: 3.16.5-1~bpo70+1
> Severity: normal
>
> Dear Maintainer,
>
> we've been analyzing a kernel bug in blk-mq with the same kernel version
> where we triggered an Oops by hot-unplugging a qcow2 Qemu/KVM virtio-blk
> storage device during active I/O to that device within the virtual machine
> running this kernel.
> So we've installed linux-image-3.16-0.bpo.3-amd64-dbg (version
> 3.16.5-1~bpo70+1),
> gdb (version 7.4.1+dfsg-0.1) and installed the related source code. But when
> trying to list the functions from the call trace, wrong code locations are
> displayed.
[...]
The kernel is compiled without frame pointers, so the kernel cannot
follow the call stack. Instead, it lists all addresses in the stack
that point to kernel code. Functions don't generally initialise their
entire stack frame on entry, so you may see return addresses left over
from earlier calls, such as for the call to __list_del() from
__list_del_entry() inlined in blk_mq_free_queue().
Uwe's point about return addresses is also worth bearing in mind.
Finally, make sure to use an exactly matching version of src:linux or
linux-source-3.16, not the upstream source.
I don't believe you've found a bug in this package.
Ben.
--
Ben Hutchings
Beware of programmers who carry screwdrivers. - Leonard Brandwein
signature.asc
Description: This is a digitally signed message part
--- End Message ---