Le 2015-09-23 09:03, Qu Wenruo a écrit :
Stéphane Lesimple wrote on 2015/09/22 16:31 +0200:
Le 2015-09-22 10:51, Qu Wenruo a écrit :
[92098.842261] Call Trace:
[92098.842277] [<ffffffffc035a5d8>] ?
read_extent_buffer+0xb8/0x110
[btrfs]
[92098.842304] [<ffffffffc0396d00>] ?
btrfs_find_all_roots+0x60/0x70
[btrfs]
[92098.842329] [<ffffffffc039af3d>]
btrfs_qgroup_rescan_worker+0x28d/0x5a0 [btrfs]
Would you please show the code of it?
This one seems to be another stupid bug I made when rewriting the
framework.
Maybe I forgot to reinit some variants or I'm screwing memory...
(gdb) list *(btrfs_qgroup_rescan_worker+0x28d)
0x97f6d is in btrfs_qgroup_rescan_worker (fs/btrfs/ctree.h:2760).
2755
2756 static inline void btrfs_disk_key_to_cpu(struct btrfs_key
*cpu,
2757 struct
btrfs_disk_key
*disk)
2758 {
2759 cpu->offset =e64_to_cpu(disk->offset);
2760 cpu->type =isk->type;
2761 cpu->objectid =e64_to_cpu(disk->objectid);
2762 }
2763
2764 static inline void btrfs_cpu_key_to_disk(struct
btrfs_disk_key
*disk,
(gdb)
Does it makes sense ?
So it seems that the memory of cpu key is being screwed up...
The code is be specific thin inline function, so what about other
stack?
Like btrfs_qgroup_rescan_helper+0x12?
Thanks,
Qu
Oh, I forgot that you can just change the number of
btrfs_qgroup_rescan_worker+0x28d to smaller value.
Try +0x280 for example, which will revert to 14 bytes asm code back,
which may jump out of the inline function range, and may give you a
good hint.
Or gdb may have a better mode for inline function, but I don't
know...
Actually, "list -" is our friend here (show 10 lignes before the last
src output)
No, that's not the case.
List - will only show lines around the source code.
What I need is to get the higher caller stack.
If debugging a running program, it's quite easy to just use frame
command.
But in this situation, we don't have call stack, so I'd like to change
the +0x28d to several bytes backward, until we jump out of the inline
function call, and see the meaningful codes.
Ah, you're right.
I had a hard time finding a value where I wouldn't end up in another
inline
function or entirely somewhere else in the kernel code, but here it is :
(gdb) list *(btrfs_qgroup_rescan_worker+0x26e)
0x97f4e is in btrfs_qgroup_rescan_worker (fs/btrfs/qgroup.c:2237).
2232 memcpy(scratch_leaf, path->nodes[0],
sizeof(*scratch_leaf));
2233 slot = path->slots[0];
2234 btrfs_release_path(path);
2235 mutex_unlock(&fs_info->qgroup_rescan_lock);
2236
2237 for (; slot < btrfs_header_nritems(scratch_leaf);
++slot) {
2238 btrfs_item_key_to_cpu(scratch_leaf, &found,
slot); <== here
2239 if (found.type != BTRFS_EXTENT_ITEM_KEY &&
2240 found.type != BTRFS_METADATA_ITEM_KEY)
2241 continue;
the btrfs_item_key_to_cpu() inline func calls 2 other inline funcs:
static inline void btrfs_item_key_to_cpu(struct extent_buffer *eb,
struct btrfs_key *key, int nr)
{
struct btrfs_disk_key disk_key;
btrfs_item_key(eb, &disk_key, nr);
btrfs_disk_key_to_cpu(key, &disk_key); <== this is 0x28d
}
btrfs_disk_key_to_cpu() is the inline referenced by 0x28d and this is
where
the GPF happens.
BTW, did you tried the following patch?
https://patchwork.kernel.org/patch/7114321/
btrfs: qgroup: exit the rescan worker during umount
The problem seems a little related to the bug you encountered, so I'd
recommend to give it a try.
Not yet, but I've come across this bug too during my tests: starting a
rescan
and umounting gets you a crash. I didn't mention it because I was sure
this
was an already known bug. Nice to see it has been fixed though !
I'll certainly give it a try but I'm not really sure it'll fix the
specific
bug we're talking about.
However the group of patches posted by Mark should fix the qgroup count
disrepancies as I understand it, right ? It might be of interest to try
them
all at once for sure.
Thanks,
--
Stéphane.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html