Re: [BUG REPORT] Kernel panic on 3.9.0-rc7-4-gbb33db7

2013-04-19 Thread Jan Schmidt
On Fri, April 19, 2013 at 07:57 (+0200), Tejun Heo wrote:
> (cc'ing btrfs people)
> 
> On Fri, Apr 19, 2013 at 11:33:20AM +0800, Wanlong Gao wrote:
>> RIP: 0010:[]  [] 
>> ftrace_raw_event_block_bio_complete+0x73/0xf0
> ...
>>  [] bio_endio+0x80/0x90
>>  [] btrfs_end_bio+0xf6/0x190 [btrfs]
>>  [] bio_endio+0x3d/0x90
>>  [] req_bio_endio+0xa3/0xe0
> 
> Ugh
> 
> In fs/btrfs/volumes.c
> 
>   static void bbio_error(struct btrfs_bio *bbio, struct bio *bio, u64 logical)
>   {
>   ...
>   bio->bi_bdev = (struct block_device *)
>  (unsigned long)bbio->mirror_num;
>   ...
>   }
> 
>   static void btrfs_end_bio(struct bio *bio, int err)
>   {
>   ...
>   bio->bi_bdev = (struct block_device *)
>   (unsigned long)bbio->mirror_num;
>   
>   ...
>   }
> 
> In fs/btrfs/extent_io.c
> 
>   static void end_bio_extent_readpage(struct bio *bio, int err)
>   {
>   int mirror;
>   ...
>   mirror = (int)(unsigned long)bio->bi_bdev;
>   ...
>   }
> 
> Ewweehh
> 
> No wonder this thing crashes.  Chris, can't the original bio carry
> bbio in bi_private and let end_bio_extent_readpage() free the bbio
> instead of abusing bi_bdev like this?

Oops.

It's been my patch back in 2011 (commit 2774b2ca3), sent as an RFC-Patch and
just slipped in without further discussion of that exact change. Hackish, yes -
my reasoning was because the block layer changed bio->bi_bdev anyway, no one
would want to look into it after the bio returned (and in fact it didn't hurt
for like two years now). Although the block layer changes bi_bdev, it stays a
valid bdev pointer, I admit.

One way around this would be what you suggest, however that would mean the
caller of (btrfs|btree)_submit_bio_hook gets its completion called in the end,
but must know that the private is in fact a bbio which in turn carries the
caller's private. Doesn't sound clean to me, either.

The best idea I currently have is to add a dispatcher function that does the
freeing of bbio and calls the actual completion with mirror_num as a separate
parameter. That would make all the btrfs completions incompatible with
bio_end_io_t, but it shouldn't hurt.

At least now I know I wasn't invited to LSF for a good reason :-)

-Jan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: GPF in read_extent_buffer while scrubbing on 3.7.0-rc8-00014-g27d7c2a

2013-01-03 Thread Jan Schmidt
Hi Mathieu,

Sorry for the late reply. I had quite a good reproducer once for what I suspect
may be your problem here - but it suddenly stopped reproducing the problem and I
still haven't figured out why. (see https://patchwork.kernel.org/patch/1773611/
if you're interested)

Can you please give the following patch a try and report back if it helps you
(apply on top of cmason/for-linus, 57ba86c)? If it doesn't, you've got three
choices:

a) forget about the problem
b) send me a dump of your whole file system (space can be provided)
c) receive debug patches, apply them and send me the output

Thanks,
-Jan

-->8-
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 65f0367..d51185e 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -3324,8 +3324,6 @@ int close_ctree(struct btrfs_root *root)

btrfs_dev_replace_suspend_for_unmount(fs_info);

-   btrfs_scrub_cancel(fs_info);
-
/* wait for any defraggers to finish */
wait_event(fs_info->transaction_wait,
   (atomic_read(&fs_info->defrag_running) == 0));
@@ -3392,6 +3390,7 @@ int close_ctree(struct btrfs_root *root)
btrfs_stop_workers(&fs_info->caching_workers);
btrfs_stop_workers(&fs_info->readahead_workers);
btrfs_stop_workers(&fs_info->flush_workers);
+   btrfs_scrub_cancel(fs_info);

 #ifdef CONFIG_BTRFS_FS_CHECK_INTEGRITY
if (btrfs_test_opt(root, CHECK_INTEGRITY))
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: /proc/kmsg giving eof on blocking read

2012-11-22 Thread Jan Schmidt
On Thu, November 22, 2012 at 13:29 (+0100), Kay Sievers wrote:
> On Thu, Nov 22, 2012 at 11:44 AM, Jan Schmidt  wrote:
>> I'm currently debugging something in btrfs in good old printk style, 
>> generating
>> around 10MB/min. I'm seeing /proc/kmsg returning eof on a blocking read (and,
>> side note, syslog-ng won't reopen it, effectively stopping logging kernel
>> messages silently).
> 
> Are you sure there is not something else that opens the same file?

Those errors didn't happen on that machine ever before, and there should have
been no user land changes to it for quite a long time.

I'm tempted to say there is no other kmsg reader, but just to make it entirely
sure, how would I trace this? From a quick look at ftrace I don't see it would
output file names when tracing sys_open, unfortunately.

Thanks!
-Jan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


/proc/kmsg giving eof on blocking read

2012-11-22 Thread Jan Schmidt
Hi,

I'm currently debugging something in btrfs in good old printk style, generating
around 10MB/min. I'm seeing /proc/kmsg returning eof on a blocking read (and,
side note, syslog-ng won't reopen it, effectively stopping logging kernel
messages silently).

I'm using kernel 3.6.0+ from cmason's tree, which is Linux 3.6.0 (commit
a0d271cbfed1dd50278c6b06bead3d00ba0a88f9) plus the Btrfs code for 3.7 (commit
c37b2b6269ee4637fb7cdb5da0d1e47215d57ce2).

I suspect it has something to do with the data I'm passing to printk. It happens
anywhere from several times per second to once every twenty minutes.

As a workaround (and proof), I'm currently using:

# perl -we 'use Fcntl; sysopen(K, "/proc/kmsg", O_RDONLY) or die "open $!";
while (1) {while ($ret = sysread(K, $buf, 8192)) {print $buf} print STDERR
scalar(localtime(time)), " oops $ret\\n";}' >> /var/tmp/kern

-Jan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/