Re: Boot failure on Arndale with next-20131105

2013-11-08 Thread Stephen Rothwell
Hi Jens,

On Tue, 05 Nov 2013 14:25:00 -0700 Jens Axboe  wrote:
>
> On 11/05/2013 10:38 AM, Stephen Warren wrote:
> > I note that compiling next-20131105 generates quite a few warnings re:
> > uninitialized variables. Reverting the commit doesn't solve those.
> > 
> >> block/blk-merge.c: In function ‘blk_bio_map_sg’:
> >> block/blk-merge.c:133:8: warning: ‘bvprv.bv_len’ may be used uninitialized 
> >> in this function [-Wmaybe-uninitialized]
> >> block/blk-merge.c:233:23: note: ‘bvprv.bv_len’ was declared here
> >> block/blk-merge.c:133:8: warning: ‘bvprv.bv_offset’ may be used 
> >> uninitialized in this function [-Wmaybe-uninitialized]
> >> block/blk-merge.c:233:23: note: ‘bvprv.bv_offset’ was declared here
> >> block/blk-merge.c:133:8: warning: ‘bvprv.bv_page’ may be used 
> >> uninitialized in this function [-Wmaybe-uninitialized]
> >> block/blk-merge.c:233:23: note: ‘bvprv.bv_page’ was declared here
> >> block/blk-merge.c: In function ‘blk_rq_map_sg’:
> >> block/blk-merge.c:133:8: warning: ‘bvprv.bv_page’ may be used 
> >> uninitialized in this function [-Wmaybe-uninitialized]
> >> block/blk-merge.c:171:23: note: ‘bvprv.bv_page’ was declared here
> >> block/blk-merge.c:133:8: warning: ‘bvprv.bv_offset’ may be used 
> >> uninitialized in this function [-Wmaybe-uninitialized]
> >> block/blk-merge.c:171:23: note: ‘bvprv.bv_offset’ was declared here
> >> block/blk-merge.c:133:8: warning: ‘bvprv.bv_len’ may be used uninitialized 
> >> in this function [-Wmaybe-uninitialized]
> >> block/blk-merge.c:171:23: note: ‘bvprv.bv_len’ was declared here
> >> block/blk-merge.c: In function ‘attempt_merge’:
> >> block/blk-merge.c:108:7: warning: ‘end_bv.bv_offset’ may be used 
> >> uninitialized in this function [-Wmaybe-uninitialized]
> >> block/blk-merge.c:89:17: note: ‘end_bv.bv_offset’ was declared here
> >> block/blk-merge.c:108:7: warning: ‘end_bv.bv_page’ may be used 
> >> uninitialized in this function [-Wmaybe-uninitialized]
> >> block/blk-merge.c:89:17: note: ‘end_bv.bv_page’ was declared here
> >> block/blk-merge.c:108:7: warning: ‘end_bv.bv_len’ may be used 
> >> uninitialized in this function [-Wmaybe-uninitialized]
> >> block/blk-merge.c:89:17: note: ‘end_bv.bv_len’ was declared here
> 
> Looks like an incomplete merge. The patch to silence those warnings
> (which aren't bugs, BTW) is definitely in my for-next branch.

I am still getting those warnings in linux-next for various builds
(include 1386 defconfig).  Any hints would be good.

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au


pgpDVjSFCsSIq.pgp
Description: PGP signature


Re: Boot failure on Arndale with next-20131105

2013-11-05 Thread Tushar Behera
On 6 November 2013 02:26, Chris Mason  wrote:
> Quoting Olof Johansson (2013-11-05 15:38:33)
>> On Tue, Nov 5, 2013 at 12:33 PM, Chris Mason  
>> wrote:
>> > Quoting Olof Johansson (2013-11-05 15:23:51)
>> >> On Tue, Nov 5, 2013 at 11:33 AM, Jens Axboe  wrote:

> This patch is only compile tested, but I think it'll fix it.
>
> diff --git a/fs/bio.c b/fs/bio.c
> index be93de1..3595456 100644
> --- a/fs/bio.c
> +++ b/fs/bio.c
> @@ -612,6 +612,7 @@ int bio_clone_biovec(struct bio *bio, gfp_t gfp_mask)
> unsigned nr_iovecs = 0;
> struct bio_vec bv, *bvl = NULL;
> struct bvec_iter iter;
> +   int i;
>
> BUG_ON(!bio->bi_pool);
> BUG_ON(BIO_POOL_IDX(bio) != BIO_POOL_NONE);
> @@ -628,8 +629,9 @@ int bio_clone_biovec(struct bio *bio, gfp_t gfp_mask)
> bvl = bio->bi_inline_vecs;
> }
>
> +   i = 0;
> bio_for_each_segment(bv, bio, iter)
> -   bvl[bio->bi_vcnt++] = bv;
> +   bvl[i++] = bv;
>
> bio->bi_io_vec = bvl;
> bio->bi_iter.bi_idx = 0;


Tested-by: Tushar Behera 
(Fixes boot failure on Exynos5250-based Arndale board.)

-- 
Tushar Behera
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Boot failure on Arndale with next-20131105

2013-11-05 Thread Chris Mason
Quoting Olof Johansson (2013-11-05 17:41:42)
> On Tue, Nov 5, 2013 at 2:06 PM, Stephen Warren  wrote:
> > On 11/05/2013 01:56 PM, Chris Mason wrote:
> >> Quoting Olof Johansson (2013-11-05 15:38:33)
> >>> On Tue, Nov 5, 2013 at 12:33 PM, Chris Mason  
> >>> wrote:
>  Quoting Olof Johansson (2013-11-05 15:23:51)
> > On Tue, Nov 5, 2013 at 11:33 AM, Jens Axboe  wrote:
> 
>  [ horrible crashes fixed by removing my patch ]
> 
> > ...
> >> Ok, I think I see it.  My guess is that you're hitting bounce buffers.
> >> __blk_queue_bounce is the only caller of the bio splitting code I can
> >> see that you might be hitting.
> >>
> >> My first patch exposed a lurking bug in bio_clone_biovec.  Basically
> >> bio->bi_vcnt is being doubled instead of initialized.
> >>
> >> This patch is only compile tested, but I think it'll fix it.
> >
> > Tested-by: Stephen Warren 
> > (this fixes the issue on Tegra30/Beaver at least)
> 
> Tested-by: Olof Johansson 
> 
> This resolves boot failures on:
> * Tegra30/beaver
> * OMAP4/panda
> * i.MX6/wandboard
> 
> Thanks Chris!

Perfect, thanks for bisecting and trying the patches.  Kent, if things
get rebased, could you please fold this patch and my bi_vcnt patch in?

-chris

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Boot failure on Arndale with next-20131105

2013-11-05 Thread Olof Johansson
On Tue, Nov 5, 2013 at 2:06 PM, Stephen Warren  wrote:
> On 11/05/2013 01:56 PM, Chris Mason wrote:
>> Quoting Olof Johansson (2013-11-05 15:38:33)
>>> On Tue, Nov 5, 2013 at 12:33 PM, Chris Mason  
>>> wrote:
 Quoting Olof Johansson (2013-11-05 15:23:51)
> On Tue, Nov 5, 2013 at 11:33 AM, Jens Axboe  wrote:

 [ horrible crashes fixed by removing my patch ]

> ...
>> Ok, I think I see it.  My guess is that you're hitting bounce buffers.
>> __blk_queue_bounce is the only caller of the bio splitting code I can
>> see that you might be hitting.
>>
>> My first patch exposed a lurking bug in bio_clone_biovec.  Basically
>> bio->bi_vcnt is being doubled instead of initialized.
>>
>> This patch is only compile tested, but I think it'll fix it.
>
> Tested-by: Stephen Warren 
> (this fixes the issue on Tegra30/Beaver at least)

Tested-by: Olof Johansson 

This resolves boot failures on:
* Tegra30/beaver
* OMAP4/panda
* i.MX6/wandboard

Thanks Chris!

-Olof
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Boot failure on Arndale with next-20131105

2013-11-05 Thread Stephen Warren
On 11/05/2013 01:56 PM, Chris Mason wrote:
> Quoting Olof Johansson (2013-11-05 15:38:33)
>> On Tue, Nov 5, 2013 at 12:33 PM, Chris Mason  
>> wrote:
>>> Quoting Olof Johansson (2013-11-05 15:23:51)
 On Tue, Nov 5, 2013 at 11:33 AM, Jens Axboe  wrote:
>>>
>>> [ horrible crashes fixed by removing my patch ]
>>>
...
> Ok, I think I see it.  My guess is that you're hitting bounce buffers.
> __blk_queue_bounce is the only caller of the bio splitting code I can
> see that you might be hitting.
> 
> My first patch exposed a lurking bug in bio_clone_biovec.  Basically
> bio->bi_vcnt is being doubled instead of initialized.
> 
> This patch is only compile tested, but I think it'll fix it.

Tested-by: Stephen Warren 
(this fixes the issue on Tegra30/Beaver at least)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Boot failure on Arndale with next-20131105

2013-11-05 Thread Olof Johansson
On Tue, Nov 5, 2013 at 12:56 PM, Chris Mason  wrote:
> Quoting Olof Johansson (2013-11-05 15:38:33)
>> On Tue, Nov 5, 2013 at 12:33 PM, Chris Mason  
>> wrote:
>> > Quoting Olof Johansson (2013-11-05 15:23:51)
>> >> On Tue, Nov 5, 2013 at 11:33 AM, Jens Axboe  wrote:
>> >
>> > [ horrible crashes fixed by removing my patch ]
>> >
>> >> > Very weird! What file system is being used?
>> >>
>> >> Most of my failures have happened on regular MMC cards with ext4
>> >> filesystems on them.
>> >>
>> >> Note that the panic happens during device probe / partition table
>> >> scanning, not after mounting the filesystem.
>> >>
>> >> Giving your patch a go now across the board. I'm very concerned about
>> >> the reports of bisectability, build failures and heaps of warnings
>> >> though. Did the 0-day builder pick up any of those? :-/
>> >>
>> >
>> > Hmmm, is bcache in your config?
>>
>> Doesn't look that way -- no ARM defconfigs enable it (it's what I
>> build and boot), and the option defaults to off and nothing selects
>> it.
>
> Ok, I think I see it.  My guess is that you're hitting bounce buffers.
> __blk_queue_bounce is the only caller of the bio splitting code I can
> see that you might be hitting.
>
> My first patch exposed a lurking bug in bio_clone_biovec.  Basically
> bio->bi_vcnt is being doubled instead of initialized.
>
> This patch is only compile tested, but I think it'll fix it.

Thanks, giving it a go now (will have results in 30+ minutes). Jens'
patch didn't make a difference, which makes sense given lack of dm
usage.


-Olof
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Boot failure on Arndale with next-20131105

2013-11-05 Thread Jens Axboe
On 11/05/2013 10:38 AM, Stephen Warren wrote:
> I note that compiling next-20131105 generates quite a few warnings re:
> uninitialized variables. Reverting the commit doesn't solve those.
> 
>> block/blk-merge.c: In function ‘blk_bio_map_sg’:
>> block/blk-merge.c:133:8: warning: ‘bvprv.bv_len’ may be used uninitialized 
>> in this function [-Wmaybe-uninitialized]
>> block/blk-merge.c:233:23: note: ‘bvprv.bv_len’ was declared here
>> block/blk-merge.c:133:8: warning: ‘bvprv.bv_offset’ may be used 
>> uninitialized in this function [-Wmaybe-uninitialized]
>> block/blk-merge.c:233:23: note: ‘bvprv.bv_offset’ was declared here
>> block/blk-merge.c:133:8: warning: ‘bvprv.bv_page’ may be used uninitialized 
>> in this function [-Wmaybe-uninitialized]
>> block/blk-merge.c:233:23: note: ‘bvprv.bv_page’ was declared here
>> block/blk-merge.c: In function ‘blk_rq_map_sg’:
>> block/blk-merge.c:133:8: warning: ‘bvprv.bv_page’ may be used uninitialized 
>> in this function [-Wmaybe-uninitialized]
>> block/blk-merge.c:171:23: note: ‘bvprv.bv_page’ was declared here
>> block/blk-merge.c:133:8: warning: ‘bvprv.bv_offset’ may be used 
>> uninitialized in this function [-Wmaybe-uninitialized]
>> block/blk-merge.c:171:23: note: ‘bvprv.bv_offset’ was declared here
>> block/blk-merge.c:133:8: warning: ‘bvprv.bv_len’ may be used uninitialized 
>> in this function [-Wmaybe-uninitialized]
>> block/blk-merge.c:171:23: note: ‘bvprv.bv_len’ was declared here
>> block/blk-merge.c: In function ‘attempt_merge’:
>> block/blk-merge.c:108:7: warning: ‘end_bv.bv_offset’ may be used 
>> uninitialized in this function [-Wmaybe-uninitialized]
>> block/blk-merge.c:89:17: note: ‘end_bv.bv_offset’ was declared here
>> block/blk-merge.c:108:7: warning: ‘end_bv.bv_page’ may be used uninitialized 
>> in this function [-Wmaybe-uninitialized]
>> block/blk-merge.c:89:17: note: ‘end_bv.bv_page’ was declared here
>> block/blk-merge.c:108:7: warning: ‘end_bv.bv_len’ may be used uninitialized 
>> in this function [-Wmaybe-uninitialized]
>> block/blk-merge.c:89:17: note: ‘end_bv.bv_len’ was declared here

Looks like an incomplete merge. The patch to silence those warnings
(which aren't bugs, BTW) is definitely in my for-next branch.

-- 
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Boot failure on Arndale with next-20131105

2013-11-05 Thread Chris Mason
Quoting Olof Johansson (2013-11-05 15:38:33)
> On Tue, Nov 5, 2013 at 12:33 PM, Chris Mason  wrote:
> > Quoting Olof Johansson (2013-11-05 15:23:51)
> >> On Tue, Nov 5, 2013 at 11:33 AM, Jens Axboe  wrote:
> >
> > [ horrible crashes fixed by removing my patch ]
> >
> >> > Very weird! What file system is being used?
> >>
> >> Most of my failures have happened on regular MMC cards with ext4
> >> filesystems on them.
> >>
> >> Note that the panic happens during device probe / partition table
> >> scanning, not after mounting the filesystem.
> >>
> >> Giving your patch a go now across the board. I'm very concerned about
> >> the reports of bisectability, build failures and heaps of warnings
> >> though. Did the 0-day builder pick up any of those? :-/
> >>
> >
> > Hmmm, is bcache in your config?
> 
> Doesn't look that way -- no ARM defconfigs enable it (it's what I
> build and boot), and the option defaults to off and nothing selects
> it.

Ok, I think I see it.  My guess is that you're hitting bounce buffers.
__blk_queue_bounce is the only caller of the bio splitting code I can
see that you might be hitting.

My first patch exposed a lurking bug in bio_clone_biovec.  Basically
bio->bi_vcnt is being doubled instead of initialized.

This patch is only compile tested, but I think it'll fix it.

diff --git a/fs/bio.c b/fs/bio.c
index be93de1..3595456 100644
--- a/fs/bio.c
+++ b/fs/bio.c
@@ -612,6 +612,7 @@ int bio_clone_biovec(struct bio *bio, gfp_t gfp_mask)
unsigned nr_iovecs = 0;
struct bio_vec bv, *bvl = NULL;
struct bvec_iter iter;
+   int i;
 
BUG_ON(!bio->bi_pool);
BUG_ON(BIO_POOL_IDX(bio) != BIO_POOL_NONE);
@@ -628,8 +629,9 @@ int bio_clone_biovec(struct bio *bio, gfp_t gfp_mask)
bvl = bio->bi_inline_vecs;
}
 
+   i = 0;
bio_for_each_segment(bv, bio, iter)
-   bvl[bio->bi_vcnt++] = bv;
+   bvl[i++] = bv;
 
bio->bi_io_vec = bvl;
bio->bi_iter.bi_idx = 0;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Boot failure on Arndale with next-20131105

2013-11-05 Thread Olof Johansson
On Tue, Nov 5, 2013 at 12:33 PM, Chris Mason  wrote:
> Quoting Olof Johansson (2013-11-05 15:23:51)
>> On Tue, Nov 5, 2013 at 11:33 AM, Jens Axboe  wrote:
>
> [ horrible crashes fixed by removing my patch ]
>
>> > Very weird! What file system is being used?
>>
>> Most of my failures have happened on regular MMC cards with ext4
>> filesystems on them.
>>
>> Note that the panic happens during device probe / partition table
>> scanning, not after mounting the filesystem.
>>
>> Giving your patch a go now across the board. I'm very concerned about
>> the reports of bisectability, build failures and heaps of warnings
>> though. Did the 0-day builder pick up any of those? :-/
>>
>
> Hmmm, is bcache in your config?

Doesn't look that way -- no ARM defconfigs enable it (it's what I
build and boot), and the option defaults to off and nothing selects
it.


-Olof
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Boot failure on Arndale with next-20131105

2013-11-05 Thread Jens Axboe
On 11/05/2013 01:23 PM, Olof Johansson wrote:
>> Very weird! What file system is being used?
> 
> Most of my failures have happened on regular MMC cards with ext4
> filesystems on them.
> 
> Note that the panic happens during device probe / partition table
> scanning, not after mounting the filesystem.

Hmm ok.

> Giving your patch a go now across the board. I'm very concerned about
> the reports of bisectability, build failures and heaps of warnings
> though. Did the 0-day builder pick up any of those? :-/

Yeah, unfortunately the immutable conversion has turned out to be quite
messy :-(

-- 
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Boot failure on Arndale with next-20131105

2013-11-05 Thread Chris Mason
Quoting Olof Johansson (2013-11-05 15:23:51)
> On Tue, Nov 5, 2013 at 11:33 AM, Jens Axboe  wrote:

[ horrible crashes fixed by removing my patch ]

> > Very weird! What file system is being used?
> 
> Most of my failures have happened on regular MMC cards with ext4
> filesystems on them.
> 
> Note that the panic happens during device probe / partition table
> scanning, not after mounting the filesystem.
> 
> Giving your patch a go now across the board. I'm very concerned about
> the reports of bisectability, build failures and heaps of warnings
> though. Did the 0-day builder pick up any of those? :-/
> 

Hmmm, is bcache in your config?

-chris

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Boot failure on Arndale with next-20131105

2013-11-05 Thread Olof Johansson
On Tue, Nov 5, 2013 at 11:33 AM, Jens Axboe  wrote:
> On 11/05/2013 04:49 AM, Tushar Behera wrote:
>> Hi,
>>
>> We are having a boot-time kernel panic on Samsung's Exynos5250-based
>> Arndale board with next-20131105. Bisect points to following commit.
>>
>> <<<
>> commit febca1baea1cfe2d7a0271385d89b03d5fb34f94
>> Author: Chris Mason 
>> Date:   Thu Oct 31 13:32:42 2013 -0600
>>
>> block: setup bi_vcnt on clones
>>
>> commit 9fc6286f347d changed the cloning code to make clones cheaper for
>> the case where we don't need to clone the iovec array.  But,
>> the new clone needs the bi_vnct from the original.
>>
>> Signed-off-by: Chris Mason 
>> Signed-off-by: Jens Axboe 
>
>>
>> Reverting above commit, Arndale is able to boot again.
>>
>> Excerpts from the boot log (just in case, it helps in debugging).
>>
>> [1.972062] Unable to handle kernel paging request at virtual
>> address 025e63a0
>> [1.981164] pgd = c0004000
>> [1.982375] [025e63a0] *pgd=
>> [1.985875] Internal error: Oops: 5 [#1] PREEMPT SMP ARM
>> [1.991086] Modules linked in:
>> [1.994076] CPU: 0 PID: 1178 Comm: mmcqd/0 Not tainted
>> 3.12.0-rc5-00051-gfebca1b #21
>> [2.001683] task: ef3530c0 ti: ee82e000 task.ti: ee82e000
>> [2.006981] PC is at dma_cache_maint_page+0x84/0x174
>> [2.011842] LR is at 0x6
>>
>> [2.043532] Flags: NzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM
>> Segment kernel
>> [2.050708] Control: 10c5387d  Table: 4000406a  DAC: 0015
>> [2.056342] Process mmcqd/0 (pid: 1178, stack limit = 0xee82e240)
>> [2.062321] Stack: (0xee82fd58 to 0xee83)
>>
>> [ ... ]
>>
>> [2.275352] [] (dma_cache_maint_page+0x84/0x174) from
>> [] (__dma_page_cpu_to_dev+0x28/0xa0)
>> [2.285170] [] (__dma_page_cpu_to_dev+0x28/0xa0) from
>> [] (arm_dma_map_page+0x6c/0x70)
>> [2.294565] [] (arm_dma_map_page+0x6c/0x70) from
>> [] (arm_dma_map_sg+0x74/0xec)
>> [2.303366] [] (arm_dma_map_sg+0x74/0xec) from
>> [] (dw_mci_pre_dma_transfer.isra.16+0x124/0x15c)
>> [2.313614] []
>> (dw_mci_pre_dma_transfer.isra.16+0x124/0x15c) from []
>> (dw_mci_pre_req+0x44/0x50)
>> [2.323863] [] (dw_mci_pre_req+0x44/0x50) from
>> [] (mmc_start_req+0x3c/0x39c)
>> [2.332486] [] (mmc_start_req+0x3c/0x39c) from
>> [] (mmc_blk_issue_rw_rq+0xbc/0xa9c)
>> [2.341625] [] (mmc_blk_issue_rw_rq+0xbc/0xa9c) from
>> [] (mmc_blk_issue_rq+0x1c8/0x498)
>> [2.351106] [] (mmc_blk_issue_rq+0x1c8/0x498) from
>> [] (mmc_queue_thread+0xa4/0x144)
>> [2.360331] [] (mmc_queue_thread+0xa4/0x144) from
>> [] (kthread+0xb4/0xb8)
>> [2.368616] [] (kthread+0xb4/0xb8) from []
>> (ret_from_fork+0x14/0x3c)
>> [2.376556] Code: 17e81051 10822181 e592c000 e3ccc003 (e79c2007)
>> [2.382570] ---[ end trace df06b64b1b7fa443 ]---
>>
>> [ ... ]
>>
>> Begin: Mounting root file system ... Begin: Running /scripts/local-top ... 
>> done.
>> Gave up waiting for root device.  Common problems:
>>  - Boot args (cat /proc/cmdline)
>>- Check rootdelay= (did the system wait long enough?)
>>- Check root= (did the system wait for the right device?)
>>  - Missing modules (cat /proc/modules; ls /dev)
>> ALERT!  /dev/mmcblk1p3 does not exist.  Dropping to a shell!
>> FATAL: Could not load
>> /lib/modules/3.12.0-rc5-00051-gfebca1b/modules.dep: No such file or
>> directory
>> FATAL: Could not load
>> /lib/modules/3.12.0-rc5-00051-gfebca1b/modules.dep: No such file or
>> directory
>
> Very weird! What file system is being used?

Most of my failures have happened on regular MMC cards with ext4
filesystems on them.

Note that the panic happens during device probe / partition table
scanning, not after mounting the filesystem.

Giving your patch a go now across the board. I'm very concerned about
the reports of bisectability, build failures and heaps of warnings
though. Did the 0-day builder pick up any of those? :-/


-Olof
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Boot failure on Arndale with next-20131105

2013-11-05 Thread Jens Axboe
On 11/05/2013 09:42 AM, Tomasz Figa wrote:
> Hi,
> 
> On Tuesday 05 of November 2013 17:19:00 Tushar Behera wrote:
>> Hi,
>>
>> We are having a boot-time kernel panic on Samsung's Exynos5250-based
>> Arndale board with next-20131105. Bisect points to following commit.
>>
>> <<<
>> commit febca1baea1cfe2d7a0271385d89b03d5fb34f94
>> Author: Chris Mason 
>> Date:   Thu Oct 31 13:32:42 2013 -0600
>>
>> block: setup bi_vcnt on clones
>>
>> commit 9fc6286f347d changed the cloning code to make clones cheaper for
>> the case where we don't need to clone the iovec array.  But,
>> the new clone needs the bi_vnct from the original.
>>
>> Signed-off-by: Chris Mason 
>> Signed-off-by: Jens Axboe 
>
>>
>> Reverting above commit, Arndale is able to boot again.
> 
> I can confirm exactly the same behavior on Exynos 4210-based Trats board,
> with exactly the same bisection results.

Can either (or both) of you try this?

-- 
Jens Axboe

diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index 8e6174c..a1177e1 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -1123,8 +1123,13 @@ struct clone_info {
 
 static void bio_setup_sector(struct bio *bio, sector_t sector, sector_t len)
 {
-	bio->bi_iter.bi_sector = sector;
-	bio->bi_iter.bi_size = to_bytes(len);
+	if (len) {
+		bio->bi_iter.bi_sector = sector;
+		bio->bi_iter.bi_size = to_bytes(len);
+	} else {
+		bio->bi_iter.bi_size = 0;
+		bio->bi_vcnt = 0;
+	}
 }
 
 /*
@@ -1178,8 +1183,7 @@ static void __clone_and_map_simple_bio(struct clone_info *ci,
 	 * and discard, so no need for concern about wasted bvec allocations.
 	 */
 	 __bio_clone(clone, ci->bio);
-	if (len)
-		bio_setup_sector(clone, ci->sector, len);
+	bio_setup_sector(clone, ci->sector, len);
 
 	__map_bio(tio);
 }


Re: Boot failure on Arndale with next-20131105

2013-11-05 Thread Jens Axboe
On 11/05/2013 04:49 AM, Tushar Behera wrote:
> Hi,
> 
> We are having a boot-time kernel panic on Samsung's Exynos5250-based
> Arndale board with next-20131105. Bisect points to following commit.
> 
> <<<
> commit febca1baea1cfe2d7a0271385d89b03d5fb34f94
> Author: Chris Mason 
> Date:   Thu Oct 31 13:32:42 2013 -0600
> 
> block: setup bi_vcnt on clones
> 
> commit 9fc6286f347d changed the cloning code to make clones cheaper for
> the case where we don't need to clone the iovec array.  But,
> the new clone needs the bi_vnct from the original.
> 
> Signed-off-by: Chris Mason 
> Signed-off-by: Jens Axboe 

> 
> Reverting above commit, Arndale is able to boot again.
> 
> Excerpts from the boot log (just in case, it helps in debugging).
> 
> [1.972062] Unable to handle kernel paging request at virtual
> address 025e63a0
> [1.981164] pgd = c0004000
> [1.982375] [025e63a0] *pgd=
> [1.985875] Internal error: Oops: 5 [#1] PREEMPT SMP ARM
> [1.991086] Modules linked in:
> [1.994076] CPU: 0 PID: 1178 Comm: mmcqd/0 Not tainted
> 3.12.0-rc5-00051-gfebca1b #21
> [2.001683] task: ef3530c0 ti: ee82e000 task.ti: ee82e000
> [2.006981] PC is at dma_cache_maint_page+0x84/0x174
> [2.011842] LR is at 0x6
> 
> [2.043532] Flags: NzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM
> Segment kernel
> [2.050708] Control: 10c5387d  Table: 4000406a  DAC: 0015
> [2.056342] Process mmcqd/0 (pid: 1178, stack limit = 0xee82e240)
> [2.062321] Stack: (0xee82fd58 to 0xee83)
> 
> [ ... ]
> 
> [2.275352] [] (dma_cache_maint_page+0x84/0x174) from
> [] (__dma_page_cpu_to_dev+0x28/0xa0)
> [2.285170] [] (__dma_page_cpu_to_dev+0x28/0xa0) from
> [] (arm_dma_map_page+0x6c/0x70)
> [2.294565] [] (arm_dma_map_page+0x6c/0x70) from
> [] (arm_dma_map_sg+0x74/0xec)
> [2.303366] [] (arm_dma_map_sg+0x74/0xec) from
> [] (dw_mci_pre_dma_transfer.isra.16+0x124/0x15c)
> [2.313614] []
> (dw_mci_pre_dma_transfer.isra.16+0x124/0x15c) from []
> (dw_mci_pre_req+0x44/0x50)
> [2.323863] [] (dw_mci_pre_req+0x44/0x50) from
> [] (mmc_start_req+0x3c/0x39c)
> [2.332486] [] (mmc_start_req+0x3c/0x39c) from
> [] (mmc_blk_issue_rw_rq+0xbc/0xa9c)
> [2.341625] [] (mmc_blk_issue_rw_rq+0xbc/0xa9c) from
> [] (mmc_blk_issue_rq+0x1c8/0x498)
> [2.351106] [] (mmc_blk_issue_rq+0x1c8/0x498) from
> [] (mmc_queue_thread+0xa4/0x144)
> [2.360331] [] (mmc_queue_thread+0xa4/0x144) from
> [] (kthread+0xb4/0xb8)
> [2.368616] [] (kthread+0xb4/0xb8) from []
> (ret_from_fork+0x14/0x3c)
> [2.376556] Code: 17e81051 10822181 e592c000 e3ccc003 (e79c2007)
> [2.382570] ---[ end trace df06b64b1b7fa443 ]---
> 
> [ ... ]
> 
> Begin: Mounting root file system ... Begin: Running /scripts/local-top ... 
> done.
> Gave up waiting for root device.  Common problems:
>  - Boot args (cat /proc/cmdline)
>- Check rootdelay= (did the system wait long enough?)
>- Check root= (did the system wait for the right device?)
>  - Missing modules (cat /proc/modules; ls /dev)
> ALERT!  /dev/mmcblk1p3 does not exist.  Dropping to a shell!
> FATAL: Could not load
> /lib/modules/3.12.0-rc5-00051-gfebca1b/modules.dep: No such file or
> directory
> FATAL: Could not load
> /lib/modules/3.12.0-rc5-00051-gfebca1b/modules.dep: No such file or
> directory

Very weird! What file system is being used?

-- 
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Boot failure on Arndale with next-20131105

2013-11-05 Thread Stephen Warren
On 11/05/2013 09:42 AM, Tomasz Figa wrote:
> Hi,
> 
> On Tuesday 05 of November 2013 17:19:00 Tushar Behera wrote:
>> Hi,
>>
>> We are having a boot-time kernel panic on Samsung's Exynos5250-based
>> Arndale board with next-20131105. Bisect points to following commit.
>>
>> <<<
>> commit febca1baea1cfe2d7a0271385d89b03d5fb34f94
>> Author: Chris Mason 
>> Date:   Thu Oct 31 13:32:42 2013 -0600
>>
>> block: setup bi_vcnt on clones
>>
>> commit 9fc6286f347d changed the cloning code to make clones cheaper for
>> the case where we don't need to clone the iovec array.  But,
>> the new clone needs the bi_vnct from the original.
>>
>> Signed-off-by: Chris Mason 
>> Signed-off-by: Jens Axboe 
>> >>>
>>
>> Reverting above commit, Arndale is able to boot again.
> 
> I can confirm exactly the same behavior on Exynos 4210-based Trats board,
> with exactly the same bisection results.

Despite the backtrace looking different, reverting that commit also
solves the boot failures on the Tegra-based "Beaver" board.

> Also note that I spotted multiple build failures in block layer during
> the bisection.

I note that compiling next-20131105 generates quite a few warnings re:
uninitialized variables. Reverting the commit doesn't solve those.

> block/blk-merge.c: In function ‘blk_bio_map_sg’:
> block/blk-merge.c:133:8: warning: ‘bvprv.bv_len’ may be used uninitialized in 
> this function [-Wmaybe-uninitialized]
> block/blk-merge.c:233:23: note: ‘bvprv.bv_len’ was declared here
> block/blk-merge.c:133:8: warning: ‘bvprv.bv_offset’ may be used uninitialized 
> in this function [-Wmaybe-uninitialized]
> block/blk-merge.c:233:23: note: ‘bvprv.bv_offset’ was declared here
> block/blk-merge.c:133:8: warning: ‘bvprv.bv_page’ may be used uninitialized 
> in this function [-Wmaybe-uninitialized]
> block/blk-merge.c:233:23: note: ‘bvprv.bv_page’ was declared here
> block/blk-merge.c: In function ‘blk_rq_map_sg’:
> block/blk-merge.c:133:8: warning: ‘bvprv.bv_page’ may be used uninitialized 
> in this function [-Wmaybe-uninitialized]
> block/blk-merge.c:171:23: note: ‘bvprv.bv_page’ was declared here
> block/blk-merge.c:133:8: warning: ‘bvprv.bv_offset’ may be used uninitialized 
> in this function [-Wmaybe-uninitialized]
> block/blk-merge.c:171:23: note: ‘bvprv.bv_offset’ was declared here
> block/blk-merge.c:133:8: warning: ‘bvprv.bv_len’ may be used uninitialized in 
> this function [-Wmaybe-uninitialized]
> block/blk-merge.c:171:23: note: ‘bvprv.bv_len’ was declared here
> block/blk-merge.c: In function ‘attempt_merge’:
> block/blk-merge.c:108:7: warning: ‘end_bv.bv_offset’ may be used 
> uninitialized in this function [-Wmaybe-uninitialized]
> block/blk-merge.c:89:17: note: ‘end_bv.bv_offset’ was declared here
> block/blk-merge.c:108:7: warning: ‘end_bv.bv_page’ may be used uninitialized 
> in this function [-Wmaybe-uninitialized]
> block/blk-merge.c:89:17: note: ‘end_bv.bv_page’ was declared here
> block/blk-merge.c:108:7: warning: ‘end_bv.bv_len’ may be used uninitialized 
> in this function [-Wmaybe-uninitialized]
> block/blk-merge.c:89:17: note: ‘end_bv.bv_len’ was declared here

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Boot failure on Arndale with next-20131105

2013-11-05 Thread Tomasz Figa
Hi,

On Tuesday 05 of November 2013 17:19:00 Tushar Behera wrote:
> Hi,
> 
> We are having a boot-time kernel panic on Samsung's Exynos5250-based
> Arndale board with next-20131105. Bisect points to following commit.
> 
> <<<
> commit febca1baea1cfe2d7a0271385d89b03d5fb34f94
> Author: Chris Mason 
> Date:   Thu Oct 31 13:32:42 2013 -0600
> 
> block: setup bi_vcnt on clones
> 
> commit 9fc6286f347d changed the cloning code to make clones cheaper for
> the case where we don't need to clone the iovec array.  But,
> the new clone needs the bi_vnct from the original.
> 
> Signed-off-by: Chris Mason 
> Signed-off-by: Jens Axboe 
> >>>
> 
> Reverting above commit, Arndale is able to boot again.

I can confirm exactly the same behavior on Exynos 4210-based Trats board,
with exactly the same bisection results.

Also note that I spotted multiple build failures in block layer during
the bisection.

# skip: [c198aee7e9a801d32cee4607453871cff7c43e6c] ceph: Convert to immutable 
biovecs
git bisect skip c198aee7e9a801d32cee4607453871cff7c43e6c
# skip: [3d75d579a04be023552b45f791cd95f5b6a45ba6] block: Kill 
bio_segments()/bi_vcnt usage
git bisect skip 3d75d579a04be023552b45f791cd95f5b6a45ba6
# skip: [f2da8e013088387e5e61930b715ff0defea9aa58] aoe: Convert to immutable 
biovecs
git bisect skip f2da8e013088387e5e61930b715ff0defea9aa58
# skip: [44931ee84c6362ec8d9b97b02432760035a2b639] block: Kill bio_pair_split()
git bisect skip 44931ee84c6362ec8d9b97b02432760035a2b639
# skip: [d4fbf2c24290f237cf5989d8e4c8507969ae2299] rbd: Refactor bio cloning, 
don't clone biovecs
git bisect skip d4fbf2c24290f237cf5989d8e4c8507969ae2299
# skip: [cc4067bd8adeb5507829b7ae8f17211aab5d1e9d] block: Kill bio_iovec_idx(), 
__bio_iovec()
git bisect skip cc4067bd8adeb5507829b7ae8f17211aab5d1e9d
# skip: [a040a44b1c2b56fed3ebef3734681b6fe473fd33] dm: Refactor for new bio 
cloning/splitting
git bisect skip a040a44b1c2b56fed3ebef3734681b6fe473fd33
# skip: [948809ba161cce4060977970e1133a66fffc3449] block: Introduce new 
bio_split()
git bisect skip 948809ba161cce4060977970e1133a66fffc3449
# skip: [7e814b148e1127be7c32bb438ceaadb0b6e33042] block: Remove bi_idx hacks
git bisect skip 7e814b148e1127be7c32bb438ceaadb0b6e33042
# skip: [3dbdffcc4c1ffb7d7ac631be55cd5aab3b258614] block: Immutable bio vecs
git bisect skip 3dbdffcc4c1ffb7d7ac631be55cd5aab3b258614
# skip: [919b8823a6ef27103fe3abd05026f87ad85ed1ad] block: Convert drivers to 
immutable biovecs
git bisect skip 919b8823a6ef27103fe3abd05026f87ad85ed1ad
# skip: [5fbc9c23b291ac8d8ffe73cbc55cd7cb9c57fd04] block: Convert 
bio_copy_data() to bvec_iter
git bisect skip 5fbc9c23b291ac8d8ffe73cbc55cd7cb9c57fd04
# skip: [2771aecc0cc33d70747c8335239c20c9ff87ac67] block: Generic bio chaining
git bisect skip 2771aecc0cc33d70747c8335239c20c9ff87ac67
# skip: [9fc6286f347d00528adcdcf12396d220f47492ed] block: Don't save/copy bvec 
array anymore, share when cloning
git bisect skip 9fc6286f347d00528adcdcf12396d220f47492ed
# skip: [85bf1bd38f53e93712a149a8c31abe6936494d64] block: Rename bio_split() -> 
bio_pair_split()
git bisect skip 85bf1bd38f53e93712a149a8c31abe6936494d64
# skip: [5d1f127c3e0c57d64ce75ee04a0db2b40a3e21df] block: Convert 
bio_for_each_segment() to bvec_iter
git bisect skip 5d1f127c3e0c57d64ce75ee04a0db2b40a3e21df
# skip: [eb225c28a0b3f730b50f096946aee5eef2cb9969] bio-integrity: Convert to 
bvec_iter
git bisect skip eb225c28a0b3f730b50f096946aee5eef2cb9969

Full bisect log:

# bad: [98dd2f31c585ddcfb78ce14f8d0efcb52e5ed2e9] Add linux-next specific files 
for 20131105
# good: [355e62f5ad12b005c862838156262eb2df2f8dff] of/irq: Fix potential buffer 
overflow
git bisect start 'v3.13-sdhci-fail' '355e62f'
# good: [c2e1895eb0564667394c28e1ecd772ee6a27ea54] Merge remote-tracking branch 
'crypto/master'
git bisect good c2e1895eb0564667394c28e1ecd772ee6a27ea54
# bad: [7f1546329db7b573f3a640d75cc1af40dc5ee9ed] Merge remote-tracking branch 
'tip/auto-latest'
git bisect bad 7f1546329db7b573f3a640d75cc1af40dc5ee9ed
# bad: [07d76d00209c960eb8bcce9dfdf36e7edd458da3] Merge remote-tracking branch 
'md/for-next'
git bisect bad 07d76d00209c960eb8bcce9dfdf36e7edd458da3
# good: [36753aaf7758b2089a55b3e67e6f1a9242462bb4] Merge remote-tracking branch 
'drm-tegra/drm/for-next'
git bisect good 36753aaf7758b2089a55b3e67e6f1a9242462bb4
# good: [ca5f026efedeb01287863a9c7e1d5fdaf82d196d] Merge remote-tracking branch 
'virtio/virtio-next'
git bisect good ca5f026efedeb01287863a9c7e1d5fdaf82d196d
# bad: [67b89a119b28377ced0ea844aed51f74976db36b] Merge remote-tracking branch 
'block/for-next'
git bisect bad 67b89a119b28377ced0ea844aed51f74976db36b
# bad: [26f584573c613d2a7292d8c66dc063ae2bece90a] Merge branch 'for-3.13/core' 
into for-next
git bisect bad 26f584573c613d2a7292d8c66dc063ae2bece90a
# bad: [0023432f72015803e050e381f12a724e59eded74] dm: fix missing bi_remaining 
accounting
git bisect bad 0023432f72015803e050e381f12a724e59eded74
# good: [8b6df54182c8c775f346a0703ccb4c531c18a8f0] block: Use 
rw_copy_check_