Re: REGRESSION: 37f4a24c2469: blk-mq: centralise related handling into blk_mq_get_driver_tag

2020-09-27 Thread Theodore Y. Ts'o
On Fri, Sep 25, 2020 at 02:18:48PM -0700, Shakeel Butt wrote: > > Yes, you are right. Let's first get this patch tested and after > confirmation we can update the commit message. Thanks Shakeel! I've tested your patch, as well as reverting the three commits that Linus had suggested, and both

Re: REGRESSION: 37f4a24c2469: blk-mq: centralise related handling into blk_mq_get_driver_tag

2020-09-26 Thread Roman Gushchin
On Sat, Sep 26, 2020 at 09:43:25AM +0800, Ming Lei wrote: > On Fri, Sep 25, 2020 at 12:19:02PM -0700, Shakeel Butt wrote: > > On Fri, Sep 25, 2020 at 10:58 AM Shakeel Butt > > wrote: > > > > > [snip] > > > > > > I don't think you can ignore the flushing. The __free_once() in > > > ___cache_free()

Re: REGRESSION: 37f4a24c2469: blk-mq: centralise related handling into blk_mq_get_driver_tag

2020-09-25 Thread Ming Lei
On Fri, Sep 25, 2020 at 12:19:02PM -0700, Shakeel Butt wrote: > On Fri, Sep 25, 2020 at 10:58 AM Shakeel Butt > wrote: > > > [snip] > > > > I don't think you can ignore the flushing. The __free_once() in > > ___cache_free() assumes there is a space available. > > > > BTW do_drain() also have the

Re: REGRESSION: 37f4a24c2469: blk-mq: centralise related handling into blk_mq_get_driver_tag

2020-09-25 Thread Shakeel Butt
On Fri, Sep 25, 2020 at 1:56 PM Roman Gushchin wrote: > > On Fri, Sep 25, 2020 at 12:19:02PM -0700, Shakeel Butt wrote: > > On Fri, Sep 25, 2020 at 10:58 AM Shakeel Butt > > wrote: > > > > > [snip] > > > > > > I don't think you can ignore the flushing. The __free_once() in > > > ___cache_free()

Re: REGRESSION: 37f4a24c2469: blk-mq: centralise related handling into blk_mq_get_driver_tag

2020-09-25 Thread Roman Gushchin
On Fri, Sep 25, 2020 at 12:19:02PM -0700, Shakeel Butt wrote: > On Fri, Sep 25, 2020 at 10:58 AM Shakeel Butt > wrote: > > > [snip] > > > > I don't think you can ignore the flushing. The __free_once() in > > ___cache_free() assumes there is a space available. > > > > BTW do_drain() also have the

Re: REGRESSION: 37f4a24c2469: blk-mq: centralise related handling into blk_mq_get_driver_tag

2020-09-25 Thread Shakeel Butt
On Fri, Sep 25, 2020 at 10:58 AM Shakeel Butt wrote: > [snip] > > I don't think you can ignore the flushing. The __free_once() in > ___cache_free() assumes there is a space available. > > BTW do_drain() also have the same issue. > > Why not move slabs_destroy() after we update ac->avail and

Re: REGRESSION: 37f4a24c2469: blk-mq: centralise related handling into blk_mq_get_driver_tag

2020-09-25 Thread Shakeel Butt
On Fri, Sep 25, 2020 at 10:48 AM Roman Gushchin wrote: > > On Fri, Sep 25, 2020 at 10:35:03AM -0700, Shakeel Butt wrote: > > On Fri, Sep 25, 2020 at 10:22 AM Shakeel Butt wrote: > > > > > > On Fri, Sep 25, 2020 at 10:17 AM Linus Torvalds > > > wrote: > > > > > > > > On Fri, Sep 25, 2020 at 9:19

Re: REGRESSION: 37f4a24c2469: blk-mq: centralise related handling into blk_mq_get_driver_tag

2020-09-25 Thread Roman Gushchin
On Fri, Sep 25, 2020 at 10:35:03AM -0700, Shakeel Butt wrote: > On Fri, Sep 25, 2020 at 10:22 AM Shakeel Butt wrote: > > > > On Fri, Sep 25, 2020 at 10:17 AM Linus Torvalds > > wrote: > > > > > > On Fri, Sep 25, 2020 at 9:19 AM Ming Lei wrote: > > > > > > > > git bisect shows the first bad

Re: REGRESSION: 37f4a24c2469: blk-mq: centralise related handling into blk_mq_get_driver_tag

2020-09-25 Thread Shakeel Butt
On Fri, Sep 25, 2020 at 10:22 AM Shakeel Butt wrote: > > On Fri, Sep 25, 2020 at 10:17 AM Linus Torvalds > wrote: > > > > On Fri, Sep 25, 2020 at 9:19 AM Ming Lei wrote: > > > > > > git bisect shows the first bad commit: > > > > > > [10befea91b61c4e2c2d1df06a2e978d182fcf792] mm:

Re: REGRESSION: 37f4a24c2469: blk-mq: centralise related handling into blk_mq_get_driver_tag

2020-09-25 Thread Roman Gushchin
On Fri, Sep 25, 2020 at 09:47:43AM -0700, Shakeel Butt wrote: > On Fri, Sep 25, 2020 at 9:32 AM Shakeel Butt wrote: > > > > On Fri, Sep 25, 2020 at 9:19 AM Ming Lei wrote: > > > > > > On Fri, Sep 25, 2020 at 03:31:45PM +0800, Ming Lei wrote: > > > > On Thu, Sep 24, 2020 at 09:13:11PM -0400,

Re: REGRESSION: 37f4a24c2469: blk-mq: centralise related handling into blk_mq_get_driver_tag

2020-09-25 Thread Shakeel Butt
On Fri, Sep 25, 2020 at 10:17 AM Linus Torvalds wrote: > > On Fri, Sep 25, 2020 at 9:19 AM Ming Lei wrote: > > > > git bisect shows the first bad commit: > > > > [10befea91b61c4e2c2d1df06a2e978d182fcf792] mm: memcg/slab: use a > > single set of > > kmem_caches for all

Re: REGRESSION: 37f4a24c2469: blk-mq: centralise related handling into blk_mq_get_driver_tag

2020-09-25 Thread Linus Torvalds
On Fri, Sep 25, 2020 at 9:19 AM Ming Lei wrote: > > git bisect shows the first bad commit: > > [10befea91b61c4e2c2d1df06a2e978d182fcf792] mm: memcg/slab: use a > single set of > kmem_caches for all allocations > > And I have double checked that the above commit is really

Re: REGRESSION: 37f4a24c2469: blk-mq: centralise related handling into blk_mq_get_driver_tag

2020-09-25 Thread Shakeel Butt
On Fri, Sep 25, 2020 at 9:32 AM Shakeel Butt wrote: > > On Fri, Sep 25, 2020 at 9:19 AM Ming Lei wrote: > > > > On Fri, Sep 25, 2020 at 03:31:45PM +0800, Ming Lei wrote: > > > On Thu, Sep 24, 2020 at 09:13:11PM -0400, Theodore Y. Ts'o wrote: > > > > On Thu, Sep 24, 2020 at 10:33:45AM -0400,

Re: REGRESSION: 37f4a24c2469: blk-mq: centralise related handling into blk_mq_get_driver_tag

2020-09-25 Thread Shakeel Butt
On Fri, Sep 25, 2020 at 9:19 AM Ming Lei wrote: > > On Fri, Sep 25, 2020 at 03:31:45PM +0800, Ming Lei wrote: > > On Thu, Sep 24, 2020 at 09:13:11PM -0400, Theodore Y. Ts'o wrote: > > > On Thu, Sep 24, 2020 at 10:33:45AM -0400, Theodore Y. Ts'o wrote: > > > > HOWEVER, thanks to a hint from a

Re: REGRESSION: 37f4a24c2469: blk-mq: centralise related handling into blk_mq_get_driver_tag

2020-09-25 Thread Ming Lei
On Fri, Sep 25, 2020 at 03:31:45PM +0800, Ming Lei wrote: > On Thu, Sep 24, 2020 at 09:13:11PM -0400, Theodore Y. Ts'o wrote: > > On Thu, Sep 24, 2020 at 10:33:45AM -0400, Theodore Y. Ts'o wrote: > > > HOWEVER, thanks to a hint from a colleague at $WORK, and realizing > > > that one of the stack

Re: REGRESSION: 37f4a24c2469: blk-mq: centralise related handling into blk_mq_get_driver_tag

2020-09-25 Thread Ming Lei
On Thu, Sep 24, 2020 at 09:13:11PM -0400, Theodore Y. Ts'o wrote: > On Thu, Sep 24, 2020 at 10:33:45AM -0400, Theodore Y. Ts'o wrote: > > HOWEVER, thanks to a hint from a colleague at $WORK, and realizing > > that one of the stack traces had virtio balloon in the trace, I > > realized that when I

Re: REGRESSION: 37f4a24c2469: blk-mq: centralise related handling into blk_mq_get_driver_tag

2020-09-24 Thread Ming Lei
On Fri, Sep 25, 2020 at 09:14:16AM +0800, Ming Lei wrote: > On Thu, Sep 24, 2020 at 10:33:45AM -0400, Theodore Y. Ts'o wrote: > > On Thu, Sep 24, 2020 at 08:59:01AM +0800, Ming Lei wrote: > > > > > > The list corruption issue can be reproduced on kvm/qumu guest too when > > > running

Re: REGRESSION: 37f4a24c2469: blk-mq: centralise related handling into blk_mq_get_driver_tag

2020-09-24 Thread Ming Lei
On Thu, Sep 24, 2020 at 10:33:45AM -0400, Theodore Y. Ts'o wrote: > On Thu, Sep 24, 2020 at 08:59:01AM +0800, Ming Lei wrote: > > > > The list corruption issue can be reproduced on kvm/qumu guest too when > > running xfstests(ext4) generic/038. > > > > However, the issue may become not

Re: REGRESSION: 37f4a24c2469: blk-mq: centralise related handling into blk_mq_get_driver_tag

2020-09-24 Thread Theodore Y. Ts'o
On Thu, Sep 24, 2020 at 10:33:45AM -0400, Theodore Y. Ts'o wrote: > HOWEVER, thanks to a hint from a colleague at $WORK, and realizing > that one of the stack traces had virtio balloon in the trace, I > realized that when I switched the GCE VM type from e1-standard-2 to > n1-standard-2 (where e1

Re: REGRESSION: 37f4a24c2469: blk-mq: centralise related handling into blk_mq_get_driver_tag

2020-09-24 Thread Theodore Y. Ts'o
On Thu, Sep 24, 2020 at 08:59:01AM +0800, Ming Lei wrote: > > The list corruption issue can be reproduced on kvm/qumu guest too when > running xfstests(ext4) generic/038. > > However, the issue may become not reproduced when adding or removing memory > debug options, such as adding KASAN. Can

Re: REGRESSION: 37f4a24c2469: blk-mq: centralise related handling into blk_mq_get_driver_tag

2020-09-23 Thread Ming Lei
On Thu, Sep 17, 2020 at 10:30:12AM -0400, Theodore Y. Ts'o wrote: > On Thu, Sep 17, 2020 at 10:20:51AM +0800, Ming Lei wrote: > > > > Obviously there is other more serious issue, since 568f27006577 is > > completely reverted in your test, and you still see list corruption > > issue. > > > > So

Re: REGRESSION: 37f4a24c2469: blk-mq: centralise related handling into blk_mq_get_driver_tag

2020-09-17 Thread Ming Lei
On Thu, Sep 17, 2020 at 10:30:12AM -0400, Theodore Y. Ts'o wrote: > On Thu, Sep 17, 2020 at 10:20:51AM +0800, Ming Lei wrote: > > > > Obviously there is other more serious issue, since 568f27006577 is > > completely reverted in your test, and you still see list corruption > > issue. > > > > So

Re: REGRESSION: 37f4a24c2469: blk-mq: centralise related handling into blk_mq_get_driver_tag

2020-09-17 Thread Theodore Y. Ts'o
On Thu, Sep 17, 2020 at 10:20:51AM +0800, Ming Lei wrote: > > Obviously there is other more serious issue, since 568f27006577 is > completely reverted in your test, and you still see list corruption > issue. > > So I'd suggest to find the big issue first. Once it is fixed, maybe > everything

Re: REGRESSION: 37f4a24c2469: blk-mq: centralise related handling into blk_mq_get_driver_tag

2020-09-16 Thread Ming Lei
On Wed, Sep 16, 2020 at 04:20:26PM -0400, Theodore Y. Ts'o wrote: > On Wed, Sep 16, 2020 at 07:09:41AM +0800, Ming Lei wrote: > > > The problem is it's a bit tricky to revert 568f27006577, since there > > > is a merge conflict in blk_kick_flush(). I attempted to do the bisect > > > manually here,

Re: REGRESSION: 37f4a24c2469: blk-mq: centralise related handling into blk_mq_get_driver_tag

2020-09-16 Thread Theodore Y. Ts'o
On Wed, Sep 16, 2020 at 07:09:41AM +0800, Ming Lei wrote: > > The problem is it's a bit tricky to revert 568f27006577, since there > > is a merge conflict in blk_kick_flush(). I attempted to do the bisect > > manually here, but it's clearly not right since the kernel is not > > booting after the

Re: REGRESSION: 37f4a24c2469: blk-mq: centralise related handling into blk_mq_get_driver_tag

2020-09-15 Thread Ming Lei
On Tue, Sep 15, 2020 at 06:45:41PM -0400, Theodore Y. Ts'o wrote: > On Tue, Sep 15, 2020 at 03:33:03PM +0800, Ming Lei wrote: > > Hi Theodore, > > > > On Tue, Sep 15, 2020 at 12:45:19AM -0400, Theodore Y. Ts'o wrote: > > > On Thu, Sep 03, 2020 at 11:55:28PM -0400, Theodore Y. Ts'o wrote: > > > >

Re: REGRESSION: 37f4a24c2469: blk-mq: centralise related handling into blk_mq_get_driver_tag

2020-09-15 Thread Theodore Y. Ts'o
On Tue, Sep 15, 2020 at 03:33:03PM +0800, Ming Lei wrote: > Hi Theodore, > > On Tue, Sep 15, 2020 at 12:45:19AM -0400, Theodore Y. Ts'o wrote: > > On Thu, Sep 03, 2020 at 11:55:28PM -0400, Theodore Y. Ts'o wrote: > > > Worse, right now, -rc1 and -rc2 is causing random crashes in my > > >

Re: REGRESSION: 37f4a24c2469: blk-mq: centralise related handling into blk_mq_get_driver_tag

2020-09-15 Thread Ming Lei
Hi Theodore, On Tue, Sep 15, 2020 at 12:45:19AM -0400, Theodore Y. Ts'o wrote: > On Thu, Sep 03, 2020 at 11:55:28PM -0400, Theodore Y. Ts'o wrote: > > Worse, right now, -rc1 and -rc2 is causing random crashes in my > > gce-xfstests framework. Sometimes it happens before we've run even a > >

REGRESSION: 37f4a24c2469: blk-mq: centralise related handling into blk_mq_get_driver_tag

2020-09-14 Thread Theodore Y. Ts'o
On Thu, Sep 03, 2020 at 11:55:28PM -0400, Theodore Y. Ts'o wrote: > Worse, right now, -rc1 and -rc2 is causing random crashes in my > gce-xfstests framework. Sometimes it happens before we've run even a > single xfstests; sometimes it happens after we have successfully > completed all of the