Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2018-01-11 Thread Ming Lei
On Thu, Jan 11, 2018 at 06:46:54PM +0100, Christoph Hellwig wrote:
> Thanks for looking into this Ming, I had missed it in the my current
> work overload.  Can you send the updated series to Jens?

OK, I will post it out soon.

Thanks,
Ming


Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2018-01-11 Thread Ming Lei
On Thu, Jan 11, 2018 at 06:46:54PM +0100, Christoph Hellwig wrote:
> Thanks for looking into this Ming, I had missed it in the my current
> work overload.  Can you send the updated series to Jens?

OK, I will post it out soon.

Thanks,
Ming


Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2018-01-11 Thread Christoph Hellwig
Thanks for looking into this Ming, I had missed it in the my current
work overload.  Can you send the updated series to Jens?


Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2018-01-11 Thread Christoph Hellwig
Thanks for looking into this Ming, I had missed it in the my current
work overload.  Can you send the updated series to Jens?


Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2018-01-11 Thread Stefan Haberland

On 11.01.2018 12:44, Christian Borntraeger wrote:


On 01/11/2018 10:13 AM, Ming Lei wrote:

On Wed, Dec 20, 2017 at 04:47:21PM +0100, Christian Borntraeger wrote:

On 12/18/2017 02:56 PM, Stefan Haberland wrote:

On 07.12.2017 00:29, Christoph Hellwig wrote:

On Wed, Dec 06, 2017 at 01:25:11PM +0100, Christian Borntraeger wrote:
t > commit 11b2025c3326f7096ceb588c3117c7883850c068    -> bad

  blk-mq: create a blk_mq_ctx for each possible CPU
does not boot on DASD and
commit 9c6ae239e01ae9a9f8657f05c55c4372e9fc8bcc    -> good
     genirq/affinity: assign vectors to all possible CPUs
does boot with DASD disks.

Also adding Stefan Haberland if he has an idea why this fails on DASD and 
adding Martin (for the
s390 irq handling code).

That is interesting as it really isn't related to interrupts at all,
it just ensures that possible CPUs are set in ->cpumask.

I guess we'd really want:

e005655c389e3d25bf3e43f71611ec12f3012de0
"blk-mq: only select online CPUs in blk_mq_hctx_next_cpu"

before this commit, but it seems like the whole stack didn't work for
your either.

I wonder if there is some weird thing about nr_cpu_ids in s390?
--
To unsubscribe from this list: send the line "unsubscribe linux-s390" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


I tried this on my system and the blk-mq-hotplug-fix branch does not boot for 
me as well.
The disks get up and running and I/O works fine. At least the partition 
detection and EXT4-fs mount works.

But at some point in time the disk do not get any requests.

I currently have no clue why.
I took a dump and had a look at the disk states and they are fine. No error in 
the logs or in our debug entrys. Just empty DASD devices waiting to be called 
for I/O requests.

Do you have anything I could have a look at?

Jens, Christoph, so what do we do about this?
To summarize:
- commit 4b855ad37194f7 ("blk-mq: Create hctx for each present CPU") broke CPU 
hotplug.
- Jens' quick revert did fix the issue and did not broke DASD support but has 
some issues
with interrupt affinity.
- Christoph patch set fixes the hotplug issue for virtio blk but causes I/O 
hangs on DASDs (even
without hotplug).

Hello,

This one is a valid use case for VM, I think we need to fix that.

Looks there is issue on the fouth patch("blk-mq: only select online
CPUs in blk_mq_hctx_next_cpu"), I fixed it in the following tree, and
the other 3 patches are same with Christoph's:

https://github.com/ming1/linux.git  v4.15-rc-block-for-next-cpuhot-fix

gitweb:

https://github.com/ming1/linux/commits/v4.15-rc-block-for-next-cpuhot-fix

Could you test it and provide the feedback?

BTW, if it can't help this issue, could you boot from a normal disk first
and dump blk-mq debugfs of DASD later?

That kernel seems to boot fine on my system with DASD disks.

--
To unsubscribe from this list: send the line "unsubscribe linux-s390" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html



I did some regression testing and it works quite well. Boot works, 
attaching CPUs during runtime on z/VM and enabling them in Linux works 
as well.

I also did some DASD online/offline CPU enable/disable loops.

Regards,
Stefan



Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2018-01-11 Thread Stefan Haberland

On 11.01.2018 12:44, Christian Borntraeger wrote:


On 01/11/2018 10:13 AM, Ming Lei wrote:

On Wed, Dec 20, 2017 at 04:47:21PM +0100, Christian Borntraeger wrote:

On 12/18/2017 02:56 PM, Stefan Haberland wrote:

On 07.12.2017 00:29, Christoph Hellwig wrote:

On Wed, Dec 06, 2017 at 01:25:11PM +0100, Christian Borntraeger wrote:
t > commit 11b2025c3326f7096ceb588c3117c7883850c068    -> bad

  blk-mq: create a blk_mq_ctx for each possible CPU
does not boot on DASD and
commit 9c6ae239e01ae9a9f8657f05c55c4372e9fc8bcc    -> good
     genirq/affinity: assign vectors to all possible CPUs
does boot with DASD disks.

Also adding Stefan Haberland if he has an idea why this fails on DASD and 
adding Martin (for the
s390 irq handling code).

That is interesting as it really isn't related to interrupts at all,
it just ensures that possible CPUs are set in ->cpumask.

I guess we'd really want:

e005655c389e3d25bf3e43f71611ec12f3012de0
"blk-mq: only select online CPUs in blk_mq_hctx_next_cpu"

before this commit, but it seems like the whole stack didn't work for
your either.

I wonder if there is some weird thing about nr_cpu_ids in s390?
--
To unsubscribe from this list: send the line "unsubscribe linux-s390" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


I tried this on my system and the blk-mq-hotplug-fix branch does not boot for 
me as well.
The disks get up and running and I/O works fine. At least the partition 
detection and EXT4-fs mount works.

But at some point in time the disk do not get any requests.

I currently have no clue why.
I took a dump and had a look at the disk states and they are fine. No error in 
the logs or in our debug entrys. Just empty DASD devices waiting to be called 
for I/O requests.

Do you have anything I could have a look at?

Jens, Christoph, so what do we do about this?
To summarize:
- commit 4b855ad37194f7 ("blk-mq: Create hctx for each present CPU") broke CPU 
hotplug.
- Jens' quick revert did fix the issue and did not broke DASD support but has 
some issues
with interrupt affinity.
- Christoph patch set fixes the hotplug issue for virtio blk but causes I/O 
hangs on DASDs (even
without hotplug).

Hello,

This one is a valid use case for VM, I think we need to fix that.

Looks there is issue on the fouth patch("blk-mq: only select online
CPUs in blk_mq_hctx_next_cpu"), I fixed it in the following tree, and
the other 3 patches are same with Christoph's:

https://github.com/ming1/linux.git  v4.15-rc-block-for-next-cpuhot-fix

gitweb:

https://github.com/ming1/linux/commits/v4.15-rc-block-for-next-cpuhot-fix

Could you test it and provide the feedback?

BTW, if it can't help this issue, could you boot from a normal disk first
and dump blk-mq debugfs of DASD later?

That kernel seems to boot fine on my system with DASD disks.

--
To unsubscribe from this list: send the line "unsubscribe linux-s390" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html



I did some regression testing and it works quite well. Boot works, 
attaching CPUs during runtime on z/VM and enabling them in Linux works 
as well.

I also did some DASD online/offline CPU enable/disable loops.

Regards,
Stefan



Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2018-01-11 Thread Christian Borntraeger


On 01/11/2018 10:13 AM, Ming Lei wrote:
> On Wed, Dec 20, 2017 at 04:47:21PM +0100, Christian Borntraeger wrote:
>> On 12/18/2017 02:56 PM, Stefan Haberland wrote:
>>> On 07.12.2017 00:29, Christoph Hellwig wrote:
 On Wed, Dec 06, 2017 at 01:25:11PM +0100, Christian Borntraeger wrote:
 t > commit 11b2025c3326f7096ceb588c3117c7883850c068    -> bad
>  blk-mq: create a blk_mq_ctx for each possible CPU
> does not boot on DASD and
> commit 9c6ae239e01ae9a9f8657f05c55c4372e9fc8bcc    -> good
>     genirq/affinity: assign vectors to all possible CPUs
> does boot with DASD disks.
>
> Also adding Stefan Haberland if he has an idea why this fails on DASD and 
> adding Martin (for the
> s390 irq handling code).
 That is interesting as it really isn't related to interrupts at all,
 it just ensures that possible CPUs are set in ->cpumask.

 I guess we'd really want:

 e005655c389e3d25bf3e43f71611ec12f3012de0
 "blk-mq: only select online CPUs in blk_mq_hctx_next_cpu"

 before this commit, but it seems like the whole stack didn't work for
 your either.

 I wonder if there is some weird thing about nr_cpu_ids in s390?
 -- 
 To unsubscribe from this list: send the line "unsubscribe linux-s390" in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

>>>
>>> I tried this on my system and the blk-mq-hotplug-fix branch does not boot 
>>> for me as well.
>>> The disks get up and running and I/O works fine. At least the partition 
>>> detection and EXT4-fs mount works.
>>>
>>> But at some point in time the disk do not get any requests.
>>>
>>> I currently have no clue why.
>>> I took a dump and had a look at the disk states and they are fine. No error 
>>> in the logs or in our debug entrys. Just empty DASD devices waiting to be 
>>> called for I/O requests.
>>>
>>> Do you have anything I could have a look at?
>>
>> Jens, Christoph, so what do we do about this?
>> To summarize:
>> - commit 4b855ad37194f7 ("blk-mq: Create hctx for each present CPU") broke 
>> CPU hotplug.
>> - Jens' quick revert did fix the issue and did not broke DASD support but 
>> has some issues
>> with interrupt affinity.
>> - Christoph patch set fixes the hotplug issue for virtio blk but causes I/O 
>> hangs on DASDs (even
>> without hotplug).
> 
> Hello,
> 
> This one is a valid use case for VM, I think we need to fix that.
> 
> Looks there is issue on the fouth patch("blk-mq: only select online
> CPUs in blk_mq_hctx_next_cpu"), I fixed it in the following tree, and
> the other 3 patches are same with Christoph's:
> 
>   https://github.com/ming1/linux.git  v4.15-rc-block-for-next-cpuhot-fix
> 
> gitweb:
>   
> https://github.com/ming1/linux/commits/v4.15-rc-block-for-next-cpuhot-fix
> 
> Could you test it and provide the feedback?
> 
> BTW, if it can't help this issue, could you boot from a normal disk first
> and dump blk-mq debugfs of DASD later?

That kernel seems to boot fine on my system with DASD disks.



Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2018-01-11 Thread Christian Borntraeger


On 01/11/2018 10:13 AM, Ming Lei wrote:
> On Wed, Dec 20, 2017 at 04:47:21PM +0100, Christian Borntraeger wrote:
>> On 12/18/2017 02:56 PM, Stefan Haberland wrote:
>>> On 07.12.2017 00:29, Christoph Hellwig wrote:
 On Wed, Dec 06, 2017 at 01:25:11PM +0100, Christian Borntraeger wrote:
 t > commit 11b2025c3326f7096ceb588c3117c7883850c068    -> bad
>  blk-mq: create a blk_mq_ctx for each possible CPU
> does not boot on DASD and
> commit 9c6ae239e01ae9a9f8657f05c55c4372e9fc8bcc    -> good
>     genirq/affinity: assign vectors to all possible CPUs
> does boot with DASD disks.
>
> Also adding Stefan Haberland if he has an idea why this fails on DASD and 
> adding Martin (for the
> s390 irq handling code).
 That is interesting as it really isn't related to interrupts at all,
 it just ensures that possible CPUs are set in ->cpumask.

 I guess we'd really want:

 e005655c389e3d25bf3e43f71611ec12f3012de0
 "blk-mq: only select online CPUs in blk_mq_hctx_next_cpu"

 before this commit, but it seems like the whole stack didn't work for
 your either.

 I wonder if there is some weird thing about nr_cpu_ids in s390?
 -- 
 To unsubscribe from this list: send the line "unsubscribe linux-s390" in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

>>>
>>> I tried this on my system and the blk-mq-hotplug-fix branch does not boot 
>>> for me as well.
>>> The disks get up and running and I/O works fine. At least the partition 
>>> detection and EXT4-fs mount works.
>>>
>>> But at some point in time the disk do not get any requests.
>>>
>>> I currently have no clue why.
>>> I took a dump and had a look at the disk states and they are fine. No error 
>>> in the logs or in our debug entrys. Just empty DASD devices waiting to be 
>>> called for I/O requests.
>>>
>>> Do you have anything I could have a look at?
>>
>> Jens, Christoph, so what do we do about this?
>> To summarize:
>> - commit 4b855ad37194f7 ("blk-mq: Create hctx for each present CPU") broke 
>> CPU hotplug.
>> - Jens' quick revert did fix the issue and did not broke DASD support but 
>> has some issues
>> with interrupt affinity.
>> - Christoph patch set fixes the hotplug issue for virtio blk but causes I/O 
>> hangs on DASDs (even
>> without hotplug).
> 
> Hello,
> 
> This one is a valid use case for VM, I think we need to fix that.
> 
> Looks there is issue on the fouth patch("blk-mq: only select online
> CPUs in blk_mq_hctx_next_cpu"), I fixed it in the following tree, and
> the other 3 patches are same with Christoph's:
> 
>   https://github.com/ming1/linux.git  v4.15-rc-block-for-next-cpuhot-fix
> 
> gitweb:
>   
> https://github.com/ming1/linux/commits/v4.15-rc-block-for-next-cpuhot-fix
> 
> Could you test it and provide the feedback?
> 
> BTW, if it can't help this issue, could you boot from a normal disk first
> and dump blk-mq debugfs of DASD later?

That kernel seems to boot fine on my system with DASD disks.



Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2018-01-11 Thread Stefan Haberland

On 11.01.2018 10:13, Ming Lei wrote:

On Wed, Dec 20, 2017 at 04:47:21PM +0100, Christian Borntraeger wrote:

On 12/18/2017 02:56 PM, Stefan Haberland wrote:

On 07.12.2017 00:29, Christoph Hellwig wrote:

On Wed, Dec 06, 2017 at 01:25:11PM +0100, Christian Borntraeger wrote:
t > commit 11b2025c3326f7096ceb588c3117c7883850c068    -> bad

  blk-mq: create a blk_mq_ctx for each possible CPU
does not boot on DASD and
commit 9c6ae239e01ae9a9f8657f05c55c4372e9fc8bcc    -> good
     genirq/affinity: assign vectors to all possible CPUs
does boot with DASD disks.

Also adding Stefan Haberland if he has an idea why this fails on DASD and 
adding Martin (for the
s390 irq handling code).

That is interesting as it really isn't related to interrupts at all,
it just ensures that possible CPUs are set in ->cpumask.

I guess we'd really want:

e005655c389e3d25bf3e43f71611ec12f3012de0
"blk-mq: only select online CPUs in blk_mq_hctx_next_cpu"

before this commit, but it seems like the whole stack didn't work for
your either.

I wonder if there is some weird thing about nr_cpu_ids in s390?
--
To unsubscribe from this list: send the line "unsubscribe linux-s390" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


I tried this on my system and the blk-mq-hotplug-fix branch does not boot for 
me as well.
The disks get up and running and I/O works fine. At least the partition 
detection and EXT4-fs mount works.

But at some point in time the disk do not get any requests.

I currently have no clue why.
I took a dump and had a look at the disk states and they are fine. No error in 
the logs or in our debug entrys. Just empty DASD devices waiting to be called 
for I/O requests.

Do you have anything I could have a look at?

Jens, Christoph, so what do we do about this?
To summarize:
- commit 4b855ad37194f7 ("blk-mq: Create hctx for each present CPU") broke CPU 
hotplug.
- Jens' quick revert did fix the issue and did not broke DASD support but has 
some issues
with interrupt affinity.
- Christoph patch set fixes the hotplug issue for virtio blk but causes I/O 
hangs on DASDs (even
without hotplug).

Hello,

This one is a valid use case for VM, I think we need to fix that.

Looks there is issue on the fouth patch("blk-mq: only select online
CPUs in blk_mq_hctx_next_cpu"), I fixed it in the following tree, and
the other 3 patches are same with Christoph's:

https://github.com/ming1/linux.git  v4.15-rc-block-for-next-cpuhot-fix

gitweb:

https://github.com/ming1/linux/commits/v4.15-rc-block-for-next-cpuhot-fix

Could you test it and provide the feedback?

BTW, if it can't help this issue, could you boot from a normal disk first
and dump blk-mq debugfs of DASD later?

Thanks,
Ming



Hi,

thanks for the patch. I had pretty much the same place in suspicion.
I will test it asap.

Regards,
Stefan



Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2018-01-11 Thread Stefan Haberland

On 11.01.2018 10:13, Ming Lei wrote:

On Wed, Dec 20, 2017 at 04:47:21PM +0100, Christian Borntraeger wrote:

On 12/18/2017 02:56 PM, Stefan Haberland wrote:

On 07.12.2017 00:29, Christoph Hellwig wrote:

On Wed, Dec 06, 2017 at 01:25:11PM +0100, Christian Borntraeger wrote:
t > commit 11b2025c3326f7096ceb588c3117c7883850c068    -> bad

  blk-mq: create a blk_mq_ctx for each possible CPU
does not boot on DASD and
commit 9c6ae239e01ae9a9f8657f05c55c4372e9fc8bcc    -> good
     genirq/affinity: assign vectors to all possible CPUs
does boot with DASD disks.

Also adding Stefan Haberland if he has an idea why this fails on DASD and 
adding Martin (for the
s390 irq handling code).

That is interesting as it really isn't related to interrupts at all,
it just ensures that possible CPUs are set in ->cpumask.

I guess we'd really want:

e005655c389e3d25bf3e43f71611ec12f3012de0
"blk-mq: only select online CPUs in blk_mq_hctx_next_cpu"

before this commit, but it seems like the whole stack didn't work for
your either.

I wonder if there is some weird thing about nr_cpu_ids in s390?
--
To unsubscribe from this list: send the line "unsubscribe linux-s390" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


I tried this on my system and the blk-mq-hotplug-fix branch does not boot for 
me as well.
The disks get up and running and I/O works fine. At least the partition 
detection and EXT4-fs mount works.

But at some point in time the disk do not get any requests.

I currently have no clue why.
I took a dump and had a look at the disk states and they are fine. No error in 
the logs or in our debug entrys. Just empty DASD devices waiting to be called 
for I/O requests.

Do you have anything I could have a look at?

Jens, Christoph, so what do we do about this?
To summarize:
- commit 4b855ad37194f7 ("blk-mq: Create hctx for each present CPU") broke CPU 
hotplug.
- Jens' quick revert did fix the issue and did not broke DASD support but has 
some issues
with interrupt affinity.
- Christoph patch set fixes the hotplug issue for virtio blk but causes I/O 
hangs on DASDs (even
without hotplug).

Hello,

This one is a valid use case for VM, I think we need to fix that.

Looks there is issue on the fouth patch("blk-mq: only select online
CPUs in blk_mq_hctx_next_cpu"), I fixed it in the following tree, and
the other 3 patches are same with Christoph's:

https://github.com/ming1/linux.git  v4.15-rc-block-for-next-cpuhot-fix

gitweb:

https://github.com/ming1/linux/commits/v4.15-rc-block-for-next-cpuhot-fix

Could you test it and provide the feedback?

BTW, if it can't help this issue, could you boot from a normal disk first
and dump blk-mq debugfs of DASD later?

Thanks,
Ming



Hi,

thanks for the patch. I had pretty much the same place in suspicion.
I will test it asap.

Regards,
Stefan



Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2018-01-11 Thread Ming Lei
On Wed, Dec 20, 2017 at 04:47:21PM +0100, Christian Borntraeger wrote:
> On 12/18/2017 02:56 PM, Stefan Haberland wrote:
> > On 07.12.2017 00:29, Christoph Hellwig wrote:
> >> On Wed, Dec 06, 2017 at 01:25:11PM +0100, Christian Borntraeger wrote:
> >> t > commit 11b2025c3326f7096ceb588c3117c7883850c068    -> bad
> >>>  blk-mq: create a blk_mq_ctx for each possible CPU
> >>> does not boot on DASD and
> >>> commit 9c6ae239e01ae9a9f8657f05c55c4372e9fc8bcc    -> good
> >>>     genirq/affinity: assign vectors to all possible CPUs
> >>> does boot with DASD disks.
> >>>
> >>> Also adding Stefan Haberland if he has an idea why this fails on DASD and 
> >>> adding Martin (for the
> >>> s390 irq handling code).
> >> That is interesting as it really isn't related to interrupts at all,
> >> it just ensures that possible CPUs are set in ->cpumask.
> >>
> >> I guess we'd really want:
> >>
> >> e005655c389e3d25bf3e43f71611ec12f3012de0
> >> "blk-mq: only select online CPUs in blk_mq_hctx_next_cpu"
> >>
> >> before this commit, but it seems like the whole stack didn't work for
> >> your either.
> >>
> >> I wonder if there is some weird thing about nr_cpu_ids in s390?
> >> -- 
> >> To unsubscribe from this list: send the line "unsubscribe linux-s390" in
> >> the body of a message to majord...@vger.kernel.org
> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >>
> > 
> > I tried this on my system and the blk-mq-hotplug-fix branch does not boot 
> > for me as well.
> > The disks get up and running and I/O works fine. At least the partition 
> > detection and EXT4-fs mount works.
> > 
> > But at some point in time the disk do not get any requests.
> > 
> > I currently have no clue why.
> > I took a dump and had a look at the disk states and they are fine. No error 
> > in the logs or in our debug entrys. Just empty DASD devices waiting to be 
> > called for I/O requests.
> > 
> > Do you have anything I could have a look at?
> 
> Jens, Christoph, so what do we do about this?
> To summarize:
> - commit 4b855ad37194f7 ("blk-mq: Create hctx for each present CPU") broke 
> CPU hotplug.
> - Jens' quick revert did fix the issue and did not broke DASD support but has 
> some issues
> with interrupt affinity.
> - Christoph patch set fixes the hotplug issue for virtio blk but causes I/O 
> hangs on DASDs (even
> without hotplug).

Hello,

This one is a valid use case for VM, I think we need to fix that.

Looks there is issue on the fouth patch("blk-mq: only select online
CPUs in blk_mq_hctx_next_cpu"), I fixed it in the following tree, and
the other 3 patches are same with Christoph's:

https://github.com/ming1/linux.git  v4.15-rc-block-for-next-cpuhot-fix

gitweb:

https://github.com/ming1/linux/commits/v4.15-rc-block-for-next-cpuhot-fix

Could you test it and provide the feedback?

BTW, if it can't help this issue, could you boot from a normal disk first
and dump blk-mq debugfs of DASD later?

Thanks, 
Ming


Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2018-01-11 Thread Ming Lei
On Wed, Dec 20, 2017 at 04:47:21PM +0100, Christian Borntraeger wrote:
> On 12/18/2017 02:56 PM, Stefan Haberland wrote:
> > On 07.12.2017 00:29, Christoph Hellwig wrote:
> >> On Wed, Dec 06, 2017 at 01:25:11PM +0100, Christian Borntraeger wrote:
> >> t > commit 11b2025c3326f7096ceb588c3117c7883850c068    -> bad
> >>>  blk-mq: create a blk_mq_ctx for each possible CPU
> >>> does not boot on DASD and
> >>> commit 9c6ae239e01ae9a9f8657f05c55c4372e9fc8bcc    -> good
> >>>     genirq/affinity: assign vectors to all possible CPUs
> >>> does boot with DASD disks.
> >>>
> >>> Also adding Stefan Haberland if he has an idea why this fails on DASD and 
> >>> adding Martin (for the
> >>> s390 irq handling code).
> >> That is interesting as it really isn't related to interrupts at all,
> >> it just ensures that possible CPUs are set in ->cpumask.
> >>
> >> I guess we'd really want:
> >>
> >> e005655c389e3d25bf3e43f71611ec12f3012de0
> >> "blk-mq: only select online CPUs in blk_mq_hctx_next_cpu"
> >>
> >> before this commit, but it seems like the whole stack didn't work for
> >> your either.
> >>
> >> I wonder if there is some weird thing about nr_cpu_ids in s390?
> >> -- 
> >> To unsubscribe from this list: send the line "unsubscribe linux-s390" in
> >> the body of a message to majord...@vger.kernel.org
> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >>
> > 
> > I tried this on my system and the blk-mq-hotplug-fix branch does not boot 
> > for me as well.
> > The disks get up and running and I/O works fine. At least the partition 
> > detection and EXT4-fs mount works.
> > 
> > But at some point in time the disk do not get any requests.
> > 
> > I currently have no clue why.
> > I took a dump and had a look at the disk states and they are fine. No error 
> > in the logs or in our debug entrys. Just empty DASD devices waiting to be 
> > called for I/O requests.
> > 
> > Do you have anything I could have a look at?
> 
> Jens, Christoph, so what do we do about this?
> To summarize:
> - commit 4b855ad37194f7 ("blk-mq: Create hctx for each present CPU") broke 
> CPU hotplug.
> - Jens' quick revert did fix the issue and did not broke DASD support but has 
> some issues
> with interrupt affinity.
> - Christoph patch set fixes the hotplug issue for virtio blk but causes I/O 
> hangs on DASDs (even
> without hotplug).

Hello,

This one is a valid use case for VM, I think we need to fix that.

Looks there is issue on the fouth patch("blk-mq: only select online
CPUs in blk_mq_hctx_next_cpu"), I fixed it in the following tree, and
the other 3 patches are same with Christoph's:

https://github.com/ming1/linux.git  v4.15-rc-block-for-next-cpuhot-fix

gitweb:

https://github.com/ming1/linux/commits/v4.15-rc-block-for-next-cpuhot-fix

Could you test it and provide the feedback?

BTW, if it can't help this issue, could you boot from a normal disk first
and dump blk-mq debugfs of DASD later?

Thanks, 
Ming


Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2017-12-20 Thread Christian Borntraeger
On 12/18/2017 02:56 PM, Stefan Haberland wrote:
> On 07.12.2017 00:29, Christoph Hellwig wrote:
>> On Wed, Dec 06, 2017 at 01:25:11PM +0100, Christian Borntraeger wrote:
>> t > commit 11b2025c3326f7096ceb588c3117c7883850c068    -> bad
>>>  blk-mq: create a blk_mq_ctx for each possible CPU
>>> does not boot on DASD and
>>> commit 9c6ae239e01ae9a9f8657f05c55c4372e9fc8bcc    -> good
>>>     genirq/affinity: assign vectors to all possible CPUs
>>> does boot with DASD disks.
>>>
>>> Also adding Stefan Haberland if he has an idea why this fails on DASD and 
>>> adding Martin (for the
>>> s390 irq handling code).
>> That is interesting as it really isn't related to interrupts at all,
>> it just ensures that possible CPUs are set in ->cpumask.
>>
>> I guess we'd really want:
>>
>> e005655c389e3d25bf3e43f71611ec12f3012de0
>> "blk-mq: only select online CPUs in blk_mq_hctx_next_cpu"
>>
>> before this commit, but it seems like the whole stack didn't work for
>> your either.
>>
>> I wonder if there is some weird thing about nr_cpu_ids in s390?
>> -- 
>> To unsubscribe from this list: send the line "unsubscribe linux-s390" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
> 
> I tried this on my system and the blk-mq-hotplug-fix branch does not boot for 
> me as well.
> The disks get up and running and I/O works fine. At least the partition 
> detection and EXT4-fs mount works.
> 
> But at some point in time the disk do not get any requests.
> 
> I currently have no clue why.
> I took a dump and had a look at the disk states and they are fine. No error 
> in the logs or in our debug entrys. Just empty DASD devices waiting to be 
> called for I/O requests.
> 
> Do you have anything I could have a look at?

Jens, Christoph, so what do we do about this?
To summarize:
- commit 4b855ad37194f7 ("blk-mq: Create hctx for each present CPU") broke CPU 
hotplug.
- Jens' quick revert did fix the issue and did not broke DASD support but has 
some issues
with interrupt affinity.
- Christoph patch set fixes the hotplug issue for virtio blk but causes I/O 
hangs on DASDs (even
without hotplug).

Christian



Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2017-12-20 Thread Christian Borntraeger
On 12/18/2017 02:56 PM, Stefan Haberland wrote:
> On 07.12.2017 00:29, Christoph Hellwig wrote:
>> On Wed, Dec 06, 2017 at 01:25:11PM +0100, Christian Borntraeger wrote:
>> t > commit 11b2025c3326f7096ceb588c3117c7883850c068    -> bad
>>>  blk-mq: create a blk_mq_ctx for each possible CPU
>>> does not boot on DASD and
>>> commit 9c6ae239e01ae9a9f8657f05c55c4372e9fc8bcc    -> good
>>>     genirq/affinity: assign vectors to all possible CPUs
>>> does boot with DASD disks.
>>>
>>> Also adding Stefan Haberland if he has an idea why this fails on DASD and 
>>> adding Martin (for the
>>> s390 irq handling code).
>> That is interesting as it really isn't related to interrupts at all,
>> it just ensures that possible CPUs are set in ->cpumask.
>>
>> I guess we'd really want:
>>
>> e005655c389e3d25bf3e43f71611ec12f3012de0
>> "blk-mq: only select online CPUs in blk_mq_hctx_next_cpu"
>>
>> before this commit, but it seems like the whole stack didn't work for
>> your either.
>>
>> I wonder if there is some weird thing about nr_cpu_ids in s390?
>> -- 
>> To unsubscribe from this list: send the line "unsubscribe linux-s390" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
> 
> I tried this on my system and the blk-mq-hotplug-fix branch does not boot for 
> me as well.
> The disks get up and running and I/O works fine. At least the partition 
> detection and EXT4-fs mount works.
> 
> But at some point in time the disk do not get any requests.
> 
> I currently have no clue why.
> I took a dump and had a look at the disk states and they are fine. No error 
> in the logs or in our debug entrys. Just empty DASD devices waiting to be 
> called for I/O requests.
> 
> Do you have anything I could have a look at?

Jens, Christoph, so what do we do about this?
To summarize:
- commit 4b855ad37194f7 ("blk-mq: Create hctx for each present CPU") broke CPU 
hotplug.
- Jens' quick revert did fix the issue and did not broke DASD support but has 
some issues
with interrupt affinity.
- Christoph patch set fixes the hotplug issue for virtio blk but causes I/O 
hangs on DASDs (even
without hotplug).

Christian



Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2017-12-18 Thread Stefan Haberland

On 07.12.2017 00:29, Christoph Hellwig wrote:

On Wed, Dec 06, 2017 at 01:25:11PM +0100, Christian Borntraeger wrote:
t > commit 11b2025c3326f7096ceb588c3117c7883850c068-> bad

 blk-mq: create a blk_mq_ctx for each possible CPU
does not boot on DASD and
commit 9c6ae239e01ae9a9f8657f05c55c4372e9fc8bcc-> good
genirq/affinity: assign vectors to all possible CPUs
does boot with DASD disks.

Also adding Stefan Haberland if he has an idea why this fails on DASD and 
adding Martin (for the
s390 irq handling code).

That is interesting as it really isn't related to interrupts at all,
it just ensures that possible CPUs are set in ->cpumask.

I guess we'd really want:

e005655c389e3d25bf3e43f71611ec12f3012de0
"blk-mq: only select online CPUs in blk_mq_hctx_next_cpu"

before this commit, but it seems like the whole stack didn't work for
your either.

I wonder if there is some weird thing about nr_cpu_ids in s390?
--
To unsubscribe from this list: send the line "unsubscribe linux-s390" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html



I tried this on my system and the blk-mq-hotplug-fix branch does not 
boot for me as well.
The disks get up and running and I/O works fine. At least the partition 
detection and EXT4-fs mount works.


But at some point in time the disk do not get any requests.

I currently have no clue why.
I took a dump and had a look at the disk states and they are fine. No 
error in the logs or in our debug entrys. Just empty DASD devices 
waiting to be called for I/O requests.


Do you have anything I could have a look at?



Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2017-12-18 Thread Stefan Haberland

On 07.12.2017 00:29, Christoph Hellwig wrote:

On Wed, Dec 06, 2017 at 01:25:11PM +0100, Christian Borntraeger wrote:
t > commit 11b2025c3326f7096ceb588c3117c7883850c068-> bad

 blk-mq: create a blk_mq_ctx for each possible CPU
does not boot on DASD and
commit 9c6ae239e01ae9a9f8657f05c55c4372e9fc8bcc-> good
genirq/affinity: assign vectors to all possible CPUs
does boot with DASD disks.

Also adding Stefan Haberland if he has an idea why this fails on DASD and 
adding Martin (for the
s390 irq handling code).

That is interesting as it really isn't related to interrupts at all,
it just ensures that possible CPUs are set in ->cpumask.

I guess we'd really want:

e005655c389e3d25bf3e43f71611ec12f3012de0
"blk-mq: only select online CPUs in blk_mq_hctx_next_cpu"

before this commit, but it seems like the whole stack didn't work for
your either.

I wonder if there is some weird thing about nr_cpu_ids in s390?
--
To unsubscribe from this list: send the line "unsubscribe linux-s390" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html



I tried this on my system and the blk-mq-hotplug-fix branch does not 
boot for me as well.
The disks get up and running and I/O works fine. At least the partition 
detection and EXT4-fs mount works.


But at some point in time the disk do not get any requests.

I currently have no clue why.
I took a dump and had a look at the disk states and they are fine. No 
error in the logs or in our debug entrys. Just empty DASD devices 
waiting to be called for I/O requests.


Do you have anything I could have a look at?



Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2017-12-14 Thread Christian Borntraeger
Independent from the issues with the dasd disks, this also seem to not enable
additional hardware queues.

with cpus 0,1 (and 248 cpus max)
I get cpus 0 and 2-247 attached to hardware contect 0 and I get
cpu 1 for hardware context 1. 

If I now add a cpu this does not change anything. hardware context 2,3,4
etc all have no CPU and hardware context 0 keeps sitting on all cpus (except 1).




On 12/07/2017 10:20 AM, Christian Borntraeger wrote:
> 
> 
> On 12/07/2017 12:29 AM, Christoph Hellwig wrote:
>> On Wed, Dec 06, 2017 at 01:25:11PM +0100, Christian Borntraeger wrote:
>> t > commit 11b2025c3326f7096ceb588c3117c7883850c068-> bad
>>> blk-mq: create a blk_mq_ctx for each possible CPU
>>> does not boot on DASD and 
>>> commit 9c6ae239e01ae9a9f8657f05c55c4372e9fc8bcc-> good
>>>genirq/affinity: assign vectors to all possible CPUs
>>> does boot with DASD disks.
>>>
>>> Also adding Stefan Haberland if he has an idea why this fails on DASD and 
>>> adding Martin (for the
>>> s390 irq handling code).
>>
>> That is interesting as it really isn't related to interrupts at all,
>> it just ensures that possible CPUs are set in ->cpumask.
>>
>> I guess we'd really want:
>>
>> e005655c389e3d25bf3e43f71611ec12f3012de0
>> "blk-mq: only select online CPUs in blk_mq_hctx_next_cpu"
>>
>> before this commit, but it seems like the whole stack didn't work for
>> your either.
>>
>> I wonder if there is some weird thing about nr_cpu_ids in s390?
> 
> The problem starts as soon as NR_CPUS is larger than the number
> of real CPUs.
> 
> Aquestions Wouldnt your change in blk_mq_hctx_next_cpu fail if there is more 
> than 1 non-online cpu:
> 
> e.g. dont we need something like (whitespace and indent damaged)
> 
> @@ -1241,11 +1241,11 @@ static int blk_mq_hctx_next_cpu(struct blk_mq_hw_ctx 
> *hctx)
> if (--hctx->next_cpu_batch <= 0) {
> int next_cpu;
>  
> +   do  {
> next_cpu = cpumask_next(hctx->next_cpu, hctx->cpumask);
> -   if (!cpu_online(next_cpu))
> -   next_cpu = cpumask_next(next_cpu, hctx->cpumask);
> if (next_cpu >= nr_cpu_ids)
> next_cpu = cpumask_first(hctx->cpumask);
> +   } while (!cpu_online(next_cpu));
>  
> hctx->next_cpu = next_cpu;
> hctx->next_cpu_batch = BLK_MQ_CPU_WORK_BATCH;
> 
> it does not fix the issue, though (and it would be pretty inefficient for 
> large NR_CPUS)
> 
> 



Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2017-12-14 Thread Christian Borntraeger
Independent from the issues with the dasd disks, this also seem to not enable
additional hardware queues.

with cpus 0,1 (and 248 cpus max)
I get cpus 0 and 2-247 attached to hardware contect 0 and I get
cpu 1 for hardware context 1. 

If I now add a cpu this does not change anything. hardware context 2,3,4
etc all have no CPU and hardware context 0 keeps sitting on all cpus (except 1).




On 12/07/2017 10:20 AM, Christian Borntraeger wrote:
> 
> 
> On 12/07/2017 12:29 AM, Christoph Hellwig wrote:
>> On Wed, Dec 06, 2017 at 01:25:11PM +0100, Christian Borntraeger wrote:
>> t > commit 11b2025c3326f7096ceb588c3117c7883850c068-> bad
>>> blk-mq: create a blk_mq_ctx for each possible CPU
>>> does not boot on DASD and 
>>> commit 9c6ae239e01ae9a9f8657f05c55c4372e9fc8bcc-> good
>>>genirq/affinity: assign vectors to all possible CPUs
>>> does boot with DASD disks.
>>>
>>> Also adding Stefan Haberland if he has an idea why this fails on DASD and 
>>> adding Martin (for the
>>> s390 irq handling code).
>>
>> That is interesting as it really isn't related to interrupts at all,
>> it just ensures that possible CPUs are set in ->cpumask.
>>
>> I guess we'd really want:
>>
>> e005655c389e3d25bf3e43f71611ec12f3012de0
>> "blk-mq: only select online CPUs in blk_mq_hctx_next_cpu"
>>
>> before this commit, but it seems like the whole stack didn't work for
>> your either.
>>
>> I wonder if there is some weird thing about nr_cpu_ids in s390?
> 
> The problem starts as soon as NR_CPUS is larger than the number
> of real CPUs.
> 
> Aquestions Wouldnt your change in blk_mq_hctx_next_cpu fail if there is more 
> than 1 non-online cpu:
> 
> e.g. dont we need something like (whitespace and indent damaged)
> 
> @@ -1241,11 +1241,11 @@ static int blk_mq_hctx_next_cpu(struct blk_mq_hw_ctx 
> *hctx)
> if (--hctx->next_cpu_batch <= 0) {
> int next_cpu;
>  
> +   do  {
> next_cpu = cpumask_next(hctx->next_cpu, hctx->cpumask);
> -   if (!cpu_online(next_cpu))
> -   next_cpu = cpumask_next(next_cpu, hctx->cpumask);
> if (next_cpu >= nr_cpu_ids)
> next_cpu = cpumask_first(hctx->cpumask);
> +   } while (!cpu_online(next_cpu));
>  
> hctx->next_cpu = next_cpu;
> hctx->next_cpu_batch = BLK_MQ_CPU_WORK_BATCH;
> 
> it does not fix the issue, though (and it would be pretty inefficient for 
> large NR_CPUS)
> 
> 



Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2017-12-07 Thread Christian Borntraeger


On 12/07/2017 12:29 AM, Christoph Hellwig wrote:
> On Wed, Dec 06, 2017 at 01:25:11PM +0100, Christian Borntraeger wrote:
> t > commit 11b2025c3326f7096ceb588c3117c7883850c068-> bad
>> blk-mq: create a blk_mq_ctx for each possible CPU
>> does not boot on DASD and 
>> commit 9c6ae239e01ae9a9f8657f05c55c4372e9fc8bcc-> good
>>genirq/affinity: assign vectors to all possible CPUs
>> does boot with DASD disks.
>>
>> Also adding Stefan Haberland if he has an idea why this fails on DASD and 
>> adding Martin (for the
>> s390 irq handling code).
> 
> That is interesting as it really isn't related to interrupts at all,
> it just ensures that possible CPUs are set in ->cpumask.
> 
> I guess we'd really want:
> 
> e005655c389e3d25bf3e43f71611ec12f3012de0
> "blk-mq: only select online CPUs in blk_mq_hctx_next_cpu"
> 
> before this commit, but it seems like the whole stack didn't work for
> your either.
> 
> I wonder if there is some weird thing about nr_cpu_ids in s390?

The problem starts as soon as NR_CPUS is larger than the number
of real CPUs.

Aquestions Wouldnt your change in blk_mq_hctx_next_cpu fail if there is more 
than 1 non-online cpu:

e.g. dont we need something like (whitespace and indent damaged)

@@ -1241,11 +1241,11 @@ static int blk_mq_hctx_next_cpu(struct blk_mq_hw_ctx 
*hctx)
if (--hctx->next_cpu_batch <= 0) {
int next_cpu;
 
+   do  {
next_cpu = cpumask_next(hctx->next_cpu, hctx->cpumask);
-   if (!cpu_online(next_cpu))
-   next_cpu = cpumask_next(next_cpu, hctx->cpumask);
if (next_cpu >= nr_cpu_ids)
next_cpu = cpumask_first(hctx->cpumask);
+   } while (!cpu_online(next_cpu));
 
hctx->next_cpu = next_cpu;
hctx->next_cpu_batch = BLK_MQ_CPU_WORK_BATCH;

it does not fix the issue, though (and it would be pretty inefficient for large 
NR_CPUS)




Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2017-12-07 Thread Christian Borntraeger


On 12/07/2017 12:29 AM, Christoph Hellwig wrote:
> On Wed, Dec 06, 2017 at 01:25:11PM +0100, Christian Borntraeger wrote:
> t > commit 11b2025c3326f7096ceb588c3117c7883850c068-> bad
>> blk-mq: create a blk_mq_ctx for each possible CPU
>> does not boot on DASD and 
>> commit 9c6ae239e01ae9a9f8657f05c55c4372e9fc8bcc-> good
>>genirq/affinity: assign vectors to all possible CPUs
>> does boot with DASD disks.
>>
>> Also adding Stefan Haberland if he has an idea why this fails on DASD and 
>> adding Martin (for the
>> s390 irq handling code).
> 
> That is interesting as it really isn't related to interrupts at all,
> it just ensures that possible CPUs are set in ->cpumask.
> 
> I guess we'd really want:
> 
> e005655c389e3d25bf3e43f71611ec12f3012de0
> "blk-mq: only select online CPUs in blk_mq_hctx_next_cpu"
> 
> before this commit, but it seems like the whole stack didn't work for
> your either.
> 
> I wonder if there is some weird thing about nr_cpu_ids in s390?

The problem starts as soon as NR_CPUS is larger than the number
of real CPUs.

Aquestions Wouldnt your change in blk_mq_hctx_next_cpu fail if there is more 
than 1 non-online cpu:

e.g. dont we need something like (whitespace and indent damaged)

@@ -1241,11 +1241,11 @@ static int blk_mq_hctx_next_cpu(struct blk_mq_hw_ctx 
*hctx)
if (--hctx->next_cpu_batch <= 0) {
int next_cpu;
 
+   do  {
next_cpu = cpumask_next(hctx->next_cpu, hctx->cpumask);
-   if (!cpu_online(next_cpu))
-   next_cpu = cpumask_next(next_cpu, hctx->cpumask);
if (next_cpu >= nr_cpu_ids)
next_cpu = cpumask_first(hctx->cpumask);
+   } while (!cpu_online(next_cpu));
 
hctx->next_cpu = next_cpu;
hctx->next_cpu_batch = BLK_MQ_CPU_WORK_BATCH;

it does not fix the issue, though (and it would be pretty inefficient for large 
NR_CPUS)




Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2017-12-06 Thread Christoph Hellwig
On Wed, Dec 06, 2017 at 01:25:11PM +0100, Christian Borntraeger wrote:
t > commit 11b2025c3326f7096ceb588c3117c7883850c068-> bad
> blk-mq: create a blk_mq_ctx for each possible CPU
> does not boot on DASD and 
> commit 9c6ae239e01ae9a9f8657f05c55c4372e9fc8bcc-> good
>genirq/affinity: assign vectors to all possible CPUs
> does boot with DASD disks.
> 
> Also adding Stefan Haberland if he has an idea why this fails on DASD and 
> adding Martin (for the
> s390 irq handling code).

That is interesting as it really isn't related to interrupts at all,
it just ensures that possible CPUs are set in ->cpumask.

I guess we'd really want:

e005655c389e3d25bf3e43f71611ec12f3012de0
"blk-mq: only select online CPUs in blk_mq_hctx_next_cpu"

before this commit, but it seems like the whole stack didn't work for
your either.

I wonder if there is some weird thing about nr_cpu_ids in s390?


Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2017-12-06 Thread Christoph Hellwig
On Wed, Dec 06, 2017 at 01:25:11PM +0100, Christian Borntraeger wrote:
t > commit 11b2025c3326f7096ceb588c3117c7883850c068-> bad
> blk-mq: create a blk_mq_ctx for each possible CPU
> does not boot on DASD and 
> commit 9c6ae239e01ae9a9f8657f05c55c4372e9fc8bcc-> good
>genirq/affinity: assign vectors to all possible CPUs
> does boot with DASD disks.
> 
> Also adding Stefan Haberland if he has an idea why this fails on DASD and 
> adding Martin (for the
> s390 irq handling code).

That is interesting as it really isn't related to interrupts at all,
it just ensures that possible CPUs are set in ->cpumask.

I guess we'd really want:

e005655c389e3d25bf3e43f71611ec12f3012de0
"blk-mq: only select online CPUs in blk_mq_hctx_next_cpu"

before this commit, but it seems like the whole stack didn't work for
your either.

I wonder if there is some weird thing about nr_cpu_ids in s390?


Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2017-12-06 Thread Christian Borntraeger
On 12/04/2017 05:21 PM, Christoph Hellwig wrote:
> On Wed, Nov 29, 2017 at 08:18:09PM +0100, Christian Borntraeger wrote:
>> Works fine under KVM with virtio-blk, but still hangs during boot in an LPAR.
>> FWIW, the system not only has scsi disks via fcp but also DASDs as a boot 
>> disk.
>> Seems that this is the place where the system stops. (see the sysrq-t output
>> at the bottom).
> 
> Can you check which of the patches in the tree is the culprit?


>From this branch

git://git.infradead.org/users/hch/block.git blk-mq-hotplug-fix

commit 11b2025c3326f7096ceb588c3117c7883850c068-> bad
blk-mq: create a blk_mq_ctx for each possible CPU
does not boot on DASD and 
commit 9c6ae239e01ae9a9f8657f05c55c4372e9fc8bcc-> good
   genirq/affinity: assign vectors to all possible CPUs
does boot with DASD disks.

Also adding Stefan Haberland if he has an idea why this fails on DASD and 
adding Martin (for the
s390 irq handling code).


Some history:
I got this warning
"WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 
stable)"
since 4.13 (and also in 4.12 stable)
on CPU hotplug of previously unavailable CPUs (real hotplug, no offline/online)

This was introduced with 

 blk-mq: Create hctx for each present CPU
commit 4b855ad37194f7bdbb200ce7a1c7051fecb56a08 

And Christoph is currently working on a fix. The fixed kernel does boot with 
virtio-blk and
it fixes the warning but it hangs (outstanding I/O) with dasd disks.



Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2017-12-06 Thread Christian Borntraeger
On 12/04/2017 05:21 PM, Christoph Hellwig wrote:
> On Wed, Nov 29, 2017 at 08:18:09PM +0100, Christian Borntraeger wrote:
>> Works fine under KVM with virtio-blk, but still hangs during boot in an LPAR.
>> FWIW, the system not only has scsi disks via fcp but also DASDs as a boot 
>> disk.
>> Seems that this is the place where the system stops. (see the sysrq-t output
>> at the bottom).
> 
> Can you check which of the patches in the tree is the culprit?


>From this branch

git://git.infradead.org/users/hch/block.git blk-mq-hotplug-fix

commit 11b2025c3326f7096ceb588c3117c7883850c068-> bad
blk-mq: create a blk_mq_ctx for each possible CPU
does not boot on DASD and 
commit 9c6ae239e01ae9a9f8657f05c55c4372e9fc8bcc-> good
   genirq/affinity: assign vectors to all possible CPUs
does boot with DASD disks.

Also adding Stefan Haberland if he has an idea why this fails on DASD and 
adding Martin (for the
s390 irq handling code).


Some history:
I got this warning
"WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 
stable)"
since 4.13 (and also in 4.12 stable)
on CPU hotplug of previously unavailable CPUs (real hotplug, no offline/online)

This was introduced with 

 blk-mq: Create hctx for each present CPU
commit 4b855ad37194f7bdbb200ce7a1c7051fecb56a08 

And Christoph is currently working on a fix. The fixed kernel does boot with 
virtio-blk and
it fixes the warning but it hangs (outstanding I/O) with dasd disks.



Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2017-12-04 Thread Christoph Hellwig
On Wed, Nov 29, 2017 at 08:18:09PM +0100, Christian Borntraeger wrote:
> Works fine under KVM with virtio-blk, but still hangs during boot in an LPAR.
> FWIW, the system not only has scsi disks via fcp but also DASDs as a boot 
> disk.
> Seems that this is the place where the system stops. (see the sysrq-t output
> at the bottom).

Can you check which of the patches in the tree is the culprit?


Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2017-12-04 Thread Christoph Hellwig
On Wed, Nov 29, 2017 at 08:18:09PM +0100, Christian Borntraeger wrote:
> Works fine under KVM with virtio-blk, but still hangs during boot in an LPAR.
> FWIW, the system not only has scsi disks via fcp but also DASDs as a boot 
> disk.
> Seems that this is the place where the system stops. (see the sysrq-t output
> at the bottom).

Can you check which of the patches in the tree is the culprit?


Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2017-11-29 Thread Christian Borntraeger

On 11/29/2017 08:18 PM, Christian Borntraeger wrote:
> Works fine under KVM with virtio-blk, but still hangs during boot in an LPAR.
> FWIW, the system not only has scsi disks via fcp but also DASDs as a boot 
> disk.
> Seems that this is the place where the system stops. (see the sysrq-t output
> at the bottom).

FWIW, the failing kernel had CONFIG_NR_CPUS=256 and 32 CPUS (with SMT2) == 64 
threads
with CONFIG_NR_CPUS=16 the system booted fine.



Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2017-11-29 Thread Christian Borntraeger

On 11/29/2017 08:18 PM, Christian Borntraeger wrote:
> Works fine under KVM with virtio-blk, but still hangs during boot in an LPAR.
> FWIW, the system not only has scsi disks via fcp but also DASDs as a boot 
> disk.
> Seems that this is the place where the system stops. (see the sysrq-t output
> at the bottom).

FWIW, the failing kernel had CONFIG_NR_CPUS=256 and 32 CPUS (with SMT2) == 64 
threads
with CONFIG_NR_CPUS=16 the system booted fine.



Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2017-11-29 Thread Christian Borntraeger
Works fine under KVM with virtio-blk, but still hangs during boot in an LPAR.
FWIW, the system not only has scsi disks via fcp but also DASDs as a boot disk.
Seems that this is the place where the system stops. (see the sysrq-t output
at the bottom).


Message
"[0.247484] Linux version 4.15.0-rc1+ (cborntra@s38lp08) (gcc version 6.3.1 
2
"
"0161221 (Red Hat 6.3.1-1.0.ibm) (GCC)) #229 SMP Wed Nov 29 20:05:35 CET 2017
"
"[0.247489] setup: Linux is running natively in 64-bit mode
"
"[0.247661] setup: The maximum memory size is 1048576MB
"
"[0.247670] setup: Reserving 1024MB of memory at 1047552MB for crashkernel 
(S
"
"ystem RAM: 1047552MB)
"
"[0.247688] numa: NUMA mode: plain
"
"[0.247794] cpu: 64 configured CPUs, 0 standby CPUs
"
"[0.247834] cpu: The CPU configuration topology of the machine is: 0 0 4 2 
3 
"
"8 / 4
"
"[0.248279] Write protected kernel read-only data: 12456k
"
"[0.265131] Zone ranges:
"
"[0.265134]   DMA  [mem 0x-0x7fff]
"
"[0.265136]   Normal   [mem 0x8000-0x00ff]
"
"[0.265137] Movable zone start for each node
"
"[0.265138] Early memory node ranges
"
"[0.265139]   node   0: [mem 0x-0x00ff]
"
"[0.265141] Initmem setup node 0 [mem 0x-0x00ff]
"
"[7.445561] random: fast init done
"
"[7.449194] percpu: Embedded 23 pages/cpu @00fbbe60 s56064 r8192 
d299
"
"52 u94208
"
"[7.449380] Built 1 zonelists, mobility grouping on.  Total pages: 264241152
"
"[7.449381] Policy zone: Normal
"
"[7.449384] Kernel command line: elevator=deadline audit_enable=0 audit=0 
aud
"
"it_debug=0 selinux=0 crashkernel=1024M printk.time=1 zfcp.dbfsize=100 
dasd=241c,
"
"241d,241e,241f root=/dev/dasda1 kvm.nested=1  BOOT_IMAGE=0
"
"[7.449420] audit: disabled (until reboot)
"
"[7.450513] log_buf_len individual max cpu contribution: 4096 bytes
"
"[7.450514] log_buf_len total cpu_extra contributions: 1044480 bytes
"
"[7.450515] log_buf_len min size: 131072 bytes
"
"[7.450788] log_buf_len: 2097152 bytes
"
"[7.450789] early log buf free: 125076(95%)
"
"[   11.040620] Memory: 1055873868K/1073741824K available (8248K kernel code, 
107
"
"8K rwdata, 4204K rodata, 812K init, 700K bss, 17867956K reserved, 0K 
cma-reserve
"
"d)
"
"[   11.040938] SLUB: HWalign=256, Order=0-3, MinObjects=0, CPUs=256, Nodes=1
"
"[   11.040969] ftrace: allocating 26506 entries in 104 pages
"
"[   11.051476] Hierarchical RCU implementation.
"
"[   11.051476]  RCU event tracing is enabled.
"
"[   11.051478]  RCU debug extended QS entry/exit.
"
"[   11.053263] NR_IRQS: 3, nr_irqs: 3, preallocated irqs: 3
"
"[   11.053444] clocksource: tod: mask: 0x max_cycles: 
0x3b0a9be8
"
"03b0a9, max_idle_ns: 1805497147909793 ns
"
"[   11.160192] console [ttyS0] enabled
"
"[   11.308228] pid_max: default: 262144 minimum: 2048
"
"[   11.308298] Security Framework initialized
"
"[   11.308300] SELinux:  Disabled at boot.
"
"[   11.354028] Dentry cache hash table entries: 33554432 (order: 16, 268435456 
b
"
"ytes)
"
"[   11.376945] Inode-cache hash table entries: 16777216 (order: 15, 134217728 
by
"
"tes)
"
"[   11.377685] Mount-cache hash table entries: 524288 (order: 10, 4194304 
bytes)
"

"[   11.378401] Mountpoint-cache hash table entries: 524288 (order: 10, 4194304 
b
"
"ytes)
"
"[   11.378984] Hierarchical SRCU implementation.
"
"[   11.380032] smp: Bringing up secondary CPUs ...
"
"[   11.393634] smp: Brought up 1 node, 64 CPUs
"
"[   11.585458] devtmpfs: initialized
"
"[   11.588589] clocksource: jiffies: mask: 0x max_cycles: 0x, 
ma
"
"x_idle_ns: 1911260446275 ns
"
"[   11.588998] futex hash table entries: 65536 (order: 12, 16777216 bytes)
"
"[   11.591926] NET: Registered protocol family 16
"
"[   11.596413] HugeTLB registered 1.00 MiB page size, pre-allocated 0 pages
"
"[   11.597604] SCSI subsystem initialized
"
"[   11.597611] pps_core: LinuxPPS API ver. 1 registered
"
"[   11.597612] pps_core: Software ver. 5.3.6 - Copyright 2005-2007 Rodolfo 
Giome
"
"tti 
"
"[   11.597614] PTP clock support registered
"
"[   11.599088] NetLabel: Initializing
"
"[   11.599089] NetLabel:  domain hash size = 128
"
"[   11.599090] NetLabel:  protocols = UNLABELED CIPSOv4 CALIPSO
"
"[   11.599101] NetLabel:  unlabeled traffic allowed by default
"
"[   11.612542] PCI host bridge to bus :00
"
"[   11.612546] pci_bus :00: root bus resource [mem 
0x8000-0x8000
"
"007f 64bit pref]
"
"[   11.612548] pci_bus :00: No busn resource found for root bus, will use 
[b
"
"us 00-ff]
"
"[   11.616458] iommu: Adding device :00:00.0 to group 0
"
"[   12.291894] VFS: Disk quotas dquot_6.6.0
"
"[   12.291942] VFS: Dquot-cache hash table entries: 512 (order 0, 4096 bytes)
"
"[   12.292226] NET: Registered protocol family 2
"
"[   12.292662] TCP established hash table entries: 524288 (order: 

Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2017-11-29 Thread Christian Borntraeger
Works fine under KVM with virtio-blk, but still hangs during boot in an LPAR.
FWIW, the system not only has scsi disks via fcp but also DASDs as a boot disk.
Seems that this is the place where the system stops. (see the sysrq-t output
at the bottom).


Message
"[0.247484] Linux version 4.15.0-rc1+ (cborntra@s38lp08) (gcc version 6.3.1 
2
"
"0161221 (Red Hat 6.3.1-1.0.ibm) (GCC)) #229 SMP Wed Nov 29 20:05:35 CET 2017
"
"[0.247489] setup: Linux is running natively in 64-bit mode
"
"[0.247661] setup: The maximum memory size is 1048576MB
"
"[0.247670] setup: Reserving 1024MB of memory at 1047552MB for crashkernel 
(S
"
"ystem RAM: 1047552MB)
"
"[0.247688] numa: NUMA mode: plain
"
"[0.247794] cpu: 64 configured CPUs, 0 standby CPUs
"
"[0.247834] cpu: The CPU configuration topology of the machine is: 0 0 4 2 
3 
"
"8 / 4
"
"[0.248279] Write protected kernel read-only data: 12456k
"
"[0.265131] Zone ranges:
"
"[0.265134]   DMA  [mem 0x-0x7fff]
"
"[0.265136]   Normal   [mem 0x8000-0x00ff]
"
"[0.265137] Movable zone start for each node
"
"[0.265138] Early memory node ranges
"
"[0.265139]   node   0: [mem 0x-0x00ff]
"
"[0.265141] Initmem setup node 0 [mem 0x-0x00ff]
"
"[7.445561] random: fast init done
"
"[7.449194] percpu: Embedded 23 pages/cpu @00fbbe60 s56064 r8192 
d299
"
"52 u94208
"
"[7.449380] Built 1 zonelists, mobility grouping on.  Total pages: 264241152
"
"[7.449381] Policy zone: Normal
"
"[7.449384] Kernel command line: elevator=deadline audit_enable=0 audit=0 
aud
"
"it_debug=0 selinux=0 crashkernel=1024M printk.time=1 zfcp.dbfsize=100 
dasd=241c,
"
"241d,241e,241f root=/dev/dasda1 kvm.nested=1  BOOT_IMAGE=0
"
"[7.449420] audit: disabled (until reboot)
"
"[7.450513] log_buf_len individual max cpu contribution: 4096 bytes
"
"[7.450514] log_buf_len total cpu_extra contributions: 1044480 bytes
"
"[7.450515] log_buf_len min size: 131072 bytes
"
"[7.450788] log_buf_len: 2097152 bytes
"
"[7.450789] early log buf free: 125076(95%)
"
"[   11.040620] Memory: 1055873868K/1073741824K available (8248K kernel code, 
107
"
"8K rwdata, 4204K rodata, 812K init, 700K bss, 17867956K reserved, 0K 
cma-reserve
"
"d)
"
"[   11.040938] SLUB: HWalign=256, Order=0-3, MinObjects=0, CPUs=256, Nodes=1
"
"[   11.040969] ftrace: allocating 26506 entries in 104 pages
"
"[   11.051476] Hierarchical RCU implementation.
"
"[   11.051476]  RCU event tracing is enabled.
"
"[   11.051478]  RCU debug extended QS entry/exit.
"
"[   11.053263] NR_IRQS: 3, nr_irqs: 3, preallocated irqs: 3
"
"[   11.053444] clocksource: tod: mask: 0x max_cycles: 
0x3b0a9be8
"
"03b0a9, max_idle_ns: 1805497147909793 ns
"
"[   11.160192] console [ttyS0] enabled
"
"[   11.308228] pid_max: default: 262144 minimum: 2048
"
"[   11.308298] Security Framework initialized
"
"[   11.308300] SELinux:  Disabled at boot.
"
"[   11.354028] Dentry cache hash table entries: 33554432 (order: 16, 268435456 
b
"
"ytes)
"
"[   11.376945] Inode-cache hash table entries: 16777216 (order: 15, 134217728 
by
"
"tes)
"
"[   11.377685] Mount-cache hash table entries: 524288 (order: 10, 4194304 
bytes)
"

"[   11.378401] Mountpoint-cache hash table entries: 524288 (order: 10, 4194304 
b
"
"ytes)
"
"[   11.378984] Hierarchical SRCU implementation.
"
"[   11.380032] smp: Bringing up secondary CPUs ...
"
"[   11.393634] smp: Brought up 1 node, 64 CPUs
"
"[   11.585458] devtmpfs: initialized
"
"[   11.588589] clocksource: jiffies: mask: 0x max_cycles: 0x, 
ma
"
"x_idle_ns: 1911260446275 ns
"
"[   11.588998] futex hash table entries: 65536 (order: 12, 16777216 bytes)
"
"[   11.591926] NET: Registered protocol family 16
"
"[   11.596413] HugeTLB registered 1.00 MiB page size, pre-allocated 0 pages
"
"[   11.597604] SCSI subsystem initialized
"
"[   11.597611] pps_core: LinuxPPS API ver. 1 registered
"
"[   11.597612] pps_core: Software ver. 5.3.6 - Copyright 2005-2007 Rodolfo 
Giome
"
"tti 
"
"[   11.597614] PTP clock support registered
"
"[   11.599088] NetLabel: Initializing
"
"[   11.599089] NetLabel:  domain hash size = 128
"
"[   11.599090] NetLabel:  protocols = UNLABELED CIPSOv4 CALIPSO
"
"[   11.599101] NetLabel:  unlabeled traffic allowed by default
"
"[   11.612542] PCI host bridge to bus :00
"
"[   11.612546] pci_bus :00: root bus resource [mem 
0x8000-0x8000
"
"007f 64bit pref]
"
"[   11.612548] pci_bus :00: No busn resource found for root bus, will use 
[b
"
"us 00-ff]
"
"[   11.616458] iommu: Adding device :00:00.0 to group 0
"
"[   12.291894] VFS: Disk quotas dquot_6.6.0
"
"[   12.291942] VFS: Dquot-cache hash table entries: 512 (order 0, 4096 bytes)
"
"[   12.292226] NET: Registered protocol family 2
"
"[   12.292662] TCP established hash table entries: 524288 (order: 10, 4194304 
by
"

Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2017-11-27 Thread Christoph Hellwig
Can you try this git branch:

git://git.infradead.org/users/hch/block.git blk-mq-hotplug-fix

Gitweb:

 
http://git.infradead.org/users/hch/block.git/shortlog/refs/heads/blk-mq-hotplug-fix


Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2017-11-27 Thread Christoph Hellwig
Can you try this git branch:

git://git.infradead.org/users/hch/block.git blk-mq-hotplug-fix

Gitweb:

 
http://git.infradead.org/users/hch/block.git/shortlog/refs/heads/blk-mq-hotplug-fix


Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2017-11-24 Thread Christian Borntraeger


On 11/23/2017 07:59 PM, Christian Borntraeger wrote:
> 
> 
> On 11/23/2017 07:32 PM, Christoph Hellwig wrote:
>> On Thu, Nov 23, 2017 at 07:28:31PM +0100, Christian Borntraeger wrote:
>>> zfcp on s390.
>>
>> Ok, so it can't be the interrupt code, but probably is the blk-mq-cpumap.c
>> changes.  Can you try to revert just those for a quick test?
> 
> 
> Hmm, I get further in boot, but the system seems very sluggish and it does not
> seem to be able to access the scsi disks (get data from them)
> 

FWIW, just having the changes in irq_affinity.c is indeed fine.



Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2017-11-24 Thread Christian Borntraeger


On 11/23/2017 07:59 PM, Christian Borntraeger wrote:
> 
> 
> On 11/23/2017 07:32 PM, Christoph Hellwig wrote:
>> On Thu, Nov 23, 2017 at 07:28:31PM +0100, Christian Borntraeger wrote:
>>> zfcp on s390.
>>
>> Ok, so it can't be the interrupt code, but probably is the blk-mq-cpumap.c
>> changes.  Can you try to revert just those for a quick test?
> 
> 
> Hmm, I get further in boot, but the system seems very sluggish and it does not
> seem to be able to access the scsi disks (get data from them)
> 

FWIW, just having the changes in irq_affinity.c is indeed fine.



Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2017-11-23 Thread Christian Borntraeger


On 11/23/2017 07:32 PM, Christoph Hellwig wrote:
> On Thu, Nov 23, 2017 at 07:28:31PM +0100, Christian Borntraeger wrote:
>> zfcp on s390.
> 
> Ok, so it can't be the interrupt code, but probably is the blk-mq-cpumap.c
> changes.  Can you try to revert just those for a quick test?


Hmm, I get further in boot, but the system seems very sluggish and it does not
seem to be able to access the scsi disks (get data from them)

 



Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2017-11-23 Thread Christian Borntraeger


On 11/23/2017 07:32 PM, Christoph Hellwig wrote:
> On Thu, Nov 23, 2017 at 07:28:31PM +0100, Christian Borntraeger wrote:
>> zfcp on s390.
> 
> Ok, so it can't be the interrupt code, but probably is the blk-mq-cpumap.c
> changes.  Can you try to revert just those for a quick test?


Hmm, I get further in boot, but the system seems very sluggish and it does not
seem to be able to access the scsi disks (get data from them)

 



Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2017-11-23 Thread Christoph Hellwig
On Thu, Nov 23, 2017 at 07:28:31PM +0100, Christian Borntraeger wrote:
> zfcp on s390.

Ok, so it can't be the interrupt code, but probably is the blk-mq-cpumap.c
changes.  Can you try to revert just those for a quick test?


Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2017-11-23 Thread Christoph Hellwig
On Thu, Nov 23, 2017 at 07:28:31PM +0100, Christian Borntraeger wrote:
> zfcp on s390.

Ok, so it can't be the interrupt code, but probably is the blk-mq-cpumap.c
changes.  Can you try to revert just those for a quick test?


Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2017-11-23 Thread Christian Borntraeger
zfcp on s390.

On 11/23/2017 07:25 PM, Christoph Hellwig wrote:
> What HBA driver do you use in the host?
> 



Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2017-11-23 Thread Christian Borntraeger
zfcp on s390.

On 11/23/2017 07:25 PM, Christoph Hellwig wrote:
> What HBA driver do you use in the host?
> 



Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2017-11-23 Thread Christoph Hellwig
What HBA driver do you use in the host?


Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2017-11-23 Thread Christoph Hellwig
What HBA driver do you use in the host?


Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2017-11-23 Thread Christian Borntraeger


On 11/23/2017 03:34 PM, Christoph Hellwig wrote:
> FYI, the patch below changes both the irq and block mappings to
> always use the cpu possible map (should be split in two in due time).
> 
> I think this is the right way forward.  For every normal machine
> those two are the same, but for VMs with maxcpus above their normal
> count or some big iron that can grow more cpus it means we waster
> a few more resources for the not present but reserved cpus.  It
> fixes the reported issue for me:


While it fixes the hotplug issue under KVM, the same kernel no longers boots in 
the host, 
it seems stuck early at boot just before detecting the SCSI disks. I have not 
yet looked into
that.

Christian
> 
> diff --git a/block/blk-mq-cpumap.c b/block/blk-mq-cpumap.c
> index 9f8cffc8a701..3eb169f15842 100644
> --- a/block/blk-mq-cpumap.c
> +++ b/block/blk-mq-cpumap.c
> @@ -16,11 +16,6 @@
> 
>  static int cpu_to_queue_index(unsigned int nr_queues, const int cpu)
>  {
> - /*
> -  * Non present CPU will be mapped to queue index 0.
> -  */
> - if (!cpu_present(cpu))
> - return 0;
>   return cpu % nr_queues;
>  }
> 
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index 11097477eeab..612ce1fb7c4e 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -2114,16 +2114,11 @@ static void blk_mq_init_cpu_queues(struct 
> request_queue *q,
>   INIT_LIST_HEAD(&__ctx->rq_list);
>   __ctx->queue = q;
> 
> - /* If the cpu isn't present, the cpu is mapped to first hctx */
> - if (!cpu_present(i))
> - continue;
> -
> - hctx = blk_mq_map_queue(q, i);
> -
>   /*
>* Set local node, IFF we have more than one hw queue. If
>* not, we remain on the home node of the device
>*/
> + hctx = blk_mq_map_queue(q, i);
>   if (nr_hw_queues > 1 && hctx->numa_node == NUMA_NO_NODE)
>   hctx->numa_node = local_memory_node(cpu_to_node(i));
>   }
> @@ -2180,7 +2175,7 @@ static void blk_mq_map_swqueue(struct request_queue *q)
>*
>* If the cpu isn't present, the cpu is mapped to first hctx.
>*/
> - for_each_present_cpu(i) {
> + for_each_possible_cpu(i) {
>   hctx_idx = q->mq_map[i];
>   /* unmapped hw queue can be remapped after CPU topo changed */
>   if (!set->tags[hctx_idx] &&
> diff --git a/kernel/irq/affinity.c b/kernel/irq/affinity.c
> index e12d35108225..a37a3b4b6342 100644
> --- a/kernel/irq/affinity.c
> +++ b/kernel/irq/affinity.c
> @@ -39,7 +39,7 @@ static void irq_spread_init_one(struct cpumask *irqmsk, 
> struct cpumask *nmsk,
>   }
>  }
> 
> -static cpumask_var_t *alloc_node_to_present_cpumask(void)
> +static cpumask_var_t *alloc_node_to_possible_cpumask(void)
>  {
>   cpumask_var_t *masks;
>   int node;
> @@ -62,7 +62,7 @@ static cpumask_var_t *alloc_node_to_present_cpumask(void)
>   return NULL;
>  }
> 
> -static void free_node_to_present_cpumask(cpumask_var_t *masks)
> +static void free_node_to_possible_cpumask(cpumask_var_t *masks)
>  {
>   int node;
> 
> @@ -71,22 +71,22 @@ static void free_node_to_present_cpumask(cpumask_var_t 
> *masks)
>   kfree(masks);
>  }
> 
> -static void build_node_to_present_cpumask(cpumask_var_t *masks)
> +static void build_node_to_possible_cpumask(cpumask_var_t *masks)
>  {
>   int cpu;
> 
> - for_each_present_cpu(cpu)
> + for_each_possible_cpu(cpu)
>   cpumask_set_cpu(cpu, masks[cpu_to_node(cpu)]);
>  }
> 
> -static int get_nodes_in_cpumask(cpumask_var_t *node_to_present_cpumask,
> +static int get_nodes_in_cpumask(cpumask_var_t *node_to_possible_cpumask,
>   const struct cpumask *mask, nodemask_t *nodemsk)
>  {
>   int n, nodes = 0;
> 
>   /* Calculate the number of nodes in the supplied affinity mask */
>   for_each_node(n) {
> - if (cpumask_intersects(mask, node_to_present_cpumask[n])) {
> + if (cpumask_intersects(mask, node_to_possible_cpumask[n])) {
>   node_set(n, *nodemsk);
>   nodes++;
>   }
> @@ -109,7 +109,7 @@ irq_create_affinity_masks(int nvecs, const struct 
> irq_affinity *affd)
>   int last_affv = affv + affd->pre_vectors;
>   nodemask_t nodemsk = NODE_MASK_NONE;
>   struct cpumask *masks;
> - cpumask_var_t nmsk, *node_to_present_cpumask;
> + cpumask_var_t nmsk, *node_to_possible_cpumask;
> 
>   /*
>* If there aren't any vectors left after applying the pre/post
> @@ -125,8 +125,8 @@ irq_create_affinity_masks(int nvecs, const struct 
> irq_affinity *affd)
>   if (!masks)
>   goto out;
> 
> - node_to_present_cpumask = alloc_node_to_present_cpumask();
> - if (!node_to_present_cpumask)
> + node_to_possible_cpumask = alloc_node_to_possible_cpumask();
> + if (!node_to_possible_cpumask)
>   

Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2017-11-23 Thread Christian Borntraeger


On 11/23/2017 03:34 PM, Christoph Hellwig wrote:
> FYI, the patch below changes both the irq and block mappings to
> always use the cpu possible map (should be split in two in due time).
> 
> I think this is the right way forward.  For every normal machine
> those two are the same, but for VMs with maxcpus above their normal
> count or some big iron that can grow more cpus it means we waster
> a few more resources for the not present but reserved cpus.  It
> fixes the reported issue for me:


While it fixes the hotplug issue under KVM, the same kernel no longers boots in 
the host, 
it seems stuck early at boot just before detecting the SCSI disks. I have not 
yet looked into
that.

Christian
> 
> diff --git a/block/blk-mq-cpumap.c b/block/blk-mq-cpumap.c
> index 9f8cffc8a701..3eb169f15842 100644
> --- a/block/blk-mq-cpumap.c
> +++ b/block/blk-mq-cpumap.c
> @@ -16,11 +16,6 @@
> 
>  static int cpu_to_queue_index(unsigned int nr_queues, const int cpu)
>  {
> - /*
> -  * Non present CPU will be mapped to queue index 0.
> -  */
> - if (!cpu_present(cpu))
> - return 0;
>   return cpu % nr_queues;
>  }
> 
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index 11097477eeab..612ce1fb7c4e 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -2114,16 +2114,11 @@ static void blk_mq_init_cpu_queues(struct 
> request_queue *q,
>   INIT_LIST_HEAD(&__ctx->rq_list);
>   __ctx->queue = q;
> 
> - /* If the cpu isn't present, the cpu is mapped to first hctx */
> - if (!cpu_present(i))
> - continue;
> -
> - hctx = blk_mq_map_queue(q, i);
> -
>   /*
>* Set local node, IFF we have more than one hw queue. If
>* not, we remain on the home node of the device
>*/
> + hctx = blk_mq_map_queue(q, i);
>   if (nr_hw_queues > 1 && hctx->numa_node == NUMA_NO_NODE)
>   hctx->numa_node = local_memory_node(cpu_to_node(i));
>   }
> @@ -2180,7 +2175,7 @@ static void blk_mq_map_swqueue(struct request_queue *q)
>*
>* If the cpu isn't present, the cpu is mapped to first hctx.
>*/
> - for_each_present_cpu(i) {
> + for_each_possible_cpu(i) {
>   hctx_idx = q->mq_map[i];
>   /* unmapped hw queue can be remapped after CPU topo changed */
>   if (!set->tags[hctx_idx] &&
> diff --git a/kernel/irq/affinity.c b/kernel/irq/affinity.c
> index e12d35108225..a37a3b4b6342 100644
> --- a/kernel/irq/affinity.c
> +++ b/kernel/irq/affinity.c
> @@ -39,7 +39,7 @@ static void irq_spread_init_one(struct cpumask *irqmsk, 
> struct cpumask *nmsk,
>   }
>  }
> 
> -static cpumask_var_t *alloc_node_to_present_cpumask(void)
> +static cpumask_var_t *alloc_node_to_possible_cpumask(void)
>  {
>   cpumask_var_t *masks;
>   int node;
> @@ -62,7 +62,7 @@ static cpumask_var_t *alloc_node_to_present_cpumask(void)
>   return NULL;
>  }
> 
> -static void free_node_to_present_cpumask(cpumask_var_t *masks)
> +static void free_node_to_possible_cpumask(cpumask_var_t *masks)
>  {
>   int node;
> 
> @@ -71,22 +71,22 @@ static void free_node_to_present_cpumask(cpumask_var_t 
> *masks)
>   kfree(masks);
>  }
> 
> -static void build_node_to_present_cpumask(cpumask_var_t *masks)
> +static void build_node_to_possible_cpumask(cpumask_var_t *masks)
>  {
>   int cpu;
> 
> - for_each_present_cpu(cpu)
> + for_each_possible_cpu(cpu)
>   cpumask_set_cpu(cpu, masks[cpu_to_node(cpu)]);
>  }
> 
> -static int get_nodes_in_cpumask(cpumask_var_t *node_to_present_cpumask,
> +static int get_nodes_in_cpumask(cpumask_var_t *node_to_possible_cpumask,
>   const struct cpumask *mask, nodemask_t *nodemsk)
>  {
>   int n, nodes = 0;
> 
>   /* Calculate the number of nodes in the supplied affinity mask */
>   for_each_node(n) {
> - if (cpumask_intersects(mask, node_to_present_cpumask[n])) {
> + if (cpumask_intersects(mask, node_to_possible_cpumask[n])) {
>   node_set(n, *nodemsk);
>   nodes++;
>   }
> @@ -109,7 +109,7 @@ irq_create_affinity_masks(int nvecs, const struct 
> irq_affinity *affd)
>   int last_affv = affv + affd->pre_vectors;
>   nodemask_t nodemsk = NODE_MASK_NONE;
>   struct cpumask *masks;
> - cpumask_var_t nmsk, *node_to_present_cpumask;
> + cpumask_var_t nmsk, *node_to_possible_cpumask;
> 
>   /*
>* If there aren't any vectors left after applying the pre/post
> @@ -125,8 +125,8 @@ irq_create_affinity_masks(int nvecs, const struct 
> irq_affinity *affd)
>   if (!masks)
>   goto out;
> 
> - node_to_present_cpumask = alloc_node_to_present_cpumask();
> - if (!node_to_present_cpumask)
> + node_to_possible_cpumask = alloc_node_to_possible_cpumask();
> + if (!node_to_possible_cpumask)
>   

Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2017-11-23 Thread Christian Borntraeger
Yes it seems to fix the bug.

On 11/23/2017 03:34 PM, Christoph Hellwig wrote:
> FYI, the patch below changes both the irq and block mappings to
> always use the cpu possible map (should be split in two in due time).
> 
> I think this is the right way forward.  For every normal machine
> those two are the same, but for VMs with maxcpus above their normal
> count or some big iron that can grow more cpus it means we waster
> a few more resources for the not present but reserved cpus.  It
> fixes the reported issue for me:
> 
> diff --git a/block/blk-mq-cpumap.c b/block/blk-mq-cpumap.c
> index 9f8cffc8a701..3eb169f15842 100644
> --- a/block/blk-mq-cpumap.c
> +++ b/block/blk-mq-cpumap.c
> @@ -16,11 +16,6 @@
> 
>  static int cpu_to_queue_index(unsigned int nr_queues, const int cpu)
>  {
> - /*
> -  * Non present CPU will be mapped to queue index 0.
> -  */
> - if (!cpu_present(cpu))
> - return 0;
>   return cpu % nr_queues;
>  }
> 
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index 11097477eeab..612ce1fb7c4e 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -2114,16 +2114,11 @@ static void blk_mq_init_cpu_queues(struct 
> request_queue *q,
>   INIT_LIST_HEAD(&__ctx->rq_list);
>   __ctx->queue = q;
> 
> - /* If the cpu isn't present, the cpu is mapped to first hctx */
> - if (!cpu_present(i))
> - continue;
> -
> - hctx = blk_mq_map_queue(q, i);
> -
>   /*
>* Set local node, IFF we have more than one hw queue. If
>* not, we remain on the home node of the device
>*/
> + hctx = blk_mq_map_queue(q, i);
>   if (nr_hw_queues > 1 && hctx->numa_node == NUMA_NO_NODE)
>   hctx->numa_node = local_memory_node(cpu_to_node(i));
>   }
> @@ -2180,7 +2175,7 @@ static void blk_mq_map_swqueue(struct request_queue *q)
>*
>* If the cpu isn't present, the cpu is mapped to first hctx.
>*/
> - for_each_present_cpu(i) {
> + for_each_possible_cpu(i) {
>   hctx_idx = q->mq_map[i];
>   /* unmapped hw queue can be remapped after CPU topo changed */
>   if (!set->tags[hctx_idx] &&
> diff --git a/kernel/irq/affinity.c b/kernel/irq/affinity.c
> index e12d35108225..a37a3b4b6342 100644
> --- a/kernel/irq/affinity.c
> +++ b/kernel/irq/affinity.c
> @@ -39,7 +39,7 @@ static void irq_spread_init_one(struct cpumask *irqmsk, 
> struct cpumask *nmsk,
>   }
>  }
> 
> -static cpumask_var_t *alloc_node_to_present_cpumask(void)
> +static cpumask_var_t *alloc_node_to_possible_cpumask(void)
>  {
>   cpumask_var_t *masks;
>   int node;
> @@ -62,7 +62,7 @@ static cpumask_var_t *alloc_node_to_present_cpumask(void)
>   return NULL;
>  }
> 
> -static void free_node_to_present_cpumask(cpumask_var_t *masks)
> +static void free_node_to_possible_cpumask(cpumask_var_t *masks)
>  {
>   int node;
> 
> @@ -71,22 +71,22 @@ static void free_node_to_present_cpumask(cpumask_var_t 
> *masks)
>   kfree(masks);
>  }
> 
> -static void build_node_to_present_cpumask(cpumask_var_t *masks)
> +static void build_node_to_possible_cpumask(cpumask_var_t *masks)
>  {
>   int cpu;
> 
> - for_each_present_cpu(cpu)
> + for_each_possible_cpu(cpu)
>   cpumask_set_cpu(cpu, masks[cpu_to_node(cpu)]);
>  }
> 
> -static int get_nodes_in_cpumask(cpumask_var_t *node_to_present_cpumask,
> +static int get_nodes_in_cpumask(cpumask_var_t *node_to_possible_cpumask,
>   const struct cpumask *mask, nodemask_t *nodemsk)
>  {
>   int n, nodes = 0;
> 
>   /* Calculate the number of nodes in the supplied affinity mask */
>   for_each_node(n) {
> - if (cpumask_intersects(mask, node_to_present_cpumask[n])) {
> + if (cpumask_intersects(mask, node_to_possible_cpumask[n])) {
>   node_set(n, *nodemsk);
>   nodes++;
>   }
> @@ -109,7 +109,7 @@ irq_create_affinity_masks(int nvecs, const struct 
> irq_affinity *affd)
>   int last_affv = affv + affd->pre_vectors;
>   nodemask_t nodemsk = NODE_MASK_NONE;
>   struct cpumask *masks;
> - cpumask_var_t nmsk, *node_to_present_cpumask;
> + cpumask_var_t nmsk, *node_to_possible_cpumask;
> 
>   /*
>* If there aren't any vectors left after applying the pre/post
> @@ -125,8 +125,8 @@ irq_create_affinity_masks(int nvecs, const struct 
> irq_affinity *affd)
>   if (!masks)
>   goto out;
> 
> - node_to_present_cpumask = alloc_node_to_present_cpumask();
> - if (!node_to_present_cpumask)
> + node_to_possible_cpumask = alloc_node_to_possible_cpumask();
> + if (!node_to_possible_cpumask)
>   goto out;
> 
>   /* Fill out vectors at the beginning that don't need affinity */
> @@ -135,8 +135,8 @@ irq_create_affinity_masks(int nvecs, const struct 
> 

Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2017-11-23 Thread Christian Borntraeger
Yes it seems to fix the bug.

On 11/23/2017 03:34 PM, Christoph Hellwig wrote:
> FYI, the patch below changes both the irq and block mappings to
> always use the cpu possible map (should be split in two in due time).
> 
> I think this is the right way forward.  For every normal machine
> those two are the same, but for VMs with maxcpus above their normal
> count or some big iron that can grow more cpus it means we waster
> a few more resources for the not present but reserved cpus.  It
> fixes the reported issue for me:
> 
> diff --git a/block/blk-mq-cpumap.c b/block/blk-mq-cpumap.c
> index 9f8cffc8a701..3eb169f15842 100644
> --- a/block/blk-mq-cpumap.c
> +++ b/block/blk-mq-cpumap.c
> @@ -16,11 +16,6 @@
> 
>  static int cpu_to_queue_index(unsigned int nr_queues, const int cpu)
>  {
> - /*
> -  * Non present CPU will be mapped to queue index 0.
> -  */
> - if (!cpu_present(cpu))
> - return 0;
>   return cpu % nr_queues;
>  }
> 
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index 11097477eeab..612ce1fb7c4e 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -2114,16 +2114,11 @@ static void blk_mq_init_cpu_queues(struct 
> request_queue *q,
>   INIT_LIST_HEAD(&__ctx->rq_list);
>   __ctx->queue = q;
> 
> - /* If the cpu isn't present, the cpu is mapped to first hctx */
> - if (!cpu_present(i))
> - continue;
> -
> - hctx = blk_mq_map_queue(q, i);
> -
>   /*
>* Set local node, IFF we have more than one hw queue. If
>* not, we remain on the home node of the device
>*/
> + hctx = blk_mq_map_queue(q, i);
>   if (nr_hw_queues > 1 && hctx->numa_node == NUMA_NO_NODE)
>   hctx->numa_node = local_memory_node(cpu_to_node(i));
>   }
> @@ -2180,7 +2175,7 @@ static void blk_mq_map_swqueue(struct request_queue *q)
>*
>* If the cpu isn't present, the cpu is mapped to first hctx.
>*/
> - for_each_present_cpu(i) {
> + for_each_possible_cpu(i) {
>   hctx_idx = q->mq_map[i];
>   /* unmapped hw queue can be remapped after CPU topo changed */
>   if (!set->tags[hctx_idx] &&
> diff --git a/kernel/irq/affinity.c b/kernel/irq/affinity.c
> index e12d35108225..a37a3b4b6342 100644
> --- a/kernel/irq/affinity.c
> +++ b/kernel/irq/affinity.c
> @@ -39,7 +39,7 @@ static void irq_spread_init_one(struct cpumask *irqmsk, 
> struct cpumask *nmsk,
>   }
>  }
> 
> -static cpumask_var_t *alloc_node_to_present_cpumask(void)
> +static cpumask_var_t *alloc_node_to_possible_cpumask(void)
>  {
>   cpumask_var_t *masks;
>   int node;
> @@ -62,7 +62,7 @@ static cpumask_var_t *alloc_node_to_present_cpumask(void)
>   return NULL;
>  }
> 
> -static void free_node_to_present_cpumask(cpumask_var_t *masks)
> +static void free_node_to_possible_cpumask(cpumask_var_t *masks)
>  {
>   int node;
> 
> @@ -71,22 +71,22 @@ static void free_node_to_present_cpumask(cpumask_var_t 
> *masks)
>   kfree(masks);
>  }
> 
> -static void build_node_to_present_cpumask(cpumask_var_t *masks)
> +static void build_node_to_possible_cpumask(cpumask_var_t *masks)
>  {
>   int cpu;
> 
> - for_each_present_cpu(cpu)
> + for_each_possible_cpu(cpu)
>   cpumask_set_cpu(cpu, masks[cpu_to_node(cpu)]);
>  }
> 
> -static int get_nodes_in_cpumask(cpumask_var_t *node_to_present_cpumask,
> +static int get_nodes_in_cpumask(cpumask_var_t *node_to_possible_cpumask,
>   const struct cpumask *mask, nodemask_t *nodemsk)
>  {
>   int n, nodes = 0;
> 
>   /* Calculate the number of nodes in the supplied affinity mask */
>   for_each_node(n) {
> - if (cpumask_intersects(mask, node_to_present_cpumask[n])) {
> + if (cpumask_intersects(mask, node_to_possible_cpumask[n])) {
>   node_set(n, *nodemsk);
>   nodes++;
>   }
> @@ -109,7 +109,7 @@ irq_create_affinity_masks(int nvecs, const struct 
> irq_affinity *affd)
>   int last_affv = affv + affd->pre_vectors;
>   nodemask_t nodemsk = NODE_MASK_NONE;
>   struct cpumask *masks;
> - cpumask_var_t nmsk, *node_to_present_cpumask;
> + cpumask_var_t nmsk, *node_to_possible_cpumask;
> 
>   /*
>* If there aren't any vectors left after applying the pre/post
> @@ -125,8 +125,8 @@ irq_create_affinity_masks(int nvecs, const struct 
> irq_affinity *affd)
>   if (!masks)
>   goto out;
> 
> - node_to_present_cpumask = alloc_node_to_present_cpumask();
> - if (!node_to_present_cpumask)
> + node_to_possible_cpumask = alloc_node_to_possible_cpumask();
> + if (!node_to_possible_cpumask)
>   goto out;
> 
>   /* Fill out vectors at the beginning that don't need affinity */
> @@ -135,8 +135,8 @@ irq_create_affinity_masks(int nvecs, const struct 
> 

Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2017-11-23 Thread Christoph Hellwig
[fullquote deleted]

> What will happen for the CPU hotplug case?
> Wouldn't we route I/O to a disabled CPU with this patch?

Why would we route I/O to a disabled CPU (we generally route
I/O to devices to start with).  How would including possible
but not present cpus change anything?


Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2017-11-23 Thread Christoph Hellwig
[fullquote deleted]

> What will happen for the CPU hotplug case?
> Wouldn't we route I/O to a disabled CPU with this patch?

Why would we route I/O to a disabled CPU (we generally route
I/O to devices to start with).  How would including possible
but not present cpus change anything?


Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2017-11-23 Thread Hannes Reinecke
On 11/23/2017 03:34 PM, Christoph Hellwig wrote:
> FYI, the patch below changes both the irq and block mappings to
> always use the cpu possible map (should be split in two in due time).
> 
> I think this is the right way forward.  For every normal machine
> those two are the same, but for VMs with maxcpus above their normal
> count or some big iron that can grow more cpus it means we waster
> a few more resources for the not present but reserved cpus.  It
> fixes the reported issue for me:
> 
> diff --git a/block/blk-mq-cpumap.c b/block/blk-mq-cpumap.c
> index 9f8cffc8a701..3eb169f15842 100644
> --- a/block/blk-mq-cpumap.c
> +++ b/block/blk-mq-cpumap.c
> @@ -16,11 +16,6 @@
>  
>  static int cpu_to_queue_index(unsigned int nr_queues, const int cpu)
>  {
> - /*
> -  * Non present CPU will be mapped to queue index 0.
> -  */
> - if (!cpu_present(cpu))
> - return 0;
>   return cpu % nr_queues;
>  }
>  
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index 11097477eeab..612ce1fb7c4e 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -2114,16 +2114,11 @@ static void blk_mq_init_cpu_queues(struct 
> request_queue *q,
>   INIT_LIST_HEAD(&__ctx->rq_list);
>   __ctx->queue = q;
>  
> - /* If the cpu isn't present, the cpu is mapped to first hctx */
> - if (!cpu_present(i))
> - continue;
> -
> - hctx = blk_mq_map_queue(q, i);
> -
>   /*
>* Set local node, IFF we have more than one hw queue. If
>* not, we remain on the home node of the device
>*/
> + hctx = blk_mq_map_queue(q, i);
>   if (nr_hw_queues > 1 && hctx->numa_node == NUMA_NO_NODE)
>   hctx->numa_node = local_memory_node(cpu_to_node(i));
>   }
> @@ -2180,7 +2175,7 @@ static void blk_mq_map_swqueue(struct request_queue *q)
>*
>* If the cpu isn't present, the cpu is mapped to first hctx.
>*/
> - for_each_present_cpu(i) {
> + for_each_possible_cpu(i) {
>   hctx_idx = q->mq_map[i];
>   /* unmapped hw queue can be remapped after CPU topo changed */
>   if (!set->tags[hctx_idx] &&
> diff --git a/kernel/irq/affinity.c b/kernel/irq/affinity.c
> index e12d35108225..a37a3b4b6342 100644
> --- a/kernel/irq/affinity.c
> +++ b/kernel/irq/affinity.c
> @@ -39,7 +39,7 @@ static void irq_spread_init_one(struct cpumask *irqmsk, 
> struct cpumask *nmsk,
>   }
>  }
>  
> -static cpumask_var_t *alloc_node_to_present_cpumask(void)
> +static cpumask_var_t *alloc_node_to_possible_cpumask(void)
>  {
>   cpumask_var_t *masks;
>   int node;
> @@ -62,7 +62,7 @@ static cpumask_var_t *alloc_node_to_present_cpumask(void)
>   return NULL;
>  }
>  
> -static void free_node_to_present_cpumask(cpumask_var_t *masks)
> +static void free_node_to_possible_cpumask(cpumask_var_t *masks)
>  {
>   int node;
>  
> @@ -71,22 +71,22 @@ static void free_node_to_present_cpumask(cpumask_var_t 
> *masks)
>   kfree(masks);
>  }
>  
> -static void build_node_to_present_cpumask(cpumask_var_t *masks)
> +static void build_node_to_possible_cpumask(cpumask_var_t *masks)
>  {
>   int cpu;
>  
> - for_each_present_cpu(cpu)
> + for_each_possible_cpu(cpu)
>   cpumask_set_cpu(cpu, masks[cpu_to_node(cpu)]);
>  }
>  
> -static int get_nodes_in_cpumask(cpumask_var_t *node_to_present_cpumask,
> +static int get_nodes_in_cpumask(cpumask_var_t *node_to_possible_cpumask,
>   const struct cpumask *mask, nodemask_t *nodemsk)
>  {
>   int n, nodes = 0;
>  
>   /* Calculate the number of nodes in the supplied affinity mask */
>   for_each_node(n) {
> - if (cpumask_intersects(mask, node_to_present_cpumask[n])) {
> + if (cpumask_intersects(mask, node_to_possible_cpumask[n])) {
>   node_set(n, *nodemsk);
>   nodes++;
>   }
> @@ -109,7 +109,7 @@ irq_create_affinity_masks(int nvecs, const struct 
> irq_affinity *affd)
>   int last_affv = affv + affd->pre_vectors;
>   nodemask_t nodemsk = NODE_MASK_NONE;
>   struct cpumask *masks;
> - cpumask_var_t nmsk, *node_to_present_cpumask;
> + cpumask_var_t nmsk, *node_to_possible_cpumask;
>  
>   /*
>* If there aren't any vectors left after applying the pre/post
> @@ -125,8 +125,8 @@ irq_create_affinity_masks(int nvecs, const struct 
> irq_affinity *affd)
>   if (!masks)
>   goto out;
>  
> - node_to_present_cpumask = alloc_node_to_present_cpumask();
> - if (!node_to_present_cpumask)
> + node_to_possible_cpumask = alloc_node_to_possible_cpumask();
> + if (!node_to_possible_cpumask)
>   goto out;
>  
>   /* Fill out vectors at the beginning that don't need affinity */
> @@ -135,8 +135,8 @@ irq_create_affinity_masks(int nvecs, const struct 
> irq_affinity *affd)
>  
> 

Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2017-11-23 Thread Hannes Reinecke
On 11/23/2017 03:34 PM, Christoph Hellwig wrote:
> FYI, the patch below changes both the irq and block mappings to
> always use the cpu possible map (should be split in two in due time).
> 
> I think this is the right way forward.  For every normal machine
> those two are the same, but for VMs with maxcpus above their normal
> count or some big iron that can grow more cpus it means we waster
> a few more resources for the not present but reserved cpus.  It
> fixes the reported issue for me:
> 
> diff --git a/block/blk-mq-cpumap.c b/block/blk-mq-cpumap.c
> index 9f8cffc8a701..3eb169f15842 100644
> --- a/block/blk-mq-cpumap.c
> +++ b/block/blk-mq-cpumap.c
> @@ -16,11 +16,6 @@
>  
>  static int cpu_to_queue_index(unsigned int nr_queues, const int cpu)
>  {
> - /*
> -  * Non present CPU will be mapped to queue index 0.
> -  */
> - if (!cpu_present(cpu))
> - return 0;
>   return cpu % nr_queues;
>  }
>  
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index 11097477eeab..612ce1fb7c4e 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -2114,16 +2114,11 @@ static void blk_mq_init_cpu_queues(struct 
> request_queue *q,
>   INIT_LIST_HEAD(&__ctx->rq_list);
>   __ctx->queue = q;
>  
> - /* If the cpu isn't present, the cpu is mapped to first hctx */
> - if (!cpu_present(i))
> - continue;
> -
> - hctx = blk_mq_map_queue(q, i);
> -
>   /*
>* Set local node, IFF we have more than one hw queue. If
>* not, we remain on the home node of the device
>*/
> + hctx = blk_mq_map_queue(q, i);
>   if (nr_hw_queues > 1 && hctx->numa_node == NUMA_NO_NODE)
>   hctx->numa_node = local_memory_node(cpu_to_node(i));
>   }
> @@ -2180,7 +2175,7 @@ static void blk_mq_map_swqueue(struct request_queue *q)
>*
>* If the cpu isn't present, the cpu is mapped to first hctx.
>*/
> - for_each_present_cpu(i) {
> + for_each_possible_cpu(i) {
>   hctx_idx = q->mq_map[i];
>   /* unmapped hw queue can be remapped after CPU topo changed */
>   if (!set->tags[hctx_idx] &&
> diff --git a/kernel/irq/affinity.c b/kernel/irq/affinity.c
> index e12d35108225..a37a3b4b6342 100644
> --- a/kernel/irq/affinity.c
> +++ b/kernel/irq/affinity.c
> @@ -39,7 +39,7 @@ static void irq_spread_init_one(struct cpumask *irqmsk, 
> struct cpumask *nmsk,
>   }
>  }
>  
> -static cpumask_var_t *alloc_node_to_present_cpumask(void)
> +static cpumask_var_t *alloc_node_to_possible_cpumask(void)
>  {
>   cpumask_var_t *masks;
>   int node;
> @@ -62,7 +62,7 @@ static cpumask_var_t *alloc_node_to_present_cpumask(void)
>   return NULL;
>  }
>  
> -static void free_node_to_present_cpumask(cpumask_var_t *masks)
> +static void free_node_to_possible_cpumask(cpumask_var_t *masks)
>  {
>   int node;
>  
> @@ -71,22 +71,22 @@ static void free_node_to_present_cpumask(cpumask_var_t 
> *masks)
>   kfree(masks);
>  }
>  
> -static void build_node_to_present_cpumask(cpumask_var_t *masks)
> +static void build_node_to_possible_cpumask(cpumask_var_t *masks)
>  {
>   int cpu;
>  
> - for_each_present_cpu(cpu)
> + for_each_possible_cpu(cpu)
>   cpumask_set_cpu(cpu, masks[cpu_to_node(cpu)]);
>  }
>  
> -static int get_nodes_in_cpumask(cpumask_var_t *node_to_present_cpumask,
> +static int get_nodes_in_cpumask(cpumask_var_t *node_to_possible_cpumask,
>   const struct cpumask *mask, nodemask_t *nodemsk)
>  {
>   int n, nodes = 0;
>  
>   /* Calculate the number of nodes in the supplied affinity mask */
>   for_each_node(n) {
> - if (cpumask_intersects(mask, node_to_present_cpumask[n])) {
> + if (cpumask_intersects(mask, node_to_possible_cpumask[n])) {
>   node_set(n, *nodemsk);
>   nodes++;
>   }
> @@ -109,7 +109,7 @@ irq_create_affinity_masks(int nvecs, const struct 
> irq_affinity *affd)
>   int last_affv = affv + affd->pre_vectors;
>   nodemask_t nodemsk = NODE_MASK_NONE;
>   struct cpumask *masks;
> - cpumask_var_t nmsk, *node_to_present_cpumask;
> + cpumask_var_t nmsk, *node_to_possible_cpumask;
>  
>   /*
>* If there aren't any vectors left after applying the pre/post
> @@ -125,8 +125,8 @@ irq_create_affinity_masks(int nvecs, const struct 
> irq_affinity *affd)
>   if (!masks)
>   goto out;
>  
> - node_to_present_cpumask = alloc_node_to_present_cpumask();
> - if (!node_to_present_cpumask)
> + node_to_possible_cpumask = alloc_node_to_possible_cpumask();
> + if (!node_to_possible_cpumask)
>   goto out;
>  
>   /* Fill out vectors at the beginning that don't need affinity */
> @@ -135,8 +135,8 @@ irq_create_affinity_masks(int nvecs, const struct 
> irq_affinity *affd)
>  
> 

Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2017-11-23 Thread Christoph Hellwig
FYI, the patch below changes both the irq and block mappings to
always use the cpu possible map (should be split in two in due time).

I think this is the right way forward.  For every normal machine
those two are the same, but for VMs with maxcpus above their normal
count or some big iron that can grow more cpus it means we waster
a few more resources for the not present but reserved cpus.  It
fixes the reported issue for me:

diff --git a/block/blk-mq-cpumap.c b/block/blk-mq-cpumap.c
index 9f8cffc8a701..3eb169f15842 100644
--- a/block/blk-mq-cpumap.c
+++ b/block/blk-mq-cpumap.c
@@ -16,11 +16,6 @@
 
 static int cpu_to_queue_index(unsigned int nr_queues, const int cpu)
 {
-   /*
-* Non present CPU will be mapped to queue index 0.
-*/
-   if (!cpu_present(cpu))
-   return 0;
return cpu % nr_queues;
 }
 
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 11097477eeab..612ce1fb7c4e 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -2114,16 +2114,11 @@ static void blk_mq_init_cpu_queues(struct request_queue 
*q,
INIT_LIST_HEAD(&__ctx->rq_list);
__ctx->queue = q;
 
-   /* If the cpu isn't present, the cpu is mapped to first hctx */
-   if (!cpu_present(i))
-   continue;
-
-   hctx = blk_mq_map_queue(q, i);
-
/*
 * Set local node, IFF we have more than one hw queue. If
 * not, we remain on the home node of the device
 */
+   hctx = blk_mq_map_queue(q, i);
if (nr_hw_queues > 1 && hctx->numa_node == NUMA_NO_NODE)
hctx->numa_node = local_memory_node(cpu_to_node(i));
}
@@ -2180,7 +2175,7 @@ static void blk_mq_map_swqueue(struct request_queue *q)
 *
 * If the cpu isn't present, the cpu is mapped to first hctx.
 */
-   for_each_present_cpu(i) {
+   for_each_possible_cpu(i) {
hctx_idx = q->mq_map[i];
/* unmapped hw queue can be remapped after CPU topo changed */
if (!set->tags[hctx_idx] &&
diff --git a/kernel/irq/affinity.c b/kernel/irq/affinity.c
index e12d35108225..a37a3b4b6342 100644
--- a/kernel/irq/affinity.c
+++ b/kernel/irq/affinity.c
@@ -39,7 +39,7 @@ static void irq_spread_init_one(struct cpumask *irqmsk, 
struct cpumask *nmsk,
}
 }
 
-static cpumask_var_t *alloc_node_to_present_cpumask(void)
+static cpumask_var_t *alloc_node_to_possible_cpumask(void)
 {
cpumask_var_t *masks;
int node;
@@ -62,7 +62,7 @@ static cpumask_var_t *alloc_node_to_present_cpumask(void)
return NULL;
 }
 
-static void free_node_to_present_cpumask(cpumask_var_t *masks)
+static void free_node_to_possible_cpumask(cpumask_var_t *masks)
 {
int node;
 
@@ -71,22 +71,22 @@ static void free_node_to_present_cpumask(cpumask_var_t 
*masks)
kfree(masks);
 }
 
-static void build_node_to_present_cpumask(cpumask_var_t *masks)
+static void build_node_to_possible_cpumask(cpumask_var_t *masks)
 {
int cpu;
 
-   for_each_present_cpu(cpu)
+   for_each_possible_cpu(cpu)
cpumask_set_cpu(cpu, masks[cpu_to_node(cpu)]);
 }
 
-static int get_nodes_in_cpumask(cpumask_var_t *node_to_present_cpumask,
+static int get_nodes_in_cpumask(cpumask_var_t *node_to_possible_cpumask,
const struct cpumask *mask, nodemask_t *nodemsk)
 {
int n, nodes = 0;
 
/* Calculate the number of nodes in the supplied affinity mask */
for_each_node(n) {
-   if (cpumask_intersects(mask, node_to_present_cpumask[n])) {
+   if (cpumask_intersects(mask, node_to_possible_cpumask[n])) {
node_set(n, *nodemsk);
nodes++;
}
@@ -109,7 +109,7 @@ irq_create_affinity_masks(int nvecs, const struct 
irq_affinity *affd)
int last_affv = affv + affd->pre_vectors;
nodemask_t nodemsk = NODE_MASK_NONE;
struct cpumask *masks;
-   cpumask_var_t nmsk, *node_to_present_cpumask;
+   cpumask_var_t nmsk, *node_to_possible_cpumask;
 
/*
 * If there aren't any vectors left after applying the pre/post
@@ -125,8 +125,8 @@ irq_create_affinity_masks(int nvecs, const struct 
irq_affinity *affd)
if (!masks)
goto out;
 
-   node_to_present_cpumask = alloc_node_to_present_cpumask();
-   if (!node_to_present_cpumask)
+   node_to_possible_cpumask = alloc_node_to_possible_cpumask();
+   if (!node_to_possible_cpumask)
goto out;
 
/* Fill out vectors at the beginning that don't need affinity */
@@ -135,8 +135,8 @@ irq_create_affinity_masks(int nvecs, const struct 
irq_affinity *affd)
 
/* Stabilize the cpumasks */
get_online_cpus();
-   build_node_to_present_cpumask(node_to_present_cpumask);
-   nodes = get_nodes_in_cpumask(node_to_present_cpumask, 

Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2017-11-23 Thread Christoph Hellwig
FYI, the patch below changes both the irq and block mappings to
always use the cpu possible map (should be split in two in due time).

I think this is the right way forward.  For every normal machine
those two are the same, but for VMs with maxcpus above their normal
count or some big iron that can grow more cpus it means we waster
a few more resources for the not present but reserved cpus.  It
fixes the reported issue for me:

diff --git a/block/blk-mq-cpumap.c b/block/blk-mq-cpumap.c
index 9f8cffc8a701..3eb169f15842 100644
--- a/block/blk-mq-cpumap.c
+++ b/block/blk-mq-cpumap.c
@@ -16,11 +16,6 @@
 
 static int cpu_to_queue_index(unsigned int nr_queues, const int cpu)
 {
-   /*
-* Non present CPU will be mapped to queue index 0.
-*/
-   if (!cpu_present(cpu))
-   return 0;
return cpu % nr_queues;
 }
 
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 11097477eeab..612ce1fb7c4e 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -2114,16 +2114,11 @@ static void blk_mq_init_cpu_queues(struct request_queue 
*q,
INIT_LIST_HEAD(&__ctx->rq_list);
__ctx->queue = q;
 
-   /* If the cpu isn't present, the cpu is mapped to first hctx */
-   if (!cpu_present(i))
-   continue;
-
-   hctx = blk_mq_map_queue(q, i);
-
/*
 * Set local node, IFF we have more than one hw queue. If
 * not, we remain on the home node of the device
 */
+   hctx = blk_mq_map_queue(q, i);
if (nr_hw_queues > 1 && hctx->numa_node == NUMA_NO_NODE)
hctx->numa_node = local_memory_node(cpu_to_node(i));
}
@@ -2180,7 +2175,7 @@ static void blk_mq_map_swqueue(struct request_queue *q)
 *
 * If the cpu isn't present, the cpu is mapped to first hctx.
 */
-   for_each_present_cpu(i) {
+   for_each_possible_cpu(i) {
hctx_idx = q->mq_map[i];
/* unmapped hw queue can be remapped after CPU topo changed */
if (!set->tags[hctx_idx] &&
diff --git a/kernel/irq/affinity.c b/kernel/irq/affinity.c
index e12d35108225..a37a3b4b6342 100644
--- a/kernel/irq/affinity.c
+++ b/kernel/irq/affinity.c
@@ -39,7 +39,7 @@ static void irq_spread_init_one(struct cpumask *irqmsk, 
struct cpumask *nmsk,
}
 }
 
-static cpumask_var_t *alloc_node_to_present_cpumask(void)
+static cpumask_var_t *alloc_node_to_possible_cpumask(void)
 {
cpumask_var_t *masks;
int node;
@@ -62,7 +62,7 @@ static cpumask_var_t *alloc_node_to_present_cpumask(void)
return NULL;
 }
 
-static void free_node_to_present_cpumask(cpumask_var_t *masks)
+static void free_node_to_possible_cpumask(cpumask_var_t *masks)
 {
int node;
 
@@ -71,22 +71,22 @@ static void free_node_to_present_cpumask(cpumask_var_t 
*masks)
kfree(masks);
 }
 
-static void build_node_to_present_cpumask(cpumask_var_t *masks)
+static void build_node_to_possible_cpumask(cpumask_var_t *masks)
 {
int cpu;
 
-   for_each_present_cpu(cpu)
+   for_each_possible_cpu(cpu)
cpumask_set_cpu(cpu, masks[cpu_to_node(cpu)]);
 }
 
-static int get_nodes_in_cpumask(cpumask_var_t *node_to_present_cpumask,
+static int get_nodes_in_cpumask(cpumask_var_t *node_to_possible_cpumask,
const struct cpumask *mask, nodemask_t *nodemsk)
 {
int n, nodes = 0;
 
/* Calculate the number of nodes in the supplied affinity mask */
for_each_node(n) {
-   if (cpumask_intersects(mask, node_to_present_cpumask[n])) {
+   if (cpumask_intersects(mask, node_to_possible_cpumask[n])) {
node_set(n, *nodemsk);
nodes++;
}
@@ -109,7 +109,7 @@ irq_create_affinity_masks(int nvecs, const struct 
irq_affinity *affd)
int last_affv = affv + affd->pre_vectors;
nodemask_t nodemsk = NODE_MASK_NONE;
struct cpumask *masks;
-   cpumask_var_t nmsk, *node_to_present_cpumask;
+   cpumask_var_t nmsk, *node_to_possible_cpumask;
 
/*
 * If there aren't any vectors left after applying the pre/post
@@ -125,8 +125,8 @@ irq_create_affinity_masks(int nvecs, const struct 
irq_affinity *affd)
if (!masks)
goto out;
 
-   node_to_present_cpumask = alloc_node_to_present_cpumask();
-   if (!node_to_present_cpumask)
+   node_to_possible_cpumask = alloc_node_to_possible_cpumask();
+   if (!node_to_possible_cpumask)
goto out;
 
/* Fill out vectors at the beginning that don't need affinity */
@@ -135,8 +135,8 @@ irq_create_affinity_masks(int nvecs, const struct 
irq_affinity *affd)
 
/* Stabilize the cpumasks */
get_online_cpus();
-   build_node_to_present_cpumask(node_to_present_cpumask);
-   nodes = get_nodes_in_cpumask(node_to_present_cpumask, 

Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2017-11-23 Thread Christoph Hellwig
Ok, it helps to make sure we're actually doing I/O from the CPU,
I've reproduced it now.


Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2017-11-23 Thread Christoph Hellwig
Ok, it helps to make sure we're actually doing I/O from the CPU,
I've reproduced it now.


Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2017-11-23 Thread Christoph Hellwig
I can't reproduce it in my VM with adding a new CPU.  Do you have
any interesting blk-mq like actually using multiple queues?  I'll
give that a spin next.


Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2017-11-23 Thread Christoph Hellwig
I can't reproduce it in my VM with adding a new CPU.  Do you have
any interesting blk-mq like actually using multiple queues?  I'll
give that a spin next.


Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2017-11-22 Thread Jens Axboe
On 11/22/2017 12:28 AM, Christoph Hellwig wrote:
> Jens, please don't just revert the commit in your for-linus tree.
> 
> On its own this will totally mess up the interrupt assignments.  Give
> me a bit of time to sort this out properly.

I wasn't going to push it until I heard otherwise. I'll just pop it
off, for-linus isn't a stable branch.

-- 
Jens Axboe



Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2017-11-22 Thread Jens Axboe
On 11/22/2017 12:28 AM, Christoph Hellwig wrote:
> Jens, please don't just revert the commit in your for-linus tree.
> 
> On its own this will totally mess up the interrupt assignments.  Give
> me a bit of time to sort this out properly.

I wasn't going to push it until I heard otherwise. I'll just pop it
off, for-linus isn't a stable branch.

-- 
Jens Axboe



Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2017-11-21 Thread Christoph Hellwig
Jens, please don't just revert the commit in your for-linus tree.

On its own this will totally mess up the interrupt assignments.  Give
me a bit of time to sort this out properly.


Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2017-11-21 Thread Christoph Hellwig
Jens, please don't just revert the commit in your for-linus tree.

On its own this will totally mess up the interrupt assignments.  Give
me a bit of time to sort this out properly.


Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2017-11-21 Thread Jens Axboe
On 11/21/2017 01:31 PM, Christian Borntraeger wrote:
> 
> 
> On 11/21/2017 09:21 PM, Jens Axboe wrote:
>> On 11/21/2017 01:19 PM, Christian Borntraeger wrote:
>>>
>>> On 11/21/2017 09:14 PM, Jens Axboe wrote:
 On 11/21/2017 01:12 PM, Christian Borntraeger wrote:
>
>
> On 11/21/2017 08:30 PM, Jens Axboe wrote:
>> On 11/21/2017 12:15 PM, Christian Borntraeger wrote:
>>>
>>>
>>> On 11/21/2017 07:39 PM, Jens Axboe wrote:
 On 11/21/2017 11:27 AM, Jens Axboe wrote:
> On 11/21/2017 11:12 AM, Christian Borntraeger wrote:
>>
>>
>> On 11/21/2017 07:09 PM, Jens Axboe wrote:
>>> On 11/21/2017 10:27 AM, Jens Axboe wrote:
 On 11/21/2017 03:14 AM, Christian Borntraeger wrote:
> Bisect points to
>
> 1b5a7455d345b223d3a4658a9e5fce985b7998c1 is the first bad commit
> commit 1b5a7455d345b223d3a4658a9e5fce985b7998c1
> Author: Christoph Hellwig 
> Date:   Mon Jun 26 12:20:57 2017 +0200
>
> blk-mq: Create hctx for each present CPU
> 
> commit 4b855ad37194f7bdbb200ce7a1c7051fecb56a08 upstream.
> 
> Currently we only create hctx for online CPUs, which can lead 
> to a lot
> of churn due to frequent soft offline / online operations.  
> Instead
> allocate one for each present CPU to avoid this and 
> dramatically simplify
> the code.
> 
> Signed-off-by: Christoph Hellwig 
> Reviewed-by: Jens Axboe 
> Cc: Keith Busch 
> Cc: linux-bl...@vger.kernel.org
> Cc: linux-n...@lists.infradead.org
> Link: 
> http://lkml.kernel.org/r/20170626102058.10200-3-...@lst.de
> Signed-off-by: Thomas Gleixner 
> Cc: Oleksandr Natalenko 
> Cc: Mike Galbraith 
> Signed-off-by: Greg Kroah-Hartman 

 I wonder if we're simply not getting the masks updated correctly. 
 I'll
 take a look.
>>>
>>> Can't make it trigger here. We do init for each present CPU, which 
>>> means
>>> that if I offline a few CPUs here and register a queue, those still 
>>> show
>>> up as present (just offline) and get mapped accordingly.
>>>
>>> From the looks of it, your setup is different. If the CPU doesn't 
>>> show
>>> up as present and it gets hotplugged, then I can see how this 
>>> condition
>>> would trigger. What environment are you running this in? We might 
>>> have
>>> to re-introduce the cpu hotplug notifier, right now we just monitor
>>> for a dead cpu and handle that.
>>
>> I am not doing a hot unplug and the replug, I use KVM and add a 
>> previously
>> not available CPU.
>>
>> in libvirt/virsh speak:
>>   4
>
> So that's why we run into problems. It's not present when we load the 
> device,
> but becomes present and online afterwards.
>
> Christoph, we used to handle this just fine, your patch broke it.
>
> I'll see if I can come up with an appropriate fix.

 Can you try the below?
>>>
>>>
>>> It does prevent the crash but it seems that the new CPU is not "used " 
>>> after the hotplug for mq:
>>>
>>>
>>> output with 2 cpus:
>>> /sys/kernel/debug/block/vda
>>> /sys/kernel/debug/block/vda/hctx0
>>> /sys/kernel/debug/block/vda/hctx0/cpu0
>>> /sys/kernel/debug/block/vda/hctx0/cpu0/completed
>>> /sys/kernel/debug/block/vda/hctx0/cpu0/merged
>>> /sys/kernel/debug/block/vda/hctx0/cpu0/dispatched
>>> /sys/kernel/debug/block/vda/hctx0/cpu0/rq_list
>>> /sys/kernel/debug/block/vda/hctx0/active
>>> /sys/kernel/debug/block/vda/hctx0/run
>>> /sys/kernel/debug/block/vda/hctx0/queued
>>> /sys/kernel/debug/block/vda/hctx0/dispatched
>>> /sys/kernel/debug/block/vda/hctx0/io_poll
>>> /sys/kernel/debug/block/vda/hctx0/sched_tags_bitmap
>>> /sys/kernel/debug/block/vda/hctx0/sched_tags
>>> /sys/kernel/debug/block/vda/hctx0/tags_bitmap
>>> /sys/kernel/debug/block/vda/hctx0/tags
>>> /sys/kernel/debug/block/vda/hctx0/ctx_map
>>> /sys/kernel/debug/block/vda/hctx0/busy
>>> /sys/kernel/debug/block/vda/hctx0/dispatch
>>> /sys/kernel/debug/block/vda/hctx0/flags
>>> /sys/kernel/debug/block/vda/hctx0/state
>>> /sys/kernel/debug/block/vda/sched
>>> /sys/kernel/debug/block/vda/sched/dispatch
>>> 

Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2017-11-21 Thread Jens Axboe
On 11/21/2017 01:31 PM, Christian Borntraeger wrote:
> 
> 
> On 11/21/2017 09:21 PM, Jens Axboe wrote:
>> On 11/21/2017 01:19 PM, Christian Borntraeger wrote:
>>>
>>> On 11/21/2017 09:14 PM, Jens Axboe wrote:
 On 11/21/2017 01:12 PM, Christian Borntraeger wrote:
>
>
> On 11/21/2017 08:30 PM, Jens Axboe wrote:
>> On 11/21/2017 12:15 PM, Christian Borntraeger wrote:
>>>
>>>
>>> On 11/21/2017 07:39 PM, Jens Axboe wrote:
 On 11/21/2017 11:27 AM, Jens Axboe wrote:
> On 11/21/2017 11:12 AM, Christian Borntraeger wrote:
>>
>>
>> On 11/21/2017 07:09 PM, Jens Axboe wrote:
>>> On 11/21/2017 10:27 AM, Jens Axboe wrote:
 On 11/21/2017 03:14 AM, Christian Borntraeger wrote:
> Bisect points to
>
> 1b5a7455d345b223d3a4658a9e5fce985b7998c1 is the first bad commit
> commit 1b5a7455d345b223d3a4658a9e5fce985b7998c1
> Author: Christoph Hellwig 
> Date:   Mon Jun 26 12:20:57 2017 +0200
>
> blk-mq: Create hctx for each present CPU
> 
> commit 4b855ad37194f7bdbb200ce7a1c7051fecb56a08 upstream.
> 
> Currently we only create hctx for online CPUs, which can lead 
> to a lot
> of churn due to frequent soft offline / online operations.  
> Instead
> allocate one for each present CPU to avoid this and 
> dramatically simplify
> the code.
> 
> Signed-off-by: Christoph Hellwig 
> Reviewed-by: Jens Axboe 
> Cc: Keith Busch 
> Cc: linux-bl...@vger.kernel.org
> Cc: linux-n...@lists.infradead.org
> Link: 
> http://lkml.kernel.org/r/20170626102058.10200-3-...@lst.de
> Signed-off-by: Thomas Gleixner 
> Cc: Oleksandr Natalenko 
> Cc: Mike Galbraith 
> Signed-off-by: Greg Kroah-Hartman 

 I wonder if we're simply not getting the masks updated correctly. 
 I'll
 take a look.
>>>
>>> Can't make it trigger here. We do init for each present CPU, which 
>>> means
>>> that if I offline a few CPUs here and register a queue, those still 
>>> show
>>> up as present (just offline) and get mapped accordingly.
>>>
>>> From the looks of it, your setup is different. If the CPU doesn't 
>>> show
>>> up as present and it gets hotplugged, then I can see how this 
>>> condition
>>> would trigger. What environment are you running this in? We might 
>>> have
>>> to re-introduce the cpu hotplug notifier, right now we just monitor
>>> for a dead cpu and handle that.
>>
>> I am not doing a hot unplug and the replug, I use KVM and add a 
>> previously
>> not available CPU.
>>
>> in libvirt/virsh speak:
>>   4
>
> So that's why we run into problems. It's not present when we load the 
> device,
> but becomes present and online afterwards.
>
> Christoph, we used to handle this just fine, your patch broke it.
>
> I'll see if I can come up with an appropriate fix.

 Can you try the below?
>>>
>>>
>>> It does prevent the crash but it seems that the new CPU is not "used " 
>>> after the hotplug for mq:
>>>
>>>
>>> output with 2 cpus:
>>> /sys/kernel/debug/block/vda
>>> /sys/kernel/debug/block/vda/hctx0
>>> /sys/kernel/debug/block/vda/hctx0/cpu0
>>> /sys/kernel/debug/block/vda/hctx0/cpu0/completed
>>> /sys/kernel/debug/block/vda/hctx0/cpu0/merged
>>> /sys/kernel/debug/block/vda/hctx0/cpu0/dispatched
>>> /sys/kernel/debug/block/vda/hctx0/cpu0/rq_list
>>> /sys/kernel/debug/block/vda/hctx0/active
>>> /sys/kernel/debug/block/vda/hctx0/run
>>> /sys/kernel/debug/block/vda/hctx0/queued
>>> /sys/kernel/debug/block/vda/hctx0/dispatched
>>> /sys/kernel/debug/block/vda/hctx0/io_poll
>>> /sys/kernel/debug/block/vda/hctx0/sched_tags_bitmap
>>> /sys/kernel/debug/block/vda/hctx0/sched_tags
>>> /sys/kernel/debug/block/vda/hctx0/tags_bitmap
>>> /sys/kernel/debug/block/vda/hctx0/tags
>>> /sys/kernel/debug/block/vda/hctx0/ctx_map
>>> /sys/kernel/debug/block/vda/hctx0/busy
>>> /sys/kernel/debug/block/vda/hctx0/dispatch
>>> /sys/kernel/debug/block/vda/hctx0/flags
>>> /sys/kernel/debug/block/vda/hctx0/state
>>> /sys/kernel/debug/block/vda/sched
>>> /sys/kernel/debug/block/vda/sched/dispatch
>>> /sys/kernel/debug/block/vda/sched/starved
>>> /sys/kernel/debug/block/vda/sched/batching
>>> /sys/kernel/debug/block/vda/sched/write_next_rq
>>> 

Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2017-11-21 Thread Christian Borntraeger


On 11/21/2017 09:21 PM, Jens Axboe wrote:
> On 11/21/2017 01:19 PM, Christian Borntraeger wrote:
>>
>> On 11/21/2017 09:14 PM, Jens Axboe wrote:
>>> On 11/21/2017 01:12 PM, Christian Borntraeger wrote:


 On 11/21/2017 08:30 PM, Jens Axboe wrote:
> On 11/21/2017 12:15 PM, Christian Borntraeger wrote:
>>
>>
>> On 11/21/2017 07:39 PM, Jens Axboe wrote:
>>> On 11/21/2017 11:27 AM, Jens Axboe wrote:
 On 11/21/2017 11:12 AM, Christian Borntraeger wrote:
>
>
> On 11/21/2017 07:09 PM, Jens Axboe wrote:
>> On 11/21/2017 10:27 AM, Jens Axboe wrote:
>>> On 11/21/2017 03:14 AM, Christian Borntraeger wrote:
 Bisect points to

 1b5a7455d345b223d3a4658a9e5fce985b7998c1 is the first bad commit
 commit 1b5a7455d345b223d3a4658a9e5fce985b7998c1
 Author: Christoph Hellwig 
 Date:   Mon Jun 26 12:20:57 2017 +0200

 blk-mq: Create hctx for each present CPU
 
 commit 4b855ad37194f7bdbb200ce7a1c7051fecb56a08 upstream.
 
 Currently we only create hctx for online CPUs, which can lead 
 to a lot
 of churn due to frequent soft offline / online operations.  
 Instead
 allocate one for each present CPU to avoid this and 
 dramatically simplify
 the code.
 
 Signed-off-by: Christoph Hellwig 
 Reviewed-by: Jens Axboe 
 Cc: Keith Busch 
 Cc: linux-bl...@vger.kernel.org
 Cc: linux-n...@lists.infradead.org
 Link: 
 http://lkml.kernel.org/r/20170626102058.10200-3-...@lst.de
 Signed-off-by: Thomas Gleixner 
 Cc: Oleksandr Natalenko 
 Cc: Mike Galbraith 
 Signed-off-by: Greg Kroah-Hartman 
>>>
>>> I wonder if we're simply not getting the masks updated correctly. 
>>> I'll
>>> take a look.
>>
>> Can't make it trigger here. We do init for each present CPU, which 
>> means
>> that if I offline a few CPUs here and register a queue, those still 
>> show
>> up as present (just offline) and get mapped accordingly.
>>
>> From the looks of it, your setup is different. If the CPU doesn't 
>> show
>> up as present and it gets hotplugged, then I can see how this 
>> condition
>> would trigger. What environment are you running this in? We might 
>> have
>> to re-introduce the cpu hotplug notifier, right now we just monitor
>> for a dead cpu and handle that.
>
> I am not doing a hot unplug and the replug, I use KVM and add a 
> previously
> not available CPU.
>
> in libvirt/virsh speak:
>   4

 So that's why we run into problems. It's not present when we load the 
 device,
 but becomes present and online afterwards.

 Christoph, we used to handle this just fine, your patch broke it.

 I'll see if I can come up with an appropriate fix.
>>>
>>> Can you try the below?
>>
>>
>> It does prevent the crash but it seems that the new CPU is not "used " 
>> after the hotplug for mq:
>>
>>
>> output with 2 cpus:
>> /sys/kernel/debug/block/vda
>> /sys/kernel/debug/block/vda/hctx0
>> /sys/kernel/debug/block/vda/hctx0/cpu0
>> /sys/kernel/debug/block/vda/hctx0/cpu0/completed
>> /sys/kernel/debug/block/vda/hctx0/cpu0/merged
>> /sys/kernel/debug/block/vda/hctx0/cpu0/dispatched
>> /sys/kernel/debug/block/vda/hctx0/cpu0/rq_list
>> /sys/kernel/debug/block/vda/hctx0/active
>> /sys/kernel/debug/block/vda/hctx0/run
>> /sys/kernel/debug/block/vda/hctx0/queued
>> /sys/kernel/debug/block/vda/hctx0/dispatched
>> /sys/kernel/debug/block/vda/hctx0/io_poll
>> /sys/kernel/debug/block/vda/hctx0/sched_tags_bitmap
>> /sys/kernel/debug/block/vda/hctx0/sched_tags
>> /sys/kernel/debug/block/vda/hctx0/tags_bitmap
>> /sys/kernel/debug/block/vda/hctx0/tags
>> /sys/kernel/debug/block/vda/hctx0/ctx_map
>> /sys/kernel/debug/block/vda/hctx0/busy
>> /sys/kernel/debug/block/vda/hctx0/dispatch
>> /sys/kernel/debug/block/vda/hctx0/flags
>> /sys/kernel/debug/block/vda/hctx0/state
>> /sys/kernel/debug/block/vda/sched
>> /sys/kernel/debug/block/vda/sched/dispatch
>> /sys/kernel/debug/block/vda/sched/starved
>> /sys/kernel/debug/block/vda/sched/batching
>> /sys/kernel/debug/block/vda/sched/write_next_rq
>> 

Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2017-11-21 Thread Christian Borntraeger


On 11/21/2017 09:21 PM, Jens Axboe wrote:
> On 11/21/2017 01:19 PM, Christian Borntraeger wrote:
>>
>> On 11/21/2017 09:14 PM, Jens Axboe wrote:
>>> On 11/21/2017 01:12 PM, Christian Borntraeger wrote:


 On 11/21/2017 08:30 PM, Jens Axboe wrote:
> On 11/21/2017 12:15 PM, Christian Borntraeger wrote:
>>
>>
>> On 11/21/2017 07:39 PM, Jens Axboe wrote:
>>> On 11/21/2017 11:27 AM, Jens Axboe wrote:
 On 11/21/2017 11:12 AM, Christian Borntraeger wrote:
>
>
> On 11/21/2017 07:09 PM, Jens Axboe wrote:
>> On 11/21/2017 10:27 AM, Jens Axboe wrote:
>>> On 11/21/2017 03:14 AM, Christian Borntraeger wrote:
 Bisect points to

 1b5a7455d345b223d3a4658a9e5fce985b7998c1 is the first bad commit
 commit 1b5a7455d345b223d3a4658a9e5fce985b7998c1
 Author: Christoph Hellwig 
 Date:   Mon Jun 26 12:20:57 2017 +0200

 blk-mq: Create hctx for each present CPU
 
 commit 4b855ad37194f7bdbb200ce7a1c7051fecb56a08 upstream.
 
 Currently we only create hctx for online CPUs, which can lead 
 to a lot
 of churn due to frequent soft offline / online operations.  
 Instead
 allocate one for each present CPU to avoid this and 
 dramatically simplify
 the code.
 
 Signed-off-by: Christoph Hellwig 
 Reviewed-by: Jens Axboe 
 Cc: Keith Busch 
 Cc: linux-bl...@vger.kernel.org
 Cc: linux-n...@lists.infradead.org
 Link: 
 http://lkml.kernel.org/r/20170626102058.10200-3-...@lst.de
 Signed-off-by: Thomas Gleixner 
 Cc: Oleksandr Natalenko 
 Cc: Mike Galbraith 
 Signed-off-by: Greg Kroah-Hartman 
>>>
>>> I wonder if we're simply not getting the masks updated correctly. 
>>> I'll
>>> take a look.
>>
>> Can't make it trigger here. We do init for each present CPU, which 
>> means
>> that if I offline a few CPUs here and register a queue, those still 
>> show
>> up as present (just offline) and get mapped accordingly.
>>
>> From the looks of it, your setup is different. If the CPU doesn't 
>> show
>> up as present and it gets hotplugged, then I can see how this 
>> condition
>> would trigger. What environment are you running this in? We might 
>> have
>> to re-introduce the cpu hotplug notifier, right now we just monitor
>> for a dead cpu and handle that.
>
> I am not doing a hot unplug and the replug, I use KVM and add a 
> previously
> not available CPU.
>
> in libvirt/virsh speak:
>   4

 So that's why we run into problems. It's not present when we load the 
 device,
 but becomes present and online afterwards.

 Christoph, we used to handle this just fine, your patch broke it.

 I'll see if I can come up with an appropriate fix.
>>>
>>> Can you try the below?
>>
>>
>> It does prevent the crash but it seems that the new CPU is not "used " 
>> after the hotplug for mq:
>>
>>
>> output with 2 cpus:
>> /sys/kernel/debug/block/vda
>> /sys/kernel/debug/block/vda/hctx0
>> /sys/kernel/debug/block/vda/hctx0/cpu0
>> /sys/kernel/debug/block/vda/hctx0/cpu0/completed
>> /sys/kernel/debug/block/vda/hctx0/cpu0/merged
>> /sys/kernel/debug/block/vda/hctx0/cpu0/dispatched
>> /sys/kernel/debug/block/vda/hctx0/cpu0/rq_list
>> /sys/kernel/debug/block/vda/hctx0/active
>> /sys/kernel/debug/block/vda/hctx0/run
>> /sys/kernel/debug/block/vda/hctx0/queued
>> /sys/kernel/debug/block/vda/hctx0/dispatched
>> /sys/kernel/debug/block/vda/hctx0/io_poll
>> /sys/kernel/debug/block/vda/hctx0/sched_tags_bitmap
>> /sys/kernel/debug/block/vda/hctx0/sched_tags
>> /sys/kernel/debug/block/vda/hctx0/tags_bitmap
>> /sys/kernel/debug/block/vda/hctx0/tags
>> /sys/kernel/debug/block/vda/hctx0/ctx_map
>> /sys/kernel/debug/block/vda/hctx0/busy
>> /sys/kernel/debug/block/vda/hctx0/dispatch
>> /sys/kernel/debug/block/vda/hctx0/flags
>> /sys/kernel/debug/block/vda/hctx0/state
>> /sys/kernel/debug/block/vda/sched
>> /sys/kernel/debug/block/vda/sched/dispatch
>> /sys/kernel/debug/block/vda/sched/starved
>> /sys/kernel/debug/block/vda/sched/batching
>> /sys/kernel/debug/block/vda/sched/write_next_rq
>> /sys/kernel/debug/block/vda/sched/write_fifo_list
>> /sys/kernel/debug/block/vda/sched/read_next_rq
>> /sys/kernel/debug/block/vda/sched/read_fifo_list
>> 

Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2017-11-21 Thread Jens Axboe
On 11/21/2017 01:19 PM, Christian Borntraeger wrote:
> 
> On 11/21/2017 09:14 PM, Jens Axboe wrote:
>> On 11/21/2017 01:12 PM, Christian Borntraeger wrote:
>>>
>>>
>>> On 11/21/2017 08:30 PM, Jens Axboe wrote:
 On 11/21/2017 12:15 PM, Christian Borntraeger wrote:
>
>
> On 11/21/2017 07:39 PM, Jens Axboe wrote:
>> On 11/21/2017 11:27 AM, Jens Axboe wrote:
>>> On 11/21/2017 11:12 AM, Christian Borntraeger wrote:


 On 11/21/2017 07:09 PM, Jens Axboe wrote:
> On 11/21/2017 10:27 AM, Jens Axboe wrote:
>> On 11/21/2017 03:14 AM, Christian Borntraeger wrote:
>>> Bisect points to
>>>
>>> 1b5a7455d345b223d3a4658a9e5fce985b7998c1 is the first bad commit
>>> commit 1b5a7455d345b223d3a4658a9e5fce985b7998c1
>>> Author: Christoph Hellwig 
>>> Date:   Mon Jun 26 12:20:57 2017 +0200
>>>
>>> blk-mq: Create hctx for each present CPU
>>> 
>>> commit 4b855ad37194f7bdbb200ce7a1c7051fecb56a08 upstream.
>>> 
>>> Currently we only create hctx for online CPUs, which can lead 
>>> to a lot
>>> of churn due to frequent soft offline / online operations.  
>>> Instead
>>> allocate one for each present CPU to avoid this and 
>>> dramatically simplify
>>> the code.
>>> 
>>> Signed-off-by: Christoph Hellwig 
>>> Reviewed-by: Jens Axboe 
>>> Cc: Keith Busch 
>>> Cc: linux-bl...@vger.kernel.org
>>> Cc: linux-n...@lists.infradead.org
>>> Link: http://lkml.kernel.org/r/20170626102058.10200-3-...@lst.de
>>> Signed-off-by: Thomas Gleixner 
>>> Cc: Oleksandr Natalenko 
>>> Cc: Mike Galbraith 
>>> Signed-off-by: Greg Kroah-Hartman 
>>
>> I wonder if we're simply not getting the masks updated correctly. 
>> I'll
>> take a look.
>
> Can't make it trigger here. We do init for each present CPU, which 
> means
> that if I offline a few CPUs here and register a queue, those still 
> show
> up as present (just offline) and get mapped accordingly.
>
> From the looks of it, your setup is different. If the CPU doesn't show
> up as present and it gets hotplugged, then I can see how this 
> condition
> would trigger. What environment are you running this in? We might have
> to re-introduce the cpu hotplug notifier, right now we just monitor
> for a dead cpu and handle that.

 I am not doing a hot unplug and the replug, I use KVM and add a 
 previously
 not available CPU.

 in libvirt/virsh speak:
   4
>>>
>>> So that's why we run into problems. It's not present when we load the 
>>> device,
>>> but becomes present and online afterwards.
>>>
>>> Christoph, we used to handle this just fine, your patch broke it.
>>>
>>> I'll see if I can come up with an appropriate fix.
>>
>> Can you try the below?
>
>
> It does prevent the crash but it seems that the new CPU is not "used " 
> after the hotplug for mq:
>
>
> output with 2 cpus:
> /sys/kernel/debug/block/vda
> /sys/kernel/debug/block/vda/hctx0
> /sys/kernel/debug/block/vda/hctx0/cpu0
> /sys/kernel/debug/block/vda/hctx0/cpu0/completed
> /sys/kernel/debug/block/vda/hctx0/cpu0/merged
> /sys/kernel/debug/block/vda/hctx0/cpu0/dispatched
> /sys/kernel/debug/block/vda/hctx0/cpu0/rq_list
> /sys/kernel/debug/block/vda/hctx0/active
> /sys/kernel/debug/block/vda/hctx0/run
> /sys/kernel/debug/block/vda/hctx0/queued
> /sys/kernel/debug/block/vda/hctx0/dispatched
> /sys/kernel/debug/block/vda/hctx0/io_poll
> /sys/kernel/debug/block/vda/hctx0/sched_tags_bitmap
> /sys/kernel/debug/block/vda/hctx0/sched_tags
> /sys/kernel/debug/block/vda/hctx0/tags_bitmap
> /sys/kernel/debug/block/vda/hctx0/tags
> /sys/kernel/debug/block/vda/hctx0/ctx_map
> /sys/kernel/debug/block/vda/hctx0/busy
> /sys/kernel/debug/block/vda/hctx0/dispatch
> /sys/kernel/debug/block/vda/hctx0/flags
> /sys/kernel/debug/block/vda/hctx0/state
> /sys/kernel/debug/block/vda/sched
> /sys/kernel/debug/block/vda/sched/dispatch
> /sys/kernel/debug/block/vda/sched/starved
> /sys/kernel/debug/block/vda/sched/batching
> /sys/kernel/debug/block/vda/sched/write_next_rq
> /sys/kernel/debug/block/vda/sched/write_fifo_list
> /sys/kernel/debug/block/vda/sched/read_next_rq
> /sys/kernel/debug/block/vda/sched/read_fifo_list
> /sys/kernel/debug/block/vda/write_hints
> 

Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2017-11-21 Thread Jens Axboe
On 11/21/2017 01:19 PM, Christian Borntraeger wrote:
> 
> On 11/21/2017 09:14 PM, Jens Axboe wrote:
>> On 11/21/2017 01:12 PM, Christian Borntraeger wrote:
>>>
>>>
>>> On 11/21/2017 08:30 PM, Jens Axboe wrote:
 On 11/21/2017 12:15 PM, Christian Borntraeger wrote:
>
>
> On 11/21/2017 07:39 PM, Jens Axboe wrote:
>> On 11/21/2017 11:27 AM, Jens Axboe wrote:
>>> On 11/21/2017 11:12 AM, Christian Borntraeger wrote:


 On 11/21/2017 07:09 PM, Jens Axboe wrote:
> On 11/21/2017 10:27 AM, Jens Axboe wrote:
>> On 11/21/2017 03:14 AM, Christian Borntraeger wrote:
>>> Bisect points to
>>>
>>> 1b5a7455d345b223d3a4658a9e5fce985b7998c1 is the first bad commit
>>> commit 1b5a7455d345b223d3a4658a9e5fce985b7998c1
>>> Author: Christoph Hellwig 
>>> Date:   Mon Jun 26 12:20:57 2017 +0200
>>>
>>> blk-mq: Create hctx for each present CPU
>>> 
>>> commit 4b855ad37194f7bdbb200ce7a1c7051fecb56a08 upstream.
>>> 
>>> Currently we only create hctx for online CPUs, which can lead 
>>> to a lot
>>> of churn due to frequent soft offline / online operations.  
>>> Instead
>>> allocate one for each present CPU to avoid this and 
>>> dramatically simplify
>>> the code.
>>> 
>>> Signed-off-by: Christoph Hellwig 
>>> Reviewed-by: Jens Axboe 
>>> Cc: Keith Busch 
>>> Cc: linux-bl...@vger.kernel.org
>>> Cc: linux-n...@lists.infradead.org
>>> Link: http://lkml.kernel.org/r/20170626102058.10200-3-...@lst.de
>>> Signed-off-by: Thomas Gleixner 
>>> Cc: Oleksandr Natalenko 
>>> Cc: Mike Galbraith 
>>> Signed-off-by: Greg Kroah-Hartman 
>>
>> I wonder if we're simply not getting the masks updated correctly. 
>> I'll
>> take a look.
>
> Can't make it trigger here. We do init for each present CPU, which 
> means
> that if I offline a few CPUs here and register a queue, those still 
> show
> up as present (just offline) and get mapped accordingly.
>
> From the looks of it, your setup is different. If the CPU doesn't show
> up as present and it gets hotplugged, then I can see how this 
> condition
> would trigger. What environment are you running this in? We might have
> to re-introduce the cpu hotplug notifier, right now we just monitor
> for a dead cpu and handle that.

 I am not doing a hot unplug and the replug, I use KVM and add a 
 previously
 not available CPU.

 in libvirt/virsh speak:
   4
>>>
>>> So that's why we run into problems. It's not present when we load the 
>>> device,
>>> but becomes present and online afterwards.
>>>
>>> Christoph, we used to handle this just fine, your patch broke it.
>>>
>>> I'll see if I can come up with an appropriate fix.
>>
>> Can you try the below?
>
>
> It does prevent the crash but it seems that the new CPU is not "used " 
> after the hotplug for mq:
>
>
> output with 2 cpus:
> /sys/kernel/debug/block/vda
> /sys/kernel/debug/block/vda/hctx0
> /sys/kernel/debug/block/vda/hctx0/cpu0
> /sys/kernel/debug/block/vda/hctx0/cpu0/completed
> /sys/kernel/debug/block/vda/hctx0/cpu0/merged
> /sys/kernel/debug/block/vda/hctx0/cpu0/dispatched
> /sys/kernel/debug/block/vda/hctx0/cpu0/rq_list
> /sys/kernel/debug/block/vda/hctx0/active
> /sys/kernel/debug/block/vda/hctx0/run
> /sys/kernel/debug/block/vda/hctx0/queued
> /sys/kernel/debug/block/vda/hctx0/dispatched
> /sys/kernel/debug/block/vda/hctx0/io_poll
> /sys/kernel/debug/block/vda/hctx0/sched_tags_bitmap
> /sys/kernel/debug/block/vda/hctx0/sched_tags
> /sys/kernel/debug/block/vda/hctx0/tags_bitmap
> /sys/kernel/debug/block/vda/hctx0/tags
> /sys/kernel/debug/block/vda/hctx0/ctx_map
> /sys/kernel/debug/block/vda/hctx0/busy
> /sys/kernel/debug/block/vda/hctx0/dispatch
> /sys/kernel/debug/block/vda/hctx0/flags
> /sys/kernel/debug/block/vda/hctx0/state
> /sys/kernel/debug/block/vda/sched
> /sys/kernel/debug/block/vda/sched/dispatch
> /sys/kernel/debug/block/vda/sched/starved
> /sys/kernel/debug/block/vda/sched/batching
> /sys/kernel/debug/block/vda/sched/write_next_rq
> /sys/kernel/debug/block/vda/sched/write_fifo_list
> /sys/kernel/debug/block/vda/sched/read_next_rq
> /sys/kernel/debug/block/vda/sched/read_fifo_list
> /sys/kernel/debug/block/vda/write_hints
> /sys/kernel/debug/block/vda/state
> /sys/kernel/debug/block/vda/requeue_list
> /sys/kernel/debug/block/vda/poll_stat

 Try this, basically just a revert.
>>>

Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2017-11-21 Thread Christian Borntraeger

On 11/21/2017 09:14 PM, Jens Axboe wrote:
> On 11/21/2017 01:12 PM, Christian Borntraeger wrote:
>>
>>
>> On 11/21/2017 08:30 PM, Jens Axboe wrote:
>>> On 11/21/2017 12:15 PM, Christian Borntraeger wrote:


 On 11/21/2017 07:39 PM, Jens Axboe wrote:
> On 11/21/2017 11:27 AM, Jens Axboe wrote:
>> On 11/21/2017 11:12 AM, Christian Borntraeger wrote:
>>>
>>>
>>> On 11/21/2017 07:09 PM, Jens Axboe wrote:
 On 11/21/2017 10:27 AM, Jens Axboe wrote:
> On 11/21/2017 03:14 AM, Christian Borntraeger wrote:
>> Bisect points to
>>
>> 1b5a7455d345b223d3a4658a9e5fce985b7998c1 is the first bad commit
>> commit 1b5a7455d345b223d3a4658a9e5fce985b7998c1
>> Author: Christoph Hellwig 
>> Date:   Mon Jun 26 12:20:57 2017 +0200
>>
>> blk-mq: Create hctx for each present CPU
>> 
>> commit 4b855ad37194f7bdbb200ce7a1c7051fecb56a08 upstream.
>> 
>> Currently we only create hctx for online CPUs, which can lead to 
>> a lot
>> of churn due to frequent soft offline / online operations.  
>> Instead
>> allocate one for each present CPU to avoid this and dramatically 
>> simplify
>> the code.
>> 
>> Signed-off-by: Christoph Hellwig 
>> Reviewed-by: Jens Axboe 
>> Cc: Keith Busch 
>> Cc: linux-bl...@vger.kernel.org
>> Cc: linux-n...@lists.infradead.org
>> Link: http://lkml.kernel.org/r/20170626102058.10200-3-...@lst.de
>> Signed-off-by: Thomas Gleixner 
>> Cc: Oleksandr Natalenko 
>> Cc: Mike Galbraith 
>> Signed-off-by: Greg Kroah-Hartman 
>
> I wonder if we're simply not getting the masks updated correctly. I'll
> take a look.

 Can't make it trigger here. We do init for each present CPU, which 
 means
 that if I offline a few CPUs here and register a queue, those still 
 show
 up as present (just offline) and get mapped accordingly.

 From the looks of it, your setup is different. If the CPU doesn't show
 up as present and it gets hotplugged, then I can see how this condition
 would trigger. What environment are you running this in? We might have
 to re-introduce the cpu hotplug notifier, right now we just monitor
 for a dead cpu and handle that.
>>>
>>> I am not doing a hot unplug and the replug, I use KVM and add a 
>>> previously
>>> not available CPU.
>>>
>>> in libvirt/virsh speak:
>>>   4
>>
>> So that's why we run into problems. It's not present when we load the 
>> device,
>> but becomes present and online afterwards.
>>
>> Christoph, we used to handle this just fine, your patch broke it.
>>
>> I'll see if I can come up with an appropriate fix.
>
> Can you try the below?


 It does prevent the crash but it seems that the new CPU is not "used " 
 after the hotplug for mq:


 output with 2 cpus:
 /sys/kernel/debug/block/vda
 /sys/kernel/debug/block/vda/hctx0
 /sys/kernel/debug/block/vda/hctx0/cpu0
 /sys/kernel/debug/block/vda/hctx0/cpu0/completed
 /sys/kernel/debug/block/vda/hctx0/cpu0/merged
 /sys/kernel/debug/block/vda/hctx0/cpu0/dispatched
 /sys/kernel/debug/block/vda/hctx0/cpu0/rq_list
 /sys/kernel/debug/block/vda/hctx0/active
 /sys/kernel/debug/block/vda/hctx0/run
 /sys/kernel/debug/block/vda/hctx0/queued
 /sys/kernel/debug/block/vda/hctx0/dispatched
 /sys/kernel/debug/block/vda/hctx0/io_poll
 /sys/kernel/debug/block/vda/hctx0/sched_tags_bitmap
 /sys/kernel/debug/block/vda/hctx0/sched_tags
 /sys/kernel/debug/block/vda/hctx0/tags_bitmap
 /sys/kernel/debug/block/vda/hctx0/tags
 /sys/kernel/debug/block/vda/hctx0/ctx_map
 /sys/kernel/debug/block/vda/hctx0/busy
 /sys/kernel/debug/block/vda/hctx0/dispatch
 /sys/kernel/debug/block/vda/hctx0/flags
 /sys/kernel/debug/block/vda/hctx0/state
 /sys/kernel/debug/block/vda/sched
 /sys/kernel/debug/block/vda/sched/dispatch
 /sys/kernel/debug/block/vda/sched/starved
 /sys/kernel/debug/block/vda/sched/batching
 /sys/kernel/debug/block/vda/sched/write_next_rq
 /sys/kernel/debug/block/vda/sched/write_fifo_list
 /sys/kernel/debug/block/vda/sched/read_next_rq
 /sys/kernel/debug/block/vda/sched/read_fifo_list
 /sys/kernel/debug/block/vda/write_hints
 /sys/kernel/debug/block/vda/state
 /sys/kernel/debug/block/vda/requeue_list
 /sys/kernel/debug/block/vda/poll_stat
>>>
>>> Try this, basically just a revert.
>>
>> Yes, seems to work.
>>
>> Tested-by: 

Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2017-11-21 Thread Christian Borntraeger

On 11/21/2017 09:14 PM, Jens Axboe wrote:
> On 11/21/2017 01:12 PM, Christian Borntraeger wrote:
>>
>>
>> On 11/21/2017 08:30 PM, Jens Axboe wrote:
>>> On 11/21/2017 12:15 PM, Christian Borntraeger wrote:


 On 11/21/2017 07:39 PM, Jens Axboe wrote:
> On 11/21/2017 11:27 AM, Jens Axboe wrote:
>> On 11/21/2017 11:12 AM, Christian Borntraeger wrote:
>>>
>>>
>>> On 11/21/2017 07:09 PM, Jens Axboe wrote:
 On 11/21/2017 10:27 AM, Jens Axboe wrote:
> On 11/21/2017 03:14 AM, Christian Borntraeger wrote:
>> Bisect points to
>>
>> 1b5a7455d345b223d3a4658a9e5fce985b7998c1 is the first bad commit
>> commit 1b5a7455d345b223d3a4658a9e5fce985b7998c1
>> Author: Christoph Hellwig 
>> Date:   Mon Jun 26 12:20:57 2017 +0200
>>
>> blk-mq: Create hctx for each present CPU
>> 
>> commit 4b855ad37194f7bdbb200ce7a1c7051fecb56a08 upstream.
>> 
>> Currently we only create hctx for online CPUs, which can lead to 
>> a lot
>> of churn due to frequent soft offline / online operations.  
>> Instead
>> allocate one for each present CPU to avoid this and dramatically 
>> simplify
>> the code.
>> 
>> Signed-off-by: Christoph Hellwig 
>> Reviewed-by: Jens Axboe 
>> Cc: Keith Busch 
>> Cc: linux-bl...@vger.kernel.org
>> Cc: linux-n...@lists.infradead.org
>> Link: http://lkml.kernel.org/r/20170626102058.10200-3-...@lst.de
>> Signed-off-by: Thomas Gleixner 
>> Cc: Oleksandr Natalenko 
>> Cc: Mike Galbraith 
>> Signed-off-by: Greg Kroah-Hartman 
>
> I wonder if we're simply not getting the masks updated correctly. I'll
> take a look.

 Can't make it trigger here. We do init for each present CPU, which 
 means
 that if I offline a few CPUs here and register a queue, those still 
 show
 up as present (just offline) and get mapped accordingly.

 From the looks of it, your setup is different. If the CPU doesn't show
 up as present and it gets hotplugged, then I can see how this condition
 would trigger. What environment are you running this in? We might have
 to re-introduce the cpu hotplug notifier, right now we just monitor
 for a dead cpu and handle that.
>>>
>>> I am not doing a hot unplug and the replug, I use KVM and add a 
>>> previously
>>> not available CPU.
>>>
>>> in libvirt/virsh speak:
>>>   4
>>
>> So that's why we run into problems. It's not present when we load the 
>> device,
>> but becomes present and online afterwards.
>>
>> Christoph, we used to handle this just fine, your patch broke it.
>>
>> I'll see if I can come up with an appropriate fix.
>
> Can you try the below?


 It does prevent the crash but it seems that the new CPU is not "used " 
 after the hotplug for mq:


 output with 2 cpus:
 /sys/kernel/debug/block/vda
 /sys/kernel/debug/block/vda/hctx0
 /sys/kernel/debug/block/vda/hctx0/cpu0
 /sys/kernel/debug/block/vda/hctx0/cpu0/completed
 /sys/kernel/debug/block/vda/hctx0/cpu0/merged
 /sys/kernel/debug/block/vda/hctx0/cpu0/dispatched
 /sys/kernel/debug/block/vda/hctx0/cpu0/rq_list
 /sys/kernel/debug/block/vda/hctx0/active
 /sys/kernel/debug/block/vda/hctx0/run
 /sys/kernel/debug/block/vda/hctx0/queued
 /sys/kernel/debug/block/vda/hctx0/dispatched
 /sys/kernel/debug/block/vda/hctx0/io_poll
 /sys/kernel/debug/block/vda/hctx0/sched_tags_bitmap
 /sys/kernel/debug/block/vda/hctx0/sched_tags
 /sys/kernel/debug/block/vda/hctx0/tags_bitmap
 /sys/kernel/debug/block/vda/hctx0/tags
 /sys/kernel/debug/block/vda/hctx0/ctx_map
 /sys/kernel/debug/block/vda/hctx0/busy
 /sys/kernel/debug/block/vda/hctx0/dispatch
 /sys/kernel/debug/block/vda/hctx0/flags
 /sys/kernel/debug/block/vda/hctx0/state
 /sys/kernel/debug/block/vda/sched
 /sys/kernel/debug/block/vda/sched/dispatch
 /sys/kernel/debug/block/vda/sched/starved
 /sys/kernel/debug/block/vda/sched/batching
 /sys/kernel/debug/block/vda/sched/write_next_rq
 /sys/kernel/debug/block/vda/sched/write_fifo_list
 /sys/kernel/debug/block/vda/sched/read_next_rq
 /sys/kernel/debug/block/vda/sched/read_fifo_list
 /sys/kernel/debug/block/vda/write_hints
 /sys/kernel/debug/block/vda/state
 /sys/kernel/debug/block/vda/requeue_list
 /sys/kernel/debug/block/vda/poll_stat
>>>
>>> Try this, basically just a revert.
>>
>> Yes, seems to work.
>>
>> Tested-by: Christian Borntraeger 
> 
> Great, thanks for testing.
> 
>> Do you know why the original commit made it into 4.12 stable? After all
>> it has no Fixes tag and 

Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2017-11-21 Thread Jens Axboe
On 11/21/2017 01:12 PM, Christian Borntraeger wrote:
> 
> 
> On 11/21/2017 08:30 PM, Jens Axboe wrote:
>> On 11/21/2017 12:15 PM, Christian Borntraeger wrote:
>>>
>>>
>>> On 11/21/2017 07:39 PM, Jens Axboe wrote:
 On 11/21/2017 11:27 AM, Jens Axboe wrote:
> On 11/21/2017 11:12 AM, Christian Borntraeger wrote:
>>
>>
>> On 11/21/2017 07:09 PM, Jens Axboe wrote:
>>> On 11/21/2017 10:27 AM, Jens Axboe wrote:
 On 11/21/2017 03:14 AM, Christian Borntraeger wrote:
> Bisect points to
>
> 1b5a7455d345b223d3a4658a9e5fce985b7998c1 is the first bad commit
> commit 1b5a7455d345b223d3a4658a9e5fce985b7998c1
> Author: Christoph Hellwig 
> Date:   Mon Jun 26 12:20:57 2017 +0200
>
> blk-mq: Create hctx for each present CPU
> 
> commit 4b855ad37194f7bdbb200ce7a1c7051fecb56a08 upstream.
> 
> Currently we only create hctx for online CPUs, which can lead to 
> a lot
> of churn due to frequent soft offline / online operations.  
> Instead
> allocate one for each present CPU to avoid this and dramatically 
> simplify
> the code.
> 
> Signed-off-by: Christoph Hellwig 
> Reviewed-by: Jens Axboe 
> Cc: Keith Busch 
> Cc: linux-bl...@vger.kernel.org
> Cc: linux-n...@lists.infradead.org
> Link: http://lkml.kernel.org/r/20170626102058.10200-3-...@lst.de
> Signed-off-by: Thomas Gleixner 
> Cc: Oleksandr Natalenko 
> Cc: Mike Galbraith 
> Signed-off-by: Greg Kroah-Hartman 

 I wonder if we're simply not getting the masks updated correctly. I'll
 take a look.
>>>
>>> Can't make it trigger here. We do init for each present CPU, which means
>>> that if I offline a few CPUs here and register a queue, those still show
>>> up as present (just offline) and get mapped accordingly.
>>>
>>> From the looks of it, your setup is different. If the CPU doesn't show
>>> up as present and it gets hotplugged, then I can see how this condition
>>> would trigger. What environment are you running this in? We might have
>>> to re-introduce the cpu hotplug notifier, right now we just monitor
>>> for a dead cpu and handle that.
>>
>> I am not doing a hot unplug and the replug, I use KVM and add a 
>> previously
>> not available CPU.
>>
>> in libvirt/virsh speak:
>>   4
>
> So that's why we run into problems. It's not present when we load the 
> device,
> but becomes present and online afterwards.
>
> Christoph, we used to handle this just fine, your patch broke it.
>
> I'll see if I can come up with an appropriate fix.

 Can you try the below?
>>>
>>>
>>> It does prevent the crash but it seems that the new CPU is not "used " 
>>> after the hotplug for mq:
>>>
>>>
>>> output with 2 cpus:
>>> /sys/kernel/debug/block/vda
>>> /sys/kernel/debug/block/vda/hctx0
>>> /sys/kernel/debug/block/vda/hctx0/cpu0
>>> /sys/kernel/debug/block/vda/hctx0/cpu0/completed
>>> /sys/kernel/debug/block/vda/hctx0/cpu0/merged
>>> /sys/kernel/debug/block/vda/hctx0/cpu0/dispatched
>>> /sys/kernel/debug/block/vda/hctx0/cpu0/rq_list
>>> /sys/kernel/debug/block/vda/hctx0/active
>>> /sys/kernel/debug/block/vda/hctx0/run
>>> /sys/kernel/debug/block/vda/hctx0/queued
>>> /sys/kernel/debug/block/vda/hctx0/dispatched
>>> /sys/kernel/debug/block/vda/hctx0/io_poll
>>> /sys/kernel/debug/block/vda/hctx0/sched_tags_bitmap
>>> /sys/kernel/debug/block/vda/hctx0/sched_tags
>>> /sys/kernel/debug/block/vda/hctx0/tags_bitmap
>>> /sys/kernel/debug/block/vda/hctx0/tags
>>> /sys/kernel/debug/block/vda/hctx0/ctx_map
>>> /sys/kernel/debug/block/vda/hctx0/busy
>>> /sys/kernel/debug/block/vda/hctx0/dispatch
>>> /sys/kernel/debug/block/vda/hctx0/flags
>>> /sys/kernel/debug/block/vda/hctx0/state
>>> /sys/kernel/debug/block/vda/sched
>>> /sys/kernel/debug/block/vda/sched/dispatch
>>> /sys/kernel/debug/block/vda/sched/starved
>>> /sys/kernel/debug/block/vda/sched/batching
>>> /sys/kernel/debug/block/vda/sched/write_next_rq
>>> /sys/kernel/debug/block/vda/sched/write_fifo_list
>>> /sys/kernel/debug/block/vda/sched/read_next_rq
>>> /sys/kernel/debug/block/vda/sched/read_fifo_list
>>> /sys/kernel/debug/block/vda/write_hints
>>> /sys/kernel/debug/block/vda/state
>>> /sys/kernel/debug/block/vda/requeue_list
>>> /sys/kernel/debug/block/vda/poll_stat
>>
>> Try this, basically just a revert.
> 
> Yes, seems to work.
> 
> Tested-by: Christian Borntraeger 

Great, thanks for testing.

> Do you know why the original commit made it into 4.12 stable? After all
> it has no Fixes tag and no cc 

Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2017-11-21 Thread Jens Axboe
On 11/21/2017 01:12 PM, Christian Borntraeger wrote:
> 
> 
> On 11/21/2017 08:30 PM, Jens Axboe wrote:
>> On 11/21/2017 12:15 PM, Christian Borntraeger wrote:
>>>
>>>
>>> On 11/21/2017 07:39 PM, Jens Axboe wrote:
 On 11/21/2017 11:27 AM, Jens Axboe wrote:
> On 11/21/2017 11:12 AM, Christian Borntraeger wrote:
>>
>>
>> On 11/21/2017 07:09 PM, Jens Axboe wrote:
>>> On 11/21/2017 10:27 AM, Jens Axboe wrote:
 On 11/21/2017 03:14 AM, Christian Borntraeger wrote:
> Bisect points to
>
> 1b5a7455d345b223d3a4658a9e5fce985b7998c1 is the first bad commit
> commit 1b5a7455d345b223d3a4658a9e5fce985b7998c1
> Author: Christoph Hellwig 
> Date:   Mon Jun 26 12:20:57 2017 +0200
>
> blk-mq: Create hctx for each present CPU
> 
> commit 4b855ad37194f7bdbb200ce7a1c7051fecb56a08 upstream.
> 
> Currently we only create hctx for online CPUs, which can lead to 
> a lot
> of churn due to frequent soft offline / online operations.  
> Instead
> allocate one for each present CPU to avoid this and dramatically 
> simplify
> the code.
> 
> Signed-off-by: Christoph Hellwig 
> Reviewed-by: Jens Axboe 
> Cc: Keith Busch 
> Cc: linux-bl...@vger.kernel.org
> Cc: linux-n...@lists.infradead.org
> Link: http://lkml.kernel.org/r/20170626102058.10200-3-...@lst.de
> Signed-off-by: Thomas Gleixner 
> Cc: Oleksandr Natalenko 
> Cc: Mike Galbraith 
> Signed-off-by: Greg Kroah-Hartman 

 I wonder if we're simply not getting the masks updated correctly. I'll
 take a look.
>>>
>>> Can't make it trigger here. We do init for each present CPU, which means
>>> that if I offline a few CPUs here and register a queue, those still show
>>> up as present (just offline) and get mapped accordingly.
>>>
>>> From the looks of it, your setup is different. If the CPU doesn't show
>>> up as present and it gets hotplugged, then I can see how this condition
>>> would trigger. What environment are you running this in? We might have
>>> to re-introduce the cpu hotplug notifier, right now we just monitor
>>> for a dead cpu and handle that.
>>
>> I am not doing a hot unplug and the replug, I use KVM and add a 
>> previously
>> not available CPU.
>>
>> in libvirt/virsh speak:
>>   4
>
> So that's why we run into problems. It's not present when we load the 
> device,
> but becomes present and online afterwards.
>
> Christoph, we used to handle this just fine, your patch broke it.
>
> I'll see if I can come up with an appropriate fix.

 Can you try the below?
>>>
>>>
>>> It does prevent the crash but it seems that the new CPU is not "used " 
>>> after the hotplug for mq:
>>>
>>>
>>> output with 2 cpus:
>>> /sys/kernel/debug/block/vda
>>> /sys/kernel/debug/block/vda/hctx0
>>> /sys/kernel/debug/block/vda/hctx0/cpu0
>>> /sys/kernel/debug/block/vda/hctx0/cpu0/completed
>>> /sys/kernel/debug/block/vda/hctx0/cpu0/merged
>>> /sys/kernel/debug/block/vda/hctx0/cpu0/dispatched
>>> /sys/kernel/debug/block/vda/hctx0/cpu0/rq_list
>>> /sys/kernel/debug/block/vda/hctx0/active
>>> /sys/kernel/debug/block/vda/hctx0/run
>>> /sys/kernel/debug/block/vda/hctx0/queued
>>> /sys/kernel/debug/block/vda/hctx0/dispatched
>>> /sys/kernel/debug/block/vda/hctx0/io_poll
>>> /sys/kernel/debug/block/vda/hctx0/sched_tags_bitmap
>>> /sys/kernel/debug/block/vda/hctx0/sched_tags
>>> /sys/kernel/debug/block/vda/hctx0/tags_bitmap
>>> /sys/kernel/debug/block/vda/hctx0/tags
>>> /sys/kernel/debug/block/vda/hctx0/ctx_map
>>> /sys/kernel/debug/block/vda/hctx0/busy
>>> /sys/kernel/debug/block/vda/hctx0/dispatch
>>> /sys/kernel/debug/block/vda/hctx0/flags
>>> /sys/kernel/debug/block/vda/hctx0/state
>>> /sys/kernel/debug/block/vda/sched
>>> /sys/kernel/debug/block/vda/sched/dispatch
>>> /sys/kernel/debug/block/vda/sched/starved
>>> /sys/kernel/debug/block/vda/sched/batching
>>> /sys/kernel/debug/block/vda/sched/write_next_rq
>>> /sys/kernel/debug/block/vda/sched/write_fifo_list
>>> /sys/kernel/debug/block/vda/sched/read_next_rq
>>> /sys/kernel/debug/block/vda/sched/read_fifo_list
>>> /sys/kernel/debug/block/vda/write_hints
>>> /sys/kernel/debug/block/vda/state
>>> /sys/kernel/debug/block/vda/requeue_list
>>> /sys/kernel/debug/block/vda/poll_stat
>>
>> Try this, basically just a revert.
> 
> Yes, seems to work.
> 
> Tested-by: Christian Borntraeger 

Great, thanks for testing.

> Do you know why the original commit made it into 4.12 stable? After all
> it has no Fixes tag and no cc stable-

I was wondering the same thing when you said it was in 4.12.stable and
not in 4.12 release. That patch should absolutely not have gone into
stable, it's not marked as such 

Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2017-11-21 Thread Christian Borntraeger


On 11/21/2017 08:30 PM, Jens Axboe wrote:
> On 11/21/2017 12:15 PM, Christian Borntraeger wrote:
>>
>>
>> On 11/21/2017 07:39 PM, Jens Axboe wrote:
>>> On 11/21/2017 11:27 AM, Jens Axboe wrote:
 On 11/21/2017 11:12 AM, Christian Borntraeger wrote:
>
>
> On 11/21/2017 07:09 PM, Jens Axboe wrote:
>> On 11/21/2017 10:27 AM, Jens Axboe wrote:
>>> On 11/21/2017 03:14 AM, Christian Borntraeger wrote:
 Bisect points to

 1b5a7455d345b223d3a4658a9e5fce985b7998c1 is the first bad commit
 commit 1b5a7455d345b223d3a4658a9e5fce985b7998c1
 Author: Christoph Hellwig 
 Date:   Mon Jun 26 12:20:57 2017 +0200

 blk-mq: Create hctx for each present CPU
 
 commit 4b855ad37194f7bdbb200ce7a1c7051fecb56a08 upstream.
 
 Currently we only create hctx for online CPUs, which can lead to a 
 lot
 of churn due to frequent soft offline / online operations.  Instead
 allocate one for each present CPU to avoid this and dramatically 
 simplify
 the code.
 
 Signed-off-by: Christoph Hellwig 
 Reviewed-by: Jens Axboe 
 Cc: Keith Busch 
 Cc: linux-bl...@vger.kernel.org
 Cc: linux-n...@lists.infradead.org
 Link: http://lkml.kernel.org/r/20170626102058.10200-3-...@lst.de
 Signed-off-by: Thomas Gleixner 
 Cc: Oleksandr Natalenko 
 Cc: Mike Galbraith 
 Signed-off-by: Greg Kroah-Hartman 
>>>
>>> I wonder if we're simply not getting the masks updated correctly. I'll
>>> take a look.
>>
>> Can't make it trigger here. We do init for each present CPU, which means
>> that if I offline a few CPUs here and register a queue, those still show
>> up as present (just offline) and get mapped accordingly.
>>
>> From the looks of it, your setup is different. If the CPU doesn't show
>> up as present and it gets hotplugged, then I can see how this condition
>> would trigger. What environment are you running this in? We might have
>> to re-introduce the cpu hotplug notifier, right now we just monitor
>> for a dead cpu and handle that.
>
> I am not doing a hot unplug and the replug, I use KVM and add a previously
> not available CPU.
>
> in libvirt/virsh speak:
>   4

 So that's why we run into problems. It's not present when we load the 
 device,
 but becomes present and online afterwards.

 Christoph, we used to handle this just fine, your patch broke it.

 I'll see if I can come up with an appropriate fix.
>>>
>>> Can you try the below?
>>
>>
>> It does prevent the crash but it seems that the new CPU is not "used " after 
>> the hotplug for mq:
>>
>>
>> output with 2 cpus:
>> /sys/kernel/debug/block/vda
>> /sys/kernel/debug/block/vda/hctx0
>> /sys/kernel/debug/block/vda/hctx0/cpu0
>> /sys/kernel/debug/block/vda/hctx0/cpu0/completed
>> /sys/kernel/debug/block/vda/hctx0/cpu0/merged
>> /sys/kernel/debug/block/vda/hctx0/cpu0/dispatched
>> /sys/kernel/debug/block/vda/hctx0/cpu0/rq_list
>> /sys/kernel/debug/block/vda/hctx0/active
>> /sys/kernel/debug/block/vda/hctx0/run
>> /sys/kernel/debug/block/vda/hctx0/queued
>> /sys/kernel/debug/block/vda/hctx0/dispatched
>> /sys/kernel/debug/block/vda/hctx0/io_poll
>> /sys/kernel/debug/block/vda/hctx0/sched_tags_bitmap
>> /sys/kernel/debug/block/vda/hctx0/sched_tags
>> /sys/kernel/debug/block/vda/hctx0/tags_bitmap
>> /sys/kernel/debug/block/vda/hctx0/tags
>> /sys/kernel/debug/block/vda/hctx0/ctx_map
>> /sys/kernel/debug/block/vda/hctx0/busy
>> /sys/kernel/debug/block/vda/hctx0/dispatch
>> /sys/kernel/debug/block/vda/hctx0/flags
>> /sys/kernel/debug/block/vda/hctx0/state
>> /sys/kernel/debug/block/vda/sched
>> /sys/kernel/debug/block/vda/sched/dispatch
>> /sys/kernel/debug/block/vda/sched/starved
>> /sys/kernel/debug/block/vda/sched/batching
>> /sys/kernel/debug/block/vda/sched/write_next_rq
>> /sys/kernel/debug/block/vda/sched/write_fifo_list
>> /sys/kernel/debug/block/vda/sched/read_next_rq
>> /sys/kernel/debug/block/vda/sched/read_fifo_list
>> /sys/kernel/debug/block/vda/write_hints
>> /sys/kernel/debug/block/vda/state
>> /sys/kernel/debug/block/vda/requeue_list
>> /sys/kernel/debug/block/vda/poll_stat
> 
> Try this, basically just a revert.

Yes, seems to work.

Tested-by: Christian Borntraeger 

Do you know why the original commit made it into 4.12 stable? After all
it has no Fixes tag and no cc stable-


> 
> 
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index 11097477eeab..bc1950fa9ef6 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -37,6 +37,9 @@
>  #include "blk-wbt.h"
>  #include "blk-mq-sched.h"
> 
> 

Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2017-11-21 Thread Christian Borntraeger


On 11/21/2017 08:30 PM, Jens Axboe wrote:
> On 11/21/2017 12:15 PM, Christian Borntraeger wrote:
>>
>>
>> On 11/21/2017 07:39 PM, Jens Axboe wrote:
>>> On 11/21/2017 11:27 AM, Jens Axboe wrote:
 On 11/21/2017 11:12 AM, Christian Borntraeger wrote:
>
>
> On 11/21/2017 07:09 PM, Jens Axboe wrote:
>> On 11/21/2017 10:27 AM, Jens Axboe wrote:
>>> On 11/21/2017 03:14 AM, Christian Borntraeger wrote:
 Bisect points to

 1b5a7455d345b223d3a4658a9e5fce985b7998c1 is the first bad commit
 commit 1b5a7455d345b223d3a4658a9e5fce985b7998c1
 Author: Christoph Hellwig 
 Date:   Mon Jun 26 12:20:57 2017 +0200

 blk-mq: Create hctx for each present CPU
 
 commit 4b855ad37194f7bdbb200ce7a1c7051fecb56a08 upstream.
 
 Currently we only create hctx for online CPUs, which can lead to a 
 lot
 of churn due to frequent soft offline / online operations.  Instead
 allocate one for each present CPU to avoid this and dramatically 
 simplify
 the code.
 
 Signed-off-by: Christoph Hellwig 
 Reviewed-by: Jens Axboe 
 Cc: Keith Busch 
 Cc: linux-bl...@vger.kernel.org
 Cc: linux-n...@lists.infradead.org
 Link: http://lkml.kernel.org/r/20170626102058.10200-3-...@lst.de
 Signed-off-by: Thomas Gleixner 
 Cc: Oleksandr Natalenko 
 Cc: Mike Galbraith 
 Signed-off-by: Greg Kroah-Hartman 
>>>
>>> I wonder if we're simply not getting the masks updated correctly. I'll
>>> take a look.
>>
>> Can't make it trigger here. We do init for each present CPU, which means
>> that if I offline a few CPUs here and register a queue, those still show
>> up as present (just offline) and get mapped accordingly.
>>
>> From the looks of it, your setup is different. If the CPU doesn't show
>> up as present and it gets hotplugged, then I can see how this condition
>> would trigger. What environment are you running this in? We might have
>> to re-introduce the cpu hotplug notifier, right now we just monitor
>> for a dead cpu and handle that.
>
> I am not doing a hot unplug and the replug, I use KVM and add a previously
> not available CPU.
>
> in libvirt/virsh speak:
>   4

 So that's why we run into problems. It's not present when we load the 
 device,
 but becomes present and online afterwards.

 Christoph, we used to handle this just fine, your patch broke it.

 I'll see if I can come up with an appropriate fix.
>>>
>>> Can you try the below?
>>
>>
>> It does prevent the crash but it seems that the new CPU is not "used " after 
>> the hotplug for mq:
>>
>>
>> output with 2 cpus:
>> /sys/kernel/debug/block/vda
>> /sys/kernel/debug/block/vda/hctx0
>> /sys/kernel/debug/block/vda/hctx0/cpu0
>> /sys/kernel/debug/block/vda/hctx0/cpu0/completed
>> /sys/kernel/debug/block/vda/hctx0/cpu0/merged
>> /sys/kernel/debug/block/vda/hctx0/cpu0/dispatched
>> /sys/kernel/debug/block/vda/hctx0/cpu0/rq_list
>> /sys/kernel/debug/block/vda/hctx0/active
>> /sys/kernel/debug/block/vda/hctx0/run
>> /sys/kernel/debug/block/vda/hctx0/queued
>> /sys/kernel/debug/block/vda/hctx0/dispatched
>> /sys/kernel/debug/block/vda/hctx0/io_poll
>> /sys/kernel/debug/block/vda/hctx0/sched_tags_bitmap
>> /sys/kernel/debug/block/vda/hctx0/sched_tags
>> /sys/kernel/debug/block/vda/hctx0/tags_bitmap
>> /sys/kernel/debug/block/vda/hctx0/tags
>> /sys/kernel/debug/block/vda/hctx0/ctx_map
>> /sys/kernel/debug/block/vda/hctx0/busy
>> /sys/kernel/debug/block/vda/hctx0/dispatch
>> /sys/kernel/debug/block/vda/hctx0/flags
>> /sys/kernel/debug/block/vda/hctx0/state
>> /sys/kernel/debug/block/vda/sched
>> /sys/kernel/debug/block/vda/sched/dispatch
>> /sys/kernel/debug/block/vda/sched/starved
>> /sys/kernel/debug/block/vda/sched/batching
>> /sys/kernel/debug/block/vda/sched/write_next_rq
>> /sys/kernel/debug/block/vda/sched/write_fifo_list
>> /sys/kernel/debug/block/vda/sched/read_next_rq
>> /sys/kernel/debug/block/vda/sched/read_fifo_list
>> /sys/kernel/debug/block/vda/write_hints
>> /sys/kernel/debug/block/vda/state
>> /sys/kernel/debug/block/vda/requeue_list
>> /sys/kernel/debug/block/vda/poll_stat
> 
> Try this, basically just a revert.

Yes, seems to work.

Tested-by: Christian Borntraeger 

Do you know why the original commit made it into 4.12 stable? After all
it has no Fixes tag and no cc stable-


> 
> 
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index 11097477eeab..bc1950fa9ef6 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -37,6 +37,9 @@
>  #include "blk-wbt.h"
>  #include "blk-mq-sched.h"
> 
> +static DEFINE_MUTEX(all_q_mutex);
> +static LIST_HEAD(all_q_list);
> +
>  static bool blk_mq_poll(struct request_queue *q, blk_qc_t cookie);
>  static void 

Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2017-11-21 Thread Jens Axboe
On 11/21/2017 12:15 PM, Christian Borntraeger wrote:
> 
> 
> On 11/21/2017 07:39 PM, Jens Axboe wrote:
>> On 11/21/2017 11:27 AM, Jens Axboe wrote:
>>> On 11/21/2017 11:12 AM, Christian Borntraeger wrote:


 On 11/21/2017 07:09 PM, Jens Axboe wrote:
> On 11/21/2017 10:27 AM, Jens Axboe wrote:
>> On 11/21/2017 03:14 AM, Christian Borntraeger wrote:
>>> Bisect points to
>>>
>>> 1b5a7455d345b223d3a4658a9e5fce985b7998c1 is the first bad commit
>>> commit 1b5a7455d345b223d3a4658a9e5fce985b7998c1
>>> Author: Christoph Hellwig 
>>> Date:   Mon Jun 26 12:20:57 2017 +0200
>>>
>>> blk-mq: Create hctx for each present CPU
>>> 
>>> commit 4b855ad37194f7bdbb200ce7a1c7051fecb56a08 upstream.
>>> 
>>> Currently we only create hctx for online CPUs, which can lead to a 
>>> lot
>>> of churn due to frequent soft offline / online operations.  Instead
>>> allocate one for each present CPU to avoid this and dramatically 
>>> simplify
>>> the code.
>>> 
>>> Signed-off-by: Christoph Hellwig 
>>> Reviewed-by: Jens Axboe 
>>> Cc: Keith Busch 
>>> Cc: linux-bl...@vger.kernel.org
>>> Cc: linux-n...@lists.infradead.org
>>> Link: http://lkml.kernel.org/r/20170626102058.10200-3-...@lst.de
>>> Signed-off-by: Thomas Gleixner 
>>> Cc: Oleksandr Natalenko 
>>> Cc: Mike Galbraith 
>>> Signed-off-by: Greg Kroah-Hartman 
>>
>> I wonder if we're simply not getting the masks updated correctly. I'll
>> take a look.
>
> Can't make it trigger here. We do init for each present CPU, which means
> that if I offline a few CPUs here and register a queue, those still show
> up as present (just offline) and get mapped accordingly.
>
> From the looks of it, your setup is different. If the CPU doesn't show
> up as present and it gets hotplugged, then I can see how this condition
> would trigger. What environment are you running this in? We might have
> to re-introduce the cpu hotplug notifier, right now we just monitor
> for a dead cpu and handle that.

 I am not doing a hot unplug and the replug, I use KVM and add a previously
 not available CPU.

 in libvirt/virsh speak:
   4
>>>
>>> So that's why we run into problems. It's not present when we load the 
>>> device,
>>> but becomes present and online afterwards.
>>>
>>> Christoph, we used to handle this just fine, your patch broke it.
>>>
>>> I'll see if I can come up with an appropriate fix.
>>
>> Can you try the below?
> 
> 
> It does prevent the crash but it seems that the new CPU is not "used " after 
> the hotplug for mq:
> 
> 
> output with 2 cpus:
> /sys/kernel/debug/block/vda
> /sys/kernel/debug/block/vda/hctx0
> /sys/kernel/debug/block/vda/hctx0/cpu0
> /sys/kernel/debug/block/vda/hctx0/cpu0/completed
> /sys/kernel/debug/block/vda/hctx0/cpu0/merged
> /sys/kernel/debug/block/vda/hctx0/cpu0/dispatched
> /sys/kernel/debug/block/vda/hctx0/cpu0/rq_list
> /sys/kernel/debug/block/vda/hctx0/active
> /sys/kernel/debug/block/vda/hctx0/run
> /sys/kernel/debug/block/vda/hctx0/queued
> /sys/kernel/debug/block/vda/hctx0/dispatched
> /sys/kernel/debug/block/vda/hctx0/io_poll
> /sys/kernel/debug/block/vda/hctx0/sched_tags_bitmap
> /sys/kernel/debug/block/vda/hctx0/sched_tags
> /sys/kernel/debug/block/vda/hctx0/tags_bitmap
> /sys/kernel/debug/block/vda/hctx0/tags
> /sys/kernel/debug/block/vda/hctx0/ctx_map
> /sys/kernel/debug/block/vda/hctx0/busy
> /sys/kernel/debug/block/vda/hctx0/dispatch
> /sys/kernel/debug/block/vda/hctx0/flags
> /sys/kernel/debug/block/vda/hctx0/state
> /sys/kernel/debug/block/vda/sched
> /sys/kernel/debug/block/vda/sched/dispatch
> /sys/kernel/debug/block/vda/sched/starved
> /sys/kernel/debug/block/vda/sched/batching
> /sys/kernel/debug/block/vda/sched/write_next_rq
> /sys/kernel/debug/block/vda/sched/write_fifo_list
> /sys/kernel/debug/block/vda/sched/read_next_rq
> /sys/kernel/debug/block/vda/sched/read_fifo_list
> /sys/kernel/debug/block/vda/write_hints
> /sys/kernel/debug/block/vda/state
> /sys/kernel/debug/block/vda/requeue_list
> /sys/kernel/debug/block/vda/poll_stat

Try this, basically just a revert.


diff --git a/block/blk-mq.c b/block/blk-mq.c
index 11097477eeab..bc1950fa9ef6 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -37,6 +37,9 @@
 #include "blk-wbt.h"
 #include "blk-mq-sched.h"
 
+static DEFINE_MUTEX(all_q_mutex);
+static LIST_HEAD(all_q_list);
+
 static bool blk_mq_poll(struct request_queue *q, blk_qc_t cookie);
 static void blk_mq_poll_stats_start(struct request_queue *q);
 static void blk_mq_poll_stats_fn(struct blk_stat_callback *cb);
@@ -2114,8 +2117,8 @@ static void blk_mq_init_cpu_queues(struct request_queue 
*q,

Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2017-11-21 Thread Jens Axboe
On 11/21/2017 12:15 PM, Christian Borntraeger wrote:
> 
> 
> On 11/21/2017 07:39 PM, Jens Axboe wrote:
>> On 11/21/2017 11:27 AM, Jens Axboe wrote:
>>> On 11/21/2017 11:12 AM, Christian Borntraeger wrote:


 On 11/21/2017 07:09 PM, Jens Axboe wrote:
> On 11/21/2017 10:27 AM, Jens Axboe wrote:
>> On 11/21/2017 03:14 AM, Christian Borntraeger wrote:
>>> Bisect points to
>>>
>>> 1b5a7455d345b223d3a4658a9e5fce985b7998c1 is the first bad commit
>>> commit 1b5a7455d345b223d3a4658a9e5fce985b7998c1
>>> Author: Christoph Hellwig 
>>> Date:   Mon Jun 26 12:20:57 2017 +0200
>>>
>>> blk-mq: Create hctx for each present CPU
>>> 
>>> commit 4b855ad37194f7bdbb200ce7a1c7051fecb56a08 upstream.
>>> 
>>> Currently we only create hctx for online CPUs, which can lead to a 
>>> lot
>>> of churn due to frequent soft offline / online operations.  Instead
>>> allocate one for each present CPU to avoid this and dramatically 
>>> simplify
>>> the code.
>>> 
>>> Signed-off-by: Christoph Hellwig 
>>> Reviewed-by: Jens Axboe 
>>> Cc: Keith Busch 
>>> Cc: linux-bl...@vger.kernel.org
>>> Cc: linux-n...@lists.infradead.org
>>> Link: http://lkml.kernel.org/r/20170626102058.10200-3-...@lst.de
>>> Signed-off-by: Thomas Gleixner 
>>> Cc: Oleksandr Natalenko 
>>> Cc: Mike Galbraith 
>>> Signed-off-by: Greg Kroah-Hartman 
>>
>> I wonder if we're simply not getting the masks updated correctly. I'll
>> take a look.
>
> Can't make it trigger here. We do init for each present CPU, which means
> that if I offline a few CPUs here and register a queue, those still show
> up as present (just offline) and get mapped accordingly.
>
> From the looks of it, your setup is different. If the CPU doesn't show
> up as present and it gets hotplugged, then I can see how this condition
> would trigger. What environment are you running this in? We might have
> to re-introduce the cpu hotplug notifier, right now we just monitor
> for a dead cpu and handle that.

 I am not doing a hot unplug and the replug, I use KVM and add a previously
 not available CPU.

 in libvirt/virsh speak:
   4
>>>
>>> So that's why we run into problems. It's not present when we load the 
>>> device,
>>> but becomes present and online afterwards.
>>>
>>> Christoph, we used to handle this just fine, your patch broke it.
>>>
>>> I'll see if I can come up with an appropriate fix.
>>
>> Can you try the below?
> 
> 
> It does prevent the crash but it seems that the new CPU is not "used " after 
> the hotplug for mq:
> 
> 
> output with 2 cpus:
> /sys/kernel/debug/block/vda
> /sys/kernel/debug/block/vda/hctx0
> /sys/kernel/debug/block/vda/hctx0/cpu0
> /sys/kernel/debug/block/vda/hctx0/cpu0/completed
> /sys/kernel/debug/block/vda/hctx0/cpu0/merged
> /sys/kernel/debug/block/vda/hctx0/cpu0/dispatched
> /sys/kernel/debug/block/vda/hctx0/cpu0/rq_list
> /sys/kernel/debug/block/vda/hctx0/active
> /sys/kernel/debug/block/vda/hctx0/run
> /sys/kernel/debug/block/vda/hctx0/queued
> /sys/kernel/debug/block/vda/hctx0/dispatched
> /sys/kernel/debug/block/vda/hctx0/io_poll
> /sys/kernel/debug/block/vda/hctx0/sched_tags_bitmap
> /sys/kernel/debug/block/vda/hctx0/sched_tags
> /sys/kernel/debug/block/vda/hctx0/tags_bitmap
> /sys/kernel/debug/block/vda/hctx0/tags
> /sys/kernel/debug/block/vda/hctx0/ctx_map
> /sys/kernel/debug/block/vda/hctx0/busy
> /sys/kernel/debug/block/vda/hctx0/dispatch
> /sys/kernel/debug/block/vda/hctx0/flags
> /sys/kernel/debug/block/vda/hctx0/state
> /sys/kernel/debug/block/vda/sched
> /sys/kernel/debug/block/vda/sched/dispatch
> /sys/kernel/debug/block/vda/sched/starved
> /sys/kernel/debug/block/vda/sched/batching
> /sys/kernel/debug/block/vda/sched/write_next_rq
> /sys/kernel/debug/block/vda/sched/write_fifo_list
> /sys/kernel/debug/block/vda/sched/read_next_rq
> /sys/kernel/debug/block/vda/sched/read_fifo_list
> /sys/kernel/debug/block/vda/write_hints
> /sys/kernel/debug/block/vda/state
> /sys/kernel/debug/block/vda/requeue_list
> /sys/kernel/debug/block/vda/poll_stat

Try this, basically just a revert.


diff --git a/block/blk-mq.c b/block/blk-mq.c
index 11097477eeab..bc1950fa9ef6 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -37,6 +37,9 @@
 #include "blk-wbt.h"
 #include "blk-mq-sched.h"
 
+static DEFINE_MUTEX(all_q_mutex);
+static LIST_HEAD(all_q_list);
+
 static bool blk_mq_poll(struct request_queue *q, blk_qc_t cookie);
 static void blk_mq_poll_stats_start(struct request_queue *q);
 static void blk_mq_poll_stats_fn(struct blk_stat_callback *cb);
@@ -2114,8 +2117,8 @@ static void blk_mq_init_cpu_queues(struct request_queue 
*q,
INIT_LIST_HEAD(&__ctx->rq_list);
__ctx->queue = q;
 
-   /* If the cpu isn't present, the cpu is mapped to first hctx */
-

Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2017-11-21 Thread Christian Borntraeger


On 11/21/2017 07:39 PM, Jens Axboe wrote:
> On 11/21/2017 11:27 AM, Jens Axboe wrote:
>> On 11/21/2017 11:12 AM, Christian Borntraeger wrote:
>>>
>>>
>>> On 11/21/2017 07:09 PM, Jens Axboe wrote:
 On 11/21/2017 10:27 AM, Jens Axboe wrote:
> On 11/21/2017 03:14 AM, Christian Borntraeger wrote:
>> Bisect points to
>>
>> 1b5a7455d345b223d3a4658a9e5fce985b7998c1 is the first bad commit
>> commit 1b5a7455d345b223d3a4658a9e5fce985b7998c1
>> Author: Christoph Hellwig 
>> Date:   Mon Jun 26 12:20:57 2017 +0200
>>
>> blk-mq: Create hctx for each present CPU
>> 
>> commit 4b855ad37194f7bdbb200ce7a1c7051fecb56a08 upstream.
>> 
>> Currently we only create hctx for online CPUs, which can lead to a 
>> lot
>> of churn due to frequent soft offline / online operations.  Instead
>> allocate one for each present CPU to avoid this and dramatically 
>> simplify
>> the code.
>> 
>> Signed-off-by: Christoph Hellwig 
>> Reviewed-by: Jens Axboe 
>> Cc: Keith Busch 
>> Cc: linux-bl...@vger.kernel.org
>> Cc: linux-n...@lists.infradead.org
>> Link: http://lkml.kernel.org/r/20170626102058.10200-3-...@lst.de
>> Signed-off-by: Thomas Gleixner 
>> Cc: Oleksandr Natalenko 
>> Cc: Mike Galbraith 
>> Signed-off-by: Greg Kroah-Hartman 
>
> I wonder if we're simply not getting the masks updated correctly. I'll
> take a look.

 Can't make it trigger here. We do init for each present CPU, which means
 that if I offline a few CPUs here and register a queue, those still show
 up as present (just offline) and get mapped accordingly.

 From the looks of it, your setup is different. If the CPU doesn't show
 up as present and it gets hotplugged, then I can see how this condition
 would trigger. What environment are you running this in? We might have
 to re-introduce the cpu hotplug notifier, right now we just monitor
 for a dead cpu and handle that.
>>>
>>> I am not doing a hot unplug and the replug, I use KVM and add a previously
>>> not available CPU.
>>>
>>> in libvirt/virsh speak:
>>>   4
>>
>> So that's why we run into problems. It's not present when we load the device,
>> but becomes present and online afterwards.
>>
>> Christoph, we used to handle this just fine, your patch broke it.
>>
>> I'll see if I can come up with an appropriate fix.
> 
> Can you try the below?


It does prevent the crash but it seems that the new CPU is not "used " after 
the hotplug for mq:


output with 2 cpus:
/sys/kernel/debug/block/vda
/sys/kernel/debug/block/vda/hctx0
/sys/kernel/debug/block/vda/hctx0/cpu0
/sys/kernel/debug/block/vda/hctx0/cpu0/completed
/sys/kernel/debug/block/vda/hctx0/cpu0/merged
/sys/kernel/debug/block/vda/hctx0/cpu0/dispatched
/sys/kernel/debug/block/vda/hctx0/cpu0/rq_list
/sys/kernel/debug/block/vda/hctx0/active
/sys/kernel/debug/block/vda/hctx0/run
/sys/kernel/debug/block/vda/hctx0/queued
/sys/kernel/debug/block/vda/hctx0/dispatched
/sys/kernel/debug/block/vda/hctx0/io_poll
/sys/kernel/debug/block/vda/hctx0/sched_tags_bitmap
/sys/kernel/debug/block/vda/hctx0/sched_tags
/sys/kernel/debug/block/vda/hctx0/tags_bitmap
/sys/kernel/debug/block/vda/hctx0/tags
/sys/kernel/debug/block/vda/hctx0/ctx_map
/sys/kernel/debug/block/vda/hctx0/busy
/sys/kernel/debug/block/vda/hctx0/dispatch
/sys/kernel/debug/block/vda/hctx0/flags
/sys/kernel/debug/block/vda/hctx0/state
/sys/kernel/debug/block/vda/sched
/sys/kernel/debug/block/vda/sched/dispatch
/sys/kernel/debug/block/vda/sched/starved
/sys/kernel/debug/block/vda/sched/batching
/sys/kernel/debug/block/vda/sched/write_next_rq
/sys/kernel/debug/block/vda/sched/write_fifo_list
/sys/kernel/debug/block/vda/sched/read_next_rq
/sys/kernel/debug/block/vda/sched/read_fifo_list
/sys/kernel/debug/block/vda/write_hints
/sys/kernel/debug/block/vda/state
/sys/kernel/debug/block/vda/requeue_list
/sys/kernel/debug/block/vda/poll_stat

> 
> 
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index b600463791ec..ab3a66e7bd03 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -40,6 +40,7 @@
>  static bool blk_mq_poll(struct request_queue *q, blk_qc_t cookie);
>  static void blk_mq_poll_stats_start(struct request_queue *q);
>  static void blk_mq_poll_stats_fn(struct blk_stat_callback *cb);
> +static void blk_mq_map_swqueue(struct request_queue *q);
> 
>  static int blk_mq_poll_stats_bkt(const struct request *rq)
>  {
> @@ -1947,6 +1950,15 @@ int blk_mq_alloc_rqs(struct blk_mq_tag_set *set, 
> struct blk_mq_tags *tags,
>   return -ENOMEM;
>  }
> 
> +static int blk_mq_hctx_notify_prepare(unsigned int cpu, struct hlist_node 
> *node)
> +{
> + struct blk_mq_hw_ctx *hctx;
> +
> + hctx = hlist_entry_safe(node, struct 

Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2017-11-21 Thread Christian Borntraeger


On 11/21/2017 07:39 PM, Jens Axboe wrote:
> On 11/21/2017 11:27 AM, Jens Axboe wrote:
>> On 11/21/2017 11:12 AM, Christian Borntraeger wrote:
>>>
>>>
>>> On 11/21/2017 07:09 PM, Jens Axboe wrote:
 On 11/21/2017 10:27 AM, Jens Axboe wrote:
> On 11/21/2017 03:14 AM, Christian Borntraeger wrote:
>> Bisect points to
>>
>> 1b5a7455d345b223d3a4658a9e5fce985b7998c1 is the first bad commit
>> commit 1b5a7455d345b223d3a4658a9e5fce985b7998c1
>> Author: Christoph Hellwig 
>> Date:   Mon Jun 26 12:20:57 2017 +0200
>>
>> blk-mq: Create hctx for each present CPU
>> 
>> commit 4b855ad37194f7bdbb200ce7a1c7051fecb56a08 upstream.
>> 
>> Currently we only create hctx for online CPUs, which can lead to a 
>> lot
>> of churn due to frequent soft offline / online operations.  Instead
>> allocate one for each present CPU to avoid this and dramatically 
>> simplify
>> the code.
>> 
>> Signed-off-by: Christoph Hellwig 
>> Reviewed-by: Jens Axboe 
>> Cc: Keith Busch 
>> Cc: linux-bl...@vger.kernel.org
>> Cc: linux-n...@lists.infradead.org
>> Link: http://lkml.kernel.org/r/20170626102058.10200-3-...@lst.de
>> Signed-off-by: Thomas Gleixner 
>> Cc: Oleksandr Natalenko 
>> Cc: Mike Galbraith 
>> Signed-off-by: Greg Kroah-Hartman 
>
> I wonder if we're simply not getting the masks updated correctly. I'll
> take a look.

 Can't make it trigger here. We do init for each present CPU, which means
 that if I offline a few CPUs here and register a queue, those still show
 up as present (just offline) and get mapped accordingly.

 From the looks of it, your setup is different. If the CPU doesn't show
 up as present and it gets hotplugged, then I can see how this condition
 would trigger. What environment are you running this in? We might have
 to re-introduce the cpu hotplug notifier, right now we just monitor
 for a dead cpu and handle that.
>>>
>>> I am not doing a hot unplug and the replug, I use KVM and add a previously
>>> not available CPU.
>>>
>>> in libvirt/virsh speak:
>>>   4
>>
>> So that's why we run into problems. It's not present when we load the device,
>> but becomes present and online afterwards.
>>
>> Christoph, we used to handle this just fine, your patch broke it.
>>
>> I'll see if I can come up with an appropriate fix.
> 
> Can you try the below?


It does prevent the crash but it seems that the new CPU is not "used " after 
the hotplug for mq:


output with 2 cpus:
/sys/kernel/debug/block/vda
/sys/kernel/debug/block/vda/hctx0
/sys/kernel/debug/block/vda/hctx0/cpu0
/sys/kernel/debug/block/vda/hctx0/cpu0/completed
/sys/kernel/debug/block/vda/hctx0/cpu0/merged
/sys/kernel/debug/block/vda/hctx0/cpu0/dispatched
/sys/kernel/debug/block/vda/hctx0/cpu0/rq_list
/sys/kernel/debug/block/vda/hctx0/active
/sys/kernel/debug/block/vda/hctx0/run
/sys/kernel/debug/block/vda/hctx0/queued
/sys/kernel/debug/block/vda/hctx0/dispatched
/sys/kernel/debug/block/vda/hctx0/io_poll
/sys/kernel/debug/block/vda/hctx0/sched_tags_bitmap
/sys/kernel/debug/block/vda/hctx0/sched_tags
/sys/kernel/debug/block/vda/hctx0/tags_bitmap
/sys/kernel/debug/block/vda/hctx0/tags
/sys/kernel/debug/block/vda/hctx0/ctx_map
/sys/kernel/debug/block/vda/hctx0/busy
/sys/kernel/debug/block/vda/hctx0/dispatch
/sys/kernel/debug/block/vda/hctx0/flags
/sys/kernel/debug/block/vda/hctx0/state
/sys/kernel/debug/block/vda/sched
/sys/kernel/debug/block/vda/sched/dispatch
/sys/kernel/debug/block/vda/sched/starved
/sys/kernel/debug/block/vda/sched/batching
/sys/kernel/debug/block/vda/sched/write_next_rq
/sys/kernel/debug/block/vda/sched/write_fifo_list
/sys/kernel/debug/block/vda/sched/read_next_rq
/sys/kernel/debug/block/vda/sched/read_fifo_list
/sys/kernel/debug/block/vda/write_hints
/sys/kernel/debug/block/vda/state
/sys/kernel/debug/block/vda/requeue_list
/sys/kernel/debug/block/vda/poll_stat

> 
> 
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index b600463791ec..ab3a66e7bd03 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -40,6 +40,7 @@
>  static bool blk_mq_poll(struct request_queue *q, blk_qc_t cookie);
>  static void blk_mq_poll_stats_start(struct request_queue *q);
>  static void blk_mq_poll_stats_fn(struct blk_stat_callback *cb);
> +static void blk_mq_map_swqueue(struct request_queue *q);
> 
>  static int blk_mq_poll_stats_bkt(const struct request *rq)
>  {
> @@ -1947,6 +1950,15 @@ int blk_mq_alloc_rqs(struct blk_mq_tag_set *set, 
> struct blk_mq_tags *tags,
>   return -ENOMEM;
>  }
> 
> +static int blk_mq_hctx_notify_prepare(unsigned int cpu, struct hlist_node 
> *node)
> +{
> + struct blk_mq_hw_ctx *hctx;
> +
> + hctx = hlist_entry_safe(node, struct blk_mq_hw_ctx, cpuhp);
> + blk_mq_map_swqueue(hctx->queue);
> + return 0;
> +}
> +
>  /*
>   * 'cpu' is going away. splice any existing rq_list 

Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2017-11-21 Thread Jens Axboe
On 11/21/2017 11:27 AM, Jens Axboe wrote:
> On 11/21/2017 11:12 AM, Christian Borntraeger wrote:
>>
>>
>> On 11/21/2017 07:09 PM, Jens Axboe wrote:
>>> On 11/21/2017 10:27 AM, Jens Axboe wrote:
 On 11/21/2017 03:14 AM, Christian Borntraeger wrote:
> Bisect points to
>
> 1b5a7455d345b223d3a4658a9e5fce985b7998c1 is the first bad commit
> commit 1b5a7455d345b223d3a4658a9e5fce985b7998c1
> Author: Christoph Hellwig 
> Date:   Mon Jun 26 12:20:57 2017 +0200
>
> blk-mq: Create hctx for each present CPU
> 
> commit 4b855ad37194f7bdbb200ce7a1c7051fecb56a08 upstream.
> 
> Currently we only create hctx for online CPUs, which can lead to a lot
> of churn due to frequent soft offline / online operations.  Instead
> allocate one for each present CPU to avoid this and dramatically 
> simplify
> the code.
> 
> Signed-off-by: Christoph Hellwig 
> Reviewed-by: Jens Axboe 
> Cc: Keith Busch 
> Cc: linux-bl...@vger.kernel.org
> Cc: linux-n...@lists.infradead.org
> Link: http://lkml.kernel.org/r/20170626102058.10200-3-...@lst.de
> Signed-off-by: Thomas Gleixner 
> Cc: Oleksandr Natalenko 
> Cc: Mike Galbraith 
> Signed-off-by: Greg Kroah-Hartman 

 I wonder if we're simply not getting the masks updated correctly. I'll
 take a look.
>>>
>>> Can't make it trigger here. We do init for each present CPU, which means
>>> that if I offline a few CPUs here and register a queue, those still show
>>> up as present (just offline) and get mapped accordingly.
>>>
>>> From the looks of it, your setup is different. If the CPU doesn't show
>>> up as present and it gets hotplugged, then I can see how this condition
>>> would trigger. What environment are you running this in? We might have
>>> to re-introduce the cpu hotplug notifier, right now we just monitor
>>> for a dead cpu and handle that.
>>
>> I am not doing a hot unplug and the replug, I use KVM and add a previously
>> not available CPU.
>>
>> in libvirt/virsh speak:
>>   4
> 
> So that's why we run into problems. It's not present when we load the device,
> but becomes present and online afterwards.
> 
> Christoph, we used to handle this just fine, your patch broke it.
> 
> I'll see if I can come up with an appropriate fix.

Can you try the below?


diff --git a/block/blk-mq.c b/block/blk-mq.c
index b600463791ec..ab3a66e7bd03 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -40,6 +40,7 @@
 static bool blk_mq_poll(struct request_queue *q, blk_qc_t cookie);
 static void blk_mq_poll_stats_start(struct request_queue *q);
 static void blk_mq_poll_stats_fn(struct blk_stat_callback *cb);
+static void blk_mq_map_swqueue(struct request_queue *q);
 
 static int blk_mq_poll_stats_bkt(const struct request *rq)
 {
@@ -1947,6 +1950,15 @@ int blk_mq_alloc_rqs(struct blk_mq_tag_set *set, struct 
blk_mq_tags *tags,
return -ENOMEM;
 }
 
+static int blk_mq_hctx_notify_prepare(unsigned int cpu, struct hlist_node 
*node)
+{
+   struct blk_mq_hw_ctx *hctx;
+
+   hctx = hlist_entry_safe(node, struct blk_mq_hw_ctx, cpuhp);
+   blk_mq_map_swqueue(hctx->queue);
+   return 0;
+}
+
 /*
  * 'cpu' is going away. splice any existing rq_list entries from this
  * software queue to the hw queue dispatch list, and ensure that it
@@ -1958,7 +1970,7 @@ static int blk_mq_hctx_notify_dead(unsigned int cpu, 
struct hlist_node *node)
struct blk_mq_ctx *ctx;
LIST_HEAD(tmp);
 
-   hctx = hlist_entry_safe(node, struct blk_mq_hw_ctx, cpuhp_dead);
+   hctx = hlist_entry_safe(node, struct blk_mq_hw_ctx, cpuhp);
ctx = __blk_mq_get_ctx(hctx->queue, cpu);
 
spin_lock(>lock);
@@ -1981,8 +1993,7 @@ static int blk_mq_hctx_notify_dead(unsigned int cpu, 
struct hlist_node *node)
 
 static void blk_mq_remove_cpuhp(struct blk_mq_hw_ctx *hctx)
 {
-   cpuhp_state_remove_instance_nocalls(CPUHP_BLK_MQ_DEAD,
-   >cpuhp_dead);
+   cpuhp_state_remove_instance_nocalls(CPUHP_BLK_MQ_PREPARE, >cpuhp);
 }
 
 /* hctx->ctxs will be freed in queue's release handler */
@@ -2039,7 +2050,7 @@ static int blk_mq_init_hctx(struct request_queue *q,
hctx->queue = q;
hctx->flags = set->flags & ~BLK_MQ_F_TAG_SHARED;
 
-   cpuhp_state_add_instance_nocalls(CPUHP_BLK_MQ_DEAD, >cpuhp_dead);
+   cpuhp_state_add_instance_nocalls(CPUHP_BLK_MQ_PREPARE, >cpuhp);
 
hctx->tags = set->tags[hctx_idx];
 
@@ -2974,7 +2987,8 @@ static int __init blk_mq_init(void)
BUILD_BUG_ON((REQ_ATOM_STARTED / BITS_PER_BYTE) !=
(REQ_ATOM_COMPLETE / BITS_PER_BYTE));
 
-   cpuhp_setup_state_multi(CPUHP_BLK_MQ_DEAD, "block/mq:dead", NULL,
+   

Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2017-11-21 Thread Jens Axboe
On 11/21/2017 11:27 AM, Jens Axboe wrote:
> On 11/21/2017 11:12 AM, Christian Borntraeger wrote:
>>
>>
>> On 11/21/2017 07:09 PM, Jens Axboe wrote:
>>> On 11/21/2017 10:27 AM, Jens Axboe wrote:
 On 11/21/2017 03:14 AM, Christian Borntraeger wrote:
> Bisect points to
>
> 1b5a7455d345b223d3a4658a9e5fce985b7998c1 is the first bad commit
> commit 1b5a7455d345b223d3a4658a9e5fce985b7998c1
> Author: Christoph Hellwig 
> Date:   Mon Jun 26 12:20:57 2017 +0200
>
> blk-mq: Create hctx for each present CPU
> 
> commit 4b855ad37194f7bdbb200ce7a1c7051fecb56a08 upstream.
> 
> Currently we only create hctx for online CPUs, which can lead to a lot
> of churn due to frequent soft offline / online operations.  Instead
> allocate one for each present CPU to avoid this and dramatically 
> simplify
> the code.
> 
> Signed-off-by: Christoph Hellwig 
> Reviewed-by: Jens Axboe 
> Cc: Keith Busch 
> Cc: linux-bl...@vger.kernel.org
> Cc: linux-n...@lists.infradead.org
> Link: http://lkml.kernel.org/r/20170626102058.10200-3-...@lst.de
> Signed-off-by: Thomas Gleixner 
> Cc: Oleksandr Natalenko 
> Cc: Mike Galbraith 
> Signed-off-by: Greg Kroah-Hartman 

 I wonder if we're simply not getting the masks updated correctly. I'll
 take a look.
>>>
>>> Can't make it trigger here. We do init for each present CPU, which means
>>> that if I offline a few CPUs here and register a queue, those still show
>>> up as present (just offline) and get mapped accordingly.
>>>
>>> From the looks of it, your setup is different. If the CPU doesn't show
>>> up as present and it gets hotplugged, then I can see how this condition
>>> would trigger. What environment are you running this in? We might have
>>> to re-introduce the cpu hotplug notifier, right now we just monitor
>>> for a dead cpu and handle that.
>>
>> I am not doing a hot unplug and the replug, I use KVM and add a previously
>> not available CPU.
>>
>> in libvirt/virsh speak:
>>   4
> 
> So that's why we run into problems. It's not present when we load the device,
> but becomes present and online afterwards.
> 
> Christoph, we used to handle this just fine, your patch broke it.
> 
> I'll see if I can come up with an appropriate fix.

Can you try the below?


diff --git a/block/blk-mq.c b/block/blk-mq.c
index b600463791ec..ab3a66e7bd03 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -40,6 +40,7 @@
 static bool blk_mq_poll(struct request_queue *q, blk_qc_t cookie);
 static void blk_mq_poll_stats_start(struct request_queue *q);
 static void blk_mq_poll_stats_fn(struct blk_stat_callback *cb);
+static void blk_mq_map_swqueue(struct request_queue *q);
 
 static int blk_mq_poll_stats_bkt(const struct request *rq)
 {
@@ -1947,6 +1950,15 @@ int blk_mq_alloc_rqs(struct blk_mq_tag_set *set, struct 
blk_mq_tags *tags,
return -ENOMEM;
 }
 
+static int blk_mq_hctx_notify_prepare(unsigned int cpu, struct hlist_node 
*node)
+{
+   struct blk_mq_hw_ctx *hctx;
+
+   hctx = hlist_entry_safe(node, struct blk_mq_hw_ctx, cpuhp);
+   blk_mq_map_swqueue(hctx->queue);
+   return 0;
+}
+
 /*
  * 'cpu' is going away. splice any existing rq_list entries from this
  * software queue to the hw queue dispatch list, and ensure that it
@@ -1958,7 +1970,7 @@ static int blk_mq_hctx_notify_dead(unsigned int cpu, 
struct hlist_node *node)
struct blk_mq_ctx *ctx;
LIST_HEAD(tmp);
 
-   hctx = hlist_entry_safe(node, struct blk_mq_hw_ctx, cpuhp_dead);
+   hctx = hlist_entry_safe(node, struct blk_mq_hw_ctx, cpuhp);
ctx = __blk_mq_get_ctx(hctx->queue, cpu);
 
spin_lock(>lock);
@@ -1981,8 +1993,7 @@ static int blk_mq_hctx_notify_dead(unsigned int cpu, 
struct hlist_node *node)
 
 static void blk_mq_remove_cpuhp(struct blk_mq_hw_ctx *hctx)
 {
-   cpuhp_state_remove_instance_nocalls(CPUHP_BLK_MQ_DEAD,
-   >cpuhp_dead);
+   cpuhp_state_remove_instance_nocalls(CPUHP_BLK_MQ_PREPARE, >cpuhp);
 }
 
 /* hctx->ctxs will be freed in queue's release handler */
@@ -2039,7 +2050,7 @@ static int blk_mq_init_hctx(struct request_queue *q,
hctx->queue = q;
hctx->flags = set->flags & ~BLK_MQ_F_TAG_SHARED;
 
-   cpuhp_state_add_instance_nocalls(CPUHP_BLK_MQ_DEAD, >cpuhp_dead);
+   cpuhp_state_add_instance_nocalls(CPUHP_BLK_MQ_PREPARE, >cpuhp);
 
hctx->tags = set->tags[hctx_idx];
 
@@ -2974,7 +2987,8 @@ static int __init blk_mq_init(void)
BUILD_BUG_ON((REQ_ATOM_STARTED / BITS_PER_BYTE) !=
(REQ_ATOM_COMPLETE / BITS_PER_BYTE));
 
-   cpuhp_setup_state_multi(CPUHP_BLK_MQ_DEAD, "block/mq:dead", NULL,
+   cpuhp_setup_state_multi(CPUHP_BLK_MQ_PREPARE, "block/mq:prepare",
+   blk_mq_hctx_notify_prepare,

Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2017-11-21 Thread Jens Axboe
On 11/21/2017 11:12 AM, Christian Borntraeger wrote:
> 
> 
> On 11/21/2017 07:09 PM, Jens Axboe wrote:
>> On 11/21/2017 10:27 AM, Jens Axboe wrote:
>>> On 11/21/2017 03:14 AM, Christian Borntraeger wrote:
 Bisect points to

 1b5a7455d345b223d3a4658a9e5fce985b7998c1 is the first bad commit
 commit 1b5a7455d345b223d3a4658a9e5fce985b7998c1
 Author: Christoph Hellwig 
 Date:   Mon Jun 26 12:20:57 2017 +0200

 blk-mq: Create hctx for each present CPU
 
 commit 4b855ad37194f7bdbb200ce7a1c7051fecb56a08 upstream.
 
 Currently we only create hctx for online CPUs, which can lead to a lot
 of churn due to frequent soft offline / online operations.  Instead
 allocate one for each present CPU to avoid this and dramatically 
 simplify
 the code.
 
 Signed-off-by: Christoph Hellwig 
 Reviewed-by: Jens Axboe 
 Cc: Keith Busch 
 Cc: linux-bl...@vger.kernel.org
 Cc: linux-n...@lists.infradead.org
 Link: http://lkml.kernel.org/r/20170626102058.10200-3-...@lst.de
 Signed-off-by: Thomas Gleixner 
 Cc: Oleksandr Natalenko 
 Cc: Mike Galbraith 
 Signed-off-by: Greg Kroah-Hartman 
>>>
>>> I wonder if we're simply not getting the masks updated correctly. I'll
>>> take a look.
>>
>> Can't make it trigger here. We do init for each present CPU, which means
>> that if I offline a few CPUs here and register a queue, those still show
>> up as present (just offline) and get mapped accordingly.
>>
>> From the looks of it, your setup is different. If the CPU doesn't show
>> up as present and it gets hotplugged, then I can see how this condition
>> would trigger. What environment are you running this in? We might have
>> to re-introduce the cpu hotplug notifier, right now we just monitor
>> for a dead cpu and handle that.
> 
> I am not doing a hot unplug and the replug, I use KVM and add a previously
> not available CPU.
> 
> in libvirt/virsh speak:
>   4

So that's why we run into problems. It's not present when we load the device,
but becomes present and online afterwards.

Christoph, we used to handle this just fine, your patch broke it.

I'll see if I can come up with an appropriate fix.

-- 
Jens Axboe



Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2017-11-21 Thread Jens Axboe
On 11/21/2017 11:12 AM, Christian Borntraeger wrote:
> 
> 
> On 11/21/2017 07:09 PM, Jens Axboe wrote:
>> On 11/21/2017 10:27 AM, Jens Axboe wrote:
>>> On 11/21/2017 03:14 AM, Christian Borntraeger wrote:
 Bisect points to

 1b5a7455d345b223d3a4658a9e5fce985b7998c1 is the first bad commit
 commit 1b5a7455d345b223d3a4658a9e5fce985b7998c1
 Author: Christoph Hellwig 
 Date:   Mon Jun 26 12:20:57 2017 +0200

 blk-mq: Create hctx for each present CPU
 
 commit 4b855ad37194f7bdbb200ce7a1c7051fecb56a08 upstream.
 
 Currently we only create hctx for online CPUs, which can lead to a lot
 of churn due to frequent soft offline / online operations.  Instead
 allocate one for each present CPU to avoid this and dramatically 
 simplify
 the code.
 
 Signed-off-by: Christoph Hellwig 
 Reviewed-by: Jens Axboe 
 Cc: Keith Busch 
 Cc: linux-bl...@vger.kernel.org
 Cc: linux-n...@lists.infradead.org
 Link: http://lkml.kernel.org/r/20170626102058.10200-3-...@lst.de
 Signed-off-by: Thomas Gleixner 
 Cc: Oleksandr Natalenko 
 Cc: Mike Galbraith 
 Signed-off-by: Greg Kroah-Hartman 
>>>
>>> I wonder if we're simply not getting the masks updated correctly. I'll
>>> take a look.
>>
>> Can't make it trigger here. We do init for each present CPU, which means
>> that if I offline a few CPUs here and register a queue, those still show
>> up as present (just offline) and get mapped accordingly.
>>
>> From the looks of it, your setup is different. If the CPU doesn't show
>> up as present and it gets hotplugged, then I can see how this condition
>> would trigger. What environment are you running this in? We might have
>> to re-introduce the cpu hotplug notifier, right now we just monitor
>> for a dead cpu and handle that.
> 
> I am not doing a hot unplug and the replug, I use KVM and add a previously
> not available CPU.
> 
> in libvirt/virsh speak:
>   4

So that's why we run into problems. It's not present when we load the device,
but becomes present and online afterwards.

Christoph, we used to handle this just fine, your patch broke it.

I'll see if I can come up with an appropriate fix.

-- 
Jens Axboe



Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2017-11-21 Thread Christian Borntraeger


On 11/21/2017 07:09 PM, Jens Axboe wrote:
> On 11/21/2017 10:27 AM, Jens Axboe wrote:
>> On 11/21/2017 03:14 AM, Christian Borntraeger wrote:
>>> Bisect points to
>>>
>>> 1b5a7455d345b223d3a4658a9e5fce985b7998c1 is the first bad commit
>>> commit 1b5a7455d345b223d3a4658a9e5fce985b7998c1
>>> Author: Christoph Hellwig 
>>> Date:   Mon Jun 26 12:20:57 2017 +0200
>>>
>>> blk-mq: Create hctx for each present CPU
>>> 
>>> commit 4b855ad37194f7bdbb200ce7a1c7051fecb56a08 upstream.
>>> 
>>> Currently we only create hctx for online CPUs, which can lead to a lot
>>> of churn due to frequent soft offline / online operations.  Instead
>>> allocate one for each present CPU to avoid this and dramatically 
>>> simplify
>>> the code.
>>> 
>>> Signed-off-by: Christoph Hellwig 
>>> Reviewed-by: Jens Axboe 
>>> Cc: Keith Busch 
>>> Cc: linux-bl...@vger.kernel.org
>>> Cc: linux-n...@lists.infradead.org
>>> Link: http://lkml.kernel.org/r/20170626102058.10200-3-...@lst.de
>>> Signed-off-by: Thomas Gleixner 
>>> Cc: Oleksandr Natalenko 
>>> Cc: Mike Galbraith 
>>> Signed-off-by: Greg Kroah-Hartman 
>>
>> I wonder if we're simply not getting the masks updated correctly. I'll
>> take a look.
> 
> Can't make it trigger here. We do init for each present CPU, which means
> that if I offline a few CPUs here and register a queue, those still show
> up as present (just offline) and get mapped accordingly.
> 
> From the looks of it, your setup is different. If the CPU doesn't show
> up as present and it gets hotplugged, then I can see how this condition
> would trigger. What environment are you running this in? We might have
> to re-introduce the cpu hotplug notifier, right now we just monitor
> for a dead cpu and handle that.

I am not doing a hot unplug and the replug, I use KVM and add a previously
not available CPU.

in libvirt/virsh speak:
  4



Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2017-11-21 Thread Christian Borntraeger


On 11/21/2017 07:09 PM, Jens Axboe wrote:
> On 11/21/2017 10:27 AM, Jens Axboe wrote:
>> On 11/21/2017 03:14 AM, Christian Borntraeger wrote:
>>> Bisect points to
>>>
>>> 1b5a7455d345b223d3a4658a9e5fce985b7998c1 is the first bad commit
>>> commit 1b5a7455d345b223d3a4658a9e5fce985b7998c1
>>> Author: Christoph Hellwig 
>>> Date:   Mon Jun 26 12:20:57 2017 +0200
>>>
>>> blk-mq: Create hctx for each present CPU
>>> 
>>> commit 4b855ad37194f7bdbb200ce7a1c7051fecb56a08 upstream.
>>> 
>>> Currently we only create hctx for online CPUs, which can lead to a lot
>>> of churn due to frequent soft offline / online operations.  Instead
>>> allocate one for each present CPU to avoid this and dramatically 
>>> simplify
>>> the code.
>>> 
>>> Signed-off-by: Christoph Hellwig 
>>> Reviewed-by: Jens Axboe 
>>> Cc: Keith Busch 
>>> Cc: linux-bl...@vger.kernel.org
>>> Cc: linux-n...@lists.infradead.org
>>> Link: http://lkml.kernel.org/r/20170626102058.10200-3-...@lst.de
>>> Signed-off-by: Thomas Gleixner 
>>> Cc: Oleksandr Natalenko 
>>> Cc: Mike Galbraith 
>>> Signed-off-by: Greg Kroah-Hartman 
>>
>> I wonder if we're simply not getting the masks updated correctly. I'll
>> take a look.
> 
> Can't make it trigger here. We do init for each present CPU, which means
> that if I offline a few CPUs here and register a queue, those still show
> up as present (just offline) and get mapped accordingly.
> 
> From the looks of it, your setup is different. If the CPU doesn't show
> up as present and it gets hotplugged, then I can see how this condition
> would trigger. What environment are you running this in? We might have
> to re-introduce the cpu hotplug notifier, right now we just monitor
> for a dead cpu and handle that.

I am not doing a hot unplug and the replug, I use KVM and add a previously
not available CPU.

in libvirt/virsh speak:
  4



Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2017-11-21 Thread Jens Axboe
On 11/21/2017 10:27 AM, Jens Axboe wrote:
> On 11/21/2017 03:14 AM, Christian Borntraeger wrote:
>> Bisect points to
>>
>> 1b5a7455d345b223d3a4658a9e5fce985b7998c1 is the first bad commit
>> commit 1b5a7455d345b223d3a4658a9e5fce985b7998c1
>> Author: Christoph Hellwig 
>> Date:   Mon Jun 26 12:20:57 2017 +0200
>>
>> blk-mq: Create hctx for each present CPU
>> 
>> commit 4b855ad37194f7bdbb200ce7a1c7051fecb56a08 upstream.
>> 
>> Currently we only create hctx for online CPUs, which can lead to a lot
>> of churn due to frequent soft offline / online operations.  Instead
>> allocate one for each present CPU to avoid this and dramatically simplify
>> the code.
>> 
>> Signed-off-by: Christoph Hellwig 
>> Reviewed-by: Jens Axboe 
>> Cc: Keith Busch 
>> Cc: linux-bl...@vger.kernel.org
>> Cc: linux-n...@lists.infradead.org
>> Link: http://lkml.kernel.org/r/20170626102058.10200-3-...@lst.de
>> Signed-off-by: Thomas Gleixner 
>> Cc: Oleksandr Natalenko 
>> Cc: Mike Galbraith 
>> Signed-off-by: Greg Kroah-Hartman 
> 
> I wonder if we're simply not getting the masks updated correctly. I'll
> take a look.

Can't make it trigger here. We do init for each present CPU, which means
that if I offline a few CPUs here and register a queue, those still show
up as present (just offline) and get mapped accordingly.

>From the looks of it, your setup is different. If the CPU doesn't show
up as present and it gets hotplugged, then I can see how this condition
would trigger. What environment are you running this in? We might have
to re-introduce the cpu hotplug notifier, right now we just monitor
for a dead cpu and handle that.

-- 
Jens Axboe



Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2017-11-21 Thread Jens Axboe
On 11/21/2017 10:27 AM, Jens Axboe wrote:
> On 11/21/2017 03:14 AM, Christian Borntraeger wrote:
>> Bisect points to
>>
>> 1b5a7455d345b223d3a4658a9e5fce985b7998c1 is the first bad commit
>> commit 1b5a7455d345b223d3a4658a9e5fce985b7998c1
>> Author: Christoph Hellwig 
>> Date:   Mon Jun 26 12:20:57 2017 +0200
>>
>> blk-mq: Create hctx for each present CPU
>> 
>> commit 4b855ad37194f7bdbb200ce7a1c7051fecb56a08 upstream.
>> 
>> Currently we only create hctx for online CPUs, which can lead to a lot
>> of churn due to frequent soft offline / online operations.  Instead
>> allocate one for each present CPU to avoid this and dramatically simplify
>> the code.
>> 
>> Signed-off-by: Christoph Hellwig 
>> Reviewed-by: Jens Axboe 
>> Cc: Keith Busch 
>> Cc: linux-bl...@vger.kernel.org
>> Cc: linux-n...@lists.infradead.org
>> Link: http://lkml.kernel.org/r/20170626102058.10200-3-...@lst.de
>> Signed-off-by: Thomas Gleixner 
>> Cc: Oleksandr Natalenko 
>> Cc: Mike Galbraith 
>> Signed-off-by: Greg Kroah-Hartman 
> 
> I wonder if we're simply not getting the masks updated correctly. I'll
> take a look.

Can't make it trigger here. We do init for each present CPU, which means
that if I offline a few CPUs here and register a queue, those still show
up as present (just offline) and get mapped accordingly.

>From the looks of it, your setup is different. If the CPU doesn't show
up as present and it gets hotplugged, then I can see how this condition
would trigger. What environment are you running this in? We might have
to re-introduce the cpu hotplug notifier, right now we just monitor
for a dead cpu and handle that.

-- 
Jens Axboe



Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2017-11-21 Thread Jens Axboe
On 11/21/2017 03:14 AM, Christian Borntraeger wrote:
> Bisect points to
> 
> 1b5a7455d345b223d3a4658a9e5fce985b7998c1 is the first bad commit
> commit 1b5a7455d345b223d3a4658a9e5fce985b7998c1
> Author: Christoph Hellwig 
> Date:   Mon Jun 26 12:20:57 2017 +0200
> 
> blk-mq: Create hctx for each present CPU
> 
> commit 4b855ad37194f7bdbb200ce7a1c7051fecb56a08 upstream.
> 
> Currently we only create hctx for online CPUs, which can lead to a lot
> of churn due to frequent soft offline / online operations.  Instead
> allocate one for each present CPU to avoid this and dramatically simplify
> the code.
> 
> Signed-off-by: Christoph Hellwig 
> Reviewed-by: Jens Axboe 
> Cc: Keith Busch 
> Cc: linux-bl...@vger.kernel.org
> Cc: linux-n...@lists.infradead.org
> Link: http://lkml.kernel.org/r/20170626102058.10200-3-...@lst.de
> Signed-off-by: Thomas Gleixner 
> Cc: Oleksandr Natalenko 
> Cc: Mike Galbraith 
> Signed-off-by: Greg Kroah-Hartman 

I wonder if we're simply not getting the masks updated correctly. I'll
take a look.

-- 
Jens Axboe



Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2017-11-21 Thread Jens Axboe
On 11/21/2017 03:14 AM, Christian Borntraeger wrote:
> Bisect points to
> 
> 1b5a7455d345b223d3a4658a9e5fce985b7998c1 is the first bad commit
> commit 1b5a7455d345b223d3a4658a9e5fce985b7998c1
> Author: Christoph Hellwig 
> Date:   Mon Jun 26 12:20:57 2017 +0200
> 
> blk-mq: Create hctx for each present CPU
> 
> commit 4b855ad37194f7bdbb200ce7a1c7051fecb56a08 upstream.
> 
> Currently we only create hctx for online CPUs, which can lead to a lot
> of churn due to frequent soft offline / online operations.  Instead
> allocate one for each present CPU to avoid this and dramatically simplify
> the code.
> 
> Signed-off-by: Christoph Hellwig 
> Reviewed-by: Jens Axboe 
> Cc: Keith Busch 
> Cc: linux-bl...@vger.kernel.org
> Cc: linux-n...@lists.infradead.org
> Link: http://lkml.kernel.org/r/20170626102058.10200-3-...@lst.de
> Signed-off-by: Thomas Gleixner 
> Cc: Oleksandr Natalenko 
> Cc: Mike Galbraith 
> Signed-off-by: Greg Kroah-Hartman 

I wonder if we're simply not getting the masks updated correctly. I'll
take a look.

-- 
Jens Axboe



Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2017-11-21 Thread Christian Borntraeger


On 11/21/2017 10:50 AM, Christian Borntraeger wrote:
> 
> 
> On 11/21/2017 09:35 AM, Christian Borntraeger wrote:
>>
>>
>> On 11/20/2017 09:52 PM, Jens Axboe wrote:
>>> On 11/20/2017 01:49 PM, Christian Borntraeger wrote:


 On 11/20/2017 08:42 PM, Jens Axboe wrote:
> On 11/20/2017 12:29 PM, Christian Borntraeger wrote:
>>
>>
>> On 11/20/2017 08:20 PM, Bart Van Assche wrote:
>>> On Fri, 2017-11-17 at 15:42 +0100, Christian Borntraeger wrote:
 This is 

 b7a71e66d (Jens Axboe2017-08-01 09:28:24 -0600 1141)   
   * are mapped to it.
 b7a71e66d (Jens Axboe2017-08-01 09:28:24 -0600 1142)   
   */
 6a83e74d2 (Bart Van Assche   2016-11-02 10:09:51 -0600 1143)   
  WARN_ON(!cpumask_test_cpu(raw_smp_processor_id(), hctx->cpumask) &&
 6a83e74d2 (Bart Van Assche   2016-11-02 10:09:51 -0600 1144)   
  cpu_online(hctx->next_cpu));
 6a83e74d2 (Bart Van Assche   2016-11-02 10:09:51 -0600 1145) 
 b7a71e66d (Jens Axboe2017-08-01 09:28:24 -0600 1146)   
  /*
>>>
>>> Did you really try to figure out when the code that reported the warning
>>> was introduced? I think that warning was introduced through the 
>>> following
>>> commit:
>>
>> This was more a cut'n'paste to show which warning triggered since line 
>> numbers are somewhat volatile.
>>
>>>
>>> commit fd1270d5df6a005e1248e87042159a799cc4b2c9
>>> Date:   Wed Apr 16 09:23:48 2014 -0600
>>>
>>> blk-mq: don't use preempt_count() to check for right CPU
>>>  
>>> UP or CONFIG_PREEMPT_NONE will return 0, and what we really
>>> want to check is whether or not we are on the right CPU.
>>> So don't make PREEMPT part of this, just test the CPU in
>>> the mask directly.
>>>
>>> Anyway, I think that warning is appropriate and useful. So the next step
>>> is to figure out what work item was involved and why that work item got
>>> executed on the wrong CPU.
>>
>> It seems to be related to virtio-blk (is triggered by fio on such 
>> disks). Your comment basically
>> says: "no this is not a known issue" then :-)
>> I will try to take a dump to find out the work item
>
> blk-mq does not attempt to freeze/sync existing work if a CPU goes away,
> and we reconfigure the mappings. So I don't think the above is unexpected,
> if you are doing CPU hot unplug while running a fio job.

 I did a cpu hot plug (adding a CPU) and I started fio AFTER that.
>>>
>>> OK, that's different, we should not be triggering a warning for that.
>>> What does your machine/virtblk topology look like in terms of CPUS,
>>> nr of queues for virtblk, etc?
>>
>> FWIW, 4.11 does work, 4.12 and later is broken.
> 
> In fact: 4.12 is fine, 4.12.14 is broken.


Bisect points to

1b5a7455d345b223d3a4658a9e5fce985b7998c1 is the first bad commit
commit 1b5a7455d345b223d3a4658a9e5fce985b7998c1
Author: Christoph Hellwig 
Date:   Mon Jun 26 12:20:57 2017 +0200

blk-mq: Create hctx for each present CPU

commit 4b855ad37194f7bdbb200ce7a1c7051fecb56a08 upstream.

Currently we only create hctx for online CPUs, which can lead to a lot
of churn due to frequent soft offline / online operations.  Instead
allocate one for each present CPU to avoid this and dramatically simplify
the code.

Signed-off-by: Christoph Hellwig 
Reviewed-by: Jens Axboe 
Cc: Keith Busch 
Cc: linux-bl...@vger.kernel.org
Cc: linux-n...@lists.infradead.org
Link: http://lkml.kernel.org/r/20170626102058.10200-3-...@lst.de
Signed-off-by: Thomas Gleixner 
Cc: Oleksandr Natalenko 
Cc: Mike Galbraith 
Signed-off-by: Greg Kroah-Hartman 

:04 04 a61cb023014a7b7a6b9f24ea04fe8ab22299e706 
059ba6dc3290c74e0468937348e580cd53f963e7 M  block
:04 04 432e719d7e738ffcddfb8fc964544d3b3e0a68f7 
f4572aa21b249a851a1b604c148eea109e93b30d M  include





adding Christoph FWIW, your patch triggers the following on 4.14 when doing a 
cpu hotplug (adding a
CPU) and then accessing a virtio-blk device.


  747.652408] [ cut here ]
[  747.652410] WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 
__blk_mq_run_hw_queue+0xd4/0x100
[  747.652410] Modules linked in: dm_multipath
[  747.652412] CPU: 4 PID: 2895 Comm: kworker/4:1H Tainted: GW   
4.14.0+ #191
[  747.652412] Hardware name: IBM 2964 NC9 704 (KVM/Linux)
[  747.652414] Workqueue: kblockd blk_mq_run_work_fn
[  747.652414] task: 6068 task.stack: 5ea3
[  747.652415] Krnl PSW : 0704f0018000 00505864 
(__blk_mq_run_hw_queue+0xd4/0x100)
[  

Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2017-11-21 Thread Christian Borntraeger


On 11/21/2017 10:50 AM, Christian Borntraeger wrote:
> 
> 
> On 11/21/2017 09:35 AM, Christian Borntraeger wrote:
>>
>>
>> On 11/20/2017 09:52 PM, Jens Axboe wrote:
>>> On 11/20/2017 01:49 PM, Christian Borntraeger wrote:


 On 11/20/2017 08:42 PM, Jens Axboe wrote:
> On 11/20/2017 12:29 PM, Christian Borntraeger wrote:
>>
>>
>> On 11/20/2017 08:20 PM, Bart Van Assche wrote:
>>> On Fri, 2017-11-17 at 15:42 +0100, Christian Borntraeger wrote:
 This is 

 b7a71e66d (Jens Axboe2017-08-01 09:28:24 -0600 1141)   
   * are mapped to it.
 b7a71e66d (Jens Axboe2017-08-01 09:28:24 -0600 1142)   
   */
 6a83e74d2 (Bart Van Assche   2016-11-02 10:09:51 -0600 1143)   
  WARN_ON(!cpumask_test_cpu(raw_smp_processor_id(), hctx->cpumask) &&
 6a83e74d2 (Bart Van Assche   2016-11-02 10:09:51 -0600 1144)   
  cpu_online(hctx->next_cpu));
 6a83e74d2 (Bart Van Assche   2016-11-02 10:09:51 -0600 1145) 
 b7a71e66d (Jens Axboe2017-08-01 09:28:24 -0600 1146)   
  /*
>>>
>>> Did you really try to figure out when the code that reported the warning
>>> was introduced? I think that warning was introduced through the 
>>> following
>>> commit:
>>
>> This was more a cut'n'paste to show which warning triggered since line 
>> numbers are somewhat volatile.
>>
>>>
>>> commit fd1270d5df6a005e1248e87042159a799cc4b2c9
>>> Date:   Wed Apr 16 09:23:48 2014 -0600
>>>
>>> blk-mq: don't use preempt_count() to check for right CPU
>>>  
>>> UP or CONFIG_PREEMPT_NONE will return 0, and what we really
>>> want to check is whether or not we are on the right CPU.
>>> So don't make PREEMPT part of this, just test the CPU in
>>> the mask directly.
>>>
>>> Anyway, I think that warning is appropriate and useful. So the next step
>>> is to figure out what work item was involved and why that work item got
>>> executed on the wrong CPU.
>>
>> It seems to be related to virtio-blk (is triggered by fio on such 
>> disks). Your comment basically
>> says: "no this is not a known issue" then :-)
>> I will try to take a dump to find out the work item
>
> blk-mq does not attempt to freeze/sync existing work if a CPU goes away,
> and we reconfigure the mappings. So I don't think the above is unexpected,
> if you are doing CPU hot unplug while running a fio job.

 I did a cpu hot plug (adding a CPU) and I started fio AFTER that.
>>>
>>> OK, that's different, we should not be triggering a warning for that.
>>> What does your machine/virtblk topology look like in terms of CPUS,
>>> nr of queues for virtblk, etc?
>>
>> FWIW, 4.11 does work, 4.12 and later is broken.
> 
> In fact: 4.12 is fine, 4.12.14 is broken.


Bisect points to

1b5a7455d345b223d3a4658a9e5fce985b7998c1 is the first bad commit
commit 1b5a7455d345b223d3a4658a9e5fce985b7998c1
Author: Christoph Hellwig 
Date:   Mon Jun 26 12:20:57 2017 +0200

blk-mq: Create hctx for each present CPU

commit 4b855ad37194f7bdbb200ce7a1c7051fecb56a08 upstream.

Currently we only create hctx for online CPUs, which can lead to a lot
of churn due to frequent soft offline / online operations.  Instead
allocate one for each present CPU to avoid this and dramatically simplify
the code.

Signed-off-by: Christoph Hellwig 
Reviewed-by: Jens Axboe 
Cc: Keith Busch 
Cc: linux-bl...@vger.kernel.org
Cc: linux-n...@lists.infradead.org
Link: http://lkml.kernel.org/r/20170626102058.10200-3-...@lst.de
Signed-off-by: Thomas Gleixner 
Cc: Oleksandr Natalenko 
Cc: Mike Galbraith 
Signed-off-by: Greg Kroah-Hartman 

:04 04 a61cb023014a7b7a6b9f24ea04fe8ab22299e706 
059ba6dc3290c74e0468937348e580cd53f963e7 M  block
:04 04 432e719d7e738ffcddfb8fc964544d3b3e0a68f7 
f4572aa21b249a851a1b604c148eea109e93b30d M  include





adding Christoph FWIW, your patch triggers the following on 4.14 when doing a 
cpu hotplug (adding a
CPU) and then accessing a virtio-blk device.


  747.652408] [ cut here ]
[  747.652410] WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 
__blk_mq_run_hw_queue+0xd4/0x100
[  747.652410] Modules linked in: dm_multipath
[  747.652412] CPU: 4 PID: 2895 Comm: kworker/4:1H Tainted: GW   
4.14.0+ #191
[  747.652412] Hardware name: IBM 2964 NC9 704 (KVM/Linux)
[  747.652414] Workqueue: kblockd blk_mq_run_work_fn
[  747.652414] task: 6068 task.stack: 5ea3
[  747.652415] Krnl PSW : 0704f0018000 00505864 
(__blk_mq_run_hw_queue+0xd4/0x100)
[  747.652417]R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:3 PM:0 
RI:0 EA:3
[  747.652417] Krnl GPRS: 0010 00ff 5cbec400