Re: [PATCH] nvme: unquiesce the queue before cleaup it

2018-04-28 Thread jianchao.wang
Hi Max On 04/27/2018 04:51 PM, jianchao.wang wrote: > Hi Max > > On 04/26/2018 06:23 PM, Max Gurtovoy wrote: >> Hi Jianchao, >> I actually tried this scenario with real HW and was able to repro the hang. >> Unfortunatly, after applying your patch I got NULL deref: >> BUG: unable to handle kernel

Re: [PATCH] nvme: unquiesce the queue before cleaup it

2018-04-28 Thread jianchao.wang
Hi Max On 04/27/2018 04:51 PM, jianchao.wang wrote: > Hi Max > > On 04/26/2018 06:23 PM, Max Gurtovoy wrote: >> Hi Jianchao, >> I actually tried this scenario with real HW and was able to repro the hang. >> Unfortunatly, after applying your patch I got NULL deref: >> BUG: unable to handle kernel

Re: [PATCH] nvme: unquiesce the queue before cleaup it

2018-04-27 Thread jianchao.wang
Hi Max On 04/26/2018 06:23 PM, Max Gurtovoy wrote: > Hi Jianchao, > I actually tried this scenario with real HW and was able to repro the hang. > Unfortunatly, after applying your patch I got NULL deref: > BUG: unable to handle kernel NULL pointer dereference at 0014 > WARNING: CPU: 5

Re: [PATCH] nvme: unquiesce the queue before cleaup it

2018-04-27 Thread jianchao.wang
Hi Max On 04/26/2018 06:23 PM, Max Gurtovoy wrote: > Hi Jianchao, > I actually tried this scenario with real HW and was able to repro the hang. > Unfortunatly, after applying your patch I got NULL deref: > BUG: unable to handle kernel NULL pointer dereference at 0014 > WARNING: CPU: 5

Re: [PATCH] nvme: unquiesce the queue before cleaup it

2018-04-26 Thread jianchao.wang
Hi Max I did a similar test on nvme-rdma, the underlying fabric is soft-RoCE. A io loop, reset controller loop and a delete/create controller loop. And found io hang below: [ 230.884590] WARNING: CPU: 0 PID: 150 at /home/will/u04/source_code/linux-stable/drivers/nvme/host/rdma.c:1755

Re: [PATCH] nvme: unquiesce the queue before cleaup it

2018-04-26 Thread jianchao.wang
Hi Max I did a similar test on nvme-rdma, the underlying fabric is soft-RoCE. A io loop, reset controller loop and a delete/create controller loop. And found io hang below: [ 230.884590] WARNING: CPU: 0 PID: 150 at /home/will/u04/source_code/linux-stable/drivers/nvme/host/rdma.c:1755

Re: [PATCH] nvme: unquiesce the queue before cleaup it

2018-04-22 Thread jianchao.wang
Hi Max That's really appreciated! Here is my test script. loop_reset_controller.sh #!/bin/bash while true do echo 1 > /sys/block/nvme0n1/device/reset_controller sleep 1 done loop_unbind_driver.sh #!/bin/bash while true do echo ":02:00.0" >

Re: [PATCH] nvme: unquiesce the queue before cleaup it

2018-04-22 Thread jianchao.wang
Hi Max That's really appreciated! Here is my test script. loop_reset_controller.sh #!/bin/bash while true do echo 1 > /sys/block/nvme0n1/device/reset_controller sleep 1 done loop_unbind_driver.sh #!/bin/bash while true do echo ":02:00.0" >

Re: [PATCH] nvme: unquiesce the queue before cleaup it

2018-04-22 Thread jianchao.wang
Hi Max No, I only tested it on PCIe one. And sorry for that I didn't state that. Thanks Jianchao On 04/22/2018 10:18 PM, Max Gurtovoy wrote: > Hi Jianchao, > Since this patch is in the core, have you tested it using some fabrics drives > too ? RDMA/FC ? > > thanks, > Max. > > On 4/22/2018

Re: [PATCH] nvme: unquiesce the queue before cleaup it

2018-04-22 Thread jianchao.wang
Hi Max No, I only tested it on PCIe one. And sorry for that I didn't state that. Thanks Jianchao On 04/22/2018 10:18 PM, Max Gurtovoy wrote: > Hi Jianchao, > Since this patch is in the core, have you tested it using some fabrics drives > too ? RDMA/FC ? > > thanks, > Max. > > On 4/22/2018

Re: [PATCH] nvme: unquiesce the queue before cleaup it

2018-04-22 Thread jianchao.wang
Hi keith Would you please take a look at this patch. This issue could be reproduced easily with a driver bind/unbind loop, a reset loop and a IO loop at the same time. Thanks Jianchao On 04/19/2018 04:29 PM, Jianchao Wang wrote: > There is race between nvme_remove and nvme_reset_work that can

Re: [PATCH] nvme: unquiesce the queue before cleaup it

2018-04-22 Thread jianchao.wang
Hi keith Would you please take a look at this patch. This issue could be reproduced easily with a driver bind/unbind loop, a reset loop and a IO loop at the same time. Thanks Jianchao On 04/19/2018 04:29 PM, Jianchao Wang wrote: > There is race between nvme_remove and nvme_reset_work that can

[PATCH] nvme: unquiesce the queue before cleaup it

2018-04-19 Thread Jianchao Wang
There is race between nvme_remove and nvme_reset_work that can lead to io hang. nvme_removenvme_reset_work -> change state to DELETING -> fail to change state to LIVE -> nvme_remove_dead_ctrl

[PATCH] nvme: unquiesce the queue before cleaup it

2018-04-19 Thread Jianchao Wang
There is race between nvme_remove and nvme_reset_work that can lead to io hang. nvme_removenvme_reset_work -> change state to DELETING -> fail to change state to LIVE -> nvme_remove_dead_ctrl