Re: [PATCH 3/6] nvme: Move all IO out of controller reset

2018-05-22 Thread Keith Busch
On Tue, May 22, 2018 at 09:46:32AM +0800, Ming Lei wrote: > It isn't a good practice to rely on timing for avoiding race, IMO. Sure thing, and it's easy enough to avoid this one. > OK, please fix other issues mentioned in the following comment together, > then I will review further and see if

Re: [PATCH 3/6] nvme: Move all IO out of controller reset

2018-05-21 Thread Ming Lei
On Mon, May 21, 2018 at 10:23:55AM -0600, Keith Busch wrote: > On Tue, May 22, 2018 at 12:04:53AM +0800, Ming Lei wrote: > > On Mon, May 21, 2018 at 09:44:33AM -0600, Keith Busch wrote: > > > On Mon, May 21, 2018 at 11:34:27PM +0800, Ming Lei wrote: > > > > nvme_dev_disable() quiesces queues first

Re: [PATCH 3/6] nvme: Move all IO out of controller reset

2018-05-21 Thread Keith Busch
On Tue, May 22, 2018 at 12:04:53AM +0800, Ming Lei wrote: > On Mon, May 21, 2018 at 09:44:33AM -0600, Keith Busch wrote: > > On Mon, May 21, 2018 at 11:34:27PM +0800, Ming Lei wrote: > > > nvme_dev_disable() quiesces queues first before killing queues. > > > > > > If queues are quiesced during or

Re: [PATCH 3/6] nvme: Move all IO out of controller reset

2018-05-21 Thread Ming Lei
On Mon, May 21, 2018 at 09:44:33AM -0600, Keith Busch wrote: > On Mon, May 21, 2018 at 11:34:27PM +0800, Ming Lei wrote: > > nvme_dev_disable() quiesces queues first before killing queues. > > > > If queues are quiesced during or before nvme_wait_freeze() is run > > from the 2nd part of reset,

Re: [PATCH 3/6] nvme: Move all IO out of controller reset

2018-05-21 Thread Keith Busch
On Mon, May 21, 2018 at 11:34:27PM +0800, Ming Lei wrote: > nvme_dev_disable() quiesces queues first before killing queues. > > If queues are quiesced during or before nvme_wait_freeze() is run > from the 2nd part of reset, the 2nd part can't move on, and IO hang > is caused. Finally no reset can

Re: [PATCH 3/6] nvme: Move all IO out of controller reset

2018-05-21 Thread Ming Lei
On Mon, May 21, 2018 at 09:03:31AM -0600, Keith Busch wrote: > On Mon, May 21, 2018 at 10:58:51PM +0800, Ming Lei wrote: > > On Mon, May 21, 2018 at 08:22:19AM -0600, Keith Busch wrote: > > > On Sat, May 19, 2018 at 07:03:58AM +0800, Ming Lei wrote: > > > > On Fri, May 18, 2018 at 10:38:20AM

Re: [PATCH 3/6] nvme: Move all IO out of controller reset

2018-05-21 Thread Keith Busch
On Mon, May 21, 2018 at 10:58:51PM +0800, Ming Lei wrote: > On Mon, May 21, 2018 at 08:22:19AM -0600, Keith Busch wrote: > > On Sat, May 19, 2018 at 07:03:58AM +0800, Ming Lei wrote: > > > On Fri, May 18, 2018 at 10:38:20AM -0600, Keith Busch wrote: > > > > + > > > > + if (unfreeze) > > > >

Re: [PATCH 3/6] nvme: Move all IO out of controller reset

2018-05-21 Thread Ming Lei
On Mon, May 21, 2018 at 08:22:19AM -0600, Keith Busch wrote: > On Sat, May 19, 2018 at 07:03:58AM +0800, Ming Lei wrote: > > On Fri, May 18, 2018 at 10:38:20AM -0600, Keith Busch wrote: > > > + > > > + if (unfreeze) > > > + nvme_wait_freeze(>ctrl); > > > + > > > > timeout may comes just

Re: [PATCH 3/6] nvme: Move all IO out of controller reset

2018-05-21 Thread Keith Busch
On Sat, May 19, 2018 at 07:03:58AM +0800, Ming Lei wrote: > On Fri, May 18, 2018 at 10:38:20AM -0600, Keith Busch wrote: > > + > > + if (unfreeze) > > + nvme_wait_freeze(>ctrl); > > + > > timeout may comes just before blk_mq_update_nr_hw_queues() or > the above nvme_wait_freeze(),

Re: [PATCH 3/6] nvme: Move all IO out of controller reset

2018-05-18 Thread Ming Lei
On Fri, May 18, 2018 at 10:38:20AM -0600, Keith Busch wrote: > IO may be retryable, so don't wait for them in the reset path. These > commands may trigger a reset if that IO expires without a completion, > placing it on the requeue list. Waiting for these would then deadlock > the reset handler. >

[PATCH 3/6] nvme: Move all IO out of controller reset

2018-05-18 Thread Keith Busch
IO may be retryable, so don't wait for them in the reset path. These commands may trigger a reset if that IO expires without a completion, placing it on the requeue list. Waiting for these would then deadlock the reset handler. To fix the theoretical deadlock, this patch unblocks IO submission