Re: [PATCH 1/2] nvme: pci: simplify timeout handling

2018-05-10 Thread Ming Lei
On Fri, May 11, 2018 at 5:05 AM, Keith Busch wrote: > On Fri, May 11, 2018 at 04:52:11AM +0800, Ming Lei wrote: >> Hi Keith, >> >> On Tue, May 8, 2018 at 11:30 PM, Keith Busch wrote: >> > On Sat, Apr 28, 2018 at 11:50:17AM +0800, Ming Lei wrote: >> >> This sync may be raced with one timed-out req

Re: [PATCH 1/2] nvme: pci: simplify timeout handling

2018-05-10 Thread Ming Lei
On Thu, May 10, 2018 at 04:43:57PM -0600, Keith Busch wrote: > On Fri, May 11, 2018 at 06:03:59AM +0800, Ming Lei wrote: > > Sorry, forget to mention, it isn't enough to simply sync timeout inside > > reset(). > > > > Another tricky thing is about freeze & unfreeze, now freeze is done in > > nvme

Re: [PATCH 1/2] nvme: pci: simplify timeout handling

2018-05-10 Thread Keith Busch
On Fri, May 11, 2018 at 06:03:59AM +0800, Ming Lei wrote: > Sorry, forget to mention, it isn't enough to simply sync timeout inside > reset(). > > Another tricky thing is about freeze & unfreeze, now freeze is done in > nvme_dev_disable(), and unfreeze is done in nvme_reset_work. That means > we

Re: [PATCH 1/2] nvme: pci: simplify timeout handling

2018-05-10 Thread Ming Lei
On Fri, May 11, 2018 at 5:18 AM, Keith Busch wrote: > On Fri, May 11, 2018 at 05:10:40AM +0800, Ming Lei wrote: >> On Fri, May 11, 2018 at 5:05 AM, Keith Busch >> wrote: >> > On Fri, May 11, 2018 at 04:52:11AM +0800, Ming Lei wrote: >> >> Hi Keith, >> >> >> >> On Tue, May 8, 2018 at 11:30 PM, Kei

Re: [PATCH 1/2] nvme: pci: simplify timeout handling

2018-05-10 Thread Ming Lei
On Thu, May 10, 2018 at 03:44:41PM -0600, Keith Busch wrote: > On Fri, May 11, 2018 at 05:24:46AM +0800, Ming Lei wrote: > > Could you share me the link? > > The diff was in this reply here: > > http://lists.infradead.org/pipermail/linux-nvme/2018-April/017019.html > > > Firstly, the previous nv

Re: [PATCH 1/2] nvme: pci: simplify timeout handling

2018-05-10 Thread Ming Lei
On Thu, May 10, 2018 at 03:44:41PM -0600, Keith Busch wrote: > On Fri, May 11, 2018 at 05:24:46AM +0800, Ming Lei wrote: > > Could you share me the link? > > The diff was in this reply here: > > http://lists.infradead.org/pipermail/linux-nvme/2018-April/017019.html > > > Firstly, the previous nv

Re: [PATCH 1/2] nvme: pci: simplify timeout handling

2018-05-10 Thread Keith Busch
On Fri, May 11, 2018 at 05:24:46AM +0800, Ming Lei wrote: > Could you share me the link? The diff was in this reply here: http://lists.infradead.org/pipermail/linux-nvme/2018-April/017019.html > Firstly, the previous nvme_sync_queues() won't work reliably, so this > patch introduces blk_unquiesc

Re: [PATCH 1/2] nvme: pci: simplify timeout handling

2018-05-10 Thread Ming Lei
On Thu, May 10, 2018 at 03:18:29PM -0600, Keith Busch wrote: > On Fri, May 11, 2018 at 05:10:40AM +0800, Ming Lei wrote: > > On Fri, May 11, 2018 at 5:05 AM, Keith Busch > > wrote: > > > On Fri, May 11, 2018 at 04:52:11AM +0800, Ming Lei wrote: > > >> Hi Keith, > > >> > > >> On Tue, May 8, 2018 at

Re: [PATCH 1/2] nvme: pci: simplify timeout handling

2018-05-10 Thread Keith Busch
On Fri, May 11, 2018 at 05:10:40AM +0800, Ming Lei wrote: > On Fri, May 11, 2018 at 5:05 AM, Keith Busch > wrote: > > On Fri, May 11, 2018 at 04:52:11AM +0800, Ming Lei wrote: > >> Hi Keith, > >> > >> On Tue, May 8, 2018 at 11:30 PM, Keith Busch wrote: > >> > On Sat, Apr 28, 2018 at 11:50:17AM +0

Re: [PATCH 1/2] nvme: pci: simplify timeout handling

2018-05-10 Thread Ming Lei
On Fri, May 11, 2018 at 5:05 AM, Keith Busch wrote: > On Fri, May 11, 2018 at 04:52:11AM +0800, Ming Lei wrote: >> Hi Keith, >> >> On Tue, May 8, 2018 at 11:30 PM, Keith Busch wrote: >> > On Sat, Apr 28, 2018 at 11:50:17AM +0800, Ming Lei wrote: >> >> This sync may be raced with one timed-out req

Re: [PATCH 1/2] nvme: pci: simplify timeout handling

2018-05-10 Thread Keith Busch
On Fri, May 11, 2018 at 04:52:11AM +0800, Ming Lei wrote: > Hi Keith, > > On Tue, May 8, 2018 at 11:30 PM, Keith Busch wrote: > > On Sat, Apr 28, 2018 at 11:50:17AM +0800, Ming Lei wrote: > >> This sync may be raced with one timed-out request, which may be handled > >> as BLK_EH_HANDLED or BLK_EH

Re: [PATCH 1/2] nvme: pci: simplify timeout handling

2018-05-10 Thread Ming Lei
Hi Keith, On Tue, May 8, 2018 at 11:30 PM, Keith Busch wrote: > On Sat, Apr 28, 2018 at 11:50:17AM +0800, Ming Lei wrote: >> This sync may be raced with one timed-out request, which may be handled >> as BLK_EH_HANDLED or BLK_EH_RESET_TIMER, so the above sync queues can't >> work reliably. > > Min

Re: [PATCH 1/2] nvme: pci: simplify timeout handling

2018-05-08 Thread Keith Busch
On Sat, Apr 28, 2018 at 11:50:17AM +0800, Ming Lei wrote: > This sync may be raced with one timed-out request, which may be handled > as BLK_EH_HANDLED or BLK_EH_RESET_TIMER, so the above sync queues can't > work reliably. Ming, As proposed, that scenario is impossible to encounter. Resetting th

Re: [PATCH 1/2] nvme: pci: simplify timeout handling

2018-04-30 Thread Ming Lei
On Mon, Apr 30, 2018 at 01:52:17PM -0600, Keith Busch wrote: > On Sun, Apr 29, 2018 at 05:39:52AM +0800, Ming Lei wrote: > > On Sat, Apr 28, 2018 at 9:35 PM, Keith Busch > > wrote: > > > On Sat, Apr 28, 2018 at 11:50:17AM +0800, Ming Lei wrote: > > >> > I understand how the problems are happening

Re: [PATCH 1/2] nvme: pci: simplify timeout handling

2018-04-30 Thread Keith Busch
On Sun, Apr 29, 2018 at 05:39:52AM +0800, Ming Lei wrote: > On Sat, Apr 28, 2018 at 9:35 PM, Keith Busch > wrote: > > On Sat, Apr 28, 2018 at 11:50:17AM +0800, Ming Lei wrote: > >> > I understand how the problems are happening a bit better now. It used > >> > to be that blk-mq would lock an expire

Re: [PATCH 1/2] nvme: pci: simplify timeout handling

2018-04-29 Thread Ming Lei
On Sun, Apr 29, 2018 at 10:21 AM, jianchao.wang wrote: > Hi ming > > On 04/29/2018 09:36 AM, Ming Lei wrote: >> On Sun, Apr 29, 2018 at 6:27 AM, Ming Lei wrote: >>> On Sun, Apr 29, 2018 at 5:57 AM, Ming Lei wrote: On Sat, Apr 28, 2018 at 10:00 PM, jianchao.wang wrote: > Hi ming >>

Re: [PATCH 1/2] nvme: pci: simplify timeout handling

2018-04-28 Thread jianchao.wang
Hi ming On 04/29/2018 09:36 AM, Ming Lei wrote: > On Sun, Apr 29, 2018 at 6:27 AM, Ming Lei wrote: >> On Sun, Apr 29, 2018 at 5:57 AM, Ming Lei wrote: >>> On Sat, Apr 28, 2018 at 10:00 PM, jianchao.wang >>> wrote: Hi ming On 04/27/2018 10:57 PM, Ming Lei wrote: > I may not un

Re: [PATCH 1/2] nvme: pci: simplify timeout handling

2018-04-28 Thread Ming Lei
On Sun, Apr 29, 2018 at 6:27 AM, Ming Lei wrote: > On Sun, Apr 29, 2018 at 5:57 AM, Ming Lei wrote: >> On Sat, Apr 28, 2018 at 10:00 PM, jianchao.wang >> wrote: >>> Hi ming >>> >>> On 04/27/2018 10:57 PM, Ming Lei wrote: I may not understand your point, once blk_sync_queue() returns, the >>

Re: [PATCH 1/2] nvme: pci: simplify timeout handling

2018-04-28 Thread Ming Lei
On Sun, Apr 29, 2018 at 5:57 AM, Ming Lei wrote: > On Sat, Apr 28, 2018 at 10:00 PM, jianchao.wang > wrote: >> Hi ming >> >> On 04/27/2018 10:57 PM, Ming Lei wrote: >>> I may not understand your point, once blk_sync_queue() returns, the >>> timer itself is deactivated, meantime the synced .nvme_t

Re: [PATCH 1/2] nvme: pci: simplify timeout handling

2018-04-28 Thread Ming Lei
On Sat, Apr 28, 2018 at 10:00 PM, jianchao.wang wrote: > Hi ming > > On 04/27/2018 10:57 PM, Ming Lei wrote: >> I may not understand your point, once blk_sync_queue() returns, the >> timer itself is deactivated, meantime the synced .nvme_timeout() only >> returns EH_NOT_HANDLED before the deactiva

Re: [PATCH 1/2] nvme: pci: simplify timeout handling

2018-04-28 Thread Ming Lei
On Sat, Apr 28, 2018 at 9:35 PM, Keith Busch wrote: > On Sat, Apr 28, 2018 at 11:50:17AM +0800, Ming Lei wrote: >> > I understand how the problems are happening a bit better now. It used >> > to be that blk-mq would lock an expired command one at a time, so when >> > we had a batch of IO timeouts,

Re: [PATCH 1/2] nvme: pci: simplify timeout handling

2018-04-28 Thread jianchao.wang
Hi Ming and Keith Let me detail extend more here. :) On 04/28/2018 09:35 PM, Keith Busch wrote: >> Actually there isn't the case before, even for legacy path, one .timeout() >> handles one request only. Yes, .timeout should be invoked for every timeout request and .timeout should also handle th

Re: [PATCH 1/2] nvme: pci: simplify timeout handling

2018-04-28 Thread jianchao.wang
Hi ming On 04/27/2018 10:57 PM, Ming Lei wrote: > I may not understand your point, once blk_sync_queue() returns, the > timer itself is deactivated, meantime the synced .nvme_timeout() only > returns EH_NOT_HANDLED before the deactivation. > > That means this timer won't be expired any more, so c

Re: [PATCH 1/2] nvme: pci: simplify timeout handling

2018-04-28 Thread Keith Busch
On Sat, Apr 28, 2018 at 11:50:17AM +0800, Ming Lei wrote: > > I understand how the problems are happening a bit better now. It used > > to be that blk-mq would lock an expired command one at a time, so when > > we had a batch of IO timeouts, the driver was able to complete all of > > them inside a

Re: [PATCH 1/2] nvme: pci: simplify timeout handling

2018-04-27 Thread Ming Lei
On Fri, Apr 27, 2018 at 11:51:57AM -0600, Keith Busch wrote: > On Thu, Apr 26, 2018 at 08:39:55PM +0800, Ming Lei wrote: > > +/* > > + * This one is called after queues are quiesced, and no in-fligh timeout > > + * and nvme interrupt handling. > > + */ > > +static void nvme_pci_cancel_request(struc

Re: [PATCH 1/2] nvme: pci: simplify timeout handling

2018-04-27 Thread Keith Busch
On Thu, Apr 26, 2018 at 08:39:55PM +0800, Ming Lei wrote: > +/* > + * This one is called after queues are quiesced, and no in-fligh timeout > + * and nvme interrupt handling. > + */ > +static void nvme_pci_cancel_request(struct request *req, void *data, > + bool reserved) > +{ > + /

Re: [PATCH 1/2] nvme: pci: simplify timeout handling

2018-04-27 Thread Ming Lei
On Fri, Apr 27, 2018 at 09:37:06AM +0800, jianchao.wang wrote: > > > On 04/26/2018 11:57 PM, Ming Lei wrote: > > Hi Jianchao, > > > > On Thu, Apr 26, 2018 at 11:07:56PM +0800, jianchao.wang wrote: > >> Hi Ming > >> > >> Thanks for your wonderful solution. :) > >> > >> On 04/26/2018 08:39 PM, Min

Re: [PATCH 1/2] nvme: pci: simplify timeout handling

2018-04-26 Thread jianchao.wang
On 04/26/2018 11:57 PM, Ming Lei wrote: > Hi Jianchao, > > On Thu, Apr 26, 2018 at 11:07:56PM +0800, jianchao.wang wrote: >> Hi Ming >> >> Thanks for your wonderful solution. :) >> >> On 04/26/2018 08:39 PM, Ming Lei wrote: >>> +/* >>> + * This one is called after queues are quiesced, and no in-

Re: [PATCH 1/2] nvme: pci: simplify timeout handling

2018-04-26 Thread Ming Lei
On Thu, Apr 26, 2018 at 11:57:22PM +0800, Ming Lei wrote: > Hi Jianchao, > > On Thu, Apr 26, 2018 at 11:07:56PM +0800, jianchao.wang wrote: > > Hi Ming > > > > Thanks for your wonderful solution. :) > > > > On 04/26/2018 08:39 PM, Ming Lei wrote: > > > +/* > > > + * This one is called after queu

Re: [PATCH 1/2] nvme: pci: simplify timeout handling

2018-04-26 Thread Ming Lei
Hi Jianchao, On Thu, Apr 26, 2018 at 11:07:56PM +0800, jianchao.wang wrote: > Hi Ming > > Thanks for your wonderful solution. :) > > On 04/26/2018 08:39 PM, Ming Lei wrote: > > +/* > > + * This one is called after queues are quiesced, and no in-fligh timeout > > + * and nvme interrupt handling.

Re: [PATCH 1/2] nvme: pci: simplify timeout handling

2018-04-26 Thread jianchao.wang
Hi Ming Thanks for your wonderful solution. :) On 04/26/2018 08:39 PM, Ming Lei wrote: > +/* > + * This one is called after queues are quiesced, and no in-fligh timeout > + * and nvme interrupt handling. > + */ > +static void nvme_pci_cancel_request(struct request *req, void *data, > +

[PATCH 1/2] nvme: pci: simplify timeout handling

2018-04-26 Thread Ming Lei
When one req is timed out, now nvme_timeout() handles it by the following way: nvme_dev_disable(dev, false); nvme_reset_ctrl(&dev->ctrl); return BLK_EH_HANDLED. which may introduces the following issues: 1) the following timeout on other reqs may call nvme_dev_disable() a