Re: [PATCH] nvme-tcp: Check if request has started before processing it

2021-03-02 Thread Keith Busch
On Tue, Mar 02, 2021 at 08:18:40AM +0100, Hannes Reinecke wrote: > On 3/1/21 9:59 PM, Keith Busch wrote: > > On Mon, Mar 01, 2021 at 05:53:25PM +0100, Hannes Reinecke wrote: > >> On 3/1/21 5:05 PM, Keith Busch wrote: > >>> On Mon, Mar 01, 2021 at 02:55:30PM +0100, Hannes Reinecke wrote: > On 3

Re: [PATCH] nvme-tcp: Check if request has started before processing it

2021-03-02 Thread Hannes Reinecke
On 3/1/21 9:59 PM, Keith Busch wrote: > On Mon, Mar 01, 2021 at 05:53:25PM +0100, Hannes Reinecke wrote: >> On 3/1/21 5:05 PM, Keith Busch wrote: >>> On Mon, Mar 01, 2021 at 02:55:30PM +0100, Hannes Reinecke wrote: On 3/1/21 2:26 PM, Daniel Wagner wrote: > On Sat, Feb 27, 2021 at 02:19:01A

Re: [PATCH] nvme-tcp: Check if request has started before processing it

2021-03-01 Thread Keith Busch
On Mon, Mar 01, 2021 at 05:53:25PM +0100, Hannes Reinecke wrote: > On 3/1/21 5:05 PM, Keith Busch wrote: > > On Mon, Mar 01, 2021 at 02:55:30PM +0100, Hannes Reinecke wrote: > > > On 3/1/21 2:26 PM, Daniel Wagner wrote: > > > > On Sat, Feb 27, 2021 at 02:19:01AM +0900, Keith Busch wrote: > > > > >

Re: [PATCH] nvme-tcp: Check if request has started before processing it

2021-03-01 Thread Hannes Reinecke
On 3/1/21 5:05 PM, Keith Busch wrote: On Mon, Mar 01, 2021 at 02:55:30PM +0100, Hannes Reinecke wrote: On 3/1/21 2:26 PM, Daniel Wagner wrote: On Sat, Feb 27, 2021 at 02:19:01AM +0900, Keith Busch wrote: Crashing is bad, silent data corruption is worse. Is there truly no defense against that?

Re: [PATCH] nvme-tcp: Check if request has started before processing it

2021-03-01 Thread Keith Busch
On Mon, Mar 01, 2021 at 02:55:30PM +0100, Hannes Reinecke wrote: > On 3/1/21 2:26 PM, Daniel Wagner wrote: > > On Sat, Feb 27, 2021 at 02:19:01AM +0900, Keith Busch wrote: > >> Crashing is bad, silent data corruption is worse. Is there truly no > >> defense against that? If not, why should anyone r

Re: [PATCH] nvme-tcp: Check if request has started before processing it

2021-03-01 Thread Hannes Reinecke
On 3/1/21 2:26 PM, Daniel Wagner wrote: > On Sat, Feb 27, 2021 at 02:19:01AM +0900, Keith Busch wrote: >> Crashing is bad, silent data corruption is worse. Is there truly no >> defense against that? If not, why should anyone rely on this? > > If we receive an response for which we don't have a sta

Re: [PATCH] nvme-tcp: Check if request has started before processing it

2021-03-01 Thread Daniel Wagner
On Sat, Feb 27, 2021 at 02:19:01AM +0900, Keith Busch wrote: > Crashing is bad, silent data corruption is worse. Is there truly no > defense against that? If not, why should anyone rely on this? If we receive an response for which we don't have a started request, we know that something is wrong. C

Re: [PATCH] nvme-tcp: Check if request has started before processing it

2021-02-26 Thread Keith Busch
On Fri, Feb 26, 2021 at 05:42:46PM +0100, Hannes Reinecke wrote: > On 2/26/21 5:13 PM, Keith Busch wrote: > > > > That's just addressing a symptom. You can't fully verify the request is > > valid this way because the host could have started the same command ID > > the very moment before the code c

Re: [PATCH] nvme-tcp: Check if request has started before processing it

2021-02-26 Thread Hannes Reinecke
On 2/26/21 5:13 PM, Keith Busch wrote: On Fri, Feb 26, 2021 at 01:54:00PM +0100, Hannes Reinecke wrote: On 2/26/21 1:35 PM, Daniel Wagner wrote: On Mon, Feb 15, 2021 at 01:29:45PM -0800, Sagi Grimberg wrote: Well, I think we should probably figure out why that is happening first. I got my ha

Re: [PATCH] nvme-tcp: Check if request has started before processing it

2021-02-26 Thread Keith Busch
On Fri, Feb 26, 2021 at 01:54:00PM +0100, Hannes Reinecke wrote: > On 2/26/21 1:35 PM, Daniel Wagner wrote: > > On Mon, Feb 15, 2021 at 01:29:45PM -0800, Sagi Grimberg wrote: > > > Well, I think we should probably figure out why that is happening first. > > > > I got my hands on a tcpdump trace. I

Re: [PATCH] nvme-tcp: Check if request has started before processing it

2021-02-26 Thread Hannes Reinecke
On 2/26/21 1:35 PM, Daniel Wagner wrote: On Mon, Feb 15, 2021 at 01:29:45PM -0800, Sagi Grimberg wrote: Well, I think we should probably figure out why that is happening first. I got my hands on a tcpdump trace. I've trimmed it to this: No. Time SourceDestinatio

Re: [PATCH] nvme-tcp: Check if request has started before processing it

2021-02-26 Thread Daniel Wagner
On Mon, Feb 15, 2021 at 01:29:45PM -0800, Sagi Grimberg wrote: > Well, I think we should probably figure out why that is happening first. I got my hands on a tcpdump trace. I've trimmed it to this: No. Time SourceDestination Protocol Length Info 1 0

Re: [PATCH] nvme-tcp: Check if request has started before processing it

2021-02-16 Thread Hannes Reinecke
On 2/15/21 10:23 PM, Sagi Grimberg wrote: blk_mq_tag_to_rq() will always return a request if the command_id is in the valid range. Check if the request has been started. If we blindly process the request we might double complete a request which can be fatal. How did you get to this one? did t

Re: [PATCH] nvme-tcp: Check if request has started before processing it

2021-02-15 Thread Sagi Grimberg
blk_mq_tag_to_rq() will always return a request if the command_id is in the valid range. Check if the request has been started. If we blindly process the request we might double complete a request which can be fatal. How did you get to this one? did the controller send a completion for a comp

Re: [PATCH] nvme-tcp: Check if request has started before processing it

2021-02-15 Thread Sagi Grimberg
blk_mq_tag_to_rq() will always return a request if the command_id is in the valid range. Check if the request has been started. If we blindly process the request we might double complete a request which can be fatal. How did you get to this one? did the controller send a completion for a comp

Re: [PATCH] nvme-tcp: Check if request has started before processing it

2021-02-15 Thread Daniel Wagner
On Sat, Feb 13, 2021 at 09:46:41AM +0100, Hannes Reinecke wrote: > On 2/12/21 10:49 PM, Sagi Grimberg wrote: > > > > > > > blk_mq_tag_to_rq() will always return a request if the command_id is > > > > > in the valid range. Check if the request has been started. If we > > > > > blindly process the r

Re: [PATCH] nvme-tcp: Check if request has started before processing it

2021-02-13 Thread Hannes Reinecke
On 2/12/21 10:49 PM, Sagi Grimberg wrote: blk_mq_tag_to_rq() will always return a request if the command_id is in the valid range. Check if the request has been started. If we blindly process the request we might double complete a request which can be fatal. How did you get to this one? did t

Re: [PATCH] nvme-tcp: Check if request has started before processing it

2021-02-13 Thread Hannes Reinecke
On 2/12/21 7:17 PM, Daniel Wagner wrote: blk_mq_tag_to_rq() will always return a request if the command_id is in the valid range. Check if the request has been started. If we blindly process the request we might double complete a request which can be fatal. Signed-off-by: Daniel Wagner --- Thi

Re: [PATCH] nvme-tcp: Check if request has started before processing it

2021-02-12 Thread Sagi Grimberg
blk_mq_tag_to_rq() will always return a request if the command_id is in the valid range. Check if the request has been started. If we blindly process the request we might double complete a request which can be fatal. How did you get to this one? did the controller send a completion for a comp

Re: [PATCH] nvme-tcp: Check if request has started before processing it

2021-02-12 Thread Keith Busch
On Fri, Feb 12, 2021 at 12:58:27PM -0800, Sagi Grimberg wrote: > > blk_mq_tag_to_rq() will always return a request if the command_id is > > in the valid range. Check if the request has been started. If we > > blindly process the request we might double complete a request which > > can be fatal. >

Re: [PATCH] nvme-tcp: Check if request has started before processing it

2021-02-12 Thread Sagi Grimberg
blk_mq_tag_to_rq() will always return a request if the command_id is in the valid range. Check if the request has been started. If we blindly process the request we might double complete a request which can be fatal. How did you get to this one? did the controller send a completion for a complet