Re: [PATCH] nvme: validate cntlid's only for nvme >= 1.1.0
On Tue, Jun 30, 2020 at 04:01:45PM +0200, Maximilian Heyne wrote: > On 6/30/20 3:36 PM, Christoph Hellwig wrote: > > And actually - 1.0 did not have the concept of a subsystem. So having > > a duplicate serial number for a 1.0 controller actually is a pretty > > nasty bug. Can you point me to this broken controller? Do you think > > the OEM could fix it up to report a proper version number and controller > > ID? > > > > I meant that the VF NVMe controllers will all land in the same subsystem from > the kernel's point of view, because, as you said, there was no idea of > different > subsystems in the 1.0 spec. Each controller should have landed in its own subsystem in this case rather than the same subsystem. > It's an older in-house controller. Seems to set the same serial number for all > VF's. Should the firmware set unique serials for the VF's instead? Yes, the driver shouldn't be finding duplicate serial numbers.
Re: [PATCH] nvme: validate cntlid's only for nvme >= 1.1.0
On 6/30/20 3:36 PM, Christoph Hellwig wrote: On Tue, Jun 30, 2020 at 03:33:58PM +0200, Christoph Hellwig wrote: On Tue, Jun 30, 2020 at 12:29:23PM +, Maximilian Heyne wrote: Controller ID's (cntlid) for NVMe devices were introduced in version 1.1.0 of the specification. Controllers that follow the older 1.0.0 spec don't set this field so it doesn't make sense to validate it. On the contrary, when using SR-IOV this check breaks VFs as they are all part of the same NVMe subsystem. Signed-off-by: Maximilian Heyne Cc: # 5.4+ The first hunk looks ok, the second doesn't make sense as fabrics was only added with NVMe 1.2.2. I can fix it up when applying if you are ok with that. I'd be totally ok with that. But you guys really shouldn't be doing SR-IOV with 1.0 controllers independent of this.. So far it worked... And actually - 1.0 did not have the concept of a subsystem. So having a duplicate serial number for a 1.0 controller actually is a pretty nasty bug. Can you point me to this broken controller? Do you think the OEM could fix it up to report a proper version number and controller ID? I meant that the VF NVMe controllers will all land in the same subsystem from the kernel's point of view, because, as you said, there was no idea of different subsystems in the 1.0 spec. It's an older in-house controller. Seems to set the same serial number for all VF's. Should the firmware set unique serials for the VF's instead? Thanks Max Amazon Development Center Germany GmbH Krausenstr. 38 10117 Berlin Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B Sitz: Berlin Ust-ID: DE 289 237 879
Re: [PATCH] nvme: validate cntlid's only for nvme >= 1.1.0
On Tue, Jun 30, 2020 at 03:33:58PM +0200, Christoph Hellwig wrote: > On Tue, Jun 30, 2020 at 12:29:23PM +, Maximilian Heyne wrote: > > Controller ID's (cntlid) for NVMe devices were introduced in version > > 1.1.0 of the specification. Controllers that follow the older 1.0.0 spec > > don't set this field so it doesn't make sense to validate it. On the > > contrary, when using SR-IOV this check breaks VFs as they are all part > > of the same NVMe subsystem. > > > > Signed-off-by: Maximilian Heyne > > Cc: # 5.4+ > > The first hunk looks ok, the second doesn't make sense as fabrics > was only added with NVMe 1.2.2. I can fix it up when applying if you > are ok with that. > > But you guys really shouldn't be doing SR-IOV with 1.0 controllers > independent of this.. And actually - 1.0 did not have the concept of a subsystem. So having a duplicate serial number for a 1.0 controller actually is a pretty nasty bug. Can you point me to this broken controller? Do you think the OEM could fix it up to report a proper version number and controller ID?
Re: [PATCH] nvme: validate cntlid's only for nvme >= 1.1.0
On Tue, Jun 30, 2020 at 12:29:23PM +, Maximilian Heyne wrote: > Controller ID's (cntlid) for NVMe devices were introduced in version > 1.1.0 of the specification. Controllers that follow the older 1.0.0 spec > don't set this field so it doesn't make sense to validate it. On the > contrary, when using SR-IOV this check breaks VFs as they are all part > of the same NVMe subsystem. > > Signed-off-by: Maximilian Heyne > Cc: # 5.4+ The first hunk looks ok, the second doesn't make sense as fabrics was only added with NVMe 1.2.2. I can fix it up when applying if you are ok with that. But you guys really shouldn't be doing SR-IOV with 1.0 controllers independent of this..