Re: [PATCH v4 9/9] hw/nvme: Refer to dev->exp.sriov_pf.num_vfs

2024-02-14 Thread Akihiko Odaki

On 2024/02/15 1:34, Michael S. Tsirkin wrote:

On Thu, Feb 15, 2024 at 01:07:29AM +0900, Akihiko Odaki wrote:

On 2024/02/15 0:46, Michael S. Tsirkin wrote:

On Wed, Feb 14, 2024 at 11:09:50PM +0900, Akihiko Odaki wrote:

On 2024/02/14 16:07, Michael S. Tsirkin wrote:

On Wed, Feb 14, 2024 at 02:13:47PM +0900, Akihiko Odaki wrote:

NumVFs may not equal to the current effective number of VFs because VF
Enable is cleared, NumVFs is set after VF Enable is set, or NumVFs is
greater than TotalVFs.

Fixes: 11871f53ef8e ("hw/nvme: Add support for the Virtualization Management 
command")
Signed-off-by: Akihiko Odaki 


I don't get what this is saying about VF enable.
This code will not trigger on numVFs write when VF enable is set.
Generally this commit makes no sense on its own, squash it with
the pci core change pls.


This code is meant to run when it is clearing VF Enable, and its
functionality is to change the state of VFs currently enabled so that we can
disable them.

However, NumVFs does not necessarily represent VFs currently being enabled,
and have a different value in the case described above.


Ah so in this case, if numvfs is changed and then VFs are disabled,
we will not call nvme_virt_set_state? OK, it should say this in commit log.
And then, what happens?


We will call nvme_virt_set_state() but only for VFs already enabled.


And? What does it cause? memory leak? some buffer is overrun?
the guest behaviour is illegal so it does not really
matter what we do as long as nothing too bad happens.


nvme_sriov_pre_write_ctrl() is intended to free resources allocated to 
VFs. Previously, nvme_sriov_pre_write_ctrl() used NumVFs to iterate VFs 
with resources allocated. If NumVFs is changed and then VFs are 
disabled, this iteration resulted in buffer overrun.


With this patch, the changed value of NumVFs will be ignored, and 
nvme_sriov_pre_write_ctrl() only frees resources allocated to VFs 
actually enabled, thus no buffer overrun happens.







Such cases exist
even before the earlier patches and this fix is independently meaningful.


yes but the previous patch causes a regression without this one.
squash it.


I'll move this patch before the previous patch.







---
hw/nvme/ctrl.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c
index f8df622fe590..daedda5d326f 100644
--- a/hw/nvme/ctrl.c
+++ b/hw/nvme/ctrl.c
@@ -8481,7 +8481,7 @@ static void nvme_sriov_pre_write_ctrl(PCIDevice *dev, 
uint32_t address,
NvmeSecCtrlEntry *sctrl;
uint16_t sriov_cap = dev->exp.sriov_cap;
uint32_t off = address - sriov_cap;
-int i, num_vfs;
+int i;
if (!sriov_cap) {
return;
@@ -8489,8 +8489,7 @@ static void nvme_sriov_pre_write_ctrl(PCIDevice *dev, 
uint32_t address,
if (range_covers_byte(off, len, PCI_SRIOV_CTRL)) {
if (!(val & PCI_SRIOV_CTRL_VFE)) {
-num_vfs = pci_get_word(dev->config + sriov_cap + PCI_SRIOV_NUM_VF);
-for (i = 0; i < num_vfs; i++) {
+for (i = 0; i < dev->exp.sriov_pf.num_vfs; i++) {


If the assumption you now make is that num_vfs only changes
when VFs are disabled, we should add a comment documenting this.
In fact, I think there's a nicer way to do this:

static void nvme_pci_write_config(PCIDevice *dev, uint32_t address,
uint32_t val, int len)
{
  int old_num_vfs = dev->exp.sriov_pf.num_vfs;

  pci_default_write_config(dev, address, val, len);
  pcie_cap_flr_write_config(dev, address, val, len);
  nvme_sriov_pre_write_ctrl(dev, address, val, len, old_num_vfs);
}

and now, nvme_sriov_pre_write_ctrl can compare:

if (old_num_vfs && !dev->exp.sriov_pf.num_vfs)
disable everything


this, without bothering with detail of SRIOV capability.
No?


It looks better. I'll do so in the next version.






Re: [PATCH v4 9/9] hw/nvme: Refer to dev->exp.sriov_pf.num_vfs

2024-02-14 Thread Michael S. Tsirkin
On Thu, Feb 15, 2024 at 01:07:29AM +0900, Akihiko Odaki wrote:
> On 2024/02/15 0:46, Michael S. Tsirkin wrote:
> > On Wed, Feb 14, 2024 at 11:09:50PM +0900, Akihiko Odaki wrote:
> > > On 2024/02/14 16:07, Michael S. Tsirkin wrote:
> > > > On Wed, Feb 14, 2024 at 02:13:47PM +0900, Akihiko Odaki wrote:
> > > > > NumVFs may not equal to the current effective number of VFs because VF
> > > > > Enable is cleared, NumVFs is set after VF Enable is set, or NumVFs is
> > > > > greater than TotalVFs.
> > > > > 
> > > > > Fixes: 11871f53ef8e ("hw/nvme: Add support for the Virtualization 
> > > > > Management command")
> > > > > Signed-off-by: Akihiko Odaki 
> > > > 
> > > > I don't get what this is saying about VF enable.
> > > > This code will not trigger on numVFs write when VF enable is set.
> > > > Generally this commit makes no sense on its own, squash it with
> > > > the pci core change pls.
> > > 
> > > This code is meant to run when it is clearing VF Enable, and its
> > > functionality is to change the state of VFs currently enabled so that we 
> > > can
> > > disable them.
> > > 
> > > However, NumVFs does not necessarily represent VFs currently being 
> > > enabled,
> > > and have a different value in the case described above.
> > 
> > Ah so in this case, if numvfs is changed and then VFs are disabled,
> > we will not call nvme_virt_set_state? OK, it should say this in commit log.
> > And then, what happens?
> 
> We will call nvme_virt_set_state() but only for VFs already enabled.

And? What does it cause? memory leak? some buffer is overrun?
the guest behaviour is illegal so it does not really
matter what we do as long as nothing too bad happens.

> > 
> > > Such cases exist
> > > even before the earlier patches and this fix is independently meaningful.
> > 
> > yes but the previous patch causes a regression without this one.
> > squash it.
> 
> I'll move this patch before the previous patch.
> 
> > 
> > 
> > > > 
> > > > > ---
> > > > >hw/nvme/ctrl.c | 5 ++---
> > > > >1 file changed, 2 insertions(+), 3 deletions(-)
> > > > > 
> > > > > diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c
> > > > > index f8df622fe590..daedda5d326f 100644
> > > > > --- a/hw/nvme/ctrl.c
> > > > > +++ b/hw/nvme/ctrl.c
> > > > > @@ -8481,7 +8481,7 @@ static void nvme_sriov_pre_write_ctrl(PCIDevice 
> > > > > *dev, uint32_t address,
> > > > >NvmeSecCtrlEntry *sctrl;
> > > > >uint16_t sriov_cap = dev->exp.sriov_cap;
> > > > >uint32_t off = address - sriov_cap;
> > > > > -int i, num_vfs;
> > > > > +int i;
> > > > >if (!sriov_cap) {
> > > > >return;
> > > > > @@ -8489,8 +8489,7 @@ static void nvme_sriov_pre_write_ctrl(PCIDevice 
> > > > > *dev, uint32_t address,
> > > > >if (range_covers_byte(off, len, PCI_SRIOV_CTRL)) {
> > > > >if (!(val & PCI_SRIOV_CTRL_VFE)) {
> > > > > -num_vfs = pci_get_word(dev->config + sriov_cap + 
> > > > > PCI_SRIOV_NUM_VF);
> > > > > -for (i = 0; i < num_vfs; i++) {
> > > > > +for (i = 0; i < dev->exp.sriov_pf.num_vfs; i++) {
> > 
> > If the assumption you now make is that num_vfs only changes
> > when VFs are disabled, we should add a comment documenting this.
> > In fact, I think there's a nicer way to do this:
> > 
> > static void nvme_pci_write_config(PCIDevice *dev, uint32_t address,
> >uint32_t val, int len)
> > {
> >  int old_num_vfs = dev->exp.sriov_pf.num_vfs;
> > 
> >  pci_default_write_config(dev, address, val, len);
> >  pcie_cap_flr_write_config(dev, address, val, len);
> >  nvme_sriov_pre_write_ctrl(dev, address, val, len, old_num_vfs);
> > }
> > 
> > and now, nvme_sriov_pre_write_ctrl can compare:
> > 
> > if (old_num_vfs && !dev->exp.sriov_pf.num_vfs)
> > disable everything
> > 
> > 
> > this, without bothering with detail of SRIOV capability.
> > No?
> 
> It looks better. I'll do so in the next version.




Re: [PATCH v4 9/9] hw/nvme: Refer to dev->exp.sriov_pf.num_vfs

2024-02-14 Thread Akihiko Odaki

On 2024/02/15 0:46, Michael S. Tsirkin wrote:

On Wed, Feb 14, 2024 at 11:09:50PM +0900, Akihiko Odaki wrote:

On 2024/02/14 16:07, Michael S. Tsirkin wrote:

On Wed, Feb 14, 2024 at 02:13:47PM +0900, Akihiko Odaki wrote:

NumVFs may not equal to the current effective number of VFs because VF
Enable is cleared, NumVFs is set after VF Enable is set, or NumVFs is
greater than TotalVFs.

Fixes: 11871f53ef8e ("hw/nvme: Add support for the Virtualization Management 
command")
Signed-off-by: Akihiko Odaki 


I don't get what this is saying about VF enable.
This code will not trigger on numVFs write when VF enable is set.
Generally this commit makes no sense on its own, squash it with
the pci core change pls.


This code is meant to run when it is clearing VF Enable, and its
functionality is to change the state of VFs currently enabled so that we can
disable them.

However, NumVFs does not necessarily represent VFs currently being enabled,
and have a different value in the case described above.


Ah so in this case, if numvfs is changed and then VFs are disabled,
we will not call nvme_virt_set_state? OK, it should say this in commit log.
And then, what happens?


We will call nvme_virt_set_state() but only for VFs already enabled.




Such cases exist
even before the earlier patches and this fix is independently meaningful.


yes but the previous patch causes a regression without this one.
squash it.


I'll move this patch before the previous patch.







---
   hw/nvme/ctrl.c | 5 ++---
   1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c
index f8df622fe590..daedda5d326f 100644
--- a/hw/nvme/ctrl.c
+++ b/hw/nvme/ctrl.c
@@ -8481,7 +8481,7 @@ static void nvme_sriov_pre_write_ctrl(PCIDevice *dev, 
uint32_t address,
   NvmeSecCtrlEntry *sctrl;
   uint16_t sriov_cap = dev->exp.sriov_cap;
   uint32_t off = address - sriov_cap;
-int i, num_vfs;
+int i;
   if (!sriov_cap) {
   return;
@@ -8489,8 +8489,7 @@ static void nvme_sriov_pre_write_ctrl(PCIDevice *dev, 
uint32_t address,
   if (range_covers_byte(off, len, PCI_SRIOV_CTRL)) {
   if (!(val & PCI_SRIOV_CTRL_VFE)) {
-num_vfs = pci_get_word(dev->config + sriov_cap + PCI_SRIOV_NUM_VF);
-for (i = 0; i < num_vfs; i++) {
+for (i = 0; i < dev->exp.sriov_pf.num_vfs; i++) {


If the assumption you now make is that num_vfs only changes
when VFs are disabled, we should add a comment documenting this.
In fact, I think there's a nicer way to do this:

static void nvme_pci_write_config(PCIDevice *dev, uint32_t address,
   uint32_t val, int len)
{
 int old_num_vfs = dev->exp.sriov_pf.num_vfs;

 pci_default_write_config(dev, address, val, len);
 pcie_cap_flr_write_config(dev, address, val, len);
 nvme_sriov_pre_write_ctrl(dev, address, val, len, old_num_vfs);
}

and now, nvme_sriov_pre_write_ctrl can compare:

if (old_num_vfs && !dev->exp.sriov_pf.num_vfs)
disable everything


this, without bothering with detail of SRIOV capability.
No?


It looks better. I'll do so in the next version.



Re: [PATCH v4 9/9] hw/nvme: Refer to dev->exp.sriov_pf.num_vfs

2024-02-14 Thread Michael S. Tsirkin
On Wed, Feb 14, 2024 at 11:09:50PM +0900, Akihiko Odaki wrote:
> On 2024/02/14 16:07, Michael S. Tsirkin wrote:
> > On Wed, Feb 14, 2024 at 02:13:47PM +0900, Akihiko Odaki wrote:
> > > NumVFs may not equal to the current effective number of VFs because VF
> > > Enable is cleared, NumVFs is set after VF Enable is set, or NumVFs is
> > > greater than TotalVFs.
> > > 
> > > Fixes: 11871f53ef8e ("hw/nvme: Add support for the Virtualization 
> > > Management command")
> > > Signed-off-by: Akihiko Odaki 
> > 
> > I don't get what this is saying about VF enable.
> > This code will not trigger on numVFs write when VF enable is set.
> > Generally this commit makes no sense on its own, squash it with
> > the pci core change pls.
> 
> This code is meant to run when it is clearing VF Enable, and its
> functionality is to change the state of VFs currently enabled so that we can
> disable them.
> 
> However, NumVFs does not necessarily represent VFs currently being enabled,
> and have a different value in the case described above.

Ah so in this case, if numvfs is changed and then VFs are disabled,
we will not call nvme_virt_set_state? OK, it should say this in commit log.
And then, what happens?

> Such cases exist
> even before the earlier patches and this fix is independently meaningful.

yes but the previous patch causes a regression without this one.
squash it.


> > 
> > > ---
> > >   hw/nvme/ctrl.c | 5 ++---
> > >   1 file changed, 2 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c
> > > index f8df622fe590..daedda5d326f 100644
> > > --- a/hw/nvme/ctrl.c
> > > +++ b/hw/nvme/ctrl.c
> > > @@ -8481,7 +8481,7 @@ static void nvme_sriov_pre_write_ctrl(PCIDevice 
> > > *dev, uint32_t address,
> > >   NvmeSecCtrlEntry *sctrl;
> > >   uint16_t sriov_cap = dev->exp.sriov_cap;
> > >   uint32_t off = address - sriov_cap;
> > > -int i, num_vfs;
> > > +int i;
> > >   if (!sriov_cap) {
> > >   return;
> > > @@ -8489,8 +8489,7 @@ static void nvme_sriov_pre_write_ctrl(PCIDevice 
> > > *dev, uint32_t address,
> > >   if (range_covers_byte(off, len, PCI_SRIOV_CTRL)) {
> > >   if (!(val & PCI_SRIOV_CTRL_VFE)) {
> > > -num_vfs = pci_get_word(dev->config + sriov_cap + 
> > > PCI_SRIOV_NUM_VF);
> > > -for (i = 0; i < num_vfs; i++) {
> > > +for (i = 0; i < dev->exp.sriov_pf.num_vfs; i++) {

If the assumption you now make is that num_vfs only changes
when VFs are disabled, we should add a comment documenting this.
In fact, I think there's a nicer way to do this:

static void nvme_pci_write_config(PCIDevice *dev, uint32_t address,
  uint32_t val, int len)
{
int old_num_vfs = dev->exp.sriov_pf.num_vfs;

pci_default_write_config(dev, address, val, len);
pcie_cap_flr_write_config(dev, address, val, len);
nvme_sriov_pre_write_ctrl(dev, address, val, len, old_num_vfs);
}

and now, nvme_sriov_pre_write_ctrl can compare:

if (old_num_vfs && !dev->exp.sriov_pf.num_vfs)
disable everything


this, without bothering with detail of SRIOV capability.
No?



> > >   sctrl = >sec_ctrl_list.sec[i];
> > >   nvme_virt_set_state(n, le16_to_cpu(sctrl->scid), false);
> > >   }
> > > 
> > > -- 
> > > 2.43.0
> > 




Re: [PATCH v4 9/9] hw/nvme: Refer to dev->exp.sriov_pf.num_vfs

2024-02-14 Thread Akihiko Odaki

On 2024/02/14 16:07, Michael S. Tsirkin wrote:

On Wed, Feb 14, 2024 at 02:13:47PM +0900, Akihiko Odaki wrote:

NumVFs may not equal to the current effective number of VFs because VF
Enable is cleared, NumVFs is set after VF Enable is set, or NumVFs is
greater than TotalVFs.

Fixes: 11871f53ef8e ("hw/nvme: Add support for the Virtualization Management 
command")
Signed-off-by: Akihiko Odaki 


I don't get what this is saying about VF enable.
This code will not trigger on numVFs write when VF enable is set.
Generally this commit makes no sense on its own, squash it with
the pci core change pls.


This code is meant to run when it is clearing VF Enable, and its 
functionality is to change the state of VFs currently enabled so that we 
can disable them.


However, NumVFs does not necessarily represent VFs currently being 
enabled, and have a different value in the case described above. Such 
cases exist even before the earlier patches and this fix is 
independently meaningful.





---
  hw/nvme/ctrl.c | 5 ++---
  1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c
index f8df622fe590..daedda5d326f 100644
--- a/hw/nvme/ctrl.c
+++ b/hw/nvme/ctrl.c
@@ -8481,7 +8481,7 @@ static void nvme_sriov_pre_write_ctrl(PCIDevice *dev, 
uint32_t address,
  NvmeSecCtrlEntry *sctrl;
  uint16_t sriov_cap = dev->exp.sriov_cap;
  uint32_t off = address - sriov_cap;
-int i, num_vfs;
+int i;
  
  if (!sriov_cap) {

  return;
@@ -8489,8 +8489,7 @@ static void nvme_sriov_pre_write_ctrl(PCIDevice *dev, 
uint32_t address,
  
  if (range_covers_byte(off, len, PCI_SRIOV_CTRL)) {

  if (!(val & PCI_SRIOV_CTRL_VFE)) {
-num_vfs = pci_get_word(dev->config + sriov_cap + PCI_SRIOV_NUM_VF);
-for (i = 0; i < num_vfs; i++) {
+for (i = 0; i < dev->exp.sriov_pf.num_vfs; i++) {
  sctrl = >sec_ctrl_list.sec[i];
  nvme_virt_set_state(n, le16_to_cpu(sctrl->scid), false);
  }

--
2.43.0






Re: [PATCH v4 9/9] hw/nvme: Refer to dev->exp.sriov_pf.num_vfs

2024-02-13 Thread Michael S. Tsirkin
On Wed, Feb 14, 2024 at 02:13:47PM +0900, Akihiko Odaki wrote:
> NumVFs may not equal to the current effective number of VFs because VF
> Enable is cleared, NumVFs is set after VF Enable is set, or NumVFs is
> greater than TotalVFs.
> 
> Fixes: 11871f53ef8e ("hw/nvme: Add support for the Virtualization Management 
> command")
> Signed-off-by: Akihiko Odaki 

I don't get what this is saying about VF enable.
This code will not trigger on numVFs write when VF enable is set.
Generally this commit makes no sense on its own, squash it with
the pci core change pls.

> ---
>  hw/nvme/ctrl.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c
> index f8df622fe590..daedda5d326f 100644
> --- a/hw/nvme/ctrl.c
> +++ b/hw/nvme/ctrl.c
> @@ -8481,7 +8481,7 @@ static void nvme_sriov_pre_write_ctrl(PCIDevice *dev, 
> uint32_t address,
>  NvmeSecCtrlEntry *sctrl;
>  uint16_t sriov_cap = dev->exp.sriov_cap;
>  uint32_t off = address - sriov_cap;
> -int i, num_vfs;
> +int i;
>  
>  if (!sriov_cap) {
>  return;
> @@ -8489,8 +8489,7 @@ static void nvme_sriov_pre_write_ctrl(PCIDevice *dev, 
> uint32_t address,
>  
>  if (range_covers_byte(off, len, PCI_SRIOV_CTRL)) {
>  if (!(val & PCI_SRIOV_CTRL_VFE)) {
> -num_vfs = pci_get_word(dev->config + sriov_cap + 
> PCI_SRIOV_NUM_VF);
> -for (i = 0; i < num_vfs; i++) {
> +for (i = 0; i < dev->exp.sriov_pf.num_vfs; i++) {
>  sctrl = >sec_ctrl_list.sec[i];
>  nvme_virt_set_state(n, le16_to_cpu(sctrl->scid), false);
>  }
> 
> -- 
> 2.43.0