On Feb 21 00:33, Akihiko Odaki wrote: > On 2024/02/20 23:53, Kevin Wolf wrote: > > Am 20.02.2024 um 15:29 hat Kevin Wolf geschrieben: > > > Am 20.02.2024 um 13:24 hat Akihiko Odaki geschrieben: > > > > nvme_sriov_pre_write_ctrl() used to directly inspect SR-IOV > > > > configurations to know the number of VFs being disabled due to SR-IOV > > > > configuration writes, but the logic was flawed and resulted in > > > > out-of-bound memory access. > > > > > > > > It assumed PCI_SRIOV_NUM_VF always has the number of currently enabled > > > > VFs, but it actually doesn't in the following cases: > > > > - PCI_SRIOV_NUM_VF has been set but PCI_SRIOV_CTRL_VFE has never been. > > > > - PCI_SRIOV_NUM_VF was written after PCI_SRIOV_CTRL_VFE was set. > > > > - VFs were only partially enabled because of realization failure. > > > > > > > > It is a responsibility of pcie_sriov to interpret SR-IOV configurations > > > > and pcie_sriov does it correctly, so use pcie_sriov_num_vfs(), which it > > > > provides, to get the number of enabled VFs before and after SR-IOV > > > > configuration writes. > > > > > > > > Cc: qemu-sta...@nongnu.org > > > > Fixes: CVE-2024-26328 > > > > Fixes: 11871f53ef8e ("hw/nvme: Add support for the Virtualization > > > > Management command") > > > > Suggested-by: Michael S. Tsirkin <m...@redhat.com> > > > > Signed-off-by: Akihiko Odaki <akihiko.od...@daynix.com> > > > > --- > > > > hw/nvme/ctrl.c | 26 ++++++++------------------ > > > > 1 file changed, 8 insertions(+), 18 deletions(-) > > > > > > > > diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c > > > > index f026245d1e9e..7a56e7b79b4d 100644 > > > > --- a/hw/nvme/ctrl.c > > > > +++ b/hw/nvme/ctrl.c > > > > @@ -8466,36 +8466,26 @@ static void nvme_pci_reset(DeviceState *qdev) > > > > nvme_ctrl_reset(n, NVME_RESET_FUNCTION); > > > > } > > > > -static void nvme_sriov_pre_write_ctrl(PCIDevice *dev, uint32_t address, > > > > - uint32_t val, int len) > > > > +static void nvme_sriov_post_write_config(PCIDevice *dev, uint16_t > > > > old_num_vfs) > > > > { > > > > NvmeCtrl *n = NVME(dev); > > > > NvmeSecCtrlEntry *sctrl; > > > > - uint16_t sriov_cap = dev->exp.sriov_cap; > > > > - uint32_t off = address - sriov_cap; > > > > - int i, num_vfs; > > > > + int i; > > > > - if (!sriov_cap) { > > > > - return; > > > > - } > > > > - > > > > - if (range_covers_byte(off, len, PCI_SRIOV_CTRL)) { > > > > - if (!(val & PCI_SRIOV_CTRL_VFE)) { > > > > - num_vfs = pci_get_word(dev->config + sriov_cap + > > > > PCI_SRIOV_NUM_VF); > > > > - for (i = 0; i < num_vfs; i++) { > > > > - sctrl = &n->sec_ctrl_list.sec[i]; > > > > - nvme_virt_set_state(n, le16_to_cpu(sctrl->scid), > > > > false); > > > > - } > > > > - } > > > > + for (i = pcie_sriov_num_vfs(dev); i < old_num_vfs; i++) { > > > > + sctrl = &n->sec_ctrl_list.sec[i]; > > > > + nvme_virt_set_state(n, le16_to_cpu(sctrl->scid), false); > > > > } > > > > } > > > > > > Maybe I'm missing something, but if the concern is that 'i' could run > > > beyond the end of the array, I don't see anything that limits > > > pcie_sriov_num_vfs() to the static size of 127 that n->sec_ctrl_list.sec > > > has. register_vfs() seems to just take whatever 16 bit value the guest > > > wrote without imposing additional restrictions. > > > > > > If there is some mechanism that makes register_vf() fail if we exceed > > > the limit, maybe an assertion with a comment would be in order because > > > it doesn't seem obvious. I couldn't find any code that enforces it, > > > sriov_max_vfs only ever seems to be used as a hint for the guest. > > > > > > If not, do we need another check that fails gracefully in the error > > > case? > > > > Ok, I see now that patch 2 fixes this. But then the commit message is > > wrong because it implies that this patch is the only thing you need to > > fix the problem with nvme. You can't say "Fixes: CVE-2024-26328" if half > > of the fix is still missing. > > I didn't assign CVE-2024-26328 for the case that the value of > PCI_SRIOV_NUM_VF is greater than PCI_SRIOV_TOTAL_VF; it's what > CVE-2024-26327 deals with. > > The problem I dealt here is that the value of PCI_SRIOV_NUM_VF may not > represent the actual number of enabled VFs because another register > (PCI_SRIOV_CTRL_VFE) is not set, for example. > > If an assertion to be added, I think it should be in pcie_sriov_num_vfs() > and ensure the returning value is less than the value of PCI_SRIOV_TOTAL_VF > (aka sriov_max_vfs in hw/nvme/ctrl.c), but I think it's fine without it. > > > > > Maybe asserting old_num_vfs < n->sec_ctrl_list.numcntl would still be a > > good idea. But looking at this one, it seems to me that numcntl isn't > > completely correct either: > > > > list->numcntl = cpu_to_le16(max_vfs); > > > > Both list->numcntl and max_vfs are uint8_t, so I think this will always > > be 0 on big endian hosts? > > Indeed it looks wrong. Will you write a patch? >
I'll fix it. And give the SR-IOV parts of hw/nvme some love all around.
signature.asc
Description: PGP signature