Marcel Apfelbaum <mar...@redhat.com> writes: > On 03/02/2016 11:13 AM, Markus Armbruster wrote: >> This got lost over the Christmas break, sorry. >> >> Cc'ing Marcel for additional PCI expertise. >> >> Cao jin <caoj.f...@cn.fujitsu.com> writes: >> >>> msi_init() is a supporting function in PCI device initialization, >>> in order to convert .init() to .realize(), it should be modified first. >> >> "Supporting function" doesn't imply "should use Error to report >> errors". HACKING explains: >> >> Use the simplest suitable method to communicate success / failure to >> callers. Stick to common methods: non-negative on success / -1 on >> error, non-negative / -errno, non-null / null, or Error objects. >> >> Example: when a function returns a non-null pointer on success, and it >> can fail only in one way (as far as the caller is concerned), returning >> null on failure is just fine, and certainly simpler and a lot easier on >> the eyes than propagating an Error object through an Error ** parameter. >> >> Example: when a function's callers need to report details on failure >> only the function really knows, use Error **, and set suitable errors. >> >> Do not report an error to the user when you're also returning an error >> for somebody else to handle. Leave the reporting to the place that >> consumes the error returned. >> >> As we'll see in your patch of msi.c, msi_init() can fail in multiple >> ways, and uses -errno to comunicate them. That can be okay even in >> realize(). It also reports to the user. That's what makes it >> unsuitable for realize(). >> >>> Also modify the callers >> >> Actually, you're *fixing* callers! But the bugs aren't 100% clear, yet, >> see below for details. Once we know what the bugs are, we'll want to >> reword the commit message to describe the bugs and their impact. >> >> I recommend to skip ahead to msi.c, then come back to the device models. >> >>> Bonus: add more comment for msi_init(). >>> Signed-off-by: Cao jin <caoj.f...@cn.fujitsu.com> >>> --- >>> hw/audio/intel-hda.c | 10 ++++- >>> hw/ide/ich.c | 2 +- >>> hw/net/vmxnet3.c | 13 +++--- >>> hw/pci-bridge/ioh3420.c | 7 +++- >>> hw/pci-bridge/pci_bridge_dev.c | 8 +++- >>> hw/pci-bridge/xio3130_downstream.c | 8 +++- >>> hw/pci-bridge/xio3130_upstream.c | 8 +++- >>> hw/pci/msi.c | 18 +++++++-- >>> hw/scsi/megasas.c | 15 +++++-- >>> hw/scsi/vmw_pvscsi.c | 13 ++++-- >>> hw/usb/hcd-xhci.c | 81 >>> +++++++++++++++++++++----------------- >>> hw/vfio/pci.c | 20 +++++----- >>> include/hw/pci/msi.h | 4 +- >>> 13 files changed, 135 insertions(+), 72 deletions(-) >>> > > [...] > >> Except I'm not sure the function should fail in the first place! See >> below. >> >>> +int msi_init(struct PCIDevice *dev, uint8_t offset, unsigned int >>> nr_vectors, >>> + bool msi64bit, bool msi_per_vector_mask, Error **errp) >>> { >>> unsigned int vectors_order; >>> - uint16_t flags; >>> + uint16_t flags; /* Message Control register value */ >>> uint8_t cap_size; >>> int config_offset; >>> >>> if (!msi_supported) { >>> + error_setg(errp, "MSI is not supported by interrupt controller"); >>> return -ENOTSUP; >> >> First failure mode: board does not support MSI (-ENOTSUP). >> >> Question to the PCI guys: why is this even an error? A device with >> capability MSI should work just fine in such a board. > > Hi Markus, > > Adding Jan Kiszka, maybe he can help. > > That's a fair question. Is there any history for this decision? > The board not supporting MSI has nothing to do with the capability being > there. > The HW should not change because the board doe not support it. > > The capability should be present but not active. > >> >>> } >>> >>> @@ -182,7 +190,8 @@ int msi_init(struct PCIDevice *dev, uint8_t offset, >>> } >>> >>> cap_size = msi_cap_sizeof(flags); >>> - config_offset = pci_add_capability(dev, PCI_CAP_ID_MSI, offset, >>> cap_size); >>> + config_offset = pci_add_capability2(dev, PCI_CAP_ID_MSI, offset, >>> + cap_size, errp); >> >> pci_add_capability() is a wrapper around pci_add_capability2() that >> additionally reports errors with error_report_err(). This is what makes >> it unsuitable for realize(). >> >>> if (config_offset < 0) { >>> return config_offset; >> >> Inherits failing modes from pci_add_capability2(). Two: out of PCI >> config space (-ENOSPC), and capability overlap (-EINVAL). >> >> Question for the PCI guys: how can these happen? When they happen, is >> it a programming error? > > out of PCI config space: a device emulation error, not enough room > for all its capabilities - it seems to be a programming error.
Programming error should be an assertion failure, not falling to a variant of the device the user didn't order and that might not even exist in the real world. > capability overlap: is for device assignment. This checks for a real HW > that is broke. - not a programming error. Okay. Thanks! [...]