Hi Claudio,

Thank you for your review. Please see my in-line comments.

On Mon, Nov 24, 2014 at 11:34 AM, Claudio Fontana
<claudio.font...@huawei.com> wrote:
> On 21.11.2014 19:07, Alvise Rigo wrote:
>> Add a generic PCI host controller for virtual platforms, based on the
>> previous work by Rob Herring:
>> http://lists.gnu.org/archive/html/qemu-devel/2014-06/msg03482.html
>
> Would it not be better to add rob herring's signoff, and then the explanation
> of what you changed from his patch, followed by your signoff?
> Or did you change so much that you had to redo the original work by Rob 
> Herring
> from scratch?

It was actually difficult to create some meaningful patches over the
initial ones.
It's sure that for a final version these details will be properly
addressed (signoff and diff over the original patches if possible).

>
>>
>> The controller creates a PCI bus for PCI devices; it is intended to
>> receive from the platform all the needed information to initiate the
>> memory regions expected by the PCI system. Due to this dependence, a
>> header file is used to define the structure that the platform has to
>> fill with the data needed by the controller. The device provides also a
>> routine for the device tree node generation, which mostly has to take
>> care of the interrupt-map property generation. This property is fetched
>> by the guest operating system to map any PCI interrupt to the interrupt
>> controller.
>
>> For the time being, the device expects a GIC v2 to be used
>> by the guest.
>
> That's a pretty big limitation for a "generic" controller.
> If it's only about the size of a parent interrupt cell or something like
> that, why don't we provide it as part of the platform description that
> you pass to the machinery anyway?

Yes we can, however being totally interrupt controller independent
means that the platform has to fully provide the way to describe a
parent interrupt.
For the time being, we use a description compatible with GICv2 (and
v3), to cover the use case mach-virt + generic-pci (this is why I also
called the interrupt specific structures "gic*").
If we really need to be more general, I can see two solutions: the
first one is to find a way to pass all the necessary interrupt
controller information to the device machinery (phandle, #cells,
interrupt number, etc.).
I don't know how such a solution could get complicated in the future,
with some other virt-platform that requires a complicated interrupt
mapping for example.
The second would be to move all the device node generation back to the
platform, making its code even more crowded.
Are there other solutions that I'm missing?

>
>> Only mach-virt has been used to test the controller.
>>
>> Signed-off-by: Alvise Rigo <a.r...@virtualopensystems.com>
>> ---
>>  hw/pci-host/Makefile.objs         |   2 +-
>>  hw/pci-host/generic-pci.c         | 280 
>> ++++++++++++++++++++++++++++++++++++++
>>  include/hw/pci-host/generic-pci.h |  74 ++++++++++
>>  3 files changed, 355 insertions(+), 1 deletion(-)
>>  create mode 100644 hw/pci-host/generic-pci.c
>>  create mode 100644 include/hw/pci-host/generic-pci.h
>>
>> diff --git a/hw/pci-host/Makefile.objs b/hw/pci-host/Makefile.objs
>> index bb65f9c..8ef9fac 100644
>> --- a/hw/pci-host/Makefile.objs
>> +++ b/hw/pci-host/Makefile.objs
>> @@ -1,4 +1,4 @@
>> -common-obj-y += pam.o
>> +common-obj-y += pam.o generic-pci.o
>>
>>  # PPC devices
>>  common-obj-$(CONFIG_PREP_PCI) += prep.o
>> diff --git a/hw/pci-host/generic-pci.c b/hw/pci-host/generic-pci.c
>> new file mode 100644
>> index 0000000..9c90263
>> --- /dev/null
>> +++ b/hw/pci-host/generic-pci.c
>> @@ -0,0 +1,280 @@
>> +/*
>> + * Generic PCI host controller
>> + *
>> + * Copyright (c) 2014 Linaro, Ltd.
>> + * Author: Rob Herring <rob.herr...@linaro.org>
>> + *
>> + * Based on ARM Versatile PCI controller (hw/pci-host/versatile.c):
>> + * Copyright (c) 2006-2009 CodeSourcery.
>> + * Written by Paul Brook
>> + *
>> + * This code is licensed under the LGPL.
>> + */
>> +
>> +#include "hw/sysbus.h"
>> +#include "hw/pci-host/generic-pci.h"
>> +#include "exec/address-spaces.h"
>> +#include "sysemu/device_tree.h"
>> +
>> +static const VMStateDescription pci_generic_host_vmstate = {
>> +    .name = "generic-host-pci",
>> +    .version_id = 1,
>> +    .minimum_version_id = 1,
>> +};
>> +
>> +static void pci_cam_config_write(void *opaque, hwaddr addr,
>> +                                 uint64_t val, unsigned size)
>> +{
>
> I would add a comment:
> /* the MemoryRegionOps require uint64_t val, but we can only do 32bit */
>
> there are already asserts in pci_host.c, so that should be sufficient.

Agreed.

>
>> +    PCIGenState *s = opaque;
>> +    pci_data_write(&s->pci_bus, addr, val, size);
>> +}
>> +
>> +static uint64_t pci_cam_config_read(void *opaque, hwaddr addr, unsigned 
>> size)
>
> /* same here */
>
>> +{
>> +    PCIGenState *s = opaque;
>> +    uint32_t val;
>> +    val = pci_data_read(&s->pci_bus, addr, size);
>> +    return val;
>> +}
>> +
>> +static const MemoryRegionOps pci_vpb_config_ops = {
>> +    .read = pci_cam_config_read,
>> +    .write = pci_cam_config_write,
>> +    .endianness = DEVICE_NATIVE_ENDIAN,
>> +};
>> +
>> +static void pci_generic_set_irq(void *opaque, int irq_num, int level)
>> +{
>> +    qemu_irq *pic = opaque;
>> +    qemu_set_irq(pic[irq_num], level);
>> +}
>> +
>> +static void pci_generic_host_init(Object *obj)
>> +{
>> +    PCIHostState *h = PCI_HOST_BRIDGE(obj);
>> +    PCIGenState *s = PCI_GEN(obj);
>> +    GenericPCIHostState *gps = &s->pci_gen;
>> +
>> +    memory_region_init(&s->pci_io_space, OBJECT(s), "pci_io", 1ULL << 32);
>> +    memory_region_init(&s->pci_mem_space, OBJECT(s), "pci_mem", 1ULL << 32);
>> +
>> +    pci_bus_new_inplace(&s->pci_bus, sizeof(s->pci_bus), DEVICE(obj), "pci",
>> +                        &s->pci_mem_space, &s->pci_io_space,
>> +                        PCI_DEVFN(0, 0), TYPE_PCIE_BUS);
>
> TYPE_PCI_BUS, until we actually support PCIE.

This was present in the first RFC as well since harmless, but I will
make it PCI only.

>
>> +    h->bus = &s->pci_bus;
>> +
>> +    object_initialize(gps, sizeof(*gps), TYPE_GENERIC_PCI_HOST);
>> +    qdev_set_parent_bus(DEVICE(gps), BUS(&s->pci_bus));
>> +    gps->devfns = 0;
>> +}
>> +
>> +static int generic_pci_map_irq_fn(PCIDevice *pci_dev, int pin)
>> +{
>> +    BusState *bus = qdev_get_parent_bus(&pci_dev->qdev);
>> +    PCIBus *pci_bus = PCI_BUS(bus);
>> +    PCIDevice *pdev = pci_bus->devices[PCI_DEVFN(0, 0)];
>> +    GenericPCIHostState *gps = PCI_GEN_HOST(pdev);
>> +
>> +    return gps->irqmap.devfn_idx_map[PCI_SLOT(pci_dev->devfn)]
>> +                                    [PCI_FUNC(pci_dev->devfn)];
>> +}
>> +
>> +static void pci_generic_host_realize(DeviceState *dev, Error **errp)
>> +{
>> +    PCIGenState *s = PCI_GEN(dev);
>> +    SysBusDevice *sbd = SYS_BUS_DEVICE(dev);
>> +    GenericPCIPlatformData *pdata = &s->plat_data;
>> +    int i;
>> +
>> +    if (!pdata->addr_map[PCI_CFG].size || !pdata->addr_map[PCI_IO].size ||
>> +                           !pdata->addr_map[PCI_MEM].size || !pdata->irqs) {
>> +        hw_error("generic_pci: PCI controller not fully configured.");
>> +    }
>> +
>> +    for (i = 0; i < pdata->irqs; i++) {
>> +        sysbus_init_irq(sbd, &s->irq[i]);
>> +    }
>> +
>> +    pci_bus_irqs(&s->pci_bus, pci_generic_set_irq, generic_pci_map_irq_fn,
>> +                 s->irq, pdata->irqs);
>> +
>> +    memory_region_init_io(&s->mem_config, OBJECT(s), &pci_vpb_config_ops, s,
>> +                          "pci-config", pdata->addr_map[PCI_CFG].size);
>> +    sysbus_init_mmio(sbd, &s->mem_config);
>> +
>> +    /* Depending on the available memory of the board using the controller, 
>> we
>> +       create a window on both the I/O and mememory space. */
>
> s/mememory/memory/

ACK.

>
>> +    memory_region_init_alias(&s->pci_io_window, OBJECT(s), "pci-io-win",
>> +                             &s->pci_io_space, 0,
>> +                             pdata->addr_map[PCI_IO].size);
>> +    sysbus_init_mmio(sbd, &s->pci_io_window);
>> +
>> +    memory_region_init_alias(&s->pci_mem_window, OBJECT(s), "pci-mem-win",
>> +                             &s->pci_mem_space,
>> +                             pdata->addr_map[PCI_MEM].addr,
>> +                             pdata->addr_map[PCI_MEM].size);
>> +    sysbus_init_mmio(sbd, &s->pci_mem_window);
>> +
>> +    /* TODO Remove once realize propagates to child devices. */
>> +    object_property_set_bool(OBJECT(&s->pci_gen), true, "realized", errp);
>> +}
>> +
>> +static void pci_generic_host_class_init(ObjectClass *klass, void *data)
>> +{
>> +    PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
>> +    DeviceClass *dc = DEVICE_CLASS(klass);
>> +
>> +    k->vendor_id = PCI_VENDOR_ID_REDHAT;
>> +    k->device_id = 0x1234;
>> +    k->class_id = PCI_CLASS_PROCESSOR_CO;
>> +    /*
>> +     * PCI-facing part of the host bridge, not usable without the
>> +     * host-facing part, which can't be device_add'ed, yet.
>> +     */
>> +    dc->cannot_instantiate_with_device_add_yet = true;
>> +}
>> +
>> +struct dt_irq_mapping {
>> +        int irqs;
>> +        uint32_t gic_phandle;
>
> cannot this be generic enough that we don't talk about gic here?

Yes. I will change it.

>
>> +        int base_irq_num;
>> +        uint64_t *data;
>> +};
>> +
>> +/*  */
>
> ? Can you remove this empty comment ?

Of course, I missed it.

>
>
>> +#define IRQ_MAP_ENTRY_DESC_SZ 14
>> +static void gen_int_mapping_fn(PCIBus *b, PCIDevice *d, void *opaque)
>> +{
>> +    struct dt_irq_mapping *map_data;
>> +    int pin;
>> +
>> +    PCIDevice *pdev = b->devices[PCI_DEVFN(0, 0)];
>> +    GenericPCIHostState *gps = PCI_GEN_HOST(pdev);
>> +    map_data = (struct dt_irq_mapping *)opaque;
>> +
>> +    /* Check if the platform has enough IRQs available. */
>> +    if (gps->devfns > map_data->irqs) {
>> +        hw_error("generic_pci: too many PCI devices.");
>> +    }
>> +
>> +    pin = pci_get_byte(d->config + PCI_INTERRUPT_PIN);
>> +    if (pin) {
>> +        uint64_t *data_ptr = map_data->data + (IRQ_MAP_ENTRY_DESC_SZ * 
>> gps->devfns);
>> +        uint8_t pci_slot = PCI_SLOT(d->devfn);
>> +        uint8_t pci_func = PCI_FUNC(d->devfn);
>> +        uint8_t *devfn_idx = &gps->irqmap.devfn_idx_map[pci_slot][0];
>> +        uint8_t *devfn_irq = gps->irqmap.devfn_irq_map;
>> +
>> +        devfn_idx[pci_func] = gps->devfns;
>> +        devfn_irq[gps->devfns] = map_data->base_irq_num + gps->devfns;
>> +
>> +        uint64_t buffer[IRQ_MAP_ENTRY_DESC_SZ] =
>> +        {1, (pci_slot << 11)|(pci_func << 8), 2, 0x00000000, 1, pin,
>> +         1, map_data->gic_phandle, 1, 0x0, 1, devfn_irq[gps->devfns],
>> +         1, 0x1};
>
> ah now I see, I think you forgot a comment above. But maybe here is a better 
> place.
> The above needs to be commented heavily, to make it obvious what each field 
> is,
> and which "spec" this is following. You have mentioned already in the 
> previous discussion,
> but it needs to be in the code.

I will do it.

>
>> +
>> +        memcpy(data_ptr, buffer, IRQ_MAP_ENTRY_DESC_SZ * sizeof(*buffer));
>> +        gps->devfns++;
>> +    }
>> +}
>> +
>> +/* Generate the "interrup-map" node's data and store it in map_data */
>
> interrupt-map

ACK.

>
>> +static void generate_int_mapping(struct dt_irq_mapping *map_data,
>> +                                 PCIGenState *s)
>> +{
>> +    pci_for_each_device(s->parent_obj.bus, 0, gen_int_mapping_fn, map_data);
>> +}
>> +
>> +static void generate_dt_node(void *fdt, DeviceState *dev)
>> +{
>> +    PCIGenState *s = PCI_GEN(dev);
>> +    char *nodename;
>> +    uint32_t gic_phandle;
>> +    uint32_t plat_acells;
>> +    uint32_t plat_scells;
>> +
>> +    SysBusDevice *busdev;
>> +    busdev = SYS_BUS_DEVICE(dev);
>> +    MemoryRegion *cfg = sysbus_mmio_get_region(busdev, 0);
>> +    MemoryRegion *io = sysbus_mmio_get_region(busdev, 1);
>> +    MemoryRegion *mem = sysbus_mmio_get_region(busdev, 2);
>> +
>> +    nodename = g_strdup_printf("/pci@%" PRIx64, cfg->addr);
>> +    qemu_fdt_add_subnode(fdt, nodename);
>> +    qemu_fdt_setprop_string(fdt, nodename, "compatible",
>> +                            "pci-host-cam-generic");
>> +    qemu_fdt_setprop_string(fdt, nodename, "device_type", "pci");
>> +    qemu_fdt_setprop_cell(fdt, nodename, "#address-cells", 0x3);
>> +    qemu_fdt_setprop_cell(fdt, nodename, "#size-cells", 0x2);
>> +    qemu_fdt_setprop_cell(fdt, nodename, "#interrupt-cells", 0x1);
>> +
>> +    plat_acells = qemu_fdt_getprop_cell(fdt, "/", "#address-cells");
>> +    plat_scells = qemu_fdt_getprop_cell(fdt, "/", "#size-cells");
>> +    qemu_fdt_setprop_sized_cells(fdt, nodename, "reg", plat_acells, 
>> cfg->addr,
>> +                                        plat_scells, 
>> memory_region_size(cfg));
>> +
>> +    qemu_fdt_setprop_sized_cells(fdt, nodename, "ranges",
>> +                        1, 0x01000000, 2, 0x00000000, /* I/O space */
>> +                        2, io->addr, 2, memory_region_size(io),
>> +                        1, 0x02000000, 2, mem->addr, /* 32bit memory space 
>> */
>
> Is it commented somewhere sensible that the generic pci has this layout:
>
> cfg = whatever base address
> io = base + 0x1000000
> mem = base + 0x2000000

The value 0x1000000 actually means that the following address has to
be interpreted as I/O address, while 0x2000000 means that it's a 32bit
address.
I will add some more documentation here, for now I point to page 4 of
the following document where the whole encoding is explained:
http://www.o3one.org/hwdocs/openfirmware/pci_supplement_2_1.pdf
All the documentation should also be available at the website
http://www.openfirmware.org (as pointed also by the kernel
documentation file Documentation/devicetree/bindings/pci/pci.txt),
however the domain does not seem active.

>
>> +                        2, mem->addr, 2, memory_region_size(mem));
>> +
>> +    gic_phandle = qemu_fdt_get_phandle(fdt, s->plat_data.gic_node_name);
>
> hmm could it be gic-independent?
>
>> +    /* A mask covering bits [15,8] of phys.high allows to honor the
>> +       function number when resolving a triggered PCI interrupt. */
>> +    qemu_fdt_setprop_sized_cells(fdt, nodename, "interrupt-map-mask",
>> +                                  1, 0xff00, 1, 0x0, 1, 0x0, 1, 0x7);
>
> and what about 0, 0, 7, what do they mean? How does the mapping work,
> what do all the fields mean, and which paper/spec are you implementing?..

0, 0 are needed to mask out the phys.mid and hys.low bits of the PCI
address (since are not needed to resolve the interrupt).
I will add a more detailed documentation in the next version.

>
>> +
>> +    uint64_t *int_mapping_data = g_malloc0(IRQ_MAP_ENTRY_DESC_SZ *
>> +                                          sizeof(uint64_t) * 
>> s->plat_data.irqs);
>> +    struct dt_irq_mapping dt_map_data = {
>> +        .irqs = s->plat_data.irqs,
>> +        .gic_phandle = gic_phandle,
>
> could this be gic independent?

Yes, but still we need a field necessary to store the interrupt
controller phandle.

>
>> +        .base_irq_num = s->plat_data.base_irq,
>> +        .data = int_mapping_data
>> +    };
>> +    /* Generate the interrupt mapping according to the devices attached
>> +     * to the PCI bus of the controller. */
>> +    generate_int_mapping(&dt_map_data, s);
>> +    qemu_fdt_setprop_sized_cells_from_array(fdt, nodename, "interrupt-map",
>> +                (s->pci_gen.devfns * IRQ_MAP_ENTRY_DESC_SZ)/2, 
>> int_mapping_data);
>> +
>> +    g_free(int_mapping_data);
>> +    g_free(nodename);
>> +}
>> +
>> +void pci_controller_build_dt_node(void *fdt, DeviceState *dev)
>> +{
>> +    generate_dt_node(fdt, dev);
>> +}
>> +
>> +static const TypeInfo pci_generic_host_info = {
>> +    .name          = TYPE_GENERIC_PCI_HOST,
>> +    .parent        = TYPE_PCI_DEVICE,
>> +    .instance_size = sizeof(GenericPCIHostState),
>> +    .class_init    = pci_generic_host_class_init,
>> +};
>> +
>> +static void pci_generic_class_init(ObjectClass *klass, void *data)
>> +{
>> +    DeviceClass *dc = DEVICE_CLASS(klass);
>> +
>> +    dc->realize = pci_generic_host_realize;
>> +    dc->vmsd = &pci_generic_host_vmstate;
>> +}
>> +
>> +static const TypeInfo pci_generic_info = {
>> +    .name          = TYPE_GENERIC_PCI,
>> +    .parent        = TYPE_PCI_HOST_BRIDGE,
>> +    .instance_size = sizeof(PCIGenState),
>> +    .instance_init = pci_generic_host_init,
>> +    .class_init    = pci_generic_class_init,
>> +};
>> +
>> +static void generic_pci_host_register_types(void)
>> +{
>> +    type_register_static(&pci_generic_info);
>> +    type_register_static(&pci_generic_host_info);
>> +}
>> +
>> +type_init(generic_pci_host_register_types)
>> \ No newline at end of file
>> diff --git a/include/hw/pci-host/generic-pci.h 
>> b/include/hw/pci-host/generic-pci.h
>> new file mode 100644
>> index 0000000..43c7a0f
>> --- /dev/null
>> +++ b/include/hw/pci-host/generic-pci.h
>> @@ -0,0 +1,74 @@
>> +#ifndef QEMU_GENERIC_PCI_H
>> +#define QEMU_GENERIC_PCI_H
>> +
>> +#include "hw/pci/pci.h"
>> +#include "hw/pci/pci_bus.h"
>> +#include "hw/pci/pci_host.h"
>> +
>> +#define MAX_PCI_DEVICES (PCI_SLOT_MAX * PCI_FUNC_MAX)
>> +
>
> maybe this would be a good place to talk about what the address space
> layout assumptions of generic pci are, like
>
> /* The Configuration space starts at the PCI base address passed in the 
> virtual platform
>  * information,
>  * The I/O space starts at an offset of 0x1000000 from the PCI base address
>  * The Mem space starts at an offset of 0x2000000 from the PCI base address
>  */
>
> these the assumptions right?

Actually this is not completely true. As explained before, 0x01000000
and 0x02000000 indicate an I/O region and a 32bit addressable region.
The I/O window (pci_io_window) starts at the beginning of the I/O
address space (pci_io_space), no matter where it is mapped to by the
platform.
For example, mach-virt maps it to the CPU address 0x11000000 but from
the PCI bus point of view, it starts at address 0x0.
For the memory window (pci_mem_window) instead we need to place it at
the same address in the memory space (mem_io_space) to which it has
been mapped by the platform.
In mach-virt, this region starts at CPU (and also PCI) address 0x12000000.
I hope to have been enough clear.

Regards,
alvise

>
>> +enum {
>> +    PCI_CFG = 0,
>> +    PCI_IO,
>> +    PCI_MEM,
>> +};
>> +
>> +typedef struct {
>> +    /* addr_map[PCI_CFG].addr = base address of configuration memory
>> +       addr_map[PCI_CFG].size = configuration memory size
>> +       ... */
>> +    struct addr_map {
>> +        hwaddr addr;
>> +        hwaddr size;
>> +    } addr_map[3];
>> +
>> +    const char *gic_node_name;
>> +    uint32_t base_irq;
>> +    uint32_t irqs;
>> +} GenericPCIPlatformData;
>> +
>> +typedef struct {
>> +    PCIDevice parent_obj;
>> +
>> +    struct irqmap {
>> +        /* devfn_idx_map[i][j] = index inside slot_irq_map for device at 
>> slot i,
>> +           fn j */
>> +        uint8_t devfn_idx_map[PCI_SLOT_MAX][PCI_FUNC_MAX];
>> +        /* devfn_irq_map[i] = IRQ num. of the i-th device (devfn) attached 
>> to
>> +           the bus */
>> +        uint8_t devfn_irq_map[MAX_PCI_DEVICES];
>> +    } irqmap;
>> +    /* number of devices attached to the bus */
>> +    uint32_t devfns;
>> +} GenericPCIHostState;
>> +
>> +typedef struct {
>> +    PCIHostState parent_obj;
>> +
>> +    qemu_irq irq[MAX_PCI_DEVICES];
>> +    MemoryRegion mem_config;
>> +    /* Containers representing the PCI address spaces */
>> +    MemoryRegion pci_io_space;
>> +    MemoryRegion pci_mem_space;
>> +    /* Alias regions into PCI address spaces which we expose as sysbus 
>> regions.
>> +     * The offsets into pci_mem_space is fixed. */
>> +    MemoryRegion pci_io_window;
>> +    MemoryRegion pci_mem_window;
>> +    PCIBus pci_bus;
>> +    GenericPCIHostState pci_gen;
>> +
>> +    /* Platform dependent data */
>> +    GenericPCIPlatformData plat_data;
>> +} PCIGenState;
>> +
>> +#define TYPE_GENERIC_PCI "generic_pci"
>> +#define PCI_GEN(obj) \
>> +    OBJECT_CHECK(PCIGenState, (obj), TYPE_GENERIC_PCI)
>> +
>> +#define TYPE_GENERIC_PCI_HOST "generic_pci_host"
>> +#define PCI_GEN_HOST(obj) \
>> +    OBJECT_CHECK(GenericPCIHostState, (obj), TYPE_GENERIC_PCI_HOST)
>> +
>> +void pci_controller_build_dt_node(void *fdt, DeviceState *dev);
>> +
>> +#endif
>> \ No newline at end of file
>>
>
>

Reply via email to