Re: [Qemu-devel] [PATCH RFC] memory: drop _overlap variant
On Tue, Feb 19, 2013 at 05:58:38PM +0200, Avi Kivity wrote: > On Tue, Feb 19, 2013 at 4:41 PM, Michael S. Tsirkin wrote: > > On Thu, Feb 14, 2013 at 08:23:04PM +0200, Avi Kivity wrote: > >> On Thu, Feb 14, 2013 at 8:12 PM, Michael S. Tsirkin > >> wrote: > >> >> > >> >> Is there an actual real problem that needs fixing? > >> > > >> > Yes. Guests sometimes cause device BARs to temporary overlap > >> > the APIC range during BAR sizing. It works fine on a physical > >> > system but fails on KVM since pci has same priority. > >> > > >> > See the report: > >> > [BUG] Guest OS hangs on boot when 64bit BAR present > >> > > >> > >> Is PCI_COMMAND_MEMORY set while this is going on? > > > > I think Linux never clears PCI_COMMAND_MEMORY because > > it's buggy in some devices. > > Ok. Then I recommend defining the MSI message area as overlapped with > sufficient priority. It should probably be a child of the PCI address > space. > > The IOAPIC is actually closer to ISA, but again it's sufficient to > move it to the PCI address space. I doubt its priority matters. Well moving IOAPIC to PCI seems strange, it's not a PCI thing, and I think it can be moved outside PCI though guests don't do it. So I think ideally we really should have it look something like: sysbus -> ioapic -> pci -> msi -- MST
Re: [Qemu-devel] [PATCH RFC] memory: drop _overlap variant
On Tue, Feb 19, 2013 at 6:08 PM, Michael S. Tsirkin wrote: >> >> The IOAPIC is actually closer to ISA, but again it's sufficient to >> move it to the PCI address space. I doubt its priority matters. > > Well moving IOAPIC to PCI seems strange, it's not a PCI thing, > and I think it can be moved outside PCI though guests don't do it. Look at the 440fx/piix datasheets. It's connected to the piix which decodes its address. So it's definitely part of the pci address space. > So I think ideally we really should have it look something like: > > sysbus -> ioapic >-> pci -> msi >
Re: [Qemu-devel] [PATCH RFC] memory: drop _overlap variant
On Tue, Feb 19, 2013 at 4:41 PM, Michael S. Tsirkin wrote: > On Thu, Feb 14, 2013 at 08:23:04PM +0200, Avi Kivity wrote: >> On Thu, Feb 14, 2013 at 8:12 PM, Michael S. Tsirkin wrote: >> >> >> >> Is there an actual real problem that needs fixing? >> > >> > Yes. Guests sometimes cause device BARs to temporary overlap >> > the APIC range during BAR sizing. It works fine on a physical >> > system but fails on KVM since pci has same priority. >> > >> > See the report: >> > [BUG] Guest OS hangs on boot when 64bit BAR present >> > >> >> Is PCI_COMMAND_MEMORY set while this is going on? > > I think Linux never clears PCI_COMMAND_MEMORY because > it's buggy in some devices. Ok. Then I recommend defining the MSI message area as overlapped with sufficient priority. It should probably be a child of the PCI address space. The IOAPIC is actually closer to ISA, but again it's sufficient to move it to the PCI address space. I doubt its priority matters.
Re: [Qemu-devel] [PATCH RFC] memory: drop _overlap variant
On Thu, Feb 14, 2013 at 08:23:04PM +0200, Avi Kivity wrote: > On Thu, Feb 14, 2013 at 8:12 PM, Michael S. Tsirkin wrote: > >> > >> Is there an actual real problem that needs fixing? > > > > Yes. Guests sometimes cause device BARs to temporary overlap > > the APIC range during BAR sizing. It works fine on a physical > > system but fails on KVM since pci has same priority. > > > > See the report: > > [BUG] Guest OS hangs on boot when 64bit BAR present > > > > Is PCI_COMMAND_MEMORY set while this is going on? I think Linux never clears PCI_COMMAND_MEMORY because it's buggy in some devices.
Re: [Qemu-devel] [PATCH RFC] memory: drop _overlap variant
On Thu, Feb 14, 2013 at 8:12 PM, Michael S. Tsirkin wrote: >> >> Is there an actual real problem that needs fixing? > > Yes. Guests sometimes cause device BARs to temporary overlap > the APIC range during BAR sizing. It works fine on a physical > system but fails on KVM since pci has same priority. > > See the report: > [BUG] Guest OS hangs on boot when 64bit BAR present > Is PCI_COMMAND_MEMORY set while this is going on?
Re: [Qemu-devel] [PATCH RFC] memory: drop _overlap variant
On Thu, Feb 14, 2013 at 07:02:15PM +0200, Avi Kivity wrote: > On Thu, Feb 14, 2013 at 6:50 PM, Michael S. Tsirkin wrote: > >> > As you see, ioapic at 0xfec0 overlaps pci-hole. > >> > ioapic is guest programmable in theory - should use _overlap? > >> > pci-hole is not but can overlap with ioapic. > >> > So also _overlap? > >> > >> It's a bug. The ioapic is in the pci address space, not the system > >> address space. And yes it's overlappable. > > > > So you want to put it where? Under pci-hole? > > No, under the pci address space. Look at the 440fx block diagram. > > > And we'll have to teach all machine types > > creating pci-hole about it? > > No. > > > > >> > > >> > Let's imagine someone writes a guest programmable device for > >> > ARM. Now we should update all ARM devices from regular to _overlap? > >> > >> It's sufficient to update the programmable device. > > > > Then the device can be higher priority (works for apic) > > but not lower priority. Make priority signed? > > Is there an actual real problem that needs fixing? Yes. Guests sometimes cause device BARs to temporary overlap the APIC range during BAR sizing. It works fine on a physical system but fails on KVM since pci has same priority. See the report: [BUG] Guest OS hangs on boot when 64bit BAR present -- MST
Re: [Qemu-devel] [PATCH RFC] memory: drop _overlap variant
On Thu, Feb 14, 2013 at 6:50 PM, Michael S. Tsirkin wrote: >> > As you see, ioapic at 0xfec0 overlaps pci-hole. >> > ioapic is guest programmable in theory - should use _overlap? >> > pci-hole is not but can overlap with ioapic. >> > So also _overlap? >> >> It's a bug. The ioapic is in the pci address space, not the system >> address space. And yes it's overlappable. > > So you want to put it where? Under pci-hole? No, under the pci address space. Look at the 440fx block diagram. > And we'll have to teach all machine types > creating pci-hole about it? No. > >> > >> > Let's imagine someone writes a guest programmable device for >> > ARM. Now we should update all ARM devices from regular to _overlap? >> >> It's sufficient to update the programmable device. > > Then the device can be higher priority (works for apic) > but not lower priority. Make priority signed? Is there an actual real problem that needs fixing?
Re: [Qemu-devel] [PATCH RFC] memory: drop _overlap variant
On Thu, Feb 14, 2013 at 05:07:02PM +0200, Avi Kivity wrote: > On Thu, Feb 14, 2013 at 4:40 PM, Michael S. Tsirkin wrote: > > On Thu, Feb 14, 2013 at 04:14:39PM +0200, Avi Kivity wrote: > > > > But some parents are system created and shared by many devices so children > > for > > such have no idea who their siblings are. > > > > Please take a look at the typical map in this mail: > > '[BUG] Guest OS hangs on boot when 64bit BAR present' > > > > system overlap 0 pri 0 [0x0 - 0x7fff] > > kvmvapic-rom overlap 1 pri 1000 [0xca000 - 0xcd000] > > pc.ram overlap 0 pri 0 [0xca000 - 0xcd000] > > ++ pc.ram [0xca000 - 0xcd000] is added to view > > > > smram-region overlap 1 pri 1 [0xa - 0xc] > > pci overlap 0 pri 0 [0xa - 0xc] > > cirrus-lowmem-container overlap 1 pri 1 [0xa - 0xc] > > cirrus-low-memory overlap 0 pri 0 [0xa - 0xc] > > ++cirrus-low-memory [0xa - 0xc] is added to view > > kvm-ioapic overlap 0 pri 0 [0xfec0 - 0xfec01000] > > ++kvm-ioapic [0xfec0 - 0xfec01000] is added to view > > pci-hole64 overlap 0 pri 0 [0x1 - 0x4001] > > pci overlap 0 pri 0 [0x1 - 0x4001] > > pci-hole overlap 0 pri 0 [0x7d00 - 0x1] > > pci overlap 0 pri 0 [0x7d00 - 0x1] > > ivshmem-bar2-container overlap 1 pri 1 [0xfe00 - > > 0x1] > > ivshmem.bar2 overlap 0 pri 0 [0xfe00 - 0x1] > > ++ivshmem.bar2 [0xfe00 - 0xfec0] is added to view > > ++ivshmem.bar2 [0xfec01000 - 0x1] is added to view > > > > As you see, ioapic at 0xfec0 overlaps pci-hole. > > ioapic is guest programmable in theory - should use _overlap? > > pci-hole is not but can overlap with ioapic. > > So also _overlap? > > It's a bug. The ioapic is in the pci address space, not the system > address space. And yes it's overlappable. So you want to put it where? Under pci-hole? And we'll have to teach all machine types creating pci-hole about it? > > > > Let's imagine someone writes a guest programmable device for > > ARM. Now we should update all ARM devices from regular to _overlap? > > It's sufficient to update the programmable device. Then the device can be higher priority (works for apic) but not lower priority. Make priority signed? > >> > > >> > Non overlapping is not a common case at all. E.g. with normal PCI > >> > devices you have no way to know nothing overlaps - addresses are guest > >> > programmable. > >> > >> Non overlapping is mostly useful for embedded platforms. > > > > Maybe it should have a longer name like _nonoverlap then? > > Current API makes people assume _overlap is only for special > > cases and default should be non overlap. > > The assumption is correct.
Re: [Qemu-devel] [PATCH RFC] memory: drop _overlap variant
On Thu, Feb 14, 2013 at 4:40 PM, Michael S. Tsirkin wrote: > On Thu, Feb 14, 2013 at 04:14:39PM +0200, Avi Kivity wrote: > > But some parents are system created and shared by many devices so children for > such have no idea who their siblings are. > > Please take a look at the typical map in this mail: > '[BUG] Guest OS hangs on boot when 64bit BAR present' > > system overlap 0 pri 0 [0x0 - 0x7fff] > kvmvapic-rom overlap 1 pri 1000 [0xca000 - 0xcd000] > pc.ram overlap 0 pri 0 [0xca000 - 0xcd000] > ++ pc.ram [0xca000 - 0xcd000] is added to view > > smram-region overlap 1 pri 1 [0xa - 0xc] > pci overlap 0 pri 0 [0xa - 0xc] > cirrus-lowmem-container overlap 1 pri 1 [0xa - 0xc] > cirrus-low-memory overlap 0 pri 0 [0xa - 0xc] > ++cirrus-low-memory [0xa - 0xc] is added to view > kvm-ioapic overlap 0 pri 0 [0xfec0 - 0xfec01000] > ++kvm-ioapic [0xfec0 - 0xfec01000] is added to view > pci-hole64 overlap 0 pri 0 [0x1 - 0x4001] > pci overlap 0 pri 0 [0x1 - 0x4001] > pci-hole overlap 0 pri 0 [0x7d00 - 0x1] > pci overlap 0 pri 0 [0x7d00 - 0x1] > ivshmem-bar2-container overlap 1 pri 1 [0xfe00 - 0x1] > ivshmem.bar2 overlap 0 pri 0 [0xfe00 - 0x1] > ++ivshmem.bar2 [0xfe00 - 0xfec0] is added to view > ++ivshmem.bar2 [0xfec01000 - 0x1] is added to view > > As you see, ioapic at 0xfec0 overlaps pci-hole. > ioapic is guest programmable in theory - should use _overlap? > pci-hole is not but can overlap with ioapic. > So also _overlap? It's a bug. The ioapic is in the pci address space, not the system address space. And yes it's overlappable. > > Let's imagine someone writes a guest programmable device for > ARM. Now we should update all ARM devices from regular to _overlap? It's sufficient to update the programmable device. >> > >> > Non overlapping is not a common case at all. E.g. with normal PCI >> > devices you have no way to know nothing overlaps - addresses are guest >> > programmable. >> >> Non overlapping is mostly useful for embedded platforms. > > Maybe it should have a longer name like _nonoverlap then? > Current API makes people assume _overlap is only for special > cases and default should be non overlap. The assumption is correct.
Re: [Qemu-devel] [PATCH RFC] memory: drop _overlap variant
On Thu, Feb 14, 2013 at 02:34:20PM +, Peter Maydell wrote: > On 14 February 2013 14:02, Michael S. Tsirkin wrote: > > Well that's the status quo. One of the issues is, you have > > no idea what else uses each priority. With this change, > > at least you can grep for it. > > No, because most of the code you find will be setting > priorities for completely irrelevant containers (for > instance PCI doesn't care at all about priorities used > by the v7m NVIC). > > > Imagine the specific example: ioapic and pci devices. ioapic has > > an address within the pci hole but it is not a subregion. > > If priority has no meaning how would you decide which one > > to use? > > I don't know about the specifics of the PC's memory layout, > but *something* has to manage the address space that is > being set up. I would expect something like: > > * PCI host controller has a memory region (container) which >all the PCI devices are mapped into as per guest programming > * ioapic has a memory region > * there is another container which contains both these >memory regions. The code that controls and sets up that >container [which is probably the pc board model] gets to >decide priorities, which are purely local to it This assumes we set up devices in code. We are trying to move away from that, and have APIs that let you set up boards from command line. > (It's possible that at the moment the "another container" is > the get_system_memory() system address space. If it makes life > easier you can always invent another container to give you a > fresh level of indirection.) > > > Also, on a PC many addresses are guest programmable. We need to behave > > in some defined way if guest programs addresses to something silly. > > Yes, this is the job of the code controlling the container(s) > into which those memory regions may be mapped. Some containers don't know what is mapped into them. > >> If the guest can > >> program overlap then presumably PCI specifies semantics > >> for what happens then, and there need to be PCI specific > >> wrappers that enforce those semantics and they can call > >> the relevant _overlap functions when mapping things. > >> In any case this isn't a concern for the PCI *device*, > >> which can just provide its memory regions. It's a problem > >> the PCI *host adaptor* has to deal with when it's figuring > >> out how to map those regions into the container which > >> corresponds to its area of the address space. > > > > Issue is, a PCI device overlapping something else suddenly > > becomes this something else's problem. > > Nope, because the PCI host controller model should be in > complete control of the container all the PCI devices live > in, and it is the thing doing the mapping and unmapping > so it gets to set priorities and mark things as OK to > overlap. Also, memory.c permits overlap if either of the > two memory regions in question is marked as may-overlap; > they don't both have to be marked. That's undocumented, isn't it? And then which one wins? > >> > We could add a wrapper for MEMORY_PRIO_LOWEST - will that address > >> > your concern? > >> > >> Well, I'm entirely happy with the memory API we have at > >> the moment, and I'm trying to figure out why you want to > >> change it... > > > > I am guessing your systems all have hardcoded addresses > > not controlled by guest. > > Nope. omap_gpmc.c for instance has guest programmable subregions; > it uses a container so the guest's manipulation of these can't > leak out and cause weird things to happen to other bits of QEMU. > [I think we don't implement the correct guest-facing behaviour > when the guest asks for overlapping regions, but we shouldn't > hit the memory.c overlapping-region issue, or if we do it's > a bug to be fixed.] > > There's also PCI on the versatilepb, but PCI devices can't > just appear anywhere, the PCI memory windows are at known > addresses and the PCI device can't escape from the wrong > side of the PCI controller. But, there are devices who's addresses can overlap the PCI window. > >> >> Maybe we should take the printf() about subregion collisions > >> >> in memory_region_add_subregion_common() out of the #if 0 > >> >> that it currently sits in? > >> > >> > This is just a debugging tool, it won't fix anything. > >> > >> It might tell us what bits of code are currently erroneously > >> mapping regions that overlap without using the _overlap() > >> function. Then we could fix them. > > > When there is a single guest programmable device, > > any address can be overlapped by it. > > Do we really have an example of a guest programmable > device where the *device itself* decides where it lives > in the address space, rather than the guest being able to > program a host controller/bus fabric/equivalent thing to > specify where the device should live, or the device > effectively negotiating with its bus controller? That > seems very implausible to me just because hardware itself > gener
Re: [Qemu-devel] [PATCH RFC] memory: drop _overlap variant
On Thu, Feb 14, 2013 at 04:14:39PM +0200, Avi Kivity wrote: > On Thu, Feb 14, 2013 at 3:09 PM, Michael S. Tsirkin wrote: > > On Thu, Feb 14, 2013 at 12:56:02PM +, Peter Maydell wrote: > >> On 14 February 2013 12:45, Michael S. Tsirkin wrote: > >> > overlap flag in the region is currently unused, most devices have no > >> > idea whether their region overlaps with anything, so drop it, > >> > assume that all regions can overlap and always require priority. > >> > >> Devices themselves shouldn't care, for the most part -- they just > >> provide a memory region and it's their parent that has to map it > >> and know whether it overlaps or not. Similarly, parents should > >> generally be in control of the container they're mapping the > >> memory region into, and know whether it will be an overlapping > >> map or not. > >> > >> > It's also not clear how should devices allocate priorities. > >> > >> Up to the parent which controls the region being mapped into. > > > > We could just assume same priority as parent but what happens if it > > has to be different? > > Priority is only considered relative to siblings. The parent's > priority is only considered wrt the parent's siblings, not its > children. But some parents are system created and shared by many devices so children for such have no idea who their siblings are. Please take a look at the typical map in this mail: '[BUG] Guest OS hangs on boot when 64bit BAR present' system overlap 0 pri 0 [0x0 - 0x7fff] kvmvapic-rom overlap 1 pri 1000 [0xca000 - 0xcd000] pc.ram overlap 0 pri 0 [0xca000 - 0xcd000] ++ pc.ram [0xca000 - 0xcd000] is added to view smram-region overlap 1 pri 1 [0xa - 0xc] pci overlap 0 pri 0 [0xa - 0xc] cirrus-lowmem-container overlap 1 pri 1 [0xa - 0xc] cirrus-low-memory overlap 0 pri 0 [0xa - 0xc] ++cirrus-low-memory [0xa - 0xc] is added to view kvm-ioapic overlap 0 pri 0 [0xfec0 - 0xfec01000] ++kvm-ioapic [0xfec0 - 0xfec01000] is added to view pci-hole64 overlap 0 pri 0 [0x1 - 0x4001] pci overlap 0 pri 0 [0x1 - 0x4001] pci-hole overlap 0 pri 0 [0x7d00 - 0x1] pci overlap 0 pri 0 [0x7d00 - 0x1] ivshmem-bar2-container overlap 1 pri 1 [0xfe00 - 0x1] ivshmem.bar2 overlap 0 pri 0 [0xfe00 - 0x1] ++ivshmem.bar2 [0xfe00 - 0xfec0] is added to view ++ivshmem.bar2 [0xfec01000 - 0x1] is added to view As you see, ioapic at 0xfec0 overlaps pci-hole. ioapic is guest programmable in theory - should use _overlap? pci-hole is not but can overlap with ioapic. So also _overlap? Let's imagine someone writes a guest programmable device for ARM. Now we should update all ARM devices from regular to _overlap? > > There are also aliases so a region > > can have multiple parents. Presumably it will have to have > > different priorities depending on what the parent does? > > The alias region has its own priority > > > Here's a list of instances using priority != 0. > > > > hw/armv7m_nvic.c:MEMORY_PRIO_LOW); > > hw/cirrus_vga.c:MEMORY_PRIO_LOW); > > hw/cirrus_vga.c:MEMORY_PRIO_LOW); > > hw/cirrus_vga.c:&s->low_mem_container, > > MEMORY_PRIO_LOW); > > hw/kvm/pci-assign.c: &r_dev->mmio, MEMORY_PRIO_LOW); > > hw/kvmvapic.c:memory_region_add_subregion(as, rom_paddr, &s->rom, > > MEMORY_PRIO_HIGH); > > hw/lpc_ich9.c:MEMORY_PRIO_LOW); > > hw/onenand.c:&s->mapped_ram, > > MEMORY_PRIO_LOW); > > hw/pam.c:MEMORY_PRIO_LOW); > > hw/pc.c:MEMORY_PRIO_LOW); > > hw/pc_sysfw.c:isa_bios, MEMORY_PRIO_LOW); > > hw/pc_sysfw.c:isa_bios, MEMORY_PRIO_LOW); > > hw/pci/pci.c:MEMORY_PRIO_LOW); > > hw/pci/pci_bridge.c:memory_region_add_subregion(parent_space, base, > > alias, MEMORY_PRIO_LOW); > > hw/piix_pci.c:MEMORY_PRIO_LOW); > > hw/piix_pci.c:&d->rcr_mem, MEMORY_PRIO_LOW); > > hw/q35.c:&mch->smram_region, > > MEMORY_PRIO_LOW); > > hw/vga-isa.c:MEMORY_PRIO_LOW); > > hw/vga.c:MEMORY_PRIO_MEDIUM); > > hw/vga.c:vga_io_memory, MEMORY_PRIO_LOW); > > hw/xen_pt_msi.c:MEMORY_PRIO_MEDIUM); /* > > Priority: pci default + 1 > > > > Making priority relative to parent but not the same just seems like a > > recip
Re: [Qemu-devel] [PATCH RFC] memory: drop _overlap variant
On 14 February 2013 14:02, Michael S. Tsirkin wrote: > Well that's the status quo. One of the issues is, you have > no idea what else uses each priority. With this change, > at least you can grep for it. No, because most of the code you find will be setting priorities for completely irrelevant containers (for instance PCI doesn't care at all about priorities used by the v7m NVIC). > Imagine the specific example: ioapic and pci devices. ioapic has > an address within the pci hole but it is not a subregion. > If priority has no meaning how would you decide which one > to use? I don't know about the specifics of the PC's memory layout, but *something* has to manage the address space that is being set up. I would expect something like: * PCI host controller has a memory region (container) which all the PCI devices are mapped into as per guest programming * ioapic has a memory region * there is another container which contains both these memory regions. The code that controls and sets up that container [which is probably the pc board model] gets to decide priorities, which are purely local to it (It's possible that at the moment the "another container" is the get_system_memory() system address space. If it makes life easier you can always invent another container to give you a fresh level of indirection.) > Also, on a PC many addresses are guest programmable. We need to behave > in some defined way if guest programs addresses to something silly. Yes, this is the job of the code controlling the container(s) into which those memory regions may be mapped. >> If the guest can >> program overlap then presumably PCI specifies semantics >> for what happens then, and there need to be PCI specific >> wrappers that enforce those semantics and they can call >> the relevant _overlap functions when mapping things. >> In any case this isn't a concern for the PCI *device*, >> which can just provide its memory regions. It's a problem >> the PCI *host adaptor* has to deal with when it's figuring >> out how to map those regions into the container which >> corresponds to its area of the address space. > > Issue is, a PCI device overlapping something else suddenly > becomes this something else's problem. Nope, because the PCI host controller model should be in complete control of the container all the PCI devices live in, and it is the thing doing the mapping and unmapping so it gets to set priorities and mark things as OK to overlap. Also, memory.c permits overlap if either of the two memory regions in question is marked as may-overlap; they don't both have to be marked. >> > We could add a wrapper for MEMORY_PRIO_LOWEST - will that address >> > your concern? >> >> Well, I'm entirely happy with the memory API we have at >> the moment, and I'm trying to figure out why you want to >> change it... > > I am guessing your systems all have hardcoded addresses > not controlled by guest. Nope. omap_gpmc.c for instance has guest programmable subregions; it uses a container so the guest's manipulation of these can't leak out and cause weird things to happen to other bits of QEMU. [I think we don't implement the correct guest-facing behaviour when the guest asks for overlapping regions, but we shouldn't hit the memory.c overlapping-region issue, or if we do it's a bug to be fixed.] There's also PCI on the versatilepb, but PCI devices can't just appear anywhere, the PCI memory windows are at known addresses and the PCI device can't escape from the wrong side of the PCI controller. >> >> Maybe we should take the printf() about subregion collisions >> >> in memory_region_add_subregion_common() out of the #if 0 >> >> that it currently sits in? >> >> > This is just a debugging tool, it won't fix anything. >> >> It might tell us what bits of code are currently erroneously >> mapping regions that overlap without using the _overlap() >> function. Then we could fix them. > When there is a single guest programmable device, > any address can be overlapped by it. Do we really have an example of a guest programmable device where the *device itself* decides where it lives in the address space, rather than the guest being able to program a host controller/bus fabric/equivalent thing to specify where the device should live, or the device effectively negotiating with its bus controller? That seems very implausible to me just because hardware itself generally has some kind of hierarchy of buses and it's not really possible for a leaf node to make itself appear anywhere in the hierarchy; all it can do is by agreement with the thing above it appear at some different address at the same level. [of course there are trivial systems with a totally flat bus but that's just a degenerate case of the above where there's only one thing (the board) managing a single layer, and typically those systems have everything at a fixed address anyhow.] -- PMM
Re: [Qemu-devel] [PATCH RFC] memory: drop _overlap variant
On Thu, Feb 14, 2013 at 4:02 PM, Michael S. Tsirkin wrote: > On Thu, Feb 14, 2013 at 01:22:18PM +, Peter Maydell wrote: >> On 14 February 2013 13:09, Michael S. Tsirkin wrote: >> > On Thu, Feb 14, 2013 at 12:56:02PM +, Peter Maydell wrote: >> >> Up to the parent which controls the region being mapped into. >> > >> > We could just assume same priority as parent >> >> Er, no. I mean the code in control of the parent MR sets the >> priority, when it calls memory_region_add_subregion_overlap(). >> >> > but what happens if it >> > has to be different? There are also aliases so a region >> > can have multiple parents. >> >> The alias has its own priority. > > Well that's the status quo. One of the issues is, you have > no idea what else uses each priority. With this change, > at least you can grep for it. The question "what priorities do aliases of this region have" is not an interesting question. Priority is a local attribute, not an attribute of the region being prioritized. > >> > Presumably it will have to have >> > different priorities depending on what the parent does? >> > Here's a list of instances using priority != 0. >> > >> > hw/armv7m_nvic.c:MEMORY_PRIO_LOW); >> >> So this one I know about, and it's a good example of what >> I'm talking about. This function sets up a container memory >> region ("nvic"), and it is in total control of what is >> mapped into that container. Specifically, it puts in a >> "nvic_sysregs" background region which covers the whole >> 0x1000 size of the container (at an implicit priority of >> zero). It then layers over that an alias of the GIC >> registers ("nvic-gic") at a specific address and with >> a priority of 1 so it appears above the background region. >> Nobody else ever puts anything in this container, so >> the only thing we care about is that the priority of >> the nvic-gic region is higher than that of the nvic_sysregs >> region; and it's clear from the code that we do that. >> Priority is a local question whose meaning is only relevant >> within a particular container region, not system-wide, >> and >> having system-wide MEMORY_PRIO_ defines obscures that IMHO. > > Well that's not how it seems to work, and I don't see how it *could* > work. Imagine the specific example: ioapic and pci devices. ioapic has > an address within the pci hole but it is not a subregion. > If priority has no meaning how would you decide which one > to use? Like PMM said. You look at the semantics of the hardware, and map that onto the API. If the pci controller says that BARs hide the ioapic, then you give them higher priority. If it says that the ioapic hides BARs, then that gets higher priority. If it doesn't say anything, take your pick (or give them the same priority). > > Also, on a PC many addresses are guest programmable. We need to behave > in some defined way if guest programs addresses to something silly. That's why _overlap exists. > The only reason it works sometimes is because some systems > use fixes addresses which never overlap. That's why the no overlap API exists. > >> >> >> I definitely don't like making the priority argument mandatory: >> >> this is just introducing pointless boilerplate for the common >> >> case where nothing overlaps and you know nothing overlaps. >> > >> > Non overlapping is not a common case at all. E.g. with normal PCI >> > devices you have no way to know nothing overlaps - addresses are guest >> > programmable. >> >> That means PCI is a special case :-) >> If the guest can >> program overlap then presumably PCI specifies semantics >> for what happens then, and there need to be PCI specific >> wrappers that enforce those semantics and they can call >> the relevant _overlap functions when mapping things. >> In any case this isn't a concern for the PCI *device*, >> which can just provide its memory regions. It's a problem >> the PCI *host adaptor* has to deal with when it's figuring >> out how to map those regions into the container which >> corresponds to its area of the address space. > > Issue is, a PCI device overlapping something else suddenly > becomes this something else's problem. It is not a problem at all. > >> >> Maybe we should take the printf() about subregion collisions >> >> in memory_region_add_subregion_common() out of the #if 0 >> >> that it currently sits in? >> >> > This is just a debugging tool, it won't fix anything. >> >> It might tell us what bits of code are currently erroneously >> mapping regions that overlap without using the _overlap() >> function. Then we could fix them. >> >> -- PMM > > When there is a single guest programmable device, > any address can be overlapped by it. No. Only addresses within the same container. Other containers work fine without overlap. > We could invent rules like 'non overlappable is higher > priority' but it seems completely arbitrary, a single > priority is clearer. It's just noise for the xx% of cases which don't need it.
Re: [Qemu-devel] [PATCH RFC] memory: drop _overlap variant
On Thu, Feb 14, 2013 at 3:09 PM, Michael S. Tsirkin wrote: > On Thu, Feb 14, 2013 at 12:56:02PM +, Peter Maydell wrote: >> On 14 February 2013 12:45, Michael S. Tsirkin wrote: >> > overlap flag in the region is currently unused, most devices have no >> > idea whether their region overlaps with anything, so drop it, >> > assume that all regions can overlap and always require priority. >> >> Devices themselves shouldn't care, for the most part -- they just >> provide a memory region and it's their parent that has to map it >> and know whether it overlaps or not. Similarly, parents should >> generally be in control of the container they're mapping the >> memory region into, and know whether it will be an overlapping >> map or not. >> >> > It's also not clear how should devices allocate priorities. >> >> Up to the parent which controls the region being mapped into. > > We could just assume same priority as parent but what happens if it > has to be different? Priority is only considered relative to siblings. The parent's priority is only considered wrt the parent's siblings, not its children. > There are also aliases so a region > can have multiple parents. Presumably it will have to have > different priorities depending on what the parent does? The alias region has its own priority > Here's a list of instances using priority != 0. > > hw/armv7m_nvic.c:MEMORY_PRIO_LOW); > hw/cirrus_vga.c:MEMORY_PRIO_LOW); > hw/cirrus_vga.c:MEMORY_PRIO_LOW); > hw/cirrus_vga.c:&s->low_mem_container, > MEMORY_PRIO_LOW); > hw/kvm/pci-assign.c: &r_dev->mmio, MEMORY_PRIO_LOW); > hw/kvmvapic.c:memory_region_add_subregion(as, rom_paddr, &s->rom, > MEMORY_PRIO_HIGH); > hw/lpc_ich9.c:MEMORY_PRIO_LOW); > hw/onenand.c:&s->mapped_ram, MEMORY_PRIO_LOW); > hw/pam.c:MEMORY_PRIO_LOW); > hw/pc.c:MEMORY_PRIO_LOW); > hw/pc_sysfw.c:isa_bios, MEMORY_PRIO_LOW); > hw/pc_sysfw.c:isa_bios, MEMORY_PRIO_LOW); > hw/pci/pci.c:MEMORY_PRIO_LOW); > hw/pci/pci_bridge.c:memory_region_add_subregion(parent_space, base, > alias, MEMORY_PRIO_LOW); > hw/piix_pci.c:MEMORY_PRIO_LOW); > hw/piix_pci.c:&d->rcr_mem, MEMORY_PRIO_LOW); > hw/q35.c:&mch->smram_region, MEMORY_PRIO_LOW); > hw/vga-isa.c:MEMORY_PRIO_LOW); > hw/vga.c:MEMORY_PRIO_MEDIUM); > hw/vga.c:vga_io_memory, MEMORY_PRIO_LOW); > hw/xen_pt_msi.c:MEMORY_PRIO_MEDIUM); /* > Priority: pci default + 1 > > Making priority relative to parent but not the same just seems like a recipe > for disaster. > >> I definitely don't like making the priority argument mandatory: >> this is just introducing pointless boilerplate for the common >> case where nothing overlaps and you know nothing overlaps. > > Non overlapping is not a common case at all. E.g. with normal PCI > devices you have no way to know nothing overlaps - addresses are guest > programmable. Non overlapping is mostly useful for embedded platforms.
Re: [Qemu-devel] [PATCH RFC] memory: drop _overlap variant
On Thu, Feb 14, 2013 at 01:22:18PM +, Peter Maydell wrote: > On 14 February 2013 13:09, Michael S. Tsirkin wrote: > > On Thu, Feb 14, 2013 at 12:56:02PM +, Peter Maydell wrote: > >> Up to the parent which controls the region being mapped into. > > > > We could just assume same priority as parent > > Er, no. I mean the code in control of the parent MR sets the > priority, when it calls memory_region_add_subregion_overlap(). > > > but what happens if it > > has to be different? There are also aliases so a region > > can have multiple parents. > > The alias has its own priority. Well that's the status quo. One of the issues is, you have no idea what else uses each priority. With this change, at least you can grep for it. > > Presumably it will have to have > > different priorities depending on what the parent does? > > Here's a list of instances using priority != 0. > > > > hw/armv7m_nvic.c:MEMORY_PRIO_LOW); > > So this one I know about, and it's a good example of what > I'm talking about. This function sets up a container memory > region ("nvic"), and it is in total control of what is > mapped into that container. Specifically, it puts in a > "nvic_sysregs" background region which covers the whole > 0x1000 size of the container (at an implicit priority of > zero). It then layers over that an alias of the GIC > registers ("nvic-gic") at a specific address and with > a priority of 1 so it appears above the background region. > Nobody else ever puts anything in this container, so > the only thing we care about is that the priority of > the nvic-gic region is higher than that of the nvic_sysregs > region; and it's clear from the code that we do that. > Priority is a local question whose meaning is only relevant > within a particular container region, not system-wide, > and > having system-wide MEMORY_PRIO_ defines obscures that IMHO. Well that's not how it seems to work, and I don't see how it *could* work. Imagine the specific example: ioapic and pci devices. ioapic has an address within the pci hole but it is not a subregion. If priority has no meaning how would you decide which one to use? Also, on a PC many addresses are guest programmable. We need to behave in some defined way if guest programs addresses to something silly. The only reason it works sometimes is because some systems use fixes addresses which never overlap. > > >> I definitely don't like making the priority argument mandatory: > >> this is just introducing pointless boilerplate for the common > >> case where nothing overlaps and you know nothing overlaps. > > > > Non overlapping is not a common case at all. E.g. with normal PCI > > devices you have no way to know nothing overlaps - addresses are guest > > programmable. > > That means PCI is a special case :-) > If the guest can > program overlap then presumably PCI specifies semantics > for what happens then, and there need to be PCI specific > wrappers that enforce those semantics and they can call > the relevant _overlap functions when mapping things. > In any case this isn't a concern for the PCI *device*, > which can just provide its memory regions. It's a problem > the PCI *host adaptor* has to deal with when it's figuring > out how to map those regions into the container which > corresponds to its area of the address space. Issue is, a PCI device overlapping something else suddenly becomes this something else's problem. > > We could add a wrapper for MEMORY_PRIO_LOWEST - will that address > > your concern? > > Well, I'm entirely happy with the memory API we have at > the moment, and I'm trying to figure out why you want to > change it... I am guessing your systems all have hardcoded addresses not controlled by guest. > >> Maybe we should take the printf() about subregion collisions > >> in memory_region_add_subregion_common() out of the #if 0 > >> that it currently sits in? > > > This is just a debugging tool, it won't fix anything. > > It might tell us what bits of code are currently erroneously > mapping regions that overlap without using the _overlap() > function. Then we could fix them. > > -- PMM When there is a single guest programmable device, any address can be overlapped by it. We could invent rules like 'non overlappable is higher priority' but it seems completely arbitrary, a single priority is clearer. -- MST
Re: [Qemu-devel] [PATCH RFC] memory: drop _overlap variant
On 14 February 2013 13:09, Michael S. Tsirkin wrote: > On Thu, Feb 14, 2013 at 12:56:02PM +, Peter Maydell wrote: >> Up to the parent which controls the region being mapped into. > > We could just assume same priority as parent Er, no. I mean the code in control of the parent MR sets the priority, when it calls memory_region_add_subregion_overlap(). > but what happens if it > has to be different? There are also aliases so a region > can have multiple parents. The alias has its own priority. > Presumably it will have to have > different priorities depending on what the parent does? > Here's a list of instances using priority != 0. > > hw/armv7m_nvic.c:MEMORY_PRIO_LOW); So this one I know about, and it's a good example of what I'm talking about. This function sets up a container memory region ("nvic"), and it is in total control of what is mapped into that container. Specifically, it puts in a "nvic_sysregs" background region which covers the whole 0x1000 size of the container (at an implicit priority of zero). It then layers over that an alias of the GIC registers ("nvic-gic") at a specific address and with a priority of 1 so it appears above the background region. Nobody else ever puts anything in this container, so the only thing we care about is that the priority of the nvic-gic region is higher than that of the nvic_sysregs region; and it's clear from the code that we do that. Priority is a local question whose meaning is only relevant within a particular container region, not system-wide, and having system-wide MEMORY_PRIO_ defines obscures that IMHO. >> I definitely don't like making the priority argument mandatory: >> this is just introducing pointless boilerplate for the common >> case where nothing overlaps and you know nothing overlaps. > > Non overlapping is not a common case at all. E.g. with normal PCI > devices you have no way to know nothing overlaps - addresses are guest > programmable. That means PCI is a special case :-) If the guest can program overlap then presumably PCI specifies semantics for what happens then, and there need to be PCI specific wrappers that enforce those semantics and they can call the relevant _overlap functions when mapping things. In any case this isn't a concern for the PCI *device*, which can just provide its memory regions. It's a problem the PCI *host adaptor* has to deal with when it's figuring out how to map those regions into the container which corresponds to its area of the address space. > We could add a wrapper for MEMORY_PRIO_LOWEST - will that address > your concern? Well, I'm entirely happy with the memory API we have at the moment, and I'm trying to figure out why you want to change it... >> Maybe we should take the printf() about subregion collisions >> in memory_region_add_subregion_common() out of the #if 0 >> that it currently sits in? > This is just a debugging tool, it won't fix anything. It might tell us what bits of code are currently erroneously mapping regions that overlap without using the _overlap() function. Then we could fix them. -- PMM
Re: [Qemu-devel] [PATCH RFC] memory: drop _overlap variant
On Thu, Feb 14, 2013 at 12:56:02PM +, Peter Maydell wrote: > On 14 February 2013 12:45, Michael S. Tsirkin wrote: > > overlap flag in the region is currently unused, most devices have no > > idea whether their region overlaps with anything, so drop it, > > assume that all regions can overlap and always require priority. > > Devices themselves shouldn't care, for the most part -- they just > provide a memory region and it's their parent that has to map it > and know whether it overlaps or not. Similarly, parents should > generally be in control of the container they're mapping the > memory region into, and know whether it will be an overlapping > map or not. > > > It's also not clear how should devices allocate priorities. > > Up to the parent which controls the region being mapped into. We could just assume same priority as parent but what happens if it has to be different? There are also aliases so a region can have multiple parents. Presumably it will have to have different priorities depending on what the parent does? Here's a list of instances using priority != 0. hw/armv7m_nvic.c:MEMORY_PRIO_LOW); hw/cirrus_vga.c:MEMORY_PRIO_LOW); hw/cirrus_vga.c:MEMORY_PRIO_LOW); hw/cirrus_vga.c:&s->low_mem_container, MEMORY_PRIO_LOW); hw/kvm/pci-assign.c: &r_dev->mmio, MEMORY_PRIO_LOW); hw/kvmvapic.c:memory_region_add_subregion(as, rom_paddr, &s->rom, MEMORY_PRIO_HIGH); hw/lpc_ich9.c:MEMORY_PRIO_LOW); hw/onenand.c:&s->mapped_ram, MEMORY_PRIO_LOW); hw/pam.c:MEMORY_PRIO_LOW); hw/pc.c:MEMORY_PRIO_LOW); hw/pc_sysfw.c:isa_bios, MEMORY_PRIO_LOW); hw/pc_sysfw.c:isa_bios, MEMORY_PRIO_LOW); hw/pci/pci.c:MEMORY_PRIO_LOW); hw/pci/pci_bridge.c:memory_region_add_subregion(parent_space, base, alias, MEMORY_PRIO_LOW); hw/piix_pci.c:MEMORY_PRIO_LOW); hw/piix_pci.c:&d->rcr_mem, MEMORY_PRIO_LOW); hw/q35.c:&mch->smram_region, MEMORY_PRIO_LOW); hw/vga-isa.c:MEMORY_PRIO_LOW); hw/vga.c:MEMORY_PRIO_MEDIUM); hw/vga.c:vga_io_memory, MEMORY_PRIO_LOW); hw/xen_pt_msi.c:MEMORY_PRIO_MEDIUM); /* Priority: pci default + 1 Making priority relative to parent but not the same just seems like a recipe for disaster. > I definitely don't like making the priority argument mandatory: > this is just introducing pointless boilerplate for the common > case where nothing overlaps and you know nothing overlaps. Non overlapping is not a common case at all. E.g. with normal PCI devices you have no way to know nothing overlaps - addresses are guest programmable. See also recent discussion about 64 bit BARs. We could add a wrapper for MEMORY_PRIO_LOWEST - will that address your concern? > Maybe we should take the printf() about subregion collisions > in memory_region_add_subregion_common() out of the #if 0 > that it currently sits in? > > -- PMM This is just a debugging tool, it won't fix anything. -- MST
Re: [Qemu-devel] [PATCH RFC] memory: drop _overlap variant
On 14 February 2013 12:45, Michael S. Tsirkin wrote: > overlap flag in the region is currently unused, most devices have no > idea whether their region overlaps with anything, so drop it, > assume that all regions can overlap and always require priority. Devices themselves shouldn't care, for the most part -- they just provide a memory region and it's their parent that has to map it and know whether it overlaps or not. Similarly, parents should generally be in control of the container they're mapping the memory region into, and know whether it will be an overlapping map or not. > It's also not clear how should devices allocate priorities. Up to the parent which controls the region being mapped into. I definitely don't like making the priority argument mandatory: this is just introducing pointless boilerplate for the common case where nothing overlaps and you know nothing overlaps. Maybe we should take the printf() about subregion collisions in memory_region_add_subregion_common() out of the #if 0 that it currently sits in? -- PMM