Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes
>>> On 04.09.18 at 18:24, wrote: > On 04/09/18 17:11, Juergen Gross wrote: >> On 16/08/18 13:27, Jan Beulich wrote: >> On 16.08.18 at 12:56, wrote: On 16/08/18 11:29, Jan Beulich wrote: > Following some further discussion with Andrew, he looks to be > convinced that the issue is to be fixed in the balloon driver, > which so far (intentionally afaict) does not remove the direct > map entries for ballooned out pages in the HVM case. I'm not > convinced of this, but I'd nevertheless like to inquire whether > such a change (resulting in shattered super page mappings) > would be acceptable in the first place. We don't tolerate anything else in the directmap pointing to invalid/unimplemented frames. Why should ballooning be any different? >>> Because ballooning is something virtualization specific, which >>> doesn't have any equivalent on bare hardware (memory hot >>> unplug doesn't come quite close enough imo, not the least >>> because that doesn't work on a page granular basis). Hence >>> we're to define the exact behavior here, and hence such a >>> definition could as well include special behavior of accesses >>> to the involved guest-physical addresses. >> After discussing the issue with some KVM guys I still think it would be >> better to leave the ballooned pages mapped in the direct map. KVM does >> it the same way. They return "something" in case the guest tries to >> read from such a page (might be the real data, 0's or all 1's). >> >> So we should either map an all 0's or 1's page via EPT, or we should >> return 0's or 1's via emulation of the read instruction. >> >> Performance shouldn't be a major issue, as such reads should be really >> rare. > > Such reads should be non-existent. One way or another, there's still a > bug to fix in the kernel, because it isn't keeping suitable track of the > pfns. So you put yourself in opposition to both what KVM and Xen do in their Linux implementations. I can only re-iterate: We're talking about a PV extension here. Behavior of this is entirely defined by us. Hence it is not a given that "such reads should be non-existent". > As for how Xen could do things better... > > We could map a page of all-ones (all zeroes would definitely be wrong), > but you've still got the problem of what happens if a write occurs. We > absolutely can't sacrifice enough RAM to fill in the ballooned-out > frames with read/write frames. Of course, or else the ballooning effect would be nullified. However, besides a page full of 0s or 1s, a simple "sink" page could also be used, where reads return undefined data (i.e. whatever has last been written into it through one of its perhaps very many aliases). Another possibility for the sink page would be a (hardware) MMIO one we know has no actual device backing it, thus allowing writes to be terminated (discarded) by hardware, and reads to return all ones (again due to hardware behavior). The question is how we would universally find such a page (accesses to which must obviously not have any other side effects). > I'd prefer not to see any emulation here, but that is more for an attack > surface limitation point of view. x86 still offers us the option to not > tolerate misaligned accesses and terminate early write-discard when > hitting one of these pages. Well - for now we have the series hopefully fixing the emulation misbehavior here (and elsewhere at the same time). But I certainly appreciate your desire for there not being emulation here in the first place. Which I think leaves as the only option the sink page described above. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes
On 04/09/18 17:11, Juergen Gross wrote: > On 16/08/18 13:27, Jan Beulich wrote: > On 16.08.18 at 12:56, wrote: >>> On 16/08/18 11:29, Jan Beulich wrote: Following some further discussion with Andrew, he looks to be convinced that the issue is to be fixed in the balloon driver, which so far (intentionally afaict) does not remove the direct map entries for ballooned out pages in the HVM case. I'm not convinced of this, but I'd nevertheless like to inquire whether such a change (resulting in shattered super page mappings) would be acceptable in the first place. >>> We don't tolerate anything else in the directmap pointing to >>> invalid/unimplemented frames. Why should ballooning be any different? >> Because ballooning is something virtualization specific, which >> doesn't have any equivalent on bare hardware (memory hot >> unplug doesn't come quite close enough imo, not the least >> because that doesn't work on a page granular basis). Hence >> we're to define the exact behavior here, and hence such a >> definition could as well include special behavior of accesses >> to the involved guest-physical addresses. > After discussing the issue with some KVM guys I still think it would be > better to leave the ballooned pages mapped in the direct map. KVM does > it the same way. They return "something" in case the guest tries to > read from such a page (might be the real data, 0's or all 1's). > > So we should either map an all 0's or 1's page via EPT, or we should > return 0's or 1's via emulation of the read instruction. > > Performance shouldn't be a major issue, as such reads should be really > rare. Such reads should be non-existent. One way or another, there's still a bug to fix in the kernel, because it isn't keeping suitable track of the pfns. As for how Xen could do things better... We could map a page of all-ones (all zeroes would definitely be wrong), but you've still got the problem of what happens if a write occurs. We absolutely can't sacrifice enough RAM to fill in the ballooned-out frames with read/write frames. I'd prefer not to see any emulation here, but that is more for an attack surface limitation point of view. x86 still offers us the option to not tolerate misaligned accesses and terminate early write-discard when hitting one of these pages. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes
On 16/08/18 13:27, Jan Beulich wrote: On 16.08.18 at 12:56, wrote: >> On 16/08/18 11:29, Jan Beulich wrote: >>> Following some further discussion with Andrew, he looks to be >>> convinced that the issue is to be fixed in the balloon driver, >>> which so far (intentionally afaict) does not remove the direct >>> map entries for ballooned out pages in the HVM case. I'm not >>> convinced of this, but I'd nevertheless like to inquire whether >>> such a change (resulting in shattered super page mappings) >>> would be acceptable in the first place. >> >> We don't tolerate anything else in the directmap pointing to >> invalid/unimplemented frames. Why should ballooning be any different? > > Because ballooning is something virtualization specific, which > doesn't have any equivalent on bare hardware (memory hot > unplug doesn't come quite close enough imo, not the least > because that doesn't work on a page granular basis). Hence > we're to define the exact behavior here, and hence such a > definition could as well include special behavior of accesses > to the involved guest-physical addresses. After discussing the issue with some KVM guys I still think it would be better to leave the ballooned pages mapped in the direct map. KVM does it the same way. They return "something" in case the guest tries to read from such a page (might be the real data, 0's or all 1's). So we should either map an all 0's or 1's page via EPT, or we should return 0's or 1's via emulation of the read instruction. Performance shouldn't be a major issue, as such reads should be really rare. Juergen ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes
On Thu, Aug 30, Jan Beulich wrote: > approach): One is Paul's idea of making null_handler actually retrieve > RAM contents when (part of) the access touches RAM. Another might This works for me: static int null_read(const struct hvm_io_handler *io_handler, uint64_t addr, uint32_t size, uint64_t *data) { struct vcpu *curr = current; struct domain *currd = curr->domain; p2m_type_t p2mt = p2m_invalid; unsigned long gmfn = paddr_to_pfn(addr); struct page_info *page; char *p; get_gfn_query_unlocked(currd, gmfn, &p2mt); if ( p2mt != p2m_ram_rw ) { *data = ~0ul; } else { page = get_page_from_gfn(currd, gmfn, NULL, P2M_UNSHARE); if ( ! page ) { memset(data, 0xee, size); } else { p = (char *)__map_domain_page(page) + (addr & ~PAGE_MASK); memcpy(data, p, size); unmap_domain_page(p); put_page(page); } } return X86EMUL_OKAY; } Olaf signature.asc Description: PGP signature ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes
>>> On 30.08.18 at 10:10, wrote: > On Wed, Aug 29, Olaf Hering wrote: > >> On Mon, Aug 13, Jan Beulich wrote: >> >> > And hence the consideration of mapping in an all zeros page >> > instead. This is because of the way __hvmemul_read() / >> > __hvm_copy() work: The latter doesn't tell its caller how many >> > bytes it was able to read, and hence the former considers the >> > entire range MMIO (and forwards the request for emulation). >> > Of course all of this is an issue only because >> > hvmemul_virtual_to_linear() sees no need to split the request >> > at the page boundary, due to the balloon driver having left in >> > place the mapping of the ballooned out page. > > So how is this bug supposed to be fixed? > > What I see in my tracing is that __hvmemul_read gets called with > gla==880223b9/bytes==8. Then hvm_copy_from_guest_linear fills > the buffer from gpa 223b9 with data, but finally it returns > HVMTRANS_bad_gfn_to_mfn, which it got from a failed get_page_from_gfn > for the second page. > > Now things go downhill. hvmemul_linear_mmio_read is called, which calls > hvmemul_do_io/hvm_io_intercept. That returns X86EMUL_UNHANDLEABLE. As a > result hvm_process_io_intercept(null_handler) is called, which > overwrites the return buffer with 0xff. There are a number of options (besides fixing the issue on the Linux side, which I continue to not be entirely convinced of being the best approach): One is Paul's idea of making null_handler actually retrieve RAM contents when (part of) the access touches RAM. Another might be to make __hvm_copy() return back what parts of the access could be read/written (so that MMIO emulation would only be triggered for the missing piece). A third might be to make the splitting of accesses more intelligent in __hvmemul_read(). I'm meaning to look into this in some more detail later today, unless a patch has appeared by then from e.g. Paul. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes
On Wed, Aug 29, Olaf Hering wrote: > On Mon, Aug 13, Jan Beulich wrote: > > > And hence the consideration of mapping in an all zeros page > > instead. This is because of the way __hvmemul_read() / > > __hvm_copy() work: The latter doesn't tell its caller how many > > bytes it was able to read, and hence the former considers the > > entire range MMIO (and forwards the request for emulation). > > Of course all of this is an issue only because > > hvmemul_virtual_to_linear() sees no need to split the request > > at the page boundary, due to the balloon driver having left in > > place the mapping of the ballooned out page. So how is this bug supposed to be fixed? What I see in my tracing is that __hvmemul_read gets called with gla==880223b9/bytes==8. Then hvm_copy_from_guest_linear fills the buffer from gpa 223b9 with data, but finally it returns HVMTRANS_bad_gfn_to_mfn, which it got from a failed get_page_from_gfn for the second page. Now things go downhill. hvmemul_linear_mmio_read is called, which calls hvmemul_do_io/hvm_io_intercept. That returns X86EMUL_UNHANDLEABLE. As a result hvm_process_io_intercept(null_handler) is called, which overwrites the return buffer with 0xff. Olaf signature.asc Description: PGP signature ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes
>>> On 29.08.18 at 13:09, wrote: > On 29/08/18 12:00, Olaf Hering wrote: >> On Wed, Aug 29, Andrew Cooper wrote: >> >>> Architecturally speaking, handing #MC back is probably the closest we >>> can get to sensible behaviour, but it is still a bug that Linux is >>> touching the ballooned out page in the first place. >> Well, the issue is that a read crosses a page boundary. If that would be >> forbidden, load_unaligned_zeropad() would not exist. It can not know >> what is in the following page. And such page crossing happens also in >> the unballooned case. Sadly I can not trigger the reported NFS bug >> myself. But it can be enforced by ballooning enough pages so that an >> allocated readdir reply eventually is right in front of a ballooned >> page. > > The Linux bug is not shooting the ballooned page out of the directmap. > Linux should be taking a fatal #PF for that read, because its a virtual > mapping for a frame which Linux has voluntarily elected to make invalid. > > As Xen can't prevent Linux from making/maintaining such an invalid > mapping, throwing #MC back is the next best thing, because terminating > the access with ~0 is just going to hide the bug, and run at a glacial > pace while doing so. I still do not understand why you think so: Handing back ~0 is far more correct than raising #MC imo. The x86 architecture is bound to its history, and in pre-Pentium days there was no #MC to be raised in the first place. Furthermore, while I can see that _some other_ bug may be hidden this way, there's no bug at all the be hidden in load_unaligned_zeropad() (leaving aside the balloon driver behavior). Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes
On 29/08/18 13:09, Andrew Cooper wrote: > On 29/08/18 12:00, Olaf Hering wrote: >> On Wed, Aug 29, Andrew Cooper wrote: >> >>> Architecturally speaking, handing #MC back is probably the closest we >>> can get to sensible behaviour, but it is still a bug that Linux is >>> touching the ballooned out page in the first place. >> Well, the issue is that a read crosses a page boundary. If that would be >> forbidden, load_unaligned_zeropad() would not exist. It can not know >> what is in the following page. And such page crossing happens also in >> the unballooned case. Sadly I can not trigger the reported NFS bug >> myself. But it can be enforced by ballooning enough pages so that an >> allocated readdir reply eventually is right in front of a ballooned >> page. > > The Linux bug is not shooting the ballooned page out of the directmap. > Linux should be taking a fatal #PF for that read, because its a virtual > mapping for a frame which Linux has voluntarily elected to make invalid. > > As Xen can't prevent Linux from making/maintaining such an invalid > mapping, throwing #MC back is the next best thing, because terminating > the access with ~0 is just going to hide the bug, and run at a glacial > pace while doing so. I think you are right: the kernel should in no case access a random page without knowing it is RAM. Hitting a ballooned page is just much more probable than hitting a MMIO page this way. There are _no_ guard pages around MMIO areas, so it could in theory happen that load_unaligned_zeropad() would access MMIO area triggering random behavior. So removing ballooned pages from the directmap just hides an underlying problem. Juergen ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes
On 29/08/18 12:09, Andrew Cooper wrote: > On 29/08/18 12:00, Olaf Hering wrote: >> On Wed, Aug 29, Andrew Cooper wrote: >> >>> Architecturally speaking, handing #MC back is probably the closest we >>> can get to sensible behaviour, but it is still a bug that Linux is >>> touching the ballooned out page in the first place. >> Well, the issue is that a read crosses a page boundary. If that would be >> forbidden, load_unaligned_zeropad() would not exist. It can not know >> what is in the following page. And such page crossing happens also in >> the unballooned case. Sadly I can not trigger the reported NFS bug >> myself. But it can be enforced by ballooning enough pages so that an >> allocated readdir reply eventually is right in front of a ballooned >> page. > The Linux bug is not shooting the ballooned page out of the directmap. > Linux should be taking a fatal #PF for that read, because its a virtual > mapping for a frame which Linux has voluntarily elected to make invalid. > > As Xen can't prevent Linux from making/maintaining such an invalid > mapping, throwing #MC back is the next best thing, because terminating > the access with ~0 is just going to hide the bug, and run at a glacial > pace while doing so. Yes - having looked at load_unaligned_zeropad(), it exists explicitly to cope with #PF occurring from the page crossing. This re-enforces my opinion that the underlying bug is not shooting ballooned-out pages out of the directmap. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes
On 29/08/18 12:00, Olaf Hering wrote: > On Wed, Aug 29, Andrew Cooper wrote: > >> Architecturally speaking, handing #MC back is probably the closest we >> can get to sensible behaviour, but it is still a bug that Linux is >> touching the ballooned out page in the first place. > Well, the issue is that a read crosses a page boundary. If that would be > forbidden, load_unaligned_zeropad() would not exist. It can not know > what is in the following page. And such page crossing happens also in > the unballooned case. Sadly I can not trigger the reported NFS bug > myself. But it can be enforced by ballooning enough pages so that an > allocated readdir reply eventually is right in front of a ballooned > page. The Linux bug is not shooting the ballooned page out of the directmap. Linux should be taking a fatal #PF for that read, because its a virtual mapping for a frame which Linux has voluntarily elected to make invalid. As Xen can't prevent Linux from making/maintaining such an invalid mapping, throwing #MC back is the next best thing, because terminating the access with ~0 is just going to hide the bug, and run at a glacial pace while doing so. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes
On Wed, Aug 29, Andrew Cooper wrote: > Architecturally speaking, handing #MC back is probably the closest we > can get to sensible behaviour, but it is still a bug that Linux is > touching the ballooned out page in the first place. Well, the issue is that a read crosses a page boundary. If that would be forbidden, load_unaligned_zeropad() would not exist. It can not know what is in the following page. And such page crossing happens also in the unballooned case. Sadly I can not trigger the reported NFS bug myself. But it can be enforced by ballooning enough pages so that an allocated readdir reply eventually is right in front of a ballooned page. Olaf signature.asc Description: PGP signature ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes
On 29/08/18 11:36, Olaf Hering wrote: > On Mon, Aug 13, Jan Beulich wrote: > >> And hence the consideration of mapping in an all zeros page >> instead. This is because of the way __hvmemul_read() / >> __hvm_copy() work: The latter doesn't tell its caller how many >> bytes it was able to read, and hence the former considers the >> entire range MMIO (and forwards the request for emulation). >> Of course all of this is an issue only because >> hvmemul_virtual_to_linear() sees no need to split the request >> at the page boundary, due to the balloon driver having left in >> place the mapping of the ballooned out page. > Should perhaps __hvm_copy detect the fault and copy 0xf for the > unavailable page into 'buf', and finally return success? > > Clearly something must be done at the Xen level. This is first and formost a Linux bug. No amount of fixing Xen is going to alter that. Architecturally speaking, handing #MC back is probably the closest we can get to sensible behaviour, but it is still a bug that Linux is touching the ballooned out page in the first place. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes
On Mon, Aug 13, Jan Beulich wrote: > And hence the consideration of mapping in an all zeros page > instead. This is because of the way __hvmemul_read() / > __hvm_copy() work: The latter doesn't tell its caller how many > bytes it was able to read, and hence the former considers the > entire range MMIO (and forwards the request for emulation). > Of course all of this is an issue only because > hvmemul_virtual_to_linear() sees no need to split the request > at the page boundary, due to the balloon driver having left in > place the mapping of the ballooned out page. Should perhaps __hvm_copy detect the fault and copy 0xf for the unavailable page into 'buf', and finally return success? Clearly something must be done at the Xen level. Olaf signature.asc Description: PGP signature ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes
On 16/08/18 13:27, Jan Beulich wrote: On 16.08.18 at 12:56, wrote: >> On 16/08/18 11:29, Jan Beulich wrote: >>> Following some further discussion with Andrew, he looks to be >>> convinced that the issue is to be fixed in the balloon driver, >>> which so far (intentionally afaict) does not remove the direct >>> map entries for ballooned out pages in the HVM case. I'm not >>> convinced of this, but I'd nevertheless like to inquire whether >>> such a change (resulting in shattered super page mappings) >>> would be acceptable in the first place. >> >> We don't tolerate anything else in the directmap pointing to >> invalid/unimplemented frames. Why should ballooning be any different? > > Because ballooning is something virtualization specific, which > doesn't have any equivalent on bare hardware (memory hot > unplug doesn't come quite close enough imo, not the least > because that doesn't work on a page granular basis). Hence > we're to define the exact behavior here, and hence such a > definition could as well include special behavior of accesses > to the involved guest-physical addresses. If I read drivers/virtio/virtio_balloon.c correctly KVM does the same as Xen: ballooned pages are _not_ removed from the direct mappings. Juergen ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes
>>> On 16.08.18 at 12:56, wrote: > On 16/08/18 11:29, Jan Beulich wrote: >> Following some further discussion with Andrew, he looks to be >> convinced that the issue is to be fixed in the balloon driver, >> which so far (intentionally afaict) does not remove the direct >> map entries for ballooned out pages in the HVM case. I'm not >> convinced of this, but I'd nevertheless like to inquire whether >> such a change (resulting in shattered super page mappings) >> would be acceptable in the first place. > > We don't tolerate anything else in the directmap pointing to > invalid/unimplemented frames. Why should ballooning be any different? Because ballooning is something virtualization specific, which doesn't have any equivalent on bare hardware (memory hot unplug doesn't come quite close enough imo, not the least because that doesn't work on a page granular basis). Hence we're to define the exact behavior here, and hence such a definition could as well include special behavior of accesses to the involved guest-physical addresses. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes
On 13/08/18 07:50, Jan Beulich wrote: On 10.08.18 at 18:37, wrote: >> On 10/08/18 17:30, George Dunlap wrote: >>> Sorry, what exactly is the issue here? Linux has a function called >>> load_unaligned_zeropad() which is reading into a ballooned region? > Yes. > >>> Fundamentally, a ballooned page is one which has been allocated to a >>> device driver. I'm having a hard time coming up with a justification >>> for having code which reads memory owned by B in the process of reading >>> memory owned by A. Or is there some weird architectural reason that I'm >>> not aware of? > Well, they do this no matter who owns the successive page (or > perhaps at a smaller granularity also the successive allocation). > I guess their goal is to have just a single MOV in the common > case (with the caller ignoring the uninteresting to it high bytes), > while recovering gracefully from #PF should one occur. > >> The underlying issue is that the emulator can't cope with a single >> misaligned access which crosses RAM and MMIO. It gives up and >> presumably throws #UD back. > We wouldn't have observed any problem if there was #UD in > such a case, as Linux'es fault recovery code doesn't care what > kind of fault has occurred. We're getting back a result of all > ones, even for the part of the read that has actually hit the > last few bytes of the present page. > >> One longstanding Xen bug is that simply ballooning a page out shouldn't >> be able to trigger MMIO emulation to begin with. It is a side effect of >> mixed p2m types, and the fix for this to have Xen understand the guest >> physmap layout. > And hence the consideration of mapping in an all zeros page > instead. This is because of the way __hvmemul_read() / > __hvm_copy() work: The latter doesn't tell its caller how many > bytes it was able to read, and hence the former considers the > entire range MMIO (and forwards the request for emulation). > Of course all of this is an issue only because > hvmemul_virtual_to_linear() sees no need to split the request > at the page boundary, due to the balloon driver having left in > place the mapping of the ballooned out page. Actually, the more I think about this, the more of a bad idea emulating a zero page is. It gives the illusion of a working piece of zeroed ram, except that writes definitely can't take effect. Its going to make bugs even more subtle. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes
On 16/08/18 11:29, Jan Beulich wrote: On 13.08.18 at 08:50, wrote: > On 10.08.18 at 18:37, wrote: >>> On 10/08/18 17:30, George Dunlap wrote: Sorry, what exactly is the issue here? Linux has a function called load_unaligned_zeropad() which is reading into a ballooned region? >> Yes. >> Fundamentally, a ballooned page is one which has been allocated to a device driver. I'm having a hard time coming up with a justification for having code which reads memory owned by B in the process of reading memory owned by A. Or is there some weird architectural reason that I'm not aware of? >> Well, they do this no matter who owns the successive page (or >> perhaps at a smaller granularity also the successive allocation). >> I guess their goal is to have just a single MOV in the common >> case (with the caller ignoring the uninteresting to it high bytes), >> while recovering gracefully from #PF should one occur. >> >>> The underlying issue is that the emulator can't cope with a single >>> misaligned access which crosses RAM and MMIO. It gives up and >>> presumably throws #UD back. >> We wouldn't have observed any problem if there was #UD in >> such a case, as Linux'es fault recovery code doesn't care what >> kind of fault has occurred. We're getting back a result of all >> ones, even for the part of the read that has actually hit the >> last few bytes of the present page. >> >>> One longstanding Xen bug is that simply ballooning a page out shouldn't >>> be able to trigger MMIO emulation to begin with. It is a side effect of >>> mixed p2m types, and the fix for this to have Xen understand the guest >>> physmap layout. >> And hence the consideration of mapping in an all zeros page >> instead. This is because of the way __hvmemul_read() / >> __hvm_copy() work: The latter doesn't tell its caller how many >> bytes it was able to read, and hence the former considers the >> entire range MMIO (and forwards the request for emulation). >> Of course all of this is an issue only because >> hvmemul_virtual_to_linear() sees no need to split the request >> at the page boundary, due to the balloon driver having left in >> place the mapping of the ballooned out page. >> >> Obviously the opposite case (access starting in a ballooned >> out page and crossing into an "ordinary" one) would have a >> similar issue, which is presumably even harder to fix without >> going the map-a-zero-page route (or Paul's suggested >> null_handler hack). >> >>> However, the real bug is Linux making such a misaligned access into a >>> ballooned out page in the first place. This is a Linux kernel bug which >>> (presumably) manifests in a very obvious way due to shortcomings in >>> Xen's emulation handling. >> I wouldn't dare to judge whether this is a bug, especially in >> light that they recover gracefully from the #PF that might result in >> the native case. Arguably the caller has to have some knowledge >> about what might live in the following page, as to not inadvertently >> hit an MMIO page rather than a non-present mapping. But I'd >> leave such judgment to them; our business is to get working a case >> that is working without Xen underneath. > Following some further discussion with Andrew, he looks to be > convinced that the issue is to be fixed in the balloon driver, > which so far (intentionally afaict) does not remove the direct > map entries for ballooned out pages in the HVM case. I'm not > convinced of this, but I'd nevertheless like to inquire whether > such a change (resulting in shattered super page mappings) > would be acceptable in the first place. We don't tolerate anything else in the directmap pointing to invalid/unimplemented frames. Why should ballooning be any different? ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes
>>> On 13.08.18 at 08:50, wrote: On 10.08.18 at 18:37, wrote: > > On 10/08/18 17:30, George Dunlap wrote: > >> Sorry, what exactly is the issue here? Linux has a function called > >> load_unaligned_zeropad() which is reading into a ballooned region? > > Yes. > > >> Fundamentally, a ballooned page is one which has been allocated to a > >> device driver. I'm having a hard time coming up with a justification > >> for having code which reads memory owned by B in the process of reading > >> memory owned by A. Or is there some weird architectural reason that I'm > >> not aware of? > > Well, they do this no matter who owns the successive page (or > perhaps at a smaller granularity also the successive allocation). > I guess their goal is to have just a single MOV in the common > case (with the caller ignoring the uninteresting to it high bytes), > while recovering gracefully from #PF should one occur. > > > The underlying issue is that the emulator can't cope with a single > > misaligned access which crosses RAM and MMIO. It gives up and > > presumably throws #UD back. > > We wouldn't have observed any problem if there was #UD in > such a case, as Linux'es fault recovery code doesn't care what > kind of fault has occurred. We're getting back a result of all > ones, even for the part of the read that has actually hit the > last few bytes of the present page. > > > One longstanding Xen bug is that simply ballooning a page out shouldn't > > be able to trigger MMIO emulation to begin with. It is a side effect of > > mixed p2m types, and the fix for this to have Xen understand the guest > > physmap layout. > > And hence the consideration of mapping in an all zeros page > instead. This is because of the way __hvmemul_read() / > __hvm_copy() work: The latter doesn't tell its caller how many > bytes it was able to read, and hence the former considers the > entire range MMIO (and forwards the request for emulation). > Of course all of this is an issue only because > hvmemul_virtual_to_linear() sees no need to split the request > at the page boundary, due to the balloon driver having left in > place the mapping of the ballooned out page. > > Obviously the opposite case (access starting in a ballooned > out page and crossing into an "ordinary" one) would have a > similar issue, which is presumably even harder to fix without > going the map-a-zero-page route (or Paul's suggested > null_handler hack). > > > However, the real bug is Linux making such a misaligned access into a > > ballooned out page in the first place. This is a Linux kernel bug which > > (presumably) manifests in a very obvious way due to shortcomings in > > Xen's emulation handling. > > I wouldn't dare to judge whether this is a bug, especially in > light that they recover gracefully from the #PF that might result in > the native case. Arguably the caller has to have some knowledge > about what might live in the following page, as to not inadvertently > hit an MMIO page rather than a non-present mapping. But I'd > leave such judgment to them; our business is to get working a case > that is working without Xen underneath. Following some further discussion with Andrew, he looks to be convinced that the issue is to be fixed in the balloon driver, which so far (intentionally afaict) does not remove the direct map entries for ballooned out pages in the HVM case. I'm not convinced of this, but I'd nevertheless like to inquire whether such a change (resulting in shattered super page mappings) would be acceptable in the first place. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes
>>> On 10.08.18 at 18:37, wrote: > On 10/08/18 17:30, George Dunlap wrote: >> Sorry, what exactly is the issue here? Linux has a function called >> load_unaligned_zeropad() which is reading into a ballooned region? Yes. >> Fundamentally, a ballooned page is one which has been allocated to a >> device driver. I'm having a hard time coming up with a justification >> for having code which reads memory owned by B in the process of reading >> memory owned by A. Or is there some weird architectural reason that I'm >> not aware of? Well, they do this no matter who owns the successive page (or perhaps at a smaller granularity also the successive allocation). I guess their goal is to have just a single MOV in the common case (with the caller ignoring the uninteresting to it high bytes), while recovering gracefully from #PF should one occur. > The underlying issue is that the emulator can't cope with a single > misaligned access which crosses RAM and MMIO. It gives up and > presumably throws #UD back. We wouldn't have observed any problem if there was #UD in such a case, as Linux'es fault recovery code doesn't care what kind of fault has occurred. We're getting back a result of all ones, even for the part of the read that has actually hit the last few bytes of the present page. > One longstanding Xen bug is that simply ballooning a page out shouldn't > be able to trigger MMIO emulation to begin with. It is a side effect of > mixed p2m types, and the fix for this to have Xen understand the guest > physmap layout. And hence the consideration of mapping in an all zeros page instead. This is because of the way __hvmemul_read() / __hvm_copy() work: The latter doesn't tell its caller how many bytes it was able to read, and hence the former considers the entire range MMIO (and forwards the request for emulation). Of course all of this is an issue only because hvmemul_virtual_to_linear() sees no need to split the request at the page boundary, due to the balloon driver having left in place the mapping of the ballooned out page. Obviously the opposite case (access starting in a ballooned out page and crossing into an "ordinary" one) would have a similar issue, which is presumably even harder to fix without going the map-a-zero-page route (or Paul's suggested null_handler hack). > However, the real bug is Linux making such a misaligned access into a > ballooned out page in the first place. This is a Linux kernel bug which > (presumably) manifests in a very obvious way due to shortcomings in > Xen's emulation handling. I wouldn't dare to judge whether this is a bug, especially in light that they recover gracefully from the #PF that might result in the native case. Arguably the caller has to have some knowledge about what might live in the following page, as to not inadvertently hit an MMIO page rather than a non-present mapping. But I'd leave such judgment to them; our business is to get working a case that is working without Xen underneath. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes
On 10/08/18 17:30, George Dunlap wrote: > On 08/10/2018 05:00 PM, Jan Beulich wrote: >>>>> On 10.08.18 at 17:35, wrote: >>>> -Original Message- >>>> From: Jan Beulich [mailto:jbeul...@suse.com] >>>> Sent: 10 August 2018 16:31 >>>> To: Paul Durrant >>>> Cc: Andrew Cooper ; xen-devel >>> de...@lists.xenproject.org> >>>> Subject: RE: [Xen-devel] [PATCH 0/2] MMIO emulation fixes >>>> >>>>>>> On 10.08.18 at 17:08, wrote: >>>>>> -Original Message----- >>>>>> From: Andrew Cooper >>>>>> Sent: 10 August 2018 13:56 >>>>>> To: Paul Durrant ; 'Jan Beulich' >>>>>> >>>>>> Cc: xen-devel >>>>>> Subject: Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes >>>>>> >>>>>> On 10/08/18 13:43, Paul Durrant wrote: >>>>>>>> -Original Message- >>>>>>>> From: Jan Beulich [mailto:jbeul...@suse.com] >>>>>>>> Sent: 10 August 2018 13:37 >>>>>>>> To: Paul Durrant >>>>>>>> Cc: xen-devel >>>>>>>> Subject: RE: [Xen-devel] [PATCH 0/2] MMIO emulation fixes >>>>>>>> >>>>>>>>>>> On 10.08.18 at 14:22, wrote: >>>>>>>>>> -Original Message----- >>>>>>>>>> From: Jan Beulich [mailto:jbeul...@suse.com] >>>>>>>>>> Sent: 10 August 2018 13:13 >>>>>>>>>> To: Paul Durrant >>>>>>>>>> Cc: xen-devel >>>>>>>>>> Subject: RE: [Xen-devel] [PATCH 0/2] MMIO emulation fixes >>>>>>>>>> >>>>>>>>>>>>> On 10.08.18 at 14:08, wrote: >>>>>>>>>>>> -Original Message- >>>>>>>>>>>> From: Jan Beulich [mailto:jbeul...@suse.com] >>>>>>>>>>>> Sent: 10 August 2018 13:02 >>>>>>>>>>>> To: Paul Durrant >>>>>>>>>>>> Cc: xen-devel >>>>>>>>>>>> Subject: Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes >>>>>>>>>>>> >>>>>>>>>>>>>>> On 10.08.18 at 12:37, wrote: >>>>>>>>>>>>> These are probably both candidates for back-port. >>>>>>>>>>>>> >>>>>>>>>>>>> Paul Durrant (2): >>>>>>>>>>>>> x86/hvm/ioreq: MMIO range checking completely ignores >>>>>> direction >>>>>>>> flag >>>>>>>>>>>>> x86/hvm/emulate: make sure rep I/O emulation does not cross >>>>>> GFN >>>>>>>>>>>>> boundaries >>>>>>>>>>>>> >>>>>>>>>>>>> xen/arch/x86/hvm/emulate.c | 17 - >>>>>>>>>>>>> xen/arch/x86/hvm/ioreq.c | 15 ++- >>>>>>>>>>>>> 2 files changed, 26 insertions(+), 6 deletions(-) >>>>>>>>>>>> I take it this isn't yet what we've talked about yesterday on irc? >>>>>>>>>>>> >>>>>>>>>>> This is the band-aid fix. I can now show correct handling of a rep >>>> mov >>>>>>>>>>> walking off MMIO into RAM. >>>>>>>>>> But that's not the problem we're having. In our case the bad >>>> behavior >>>>>>>>>> is with a single MOV. That's why I had assumed that your plan to >>>> fiddle >>>>>>>>>> with null_handler would help in our case as well, while this series >>>>>> clearly >>>>>>>>>> won't (afaict). >>>>>>>>>> >>>>>>>>> Oh, I see. A single MOV spanning MMIO and RAM has undefined >>>>>> behaviour >>>>>>>> though >>>>>>>>> as I understand it. Am I incorrect? >>>>>>>> I'm not aware of SDM or PM saying anything like this. Anyway, the >>>>>>>> specific case where this is b
Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes
On 08/10/2018 05:00 PM, Jan Beulich wrote: >>>> On 10.08.18 at 17:35, wrote: >>> -Original Message- >>> From: Jan Beulich [mailto:jbeul...@suse.com] >>> Sent: 10 August 2018 16:31 >>> To: Paul Durrant >>> Cc: Andrew Cooper ; xen-devel >> de...@lists.xenproject.org> >>> Subject: RE: [Xen-devel] [PATCH 0/2] MMIO emulation fixes >>> >>>>>> On 10.08.18 at 17:08, wrote: >>>>> -Original Message- >>>>> From: Andrew Cooper >>>>> Sent: 10 August 2018 13:56 >>>>> To: Paul Durrant ; 'Jan Beulich' >>>>> >>>>> Cc: xen-devel >>>>> Subject: Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes >>>>> >>>>> On 10/08/18 13:43, Paul Durrant wrote: >>>>>>> -Original Message- >>>>>>> From: Jan Beulich [mailto:jbeul...@suse.com] >>>>>>> Sent: 10 August 2018 13:37 >>>>>>> To: Paul Durrant >>>>>>> Cc: xen-devel >>>>>>> Subject: RE: [Xen-devel] [PATCH 0/2] MMIO emulation fixes >>>>>>> >>>>>>>>>> On 10.08.18 at 14:22, wrote: >>>>>>>>> -Original Message- >>>>>>>>> From: Jan Beulich [mailto:jbeul...@suse.com] >>>>>>>>> Sent: 10 August 2018 13:13 >>>>>>>>> To: Paul Durrant >>>>>>>>> Cc: xen-devel >>>>>>>>> Subject: RE: [Xen-devel] [PATCH 0/2] MMIO emulation fixes >>>>>>>>> >>>>>>>>>>>> On 10.08.18 at 14:08, wrote: >>>>>>>>>>> -Original Message- >>>>>>>>>>> From: Jan Beulich [mailto:jbeul...@suse.com] >>>>>>>>>>> Sent: 10 August 2018 13:02 >>>>>>>>>>> To: Paul Durrant >>>>>>>>>>> Cc: xen-devel >>>>>>>>>>> Subject: Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes >>>>>>>>>>> >>>>>>>>>>>>>> On 10.08.18 at 12:37, wrote: >>>>>>>>>>>> These are probably both candidates for back-port. >>>>>>>>>>>> >>>>>>>>>>>> Paul Durrant (2): >>>>>>>>>>>> x86/hvm/ioreq: MMIO range checking completely ignores >>>>> direction >>>>>>> flag >>>>>>>>>>>> x86/hvm/emulate: make sure rep I/O emulation does not cross >>>>> GFN >>>>>>>>>>>> boundaries >>>>>>>>>>>> >>>>>>>>>>>> xen/arch/x86/hvm/emulate.c | 17 - >>>>>>>>>>>> xen/arch/x86/hvm/ioreq.c | 15 ++- >>>>>>>>>>>> 2 files changed, 26 insertions(+), 6 deletions(-) >>>>>>>>>>> I take it this isn't yet what we've talked about yesterday on irc? >>>>>>>>>>> >>>>>>>>>> This is the band-aid fix. I can now show correct handling of a rep >>> mov >>>>>>>>>> walking off MMIO into RAM. >>>>>>>>> But that's not the problem we're having. In our case the bad >>> behavior >>>>>>>>> is with a single MOV. That's why I had assumed that your plan to >>> fiddle >>>>>>>>> with null_handler would help in our case as well, while this series >>>>> clearly >>>>>>>>> won't (afaict). >>>>>>>>> >>>>>>>> Oh, I see. A single MOV spanning MMIO and RAM has undefined >>>>> behaviour >>>>>>> though >>>>>>>> as I understand it. Am I incorrect? >>>>>>> I'm not aware of SDM or PM saying anything like this. Anyway, the >>>>>>> specific case where this is being observed as an issue is when >>>>>>> accessing the last few bytes of a normal RAM page followed by a >>>>>>> ballooned out one. The balloon driver doesn't remove the virtual >>>>>>> mapping of such pages (presumably in order to not shatter super >>>>>>
Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes
>>> On 10.08.18 at 17:35, wrote: >> -Original Message- >> From: Jan Beulich [mailto:jbeul...@suse.com] >> Sent: 10 August 2018 16:31 >> To: Paul Durrant >> Cc: Andrew Cooper ; xen-devel > de...@lists.xenproject.org> >> Subject: RE: [Xen-devel] [PATCH 0/2] MMIO emulation fixes >> >> >>> On 10.08.18 at 17:08, wrote: >> >> -Original Message- >> >> From: Andrew Cooper >> >> Sent: 10 August 2018 13:56 >> >> To: Paul Durrant ; 'Jan Beulich' >> >> >> >> Cc: xen-devel >> >> Subject: Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes >> >> >> >> On 10/08/18 13:43, Paul Durrant wrote: >> >> >> -----Original Message- >> >> >> From: Jan Beulich [mailto:jbeul...@suse.com] >> >> >> Sent: 10 August 2018 13:37 >> >> >> To: Paul Durrant >> >> >> Cc: xen-devel >> >> >> Subject: RE: [Xen-devel] [PATCH 0/2] MMIO emulation fixes >> >> >> >> >> >>>>> On 10.08.18 at 14:22, wrote: >> >> >>>> -Original Message- >> >> >>>> From: Jan Beulich [mailto:jbeul...@suse.com] >> >> >>>> Sent: 10 August 2018 13:13 >> >> >>>> To: Paul Durrant >> >> >>>> Cc: xen-devel >> >> >>>> Subject: RE: [Xen-devel] [PATCH 0/2] MMIO emulation fixes >> >> >>>> >> >> >>>>>>> On 10.08.18 at 14:08, wrote: >> >> >>>>>> -Original Message- >> >> >>>>>> From: Jan Beulich [mailto:jbeul...@suse.com] >> >> >>>>>> Sent: 10 August 2018 13:02 >> >> >>>>>> To: Paul Durrant >> >> >>>>>> Cc: xen-devel >> >> >>>>>> Subject: Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes >> >> >>>>>> >> >> >>>>>>>>> On 10.08.18 at 12:37, wrote: >> >> >>>>>>> These are probably both candidates for back-port. >> >> >>>>>>> >> >> >>>>>>> Paul Durrant (2): >> >> >>>>>>> x86/hvm/ioreq: MMIO range checking completely ignores >> >> direction >> >> >> flag >> >> >>>>>>> x86/hvm/emulate: make sure rep I/O emulation does not cross >> >> GFN >> >> >>>>>>> boundaries >> >> >>>>>>> >> >> >>>>>>> xen/arch/x86/hvm/emulate.c | 17 - >> >> >>>>>>> xen/arch/x86/hvm/ioreq.c | 15 ++- >> >> >>>>>>> 2 files changed, 26 insertions(+), 6 deletions(-) >> >> >>>>>> I take it this isn't yet what we've talked about yesterday on irc? >> >> >>>>>> >> >> >>>>> This is the band-aid fix. I can now show correct handling of a rep >> mov >> >> >>>>> walking off MMIO into RAM. >> >> >>>> But that's not the problem we're having. In our case the bad >> behavior >> >> >>>> is with a single MOV. That's why I had assumed that your plan to >> fiddle >> >> >>>> with null_handler would help in our case as well, while this series >> >> clearly >> >> >>>> won't (afaict). >> >> >>>> >> >> >>> Oh, I see. A single MOV spanning MMIO and RAM has undefined >> >> behaviour >> >> >> though >> >> >>> as I understand it. Am I incorrect? >> >> >> I'm not aware of SDM or PM saying anything like this. Anyway, the >> >> >> specific case where this is being observed as an issue is when >> >> >> accessing the last few bytes of a normal RAM page followed by a >> >> >> ballooned out one. The balloon driver doesn't remove the virtual >> >> >> mapping of such pages (presumably in order to not shatter super >> >> >> pages); observation is with the old XenoLinux one, but from code >> >> >> inspection the upstream one behaves the same. >> >> >> >> >> >> Unless we want to change the balloon driv
Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes
> -Original Message- > From: Jan Beulich [mailto:jbeul...@suse.com] > Sent: 10 August 2018 16:31 > To: Paul Durrant > Cc: Andrew Cooper ; xen-devel de...@lists.xenproject.org> > Subject: RE: [Xen-devel] [PATCH 0/2] MMIO emulation fixes > > >>> On 10.08.18 at 17:08, wrote: > >> -Original Message- > >> From: Andrew Cooper > >> Sent: 10 August 2018 13:56 > >> To: Paul Durrant ; 'Jan Beulich' > >> > >> Cc: xen-devel > >> Subject: Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes > >> > >> On 10/08/18 13:43, Paul Durrant wrote: > >> >> -Original Message- > >> >> From: Jan Beulich [mailto:jbeul...@suse.com] > >> >> Sent: 10 August 2018 13:37 > >> >> To: Paul Durrant > >> >> Cc: xen-devel > >> >> Subject: RE: [Xen-devel] [PATCH 0/2] MMIO emulation fixes > >> >> > >> >>>>> On 10.08.18 at 14:22, wrote: > >> >>>> -Original Message- > >> >>>> From: Jan Beulich [mailto:jbeul...@suse.com] > >> >>>> Sent: 10 August 2018 13:13 > >> >>>> To: Paul Durrant > >> >>>> Cc: xen-devel > >> >>>> Subject: RE: [Xen-devel] [PATCH 0/2] MMIO emulation fixes > >> >>>> > >> >>>>>>> On 10.08.18 at 14:08, wrote: > >> >>>>>> -Original Message- > >> >>>>>> From: Jan Beulich [mailto:jbeul...@suse.com] > >> >>>>>> Sent: 10 August 2018 13:02 > >> >>>>>> To: Paul Durrant > >> >>>>>> Cc: xen-devel > >> >>>>>> Subject: Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes > >> >>>>>> > >> >>>>>>>>> On 10.08.18 at 12:37, wrote: > >> >>>>>>> These are probably both candidates for back-port. > >> >>>>>>> > >> >>>>>>> Paul Durrant (2): > >> >>>>>>> x86/hvm/ioreq: MMIO range checking completely ignores > >> direction > >> >> flag > >> >>>>>>> x86/hvm/emulate: make sure rep I/O emulation does not cross > >> GFN > >> >>>>>>> boundaries > >> >>>>>>> > >> >>>>>>> xen/arch/x86/hvm/emulate.c | 17 - > >> >>>>>>> xen/arch/x86/hvm/ioreq.c | 15 ++- > >> >>>>>>> 2 files changed, 26 insertions(+), 6 deletions(-) > >> >>>>>> I take it this isn't yet what we've talked about yesterday on irc? > >> >>>>>> > >> >>>>> This is the band-aid fix. I can now show correct handling of a rep > mov > >> >>>>> walking off MMIO into RAM. > >> >>>> But that's not the problem we're having. In our case the bad > behavior > >> >>>> is with a single MOV. That's why I had assumed that your plan to > fiddle > >> >>>> with null_handler would help in our case as well, while this series > >> clearly > >> >>>> won't (afaict). > >> >>>> > >> >>> Oh, I see. A single MOV spanning MMIO and RAM has undefined > >> behaviour > >> >> though > >> >>> as I understand it. Am I incorrect? > >> >> I'm not aware of SDM or PM saying anything like this. Anyway, the > >> >> specific case where this is being observed as an issue is when > >> >> accessing the last few bytes of a normal RAM page followed by a > >> >> ballooned out one. The balloon driver doesn't remove the virtual > >> >> mapping of such pages (presumably in order to not shatter super > >> >> pages); observation is with the old XenoLinux one, but from code > >> >> inspection the upstream one behaves the same. > >> >> > >> >> Unless we want to change the balloon driver's behavior, at least > >> >> this specific case needs to be considered having defined behavior, > >> >> I think. > >> >> > >> > Ok. I'll see what I can do. > >> > >> It is a software error to try and cross boundaries. Modern processors > >> do their best to try and cause
Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes
>>> On 10.08.18 at 17:08, wrote: >> -Original Message- >> From: Andrew Cooper >> Sent: 10 August 2018 13:56 >> To: Paul Durrant ; 'Jan Beulich' >> >> Cc: xen-devel >> Subject: Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes >> >> On 10/08/18 13:43, Paul Durrant wrote: >> >> -Original Message- >> >> From: Jan Beulich [mailto:jbeul...@suse.com] >> >> Sent: 10 August 2018 13:37 >> >> To: Paul Durrant >> >> Cc: xen-devel >> >> Subject: RE: [Xen-devel] [PATCH 0/2] MMIO emulation fixes >> >> >> >>>>> On 10.08.18 at 14:22, wrote: >> >>>> -----Original Message- >> >>>> From: Jan Beulich [mailto:jbeul...@suse.com] >> >>>> Sent: 10 August 2018 13:13 >> >>>> To: Paul Durrant >> >>>> Cc: xen-devel >> >>>> Subject: RE: [Xen-devel] [PATCH 0/2] MMIO emulation fixes >> >>>> >> >>>>>>> On 10.08.18 at 14:08, wrote: >> >>>>>> -Original Message- >> >>>>>> From: Jan Beulich [mailto:jbeul...@suse.com] >> >>>>>> Sent: 10 August 2018 13:02 >> >>>>>> To: Paul Durrant >> >>>>>> Cc: xen-devel >> >>>>>> Subject: Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes >> >>>>>> >> >>>>>>>>> On 10.08.18 at 12:37, wrote: >> >>>>>>> These are probably both candidates for back-port. >> >>>>>>> >> >>>>>>> Paul Durrant (2): >> >>>>>>> x86/hvm/ioreq: MMIO range checking completely ignores >> direction >> >> flag >> >>>>>>> x86/hvm/emulate: make sure rep I/O emulation does not cross >> GFN >> >>>>>>> boundaries >> >>>>>>> >> >>>>>>> xen/arch/x86/hvm/emulate.c | 17 - >> >>>>>>> xen/arch/x86/hvm/ioreq.c | 15 ++- >> >>>>>>> 2 files changed, 26 insertions(+), 6 deletions(-) >> >>>>>> I take it this isn't yet what we've talked about yesterday on irc? >> >>>>>> >> >>>>> This is the band-aid fix. I can now show correct handling of a rep mov >> >>>>> walking off MMIO into RAM. >> >>>> But that's not the problem we're having. In our case the bad behavior >> >>>> is with a single MOV. That's why I had assumed that your plan to fiddle >> >>>> with null_handler would help in our case as well, while this series >> clearly >> >>>> won't (afaict). >> >>>> >> >>> Oh, I see. A single MOV spanning MMIO and RAM has undefined >> behaviour >> >> though >> >>> as I understand it. Am I incorrect? >> >> I'm not aware of SDM or PM saying anything like this. Anyway, the >> >> specific case where this is being observed as an issue is when >> >> accessing the last few bytes of a normal RAM page followed by a >> >> ballooned out one. The balloon driver doesn't remove the virtual >> >> mapping of such pages (presumably in order to not shatter super >> >> pages); observation is with the old XenoLinux one, but from code >> >> inspection the upstream one behaves the same. >> >> >> >> Unless we want to change the balloon driver's behavior, at least >> >> this specific case needs to be considered having defined behavior, >> >> I think. >> >> >> > Ok. I'll see what I can do. >> >> It is a software error to try and cross boundaries. Modern processors >> do their best to try and cause the correct behaviour to occur, albeit >> with a massive disclaimer about the performance hit. Older processors >> didn't cope. >> >> As far as I'm concerned, its fine to terminate a emulation which crosses >> a boundary with the null ops. > > Alas we never even get as far as the I/O handlers in some circumstances... > > I just set up a variant of an XTF test doing a backwards rep movsd into a > well aligned stack buffer where source buffer starts 1 byte before a boundary > between RAM and MMIO. The code in hvmemul_rep_movs() (rightly) detects that > both the source and dest of the initial rep are RAM, skips over the I/O > emulation calls, and then fails when the hvm_copy_from_guest_phys() > (unsurprisingly) fails to grab the 8 bytes for the initial rep. > So, any logic we add to deal with handling page spanning ops is going to > have to go in at the top level of instruction emulation... which I fear is > going to be quite a major change and not something that's going to be easy to > back-port. Well, wasn't it clear from the beginning that a proper fix would be too invasive to backport? And wasn't it for that reason that you intended to add a small hack first, to deal with just the case(s) that we currently have issues with? Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes
> -Original Message- > From: Andrew Cooper > Sent: 10 August 2018 13:56 > To: Paul Durrant ; 'Jan Beulich' > > Cc: xen-devel > Subject: Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes > > On 10/08/18 13:43, Paul Durrant wrote: > >> -Original Message- > >> From: Jan Beulich [mailto:jbeul...@suse.com] > >> Sent: 10 August 2018 13:37 > >> To: Paul Durrant > >> Cc: xen-devel > >> Subject: RE: [Xen-devel] [PATCH 0/2] MMIO emulation fixes > >> > >>>>> On 10.08.18 at 14:22, wrote: > >>>> -Original Message- > >>>> From: Jan Beulich [mailto:jbeul...@suse.com] > >>>> Sent: 10 August 2018 13:13 > >>>> To: Paul Durrant > >>>> Cc: xen-devel > >>>> Subject: RE: [Xen-devel] [PATCH 0/2] MMIO emulation fixes > >>>> > >>>>>>> On 10.08.18 at 14:08, wrote: > >>>>>> -Original Message- > >>>>>> From: Jan Beulich [mailto:jbeul...@suse.com] > >>>>>> Sent: 10 August 2018 13:02 > >>>>>> To: Paul Durrant > >>>>>> Cc: xen-devel > >>>>>> Subject: Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes > >>>>>> > >>>>>>>>> On 10.08.18 at 12:37, wrote: > >>>>>>> These are probably both candidates for back-port. > >>>>>>> > >>>>>>> Paul Durrant (2): > >>>>>>> x86/hvm/ioreq: MMIO range checking completely ignores > direction > >> flag > >>>>>>> x86/hvm/emulate: make sure rep I/O emulation does not cross > GFN > >>>>>>> boundaries > >>>>>>> > >>>>>>> xen/arch/x86/hvm/emulate.c | 17 - > >>>>>>> xen/arch/x86/hvm/ioreq.c | 15 ++- > >>>>>>> 2 files changed, 26 insertions(+), 6 deletions(-) > >>>>>> I take it this isn't yet what we've talked about yesterday on irc? > >>>>>> > >>>>> This is the band-aid fix. I can now show correct handling of a rep mov > >>>>> walking off MMIO into RAM. > >>>> But that's not the problem we're having. In our case the bad behavior > >>>> is with a single MOV. That's why I had assumed that your plan to fiddle > >>>> with null_handler would help in our case as well, while this series > clearly > >>>> won't (afaict). > >>>> > >>> Oh, I see. A single MOV spanning MMIO and RAM has undefined > behaviour > >> though > >>> as I understand it. Am I incorrect? > >> I'm not aware of SDM or PM saying anything like this. Anyway, the > >> specific case where this is being observed as an issue is when > >> accessing the last few bytes of a normal RAM page followed by a > >> ballooned out one. The balloon driver doesn't remove the virtual > >> mapping of such pages (presumably in order to not shatter super > >> pages); observation is with the old XenoLinux one, but from code > >> inspection the upstream one behaves the same. > >> > >> Unless we want to change the balloon driver's behavior, at least > >> this specific case needs to be considered having defined behavior, > >> I think. > >> > > Ok. I'll see what I can do. > > It is a software error to try and cross boundaries. Modern processors > do their best to try and cause the correct behaviour to occur, albeit > with a massive disclaimer about the performance hit. Older processors > didn't cope. > > As far as I'm concerned, its fine to terminate a emulation which crosses > a boundary with the null ops. Alas we never even get as far as the I/O handlers in some circumstances... I just set up a variant of an XTF test doing a backwards rep movsd into a well aligned stack buffer where source buffer starts 1 byte before a boundary between RAM and MMIO. The code in hvmemul_rep_movs() (rightly) detects that both the source and dest of the initial rep are RAM, skips over the I/O emulation calls, and then fails when the hvm_copy_from_guest_phys() (unsurprisingly) fails to grab the 8 bytes for the initial rep. So, any logic we add to deal with handling page spanning ops is going to have to go in at the top level of instruction emulation... which I fear is going to be quite a major change and not something that's going to be easy to back-port. Paul > > ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes
On 10/08/18 13:43, Paul Durrant wrote: >> -Original Message- >> From: Jan Beulich [mailto:jbeul...@suse.com] >> Sent: 10 August 2018 13:37 >> To: Paul Durrant >> Cc: xen-devel >> Subject: RE: [Xen-devel] [PATCH 0/2] MMIO emulation fixes >> >>>>> On 10.08.18 at 14:22, wrote: >>>> -Original Message- >>>> From: Jan Beulich [mailto:jbeul...@suse.com] >>>> Sent: 10 August 2018 13:13 >>>> To: Paul Durrant >>>> Cc: xen-devel >>>> Subject: RE: [Xen-devel] [PATCH 0/2] MMIO emulation fixes >>>> >>>>>>> On 10.08.18 at 14:08, wrote: >>>>>> -----Original Message----- >>>>>> From: Jan Beulich [mailto:jbeul...@suse.com] >>>>>> Sent: 10 August 2018 13:02 >>>>>> To: Paul Durrant >>>>>> Cc: xen-devel >>>>>> Subject: Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes >>>>>> >>>>>>>>> On 10.08.18 at 12:37, wrote: >>>>>>> These are probably both candidates for back-port. >>>>>>> >>>>>>> Paul Durrant (2): >>>>>>> x86/hvm/ioreq: MMIO range checking completely ignores direction >> flag >>>>>>> x86/hvm/emulate: make sure rep I/O emulation does not cross GFN >>>>>>> boundaries >>>>>>> >>>>>>> xen/arch/x86/hvm/emulate.c | 17 - >>>>>>> xen/arch/x86/hvm/ioreq.c | 15 ++- >>>>>>> 2 files changed, 26 insertions(+), 6 deletions(-) >>>>>> I take it this isn't yet what we've talked about yesterday on irc? >>>>>> >>>>> This is the band-aid fix. I can now show correct handling of a rep mov >>>>> walking off MMIO into RAM. >>>> But that's not the problem we're having. In our case the bad behavior >>>> is with a single MOV. That's why I had assumed that your plan to fiddle >>>> with null_handler would help in our case as well, while this series clearly >>>> won't (afaict). >>>> >>> Oh, I see. A single MOV spanning MMIO and RAM has undefined behaviour >> though >>> as I understand it. Am I incorrect? >> I'm not aware of SDM or PM saying anything like this. Anyway, the >> specific case where this is being observed as an issue is when >> accessing the last few bytes of a normal RAM page followed by a >> ballooned out one. The balloon driver doesn't remove the virtual >> mapping of such pages (presumably in order to not shatter super >> pages); observation is with the old XenoLinux one, but from code >> inspection the upstream one behaves the same. >> >> Unless we want to change the balloon driver's behavior, at least >> this specific case needs to be considered having defined behavior, >> I think. >> > Ok. I'll see what I can do. It is a software error to try and cross boundaries. Modern processors do their best to try and cause the correct behaviour to occur, albeit with a massive disclaimer about the performance hit. Older processors didn't cope. As far as I'm concerned, its fine to terminate a emulation which crosses a boundary with the null ops. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes
> -Original Message- > From: Jan Beulich [mailto:jbeul...@suse.com] > Sent: 10 August 2018 13:37 > To: Paul Durrant > Cc: xen-devel > Subject: RE: [Xen-devel] [PATCH 0/2] MMIO emulation fixes > > >>> On 10.08.18 at 14:22, wrote: > >> -Original Message- > >> From: Jan Beulich [mailto:jbeul...@suse.com] > >> Sent: 10 August 2018 13:13 > >> To: Paul Durrant > >> Cc: xen-devel > >> Subject: RE: [Xen-devel] [PATCH 0/2] MMIO emulation fixes > >> > >> >>> On 10.08.18 at 14:08, wrote: > >> >> -Original Message- > >> >> From: Jan Beulich [mailto:jbeul...@suse.com] > >> >> Sent: 10 August 2018 13:02 > >> >> To: Paul Durrant > >> >> Cc: xen-devel > >> >> Subject: Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes > >> >> > >> >> >>> On 10.08.18 at 12:37, wrote: > >> >> > These are probably both candidates for back-port. > >> >> > > >> >> > Paul Durrant (2): > >> >> > x86/hvm/ioreq: MMIO range checking completely ignores direction > flag > >> >> > x86/hvm/emulate: make sure rep I/O emulation does not cross GFN > >> >> > boundaries > >> >> > > >> >> > xen/arch/x86/hvm/emulate.c | 17 - > >> >> > xen/arch/x86/hvm/ioreq.c | 15 ++- > >> >> > 2 files changed, 26 insertions(+), 6 deletions(-) > >> >> > >> >> I take it this isn't yet what we've talked about yesterday on irc? > >> >> > >> > > >> > This is the band-aid fix. I can now show correct handling of a rep mov > >> > walking off MMIO into RAM. > >> > >> But that's not the problem we're having. In our case the bad behavior > >> is with a single MOV. That's why I had assumed that your plan to fiddle > >> with null_handler would help in our case as well, while this series clearly > >> won't (afaict). > >> > > > > Oh, I see. A single MOV spanning MMIO and RAM has undefined behaviour > though > > as I understand it. Am I incorrect? > > I'm not aware of SDM or PM saying anything like this. Anyway, the > specific case where this is being observed as an issue is when > accessing the last few bytes of a normal RAM page followed by a > ballooned out one. The balloon driver doesn't remove the virtual > mapping of such pages (presumably in order to not shatter super > pages); observation is with the old XenoLinux one, but from code > inspection the upstream one behaves the same. > > Unless we want to change the balloon driver's behavior, at least > this specific case needs to be considered having defined behavior, > I think. > Ok. I'll see what I can do. Paul > Jan > ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes
>>> On 10.08.18 at 14:22, wrote: >> -Original Message- >> From: Jan Beulich [mailto:jbeul...@suse.com] >> Sent: 10 August 2018 13:13 >> To: Paul Durrant >> Cc: xen-devel >> Subject: RE: [Xen-devel] [PATCH 0/2] MMIO emulation fixes >> >> >>> On 10.08.18 at 14:08, wrote: >> >> -Original Message- >> >> From: Jan Beulich [mailto:jbeul...@suse.com] >> >> Sent: 10 August 2018 13:02 >> >> To: Paul Durrant >> >> Cc: xen-devel >> >> Subject: Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes >> >> >> >> >>> On 10.08.18 at 12:37, wrote: >> >> > These are probably both candidates for back-port. >> >> > >> >> > Paul Durrant (2): >> >> > x86/hvm/ioreq: MMIO range checking completely ignores direction flag >> >> > x86/hvm/emulate: make sure rep I/O emulation does not cross GFN >> >> > boundaries >> >> > >> >> > xen/arch/x86/hvm/emulate.c | 17 - >> >> > xen/arch/x86/hvm/ioreq.c | 15 ++- >> >> > 2 files changed, 26 insertions(+), 6 deletions(-) >> >> >> >> I take it this isn't yet what we've talked about yesterday on irc? >> >> >> > >> > This is the band-aid fix. I can now show correct handling of a rep mov >> > walking off MMIO into RAM. >> >> But that's not the problem we're having. In our case the bad behavior >> is with a single MOV. That's why I had assumed that your plan to fiddle >> with null_handler would help in our case as well, while this series clearly >> won't (afaict). >> > > Oh, I see. A single MOV spanning MMIO and RAM has undefined behaviour though > as I understand it. Am I incorrect? I'm not aware of SDM or PM saying anything like this. Anyway, the specific case where this is being observed as an issue is when accessing the last few bytes of a normal RAM page followed by a ballooned out one. The balloon driver doesn't remove the virtual mapping of such pages (presumably in order to not shatter super pages); observation is with the old XenoLinux one, but from code inspection the upstream one behaves the same. Unless we want to change the balloon driver's behavior, at least this specific case needs to be considered having defined behavior, I think. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes
> -Original Message- > From: Jan Beulich [mailto:jbeul...@suse.com] > Sent: 10 August 2018 13:13 > To: Paul Durrant > Cc: xen-devel > Subject: RE: [Xen-devel] [PATCH 0/2] MMIO emulation fixes > > >>> On 10.08.18 at 14:08, wrote: > >> -Original Message- > >> From: Jan Beulich [mailto:jbeul...@suse.com] > >> Sent: 10 August 2018 13:02 > >> To: Paul Durrant > >> Cc: xen-devel > >> Subject: Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes > >> > >> >>> On 10.08.18 at 12:37, wrote: > >> > These are probably both candidates for back-port. > >> > > >> > Paul Durrant (2): > >> > x86/hvm/ioreq: MMIO range checking completely ignores direction flag > >> > x86/hvm/emulate: make sure rep I/O emulation does not cross GFN > >> > boundaries > >> > > >> > xen/arch/x86/hvm/emulate.c | 17 - > >> > xen/arch/x86/hvm/ioreq.c | 15 ++- > >> > 2 files changed, 26 insertions(+), 6 deletions(-) > >> > >> I take it this isn't yet what we've talked about yesterday on irc? > >> > > > > This is the band-aid fix. I can now show correct handling of a rep mov > > walking off MMIO into RAM. > > But that's not the problem we're having. In our case the bad behavior > is with a single MOV. That's why I had assumed that your plan to fiddle > with null_handler would help in our case as well, while this series clearly > won't (afaict). > Oh, I see. A single MOV spanning MMIO and RAM has undefined behaviour though as I understand it. Am I incorrect? Paul > Jan > ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes
>>> On 10.08.18 at 14:08, wrote: >> -Original Message- >> From: Jan Beulich [mailto:jbeul...@suse.com] >> Sent: 10 August 2018 13:02 >> To: Paul Durrant >> Cc: xen-devel >> Subject: Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes >> >> >>> On 10.08.18 at 12:37, wrote: >> > These are probably both candidates for back-port. >> > >> > Paul Durrant (2): >> > x86/hvm/ioreq: MMIO range checking completely ignores direction flag >> > x86/hvm/emulate: make sure rep I/O emulation does not cross GFN >> > boundaries >> > >> > xen/arch/x86/hvm/emulate.c | 17 - >> > xen/arch/x86/hvm/ioreq.c | 15 ++- >> > 2 files changed, 26 insertions(+), 6 deletions(-) >> >> I take it this isn't yet what we've talked about yesterday on irc? >> > > This is the band-aid fix. I can now show correct handling of a rep mov > walking off MMIO into RAM. But that's not the problem we're having. In our case the bad behavior is with a single MOV. That's why I had assumed that your plan to fiddle with null_handler would help in our case as well, while this series clearly won't (afaict). Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes
> -Original Message- > From: Jan Beulich [mailto:jbeul...@suse.com] > Sent: 10 August 2018 13:02 > To: Paul Durrant > Cc: xen-devel > Subject: Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes > > >>> On 10.08.18 at 12:37, wrote: > > These are probably both candidates for back-port. > > > > Paul Durrant (2): > > x86/hvm/ioreq: MMIO range checking completely ignores direction flag > > x86/hvm/emulate: make sure rep I/O emulation does not cross GFN > > boundaries > > > > xen/arch/x86/hvm/emulate.c | 17 - > > xen/arch/x86/hvm/ioreq.c | 15 ++- > > 2 files changed, 26 insertions(+), 6 deletions(-) > > I take it this isn't yet what we've talked about yesterday on irc? > This is the band-aid fix. I can now show correct handling of a rep mov walking off MMIO into RAM. Paul > Jan > ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes
>>> On 10.08.18 at 12:37, wrote: > These are probably both candidates for back-port. > > Paul Durrant (2): > x86/hvm/ioreq: MMIO range checking completely ignores direction flag > x86/hvm/emulate: make sure rep I/O emulation does not cross GFN > boundaries > > xen/arch/x86/hvm/emulate.c | 17 - > xen/arch/x86/hvm/ioreq.c | 15 ++- > 2 files changed, 26 insertions(+), 6 deletions(-) I take it this isn't yet what we've talked about yesterday on irc? Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel