Re: [Xen-devel] [PATCH] x86/VMX: sanitize VM86 TSS handling
At 04:49 -0700 on 16 Feb (1487220558), Jan Beulich wrote: > >>> On 16.02.17 at 12:14, wrote: > On 15.02.17 at 12:21, wrote: > >> At 01:18 -0700 on 15 Feb (1487121525), Jan Beulich wrote: > >>> >>> On 14.02.17 at 18:33, wrote: > >>> >> TBD: Do we really want to re-init the TSS every time we are about to > >>> >> use it? > >>> > > >>> > No - I think we should init it when the guest writes the param(s) and > >>> > leave it at that. Hvmloader marks it as reserved so the guest should > >>> > know better than to write to it, and we can't protect it against all > >>> > the possible ways the guest could break itself. > >>> > > >>> > If you do want to re-init it more often, then I think it would still > >>> > be better to legacy guests' (lack of a) size limit once, when the guest > >>> > writes the base param. > >>> > >>> The only problem with this being that at the time the base gets > >>> written we don't know the size yet (nor whether the guest is > >>> going to write it), and hence we don't know how must space to > >>> initialize. The lower limit we enforce on the size (if being set) is > >>> below the 128 byte default for old guests. > >> > >> Why not make the lower limit 128? I'd happily exchange simpler > >> hypervisor code for the theoretical case of a guest that needs to run > >> realmode code and cares about those few bytes. > > > > Actually there's one more issue with doing it when the parameter is > > being set: If a guest wanted to move the TSS (the parameter isn't > > one-shot after all), we can't use the old value of the other parameter > > when the first of the two is being changed. Of course we could make > > both parameters one-shot, but this would again seem arbitrary. So I > > think the better model is to record when either parameter changed, > > and do the initialization just once after that. > > Actually no, this wouldn't work either, for the same reason (we might > again be using an inconsistent pair of parameters). Which makes me > come back to my original plan (not mentioned anywhere so far): > Instead of a new size param, we need one which allows setting both > size and address at the same time. Since the address needs to be > below 4Gb anyway, we could simply use the high half of the 64-bit > value to hold the size. Sure, that seems like it avoids a lot of potential pitfalls. Tim. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] x86/VMX: sanitize VM86 TSS handling
On Thu, Feb 16, 2017 at 04:49:18AM -0700, Jan Beulich wrote: > >>> On 16.02.17 at 12:14, wrote: > On 15.02.17 at 12:21, wrote: > >> At 01:18 -0700 on 15 Feb (1487121525), Jan Beulich wrote: > >>> >>> On 14.02.17 at 18:33, wrote: > >>> >> TBD: Do we really want to re-init the TSS every time we are about to > >>> >> use it? > >>> > > >>> > No - I think we should init it when the guest writes the param(s) and > >>> > leave it at that. Hvmloader marks it as reserved so the guest should > >>> > know better than to write to it, and we can't protect it against all > >>> > the possible ways the guest could break itself. > >>> > > >>> > If you do want to re-init it more often, then I think it would still > >>> > be better to legacy guests' (lack of a) size limit once, when the guest > >>> > writes the base param. > >>> > >>> The only problem with this being that at the time the base gets > >>> written we don't know the size yet (nor whether the guest is > >>> going to write it), and hence we don't know how must space to > >>> initialize. The lower limit we enforce on the size (if being set) is > >>> below the 128 byte default for old guests. > >> > >> Why not make the lower limit 128? I'd happily exchange simpler > >> hypervisor code for the theoretical case of a guest that needs to run > >> realmode code and cares about those few bytes. > > > > Actually there's one more issue with doing it when the parameter is > > being set: If a guest wanted to move the TSS (the parameter isn't > > one-shot after all), we can't use the old value of the other parameter > > when the first of the two is being changed. Of course we could make > > both parameters one-shot, but this would again seem arbitrary. So I > > think the better model is to record when either parameter changed, > > and do the initialization just once after that. > > Actually no, this wouldn't work either, for the same reason (we might > again be using an inconsistent pair of parameters). Which makes me > come back to my original plan (not mentioned anywhere so far): > Instead of a new size param, we need one which allows setting both > size and address at the same time. Since the address needs to be > below 4Gb anyway, we could simply use the high half of the 64-bit > value to hold the size. IMHO this looks sensible. I've rebased my PVHv2 Dom0 series on top of this (and fixed the TSS allocation in Dom0 build), and it seems to work fine. Roger. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] x86/VMX: sanitize VM86 TSS handling
>>> On 16.02.17 at 12:14, wrote: On 15.02.17 at 12:21, wrote: >> At 01:18 -0700 on 15 Feb (1487121525), Jan Beulich wrote: >>> >>> On 14.02.17 at 18:33, wrote: >>> >> TBD: Do we really want to re-init the TSS every time we are about to >>> >> use it? >>> > >>> > No - I think we should init it when the guest writes the param(s) and >>> > leave it at that. Hvmloader marks it as reserved so the guest should >>> > know better than to write to it, and we can't protect it against all >>> > the possible ways the guest could break itself. >>> > >>> > If you do want to re-init it more often, then I think it would still >>> > be better to legacy guests' (lack of a) size limit once, when the guest >>> > writes the base param. >>> >>> The only problem with this being that at the time the base gets >>> written we don't know the size yet (nor whether the guest is >>> going to write it), and hence we don't know how must space to >>> initialize. The lower limit we enforce on the size (if being set) is >>> below the 128 byte default for old guests. >> >> Why not make the lower limit 128? I'd happily exchange simpler >> hypervisor code for the theoretical case of a guest that needs to run >> realmode code and cares about those few bytes. > > Actually there's one more issue with doing it when the parameter is > being set: If a guest wanted to move the TSS (the parameter isn't > one-shot after all), we can't use the old value of the other parameter > when the first of the two is being changed. Of course we could make > both parameters one-shot, but this would again seem arbitrary. So I > think the better model is to record when either parameter changed, > and do the initialization just once after that. Actually no, this wouldn't work either, for the same reason (we might again be using an inconsistent pair of parameters). Which makes me come back to my original plan (not mentioned anywhere so far): Instead of a new size param, we need one which allows setting both size and address at the same time. Since the address needs to be below 4Gb anyway, we could simply use the high half of the 64-bit value to hold the size. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] x86/VMX: sanitize VM86 TSS handling
>>> On 15.02.17 at 12:21, wrote: > At 01:18 -0700 on 15 Feb (1487121525), Jan Beulich wrote: >> >>> On 14.02.17 at 18:33, wrote: >> >> TBD: Do we really want to re-init the TSS every time we are about to >> >> use it? >> > >> > No - I think we should init it when the guest writes the param(s) and >> > leave it at that. Hvmloader marks it as reserved so the guest should >> > know better than to write to it, and we can't protect it against all >> > the possible ways the guest could break itself. >> > >> > If you do want to re-init it more often, then I think it would still >> > be better to legacy guests' (lack of a) size limit once, when the guest >> > writes the base param. >> >> The only problem with this being that at the time the base gets >> written we don't know the size yet (nor whether the guest is >> going to write it), and hence we don't know how must space to >> initialize. The lower limit we enforce on the size (if being set) is >> below the 128 byte default for old guests. > > Why not make the lower limit 128? I'd happily exchange simpler > hypervisor code for the theoretical case of a guest that needs to run > realmode code and cares about those few bytes. Actually there's one more issue with doing it when the parameter is being set: If a guest wanted to move the TSS (the parameter isn't one-shot after all), we can't use the old value of the other parameter when the first of the two is being changed. Of course we could make both parameters one-shot, but this would again seem arbitrary. So I think the better model is to record when either parameter changed, and do the initialization just once after that. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] x86/VMX: sanitize VM86 TSS handling
>>> On 15.02.17 at 12:21, wrote: > At 01:18 -0700 on 15 Feb (1487121525), Jan Beulich wrote: >> >>> On 14.02.17 at 18:33, wrote: >> >> TBD: Do we really want to re-init the TSS every time we are about to >> >> use it? >> > >> > No - I think we should init it when the guest writes the param(s) and >> > leave it at that. Hvmloader marks it as reserved so the guest should >> > know better than to write to it, and we can't protect it against all >> > the possible ways the guest could break itself. >> > >> > If you do want to re-init it more often, then I think it would still >> > be better to legacy guests' (lack of a) size limit once, when the guest >> > writes the base param. >> >> The only problem with this being that at the time the base gets >> written we don't know the size yet (nor whether the guest is >> going to write it), and hence we don't know how must space to >> initialize. The lower limit we enforce on the size (if being set) is >> below the 128 byte default for old guests. > > Why not make the lower limit 128? I'd happily exchange simpler > hypervisor code for the theoretical case of a guest that needs to run > realmode code and cares about those few bytes. Well, this would seem a pretty arbitrary limit to me, which I dislike (but could probably live with). Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] x86/VMX: sanitize VM86 TSS handling
At 01:18 -0700 on 15 Feb (1487121525), Jan Beulich wrote: > >>> On 14.02.17 at 18:33, wrote: > >> TBD: Do we really want to re-init the TSS every time we are about to > >> use it? > > > > No - I think we should init it when the guest writes the param(s) and > > leave it at that. Hvmloader marks it as reserved so the guest should > > know better than to write to it, and we can't protect it against all > > the possible ways the guest could break itself. > > > > If you do want to re-init it more often, then I think it would still > > be better to legacy guests' (lack of a) size limit once, when the guest > > writes the base param. > > The only problem with this being that at the time the base gets > written we don't know the size yet (nor whether the guest is > going to write it), and hence we don't know how must space to > initialize. The lower limit we enforce on the size (if being set) is > below the 128 byte default for old guests. Why not make the lower limit 128? I'd happily exchange simpler hypervisor code for the theoretical case of a guest that needs to run realmode code and cares about those few bytes. Cheers, Tim. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] x86/VMX: sanitize VM86 TSS handling
At 01:13 -0700 on 15 Feb (1487121231), Jan Beulich wrote: > >>> On 14.02.17 at 18:35, wrote: > > At 06:37 -0700 on 13 Feb (1486967832), Jan Beulich wrote: > >> >>> On 13.02.17 at 14:19, wrote: > >> > -tss = mem_alloc(128, 128); > >> > -memset(tss, 0, 128); > >> > +tss = mem_alloc(TSS_SIZE, TSS_SIZE); > >> > >> tss = mem_alloc(TSS_SIZE, 128); > >> > >> is sufficient here, as I've noticed (only) while reviewing Roger's > >> series v4 of which did trigger the creation of this patch. I've made > >> the change locally for now. > > > > Should Xen check the alignment when the param gets written? > > I did think about this too, but then decided not to, since the guest > would only shoot itself in the foot (the more that in non-root mode > no actual task switching by the hardware occurs, so the alignment > requirement is pretty theoretical anyway). Righto. Tim. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] x86/VMX: sanitize VM86 TSS handling
>>> On 14.02.17 at 16:48, wrote: > On 14/02/17 08:55, Jan Beulich wrote: > On 13.02.17 at 19:26, wrote: >>> On 13/02/17 13:19, Jan Beulich wrote: --- TBD: Do we really want to re-init the TSS every time we are about to use it? This can happen quite often during boot, especially while grub is running. >>> The only case we need worry about is if the guest manually writes to the >>> area covered by the vm86 tss. I think, looking at the logic, that we >>> properly restore the old %tr when re-entering protected mode, so we >>> aren't at risk of having the vm86 tss on the leaving-side of a hardware >>> task switch. >> Yes. But that's the whole point of the question: The guest could >> equally well write to the TSS manually right after we've initialized >> it. And it only harms itself by doing so, hence we could as well >> init the TSS on a less frequently executed path. > > Per this patch (I think) we initialise it each time the guest tries to > clear CR0.PE > > Unless the VM86_TSS location is below the 1MB boundary, the guest can't > actually clobber it. Is that the case? Don't we mimic big real mode (by using the emulator for all insns)? > As an alternative, if you don't reinitialise each time, when would you > do so? Well, ideally when the params get set, but see the reply just sent to Tim's question in that direction. Alternatively we could track whether either parameter changed, and gate the call to hvm_prepare_vm86_tss() on that flag. >>> We have lasted thus far without, but given the issues recently >>> identified, the only safe conclusion to be drawn is that this >>> functionality hasn't been used in anger. >>> >>> For sensible performance, we need to keep the IO bitmap clear to avoid >>> taking the #GP emulation path. >>> >>> For correct behaviour, we need the entire software bitmap to be 0. >> This one is just for reasonable performance too, afaict. If faulting, >> just like with port accesses, we'd emulate the INT $nn. > > Does that actually work? Upon re-injection of the event, won't hardware > follow the same bad path which caused the #GP fault to start with? realmode_deliver_exception() does everything manually, so I don't think there's any re-injection. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] x86/VMX: sanitize VM86 TSS handling
>>> On 14.02.17 at 18:33, wrote: > Hi, > > At 06:19 -0700 on 13 Feb (1486966797), Jan Beulich wrote: >> The present way of setting this up is flawed: Leaving the I/O bitmap >> pointer at zero means that the interrupt redirection bitmap lives >> outside (ahead of) the allocated space of the TSS. Similarly setting a >> TSS limit of 255 when only 128 bytes get allocated means that 128 extra >> bytes may be accessed by the CPU during I/O port access processing. >> >> Introduce a new HVM param to set the allocated size of the TSS, and >> have the hypervisor actually take care of setting namely the I/O bitmap >> pointer. Both this and the segment limit now take the allocated size >> into account. >> >> Signed-off-by: Jan Beulich > > This looks pretty good to me. Once the question below is resolved, > Reviewed-by: Tim Deegan > >> --- >> TBD: Do we really want to re-init the TSS every time we are about to >> use it? > > No - I think we should init it when the guest writes the param(s) and > leave it at that. Hvmloader marks it as reserved so the guest should > know better than to write to it, and we can't protect it against all > the possible ways the guest could break itself. > > If you do want to re-init it more often, then I think it would still > be better to legacy guests' (lack of a) size limit once, when the guest > writes the base param. The only problem with this being that at the time the base gets written we don't know the size yet (nor whether the guest is going to write it), and hence we don't know how must space to initialize. The lower limit we enforce on the size (if being set) is below the 128 byte default for old guests. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] x86/VMX: sanitize VM86 TSS handling
>>> On 14.02.17 at 18:35, wrote: > At 06:37 -0700 on 13 Feb (1486967832), Jan Beulich wrote: >> >>> On 13.02.17 at 14:19, wrote: >> > -tss = mem_alloc(128, 128); >> > -memset(tss, 0, 128); >> > +tss = mem_alloc(TSS_SIZE, TSS_SIZE); >> >> tss = mem_alloc(TSS_SIZE, 128); >> >> is sufficient here, as I've noticed (only) while reviewing Roger's >> series v4 of which did trigger the creation of this patch. I've made >> the change locally for now. > > Should Xen check the alignment when the param gets written? I did think about this too, but then decided not to, since the guest would only shoot itself in the foot (the more that in non-root mode no actual task switching by the hardware occurs, so the alignment requirement is pretty theoretical anyway). Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] x86/VMX: sanitize VM86 TSS handling
At 06:37 -0700 on 13 Feb (1486967832), Jan Beulich wrote: > >>> On 13.02.17 at 14:19, wrote: > > -tss = mem_alloc(128, 128); > > -memset(tss, 0, 128); > > +tss = mem_alloc(TSS_SIZE, TSS_SIZE); > > tss = mem_alloc(TSS_SIZE, 128); > > is sufficient here, as I've noticed (only) while reviewing Roger's > series v4 of which did trigger the creation of this patch. I've made > the change locally for now. Should Xen check the alignment when the param gets written? Tim. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] x86/VMX: sanitize VM86 TSS handling
Hi, At 06:19 -0700 on 13 Feb (1486966797), Jan Beulich wrote: > The present way of setting this up is flawed: Leaving the I/O bitmap > pointer at zero means that the interrupt redirection bitmap lives > outside (ahead of) the allocated space of the TSS. Similarly setting a > TSS limit of 255 when only 128 bytes get allocated means that 128 extra > bytes may be accessed by the CPU during I/O port access processing. > > Introduce a new HVM param to set the allocated size of the TSS, and > have the hypervisor actually take care of setting namely the I/O bitmap > pointer. Both this and the segment limit now take the allocated size > into account. > > Signed-off-by: Jan Beulich This looks pretty good to me. Once the question below is resolved, Reviewed-by: Tim Deegan > --- > TBD: Do we really want to re-init the TSS every time we are about to > use it? No - I think we should init it when the guest writes the param(s) and leave it at that. Hvmloader marks it as reserved so the guest should know better than to write to it, and we can't protect it against all the possible ways the guest could break itself. If you do want to re-init it more often, then I think it would still be better to legacy guests' (lack of a) size limit once, when the guest writes the base param. Cheers, Tim. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] x86/VMX: sanitize VM86 TSS handling
On 14/02/17 08:55, Jan Beulich wrote: On 13.02.17 at 19:26, wrote: >> On 13/02/17 13:19, Jan Beulich wrote: >>> --- >>> TBD: Do we really want to re-init the TSS every time we are about to >>> use it? This can happen quite often during boot, especially while >>> grub is running. >> The only case we need worry about is if the guest manually writes to the >> area covered by the vm86 tss. I think, looking at the logic, that we >> properly restore the old %tr when re-entering protected mode, so we >> aren't at risk of having the vm86 tss on the leaving-side of a hardware >> task switch. > Yes. But that's the whole point of the question: The guest could > equally well write to the TSS manually right after we've initialized > it. And it only harms itself by doing so, hence we could as well > init the TSS on a less frequently executed path. Per this patch (I think) we initialise it each time the guest tries to clear CR0.PE Unless the VM86_TSS location is below the 1MB boundary, the guest can't actually clobber it. As an alternative, if you don't reinitialise each time, when would you do so? > >> We have lasted thus far without, but given the issues recently >> identified, the only safe conclusion to be drawn is that this >> functionality hasn't been used in anger. >> >> For sensible performance, we need to keep the IO bitmap clear to avoid >> taking the #GP emulation path. >> >> For correct behaviour, we need the entire software bitmap to be 0. > This one is just for reasonable performance too, afaict. If faulting, > just like with port accesses, we'd emulate the INT $nn. Does that actually work? Upon re-injection of the event, won't hardware follow the same bad path which caused the #GP fault to start with? > >>> +void hvm_prepare_vm86_tss(struct vcpu *v, uint32_t base, uint32_t limit) >>> +{ >>> +/* >>> + * If the provided area is large enough to cover at least the ISA port >>> + * range, keep the bitmaps outside the base structure. For rather small >>> + * areas (namely relevant for guests having been migrated from older >>> + * Xen versions), maximize interrupt vector and port coverage by >>> pointing >>> + * the I/O bitmap at 0x20 (which puts the interrupt redirection bitmap >>> + * right at zero), accepting accesses to port 0x235 (represented by >>> bit 5 >>> + * of byte 0x46) to trigger #GP (which will simply result in the access >>> + * being handled by the emulator via a slightly different path than it >>> + * would be anyway). Be sure to include one extra byte at the end of >>> the >>> + * I/O bitmap (hence the missing "- 1" in the comparison is not an >>> + * off-by-one mistake), which we deliberately don't fill with all ones. >>> + */ >>> +uint16_t iomap = (limit >= sizeof(struct tss32) + (0x100 / 8) + (0x400 >>> / 8) >>> + ? sizeof(struct tss32) : 0) + (0x100 / 8); >>> + >>> +ASSERT(limit >= sizeof(struct tss32) - 1); >>> +hvm_copy_to_guest_phys(base, NULL, limit + 1, v); >>> +hvm_copy_to_guest_phys(base + offsetof(struct tss32, iomap), >>> + &iomap, sizeof(iomap), v); >> This should be linear rather than phys. > Strictly speaking yes, but since we're running with an ident map, > it doesn't matter. And I'd prefer not to have to introduce a vcpu > parameter to the linear copy function, as that would mean quite > a few changes elsewhere. Would you be okay with me just > attaching a respective comment here? Ok. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] x86/VMX: sanitize VM86 TSS handling
>>> On 13.02.17 at 19:26, wrote: > On 13/02/17 13:19, Jan Beulich wrote: >> --- >> TBD: Do we really want to re-init the TSS every time we are about to >> use it? This can happen quite often during boot, especially while >> grub is running. > > The only case we need worry about is if the guest manually writes to the > area covered by the vm86 tss. I think, looking at the logic, that we > properly restore the old %tr when re-entering protected mode, so we > aren't at risk of having the vm86 tss on the leaving-side of a hardware > task switch. Yes. But that's the whole point of the question: The guest could equally well write to the TSS manually right after we've initialized it. And it only harms itself by doing so, hence we could as well init the TSS on a less frequently executed path. > We have lasted thus far without, but given the issues recently > identified, the only safe conclusion to be drawn is that this > functionality hasn't been used in anger. > > For sensible performance, we need to keep the IO bitmap clear to avoid > taking the #GP emulation path. > > For correct behaviour, we need the entire software bitmap to be 0. This one is just for reasonable performance too, afaict. If faulting, just like with port accesses, we'd emulate the INT $nn. >> +void hvm_prepare_vm86_tss(struct vcpu *v, uint32_t base, uint32_t limit) >> +{ >> +/* >> + * If the provided area is large enough to cover at least the ISA port >> + * range, keep the bitmaps outside the base structure. For rather small >> + * areas (namely relevant for guests having been migrated from older >> + * Xen versions), maximize interrupt vector and port coverage by >> pointing >> + * the I/O bitmap at 0x20 (which puts the interrupt redirection bitmap >> + * right at zero), accepting accesses to port 0x235 (represented by bit >> 5 >> + * of byte 0x46) to trigger #GP (which will simply result in the access >> + * being handled by the emulator via a slightly different path than it >> + * would be anyway). Be sure to include one extra byte at the end of the >> + * I/O bitmap (hence the missing "- 1" in the comparison is not an >> + * off-by-one mistake), which we deliberately don't fill with all ones. >> + */ >> +uint16_t iomap = (limit >= sizeof(struct tss32) + (0x100 / 8) + (0x400 >> / 8) >> + ? sizeof(struct tss32) : 0) + (0x100 / 8); >> + >> +ASSERT(limit >= sizeof(struct tss32) - 1); >> +hvm_copy_to_guest_phys(base, NULL, limit + 1, v); >> +hvm_copy_to_guest_phys(base + offsetof(struct tss32, iomap), >> + &iomap, sizeof(iomap), v); > > This should be linear rather than phys. Strictly speaking yes, but since we're running with an ident map, it doesn't matter. And I'd prefer not to have to introduce a vcpu parameter to the linear copy function, as that would mean quite a few changes elsewhere. Would you be okay with me just attaching a respective comment here? > In practice it will only make a difference if the guest manages to > corrupt its IDENT_PT (which is why I suggested that we move the IDENT_PT > entirely outside of the guest control), but in such a case, we will > re-enter the guest with the vm86 tss pointing to something other than > what we have just initialised. Right, but again the guest would only harm itself. >> @@ -4484,6 +4511,31 @@ static int hvmop_set_param( >> } >> d->arch.x87_fip_width = a.value; >> break; >> + >> +case HVM_PARAM_VM86_TSS: >> +/* Hardware would silently truncate high bits. */ > > Does it? I'd have thought it would be a vmentry failure. I'm pretty sure I've tried it out, as I didn't have it that way initially. I think a VM entry failure would result only if the address was non-canonical. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] x86/VMX: sanitize VM86 TSS handling
On 13/02/17 13:19, Jan Beulich wrote: > The present way of setting this up is flawed: Leaving the I/O bitmap > pointer at zero means that the interrupt redirection bitmap lives > outside (ahead of) the allocated space of the TSS. Similarly setting a > TSS limit of 255 when only 128 bytes get allocated means that 128 extra > bytes may be accessed by the CPU during I/O port access processing. > > Introduce a new HVM param to set the allocated size of the TSS, and > have the hypervisor actually take care of setting namely the I/O bitmap > pointer. Both this and the segment limit now take the allocated size > into account. > > Signed-off-by: Jan Beulich > --- > TBD: Do we really want to re-init the TSS every time we are about to > use it? This can happen quite often during boot, especially while > grub is running. The only case we need worry about is if the guest manually writes to the area covered by the vm86 tss. I think, looking at the logic, that we properly restore the old %tr when re-entering protected mode, so we aren't at risk of having the vm86 tss on the leaving-side of a hardware task switch. We have lasted thus far without, but given the issues recently identified, the only safe conclusion to be drawn is that this functionality hasn't been used in anger. For sensible performance, we need to keep the IO bitmap clear to avoid taking the #GP emulation path. For correct behaviour, we need the entire software bitmap to be 0. As this functionality exists only for first-gen VT-x lacking unrestricted_guest, I can't claim to care too much about the performance impact. > --- a/xen/arch/x86/hvm/hvm.c > +++ b/xen/arch/x86/hvm/hvm.c > @@ -2850,6 +2850,43 @@ static int hvm_load_segment_selector( > return 1; > } > > +struct tss32 { > +uint16_t back_link, :16; > +uint32_t esp0; > +uint16_t ss0, :16; > +uint32_t esp1; > +uint16_t ss1, :16; > +uint32_t esp2; > +uint16_t ss2, :16; > +uint32_t cr3, eip, eflags, eax, ecx, edx, ebx, esp, ebp, esi, edi; > +uint16_t es, :16, cs, :16, ss, :16, ds, :16, fs, :16, gs, :16, ldt, :16; > +uint16_t trace /* :1 */, iomap; > +}; > + > +void hvm_prepare_vm86_tss(struct vcpu *v, uint32_t base, uint32_t limit) > +{ > +/* > + * If the provided area is large enough to cover at least the ISA port > + * range, keep the bitmaps outside the base structure. For rather small > + * areas (namely relevant for guests having been migrated from older > + * Xen versions), maximize interrupt vector and port coverage by pointing > + * the I/O bitmap at 0x20 (which puts the interrupt redirection bitmap > + * right at zero), accepting accesses to port 0x235 (represented by bit 5 > + * of byte 0x46) to trigger #GP (which will simply result in the access > + * being handled by the emulator via a slightly different path than it > + * would be anyway). Be sure to include one extra byte at the end of the > + * I/O bitmap (hence the missing "- 1" in the comparison is not an > + * off-by-one mistake), which we deliberately don't fill with all ones. > + */ > +uint16_t iomap = (limit >= sizeof(struct tss32) + (0x100 / 8) + (0x400 / > 8) > + ? sizeof(struct tss32) : 0) + (0x100 / 8); > + > +ASSERT(limit >= sizeof(struct tss32) - 1); > +hvm_copy_to_guest_phys(base, NULL, limit + 1, v); > +hvm_copy_to_guest_phys(base + offsetof(struct tss32, iomap), > + &iomap, sizeof(iomap), v); This should be linear rather than phys. In practice it will only make a difference if the guest manages to corrupt its IDENT_PT (which is why I suggested that we move the IDENT_PT entirely outside of the guest control), but in such a case, we will re-enter the guest with the vm86 tss pointing to something other than what we have just initialised. > +} > + > void hvm_task_switch( > uint16_t tss_sel, enum hvm_task_switch_reason taskswitch_reason, > int32_t errcode) > @@ -2862,18 +2899,7 @@ void hvm_task_switch( > unsigned int eflags; > pagefault_info_t pfinfo; > int exn_raised, rc; > -struct { > -u16 back_link,__blh; > -u32 esp0; > -u16 ss0, _0; > -u32 esp1; > -u16 ss1, _1; > -u32 esp2; > -u16 ss2, _2; > -u32 cr3, eip, eflags, eax, ecx, edx, ebx, esp, ebp, esi, edi; > -u16 es, _3, cs, _4, ss, _5, ds, _6, fs, _7, gs, _8, ldt, _9; > -u16 trace, iomap; > -} tss; > +struct tss32 tss; > > hvm_get_segment_register(v, x86_seg_gdtr, &gdt); > hvm_get_segment_register(v, x86_seg_tr, &prev_tr); > @@ -4276,6 +4302,7 @@ static int hvm_allow_set_param(struct do > /* The following parameters can be set by the guest. */ > case HVM_PARAM_CALLBACK_IRQ: > case HVM_PARAM_VM86_TSS: > +case HVM_PARAM_VM86_TSS_SIZE: > case HVM_PARAM_ACPI_IOPORTS_LOCATION: > case HVM_PARAM_VM_GENERATION_ID_ADDR: > case HVM_PARAM_STORE_EVT
Re: [Xen-devel] [PATCH] x86/VMX: sanitize VM86 TSS handling
>>> On 13.02.17 at 14:37, wrote: On 13.02.17 at 14:19, wrote: >> --- a/tools/firmware/hvmloader/hvmloader.c >> +++ b/tools/firmware/hvmloader/hvmloader.c >> @@ -177,18 +177,30 @@ static void cmos_write_memory_size(void) >> } >> >> /* >> - * Set up an empty TSS area for virtual 8086 mode to use. >> - * The only important thing is that it musn't have any bits set >> - * in the interrupt redirection bitmap, so all zeros will do. >> + * Set up an empty TSS area for virtual 8086 mode to use. Its content is >> + * going to be managed by Xen, but zero fill it just in case. >> */ >> static void init_vm86_tss(void) >> { >> +/* >> + * Have the TSS cover the ISA port range, which makes it >> + * - 104 bytes base structure >> + * - 32 bytes interrupt redirection bitmap >> + * - 128 bytes I/O bitmap >> + * - one trailing byte >> + * or a total of to 265 bytes. As it needs to be be a power of two for >> + * now (or else the alignment parameter to mem_alloc() needs adjustment), >> + * this ends up requiring 512 bytes. >> + */ >> +#define TSS_SIZE 512 >> void *tss; >> >> -tss = mem_alloc(128, 128); >> -memset(tss, 0, 128); >> +tss = mem_alloc(TSS_SIZE, TSS_SIZE); > > tss = mem_alloc(TSS_SIZE, 128); > > is sufficient here, as I've noticed (only) while reviewing Roger's > series v4 of which did trigger the creation of this patch. I've made > the change locally for now. Which in turn means the size can also be reduced to 384, and then the comment needs adjustment. Resulting hunk: @@ -177,18 +177,29 @@ static void cmos_write_memory_size(void) } /* - * Set up an empty TSS area for virtual 8086 mode to use. - * The only important thing is that it musn't have any bits set - * in the interrupt redirection bitmap, so all zeros will do. + * Set up an empty TSS area for virtual 8086 mode to use. Its content is + * going to be managed by Xen, but zero fill it just in case. */ static void init_vm86_tss(void) { +/* + * Have the TSS cover the ISA port range, which makes it + * - 104 bytes base structure + * - 32 bytes interrupt redirection bitmap + * - 128 bytes I/O bitmap + * - one trailing byte + * or a total of to 265 bytes. As it needs to be a multiple of the requested + * alignment, this ends up requiring 384 bytes. + */ +#define TSS_SIZE (3 * 128) void *tss; -tss = mem_alloc(128, 128); -memset(tss, 0, 128); +tss = mem_alloc(TSS_SIZE, 128); +memset(tss, 0, TSS_SIZE); hvm_param_set(HVM_PARAM_VM86_TSS, virt_to_phys(tss)); +hvm_param_set(HVM_PARAM_VM86_TSS_SIZE, TSS_SIZE); printf("vm86 TSS at %08lx\n", virt_to_phys(tss)); +#undef TSS_SIZE } static void apic_setup(void) Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] x86/VMX: sanitize VM86 TSS handling
>>> On 13.02.17 at 14:19, wrote: > --- a/tools/firmware/hvmloader/hvmloader.c > +++ b/tools/firmware/hvmloader/hvmloader.c > @@ -177,18 +177,30 @@ static void cmos_write_memory_size(void) > } > > /* > - * Set up an empty TSS area for virtual 8086 mode to use. > - * The only important thing is that it musn't have any bits set > - * in the interrupt redirection bitmap, so all zeros will do. > + * Set up an empty TSS area for virtual 8086 mode to use. Its content is > + * going to be managed by Xen, but zero fill it just in case. > */ > static void init_vm86_tss(void) > { > +/* > + * Have the TSS cover the ISA port range, which makes it > + * - 104 bytes base structure > + * - 32 bytes interrupt redirection bitmap > + * - 128 bytes I/O bitmap > + * - one trailing byte > + * or a total of to 265 bytes. As it needs to be be a power of two for > + * now (or else the alignment parameter to mem_alloc() needs adjustment), > + * this ends up requiring 512 bytes. > + */ > +#define TSS_SIZE 512 > void *tss; > > -tss = mem_alloc(128, 128); > -memset(tss, 0, 128); > +tss = mem_alloc(TSS_SIZE, TSS_SIZE); tss = mem_alloc(TSS_SIZE, 128); is sufficient here, as I've noticed (only) while reviewing Roger's series v4 of which did trigger the creation of this patch. I've made the change locally for now. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH] x86/VMX: sanitize VM86 TSS handling
The present way of setting this up is flawed: Leaving the I/O bitmap pointer at zero means that the interrupt redirection bitmap lives outside (ahead of) the allocated space of the TSS. Similarly setting a TSS limit of 255 when only 128 bytes get allocated means that 128 extra bytes may be accessed by the CPU during I/O port access processing. Introduce a new HVM param to set the allocated size of the TSS, and have the hypervisor actually take care of setting namely the I/O bitmap pointer. Both this and the segment limit now take the allocated size into account. Signed-off-by: Jan Beulich --- TBD: Do we really want to re-init the TSS every time we are about to use it? This can happen quite often during boot, especially while grub is running. --- a/tools/firmware/hvmloader/hvmloader.c +++ b/tools/firmware/hvmloader/hvmloader.c @@ -177,18 +177,30 @@ static void cmos_write_memory_size(void) } /* - * Set up an empty TSS area for virtual 8086 mode to use. - * The only important thing is that it musn't have any bits set - * in the interrupt redirection bitmap, so all zeros will do. + * Set up an empty TSS area for virtual 8086 mode to use. Its content is + * going to be managed by Xen, but zero fill it just in case. */ static void init_vm86_tss(void) { +/* + * Have the TSS cover the ISA port range, which makes it + * - 104 bytes base structure + * - 32 bytes interrupt redirection bitmap + * - 128 bytes I/O bitmap + * - one trailing byte + * or a total of to 265 bytes. As it needs to be be a power of two for + * now (or else the alignment parameter to mem_alloc() needs adjustment), + * this ends up requiring 512 bytes. + */ +#define TSS_SIZE 512 void *tss; -tss = mem_alloc(128, 128); -memset(tss, 0, 128); +tss = mem_alloc(TSS_SIZE, TSS_SIZE); +memset(tss, 0, TSS_SIZE); hvm_param_set(HVM_PARAM_VM86_TSS, virt_to_phys(tss)); +hvm_param_set(HVM_PARAM_VM86_TSS_SIZE, TSS_SIZE); printf("vm86 TSS at %08lx\n", virt_to_phys(tss)); +#undef TSS_SIZE } static void apic_setup(void) --- a/tools/libxc/xc_sr_save_x86_hvm.c +++ b/tools/libxc/xc_sr_save_x86_hvm.c @@ -68,6 +68,7 @@ static int write_hvm_params(struct xc_sr HVM_PARAM_MONITOR_RING_PFN, HVM_PARAM_SHARING_RING_PFN, HVM_PARAM_VM86_TSS, +HVM_PARAM_VM86_TSS_SIZE, HVM_PARAM_CONSOLE_PFN, HVM_PARAM_ACPI_IOPORTS_LOCATION, HVM_PARAM_VIRIDIAN, --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -2850,6 +2850,43 @@ static int hvm_load_segment_selector( return 1; } +struct tss32 { +uint16_t back_link, :16; +uint32_t esp0; +uint16_t ss0, :16; +uint32_t esp1; +uint16_t ss1, :16; +uint32_t esp2; +uint16_t ss2, :16; +uint32_t cr3, eip, eflags, eax, ecx, edx, ebx, esp, ebp, esi, edi; +uint16_t es, :16, cs, :16, ss, :16, ds, :16, fs, :16, gs, :16, ldt, :16; +uint16_t trace /* :1 */, iomap; +}; + +void hvm_prepare_vm86_tss(struct vcpu *v, uint32_t base, uint32_t limit) +{ +/* + * If the provided area is large enough to cover at least the ISA port + * range, keep the bitmaps outside the base structure. For rather small + * areas (namely relevant for guests having been migrated from older + * Xen versions), maximize interrupt vector and port coverage by pointing + * the I/O bitmap at 0x20 (which puts the interrupt redirection bitmap + * right at zero), accepting accesses to port 0x235 (represented by bit 5 + * of byte 0x46) to trigger #GP (which will simply result in the access + * being handled by the emulator via a slightly different path than it + * would be anyway). Be sure to include one extra byte at the end of the + * I/O bitmap (hence the missing "- 1" in the comparison is not an + * off-by-one mistake), which we deliberately don't fill with all ones. + */ +uint16_t iomap = (limit >= sizeof(struct tss32) + (0x100 / 8) + (0x400 / 8) + ? sizeof(struct tss32) : 0) + (0x100 / 8); + +ASSERT(limit >= sizeof(struct tss32) - 1); +hvm_copy_to_guest_phys(base, NULL, limit + 1, v); +hvm_copy_to_guest_phys(base + offsetof(struct tss32, iomap), + &iomap, sizeof(iomap), v); +} + void hvm_task_switch( uint16_t tss_sel, enum hvm_task_switch_reason taskswitch_reason, int32_t errcode) @@ -2862,18 +2899,7 @@ void hvm_task_switch( unsigned int eflags; pagefault_info_t pfinfo; int exn_raised, rc; -struct { -u16 back_link,__blh; -u32 esp0; -u16 ss0, _0; -u32 esp1; -u16 ss1, _1; -u32 esp2; -u16 ss2, _2; -u32 cr3, eip, eflags, eax, ecx, edx, ebx, esp, ebp, esi, edi; -u16 es, _3, cs, _4, ss, _5, ds, _6, fs, _7, gs, _8, ldt, _9; -u16 trace, iomap; -} tss; +struct tss32 tss; hvm_get_segment_register(v, x86_seg_gdtr, &gdt); hvm_get_segment_register(v, x86_seg_tr, &prev_tr); @@ -4276,6 +430