Hi Quentin, this patch (or rather parts of it) are part of VBox 5.0.22 so the problem should be fixed now.
Kind regards, Frank On Tuesday 17 May 2016 16:54:56 quentin buathier wrote: > Hi Michal, > > Thank you for taking the time to reply to all my messy explanations and I > hope my patch will be useful. > > Regards > > 2016-05-17 16:05 GMT+02:00 Michal Necasek <michal.neca...@oracle.com>: > > Hi Quentin, > > > > A couple of minor points: > > > > - VM exits are not only triggered by guest code but also by external > > > > interrupts arriving on the host; that introduces a nice element of > > unpredictability into the mix > > > > - Some x86 CPUs may only save the FP data pointer on FP exceptions; > > > > however, the FP instruction pointer should still be saved always > > > > - One of the situations that could cause problems is when the VM process > > > > gets rescheduled on a different host CPU and the guest FPU state needs to > > be correctly saved/loaded; again unpredictable when that happens > > > > I'm still not certain why your testcase always immediately fails on a > > > > 64-bit host OS. I suspect that the OS actively destroys the FP CS/DS while > > manipulating the 64-bit FPU state, which is probably forced by the signal > > processing. > > > > This whole thing was really poorly designed on the hardware level and I > > > > don't understand why there's no instruction that can save/restore the full > > FPU state at once (16-bit segments + 64-bit offsets). Then again Intel > > clearly wants to get away from tracking the CS/DS completely which also > > "solves" the issue. It actually causes problems for some existing software > > because FP exception handlers will fail/crash when CS/DS is invalid. > > > > Anyway, your patch should solve the problem on the affected CPUs. We'll > > > > take another look at it (too easy to get something wrong with assembler) > > but I think it's doing the right thing. > > > > Thanks, > > > > Michal > > > > On 5/16/2016 11:28 AM, quentin buathier wrote: > >> Hi Michal, > >> > >> I was only mentionning the 32 bit architecture to highlight that we're > >> running a 32 bit OS (not just an executable) on the 64 bit host. Plus, > >> you are entirely right that the behavior occurs only on Sandy & Ivy > >> bridge. My understanding of when the CS & DS registers are lost is > >> incomplete - especially since it's not systematic. Let me try to detail > >> what I think happens, based on the behavior the provided sample can show: > >> > >> 1- From the guest, we execute an FPU instruction, which sets FPU-CS &DS > >> 2- I assume other code, which triggers a VMExit, runs inside the guest > >> 3- Virtualbox uses the 64-bit xsave to save the guest state (see the ref > >> 1 at the end of this mail) > >> 4- Some host code is run which (sometimes only) resets the CS & DS. I > >> see two possibilities here: maybe FPU instructions are executed which, > >> because they are on the 64-bit host, reset CS and DS. Or maybe the host > >> executes something like fninit. I'm not too sure about this part. > >> 5- Virtualbox then uses the 64-bit xrstor to restore the guest state, > >> which ignores CS / DS (see the ref 2). > >> 6- Back to the guest , we test FPU CS and DS: the expected behaviour is > >> that between the execution of the FPU instruction and this test, CS and > >> DS aren't reset. > >> 7- If step 4 reset CS & DS, the previous assumption turns out wrong and > >> we stop the program. Otherwise, we keep looping back to step 1. > >> > >> So, from a guest point of view, CS & DS can be sometimes reset for no > >> obvious reason. On later versions of the CPU, the CS & DS are never > >> saved anyway, so that wouldn't be a problem. > >> > >> The submitted patch forces Virtualbox to properly save FPU CS and DS > >> when the guest state is saved using xsave. After the patch is applied, > >> the test program should run indefinitely on a 32-bit guest, matching the > >> behaviour of a non-virtualised 32-bit system. > >> > >> Ref 1: In subchapter 13.5.1 "x87 State" of "Intel 64 and IA-32 > >> Architectures Developer's Manual" we read that in 64-bit mode (REX.W=1), > >> xsave uses the space of CS and DS to save upper parts of FIP and FDP, so > >> FPU CS and DS are ignored. > >> Ref 2: Like the ref 1, xrstor doesn't restore FPU CS and DS > >> > >> Regards > >> > >> 2016-05-13 15:39 GMT+02:00 Michal Necasek <michal.neca...@oracle.com > >> > >> <mailto:michal.neca...@oracle.com>>: > >> Hi Quentin, > >> > >> Please add a bit more detail... we're on a 64-bit system the whole > >> > >> time so that's not useful in explaining what happens. When we're > >> running a 32-bit executable, it's executing a 32-bit FNSTENV and the > >> segment registers should be saved. Where exactly are they lost? By > >> the way, feel free to point to the relevant sections of the Intel > >> SDM. > >> > >> Also, am I understanding correctly that this problem only affects > >> > >> Sandy Bridge and Ivy Bridge generation CPUs? Older CPUs shouldn't be > >> affected because they have no AVX, and Haswell and later are broken > >> by design and never save the CS/DS at all. > >> > >> Regards, > >> > >> Michal > >> > >> ----- Original Message ----- > >> From: qbuath...@tetrane.com <mailto:qbuath...@tetrane.com> > >> To: vbox-dev@virtualbox.org <mailto:vbox-dev@virtualbox.org> > >> Sent: Friday, May 13, 2016 2:55:20 PM GMT +01:00 Amsterdam / Berlin > >> / Bern / Rome / Stockholm / Vienna > >> Subject: Re: [vbox-dev] Virtualbox don't restore FPU segments with > >> 32-bit guests while using xsave/xrstor > >> > >> Hi Frank, > >> > >> When we run a FPU instruction on a x86 system, FPU CS and DS are set > >> to the same value as the CS and DS registers. This is an historical > >> reason where FPU is an individual chip with its own registers. So, > >> the expected behaviour of this sample is to run forever on a guest > >> > >> x86. > >> > >> On a x86_64 system, this is different because it removes this > >> historical behavior and the FPU segments are always set to 0. So, if > >> you run this sample on a x86_64 system, it's the normal behaviour to > >> have "segs unset 1". > >> > >> The problem is that Virtualbox doesn't restore properly FPU CS and > >> DS when it uses xsave/xrstor, but does it if it uses fxsave/fxrstor. > >> The problem happens randomly and I think that it's cause by the > >> switch between guest code execution and host code execution (when > >> Virtualbox save the guest state, restore the host state and after > >> the host code execution repeated those operations in reverse order). > >> > >> My patch fixes that problem by using the same behaviour than for > >> fxsave / fxrstor to save CS and DS when using xsave / xrstor in that > >> particular case. > >> > >> Regards, > >> > >> 2016-05-13 14:53 GMT+02:00 quentin buathier <qbuath...@tetrane.com > >> > >> <mailto:qbuath...@tetrane.com>>: > >> Hi Frank, > >> > >> When we run a FPU instruction on a x86 system, FPU CS and DS are > >> set to the same value as the CS and DS registers. This is an > >> historical reason where FPU is an individual chip with its own > >> registers. So, the expected behaviour of this sample is to run > >> forever on a guest x86. > >> > >> On a x86_64 system, this is different because it removes this > >> historical behavior and the FPU segments are always set to 0. > >> So, if you run this sample on a x86_64 system, it's the normal > >> behaviour to have "segs unset 1". > >> > >> The problem is that Virtualbox doesn't restore properly FPU CS > >> and DS when it uses xsave/xrstor, but does it if it uses > >> fxsave/fxrstor. The problem happens randomly and I think that > >> it's cause by the switch between guest code execution and host > >> code execution (when Virtualbox save the guest state, restore > >> the host state and after the host code execution repeated those > >> operations in reverse order). > >> > >> My patch fixes that problem by using the same behaviour than for > >> fxsave / fxrstor to save CS and DS when using xsave / xrstor in > >> that particular case. > >> > >> Regards, > >> > >> 2016-05-13 9:14 GMT+02:00 Frank Mehnert > >> > >> <frank.mehn...@oracle.com <mailto:frank.mehn...@oracle.com>>: > >> Hi Quentin, > >> > >> what is the expected behaviour of this sample? Should it run > >> forever? > >> Running this sample in a 32-bit guests stops with "segs > >> unset" after > >> a short time. After applying your patch and running the > >> example in the > >> guest, it runs forever. > >> > >> But: If I run this sample on the host (Linux 4.5.4), it will > >> always > >> stop with "segs unset 1" after the first turn. > >> > >> Kind regards, > >> > >> Frank > >> > >> On Thursday 12 May 2016 14:47:01 quentin buathier wrote: > >> > This is a sample in C++ which reproduce the problem > >> > >> randomly (1 ~ 2 > >> > >> > seconds). > >> > On the same host / guest / cpu that my previous mail. > >> > > >> > 2016-05-12 12:20 GMT+02:00 quentin buathier > >> > >> <qbuath...@tetrane.com <mailto:qbuath...@tetrane.com>>: > >> > > Hi Michal, > >> > > > >> > > I can't now give a way to reproduce the bug but I'll > >> > >> send an executable if > >> > >> > > I manage to reproduce the problem on something > >> > > minimalist. > >> > > > >> > > But I can give you the context of the problem: > >> > > Host OS: Debian jessie 64-bits > >> > > Guest OS: Debian jessie 32-bits > >> > > Processor: i7-2600 (and all i7 tested) > >> > > > >> > > PS: Sorry for the previous mail that was accidently sent > >> > > > >> > > Regards, > >> > > > >> > > 2016-05-12 12:18 GMT+02:00 quentin buathier > >> > >> <qbuath...@tetrane.com <mailto:qbuath...@tetrane.com>>: > >> > >> Hi Michal, > >> > >> > >> > >> I can't now give a way to reproduce the bug. I'll send > >> > >> an executable if I > >> > >> > >> manage to reproduce the problem on something minimalist. > >> > >> > >> > >> But I can give you the context of the problem: > >> > >> Host OS: Debian jessie 64-bits > >> > >> > >> > >> 2016-05-12 11:52 GMT+02:00 Michal Necasek > >> > >> <michal.neca...@oracle.com <mailto:michal.neca...@oracle.com > >> > >> > >>> Hi Quentin, > >> > >>> > >> > >>> Thank you for the patch! > >> > >>> > >> > >>> Unfortunately (?) I can't reproduce the problem that > >> > >> was originally > >> > >> > >>> fixed. Could you please provide a bit more > >> > >> information? What's the host > >> > >> > >>> OS, > >> > >>> guest OS, host CPU type? How to reproduce the problem? > >> > >>> > >> > >>> Regards, > >> > >>> > >> > >>> Michal > >> > >>> > >> > >>> On 5/12/2016 11:26 AM, quentin buathier wrote: > >> > >>>> Hi, > >> > >>>> > >> > >>>> As I understand it, there used to be a problem with > >> > >> restoring the FPU > >> > >> > >>>> segments in case of a 64-bit hosts with a 32-bit > >> > >> guest. This issue has > >> > >> > >>>> been fixed by using the macros "SAVE_32_OR_64_FPU" and > >> > >>>> "RESTORE_32_OR_64_FPU" in > >> > >> "src/VBox/VMM/VMMR0/CPUMR0A.asm" (when > >> > >> > >>>> Virtualbox was using fxsave and fxrstor to save and > >> > >> restore the FPU > >> > >> > >>>> context). > >> > >>>> > >> > >>>> But along with the recent support of xsave / xrstor, > >> > >> the bug was > >> > >> > >>>> reintroduced: if the CPU supports xsave/xrstor, > >> > >> Virtualbox uses these > >> > >> > >>>> instructions and the guest's FPU segments are not > >> > >> restored properly. > >> > >> > >>>> Please find attached a possible patch to fix this > >> > >> issue (MIT licence). > >> > >> > >>>> Regards, > >> > >>>> > >> > >>>> > >> > >>>> _______________________________________________ > >> > >>>> vbox-dev mailing list > >> > >> > >>>> vbox-dev@virtualbox.org <mailto: > >> vbox-dev@virtualbox.org> > >> > >> > >>>> https://www.virtualbox.org/mailman/listinfo/vbox-dev > >> > >>> > >> > >>> _______________________________________________ > >> > >>> vbox-dev mailing list > >> > >>> vbox-dev@virtualbox.org <mailto:vbox-dev@virtualbox.org > >> > >>> > >> > >>> https://www.virtualbox.org/mailman/listinfo/vbox-dev > >> > >> -- > >> Dr.-Ing. Frank Mehnert | Software Development Director, > >> VirtualBox > >> ORACLE Deutschland B.V. & Co. KG | Werkstr. 24 | 71384 > >> Weinstadt, Germany > >> > >> ORACLE Deutschland B.V. & Co. KG > >> Hauptverwaltung: Riesstraße 25, D-80992 München > >> Registergericht: Amtsgericht München, HRA 95603 > >> > >> Komplementärin: ORACLE Deutschland Verwaltung B.V. > >> Hertogswetering 163/167, 3543 AS Utrecht, Niederlande > >> Handelsregister der Handelskammer Midden-Niederlande, Nr. > >> 30143697 > >> Geschäftsführer: Alexander van der Ven, Jan Schultheiss, Val > >> Maher > >> _______________________________________________ > >> vbox-dev mailing list > >> vbox-dev@virtualbox.org <mailto:vbox-dev@virtualbox.org> > >> https://www.virtualbox.org/mailman/listinfo/vbox-dev > >> > >> _______________________________________________ > >> vbox-dev mailing list > >> vbox-dev@virtualbox.org <mailto:vbox-dev@virtualbox.org> > >> https://www.virtualbox.org/mailman/listinfo/vbox-dev > >> > >> _______________________________________________ > >> vbox-dev mailing list > >> vbox-dev@virtualbox.org > >> https://www.virtualbox.org/mailman/listinfo/vbox-dev > > > > _______________________________________________ > > vbox-dev mailing list > > vbox-dev@virtualbox.org > > https://www.virtualbox.org/mailman/listinfo/vbox-dev -- Dr.-Ing. Frank Mehnert | Software Development Director, VirtualBox ORACLE Deutschland B.V. & Co. KG | Werkstr. 24 | 71384 Weinstadt, Germany ORACLE Deutschland B.V. & Co. KG Hauptverwaltung: Riesstraße 25, D-80992 München Registergericht: Amtsgericht München, HRA 95603 Komplementärin: ORACLE Deutschland Verwaltung B.V. Hertogswetering 163/167, 3543 AS Utrecht, Niederlande Handelsregister der Handelskammer Midden-Niederlande, Nr. 30143697 Geschäftsführer: Alexander van der Ven, Jan Schultheiss, Val Maher _______________________________________________ vbox-dev mailing list vbox-dev@virtualbox.org https://www.virtualbox.org/mailman/listinfo/vbox-dev