On Thu, Jun 15, 2017 at 3:18 PM, Andy Lutomirski <l...@kernel.org> wrote: > On Thu, Jun 15, 2017 at 7:33 AM, Dave Hansen <dave.han...@intel.com> wrote: >> On 06/14/2017 10:18 PM, Andy Lutomirski wrote: >>> Dave, why is XINUSE exposed at all to userspace? >> >> You need it for XSAVEOPT when it is using the init optimization to be >> able to tell which state was written and which state in the XSAVE buffer >> is potentially stale with respect to what's in the registers. I guess >> you can just use XSAVE instead of XSAVEOPT, though. >> >> As you pointed out, if you are using XSAVEC's compaction features by >> leaving bits unset in the requested feature bitmap registers, you have >> no idea how much data XSAVEC will write, unless you read XINUSE with >> XGETBV. But, you can get around *that* by just presizing the XSAVE >> buffer to be big. > > I imagine that, if you're going to save, do something quick, and > restore, you'd be better off allocating a big buffer rather than > trying to find the smallest buffer you can get away with by reading > XINUSE. Also, what happens if XINUSE nondeterministically changes out > from under you before you do XSAVEC? I assume you can avoid this > becoming a problem by using RFBM carefully. > >> >> So, I guess that leaves its use to just figuring out how much XSAVEOPT >> (and friends) are going to write. >> >>> To be fair, glibc uses this new XGETBV feature, but I suspect its >>> usage is rather dubious. Shouldn't it just do XSAVEC directly rather >>> than rolling its own code? >> >> A quick grep through my glibc source only shows XGETBV(0) used which >> reads XCR0. I don't see any XGETBV(1) which reads XINUSE. Did I miss it. > > Take a look at sysdeps/x86_64/dl-trampoline.h in a new enough version.
I wrote a test to compare latency against different approaches. This is on Skylake: [hjl@gnu-skl-1 glibc-test]$ make ./test move : 47212 fxsave : 719440 xsave : 925146 xsavec : 811036 xsave_state_size: 1088 xsave_state_comp_size: 896 load/store is about 17X faster than xsavec. I put my hjl/pr21265/xsavec branch at https://sourceware.org/git/?p=glibc.git;a=summary It uses xsave/xsave/xsavec in _dl_runtime_resolve. -- H.J.
plt.tar.xz
Description: application/xz