https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110899

--- Comment #11 from Michael Matz <matz at gcc dot gnu.org> ---
(In reply to Florian Weimer from comment #10)
> (In reply to Michael Matz from comment #9)
> > > > I don't see how that helps.  Imagine a preserve_all function foo that 
> > > > calls
> > > > printf.  How do you propose that 'foo' saves all parts of the SSE 
> > > > registers,
> > > > even those that aren't invented yet, or those that can't be touched by 
> > > > the
> > > > current ISA?  (printf might clobber all of these)
> > > 
> > > Vector registers are out of scope for this.
> > 
> > Why do you say that?  From clang: "Furthermore it also preserves all
> > floating-point registers (XMMs/YMMs)."  (for preserve_all, but this
> > bugreport does include that variant of the attribute).
> 
> Ugh, I preferred not to look at it because it's likely that the Clang
> implementation is broken (not future-proof).

I see, then we need to make clear that we aren't going to do anything about
preserve_all with clangs wording, in context of this report.

FWIW, in my implementation referred to above I chose to also have two variants:
one saving/restoring only the SSE2 parts of *mm8-*mm15 (i.e. xmm8-xmm15),
and one guaranteering to not clobber anything of *mm8-*mm15.  (No guarantees
about the *mm16 upwards).  The first variant can call foreign functions,
the second variant simply is allowed to only call functions that also give
that guarantee.

(There is also the question of mask registers, the clang docu doesn't talk
about them.  And I still would like to know the reason for the seemingly
arbitrary choice to leave some regs call clobbered for aarch64).

> > > But lets look at APX. If printf is recompiled to use APX, then it will
> > > clobber the extended register file. If we define __preserve_most__ the way
> > > we do in my psABI proposal (i.e., *not* as everything but %r11), the
> > > extended APX registers are still caller-saved.
> > 
> > Right, for preserve_most _with your wording_ it works out.  preserve_all
> > or preserve_most with clang wording doesn't.
> 
> In glibc, we already use a full context switch with XSAVE for the dynamic
> loader trampoline. As far as I understand it, it's not future-proof. The
> kernel could provide an interface that is guaranteed to work because it only
> enables those parts of the register file that it can context-switch. I can
> probably get the userspace-only implementation into glibc, but the kernel
> interface seems unlikely. We'd also have to work out the interaction of
> preserve_all and unwinding, setjmp etc.; not sure if there is a proper
> solution for that.

There are a couple possibilities to implement a halfway solution for this,
via XSAVE and friends, or via runtime dispatching dependend on current CPU
(e.g. provide a generic save/restore-stuff function in libgcc).  The problem
will always be where the memory for this save/restore pattern should come from,
its size isn't constant at compile time.  That's also solvable, but it's 
becoming more and more hairy.

That's why I chose to simply disallow calling foreign functions from those
that want to give very strict guarantees.  We could do the same for
preserve_all, if we absolutely want to have it.

Reply via email to