On 6/21/23 02:14, Wang, Yanzhang wrote:
Hi Jeff, sorry for the late reply.

The long branch handling is done at the assembler level.  So the clobbering
of $ra isn't visible to the compiler.  Thus the compiler has to be
extremely careful to not hold values in $ra because the assembler may
clobber $ra.

If assembler will modify the $ra behavior, it seems the rules we defined in
the riscv.cc will be ignored. For example, the $ra saving generated by this
patch may be modified by the assmebler and all others depends on it will be
wrong. So implementing the long jump in the compiler is better.
Basically correct. The assembler potentially clobbers $ra. That's why in the long jump patches $ra becomes a fixed register -- the compiler doesn't know when it's clobbered by the assembler.

Even if this were done in the compiler, we'd still have to do something special with $ra. The point at which decisions about register allocation and such are made is before the point where we know the final positions of jumps/labels. It's a classic problem in GCC's design.


If you're not going to use dwarf, then my recommendation is to ensure that
the data you need is *always* available in the stack at known
offsets.   That will mean your code isn't optimized as well.  It means
hand written assembly code has to follow the conventions, you can't link
against libraries that do not follow those conventions, etc etc.  But
that's the price you pay for not using dwarf (or presumably ORC/SFRAME
which I haven't studied in detail).

Yes. That's right. All the libraries need to follow the same logic. But as
you said, this is the price if we choose this solution. And fortunately,
this will only be used in special scenarios.
The key point is you want the location of the return pointer to be consistent in every function and you want to know that every function has a frame pointer.

Otherwise you end up having to either consult on-the-side tables (at which point you might as well look at ORC/SFRAME) or disassembling code in the executable to deduce where to find fp, ra, etc (which is a path to madness).

Thus for the usage scenario you're looking at, I would recommend always having a frame pointer, every function, no matter how trivial and that $ra always be saved into a suitable slot relative to the frame pointer, again, no matter how trivial the function.


And Jeff, do you have any other comments about this patch? Should we add
some descriptions somewhere in the doc?
We may need to adjust the documentation a bit since I think I'm suggesting slight changes in the behavior of existing -m options.

I'd like to see an updated patch before commenting further on implementation details.

jeff

Reply via email to