Re: [OpenRISC] OpenRISC LLVM backend improvements

Alessandro Di Federico Fri, 06 Jun 2014 06:50:34 -0700

On Sat, 24 May 2014 22:52:03 +0100
Christian Svensson <[email protected]> wrote:
> Also, I'm definitely interested in knowing more of the ABI changes
> you are proposing.


OK, we put up a proposal. Feedback is welcome!

--
Ale

# OR1K NewABI description #

NewABI is a variant of the standard OR1K ABI. The differences between
the two ABIs are located in the calling convention of variadic
function. The objective of NewABI is to change the convention in order
to have the same convention for parameter passing on ordinary and
variadic functions.

Consider the following example:

    void callee(int a, ...);

    void caller() {
      callee(0, 1, 2, 3, 4, 5, 6, 7);
    }

using NewABI, the assembly code for the caller will contain something
like:

    caller:
        // ...
        l.ori r3, r0, 6
        l.ori r4, r0, 6
        l.sw  4(r1), r4
        l.sw  0(r1), r3
        l.ori r3, r0, 0
        l.ori r4, r0, 1
        l.ori r5, r0, 2
        l.ori r6, r0, 3
        l.ori r7, r0, 4
        l.ori r8, r0, 5
        l.jal caller
        l.nop
        // ...

Basically parameters are passed by register as long as possible, and
then are pushed on the stack, just as would happen with a call to a
non-variadic function. From the caller's point of view nothing changes
between the variadic definition of callee or the following ordinary
definition:

    void callee(int, int, int, int, int, int, int, int);

The main difference is located in the way the callee function manages
variadic arguments.

In the following we present two different implementation proposals for
such an ABI with this property.

## Proposal 1 ##

A common solution is to define the `__builtin_va_list` type as a
structure that will hold all the parameters necessary to implement the
`va_arg` construct.

In our first proposal we have the following definitions:

    struct __or1k_va_list {
        unsigned reg_size;
        void *reg_save_area;
        void *stack_area;
    };

    typedef struct __or1k_va_list ___builtin_va_list[1];

In the `va_list` we have two pointer fields:

* `reg_save_area` points to the callee stack section reserved for
  variadic parameters passed by registers,
* `stack_area` points to caller stack section reserved for stack
  parameters.

We suggest the following stack layout:

        |                 | (higher addresses)
        +-----------------+
        |                 |
        |  stack params   |
        |                 |  caller 
    ----+-----------------+----------------
        |  RA slot (opt)  |  callee
        +-----------------+
        |  FP slot (opt)  |
        +-----------------+
        |                 |
        |  reg save area  |
        |                 |
        +-----------------+
        |                 |
        |    CSR slots    |
        |                 |
        +-----------------+          /\
        |                 |         /  \
        | Local variables |          ||
        |                 |          ||
        +-----------------+  addresses grow upwards
        |                 |
        | Dynamic objects |  stack grows downwards
        |                 |          ||
        +-----------------+          ||
        |                 |         \  /
        |  stack params   |          \/
        |                 |
        +-----------------+ (lower addresses)

On the entry of a variadic function registers that may hold parameters
must be saved on the stack in a predefined location. The `va_start`
macro will just initialize the `va_list` fields as follows:

* `reg_save_area` points to the stack section of variadic register
  parameters,
* `stack_area` points to the stack parameters
* `reg_size` is the size (in byte) of ordinary register parameters.

The `va_arg` implementation should do the following:

    va_arg(va_list ap, type t) {
      int size = 4;
      if (t == long long || t == double)
        size = 8;
      void *ptr = 0;
      // If it doesn't fit in the registers switch to the stack
      if (ap->reg_size + size > 24) {
        ptr = ap->stack_area;
        ap->stack_area = (char*)(ap->stack_area) + size;
        ap->reg_size = 24;
      } else {
        ptr = ap->reg_save_area;
        ap->reg_save_area = (char*)(ap->reg_save_area) + size;
        ap->reg_size += size;
      }
      if (is_aggregate(t)) {
        return **((t**)ptr);
      }
      return *((t*)ptr);
    }

## Proposal 2 ##

An alternative approach for laying out the stack for a variadic
function is the following:

        |                 | (higher addresses)
        +-----------------+
        |                 |
        |  stack params   |
        |                 |  caller 
    ----+-----------------+----------------
        |                 |  callee
        |  reg save area  |
        |                 |
        +-----------------+
        |  RA slot (opt)  |
        +-----------------+
        |  FP slot (opt)  |
        +-----------------+
        |                 |
        |    CSR slots    |
        |                 |
        +-----------------+          /\
        |                 |         /  \
        | Local variables |          ||
        |                 |          ||
        +-----------------+  addresses grow upwards
        |                 |
        | Dynamic objects |  stack grows downwards
        |                 |          ||
        +-----------------+          ||
        |                 |         \  /
        |  stack params   |          \/
        |                 |
        +-----------------+ (lower addresses)

This layout, at first sight, keeps the parameters contiguous on the
stack, which allows easier access to the arguments using a single
pointer which is properly incremented at each `va_arg(...)` call. 

However, due to the above mentioned issue about 8-bytes long
parameters, in general we cannot assume that the parameters will
actually be contiguous. A simple example is the following prototype:

    void foo(int, long long, long long, ...);

The first parameter is passed using `r3`.
The second parameter is passed using `r4-r5`.
The third parameter is passed using `r6-r7`.

Now if the first variadic parameter is an aggregate type (indirect
passage using a pointer to the aggregate type) or its size is at most 4
bytes then is passed using `r8`. The other possible case is the first
variadic parameters is a `long long` or a `double`, thus requiring two
registers. The standard ABI says that in this case we should not split
the parameter half in a register and half on the stack, so we have to
pass the parameter using the stack. This particular case will produce a
hole in the stack layout that cannot be avoided.

However we want the `va_list` type to be `void*` and thus have a simple
`va_arg` that just use the `va_list` as pointer to the argument.

For this reason we propose two solutions.

### Solution A ###

Force variadic functions to always have a frame pointer defined that
points to the beginning of the reg save area. This constraint allows us
to compute the amount of bytes of the parameters we have already used
as follows:

    params_bytes = va_list_value - FP + #static_params_bytes

Note that `#static_params_bytes` is a compile time constant.

The `va_arg` construct for `long long` or `double` must handle the
particular case of `#params_bytes` equals to 20 differently: the
pointer for the argument is `va_list_value + 4` instead of just
`va_list_value`.

### Solution B ###

Change the ABI to require that 8-byte parameters are split among
registers and the stack if appropriate.

This approach has a reduced overhead if compared to the previous one,
which however requires a change in the ABI outside the context of
variadic functions.


In both solution A and B we need to relax the frame pointer definition
to guarantee the correctness: the frame pointer must always point to
the base of the reg save area, i.e. right after the return address
slot. For ordinary functions the reg save area size is zero, thus the
frame pointer will be the value of the stack pointer at function entry
point as in the standard ABI. For variadic functions this is generally
false. This relaxation of the frame pointer definition is necessary in
order to be able to follow the linked list of function frames.

Using this relaxed definition for the frame pointer we are not able to
identify the stack frames correctly: the reg save area would virtually
become part of the caller stack frame, since there is no way to check
if a given stack frame refers to either a variadic or an ordinary
function.
_______________________________________________
OpenRISC mailing list
[email protected]
http://lists.openrisc.net/listinfo/openrisc

Re: [OpenRISC] OpenRISC LLVM backend improvements

Reply via email to