>       * 8 bit RISC microcontroller

Not 16?

>       * 16 general purpose registers
>       * 16 level deep hardware call stack

If you have RAM, why not use it?  Model calls like the PPC - put
current $pc in a register and jump.  The caller saves the old $pc in
the regular stack.  GCC is going to want a "normal" frame.  This is
easy to do in hardware, and more flexible than a hardware call stack.

>       * load/store to a flat "memory" (minimum 1k, up to 16k)
>         (addressing in that memory uses a register for the 8 lsb and
>            a special register for the msbs)
>         that memory can be loaded along with the code to have known
>         content at startup for eg.

GCC is going to want register pairs to work as larger registers.
Like, if you have $r2 and $r3 as 8 bit registers, gcc wants [$r2$r3]
to be usable as a 16 bit register.  Another reason to go with 16 bit
registers ;-)

GCC won't like having an address split across "special" registers.

But it's OK to limit index registers to evenly numbered ones.

>       * pipelined with some restriction on instruction scheduling
>         (cfr later)

GCC works better if the hardware enforces the locks; it's good at
scheduling pipelines but it doesn't *always* do the right thing; it's
easier if your hardware allows this, if suboptimally.

Of course, I don't know *that* much about the current scheduler.
There may be a way to deal with this cleanly now.

>       * 2 flags Carry & Zero for testing.

GCC will want 4 (add sign and overflow) to support signed comparisons.
Sign should be easy; overflow is the carry out of bit 6.

> I mentionned earlier that there is some scheduling restriction on the
> instructions due to internal pipelining. For example, the result of a
> fetch from memory may not be used in the instruction directly following
> the fetch. When there is a conditionnal branch, the instruction just
> following the branch will always be executed, no matter what the result
> is (branch/call/... are not immediate but have a 1 instruction latency
> beforce the occur). Is theses kind of limitation 'easily' supported by
> gcc ?

Delay slots are common; gcc handles them well.  You might need to add
custom code to enforce the pipeline rules if your pipeline won't
automatically stall.

> I saw several time that gcc works better with a lot of GPRs. I could
> increase them to 32 but then arithmetic instructions would have to use
> the same register for destination than for src1.

16 is sufficient.

Reply via email to