> * 8 bit RISC microcontroller Not 16?
> * 16 general purpose registers > * 16 level deep hardware call stack If you have RAM, why not use it? Model calls like the PPC - put current $pc in a register and jump. The caller saves the old $pc in the regular stack. GCC is going to want a "normal" frame. This is easy to do in hardware, and more flexible than a hardware call stack. > * load/store to a flat "memory" (minimum 1k, up to 16k) > (addressing in that memory uses a register for the 8 lsb and > a special register for the msbs) > that memory can be loaded along with the code to have known > content at startup for eg. GCC is going to want register pairs to work as larger registers. Like, if you have $r2 and $r3 as 8 bit registers, gcc wants [$r2$r3] to be usable as a 16 bit register. Another reason to go with 16 bit registers ;-) GCC won't like having an address split across "special" registers. But it's OK to limit index registers to evenly numbered ones. > * pipelined with some restriction on instruction scheduling > (cfr later) GCC works better if the hardware enforces the locks; it's good at scheduling pipelines but it doesn't *always* do the right thing; it's easier if your hardware allows this, if suboptimally. Of course, I don't know *that* much about the current scheduler. There may be a way to deal with this cleanly now. > * 2 flags Carry & Zero for testing. GCC will want 4 (add sign and overflow) to support signed comparisons. Sign should be easy; overflow is the carry out of bit 6. > I mentionned earlier that there is some scheduling restriction on the > instructions due to internal pipelining. For example, the result of a > fetch from memory may not be used in the instruction directly following > the fetch. When there is a conditionnal branch, the instruction just > following the branch will always be executed, no matter what the result > is (branch/call/... are not immediate but have a 1 instruction latency > beforce the occur). Is theses kind of limitation 'easily' supported by > gcc ? Delay slots are common; gcc handles them well. You might need to add custom code to enforce the pipeline rules if your pipeline won't automatically stall. > I saw several time that gcc works better with a lot of GPRs. I could > increase them to 32 but then arithmetic instructions would have to use > the same register for destination than for src1. 16 is sufficient.