Jan sent me a pointer this morning to a blog entry about picolisp:

   http://software-lab.de/doc64/README

I think the document is interesting, mainly because it seems to make a
number of assumptions that don't seem obvious to me. Let me go through Mr.
Burger's points in turn.

1. Stack manipulations.

The indefinite push/pop thing is problematic. Usually, this can be done in
C using alloca(). But more importantly, most native runtimes have size
constraints on the stack that make indefinite push/pop highly undesirable.
It's possible that I'm missing something here - my head has been out of
Scheme for a while, and with static typing some of this particular concern
is eliminated.

Of the various points that Mr. Burger brings forward, the argument for
(more) direct manipulation of the stack to support apply frames seems the
most compelling. If you've been following the BitC list for a while, you
know that I have also put some thought into this for BitC, mostly in the
context of our discussion about currying. I think it's a slight
mis-statement of the problem. The real problem is reconciling [possibly
curried] application with the native register-based calling convention In
the end, my conclusion was that (a) the native (typically register-based)
calling convention exists for a reason and should be exploited whenever
possible, and (b) native register conventions don't get along with
curried-style application very well, and (c) the desired registerization
sure as hell can't be expressed in C.

I'd add that this is even harder to do in LISP, because you don't have much
statically-analyzable type information to work with. You can *sort of* get
procedure arity, but not reliably.

Doing a good job on apply (including curried apply) probably requires that
the compiler emit certain functions using variant register conventions. I
agree that this can't be done by way of C, but it also can't be done by way
of generic assembler.

The alternative is to set up a special frame style (either stack based or
heap based) for such calls. That can be done in C.

2. Alignments and memory layout.

Since Mr. Burger doesn't give examples, it's hard to imagine what heap
structures he is describing. I'm aware of some structures that are a
nuisance to initialize from C, but not any compelling ones *unless* you
need to initialize values with unnatural alignments. Which is something you
really want to avoid. Maybe he's thinking about tag values? In modern C
compilers those can be done with a suitable union and the dotted structure
field notation.

I disagree that 16+2 alignment for SUBRs is difficult. First, he's doing
that to avoid the tag, and masking the tag just isn't that big a deal. More
importantly, every one of his target compilers supports alignment
directives. The real problem is that the C compiler provides no generic way
to insert NOPs at the front of a procedure. Turns out that can be done with
inline assembler. But the bigger problem is that the "+2" assumption simply
isn't enforceable on RISC (because they require 4 byte instruction
alignment anyway). The right fix here is to rearrange the tag structure
into something that can be directly expressed by alignment directives
(either in C or in asm).

In BitC, it turned out that *procedure objects* needed a tag, but the
associated *code thunks* did not. So the real problem here is that he wants
to use a four bit tag value with the LSB always being zero. I looked at
re-tagging TinyScheme at one point. Three bits are definitely enough.
Actually, it looks to me like *two* bits would be enough for Mr. Burger,
though some of the tag would need to move into the object header in that
case, and I understand why he's not fond of that.

3. The argument for using the carry flag is important, but it turns out
that if you write the C code correctly, a decent optimizer will use ADC (or
equivalent) on platforms that support it. See discussion
here<http://stackoverflow.com/questions/6659414/efficient-128-bit-addition-using-carry-flag>.
It does require some knowledge of compiler peephole optimizations, but the
resulting code is portable.

4. The case for manual register allocation is pretty compelling *provided* you
aren't trying to optimize by hand across procedures. The native register
convention generally *isn't* good for a managed language. This is
especially true for closure parameters, as we've found in our own
implementation.

5. Condition code return. NO NO NO NO. There's a REASON that functions are
assumed to whack condition codes!

6. Multiple entry points.

Yes, this is something that C is really bad at. It turns out that you
*can* code
this if you do something comparable to a monster switch statement for
built-in functions, but it's decidedly unnatural.

But I'm not convinced. There aren't that many compelling cases for multiple
entry points, and if you are only doing two or three of them the code size
savings isn't worth the complexity increase.



shap
_______________________________________________
bitc-dev mailing list
[email protected]
http://www.coyotos.org/mailman/listinfo/bitc-dev

Reply via email to