On 6/19/2012 3:55 PM, Manu wrote:
On 20 June 2012 01:07, Walter Bright <newshou...@digitalmars.com
    Do a grep for "naked" across the druntime library sources. For example, its
    use in druntime/src/rt/alloca.d, where it is very much needed, as alloca()
    is one of those "magic" functions.
I never argued against naked... I agree it's mandatory.

Then I misunderstood you.


    Do a grep for "asm" across the druntime library sources. Can you justify all
    of that with some other scheme?


I think almost all the blocks I just browsed through could be easily written
with nothing more than the register alias feature I suggested, and perhaps a
couple of opcode intrinsics.

But I see nothing gained by that.


And as a bonus, they would also be readable.

I don't agree. The point of IA to me is so I can specify exactly what I want. If I wanted to do it at a higher level, I'd use normal D syntax.


I can imagine cases where the
optimiser would have more freedom too.

But if I'm writing IA, I want to do it my way. Not the optimizer's way, which may or may not be able to give me what I want.


        Thinking more about the implications of removing the inline asm, what 
would
        REALLY roxors, would be a keyword to insist a variable is represented 
by a
        register, and by extension, to associate it with a specific register:


    This was a failure in C.


Really?

Yes. C has a register keyword, and nobody uses it anymore. The troubles are many, starting with people always "register"ed the wrong variables, and it really didn't work out too well when compilers started doing live range register assignments. It's ignored by modern C compilers, and hasn't been carried forward into other languages.

This is the missing link between mandatory asm blocks, and being able to
do it in high level code with intrinsics.
The 'register' keyword was similarly fail as 'inline'.. __forceinline was not
fail, it is actually mandatory. I'd argue that __forceregister would be
similarly useful in C aswell, but the real power would come from being able to
specify the particular register to alias.

        This would almost entirely eliminate the usefulness of an inline 
assembler.
        Better yet, this could use the 'new' attribute syntax, which most agree 
will
        support arguments:
        @register(rsp) int x;


    Some C compilers did have such pseudo-register abilities. It was a failure
    in practice.


Really? I've never seen that. What about it was fail?

It's actually in DMC, believe it or not. It was a giant failure because nobody used it. It was in Borland's TurboC, too. It pretty much just throws a wrench into the gears of more sophisticated code generators.


    I really don't understand preferring all these rather convoluted
    enhancements to avoid something simple and straightforward like the inline
    assembler. The use of IA in the D runtime library, for example, has been
    quite successful.


I agree, IA is useful and has been successful, but it has drawbacks too.
   * IA ruins optimisation around the IA block

dmd's optimizer is not so sensitive to that.

   * IA doesn't inline well.

True, but that's fixable (excluding naked functions). Currently, you can use mixins to do it.

intrinsics allow much greater opportunity for
efficient integration into the calling context
   * most IA functions are small, and prime candidates for inlining (see points
1 and 2)
   * IA is difficult for the majority of programmers to follow/understand

IA isn't for everyone. But when you do need it, it has been a marvelous tool 
for D.

   * even to experienced programmers, poorly commented asm takes a lot of time
to mentally parse

It's a shame that there are IA constructs that can't be expressed any other way.
I don't think it would take much to address that.



This one seems trivial, you just need one intrinsic:

   size_t reqsize = size * newcapacity;
   __jc(&Loverflow);

That's highly risky. The optimizer knows nothing at all about the state of the flags register, and does not take into account a dependency on the C flag when doing code motion. Nor would the compiler guarantee that the C flag is even set by however it chose to do the previous multiply (for example, the LEA instruction is often used to do multiplies, which leaves the C flag untouched. Oops!). Nothing connects the __jc intrinsic to that multiply operation.


 Although it depends on a '&codeLabel' mechanism to get the label address (GCC
supports this in C, I'd love to see this in D too).

Note that supporting such will wind up disabling a lot of the data flow analysis, which is not set up to handle unknown edges between basic blocks.

To summarize, I see a lot of complex new features, a significant rewrite of the optimizer, and a rewrite of a lot of existing code, and at the end of all that we're pretty much at the same state we are at now.

Reply via email to