Am Sat, 16 Apr 2016 21:46:08 -0700 schrieb Walter Bright <newshou...@digitalmars.com>:
> On 4/16/2016 2:40 PM, Marco Leise wrote: > > Tell me again, what's more elgant ! > > If I wanted to write in assembler, I wouldn't write in a high level language, > especially a weird one like GNU version. I hate the many pitfalls of extended asm: Forget to mention a side effect in the "clobbers" list and the compiler assumes that register or memory location still holds the value from before the asm. Have an _input_ reg clobbered? Must NOT name it in the clobber list but use it as a dummy output with a dummy variable assignment. The learning curve is steep and as you said, usually unintelligible without prior knowledge. But what I really miss from the last generation of inline assemblers are these points: 1. In most cases you can make the asm transparent to the optimizer leading to: 1.a Inlining of asm 1.b Dead-code removal of asm blocks 2. Asm Template arguments (e.g. input variables) are bound via constraints: 2.a Can use output constraint `"=a" var` to mean an of "AL", "AX", "EAX" or "RAX" depending on size of 'var' 2.b `"r" ptr` can bind 32-bit and 64-bit pointers often eliminating the need for duplicate asm blocks that only differ in one mention of e.g. RSI vs. ESI. 2.c Compiler seamlessly integrates host code variables with asm with host code. No need to manually pick tmp registers to move parameters and output. `"r" myUint` is all it takes for 'myUint' to end up in any of EAX, EDX, ... (whatever the register allocator deems efficient at that point) 3.d As a net result, asm templates often reduce to a single mnemonic and work with X86, X32 and AMD64. 3. In DMD I often see "naked" used to get rid of function prolog and epilog in an attempt to get an intrinsic-like, fast function. This requires extra care to get the calling convention right and may require more code duplication for e.g. Win32. Asm templates in GCC and LLVM benefit from this speedup automatically, because the backend will remove unneeded prolog/epilog code and even inline small functions. GCC's historically grown template syntax based on multiple _external_ assembler backends ain't that great and it is a PITA that it cannot understand the mnemonics and figure out side effects itself like DMD. But I hope I could highlight a few points where classic assemblers as found in Delphi or DMD fall behind in modern convenience and native efficiency. When C was invented it matched the CPUs quite well, but today we have dozens of instructions that C and D syntax has no expression for. All modern compilers spend considerable amount of backend code to the task of pattern matching code constructs like a layman's POPCNT and replace them with optimal CPU instructions. More and more we turn to browsing the list of readily available compiler built-ins first and the next step is to acknowledge the need and make inline assemblers powerful enough for programmers to efficiently implement non-existing intrinsics in library code. -- Marco