A nice article, thank you for writing it for us for free (I used to pay to read 
similar texts).

>I had the job at the time of converting a huge (and very successful) 
>electronic schematic editor, DASH, from assembler into C. In C it could then 
>be recompiled for 32 bits, and even ported to other platforms like the Sun 
>workstations. The conversion took months, but was a big success.<

Sounds like a so much bug-prone job that the probability of having a final 
working product seem small. I guess you have translated it part-by-part, 
keeping it functional all the time, as you are doing with optlink.


>9. Optlink has no unit tests. Writing them would require understanding what 
>the various functions do, and I won't know precisely what they are supposed to 
>do until most of the program is converted.<

Translating a largish program from a language to a different language is a good 
way to convince most programmers that unit tests are a Good Thing. Unit tests 
make this work quite simpler. And many good programs eventually need to be 
translated, it's not an uncommon event.


>Although I have not run any speed tests, I expect the performance of the 
>non-I/O bound code to be about 30% slower. Since a linker tends to be I/O 
>bound, the actual performance loss probably will be about 10%, which I can 
>live with.<

We'll see. Modern C compilers, like the Intel one, LLVM or GCC 4.4 sometimes 
give surprises :-) Modern C compilers also use CPU instructions that were not 
available in the past (you have to use compilation flags for Pentium4/Core2 or 
similar CPUs of course).


>The difference in object code size is primarily due to the assembler having 
>done register assignments that cross over multiple function calls. There's 
>just a lot less pushing and popping of parameters.<

In GCC there are annotations (that LLVM doesn't support yet) that allow to nail 
variables into registers. One Haskell compiler (GHC) uses this feature a lot, 
it's explained here:
http://www.cse.unsw.edu.au/~pls/thesis/davidt-thesis.pdf

>Registered mode is implemented by both the C and NCG back-ends and involves 
>primarily
two optimisations which significantly increase the performance of generated 
code. The first
optimisation is known as TABLES_NEXT_TO_CODE and optimises the data layout of 
the
code GHC produces, this is further explained in section 2.5. The second 
optimisation though
is far more significant, it deals with the register allocation for the 
generated code. As part of
the execution model that GHC uses for Haskell, it defines a variety of virtual 
registers which
mirror many of the registers typically found on a CPU, such as a stack pointer 
register. These
are examined in more detail in section 2.4.2 but their purpose is to create a 
virtual machine
architecture which can be explicitly managed instead of using the C stack for 
example. In
unregistered mode these virtual registers are all stored on the heap, needing 
memory access for
any reads and writes. Given the frequency of access of these virtual registers, 
this creates a
significant amount of memory traffic which becomes a limiting factor on the 
performance of
the generated code. In registered mode many of these virtual registers are 
instead permanently
assigned to a specific hardware registers (register pinning), greatly improving 
performance. The
runtime of code compiled in registered mode as opposed to unregistered mode is 
on average 55%
shorter. More details of this can be seen in section 4.<

Eventually this GCC feature can be added to D too.
D may also use a function annotation that tells a function how to manage 
certain registers.


>There are also a lot of functions with multiple entry points. I don't know of 
>any current compiler that is able to enregister across a flow graph of 
>function calls, or is able to tail merge multiple functions.<

Probably no common compiler.


>After a few false starts, I remembered the lessons from converting DASH, which 
>are:<

How I translate from a higher level language to another, for example Java to D 
or Python to C:
1) I make sure to have a certain amount of correct input-output pairs for the 
original code, and sometimes I even write few unittests.
2) I clean & reformat the original code, because a well formatted code is more 
readable. I use the formatting style that I am able to read better. I sometimes 
remove excessive comments, and all the parts of the code that don't need to be 
translated. I keep testing that the functional tests and unit test keep working.
3) I slowly modify the original code, lowering its style, or just changing it, 
to make the code look and work as closer as possible to the destination 
language. Often this is a lowering of semantics, to use more basic constructs 
that are shared between the two languages. Every change is of course followed 
by running the tests. This is the intermediate version.
4) And only now I jump to the destination language, fix things, and hope. 
Sometimes I see problems, so I go back to the intermediate version and fix it 
so its semantics is closer to the destination language.

Sometimes I even use an intermediate language, for example to translate Python 
to C I first use D with my dlibs that allow me to write very Python-like D 
code. Then I lower that D version to normal D-style code. And then I slowly 
lower the D version to C-style code. Running tests every few minutes. At the 
end I convert it to C, and I try to see if it works. D1 language is closer to C 
compared to D2 so this is simpler. Some bugs are common here when I translate D 
to C, like forgetting to initialize C variables, etc. To solve this problem I 
slowly assign "= void" to *all* variables and structs in the D code, so I can 
see it crash before going to C.

As you, I have seen that doing such things in the most tidy and regular way 
possible saves me time. Even using an intermediate language (like D1 to 
translate Python => C) looks like a slow thing, but it allows me to avoid many 
bugs along the way, so even if it's boring and it looks like a waste of time, 
in the end I save time. D1 is useful to me to perform such translations too (D2 
will be less fit for this purpose, so I'll probably keep using D1 for this).

Bye,
bearophile

Reply via email to