64 bits roadmap

Yves Jaradin Thu, 10 Feb 2011 01:37:04 -0800

Hi all,

Sorry for the cross-posting, the 64 bit discussion should probably bemigrated to hackers but since it started in users, I post this messagein both.

Here is a small explanation of how I see the way to a 64 bit build.There is quite a lot of coding work but also some design decisions thatcould be discussed on the hackers mailing-list.

If anyone wants to take the challenge, I'll be glad to help or discussspecific points but unfortunately, I don't currently have much time tospend on the emulator. Certainly not enough to do the 64 bit portingmyself.


Cheers,
Yves

Porting Mozart to 64 bits architectures
=======================================

First a choice has to be made as to what level of cross-architectureusage we want to achieve.

The three natural choices are:
1. Oz source only. The same Oz code can be executed on both architecture.

2. Pickle level. The pickle format is identical on both architectures,functors compiled on one architecture can be run on the other.3. Distribution level. A distributed application can be made by sharingentities on two sites running on emulators compiled for differentarchitectures. This implies 2 as the distribution messages re-use thepickle format.

Currently we have distribution level comptatibility between all (32-bit)platforms, regardless of processor type, endianess, etc. so this wouldseem reasonable to try to achieve the same level of compatibility fordiffering bit-width.

Then a choice has to be made as to what level of knowledge/usage we wantto give to the emulator.1. Compile-time only. We add -m32 where needed in the Makefiles andadjust the dependencies and build-dependencies of the packages so thatthey can build easily on 64-bit platforms but as 32 bits executables.2. Pointers only. Every integer that is actually a pointer in disguiseis made to be platform-length. Every other integer is forced to 32 bits.This allows to compile as native 64 bit, gcc can use 64 bitsinstructions and registers but nothing user-visible is changed,including maximum heap size, etc.4. Everything. For every integer, the decision is made to make itplatform length, fixed to 32 bits (should be uncommon) or fixed to 64bits. The decision is taken purely on conceptual grounds, based on themeaning of the integer. Every integer that appears in an external format(pickle or distributin message) should be fixed width if we wantinteroperability. Most others should be platform sized. Some may bepushed to 64 bits even on 32 bits platforms to increase the limits fordatastructures (bytestrings lengths, tuple widths, etc.) at the cost ofsome performance on 32 bits platforms. The range of FDs and FSs could beincreased, etc. In short, nothing in the code is decided because of the32 bit legacy. It could even go as far as considering 64 bit the mainplatform with some workarounds so that it can continue to work on 32bits platforms.3. Between 2 and 4. Like in 4, every integer is considered and a similardecision is made. However, if the amount of work to make it platformsized or fixed to 64 bits is too big compared to the benefits, we fix itto 32 bits as in 2. It might not be very useful to have bytestrings ofmore than 4Gb or tuples with millions of millions of fields but therehave been reports of people hitting the maximum heap size by dealingwith enormous number of entities; increasing the range of small(optimized) integers can probably improve performances a lot; some otherlimits may be easy enough to remove that we wouldn't want to not do so.However, the evaluation of the trade-offs is ultimately in the hands ofthe one doing the implementation work.

Level 1 should be considered because it is probably rather easy andcould be a stop-gap measure while changes are made for another level.Level 2 has probably too high a cost for not much benefit. I would gofor level 3, taking advantages where easy enough.

Mozart was designed at a time where 32-bits seemed more than enough forthe foreseeable future and excluded from the start 16-bits platforms.Therefore, no attempts were made to avoid or limit dependencies on thebit-width of the platform.The following parts of Mozart are probably the most critical for 64-bitporting.

The memory manager. Mozart manages it's memory in order to implementit's own GC. By necessity, this involves lots of pointers as integersmanipulations. Alignments of memory blocks may need to be increased incertain cases.

The tagged pointers. A tagged pointer is a combination of a pointer anda small integer (2-3 bits depending on places), put in the low-orderbits of the pointer as they are guaranteed to be zero by alignment. Thesmall integer identifies the type of the pointed-to value. In certaincases (small integers) the "pointer" is not a pointer but the valueitself (shifted a few bits). Replacing tagged pointers with somethingcleaner (e.g. a pair of a type and a union pointer/int) could be donebut could increase memory usage 4-fold. By their nature, tagged pointersinvolve lots of bit-level manipulation of pointers and therefore castingto and-from integers. Most tag manipulations is encapsulated in macrosbut I wouldn't bet they all are.

The threaded code. Mozart bytecode is loaded in an optimized threadedcode format. Some indexes in array are explicitly multiplied by themember size to transform a shift-then-add in a simple add atmachine-code level. Instructions sizes are used in many places (at leastemulation, GC, pickling) and may need to be adjusted depending on whatare their parameters. Alignments requirements may also be a concern.All in all, this part of the code has been very micro-optimized and cancontain lots of problematic hacks.

The FD implementation. This is also very old, micro-optimized code thatcan contain anything... Keeping that part as close as possible to 32bits would be easiest but because of tagged pointers used for smallintegers (that would now be pointer-sized), some changes may be needed.

The pickling system. Non-DAG datastructures are pickled using numberedcoreferences. What to do when the number of such gets over what can befit in a 32-bit integer ? Changing the pickle format could be an optionbut we would then need to change the implementation on the 32-bits side.Other fields may have similar problems.


The way I would go to do the porting is as follow:

1. Remove all the x86 assembly language. It's optimized for old CPUsthat nobody uses anymore, compilers have improved a lot, we don't needthem since Mozart run on other architectures.2. Reveres-engineer the functionning of the critical parts above fromthe source, removing dead code and cruft before fixing them to 64-bitcleanness.3. Look at remaining compiler warnings for dangerous casts, grep foridioms that need fixing found previously.

4. Test, debug, fix. Lather, rinse, repeat.
_________________________________________________________________________________
mozart-users mailing list                               
[email protected]
http://www.mozart-oz.org/mailman/listinfo/mozart-users

64 bits roadmap

Reply via email to