On Tue, 15 Feb 2011 14:15:06 -0500, Rainer Schuetze <r.sagita...@gmx.de> wrote:


I think David has raised a good point here that seems to have been lost in the discussion about naming.

Please note that the C name of the machine word integer was usually called "int". The C standard only specifies a minimum bit-size for the different types (see for example http://www.ericgiguere.com/articles/ansi-c-summary.html). Most of current C++ implementations have identical "int" sizes, but now "long" is different. This approach has failed and has caused many headaches when porting software from one platform to another. D has recognized this and has explicitely defined the bit-size of the various integer types. That's good!

Now, with size_t the distinction between platforms creeps back into the language. It is everywhere across phobos, be it as length of ranges or size of containers. This can get viral, as everything that gets in touch with these values might have to stick to size_t. Is this really desired?

Do you really want portable code? The thing is, size_t is specifically defined to be *the word size* whereas C defines int as a fuzzy size "should be at least 16 bits, and recommended to be equivalent to the natural size of the machine". size_t is *guaranteed* to be the same size on the same platform, even among different compilers.

In addition size_t isn't actually defined by the compiler. So the library controls the size of size_t, not the compiler. This should make it extremely portable.

Consider saving an array to disk, trying to read it on another platform. How many bits should be written for the size of that array?

It depends on the protocol or file format definition. It should be irrelevant what platform/architecture you are on. Any format or protocol worth its salt will define what size integers you should store.

Then you need a protocol implementation that converts between the native size and the stored size.

This is just like network endianness vs. host endianness. You always use htonl and ntohl even if your platform has the same endianness as the network, because you want your code to be portable. Not using them is a no-no even if it works fine on your big-endian system.

I don't have a perfect solution, but maybe builtin arrays could be limited to 2^^32-1 elements (or maybe 2^^31-1 to get rid of endless signed/unsigned conversions), so the normal type to be used is still "int". Ranges should adopt the type sizes of the underlying objects.

No, this is too limiting. If I have 64GB of memory (not out of the question), and I want to have a 5GB array, I think I should be allowed to. This is one of the main reasons to go to 64-bit in the first place.

-Steve

Reply via email to