On Thursday, 29 May 2014 at 03:29:31 UTC, Jonathan M Davis via Digitalmars-d-announce wrote:
1. The order of the dimensions of multi-dimensional static arrays is backwards
in comparison to what most everyone expects.

    int[4][5][6] foo;

is the same as

    int foo[6][5][4];

and has the same dimensions as

    auto bar = new int[][][](6, 5, 4);

The reasons for it stem from the fact that the compiler reads types outward from the variable name (which is very important to understand in C because of its function pointer syntax but not so important in D). However, once we did

    const(int)* foo;

and didn't allow

    (int)const* foo;

I think that we threw that particular bit of consistency with C/C++ out the window, and we really should have just made static array dimensions be read from left-to-right. Unfortunately, I don't think that we can fix that at this point, because doing so would cause silent breakage (or at minimum, would be
silent until RangeErrors were thrown at runtime).

I don't see this as an inconsistency. Just read it as follows:

    int[6][5]* foo;

- start with the type int
- make an array from it
- make an array from that
- and finally, turn it into a pointer.

    const(int)* bar;

Just read `const(int)` as one entity here (as its form suggests, some kind of "function call"):

- start with a const(int)
- make a pointer from it

3. const, immutable, and inout on the left-hand side of a function declaration are unfortunately legal.

Agreed. At least it's possible to do it by convention (but see 4.).

4. There are some cases (such as with static constructors and unittest blocks) that the attributes have to go on the left for some reason. I don't remember the reasons for it, but it's an inconsistency which definitely trips up even
seasoned D programmers from time to time.

I don't know these cases, but the reason might be is that function declarations and unittests need to be followed by braces (or a semicolon in the case of functions), whereas some other keywords also allow non-compound statements. This could therefore lead to ambiguities as to whether the type qualifier applies to the declaration or the following statement.

5. The fact that pure is called pure is very problematic at this point as far as explaining things to folks goes. We should probably consider renaming it to something like @noglobal, but I'm not sure that that would go over very well given the amount of breakage involved. It _does_ require a lot of explaining
though.

Well, it's just a name, and it's for hysterical raisins ;-) I don't think it's so bad, because the purity concept already differs from language to language.

6. The situation with ranges and string is kind of ugly, with them being treated as ranges of code points. I don't know what the correct solution to this is, since treating them as ranges of code units promotes efficiency but makes code more error-prone, whereas treating them as ranges of graphemes would just cost too much. Ranges of code points is _mostly_ correct but still incorrect and _more_ efficient than graphemes but still quite a bit less efficient than code units. So, it's kind of like it's got the best and worst of both worlds. The current situation causes inconsistencies with everything else (forcing us to use isNarrowString all over the place) and definitely requires frequent explaining, but it does prevent some classes of problems. So, I don't know. I used to be in favor of the current situation, but at this point, if we could change it, I think that I'd argue in faver of just treating them as ranges of code units and then have wrappers for ranges of code points or graphemes. It seems like the current situation promotes either using ubyte[] (if you care about efficiency) or the new grapheme facilities in std.uni if you care about correctness, whereas just using strings as ranges of dchar is probably a bad idea unless you just don't want to deal with any of the Unicode stuff, don't care all that much about efficiency, and are willing have bugs in the areas where operating at the code point level is incorrect.

My preferred solution would be to disallow iterating over bare char/wchar/dchar ranges, but require an explicit .byCodeUnit, .byCodePoint or .byGrapheme. Probably not going to happen, though...

Reply via email to