On Friday, 7 March 2014 at 19:43:57 UTC, Walter Bright wrote:
On 3/7/2014 7:03 AM, Dicebot wrote:
1) It is a huge breakage and you have been refusing to do one even for more
important problems. What is about this sudden change of mind?

1. Performance Performance Performance

Not important enough. D has always been "safe by default, fast when asked to" language, not other way around. There is no fundamental performance problem here, only lack of knowledge about Phobos.

2. The current behavior is surprising (it sure surprised me, I didn't notice it until I looked at the assembler to figure out why the performance sucked)

That may imply that better documentation is needed. You were only surprised because of wrong initial assumption about what `char[]` type means.

3. Weirdnesses like ElementEncodingType

ElementEncodingType is extremely annoying but I think it is just a side effect of more bigger problem how string algorithms are handled currently. It does not need to be that way.

4. Strange behavior differences between char[], char*, and InputRange!char types

Again, there is nothing strange about it. `char[]` is a special type with special semantics that is defined in documentation and consistently following that definition in all but raw array indexing/slicing (which is what I find unfortunate but also beyond fixing feasibility).

5. Funky anomalous issues with writing OutputRange!char (the put(T) must take a dchar)

Bad but not worth even a small breaking change.

2) lack of convenient .raw property which will effectively do cast(ubyte[])

I've done the cast as a workaround, but when working with generic code it turns out the ubyte type becomes viral - you have to use it everywhere. So all over the place you're having casts between ubyte <=> char in unexpected places. You also wind up with ugly ubyte <=> dchar casts, with the commensurate risk that you goofed and have a truncation bug.

Of course it is viral. Because you never ever wan't to have char[] at all if you don't work with Unicode (or work with it on raw byte level). And in that case it is your responsibility to do manual decoding when appropriate. Trying to dish out that performance often means going at low level with all associated risks, there is nothing special about char[] here. It is not a common use case.

Essentially, the auto-decode makes trivial code look better, but if you're writing a more comprehensive string processing program, and care about performance, it makes a regular ugly mess of things.

And this is how it should be. Again, I am all for creating language that favors performance-critical power programming needs over common/casual needs but it is not what D is and you have been making such choices consistently over quite a long time now (array literals that allocate, I will never forgive that). Suddenly changing your mind only because you have encountered this specific issue personally as opposed to just reports does not fit a language author role. It does not really matter if any new approach itself is good or bad - being unpredictable is a reputation damage D simply can't afford.

Reply via email to