On Thursday, 8 March 2018 at 17:35:11 UTC, H. S. Teoh wrote:
Yeah, the only reason autodecoding survived in the beginning was because Andrei (wrongly) thought that a Unicode code point was equivalent to a grapheme. If that had been the case, the cost associated with auto-decoding may have been justifiable. Unfortunately, that is not the case, which greatly diminishes most of the advantages that autodecoding was meant to have. So it ended up being something that incurred a significant performance hit, yet did not offer the advantages it was supposed to. To fully live up to Andrei's original vision, it would have to include grapheme segmentation as well. Unfortunately, graphemes are of arbitrary length and cannot in general fit in a single dchar (or any fixed-size type), and grapheme segmentation is extremely costly to compute, so doing it by default would kill D's string manipulation performance.


I remember something a bit different last time it was discussed:

- removing auto-decoding was breaking a lot of code, it's used in lots of place
 - performance loss could be mitigated with .byCodeUnit everytime
 - Andrei correctly advocating against breakage

Personally I do use auto-decoding, often iterating by codepoint, and uses it for fonts and parsers. It's correct for a large subset of languages. You gave us a feature and now we are using it ;)

Reply via email to