On Thursday, 2 June 2016 at 21:38:02 UTC, default0 wrote:
On Thursday, 2 June 2016 at 21:30:51 UTC, tsbockman wrote:
1) It does not say that level 2 should be opt-in; it says that level 2 should be toggle-able. Nowhere does it say which of level 1 and 2 should be the default.

2) It says that working with graphemes is slower than UTF-16 code UNITS (level 1), but says nothing about streaming decoding of code POINTS (what we have).

3) That document is from 2000, and its claims about performance are surely extremely out-dated, anyway. Computers and the Unicode standard have both changed much since then.

1) Right because a special toggleable syntax is definitely not "opt-in".

It is not "opt-in" unless it is toggled off by default. The only reason it doesn't talk about toggling in the level 1 section, is because that section is written with the assumption that many programs will *only* support level 1.

2) Several people in this thread noted that working on graphemes is way slower (which makes sense, because its yet another processing you need to do after you decoded - therefore more work - therefore slower) than working on code points.

And working on code points is way slower than working on code units (the actual level 1).

3) Not an argument - doing more work makes code slower.

What do you think I'm arguing for? It's not graphemes-by-default.

What I actually want to see: permanently deprecate the auto-decoding range primitives. Force the user to explicitly specify whichever of `by!dchar`, `byCodePoint`, or `byGrapheme` their specific algorithm actually needs. Removing the implicit conversions between `char`, `wchar`, and `dchar` would also be nice, but isn't really necessary I think.

That would be a standards-compliant solution (one of several possible). What we have now is non-standard, at least going by the old version Walter linked.

Reply via email to