On Thursday, August 02, 2012 15:14:17 Walter Bright wrote: > Remember, its the consumer doing the decoding, not the input range.
But that's the problem. The consumer has to treat character ranges specially to make this work. It's not generic. If it were generic, then it would simply be using front, popFront, etc. It's going to have to special case strings to do the buffering that you're suggesting. And if you have to special case strings, then how is that any different from what we have now? If you're arguing that strings should be treated as ranges of code units, then pretty much _every_ range-based function will have to special case strings to even work correctly - otherwise it'll be operating on individual code points rather than code points (e.g. filtering code units rather than code points, which would generate an invalid string). This makes the default behavior incorrect, forcing _everyone_ to special case strings _everywhere_ if they want correct behavior with ranges which are strings. And efficiency means nothing if the result is wrong. As it is now, the default behavior of strings with range-based functions is correct but inefficient, so at least we get correct code. And if someone wants their string processing to be efficient, then they special case strings and do things like the buffering that you're suggesting. So, we have correct by default with efficiency as an option. The alternative that you seem to be suggesting (making strings be treated as ranges of code units) means that it would be fast by default but correct as an option, which is completely backwards IMHO. Efficiency is important, but it's pointless how efficient something is if it's wrong, and expecting that your average programmer is going to write unicode-aware code which functions correctly is completely unrealistic. - Jonathan M Davis