On Thursday, August 02, 2012 01:14:30 Walter Bright wrote: > On 8/2/2012 12:43 AM, Jonathan M Davis wrote: > > It is for ranges in general. In the general case, a range of UTF-8 or > > UTF-16 makes no sense whatsoever. Having range-based functions which > > understand the encodings and optimize accordingly can be very beneficial > > (which happens with strings but can't happen with general ranges without > > the concept of a variably-length encoded range like we have with forward > > range or random access range), but to actually have a range of UTF-8 or > > UTF-16 just wouldn't work. Range-based functions operate on elements, and > > doing stuff like filter or map or reduce on code units doesn't make any > > sense at all. > > Yes, it can work.
How? If you operate on a range of code units, then you're operating on individual code units, which almost never makes sense. There are plenty cases where a function which understands the encoding can avoid some of costs associated with decoding and whatnot, but since range-based functions operate on their elements, if the elementse are code units, a range-based function will operate on individual code units with _no_ understanding of the encoding at all. Ranges have no concept of encoding. Do you really think that it makes sense for a function like map or filter to operate on individual code units? Because that's what would end up happening with a range of code units. Your average, range-based function only makes sense with _characters_, not code units. Functions which can operate on ranges of code units without screwing up the encoding are a rarity. Unless a range-based function special cases a range-type which is variably- lengthed encoded (e.g. string), it just isn't going to deal with the encoding properly. Either it operates on the encoding or the actual value, depending on what its element type is. I concur that operating on strings as code units is better from the standpoint of efficiency, but it just doesn't work with a generic function without it having a special case which therefore _isn't_ generic. - Jonathan M Davis