Re: VLERange: a range in between BidirectionalRange and RandomAccessRange

spir Wed, 12 Jan 2011 11:58:27 -0800

On 01/12/2011 08:28 PM, Don wrote:

I think the only problem that we really have, is that "char[]",
"dchar[]" implies that code points is always the appropriate level of
abstraction.

I'd like to know when it happens that codepoint is the appropriate levelof abstraction.* If pieces of text are not manipulated, meaning just used in theapplication, or just transferred via the application as is (from file /input / literal to any kind of output), then any kind of encoding justworks. One can even concatenate, provided all pieces use the sameencoding. --> _lower_ level than codepoint is OK.* But any of manipulation (indexing, slicing, compare, search, count,replace, not to speak about regex/parsing) requires operating at the_higher_ level of characters (in the common sense). Just like withhistoric character sets in which codes used to represent characters (notlower-level thingies as in UCS). Else, one reads, compares, changesmeaningless bits of text.


As I see it now, we need 2 types:

* One plain string similar to good old ones (bytestring would do thejob, since most unicode is utf8 encoded) for the first kind of useabove. With optional validity check when it's supposed to be unicode text.* One hiher-level type abstracting from codepoint (not code unit)issues, restoring the necessary properties: (1) each character is oneelement in the sequence (2) each character is always represented thesame way.



Denis
_________________
vita es estrany
spir.wikidot.com

Re: VLERange: a range in between BidirectionalRange and RandomAccessRange

Reply via email to