> >What do you mean by character size if it does not support variable
length?
> 
> Well, if strings are to be treated relatively abstractly, and we still
want 
> to poke around through the string buffer, we need to know how big a 
> character is.

I agree on this. I think support variable length encoding should be
included.

> I'm thinking locale is, in some ways, like tainting where it's really a 
> property of the data rather than a property of the code region.

I think you misunderstand my point. It is "a property of the code region",
but "a property of the context in which is the code is running". For
example,
Taiwanese read traditional chinese characters, but PRC people read
simplied chinese. Even we take the same data, and same program (code),
people just read differently. As an end user, I want to make the decision.
It will drive me crazy if Perl render/display the text file using
traditional
chinese just because it was tagged as "Big5".

> Yep, I fully agree. (Well, I'm not sure of the ASCII restriction on the 
> name, but I can live with that as a lowest-common-denominator 
> sort of thing)

> >The byte based is more useful. I have utf-8, and I want to substr it
> >to another utf-8. It is painful to convert it or linear search for
> >charaacter position.
> 
> The pain is the reason for specifying it in the API. If we force the pain 
> to be local to the encoding then it means that we don't have 
> to embed it in the core.

If it is common API, I like to specify it in core, so each encoding
implemetation can strictly follow. I believe it is common enough.

Hong

Reply via email to