Re: [Pharo-dev] [squeak-dev] Re: String >> #=

Andres Valloud Wed, 28 May 2014 13:25:29 -0700

Hey Philippe,

Yes but #= is blissfully unaware of normalization in Squeak/Pharo. In
fact AFAIK Squeak/Pharo is unaware of normalization. Having a short look
at it doesn't even look as if case insensitivity worked in Squeak/Pharo
outside of Latin-1 (I could be wrong though).

Yes, that's what I am thinking about. To be more explicit, suppose"Unicode" series of characters got into the image via the keyboard, afile, a socket... once decoded, what could one do with them? Are alltypes of decoded character series going to be represented as instancesof a single class, although they have inherently different behavior?

In addition you probably don't want #= to do normalization "because
performance". And even if you did you probably still want a fast path
for ByteString receiver and ByteString argument in which case #size is safe.

Assuming all fixed width representation strings (e.g. byte strings) willalways have the same encoding (e.g. same code page), then the size checkfor those seems ok to me.

Just to make sure, I am not celebrating all this complexity in theworld... however, given that it's there, how are we going to deal withit? I'm concerned about the long term consequences of making thingsmore complex than they are by reinterpreting them. The problem I see isthat ultimately programs just won't Work(TM).


Andres.

Re: [Pharo-dev] [squeak-dev] Re: String >> #=

Reply via email to