On Fri, Sep 21, 2001 at 03:52:13PM -0700, Paul Prescod wrote:
> At some point you have two strings and the engine is asked to
> concatenate them and it can't ask either of them to do the job itself.

Correct, at which point it has to trancsode.

> have tried to deal with this are Unicode and ISO 2022 which seems dead.
 
Some transcodings are trivial; for instance, transcodings between UTFs
can be done completely algorithmically. The transcoding between "native"
string and Unicode is a little more tricky - obviously, it's trivial for
ASCII and also for ISO8859-1, but for other encodings you have to specify
a table to go through for the top 128 characters. This can be done relatively
efficiently.

> Another example is when one string is a regular expression (in one
> encoding) and the other is a string to match against (in another
> encoding).

Yeah, Perl 5 hit this problem running and solved it. Badly, but it solved
it. :)

> If the interpreter has a built-in concept of regular expression or
> string concatenation (rather than dispatching these to the types) then
> it needs to have a built-in understanding of the semantics of encoding
> combination. 

Correct.

> I don't think you can define that without "standardizing"
> on Unicode or some other the unifying character set.

Well, one of the whole *reasons* for Unicode is that it can be a pivot
for transcoding. So this isn't entirely unreasonable.

-- 
Um. There is no David conspiracy. Definitely not. No Kate conspiracy either.
No. No, there is definitely not any sort of David conspiracy, and we are
definitely *not* in league with the Kate conspiracy. Who doesn't exist. And
nor does the David conspiracy. No. No conspiracies here. - Thorfinn, ASR

Reply via email to