On Jun 16, 2010, at 12:04 AM, Henning Michael Møller Just wrote:

> Hello (loved your PostgreSQL presentation at the most recent OSCON, BTW)

Thanks. Come see my tutorial at OSCON this year, if you can: Test-Driven 
Database Development. :-) Not sure I can make a tutorial as entertaining, alas. 
Perhaps if I bring beer for the audience.

> Which editor do you use? When loading the script in Komodo IDE 5.2 the string 
> looks broken. Running the script (ActivePerl 5.10.1 on Windows) only the 
> second line is correct - the first (no surprise) and third are broken.

Yes, that's how it looks to me in GNU Emacs (compiled from source with cocoa 
bindings).

> Loading the file in UltraEdit-32 13.20+3, set to not convert the script on 
> loading, it becomes obvious that what should have been one character is 
> represented by 4 bytes, \xC3 \x84 \xC2 \x8D, which modern editors would 
> probably show as 2 characters and as broken.

Right.

> It looks to me like the string is being displayed as a byte representation of 
> the characters, if that makes sense. My english isn't perfect :-/ and what I 
> am trying to say is that this is problem that I am quite familiar with. It 
> happens whenever the source and the reader do not agree on whether a string 
> is encoded in utf-8 or not.
> 
> Apparently Encode fixes the incorrect string which is nice. The interesting 
> thing is, where should this be fixed? If it's at Yahoo! Pipes you'll probably 
> have to use Encode as a work-around for some time...

Yes.

Best,

David

Reply via email to