On Jun 16, 2010, at 12:04 AM, Henning Michael Møller Just wrote: > Hello (loved your PostgreSQL presentation at the most recent OSCON, BTW)
Thanks. Come see my tutorial at OSCON this year, if you can: Test-Driven Database Development. :-) Not sure I can make a tutorial as entertaining, alas. Perhaps if I bring beer for the audience. > Which editor do you use? When loading the script in Komodo IDE 5.2 the string > looks broken. Running the script (ActivePerl 5.10.1 on Windows) only the > second line is correct - the first (no surprise) and third are broken. Yes, that's how it looks to me in GNU Emacs (compiled from source with cocoa bindings). > Loading the file in UltraEdit-32 13.20+3, set to not convert the script on > loading, it becomes obvious that what should have been one character is > represented by 4 bytes, \xC3 \x84 \xC2 \x8D, which modern editors would > probably show as 2 characters and as broken. Right. > It looks to me like the string is being displayed as a byte representation of > the characters, if that makes sense. My english isn't perfect :-/ and what I > am trying to say is that this is problem that I am quite familiar with. It > happens whenever the source and the reader do not agree on whether a string > is encoded in utf-8 or not. > > Apparently Encode fixes the incorrect string which is nice. The interesting > thing is, where should this be fixed? If it's at Yahoo! Pipes you'll probably > have to use Encode as a work-around for some time... Yes. Best, David