On Mon, May 09, 2011 at 07:42:53PM +0100, Martin J. Evans wrote: > I've recently had an rt posted > (http://rt.cpan.org/Public/Bug/Display.html?id=67994) after a > discussion on stackoverflow > (http://stackoverflow.com/questions/5912082/automatic-character-encoding-handling-in-perl-dbi-dbdodbc). > > In this case the Perl script is binding the columns but the data > returned is windows-1252 and the user is having to manually > Encode::decode all bound columns. DBD::ODBC already had a > odbc_utf8_on flag > (http://search.cpan.org/~mjevans/DBD-ODBC-1.29/ODBC.pm#odbc_utf8_on) > for a derivative of Postgres which returns bound data UTF-8 encoded > but in that case I can just call sv_utf8_decode (in the XS) and it > is converted in place. Initially I thought I could combine > odbc_utf8_on into a new flag saying my data is returned as xxx and > just call Encode::decode with xxx (then eventually I could drop > odbc_utf8_on).
I'd be wary of going down this path. I sense pain just beyond the horizon. A twisty-turny maze of sharp edge cases and unforseen issues. For a start: What about the charset of bind values? What about the charset of SQL literals? Can't the database connection/session settings be altered to assume utf8 at the client end and have the server or client libs automatically convert for you? If so, that's a good way to go. Tim.