Use the Encode module to test/convert back and forth between UTF8 characters
and bytes for the SQL ASCII database. Assuming the input is already UTF-8:
use Encode qw(:all);
# connect to db, prepare insert statement, etc.
my $bytes = encode('utf8', $utf8_text);
$sth->execute($bytes, $i) or errexit("execute of insert into public_suffixes
tbl failed: ", $DBI::errstr);
If your input is not already UTF-8, you will have to use decode in an eval
statement to convert to utf-8, then check for failure before re-converting and
inserting into the database. Or something similar.
This seems to work for me. When I need to pull the data back out of the
database, I have to reconvert from the byte string into UTF-8 characters before
displaying the output.
Susan
________________________________
From: [email protected]
[mailto:[email protected]] On Behalf Of Mike Blackwell
Sent: Thursday, July 21, 2011 7:49 AM
To: [email protected]
Subject: [GENERAL] SQL-ASCII database cleanup
I have an older database that was created with SQL-ASCII encoding. Over time
users have managed to enter all manner of interesting characters, mostly via
cut and paste from Windows documents. I'm attempting to clean up and
eventually the database to UTF8. I've managed to find most of the data that
won't nicely convert from some-random-encoding to UTF8, but it seems the users
are entering it as fast as I can find it. Is there a way the incoming data from
a Perl CGI web application can be automatically limited to UTF8 even though the
database is SQL-ASCII?
Mike