On Wed, Sep 10, 2003 at 11:42:23AM +0100, Steve Hay wrote:
> Bart Lateur wrote:
>
> >On Wed, 10 Sep 2003 10:40:29 +0100, Steve Hay wrote:
> >
> >>But the question was: How can I arrange for such conversions to be
> >>performed automatically by DBI whenever it receives or returns data?
> >
> >Well, there are two options... either does the dtabase somewhere stores
> >the flag indicating that some string is in UTF8, or you have to add that
> >information yourself. For the latter, I don't know if it'll actually
> >work, but it seems like an appropriate way to do it: add a "BOM" marker
> >at the start of the string.
>
> I don't think the MySQL 3.x stores any flag to indicate that a string is
> UTF8, and even if it did I'm not aware of anything in DBI or DBD-mysql
> that would make use of it, e.g. to decode data flagged in such a way
> into Perl's internal format.
>
> Adding a BOM myself to the string seems to have problems of its own (see
> http://www.unicode.org/unicode/faq/utf_bom.html#27), and again I'm not
> aware of DBI / DBD-mysql having anything in them that would make use of
> such a BOM. Please correct me if I'm wrong - that could be just the
> sort of thing that I'm looking for here.
Basically it should be the job of the drivers to set the uft8 flag on
data being retrieved if it is utf8. I believe that the new mysql v4.1
protocol does provide information about the characterset of each colum.
DBD::mysql can use that.
For people stuck with older versions of mysql, a driver private
option could be used to indicate that all char fields are utf8,
or have some way of indicating that per-column, such as
$sth->bind_col(1, undef, { mysql_charset => 'utf8' });
Tim.