Hi, this is not a problem but a solution. I know some of you use the BerkeleyDB module to store data. Recently I have tried to use UTF8 keys and failed. When reading back keys I sometimes got character strings sometimes octet strings. I had used the following 2 filters to ensure the data in the database is octet strings and the data I get back are character strings:
$db->filter_fetch_key(sub { $_=Encode::decode('utf8', $_) }); $db->filter_store_key(sub { $_=Encode::encode('utf8', $_) }); The problem is BerkeleyDB doesn't reset the UTF8 bit when storing data to @_ variables as in c_get() or db_get(). One possible solution is $db->filter_fetch_key(sub { Encode::_utf8_off($_); $_=Encode::decode('utf8', $_); }); The other/better one is to fix it in BerkeleyDB.xs. This is what the attached patch does. I have sent it to the author, Paul Marquess. Here is his reply: On Tue 10 Mar 2009, Paul Marquess wrote: > Your patch looks fine and should be ok to include in my development > copy without any changes. Torsten -- Need professional mod_perl support? Just hire me: torsten.foert...@gmx.net
--- BerkeleyDB.xs~ 2009-02-18 21:31:46.000000000 +0100 +++ BerkeleyDB.xs 2009-03-06 14:38:04.000000000 +0100 @@ -430,7 +430,10 @@ #define getInnerObject(x) ((SV*)SvRV(sv)) #endif -#define my_sv_setpvn(sv, d, s) (s ? sv_setpvn(sv, d, s) : sv_setpv(sv, "") ) +#define my_sv_setpvn(sv, d, s) do { \ + s ? sv_setpvn(sv, d, s) : sv_setpv(sv, ""); \ + SvUTF8_off(sv); \ + } while(0) #define GetValue_iv(h,k) (((sv = readHash(h, k)) && sv != &PL_sv_undef) \ ? SvIV(sv) : 0)