Net::LDAP::LDIF and UTF8 data

Alexander Bergolth Thu, 29 Jul 2004 08:07:37 -0700

Hi!

Since attribute values should be UTF-8 encoded, shouldn't Net::LDIF convert values that are "character-strings" explicitly to UTF8?

--- LDIF.pm.orig        2004-07-29 15:45:25.000000000 +0200
+++ LDIF.pm     2004-07-29 15:44:54.000000000 +0200
@@ -296,7 +296,9 @@
     my $ln = $lower ? lc $attr : $attr;
     if ($v =~ /(^[ :]|[\x00-\x1f\x7f-\xff])/) {
       require MIME::Base64;
-      $ln .= ":: " . MIME::Base64::encode($v,"");
+      use Encode;
+      $ln .= ":: " .MIME::Base64::encode(
+        Encode::is_utf8($v)? encode_utf8($v) : $v,"");
     }
     else {
       $ln .= ": " . $v;

Otherwise a perl-string that contains only code points up to 0xFF (e.g. a string in Latin1) will be internally represented as 8 bit characters and will be converted to a Base64 converted Latin1 string, not a Base64 converted UTF-8 string.

The same applies to all method that communicate with the ldap-server. Shouldn't they convert perl character strings to UTF-8 octet-strings (like above) when sending data to the server?

Cheers,
--leo
--
-------------------------------------------------------
Alexander (Leo) Bergolth          [EMAIL PROTECTED]
WU-Wien - Zentrum fuer Informatikdienste - Projektbuero

Net::LDAP::LDIF and UTF8 data

Reply via email to