Re: [rt-users] Bad characters in names loaded from LDAP (AD)

Bill Cole Mon, 10 Oct 2016 20:47:28 -0700

On 10 Oct 2016, at 16:26, Jan Burian wrote:

Hi all,
we have RT 4.4.0 on CentOS 7 and Perl v5.22.1. And we are starting to
use RT in production.

We configured RT to authenticate users via LDAP
(RT::Authen::ExternalAuth::LDAP). Our LDAP server is MS AD (Win 2008R2).

[...]

Authentication is working fine. Users can log in, if the user doesn't
exist in RT the account is autocreated. All the configured attributes
are transferred.

This is a strong sign that the LDAP part is working correctly. If theLDAP server (AD) and client (Perl's Net::LDAP module) are usingmismatched encodings, it is likely to show up in authentication failuresdue to incompatible encodings of the same (logical) characters that8-bit encodings assign to byte values 0x80-0xff.

Fortunately, it is somewhere between arcane and impossible to makeNet::LDAP use anything other than UTF-8. There's *probably* some way tomake it do T.61 for ancient-history compatibility, but that's mostlypointless.


[...]

We had similar problem with Moodle. When we configured Moodle against

Active Directory and set cp1250 encoding, then it was doing exactlysame

thing. After we changed encoding for LDAP connector to utf-8 then the
names was
corrected.

Which makes sense: LDAP v3 by default uses UTF-8 and you have a modernsystem with a mature LDAP client. I know of no way to configure a CentOS7/Perl 5.22 system such that the LDAP interaction with an AD LDAP servertalking UTF-8 would be the source of this sort of encoding conflict. I'mmildly surprised that anything talking LDAPv3 can be made to use cp1250encoding, but I suppose Microsoft makes their own rules to go along withtheir own unique code pages.


[...]

Also I red thath MS AD in LDAP protocol version 3 returns any stringto
LDAP client in utf-8 encoding.
I really don't know where could be a problem.

The most likely place is in your database. I'm guessing that you areusing MySQL, which defaults to latin1 encoding. When you store a UTF-8string into a latin1 table, it breaks any multi-byte characters into 2or 3 characters, but the right bits are still there. This issue has comeup a few times on this list over the past decade and I think BestPractical has documented how to safely convert a RT database with thatsort of problem from latin1 to utf8. It is probably worth lookingthrough their docs (possibly one of the UPGRADING* files?) and the RTWiki for a solution. I expect it could be done with a binary dump of thedatabase, altering of any latin1 tables to use utf8, and a re-import ofthe binary dump. I'm not enough of a MySQL expert to detail that process(I generally use Postgres where possible.)

---------
RT 4.4 and RTIR training sessions, and a new workshop day! 
https://bestpractical.com/training
* Boston - October 24-26
* Los Angeles - Q1 2017

Re: [rt-users] Bad characters in names loaded from LDAP (AD)

Reply via email to