jo3c wrote:
Im trying to get some information out of a windows sever 2003 chinese
active directory system
so let's say encoding is probably big5 or utf-8

The Unicode encoding of LDAP attributes with syntax Directory String is always UTF-8 (e.g. attributes 'cn', 'sn', 'givenName' or 'displayName').

what im doing is simliar to ldapsearch in shell with my python script
using python ldap module

the result is not the correct encoding..

What exactly did you expect?

 'cn': ['\xe6\xb1\x9f\xe6\x9f\x8f\xe5\xa3\x95'],

>>> unicode('\xe6\xb1\x9f\xe6\x9f\x8f\xe5\xa3\x95','utf-8')
u'\u6c5f\u67cf\u58d5'

I cannot tell whether this Unicode string of length 3 is correct since I cannot read Chinese and I probably don't have the necessary fonts installed. At least it decodes as UTF-8 which is correct at the LDAP level.

 'displayName': ['\xe6\xb1\x9f\xe6\x9f\x8f\xe5\xa3\x95'],

>>> unicode('\xe6\xb1\x9f\xe6\x9f\x8f\xe5\xa3\x95','utf-8')
u'\u6c5f\u67cf\u58d5'

Maybe you should provide the original Unicode string (e.g. in Python syntax) and tell us how you store that into your AD server. Note that the tools used to maintain AD are also part of the game.

Ciao, Michael.
--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to