The charset conversion that is happening in LDAP is actually quite specialized. The general functionality of converting from one charset to another already exists in APR in the form of apr_xlat_xxx(). LDAP is only interested in converting the user ID from a given charset to UTF-8. Up until auth_ldap calls ap_get_basic_auth_pw(), the user ID and password are encrypted in the "Authentication" header entry. Until the user ID and password have been decrypted, the conversion to UTF-8 can not occur. Therefore the conversion must take place from within auth_ldap or any other authentication module after decrypting the user information. A module or filter outside of the authentication module that does a blind charset conversion on the header information, would not work because it would not be able to decrypt the user ID and password, convert it and re-encrypt it in order to make the process transparent to all authentication modules. (Actually you could probably make it work for base64, but what about digest?) On the other hand, the one place that the conversion could be done is within the call to ap_get_basic_auth_pw(). But ap_get_basic_auth_pw() or whatever function handles decrypting digest authentication, would have to be modified so that it had access to the "accept-language" header values. This would allow it to convert from the assumed browser's charset to UTF-8 or any other charset. But the down side is that the "accept-language" header value does not guarantee that that is the charset the browser used when it sent the request. It is simply an indicator of what charset(s) the browser will accept. Auth_LDAP would be utilizing this functionality to at least attempt to do the right thing rather than always failing. I do agree that we need some type of functionality that will convert requests made in a particular charset to a universal charset that Apache can rely on. I'm just not sure this is it. It seems to work for auth_LDAP, but I'm not sure how to generalize it. This is where a much broader discussion need to take place.
Brad Nicholes Senior Software Engineer Novell, Inc., the leading provider of Net business solutions http://www.novell.com >>> [EMAIL PROTECTED] Thursday, December 12, 2002 4:09:57 AM >>> > This patch eliminates the hardcoded charset table. Instead it > reads the charset table from a conf file. The directive > AuthLDAPCharsetConfig allows the admin to specify the charset conf > file. Is there also a need to specify additional conversions > directly in the httpd.conf file through a different directive? It > seems that the charset conf file would be sufficient. If there are > multiple charsets per language, these can be set by specifying the > 5 character language ID rather than the 2 character ID similar to > the example in the charset.conv file for chinese. As nd said, if someone needs additional conversion, he will scream for it. :-) But something else is going around in my head. Why should this charset conversion be limited to ladp? Well, I don't know where we need the conversion table too. But the table itself should be general available to all modules. Maybe some other modules would like to do the same. A core (?) directive like LanguageCharsetConfig might be much more useful then AuthLDAPCharsetConfig. So the next step would be to move the conversion function to core or apr or so, too. Each module, which needs a conversion, can call this funtion instead of having its own code. Maybe there are some overlapping with mod_charset_lite which also does charset conversion. Kess