Re: httpd and locales
* Garrett Rooney [EMAIL PROTECTED] wrote: It doesn't belong here, but... I'm wondering why the path isn't passed as UTF-8. Why is it translated to the locale at all? It's all happening within the svn file system, so I'd really expect to get utf-8 and would consider locale translation as a bug. Well, I imagine that the assumption is that any hook script is going to be using the actual locale specified in LANG/LC_ALL/etc env variables, so if we don't translate to that locale it'll get rather confused by utf8 data in its command line. As a general rule svn translates from native - utf8 on input and from utf8 - native for output. Ironically, if the LANG/LC_ALL/etc env vars were being followed by httpd this translation would be a noop, since the system uses a utf8 locale... So whether the users of a repository (httpd or svnserve) may use the full unicode range for their files depends on the locale of the server? That feels just wrong ;-) I don't see how there are command line confusings... As long as one references files enclosed in the filesystem no translation should occur at all. It's just unicode (in utf-8 format). The only part of the subversion system which should deal with filename recodings of reposiory stored path should be a client. But as said, this doesn't belong here. nd
Re: httpd and locales
André Malo wrote: * Garrett Rooney [EMAIL PROTECTED] wrote: It doesn't belong here, but... I'm wondering why the path isn't passed as UTF-8. Why is it translated to the locale at all? It's all happening within the svn file system, so I'd really expect to get utf-8 and would consider locale translation as a bug. Well, I imagine that the assumption is that any hook script is going to be using the actual locale specified in LANG/LC_ALL/etc env variables, so if we don't translate to that locale it'll get rather confused by utf8 data in its command line. As a general rule svn translates from native - utf8 on input and from utf8 - native for output. Ironically, if the LANG/LC_ALL/etc env vars were being followed by httpd this translation would be a noop, since the system uses a utf8 locale... So whether the users of a repository (httpd or svnserve) may use the full unicode range for their files depends on the locale of the server? That feels just wrong ;-) I don't see how there are command line confusings... You're confusing the content of the SVN repository and hook scripts stored on the local filesystem. Paths in the first are always encoded in UTF-8. The latter naturally have to obey the server's locale. -- Brane
Re: httpd and locales
* Branko Čibej wrote: You're confusing the content of the SVN repository and hook scripts stored on the local filesystem. Paths in the first are always encoded in UTF-8. The latter naturally have to obey the server's locale. I don't think so. The task was to pass the name of a file stored in the repository to a hook script via the command line. Otherwise I must have misunderstood something quite heavily. nd -- Das einzige, das einen Gebäudekollaps (oder auch einen thermonuklearen Krieg) unbeschadet übersteht, sind Kakerlaken und AOL-CDs. -- Bastian Lipp in dcsm
Re: httpd and locales
On 1/19/06, André Malo [EMAIL PROTECTED] wrote: * Branko Čibej wrote: You're confusing the content of the SVN repository and hook scripts stored on the local filesystem. Paths in the first are always encoded in UTF-8. The latter naturally have to obey the server's locale. I don't think so. The task was to pass the name of a file stored in the repository to a hook script via the command line. Otherwise I must have misunderstood something quite heavily. That is correct, it's an argument to the hook script that happens to contain the path of a file in the repository. Currently all arguments are transcoded from utf8 to native before we execute the hook script. -garrett
Re: httpd and locales
On Thu, Jan 19, 2006 at 11:09:13AM -0800, Garrett Rooney wrote: On 1/19/06, André Malo [EMAIL PROTECTED] wrote: * Branko Čibej wrote: You're confusing the content of the SVN repository and hook scripts stored on the local filesystem. Paths in the first are always encoded in UTF-8. The latter naturally have to obey the server's locale. I don't think so. The task was to pass the name of a file stored in the repository to a hook script via the command line. Otherwise I must have misunderstood something quite heavily. That is correct, it's an argument to the hook script that happens to contain the path of a file in the repository. Currently all arguments are transcoded from utf8 to native before we execute the hook script. I really don't think that relying on that working properly is a good idea. All it takes is for one rogue PHP script to set the locale to some odd locale to be able to print currency symbols properly or whatever, and the hook scripts would start behaving really strangely. As a module author, presuming the locale is undefined is the safest bet, and as an adminstrator, starting the server in the C locale is the safest bet. joe
Re: httpd and locales
On Wed, Jan 18, 2006 at 11:17:30AM -0800, Garrett Rooney wrote: Is there any particular reason that httpd never does the 'setlocale(LC_ALL, );' magic necessary to get libc to respect the various locale related environment variables? As far as I can tell, despite system settings for locale (i.e. /etc/sysconfig/i18n on RHEL) httpd always runs with a locale of C, which is fine for most things, but pretty irritating if you have a need to do stuff with multibyte strings in a module. Just adding a call to setlocale with a locale in httpd's main makes my particular problem go away, but I'm kind of hesitant to propose actually doing so since I don't know what kind of fallout there would be from having httpd all of a sudden start respecting the environment variables... Ideally the locale shouldn't matter, but in practice it does: notably strcasecmp() and the is* functions behave differently. This can cause things to fail in surprising ways, so it's generally to be avoided. Various modules will do it at startup anyway, so it's hard to avoid completely, but it's not something that I'd really advise propagating. joe
Re: httpd and locales
On 1/18/06, Joe Orton [EMAIL PROTECTED] wrote: On Wed, Jan 18, 2006 at 11:17:30AM -0800, Garrett Rooney wrote: Is there any particular reason that httpd never does the 'setlocale(LC_ALL, );' magic necessary to get libc to respect the various locale related environment variables? As far as I can tell, despite system settings for locale (i.e. /etc/sysconfig/i18n on RHEL) httpd always runs with a locale of C, which is fine for most things, but pretty irritating if you have a need to do stuff with multibyte strings in a module. Just adding a call to setlocale with a locale in httpd's main makes my particular problem go away, but I'm kind of hesitant to propose actually doing so since I don't know what kind of fallout there would be from having httpd all of a sudden start respecting the environment variables... Ideally the locale shouldn't matter, but in practice it does: notably strcasecmp() and the is* functions behave differently. This can cause things to fail in surprising ways, so it's generally to be avoided. Various modules will do it at startup anyway, so it's hard to avoid completely, but it's not something that I'd really advise propagating. The specific problem I'm trying to fix is that mod_dav_svn fails to run a pre-lock hook script when you try to lock a filename with double byte characters. It never even gets to the point of trying to run the script, it fails trying to build the command line because it can't convert the filename from utf8 to the native encoding because the locale is C and thus the native encoding is 7 bit ascii. I'm having trouble finding a work around for this that doesn't involve setting the locale, although if there's anything obvious I'm missing I'd love to hear it. -garrett
Re: httpd and locales
* Garrett Rooney wrote: The specific problem I'm trying to fix is that mod_dav_svn fails to run a pre-lock hook script when you try to lock a filename with double byte characters. It never even gets to the point of trying to run the script, it fails trying to build the command line because it can't convert the filename from utf8 to the native encoding because the locale is C and thus the native encoding is 7 bit ascii. I'm having trouble finding a work around for this that doesn't involve setting the locale, although if there's anything obvious I'm missing I'd love to hear it. It doesn't belong here, but... I'm wondering why the path isn't passed as UTF-8. Why is it translated to the locale at all? It's all happening within the svn file system, so I'd really expect to get utf-8 and would consider locale translation as a bug. nd -- Das Verhalten von Gates hatte mir bewiesen, dass ich auf ihn und seine beiden Gefährten nicht zu zählen brauchte -- Karl May, Winnetou III Im Westen was neues: http://pub.perlig.de/books.html#apache2
Re: httpd and locales
On 1/18/06, André Malo [EMAIL PROTECTED] wrote: * Garrett Rooney wrote: The specific problem I'm trying to fix is that mod_dav_svn fails to run a pre-lock hook script when you try to lock a filename with double byte characters. It never even gets to the point of trying to run the script, it fails trying to build the command line because it can't convert the filename from utf8 to the native encoding because the locale is C and thus the native encoding is 7 bit ascii. I'm having trouble finding a work around for this that doesn't involve setting the locale, although if there's anything obvious I'm missing I'd love to hear it. It doesn't belong here, but... I'm wondering why the path isn't passed as UTF-8. Why is it translated to the locale at all? It's all happening within the svn file system, so I'd really expect to get utf-8 and would consider locale translation as a bug. Well, I imagine that the assumption is that any hook script is going to be using the actual locale specified in LANG/LC_ALL/etc env variables, so if we don't translate to that locale it'll get rather confused by utf8 data in its command line. As a general rule svn translates from native - utf8 on input and from utf8 - native for output. Ironically, if the LANG/LC_ALL/etc env vars were being followed by httpd this translation would be a noop, since the system uses a utf8 locale... -garrett