On Mon, Oct 21, 2013 at 12:45:58AM +0200, Martin Pelikan wrote: > > > > Obviously, our locale support still sucks, this patch is mostly > > > > providing the API for filling the blanks later. > > > > Which blanks exactly? Locale features we don't have, such as collation? > > Yes. The features why for example PostgreSQL won't sort tables > correctly, which if you live in a country with weird characters in their > language is... quite unfortunate. > > I was planning on bringing specifically LC_COLLATE support for a long > time, but it's quite a lot of work. (and testing, and bugfixing with > languages I don't even know existed)
Indeed, doing collation properly (i.e. with Unicode, not just 8 bit characters like FreeBSD does) really is a non-trivial effort. It requires some expertise in linguistics and a solid understanding of the unicode standard. You'd need to make use of something like ICU (icu-project.org) to keep your sanity, or implement a whole lot of that code base yourself... > > > How much did the ramdisks grow by when you built release with this? > > > Having just freed up a bunch of space on the ramdisks, I'll be pissed if > > > we squander it all immediately. > > No objections against #ifndef SMALL_KERNEL-ing the big bits. This is about userland parts of the ramdisk. Locales don't affect the kernel. Bloat in ramdisk libc is avoided by compiling API stubs instead of the usual source files. Or some special-case #ifdef other than SMALL_KERNEL. > > I'm not very excited about xlocale. If the only goal here is an API > > shim to compile a C++ library, can't we put the shim somewhere else > > than libc? Like the misc/libutf8 port we used to have? > > Thought about it too, but since apps expect to find this stuff in libc, Applications don't care where a symbol comes from. Build scripts and Makefiles might expect them to be in libc and would need to link an additional library, but that's trivial to do. > I went for a libc diff hoping that porters will have their lives easier. > The functions I ported were the ones ld-2.17 complained about. I have > no idea whether that port is complete and I don't claim the diff to be > ready. It gets the job done at the cost of being huge and probably > wrong in places, and is open for discussion. In my opinion, if you're putting something into libc it should be a correct and functional implementation, not an API stub that doesn't really provide the advertised functionality. Libc is not a stash for missing symbols that exist on other operating systems. > I don't care about xlocale either. What'd I like is to have C++11 > working out of the box for the next release (Is that real?) and > hopefully collation support some time in the future. I think you should tackle your goals (C++11, collation) in isolation. They aren't coupled, really. > Later in the > process I noticed there is an even smaller shim intended for Solaris as > a part of libcxx, Ah. So perhaps the right approach for now is to use reuse that shim? > but my thoughts were: > - Locale has always been a pain in the ass, but something users demand. > (or is it just me with postgresql?) You'll have to be a bit more specific about what's wrong with the current locale implementation and what user demands we don't serve. Otherwise it's hard to have a productive discussion. > - Sharing this stuff with FreeBSD will make our lives easier should > anything go wrong with it. Less work for us + satisfied customers. The FreeBSD collation implementation only works for characters from Latin1. It is based on code from 1995: http://svnweb.freebsd.org/base?view=revision&revision=6485 There haven't been any functional enhancements since. If we bother with collation I think we should try to do better. > - We don't have to be complete, or even advertise it very much. But > stuff that is increasingly popular (like C++11) will work out of the > box. The ability to use modern toolchains for ports should make the > latency-savvy desktop users happier. Then again we must resist adding stubs to our base libraries to make 3rd party stuff happy. Otherwise the quality of base will suffer in the long term. > - Since a lot of operating systems have now adopted solutions (being it > xlocale or others), I suspect libcxx maintainers won't be very happy > about #ifdef __OpenBSD__ <remove half of the functionality> So they're happy about their Solaris shim, but won't take an OpenBSD one? > Please correct me if the philosophy is wrong. Or better, suggest other > ways forward :-) Since your priorities clearly lie with C++11 and not locale support in our libc, I think you should try to make libcxx more portable. Going the other way of adding a proper xlocale implementation to libc is going to be frustrating for you if it is not a goal in itself. I would suggest to implement a small and non-OS-specific stub for libcxx that they can use on any OS lacking xlocale support. Replace/enhance the existing shim for Solaris as part of this effort. Work with the libcxx team to integrate your changes there. If they tell you that they won't run on a non-xlocale OS that isn't Solaris, implement the shim anyway and add it to our ports tree. The shim is going to be a lot less work, and doesn't preclude an implentation inside libc at a later stage.