>From: Apache Wiki [mailto:[EMAIL PROTECTED] > >The new >interface will need to make it easy to specify such a set of >locales without explicitly naming them, and it will need to >retrieve such locales without returning duplicates. >
As mentioned before I don't know a good way to avoid duplicates other than to compare every attribute of each facet of each locale to all of the other locales. Just testing to see if the return from setlocale() is the same as the input string is not enough. The user could have intalled locales that have unique names but are copies of the data from some other locale. >The interface should make it easy to >express conjunction, disjunction, and negation of the terms >(parameters) and support (a perhaps simplified version of) >[http://www.opengroup.org/onlinepubs/009695399/basedefs/xbd_cha >p09.html#tag_09_03 Basic Regular Expression] syntax. Conjunction, disjunction and negation? Are you saying you want to be able to select all locales that are _not_ in some set, something like you would get with a caret (^} in a grep expression? I'm hoping that I'm just misunderstanding your comments. If not, then this is news to me and I'm a bit curious just how this addition is necessary to minimize the number of locales tested [i.e. the objective]. >We've >decided to use shell brace expansion as a means of expressing >logical conjunction between terms: a valid brace expression is >expanded to obtain a set of terms implicitly connected by a >logical AND. Individual ('\n'-separated) lines of the query >string are taken to be implicitly connected by a logical OR. >This approach models the >[http://www.opengroup.org/onlinepubs/009695399/utilities/grep.h >tml grep] interface with each line loosely corresponding to >the argument of the `-e` option to `grep`. > I've seen you mention the '\n' seperated list thing before, but I still can't make sense of it. Are you saying that to select `en_US.*' with a 1 byte encoding or `zh_*.UTF-8' with a 2, 3, or 4 byte encoding, I would write the following query? const char* locales = rw_locale_query ("en_US.* 1\nzh_*.UTF-8 {2..4}", 10); I don't see why that would be necessary. You can do it with the following query using normal brace expansion, and it's human readable. const char* locales = rw_locale_query ("{en_US.* 1,zh_*.UTF-8 {2..4}}", 10); I know that the '\n' is how you'd use `grep -e', but does it really make sense? We aren't using `grep -e' here. Travis