Re: low hanging fruit while cleaning up test failures
Travis Vitek wrote: Travis Vitek wrote: Martin Sebor wrote: My only requirement is to get those tests to pass in a reasonable amount of time (i.e., without timing out), and without compromising their effectiveness. > Do > we want to give up on the locale name matching, or do we want to include > zh_CN in the list of locales to test? What about matching the encoding? > Should we ignore all of this and just find one locale for each value of > MB_CUR_MAX from 1 to MB_LEN_MAX and run the test on them? Maybe. I'll let you propose what makes the most sense to you :) Martin Well, the AIX I'm testing on has 683 installed locale files. Of those, many are links to locales with different names. For example, we have $ locale -a | grep "_CN" | grep -v "\." ZH_CN Zh_CN zh_CN $ ls -l /usr/lib/nls/loc/ZH_CN lrwxrwxrwx 1 binbin bin 28 Feb 8 2008 /usr/lib/nls/loc/ZH_CN -> /usr/lib/nls/loc/ZH_CN.UTF-8 $ ls -l /usr/lib/nls/loc/Zh_CN lrwxrwxrwx 1 binbin bin 28 Feb 8 2008 /usr/lib/nls/loc/ZH_CN -> /usr/lib/nls/loc/Zh_CN.GB18030 $ ls -l /usr/lib/nls/loc/zh_CN lrwxrwxrwx 1 binbin bin 28 Feb 8 2008 /usr/lib/nls/loc/ZH_CN -> /usr/lib/nls/loc/zh_CN.IBM-eucCN The locales that are mapped to [ZH_CN.UTF-8, Zh_CN.GB18030, zh_CN.IBM-eucCN] also appear in the locale list, so we have many duplicated locales. So, for an immediate reduction in the number of tested locales, we could eliminate these duplicates. How to tell if a locale is a duplicate? I'm not sure. Another option would be to ignore all locales that don't match the regular expression "[a-z][a-z]_[A-Z][A-Z]([EMAIL PROTECTED])?$" or the fnmatch expressions "[a-z][a-z]_[A-Z][A-Z]" and "[EMAIL PROTECTED]". The C/POSIX locales don't match this, but we can explicitly allow them. This alone cuts the number of locales down significantly, though it does affect other platforms. Here is a small table showing the total number of locales, and the number of locales that match the above regular expression. Okay Total AIX 226 603 Compaq3340 HP-UX142 160 Irix 3960 Linux479 582 Solaris 223 331 Another option is to build up a list of all installed locales [their names and other properties], and then provide a mechanism to search through, or iterate over that list. If you want to run a test on all locales that have a name matching some expression, you write a function or function object to return true on match. You pass that to the rw_locales_match() routine, and it gives you the first match. Call again to get the next match or null. for (const rw_locale_entry* e = rw_locales_match(0, fun); e; e = rw_locales_match(e, fun)) { } If you want to select only locales with mb_cur_max of 4, you either write a filter, or you explicitly iterate over the list. If we really decide that it is necessary to write up a SQL type language for selecting locales, then that system can be implemented on top of this. Travis Ah, my primitive scheme above isn't quite good enough. The time to run the 22.locale.ctype.is test was 28m35s, and I've reduced it down to 6m28s with an 11s build on AIX. The test would have timed out at 5 minutes. Because we're still testing far too many locales... Now that I've seen that, it makes me wonder about the other proposal and the SQL-like query string idea. If we get a locale from the system, we don't have access to the original data that was in the ASCII source file. We just get the data presented from the C/C++ locale. This means that we have to discover information about the locale [like the mb_cur_max value]. This may take considerable time. Maybe we should start by putting together a comprehensive list of locales installed on all our systems and their properties: std-country, std-lang, std-codeset, native-name, aliases, MB_CUR_MAX, ...(anything else of interest)... std-country (ISO-3166): http://www.iso.org/iso/english_country_names_and_code_elements std-lang (ISO-639): http://www.loc.gov/standards/iso639-2/php/English_list.php std-codeset: http://www.iana.org/assignments/character-sets We could then create a database mapping each of the set of standard names to the platform-specific names on every supported OS. We'd also need the ability to ask for one out of a set of options (i.e., give me a locale that corresponds to en_US.UTF-8 if it exist, or else en_US.ISO-8859-1, or an en_US locale in any encoding, or if even that's not available, anything you've got in English ;-) Martin
Re: low hanging fruit while cleaning up test failures
Travis Vitek wrote: Martin Sebor wrote: My only requirement is to get those tests to pass in a reasonable amount of time (i.e., without timing out), and without compromising their effectiveness. > Do > we want to give up on the locale name matching, or do we want to include > zh_CN in the list of locales to test? What about matching the encoding? > Should we ignore all of this and just find one locale for each value of > MB_CUR_MAX from 1 to MB_LEN_MAX and run the test on them? Maybe. I'll let you propose what makes the most sense to you :) Martin Well, the AIX I'm testing on has 683 installed locale files. Of those, many are links to locales with different names. For example, we have $ locale -a | grep "_CN" | grep -v "\." ZH_CN Zh_CN zh_CN $ ls -l /usr/lib/nls/loc/ZH_CN lrwxrwxrwx 1 binbin bin 28 Feb 8 2008 /usr/lib/nls/loc/ZH_CN -> /usr/lib/nls/loc/ZH_CN.UTF-8 $ ls -l /usr/lib/nls/loc/Zh_CN lrwxrwxrwx 1 binbin bin 28 Feb 8 2008 /usr/lib/nls/loc/ZH_CN -> /usr/lib/nls/loc/Zh_CN.GB18030 $ ls -l /usr/lib/nls/loc/zh_CN lrwxrwxrwx 1 binbin bin 28 Feb 8 2008 /usr/lib/nls/loc/ZH_CN -> /usr/lib/nls/loc/zh_CN.IBM-eucCN The locales that are mapped to [ZH_CN.UTF-8, Zh_CN.GB18030, zh_CN.IBM-eucCN] also appear in the locale list, so we have many duplicated locales. So, for an immediate reduction in the number of tested locales, we could eliminate these duplicates. How to tell if a locale is a duplicate? I'm not sure. Test the result of setlocale(LC_ALL, name) for equality? Another option would be to ignore all locales that don't match the regular expression "[a-z][a-z]_[A-Z][A-Z]([EMAIL PROTECTED])?$" or the fnmatch expressions "[a-z][a-z]_[A-Z][A-Z]" and "[EMAIL PROTECTED]". The C/POSIX locales don't match this, but we can explicitly allow them. I suspect this kind of matching isn't going to be robust enough. This alone cuts the number of locales down significantly, though it does affect other platforms. Here is a small table showing the total number of locales, and the number of locales that match the above regular expression. Okay Total AIX 226 603 Compaq3340 HP-UX142 160 Irix 3960 Linux479 582 Solaris 223 331 Another option is to build up a list of all installed locales [their names and other properties], and then provide a mechanism to search through, or iterate over that list. This would work. If you want to run a test on all locales that have a name matching some expression, you write a function or function object to return true on match. You pass that to the rw_locales_match() routine, and it gives you the first match. Call again to get the next match or null. for (const rw_locale_entry* e = rw_locales_match(0, fun); e; e = rw_locales_match(e, fun)) { } If you want to select only locales with mb_cur_max of 4, you either write a filter, or you explicitly iterate over the list. If we really decide that it is necessary to write up a SQL type language for selecting locales, then that system can be implemented on top of this. I like it! Martin
Re: low hanging fruit while cleaning up test failures
Travis Vitek wrote: > > > > Martin Sebor wrote: >> >> My only requirement is to get those tests to pass in a reasonable >> amount of time (i.e., without timing out), and without compromising >> their effectiveness. >> >> > Do >> > we want to give up on the locale name matching, or do we want to >> include >> > zh_CN in the list of locales to test? What about matching the >> encoding? >> > Should we ignore all of this and just find one locale for each value >> of >> > MB_CUR_MAX from 1 to MB_LEN_MAX and run the test on them? >> >> Maybe. I'll let you propose what makes the most sense to you :) >> >> Martin >> > > Well, the AIX I'm testing on has 683 installed locale files. Of those, > many are links to locales with different names. For example, we have > > $ locale -a | grep "_CN" | grep -v "\." > ZH_CN > Zh_CN > zh_CN > $ ls -l /usr/lib/nls/loc/ZH_CN > lrwxrwxrwx 1 binbin bin 28 Feb 8 2008 > /usr/lib/nls/loc/ZH_CN -> /usr/lib/nls/loc/ZH_CN.UTF-8 > $ ls -l /usr/lib/nls/loc/Zh_CN > lrwxrwxrwx 1 binbin bin 28 Feb 8 2008 > /usr/lib/nls/loc/ZH_CN -> /usr/lib/nls/loc/Zh_CN.GB18030 > $ ls -l /usr/lib/nls/loc/zh_CN > lrwxrwxrwx 1 binbin bin 28 Feb 8 2008 > /usr/lib/nls/loc/ZH_CN -> /usr/lib/nls/loc/zh_CN.IBM-eucCN > > The locales that are mapped to [ZH_CN.UTF-8, Zh_CN.GB18030, > zh_CN.IBM-eucCN] also appear in the locale list, so we have many > duplicated locales. So, for an immediate reduction in the number of tested > locales, we could eliminate these duplicates. How to tell if a locale is a > duplicate? I'm not sure. > > Another option would be to ignore all locales that don't match the regular > expression "[a-z][a-z]_[A-Z][A-Z]([EMAIL PROTECTED])?$" or the fnmatch > expressions > "[a-z][a-z]_[A-Z][A-Z]" and "[EMAIL PROTECTED]". The C/POSIX > locales don't match this, but we can explicitly allow them. > > This alone cuts the number of locales down significantly, though it does > affect other platforms. Here is a small table showing the total number of > locales, and the number of locales that match the above regular > expression. > > Okay Total > AIX 226 603 > Compaq3340 > HP-UX142 160 > Irix 3960 > Linux479 582 > Solaris 223 331 > > Another option is to build up a list of all installed locales [their names > and other properties], and then provide a mechanism to search through, or > iterate over that list. If you want to run a test on all locales that have > a name matching some expression, you write a function or function object > to return true on match. You pass that to the rw_locales_match() routine, > and it gives you the first match. Call again to get the next match or > null. > > for (const rw_locale_entry* e = rw_locales_match(0, fun); > e; e = rw_locales_match(e, fun)) > { > } > > If you want to select only locales with mb_cur_max of 4, you either write > a filter, or you explicitly iterate over the list. If we really decide > that it is necessary to write up a SQL type language for selecting > locales, then that system can be implemented on top of this. > > Travis > Ah, my primitive scheme above isn't quite good enough. The time to run the 22.locale.ctype.is test was 28m35s, and I've reduced it down to 6m28s with an 11s build on AIX. The test would have timed out at 5 minutes. Now that I've seen that, it makes me wonder about the other proposal and the SQL-like query string idea. If we get a locale from the system, we don't have access to the original data that was in the ASCII source file. We just get the data presented from the C/C++ locale. This means that we have to discover information about the locale [like the mb_cur_max value]. This may take considerable time. Travis -- View this message in context: http://www.nabble.com/low-hanging-fruit-while-cleaning-up-test-failures-tp13634803p14821525.html Sent from the stdcxx-dev mailing list archive at Nabble.com.
Re: [Fwd: New wiki created: stdcxx]
William A. Rowe, Jr. wrote: FYI, it's live. Excellent! We might want to discuss the organization of the Wiki at some point. I'd like to put up the release process document so we can finalize the examples. I expect Farid might want to document the Boost test suite setup he described in his December post here: http://www.nabble.com/Running-the-boost-regression-tests-with-stdcxx-to14536939.html#a14536939 Since we're getting close to 4.2.1 and have been talking about 4.3 for months we should probably start putting together the set of "supported" platforms (primary, secondary, etc.) What else? Martin Subject: New wiki created: stdcxx From: [EMAIL PROTECTED] Date: Mon, 14 Jan 2008 22:29:52 + (GMT) To: [EMAIL PROTECTED] To: [EMAIL PROTECTED] A new wiki has just been created by root. Wiki name: stdcxx Wiki title: Apache C++ Standard Library Commits:[EMAIL PROTECTED] PMC notify: [EMAIL PROTECTED]
Re: low hanging fruit while cleaning up test failures
Martin Sebor wrote: > > My only requirement is to get those tests to pass in a reasonable > amount of time (i.e., without timing out), and without compromising > their effectiveness. > > > Do > > we want to give up on the locale name matching, or do we want to > include > > zh_CN in the list of locales to test? What about matching the encoding? > > Should we ignore all of this and just find one locale for each value of > > MB_CUR_MAX from 1 to MB_LEN_MAX and run the test on them? > > Maybe. I'll let you propose what makes the most sense to you :) > > Martin > Well, the AIX I'm testing on has 683 installed locale files. Of those, many are links to locales with different names. For example, we have $ locale -a | grep "_CN" | grep -v "\." ZH_CN Zh_CN zh_CN $ ls -l /usr/lib/nls/loc/ZH_CN lrwxrwxrwx 1 binbin bin 28 Feb 8 2008 /usr/lib/nls/loc/ZH_CN -> /usr/lib/nls/loc/ZH_CN.UTF-8 $ ls -l /usr/lib/nls/loc/Zh_CN lrwxrwxrwx 1 binbin bin 28 Feb 8 2008 /usr/lib/nls/loc/ZH_CN -> /usr/lib/nls/loc/Zh_CN.GB18030 $ ls -l /usr/lib/nls/loc/zh_CN lrwxrwxrwx 1 binbin bin 28 Feb 8 2008 /usr/lib/nls/loc/ZH_CN -> /usr/lib/nls/loc/zh_CN.IBM-eucCN The locales that are mapped to [ZH_CN.UTF-8, Zh_CN.GB18030, zh_CN.IBM-eucCN] also appear in the locale list, so we have many duplicated locales. So, for an immediate reduction in the number of tested locales, we could eliminate these duplicates. How to tell if a locale is a duplicate? I'm not sure. Another option would be to ignore all locales that don't match the regular expression "[a-z][a-z]_[A-Z][A-Z]([EMAIL PROTECTED])?$" or the fnmatch expressions "[a-z][a-z]_[A-Z][A-Z]" and "[EMAIL PROTECTED]". The C/POSIX locales don't match this, but we can explicitly allow them. This alone cuts the number of locales down significantly, though it does affect other platforms. Here is a small table showing the total number of locales, and the number of locales that match the above regular expression. Okay Total AIX 226 603 Compaq3340 HP-UX142 160 Irix 3960 Linux479 582 Solaris 223 331 Another option is to build up a list of all installed locales [their names and other properties], and then provide a mechanism to search through, or iterate over that list. If you want to run a test on all locales that have a name matching some expression, you write a function or function object to return true on match. You pass that to the rw_locales_match() routine, and it gives you the first match. Call again to get the next match or null. for (const rw_locale_entry* e = rw_locales_match(0, fun); e; e = rw_locales_match(e, fun)) { } If you want to select only locales with mb_cur_max of 4, you either write a filter, or you explicitly iterate over the list. If we really decide that it is necessary to write up a SQL type language for selecting locales, then that system can be implemented on top of this. Travis -- View this message in context: http://www.nabble.com/low-hanging-fruit-while-cleaning-up-test-failures-tp13634803p14818795.html Sent from the stdcxx-dev mailing list archive at Nabble.com.
[Fwd: New wiki created: stdcxx]
FYI, it's live. --- Begin Message --- A new wiki has just been created by root. Wiki name: stdcxx Wiki title: Apache C++ Standard Library Commits:[EMAIL PROTECTED] PMC notify: [EMAIL PROTECTED] --- End Message ---
[jira] Commented: (STDCXX-692) [gcc 4.0.1/Mac OS X 10.5.1 Leopard] 25.search test failure
[ https://issues.apache.org/jira/browse/STDCXX-692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12558765#action_12558765 ] Martin Sebor commented on STDCXX-692: - Can you include build type info and a stack trace? > [gcc 4.0.1/Mac OS X 10.5.1 Leopard] 25.search test failure > -- > > Key: STDCXX-692 > URL: https://issues.apache.org/jira/browse/STDCXX-692 > Project: C++ Standard Library > Issue Type: Bug > Components: Tests >Affects Versions: 4.2.0 > Environment: Darwin hostname 9.1.0 Darwin Kernel Version 9.1.0: Wed > Oct 31 17:46:22 PDT 2007; root:xnu-1228.0.2~1/RELEASE_I386 i386 >Reporter: Eric Lemings >Priority: Minor > Fix For: 4.2.1 > > > This test program crashes with a bus error on Apple's latest cat. Don't know > any more than that. Just filing an issue for tracking purposes. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (STDCXX-696) Remove etc/config/configure.sh script from repository.
Remove etc/config/configure.sh script from repository. -- Key: STDCXX-696 URL: https://issues.apache.org/jira/browse/STDCXX-696 Project: C++ Standard Library Issue Type: Task Components: Configuration Reporter: Eric Lemings Priority: Trivial Fix For: 4.2.1 >From what I see, this file does absolutely nothing yet it is included as part >of the distribution because it still resides in the repository. If it is no >longer used, it should be deleted from the repository. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[NOTICE] Releases -> archive.apache.org
i've now deleted the original releases from people.apache.org/dist/incubator. copies can be found in archives.apache.org/dist/incubator. the .htaccess seems to be working ok but please check the links ASAP. i will unsubscribe from this list no earlier than thursday. if you find you have any questions after then please ask on general. - robert signature.asc Description: This is a digitally signed message part
[jira] Commented: (STDCXX-686) redesign web site
[ https://issues.apache.org/jira/browse/STDCXX-686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12558650#action_12558650 ] Martin Sebor commented on STDCXX-686: - Okay, I'll work on that. Oddly, I don't see your request to subscribe. Unless someone else has already approved it could you try again? > redesign web site > - > > Key: STDCXX-686 > URL: https://issues.apache.org/jira/browse/STDCXX-686 > Project: C++ Standard Library > Issue Type: Improvement > Components: Web >Reporter: Martin Sebor > > The current web site is a bunch of static HTML pages with a lot of > difficult-to-maintain formatting cruft copied from pages of another incubator > project. We should look into generating the site from easier-to-maintain > "sources" using a tool like Apache Forrest or some such. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (STDCXX-686) redesign web site
[ https://issues.apache.org/jira/browse/STDCXX-686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12558587#action_12558587 ] Gavin commented on STDCXX-686: -- If only a few pages then yes I would do that. Just create a 'forrest seed-sample' or 'forrest seed-basic' and alter/delete as required. I'm waiting for a moderator to approve my request to join the dev list. It would be good to discuss finer details there. > redesign web site > - > > Key: STDCXX-686 > URL: https://issues.apache.org/jira/browse/STDCXX-686 > Project: C++ Standard Library > Issue Type: Improvement > Components: Web >Reporter: Martin Sebor > > The current web site is a bunch of static HTML pages with a lot of > difficult-to-maintain formatting cruft copied from pages of another incubator > project. We should look into generating the site from easier-to-maintain > "sources" using a tool like Apache Forrest or some such. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.