Branch: refs/heads/smoke-me/khw-locale4 Home: https://github.com/Perl/perl5 Commit: dc6501f2f655464183db9cd9fc77acefeaf92b91 https://github.com/Perl/perl5/commit/dc6501f2f655464183db9cd9fc77acefeaf92b91 Author: Karl Williamson <k...@cpan.org> Date: 2022-08-09 (Tue, 09 Aug 2022)
Changed paths: M embed.fnc M embed.h M ext/POSIX/POSIX.xs M ext/POSIX/lib/POSIX.pm M locale.c M proto.h Log Message: ----------- Move code from POSIX.xs to locale.c This avoids duplicated logic. Commit: a6f0dd16ac7accdc6dd85e9492c37d4f351d1603 https://github.com/Perl/perl5/commit/a6f0dd16ac7accdc6dd85e9492c37d4f351d1603 Author: Karl Williamson <k...@cpan.org> Date: 2022-08-09 (Tue, 09 Aug 2022) Changed paths: M locale.c Log Message: ----------- locale.c: Reorder cases in a switch This moves handling the CODESET to the end, as future commits will make its handling more complicated. The cases are now ordered so the simplest (based on the direction of future commits) are first Commit: 3a666d679e9dcb2100db01e5c23453782c3d984b https://github.com/Perl/perl5/commit/3a666d679e9dcb2100db01e5c23453782c3d984b Author: Karl Williamson <k...@cpan.org> Date: 2022-08-09 (Tue, 09 Aug 2022) Changed paths: M locale.c Log Message: ----------- locale.c: Make statics of repeated string constants These strings are (or soon will be) used in multiple places; so have just one definition for them. Commit: e93f6b19a5ad51ca8a3f95f293967739ae92b4fd https://github.com/Perl/perl5/commit/e93f6b19a5ad51ca8a3f95f293967739ae92b4fd Author: Karl Williamson <k...@cpan.org> Date: 2022-08-09 (Tue, 09 Aug 2022) Changed paths: M locale.c Log Message: ----------- locale.c: Add two #defines This makes sure that we handle having any variant of nl_langinfo() or localeconv(). Commit: 39cbe40172de0d17fee39ff6b0103662e2efb56d https://github.com/Perl/perl5/commit/39cbe40172de0d17fee39ff6b0103662e2efb56d Author: Karl Williamson <k...@cpan.org> Date: 2022-08-09 (Tue, 09 Aug 2022) Changed paths: M embed.fnc M embed.h M locale.c M proto.h Log Message: ----------- locale.c: Return defaults for uncomputable langinfo items Return the values from the C locale for nl_langinfo() items that aren't computable on this platform. If the platform has nl_langinfo(), then all of them are computable, but if not, some can't be computed, and others can be, but only if there are alternative methods available on the platform. As part of this commit, S_my_nl_langinfo() and S_save_to_buffer() are no longer used when USE_LOCALE is not defined, so don't compile them. Commit: 19a305057c3b43f64df0796acc3138d9ecf3cb0f https://github.com/Perl/perl5/commit/19a305057c3b43f64df0796acc3138d9ecf3cb0f Author: Karl Williamson <k...@cpan.org> Date: 2022-08-09 (Tue, 09 Aug 2022) Changed paths: M locale.c Log Message: ----------- locale.c: Rmv reimplementation of my_strftime() Prior to this commit, there was a near duplicate copy of the code from util.c that implements my_strftime(). This was done because the util.c version zaps the wday field, which made it incompatible. But it dawned on me that if the arbitrary date we use to do our calculations were such that it was for a year in which the wday field gets zapped to the value we want it to be, then the util.c version automatically works. This happens in years when January 1 falls on a Sunday. Commit: c1123e2a4bae569c23408e77b1fbfc258bd20767 https://github.com/Perl/perl5/commit/c1123e2a4bae569c23408e77b1fbfc258bd20767 Author: Karl Williamson <k...@cpan.org> Date: 2022-08-09 (Tue, 09 Aug 2022) Changed paths: M locale.c Log Message: ----------- locale.c: White-space only Align with previous commit and properly indent some preprocessor directives Commit: 5350097ffb1172357c59e0a2b378ddafde7edd62 https://github.com/Perl/perl5/commit/5350097ffb1172357c59e0a2b378ddafde7edd62 Author: Karl Williamson <k...@cpan.org> Date: 2022-08-09 (Tue, 09 Aug 2022) Changed paths: M embed.fnc M embed.h M locale.c M proto.h Log Message: ----------- locale.c: Shorten static function name The extra syllable(s) are unnecessary noise Commit: 32913fa130dab6210dd32fe7bec44fb763aa44f4 https://github.com/Perl/perl5/commit/32913fa130dab6210dd32fe7bec44fb763aa44f4 Author: Karl Williamson <k...@cpan.org> Date: 2022-08-09 (Tue, 09 Aug 2022) Changed paths: M embed.fnc M locale.c M proto.h Log Message: ----------- locale.c: Extend a static function This will allow it to be used in situations where the buffer it controls is single use, and we don't need to keep track of the size for future calls. Commit: 7e00f2bfb793b4632aeb45b6bc946ef9c16eae05 https://github.com/Perl/perl5/commit/7e00f2bfb793b4632aeb45b6bc946ef9c16eae05 Author: Karl Williamson <k...@cpan.org> Date: 2022-08-09 (Tue, 09 Aug 2022) Changed paths: M locale.c Log Message: ----------- locale.c: Use typedef to simplify This allows some preprocessor conditionals to be removed Commit: e415c28e84a573ece99bad27e975854b4c5c8f81 https://github.com/Perl/perl5/commit/e415c28e84a573ece99bad27e975854b4c5c8f81 Author: Karl Williamson <k...@cpan.org> Date: 2022-08-09 (Tue, 09 Aug 2022) Changed paths: M locale.c Log Message: ----------- locale.c: Rmv redundant cBOOL() strEQ and && already return booleans Commit: 961655eaec15c801d91cb37abffe83994c4d559e https://github.com/Perl/perl5/commit/961655eaec15c801d91cb37abffe83994c4d559e Author: Karl Williamson <k...@cpan.org> Date: 2022-08-09 (Tue, 09 Aug 2022) Changed paths: M locale.c Log Message: ----------- locale.c: Fix currency symbol derivation On platforms without nl_langinfo(), we derive the currency symbol from localeconv(). The symbol must be tweaked to conform to nl_langinfo() standards. Prior to this commit, it guessed at how to tweak a rare circumstance. I found evidence this guess was wrong, so looked around, and copied the way cygwin does it. This also no longer returns just an empty string in certain cases. nl_langinfo() itself doesn't, so conform to that. Commit: 119169ae097c585fbbb4f80080363c2db0701eb3 https://github.com/Perl/perl5/commit/119169ae097c585fbbb4f80080363c2db0701eb3 Author: Karl Williamson <k...@cpan.org> Date: 2022-08-09 (Tue, 09 Aug 2022) Changed paths: M locale.c Log Message: ----------- locale.c: Don't add CP to Windows code page names The actual name appears to be just the number for purposes of nl_langinfo()-ish things. Commit: 891e4358edbb3681a2e6cf882091dbc44c46069b https://github.com/Perl/perl5/commit/891e4358edbb3681a2e6cf882091dbc44c46069b Author: Karl Williamson <k...@cpan.org> Date: 2022-08-09 (Tue, 09 Aug 2022) Changed paths: M embed.fnc M locale.c M proto.h Log Message: ----------- locale.c: Don't ask a static fcn to be inlined It's too complicated to really be inlined, and the compiler can figure things out itself given it is a static function Commit: 0faaf72fa3e9f931a631f6b558cb1684e61c4aa6 https://github.com/Perl/perl5/commit/0faaf72fa3e9f931a631f6b558cb1684e61c4aa6 Author: Karl Williamson <k...@cpan.org> Date: 2022-08-09 (Tue, 09 Aug 2022) Changed paths: M embed.fnc M proto.h Log Message: ----------- S_save_to_buffer() allow ignoring return value Future commits will want to use this, while discarding the return value. Commit: b503a5953bd04e9fa97d2d82a7edbf2c1b635deb https://github.com/Perl/perl5/commit/b503a5953bd04e9fa97d2d82a7edbf2c1b635deb Author: Karl Williamson <k...@cpan.org> Date: 2022-08-09 (Tue, 09 Aug 2022) Changed paths: M embed.fnc M locale.c M proto.h Log Message: ----------- locale.c: Rmv no longer used param from static fnc Previous commits have gotten rid of this parameter to S_save_to_buffer Commit: 82354d2f9fbfb442d76d5d6e982b0d725fa9c5be https://github.com/Perl/perl5/commit/82354d2f9fbfb442d76d5d6e982b0d725fa9c5be Author: Karl Williamson <k...@cpan.org> Date: 2022-08-09 (Tue, 09 Aug 2022) Changed paths: M locale.c Log Message: ----------- locale.c: Don't read off buffer end In some configurations, under the exact set of input it would have been possible to read past the buffer end. This commit adds a conditional to prevent that. Commit: 3716372427fade85c840c017ebfe057cc08fd49e https://github.com/Perl/perl5/commit/3716372427fade85c840c017ebfe057cc08fd49e Author: Karl Williamson <k...@cpan.org> Date: 2022-08-09 (Tue, 09 Aug 2022) Changed paths: M locale.c Log Message: ----------- locale.c: Fix Windows bug with broken localeconv() localeconv() was broken on Windows until VS 2015. As a workaround, this was using my_snprintf() to find what the decimal point character is, trying to avoid our workaround for localeconv(), which has a (slight) chance of a race condition. The problem is that my_snprintf() might not end up calling snprintf at all; I didn't trace all possibilities in Windows. So it doesn't make for a reliable sentinel. This commit now specifically uses libc snprintf(), and if it fails, drops down to try localeconv(). It also changes things so that if localeconv() is not present at all or usable on the platform, to use this snprintf method. Commit: e1fcdbd0998c8c9fc1413afe3a3731915cd9568b https://github.com/Perl/perl5/commit/e1fcdbd0998c8c9fc1413afe3a3731915cd9568b Author: Karl Williamson <k...@cpan.org> Date: 2022-08-09 (Tue, 09 Aug 2022) Changed paths: M locale.c Log Message: ----------- locale.c: Use a scratch buf; instead of reusing old This is in preparation for the next commit Commit: b00b648f474d643149ca2d423a4ddb53576682d6 https://github.com/Perl/perl5/commit/b00b648f474d643149ca2d423a4ddb53576682d6 Author: Karl Williamson <k...@cpan.org> Date: 2022-08-09 (Tue, 09 Aug 2022) Changed paths: M intrpvar.h M locale.c Log Message: ----------- locale: make PL_langinfo_buf const * The previous commit allows this change to be made. Commit: b7e0907e46e6cb7f439a7cd7c704b839c41113d8 https://github.com/Perl/perl5/commit/b7e0907e46e6cb7f439a7cd7c704b839c41113d8 Author: Karl Williamson <k...@cpan.org> Date: 2022-08-09 (Tue, 09 Aug 2022) Changed paths: M embed.fnc M embed.h M proto.h Log Message: ----------- embed.fnc: Also check for NL_LANGINFO_L The preprocessor directives were only flooking for plain nl_langinfo(). It's quite unlikely that a platform will have the '_l' version without also having the plain one. But this makes sure. Commit: 911e59093a8ef913afba18822b1196c2d845eaf6 https://github.com/Perl/perl5/commit/911e59093a8ef913afba18822b1196c2d845eaf6 Author: Karl Williamson <k...@cpan.org> Date: 2022-08-09 (Tue, 09 Aug 2022) Changed paths: M embed.fnc M embed.h M locale.c M proto.h Log Message: ----------- locale.c: Make static fcn reentrant This makes my_langinfo() reentrant by adding parameters specifying where to store the result. This prepares for future commits, and fixes some minor bugs for XS writers, in that the claim was that the buffer in calling Perl_langinfo() was safe from getting zapped until the next call to it in the same thread. It turns out there were cases where, because of internal calls, the buffer did get zapped. Commit: 8d76e1d2be16f7fc40b44a795a6922afce8887a0 https://github.com/Perl/perl5/commit/8d76e1d2be16f7fc40b44a795a6922afce8887a0 Author: Karl Williamson <k...@cpan.org> Date: 2022-08-09 (Tue, 09 Aug 2022) Changed paths: M locale.c Log Message: ----------- locale.c: langinfo: Use Windows fcn to find CODESET There is a Windows function, available for quite a long time, that will return the current code page. Use this for the nl_langinfo() CODESET, as that libc function isn't implemented on Windows. Commit: 92142e1c99bf2950a3f0ef8c0cd87b0c3e9f2bc4 https://github.com/Perl/perl5/commit/92142e1c99bf2950a3f0ef8c0cd87b0c3e9f2bc4 Author: Karl Williamson <k...@cpan.org> Date: 2022-08-09 (Tue, 09 Aug 2022) Changed paths: M embed.fnc M embed.h M locale.c M proto.h Log Message: ----------- locale.c: Add static fcn to analyze locale name codeset It determines if the name indicates it is UTF-8 or not. There are several variant spellings in use, and this hides that from the the callers. It won't be actually used until the next commit Commit: e95977d9a2d6b4287792e3bab036c325c3c6a32d https://github.com/Perl/perl5/commit/e95977d9a2d6b4287792e3bab036c325c3c6a32d Author: Karl Williamson <k...@cpan.org> Date: 2022-08-09 (Tue, 09 Aug 2022) Changed paths: M ext/I18N-Langinfo/Langinfo.pm M locale.c Log Message: ----------- locale.c: Improve non-nl_langinfo() CODESET calc Prior to this commit, on non-Windows platforms that don't have a nl_langinfo() libc function, the code completely punted computation of the CODESET item. I have not been able to figure out how to do this, even going to the locale definition files on disk (which may vary anyway), but we can do a lot better than punting. This commit adds three checks: 1) If the locale name is C or POSIX, we know the codeset 2) We can detect if a locale is UTF-8. If it is, that is the codeset. Many modern locales are of this ilk. 3) Failing that, some locales have the codeset appear in the name, following a dot. It isn't perfect, but it's a lot better than completely punting. Commit: 9306d93a908a4530b117b5f38d353298c11f3f97 https://github.com/Perl/perl5/commit/9306d93a908a4530b117b5f38d353298c11f3f97 Author: Karl Williamson <k...@cpan.org> Date: 2022-08-09 (Tue, 09 Aug 2022) Changed paths: M embed.fnc M locale.c M proto.h Log Message: ----------- Add toggle_locale() fcns These are designed to temporarily switch the locale for a cateogry around some operation that needs it to be different than the current one. They will be used in the next commit. These will eventually replace the more unwieldy _is_cur_LC_category_utf8() function, which toggles as a side effect Commit: 802821f24fc836f706cc92d913f9039462abbc30 https://github.com/Perl/perl5/commit/802821f24fc836f706cc92d913f9039462abbc30 Author: Karl Williamson <k...@cpan.org> Date: 2022-08-09 (Tue, 09 Aug 2022) Changed paths: M embed.fnc M embed.h M locale.c M proto.h Log Message: ----------- New signature for static fcn my_langinfo() This commit changes the calling sequence for my_langinfo to add the desired locale, and the locale category of the desired item. This allows the function to be able to return the desired value for any locale, avoiding some locale changes that would happen until this commit, and hiding the need for locale changes from outside functions, though a couple continue to do so to avoid potential multiple changes. Commit: aa832b8871708e0d6b6d194556e9e5bee34b9dab https://github.com/Perl/perl5/commit/aa832b8871708e0d6b6d194556e9e5bee34b9dab Author: Karl Williamson <k...@cpan.org> Date: 2022-08-09 (Tue, 09 Aug 2022) Changed paths: M locale.c Log Message: ----------- locale.: Need CTYPE to match other category for nl_langinfo nl_langinfo knows about various components of locales that are supposed to be defined for every locale, such as a string for a Yes/No response or the name of a month in a particular language. These are associated with various locale categories. In the examples cited, the month names are in the LC_TIME category, and the responses in the LC_MESSAGES one. But (perhaps because these are text strings), some platforms require the LC_CTYPE locale to be the same as the other locale. cygwin is an example. Rather than try to figure out which platform require this, and which do not, it is a simple matter to just LC_CTYPE at the same time as the other category Commit: 3b4179bf63634157f5929d4384907fe29471263f https://github.com/Perl/perl5/commit/3b4179bf63634157f5929d4384907fe29471263f Author: Karl Williamson <k...@cpan.org> Date: 2022-08-09 (Tue, 09 Aug 2022) Changed paths: M embed.fnc M embed.h M locale.c M proto.h Log Message: ----------- locale.c: Add is_locale_utf8() Previous commits have added the infrastructure to be able to determine if a locale is UTF-8. This will prove useful, and this commit adds a function to encapsulate this information, and uses it in a couple of places, with more to come in future commits. This uses as a final fallback, mbtowc(), supposed to be available in C99. Future commits will add heuristics when that function isn't available or is known to be unreliable on a particular system. Commit: be6d73adfccd0e42240dde3084272a28b9417bcb https://github.com/Perl/perl5/commit/be6d73adfccd0e42240dde3084272a28b9417bcb Author: Karl Williamson <k...@cpan.org> Date: 2022-08-09 (Tue, 09 Aug 2022) Changed paths: M embed.fnc M embed.h M locale.c M perl.h M proto.h M utf8.h Log Message: ----------- locale.c: Add fcn for UTF8ness determination get_locale_string_utf8ness_i() will determine if the string it is passed in the locale it is passed is to be treated as UTF-8, or not. Commit: f71833831181f086f8f1fd161545ce2036c43ff0 https://github.com/Perl/perl5/commit/f71833831181f086f8f1fd161545ce2036c43ff0 Author: Karl Williamson <k...@cpan.org> Date: 2022-08-09 (Tue, 09 Aug 2022) Changed paths: M embed.fnc M embed.h M ext/POSIX/POSIX.xs M locale.c M proto.h Log Message: ----------- XXX perldelta Move POSIX::localeconv() logic to locale.c The code currently in POSIX.xs is moved to locale.c, and reworked some to fit in that scheme, and the logic for the workaround for the Windows broken localeconv() is made more robust. This is in preparation for the next commit which will use this logic instead of (imperfectly) duplicating it. This also creates Perl_localeconv() for direct XS calls of this functionality. Commit: a9003fc36e1d4a31b39eec8dce8bfaec0c31ed2b https://github.com/Perl/perl5/commit/a9003fc36e1d4a31b39eec8dce8bfaec0c31ed2b Author: Karl Williamson <k...@cpan.org> Date: 2022-08-09 (Tue, 09 Aug 2022) Changed paths: M locale.c Log Message: ----------- locale.c: localeconv() unconditional NUMERIC toggle It is possible to lockout changing the LC_NUMERIC locale. This is done in some printf cases where a recursive call could get the radix character wrong. But localeconv(), which could be called during this recursion on some platforms, toggles the locale briefly, without affecting the surrounding calls; so it can do the toggle unconditionally. The previous commit merely moved the functionality of localeconv() from POSIX.xs to locale.c. This commit expands upon that. Commit: fb337a6d30292b44a1f0a740ee0c53db87d8a064 https://github.com/Perl/perl5/commit/fb337a6d30292b44a1f0a740ee0c53db87d8a064 Author: Karl Williamson <k...@cpan.org> Date: 2022-08-09 (Tue, 09 Aug 2022) Changed paths: M embed.fnc M embed.h M locale.c M proto.h Log Message: ----------- locale.c: Collapse duplicate logic into one instance A previous commit move the logic for localeconv() into locale.c. This commit takes advantage of that to use it instead of repeating the logic. Notably, this commit removes the inconsistent duplicate logic that had been used to deal with the Windows broken localeconv() bug. Commit: dd9d8d0ed3c6a9e4c65fa60f150c6c5d76876b80 https://github.com/Perl/perl5/commit/dd9d8d0ed3c6a9e4c65fa60f150c6c5d76876b80 Author: Karl Williamson <k...@cpan.org> Date: 2022-08-09 (Tue, 09 Aug 2022) Changed paths: M locale.c Log Message: ----------- locale.c: Add branch prediction, comments Commit: 254dc105489bae919c97000a88ffa3dec9116be5 https://github.com/Perl/perl5/commit/254dc105489bae919c97000a88ffa3dec9116be5 Author: Karl Williamson <k...@cpan.org> Date: 2022-08-09 (Tue, 09 Aug 2022) Changed paths: M embed.fnc M embed.h M ext/POSIX/POSIX.xs M locale.c M proto.h Log Message: ----------- XXXdelta Add my_strftime8() This is like plain my_strftime(), but additionally returns an indication of the UTF-8ness of the returned string Commit: 3cd55787d9deb31f16106eb821ba9f752aebb40a https://github.com/Perl/perl5/commit/3cd55787d9deb31f16106eb821ba9f752aebb40a Author: Karl Williamson <k...@cpan.org> Date: 2022-08-09 (Tue, 09 Aug 2022) Changed paths: M embed.fnc M embed.h M locale.c M proto.h Log Message: ----------- locale.c: Add utf8ness return param to static fcn my_langinfo_i() now will additionally return the UTF-8ness of the returned string. Commit: e9e6b10fa9cece094baefea4e95f77b977b796af https://github.com/Perl/perl5/commit/e9e6b10fa9cece094baefea4e95f77b977b796af Author: Karl Williamson <k...@cpan.org> Date: 2022-08-09 (Tue, 09 Aug 2022) Changed paths: M embed.fnc M ext/I18N-Langinfo/Langinfo.xs M locale.c M proto.h Log Message: ----------- XXXdelta Add Perl_langinfo8() This is like Perl_langinfo() but additionally returns information about the UTF-8ness of the returned string. Commit: ce4161554f37992ccf4c363938a7cf6aa4a6871b https://github.com/Perl/perl5/commit/ce4161554f37992ccf4c363938a7cf6aa4a6871b Author: Karl Williamson <k...@cpan.org> Date: 2022-08-09 (Tue, 09 Aug 2022) Changed paths: M lib/locale.t Log Message: ----------- lib/locale.t: Use I18N::Langinfo, not POSIX::localeconv() Now that Langinfo is ported to every box, it requires less work than localeconv(), and offers more choices. This commit changes to use it, and for more info when debugging, gets some additional info from it, while avoiding some calls when not debugging Commit: 6c2950ff240afd8d0c2de4828373c15a895ed529 https://github.com/Perl/perl5/commit/6c2950ff240afd8d0c2de4828373c15a895ed529 Author: Karl Williamson <k...@cpan.org> Date: 2022-08-09 (Tue, 09 Aug 2022) Changed paths: M lib/locale_threads.t Log Message: ----------- locale_threads.t: Use I18N::Langinfo, not POSIX::localeconv() The former is always present; the latter might not be Commit: b769e8e78462ba368a6019eef0703f39c12f8662 https://github.com/Perl/perl5/commit/b769e8e78462ba368a6019eef0703f39c12f8662 Author: Karl Williamson <k...@cpan.org> Date: 2022-08-09 (Tue, 09 Aug 2022) Changed paths: M locale.c Log Message: ----------- locale.c: Add fallbacks if no mbtowc() This add heuristics that work well for non-English locales to determine if a locale is UTF-8 or not when mbtowc() isn't available. It would be a very rare compiler that didn't have that these days, but this covers that case as best as I have been able to figure out. Commit: 8e565e96ee3e7b88b18d74462c91860e9535fc2c https://github.com/Perl/perl5/commit/8e565e96ee3e7b88b18d74462c91860e9535fc2c Author: Karl Williamson <k...@cpan.org> Date: 2022-08-09 (Tue, 09 Aug 2022) Changed paths: M ext/I18N-Langinfo/t/Langinfo.t Log Message: ----------- Revert "XXX Temporarily skip on Windows" This should now be fixed by intervening commits Compare: https://github.com/Perl/perl5/compare/d18a98477947...8e565e96ee3e