[ANNOUNCEMENT] Upgraded: grep 3.10
The following package has been upgraded in the Cygwin distribution: * grep 3.10 GNU grep searches one or more input files for lines containing a match to a specified pattern. By default, grep outputs the matching lines. The GNU implementation includes several useful extensions over POSIX. The previous release stated that egrep and fgrep are deprecated obsolescent commands, will be dropped in future, and from this release until then, every use will show a stderr warning message, reminding you how to change your commands and scripts: $ egrep ... egrep: warning: egrep is obsolescent; using grep -E ... $ fgrep ... fgrep: warning: fgrep is obsolescent; using grep -F ... Cygwin releases will suppress the egrep and fgrep warning messages, but developers and maintainers should rigorously remove all such usages from their practices and scripts, as those commands could be dropped, or any warning messages could be treated as fatal errors, in future. Other invalid usages documented previously also now generate stderr warning or error messages e.g. grep: warning: * at start of expression grep: warning: ? at start of expression grep: warning: + at start of expression grep: warning: {...} at start of expression grep: warning: stray \ before grep: warning: stray \ before unprintable character grep: warning: stray \ before white space For more information see the project home pages: https://www.gnu.org/software/grep/ https://sv.gnu.org/projects/grep/ For changes since the previous Cygwin release please see below or read /usr/share/doc/grep/NEWS after installation; for complete details see: /usr/share/doc/grep/ChangeLog https://git.sv.gnu.org/gitweb/?p=grep.git;a=log;h=refs/tags/v3.9 Noteworthy changes in release 3.10 2023-03-22 * Bug fixes With -P, \d now matches only ASCII digits, regardless of PCRE options/modes. The changes in grep-3.9 to make \b and \w work properly had the undesirable side effect of making \d also match e.g., the Arabic digits: ٠١٢٣٤٥٦٧٨٩. With grep-3.9, -P '\d+' would match that ten-digit (20-byte) string. Now, to match such a digit, you would use \p{Nd}. Similarly, \D is now mapped to [^0-9]. [bug introduced in grep 3.9] -- Problem reports: https://cygwin.com/problems.html FAQ: https://cygwin.com/faq/ Documentation:https://cygwin.com/docs.html Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple
[ANNOUNCEMENT] Updated: tzcode, tzdata 2023a
The following packages have been upgraded in the Cygwin distribution: * tzcode2023a * tzdata2023a The Time Zone Database (often called tz, tzdb, or zoneinfo) contains data that represents the history of local time for many locations around the world, and supports conversion of UTC time to local time at those locations to allow display of those local times. It is updated periodically to reflect changes made by political bodies to summer daylight saving time rules, UTC offsets, and time zone boundaries. The tzcode package provides the tzselect, zdump, and zic utilities. For more information, see the project home page: https://www.iana.org/time-zones For more details on changes, see the announcement or below: https://mm.icann.org/pipermail/tz-announce/2023-March/77.html Release 2023a 2023-03-22 Briefly: * Egypt now uses DST again, from April through October. * This year Morocco springs forward April 23, not April 30. * Palestine delays the start of DST this year. * Much of Greenland still uses DST from 2024 on. * America/Yellowknife now links to America/Edmonton. * tzselect can now use current time to help infer timezone. * The code now defaults to C99 or later. * Fix use of C23 attributes. Changes to future timestamps * Starting in 2023, Egypt will observe DST from April's last Friday through October's last Thursday. Assume the transition times are 00:00 and 24:00, respectively. * In 2023 Morocco's spring-forward transition after Ramadan will occur April 23, not April 30. Adjust predictions for future years accordingly. This affects predictions for 2023, 2031, 2038, and later years. * This year Palestine will delay its spring forward from March 25 to April 29 due to Ramadan. Make guesses for future Ramadans too. * Much of Greenland, represented by America/Nuuk, will continue to observe DST using European Union rules. When combined with Greenland's decision not to change the clocks in fall 2023, America/Nuuk therefore changes from -03/-02 to -02/-01 effective 2023-10-29 at 01:00 UTC. This change from 2022g doesn't affect timestamps until 2024-03-30, and doesn't affect tm_isdst until 2023-03-25. Changes to past timestamps * America/Yellowknife has changed from a Zone to a backward compatibility Link, as it no longer differs from America/Edmonton since 1970. This affects some pre-1948 timestamps. The old data are now in 'backzone'. Changes to past time zone abbreviations * When observing Moscow time, Europe/Kirov and Europe/Volgograd now use the abbreviations MSK/MSD instead of numeric abbreviations, for consistency with other timezones observing Moscow time. Changes to code * You can now tell tzselect local time, to simplify later choices. Select the 'time' option in its first prompt. * You can now compile with -DTZNAME_MAXIMUM=N to limit time zone abbreviations to N bytes (default 255). The reference runtime library now rejects POSIX-style TZ strings that contain longer abbreviations, treating them as UTC. Previously the limit was platform dependent and abbreviations were silently truncated to 16 bytes even when the limit was greater than 16. * The code by default is now designed for C99 or later. To build in a C89 environment, compile with -DPORT_TO_C89. To support C89 callers of the tzcode library, compile with -DSUPPORT_C89. The two new macros are transitional aids planned to be removed in a future version, when C99 or later will be required. * The code now builds again on pre-C99 platforms, if you compile with -DPORT_TO_C89. This fixes a bug introduced in 2022f. * On C23-compatible platforms tzcode no longer uses syntax like 'static [[noreturn]] void usage(void);'. Instead, it uses '[[noreturn]] static void usage(void);' as strict C23 requires. * The code's functions now constrain their arguments with the C 'restrict' keyword consistently with their documentation. This may allow future optimizations. * zdump again builds standalone with ckdadd and without setenv, fixing a bug introduced in 2022g. * leapseconds.awk can now process a leap seconds file that never expires; this might be useful if leap seconds are discontinued. Changes to commentary * tz-link.html has a new section "Coordinating with governments and distributors". * To improve tzselect diagnostics, zone1970.tab's comments column is now limited to countries that have multiple timezones. * Note that leap seconds are planned to be discontinued by 2035. -- Problem reports: https://cygwin.com/problems.html FAQ: https://cygwin.com/faq/ Documentation:https://cygwin.com/docs.html Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple
Re: newlocale: Linux incompatibility
On Mar 25 13:03, Brian Inglis via Cygwin wrote: > On 2023-03-25 05:49, Corinna Vinschen via Cygwin wrote: > It looks like /proc/locales contains the same content as produced by locale > -a? Yes, locale -a actually opens /proc/locales to read the locales from the Cygwin core, just as it opens /proc/codesets to implement locale -m. The idea was to have these definitions collected inside the DLL instead of having to duplicate code in an external tool. Corinna -- Problem reports: https://cygwin.com/problems.html FAQ: https://cygwin.com/faq/ Documentation:https://cygwin.com/docs.html Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple
Re: newlocale: Linux incompatibility
On Mar 25 13:03, Brian Inglis via Cygwin wrote: > On 2023-03-25 05:49, Corinna Vinschen via Cygwin wrote: > > On Mar 24 16:49, Brian Inglis via Cygwin wrote: > > I never heard about an environment variable called LANGUAGE. This is > > about LANG/LC_ALL/LC_whatever, so POSIX syntax is required... > > Used by gettext: > > https://www.gnu.org/software/gettext/manual/html_node/The-LANGUAGE-variable.html Ok, I'm not using that because I didn't even know that. But I'm not sure why you even mention it, it has nothing to do with Cygwin's locale implementation which is based on the POSIX definitions. Exception here is where the data comes from since we don't maintain locale definition files and thus we don't follow https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap07.html to the letter. > > > Aha - a nice new 3.5.0 feature - as well as /proc/codesets - is that > > > charsets e.g. ISO-10646, etc. rather than encodings e.g. UTF-8, etc.! > > > It's a list of what you can use as codeset in $LANG and friends as in > >LC_CTYPE=lang_TERRITORY.codeset@modifier > > You are using codeset to mean encoding, whereas in Unicode and W3 it usually > means coded character set/charset; it can also mean charmap; see iconv(1): I'm using the POSIX definition here. Codeset is codeset, as in https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap08.html Quote: If the locale value has the form: language[_territory][.codeset] it refers to an implementation-provided locale, where settings of language, territory, and codeset are implementation-defined. So I'm using the name "codesets" to follow POSIX documentation for setting the matching locale environment variables, exactly to avoid confusion. Corinna -- Problem reports: https://cygwin.com/problems.html FAQ: https://cygwin.com/faq/ Documentation:https://cygwin.com/docs.html Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple
[ANNOUNCEMENT] Updated: {,{mingw64-{x86_64,i686}}xz 5.4.2-1
The following packages have been uploaded to the Cygwin distribution: xz-5.4.2-1 liblzma5-5.4.2-1 liblzma-devel-5.4.2-1 mingw64-i686-xz-5.4.2-1 mingw64-x86_64-xz-5.4.2-1 XZ Utils is free general-purpose data compression software with high compression ratio. XZ Utils are the successor to LZMA Utils. This is an update to the latest upstream release. -- *** CYGWIN-ANNOUNCE UNSUBSCRIBE INFO *** If you want to unsubscribe from the cygwin-announce mailing list, look at the "List-Unsubscribe: " tag in the email header of this message. Send email to the address specified there. It will be in the format: cygwin-announce-unsubscribe-you=yourdomain@cygwin.com If you need more information on unsubscribing, start reading here: http://sourceware.org/lists.html#unsubscribe-simple Please read *all* of the information on unsubscribing that is available starting at this URL. -- Problem reports: https://cygwin.com/problems.html FAQ: https://cygwin.com/faq/ Documentation:https://cygwin.com/docs.html Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple
[ANNOUNCEMENT] Updated: Perl distributions
The following Perl distributions have been updated to their latest release version available on CPAN: noarch -- perl-Business-ISBN-3.008-1 perl-Business-ISBN-Data-20230322.001-1 perl-DateTime-TimeZone-2.59-1 perl-Exporter-Tiny-1.006001-1 perl-Test2-Suite-0.000150-1 perl-YAML-Tiny-1.74-1 -- *** CYGWIN-ANNOUNCE UNSUBSCRIBE INFO *** If you want to unsubscribe from the cygwin-announce mailing list, look at the "List-Unsubscribe: " tag in the email header of this message. Send email to the address specified there. It will be in the format: cygwin-announce-unsubscribe-you=yourdomain@cygwin.com If you need more information on unsubscribing, start reading here: http://sourceware.org/lists.html#unsubscribe-simple Please read *all* of the information on unsubscribing that is available starting at this URL. -- Problem reports: https://cygwin.com/problems.html FAQ: https://cygwin.com/faq/ Documentation:https://cygwin.com/docs.html Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple
Re: newlocale: Linux incompatibility
On 2023-03-25 05:49, Corinna Vinschen via Cygwin wrote: On Mar 24 16:49, Brian Inglis via Cygwin wrote: On 2023-03-24 06:18, Corinna Vinschen via Cygwin wrote: First, it's a bug in the Emacs testsuite. The test simply assumes that there's no en_DE locale on any system, but that's just not true. Windows support the RFC 5646 locale "en-DE", which is called "English (Germany)" in the "Region" settings. You can also check with `locale -av | less' and search for en_DE. For the reminder of this mail, I assume you're talking about Cygwin 3.5. I won't fix this for 3.4 anymore, given how much locale handling has changed for 3.5. The second bug is that Cygwin blindly trusts the Windows function ResolveLocaleName(). That function blatantly converts even vaguely similar locales into something it supports. E.g., it converts "en-XY" to "en-US". I. .e., even if you use "en_XY.utf8" as locale, the above testcase will wrongly succeed. So I have to rethink how I resolve POSIX locales to Windows locales. Does Windows even consider https://www.rfc-editor.org/rfc/rfc4647 "Matching of Language Tags", part of https://www.rfc-editor.org/info/bcp47 "Language Tags", and if POSIX only matches exactly, will LANGUAGE be able to be used for fallback? I never heard about an environment variable called LANGUAGE. This is about LANG/LC_ALL/LC_whatever, so POSIX syntax is required... Used by gettext: https://www.gnu.org/software/gettext/manual/html_node/The-LANGUAGE-variable.html also LINGUAS FYI controlling, documentating, or limiting translations: https://www.gnu.org/software/gettext/manual/html_node/po_002fLINGUAS.html https://www.gnu.org/software/gettext/manual/html_node/Installers.html as POSIX punts a lot of locale handling into the (hand waving) implementation defined bucket, where this is the primary implementation. I currently define LANGUAGE=en_CA:en_GB:en in case en-CA is unsupported by anything. [I use my own en-CA locale not the glibc default created by https://rap.dk/.] Will "-" be supported like "_" as a separator in values? In Cygwin? No. The POSIX syntax is required, it's converted into a matching Windows RFC 5646 locale internally. And the third bug is that Cygwin fails to set errno if it doesn't support a locale, but that's a minor inconvenience in comparison. Thanks for the report, I totally missed the above problem with ResolveLocaleName. I pushed a couple of patches which hopefully clean up the code. It's really frustrating how these Windows locale functions work. Or, rather, not work. I mean, come on... - ResolveLocaleName() resolves "ff-BF" to "ff-Latn-SN", not to "ff-Adlm-BF" or "ff-Latn-BF", even though both exist. - There's a locale called "sd-Arab-PK" and a locale "sd-Deva-IN". If you ask for the script used in "sd-IN", the result is "Arab", not "Deva". I had to create a replacement function for ResolveLocaleName which doesn't return totally screwy and unexpected results, and special case two more locales in /proc/locales output so the output makes sense. Aha - a nice new 3.5.0 feature - as well as /proc/codesets - is that charsets e.g. ISO-10646, etc. rather than encodings e.g. UTF-8, etc.! It's a list of what you can use as codeset in $LANG and friends as in LC_CTYPE=lang_TERRITORY.codeset@modifier You are using codeset to mean encoding, whereas in Unicode and W3 it usually means coded character set/charset; it can also mean charmap; see iconv(1): https://pubs.opengroup.org/onlinepubs/9699919799/utilities/iconv.html Further confused by codeset definition: https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html#tag_03_99 linking to: https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap06.html#tag_06_02 which says POSIX "provides no means of defining a wide-character codeset" implying encodings such as UCS-2/UTF-16 and UCS-4/UTF-32 can not be specified, requiring a non-POSIX approach to conversion. Also IBM uses codeset to distinguish between EBCDIC and ASCII encodings. Adding to the confusion ISO uses codeset to refer generically to each set of codes supported by each part of ISO-639-1/2/3/5, ISO-3166-1/2/3, and ISO-15924, as well as ISO-8859-1...16. I get no hits from RFCs. To avoid ambiguity and reduce possible confusion, it may be better to name this charmaps as used in locale(1), and produced by locale -m with the same apparent content? It looks like /proc/locales contains the same content as produced by locale -a? JM2c ;^> -- Take care. Thanks, Brian Inglis Calgary, Alberta, Canada La perfection est atteinte Perfection is achieved non pas lorsqu'il n'y a plus rien à ajouter not when there is no more to add mais lorsqu'il n'y a plus rien à retirer but when there is no more to cut -- Antoine de Saint-Exupéry -- Problem reports: https://cygwin.com/problems.html FAQ: https://cygwin.com/faq/ Document
RE: Strange link /bin/rungs in 32-bit Cygwin
>> Maybe of diminishing interest (32-bit Cygwin) - but: >> Out of nowhere (but see below (*)) a link has occurred >> /bin/rungs -> /usr/share/texmf-dist/scripts/texlive/rungs.tlu >> which is, I think, a typo for /usr/share/texmf-dist/scripts/texlive/rungs.lua >> Can anybody confirm? > The symlink and script come from texlive-collection-basic. In current > TeX Live on 64-bit Cygwin, the link does indeed point to rungs.lua. But > I think the .tlu extension is used for texlua scripts, so what you're > seeing might not be a typo. I'd have to look back at last year's > texlive-collection-basic to be sure, but you can do that more easily > than I can, since you already have a system with last year's > texlive-collection-basic. You are right. The provision is right though different: Cygwin32: texlive-collection-basic-20220321-1 Cygwin64: texlive-collection-basic-20230313-2 Both setup.ini files are right and so is their enactment. Actually both my platforms were correctly configured with correct links too. My housekeeping, or rather my interpretation of housekeeping reports, was faulty. Sorry for wasted time. -- Problem reports: https://cygwin.com/problems.html FAQ: https://cygwin.com/faq/ Documentation:https://cygwin.com/docs.html Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple
Re: newlocale: Linux incompatibility
On Mar 24 16:49, Brian Inglis via Cygwin wrote: > On 2023-03-24 06:18, Corinna Vinschen via Cygwin wrote: > > > First, it's a bug in the Emacs testsuite. The test simply assumes that > > > there's no en_DE locale on any system, but that's just not true. > > > Windows support the RFC 5646 locale "en-DE", which is called "English > > > (Germany)" in the "Region" settings. > > > > > > You can also check with `locale -av | less' and search for en_DE. > > > > > > For the reminder of this mail, I assume you're talking about Cygwin 3.5. > > > I won't fix this for 3.4 anymore, given how much locale handling has > > > changed for 3.5. > > > > > > The second bug is that Cygwin blindly trusts the Windows function > > > ResolveLocaleName(). That function blatantly converts even vaguely > > > similar locales into something it supports. E.g., it converts "en-XY" > > > to "en-US". I. .e., even if you use "en_XY.utf8" as locale, the above > > > testcase will wrongly succeed. So I have to rethink how I resolve POSIX > > > locales to Windows locales. > > Does Windows even consider https://www.rfc-editor.org/rfc/rfc4647 "Matching > of Language Tags", part of https://www.rfc-editor.org/info/bcp47 "Language > Tags", and if POSIX only matches exactly, will LANGUAGE be able to be used > for fallback? I never heard about an environment variable called LANGUAGE. This is about LANG/LC_ALL/LC_whatever, so POSIX syntax is required... > I currently define LANGUAGE=en_CA:en_GB:en in case en-CA is unsupported by > anything. > [I use my own en-CA locale not the glibc default created by https://rap.dk/.] > > Will "-" be supported like "_" as a separator in values? In Cygwin? No. The POSIX syntax is required, it's converted into a matching Windows RFC 5646 locale internally. > > > And the third bug is that Cygwin fails to set errno if it doesn't > > > support a locale, but that's a minor inconvenience in comparison. > > > > > > Thanks for the report, I totally missed the above problem with > > > ResolveLocaleName. > > > > I pushed a couple of patches which hopefully clean up the code. It's > > really frustrating how these Windows locale functions work. Or, rather, > > not work. I mean, come on... > > > > - ResolveLocaleName() resolves "ff-BF" to "ff-Latn-SN", not to > >"ff-Adlm-BF" or "ff-Latn-BF", even though both exist. > > > > - There's a locale called "sd-Arab-PK" and a locale "sd-Deva-IN". If > >you ask for the script used in "sd-IN", the result is "Arab", not > >"Deva". > > > > I had to create a replacement function for ResolveLocaleName which > > doesn't return totally screwy and unexpected results, and special case > > two more locales in /proc/locales output so the output makes sense. > > Aha - a nice new 3.5.0 feature - as well as /proc/codesets - is that > charsets e.g. ISO-10646, etc. rather than encodings e.g. UTF-8, etc.! It's a list of what you can use as codeset in $LANG and friends as in LC_CTYPE=lang_TERRITORY.codeset@modifier Corinna -- Problem reports: https://cygwin.com/problems.html FAQ: https://cygwin.com/faq/ Documentation:https://cygwin.com/docs.html Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple