Re: RFR: 8306323: Update license files in CLDR v43

2023-04-18 Thread Steven R . Loomis
On Tue, 18 Apr 2023 18:40:03 GMT, Naoto Sato  wrote:

> The upgrade to CLDR v43 was missing the license-related file updates. Here 
> are the supplemental updates.

Marked as reviewed by srl...@github.com (no known OpenJDK username).

-

PR Review: https://git.openjdk.org/jdk/pull/13517#pullrequestreview-1390749259


Re: RFR: 8296248: Update CLDR to Version 43.0

2023-04-13 Thread Steven R . Loomis
On Thu, 13 Apr 2023 20:47:39 GMT, Steven R. Loomis  wrote:

>> Upgrading the CLDR to [version 
>> 43](https://cldr.unicode.org/index/downloads/cldr-43). This semi-annual 
>> release is their `limited-submission` release so I would not expect 
>> regressions caused by formatting changes as we had in JDK20/CLDRv42 
>> (https://inside.java/2023/03/28/quality-heads-up/)
>
> Marked as reviewed by srl...@github.com (no known OpenJDK username).

`@srl295 (no known openjdk.org user name / role)` 

-

PR Comment: https://git.openjdk.org/jdk/pull/13469#issuecomment-1507597408


Re: RFR: 8296248: Update CLDR to Version 43.0

2023-04-13 Thread Steven R . Loomis
On Thu, 13 Apr 2023 20:20:02 GMT, Naoto Sato  wrote:

> Upgrading the CLDR to [version 
> 43](https://cldr.unicode.org/index/downloads/cldr-43). This semi-annual 
> release is their `limited-submission` release so I would not expect 
> regressions caused by formatting changes as we had in JDK20/CLDRv42 
> (https://inside.java/2023/03/28/quality-heads-up/)

Marked as reviewed by srl...@github.com (no known OpenJDK username).

make/data/cldr/common/main/ken.xml line 19:

> 17:   
> 18:   
> 19:   [a á à ǎ b c d e é è ě ɛ {ɛ\u0301} 
> {ɛ\u0300} {ɛ\u030C} f g {gb} {gh} h i ɨ {ɨ\u0301} {ɨ\u0300} {ɨ\u030C} j k 
> {kp} m n {ny} ŋ o ó ò ǒ ɔ {ɔ\u0301} {ɔ\u0300} {ɔ\u030C} p r s t u ú ù ǔ ʉ 
> {ʉ\u0301} {ʉ\u0300} {ʉ\u030C} w y]

@naotoj  this is in common/main but not at basic level… is this intentional?

make/data/cldr/common/properties/coverageLevels.txt line 115:

> 113: sq ; modern ;Albanian
> 114: sr ; modern ;Serbian
> 115: sr_Latn ;modern ;Serbian (Latin)

@naotoj BTW this was fixed

src/java.base/share/legal/cldr.md line 1:

> 1: ## Unicode Common Local Data Repository (CLDR) v43

BTW the license is now just named `LICENSE` in the repo starting with v44

-

PR Review: https://git.openjdk.org/jdk/pull/13469#pullrequestreview-1384183612
PR Review Comment: https://git.openjdk.org/jdk/pull/13469#discussion_r1166009340
PR Review Comment: https://git.openjdk.org/jdk/pull/13469#discussion_r1166009761
PR Review Comment: https://git.openjdk.org/jdk/pull/13469#discussion_r1166010264


Re: RFR: 8305400: ISO 4217 Amendment 175 Update

2023-03-31 Thread Steven R . Loomis
On Fri, 31 Mar 2023 21:38:31 GMT, Justin Lu  wrote:

> Please review the ISO 4217 amendment 175 update.
> 
> There are no meaningful code changes, but the version number should be 
> updated accordingly to be in sync.

Marked as reviewed by srl...@github.com (no known OpenJDK username).

-

PR Review: https://git.openjdk.org/jdk/pull/13275#pullrequestreview-1367679317


Re: RFR: 8305400: ISO 4217 Amendment 175 Update

2023-03-31 Thread Steven R . Loomis
On Fri, 31 Mar 2023 21:38:31 GMT, Justin Lu  wrote:

> Please review the ISO 4217 amendment 175 update.
> 
> There are no meaningful code changes, but the version number should be 
> updated accordingly to be in sync.

Do you track the legal tender date? Because that did change, for CLDR 
https://github.com/unicode-org/cldr/pull/2825

-

PR Review: https://git.openjdk.org/jdk/pull/13275#pullrequestreview-1367657807


Re: RFR: 8303833: java.util.LocaleISOData has wrong comments for 'Norwegian Bokmål' and 'Volapük'

2023-03-08 Thread Steven R . Loomis
On Wed, 8 Mar 2023 19:57:43 GMT, Eirik Bjorsnos  wrote:

> Please review this comment-only PR which fixes incorrect language names  
> 'Norwegian Bokmål' and 'Volapük' in the comments in LocaleISOData:
> 
> `+ "nb" + "nob"  // Norwegian Bokm?l`
> `+ "vo" + "vol"  // Volap?k`
> 
> These encoding issues seem to have been around since Duke imported the file 
> in 2007. Let's fix them now.
> 
> For context: 'Norwegian bokmål'  is the most common written form of the 
> Norwegian language:
> 
> https://en.wikipedia.org/wiki/Bokm%C3%A5l
> 
> 'Volapük' is a constructed language: 
> 
> https://en.wikipedia.org/wiki/Volap%C3%BCk

Marked as reviewed by srl...@github.com (no known OpenJDK username).

-

PR: https://git.openjdk.org/jdk/pull/12932


Re: RFR: 8303472: Display name for region TR [v2]

2023-03-02 Thread Steven R . Loomis
On Wed, 1 Mar 2023 23:45:47 GMT, Justin Lu  wrote:

>> This PR changes the English name for the region `TR`, from `Turkey` to 
>> `Türkiye`. Although this change is included in the upcoming CLDR v43, it 
>> should be applied as a spot change so that it can be back-ported properly 
>> (As it is a common English region name).
>> 
>> 
>> 
>> This change targets both the CLDR and COMPAT data.
>
> Justin Lu has updated the pull request incrementally with two additional 
> commits since the last revision:
> 
>  - Supply test with changes, use unicode escapes to be consistent
>  - copyright year

Marked as reviewed by srl...@github.com (no known OpenJDK username).

-

PR: https://git.openjdk.org/jdk/pull/12816


Re: RFR: 8303039: Utilize `coverageLevels.txt` [v2]

2023-03-01 Thread Steven R . Loomis
On Thu, 2 Mar 2023 01:03:20 GMT, Naoto Sato  wrote:

>> This is a pre-requisite for supporting CLDR v43, where they combine `seeds` 
>> locales with `common` locales 
>> (https://cldr.unicode.org/index/downloads/cldr-43#h.7s25aqdv767e). In order 
>> to have the same coverage level of locales, CLDRConverter tool needs to comb 
>> through the locale files based on the `coverageLevels.txt` file, (and the 
>> ones we already included as of v42). Confirmed the same set of locales is 
>> generated before and after this modification.
>
> Naoto Sato has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   Explicitly filter coverage levels

Marked as reviewed by srl...@github.com (no known OpenJDK username).

-

PR: https://git.openjdk.org/jdk/pull/12812


Re: RFR: 8303039: Utilize `coverageLevels.txt`

2023-03-01 Thread Steven R . Loomis
On Wed, 1 Mar 2023 23:19:01 GMT, Naoto Sato  wrote:

>> make/jdk/src/classes/build/tools/cldrconverter/CLDRConverter.java line 1212:
>> 
>>> 1210: a -> 
>>> Locale.forLanguageTag(a[0].trim().replaceAll("_", "-")),
>>> 1211: a -> a[1].trim(),
>>> 1212: (v1, v2) -> v2,
>> 
>> this will grab all listed entries.  Right now, that will get you basic and 
>> above.  Ideally you would include if `v1` is one of 
>> `(basic|moderate|modern|comprehensive)` — I'm proposing to add `core` or 
>> maybe even `undefined` locales in this list (though no consensus yet)
>
> OK, will filter explicitly for those levels.

the whole set is `undetermined|core|basic|moderate|modern|comprehensive` you 
could use an enum and then you can select just one.

-

PR: https://git.openjdk.org/jdk/pull/12812


Re: RFR: 8303039: Utilize `coverageLevels.txt`

2023-03-01 Thread Steven R . Loomis
On Wed, 1 Mar 2023 23:18:59 GMT, Naoto Sato  wrote:

>> is this from the alpha2 drop? or v42's?
>
> Yes, this is the file from the released CLDR v42. We have not integrated v43 
> yet.

this is great groundwork.

-

PR: https://git.openjdk.org/jdk/pull/12812


Re: RFR: 8303039: Utilize `coverageLevels.txt`

2023-03-01 Thread Steven R . Loomis
On Wed, 1 Mar 2023 23:02:06 GMT, Steven R. Loomis  wrote:

>> This is a pre-requisite for supporting CLDR v43, where they combine `seeds` 
>> locales with `common` locales 
>> (https://cldr.unicode.org/index/downloads/cldr-43#h.7s25aqdv767e). In order 
>> to have the same coverage level of locales, CLDRConverter tool needs to comb 
>> through the locale files based on the `coverageLevels.txt` file, (and the 
>> ones we already included as of v42). Confirmed the same set of locales is 
>> generated before and after this modification.
>
> make/data/cldr/common/properties/coverageLevels.txt line 2:
> 
>> 1: # coverageLevels.txt
>> 2: # Copyright © 2022 Unicode, Inc.
> 
> older version?

is this from the alpha2 drop? or v42's?

> make/jdk/src/classes/build/tools/cldrconverter/OtherCommonLocales.properties 
> line 140:
> 
>> 138: 
>> 139: # Not listed, but existed
>> 140: sr-Latn=Serbian (Latin)
> 
> [CLDR-16449](https://unicode-org.atlassian.net/browse/CLDR-16449)

you might want to give yourself a task to periodically review this file.

-

PR: https://git.openjdk.org/jdk/pull/12812


Re: RFR: 8303039: Utilize `coverageLevels.txt`

2023-03-01 Thread Steven R . Loomis
On Wed, 1 Mar 2023 19:50:56 GMT, Naoto Sato  wrote:

> This is a pre-requisite for supporting CLDR v43, where they combine `seeds` 
> locales with `common` locales 
> (https://cldr.unicode.org/index/downloads/cldr-43#h.7s25aqdv767e). In order 
> to have the same coverage level of locales, CLDRConverter tool needs to comb 
> through the locale files based on the `coverageLevels.txt` file, (and the 
> ones we already included as of v42). Confirmed the same set of locales is 
> generated before and after this modification.

my openjdk name should be `srl`  … i'm still there 
https://openjdk.org/census#srl

-

PR: https://git.openjdk.org/jdk/pull/12812


Re: RFR: 8303039: Utilize `coverageLevels.txt`

2023-03-01 Thread Steven R . Loomis
On Wed, 1 Mar 2023 19:50:56 GMT, Naoto Sato  wrote:

> This is a pre-requisite for supporting CLDR v43, where they combine `seeds` 
> locales with `common` locales 
> (https://cldr.unicode.org/index/downloads/cldr-43#h.7s25aqdv767e). In order 
> to have the same coverage level of locales, CLDRConverter tool needs to comb 
> through the locale files based on the `coverageLevels.txt` file, (and the 
> ones we already included as of v42). Confirmed the same set of locales is 
> generated before and after this modification.

Marked as reviewed by srl...@github.com (no known OpenJDK username).

make/data/cldr/common/properties/coverageLevels.txt line 2:

> 1: #  coverageLevels.txt
> 2: #  Copyright © 2022 Unicode, Inc.

older version?

make/jdk/src/classes/build/tools/cldrconverter/CLDRConverter.java line 1212:

> 1210: a -> 
> Locale.forLanguageTag(a[0].trim().replaceAll("_", "-")),
> 1211: a -> a[1].trim(),
> 1212: (v1, v2) -> v2,

this will grab all listed entries.  Right now, that will get you basic and 
above.  Ideally you would include if `v1` is one of 
`(basic|moderate|modern|comprehensive)` — I'm proposing to add `core` or maybe 
even `undefined` locales in this list (though no consensus yet)

make/jdk/src/classes/build/tools/cldrconverter/OtherCommonLocales.properties 
line 140:

> 138: 
> 139: # Not listed, but existed
> 140: sr-Latn=Serbian (Latin)

[CLDR-16449](https://unicode-org.atlassian.net/browse/CLDR-16449)

-

PR: https://git.openjdk.org/jdk/pull/12812


Re: RFR: 8284840: Update CLDR to Version 42.0

2022-10-22 Thread Steven R . Loomis
On Sat, 22 Oct 2022 08:14:00 GMT, Alan Bateman  wrote:

> > Yes. These translation changes affect formatting. We don't usually file a 
> > CSR for such changes, but cover them in our release notes.
> 
> Indeed and periodically CLDR upgrades do cause breakage somewhere, often it 
> will be a library or application tests that compare some result that is 
> outdated due to changes that impact the formatting.

Yes, although libraries and tests shouldn't be testing against cultural 
formatting whose goal is to produce the updated best result, not the same 
result.  For example we had an internal corporate client which needed to use 
fixed formatting, instead of culturally sensitive formatting, because they 
really wanted exactly the same output every time, because the output data was 
going to be consumed by machine and not just by humans.

-

PR: https://git.openjdk.org/jdk/pull/10820


Re: RFR: 8284840: Update CLDR to Version 42.0

2022-10-21 Thread Steven R . Loomis
On Fri, 21 Oct 2022 16:55:28 GMT, Naoto Sato  wrote:

> This is to update the CLDR data from version 41 to version 42. The vast 
> majority of the changes are basically replacing the CLDR data, along with 
> tools/testcase alignments to those upstream changes:
> 
> https://unicode-org.atlassian.net/browse/CLDR-14032 (" at " is no longer used 
> for standard date/time format)
> https://unicode-org.atlassian.net/browse/CLDR-14831 (NBSP prefixed to `a`, 
> instead of a normal space )
> https://unicode-org.atlassian.net/browse/CLDR-11510 (Fix first day of week 
> info for China (CN))
> https://unicode-org.atlassian.net/browse/CLDR-15966 (Japanese: Support 
> numbers up to 京)
> 
> Here is the link to CLDR v42's release notes: 
> https://cldr.unicode.org/index/downloads/cldr-42

hi folks  looking good… congats on the quick update!

-

PR: https://git.openjdk.org/jdk/pull/10820


Re: RFR: 8289227: Support for BCP 47 Extension T - Transformed Content [v6]

2022-08-01 Thread Steven R . Loomis
On Mon, 1 Aug 2022 21:27:47 GMT, Steven R. Loomis  wrote:

>> Naoto Sato has updated the pull request incrementally with one additional 
>> commit since the last revision:
>> 
>>   Use `assertThrows`
>
> test/jdk/java/util/Locale/bcp47/TExtensionTests.java line 80:
> 
>> 78: return new Object[][] {
>> 79: {L1, Locale.US,
>> 80: "Cyrillic (Transform: Latin, Transform Rules: UN 
>> GEGN Transliteration 2007)"},
> 
>  I like these display names!

Note: another class of use for `-t-` is for keyboards, so for example 
`ar-t-k0-windows-azerty` in 
https://github.com/unicode-org/cldr/blob/release-41/keyboards/windows/ar-t-k0-windows-azerty.xml

-

PR: https://git.openjdk.org/jdk/pull/9620


Re: RFR: 8289227: Support for BCP 47 Extension T - Transformed Content [v6]

2022-08-01 Thread Steven R . Loomis
On Sat, 30 Jul 2022 20:52:49 GMT, Naoto Sato  wrote:

>> This PR is to propose supporting the `T` extension to the BCP 47 to which 
>> `java.util.Locale` class conforms. There are two extensions to the BCP 47, 
>> one is `Unicode Locale Extension` which has been supported since JDK7, the 
>> other is this `Transformed Content` extension. A CSR has also been drafted.
>
> Naoto Sato has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   Use `assertThrows`

lgtm

hi! by the way if you want to see some more `-t-` names in the wild, I recently 
merged https://github.com/unicode-org/cldr/pull/1755

test/jdk/java/util/Locale/bcp47/TExtensionTests.java line 80:

> 78: return new Object[][] {
> 79: {L1, Locale.US,
> 80: "Cyrillic (Transform: Latin, Transform Rules: UN 
> GEGN Transliteration 2007)"},

 I like these display names!

-

Marked as reviewed by srl...@github.com (no known OpenJDK username).

PR: https://git.openjdk.org/jdk/pull/9620