Bug#1013946: lintian: wrongly report unknown-locale-code ber

2022-06-29 Thread Axel Beckert
Hi Tobias,

Dr. Tobias Quathamer wrote:
> The safest way for lintian would probably be to use ISO 639-3 as a source
> for locale checking, because those codes represent an individual language.
> The vast majority of program translations are into an individual language,
> so the check seems plausible.
> 
> For bonus points, you could also check ISO 639-5 and print a warning (or
> info) that this locale code represents a language group rather than an
> individual language. :-)
> 
> This is essentially Axel's latest suggestion -- except that I'd suggest to
> use ISO 639-3 instead of ISO 639-2 as authoritative source.

Thanks for this summary and especially also all the history! (And for
reading our long mails which showed bit by bit our experiences and
discoveries. :-)

> Sorry for this long e-mail, but languages and their codes are pretty
> hard ...

No need to be sorry!

Will try to implement this in lintian.

Thanks again!

Regards, Axel
-- 
 ,''`.  |  Axel Beckert , https://people.debian.org/~abe/
: :' :  |  Debian Developer, ftp.ch.debian.org Admin
`. `'   |  4096R: 2517 B724 C5F6 CA99 5329  6E61 2FF9 CD59 6126 16B5
  `-|  1024D: F067 EA27 26B9 C3FC 1486  202E C09E 1D89 9593 0EDE



Bug#1013946: lintian: wrongly report unknown-locale-code ber

2022-06-29 Thread Dr. Tobias Quathamer

Am 28.06.22 um 02:31 schrieb Axel Beckert:

Still would be happy about input from Toddy on this. :-)


Hi all,

thanks a lot for your research and insights ...

I'm not an ISO expert, either, but from my reading and understanding the 
relationship between the standards (and the intended use) is as follows:


First, ISO 639 (without the suffix -1) was created, and it included most 
of the major (spoken and written) languages of the world. All included 
languages had a two letter code (like "en" for English).


As it quickly turned out, the two letter code was not enough to 
categorize all written and spoken languages. :-)


Therefore, ISO 639 became ISO 639-1 and the development of ISO 639-2 
started. As far as I know, ISO 639-1 is a strict subset of ISO 639-2.


The standard ISO 639-2 introduced a three letter code, so they could 
include many more languages. As the standard was intended to be used in 
bibliographic contexts, they created a code for every individual 
language which has at least a "modest body of literature" (whatever 
"modest" means here.)


In order to accommodate languages with an even smaller proportion of 
literature, they created collections of languages, called "language 
groups" or "families". One such example are the Berber languages which 
you've discovered.


Lastly, ISO 639-3 and 639-5 have been created. Those standards aim to 
make a clear distinction between individual languages (in ISO 639-3) and 
language groups or families (ISO 639-5).


Apart from the language families (like Berber), every element that 
represents an individual language in ISO 639-2 is included in ISO 639-3. 
So for individual languages, ISO 639-2 is a strict subset of ISO 639-3.




I'm not entirely sure if it's a good idea to use a language family as a 
locale (or in this context, program translation). It might work for the 
Berber example, if the Berber languages are really so similar that it 
doesn't matter which language it is exactly. However, I don't know 
anything about Berber languages, so I cannot tell if this approach makes 
sense.


From a quick search, there are at least Kabyle language, Shilha 
language, the Tuareg languages, Tarifit, and Central Atlas Tamazight 
which are summed up as Berber languages.




The safest way for lintian would probably be to use ISO 639-3 as a 
source for locale checking, because those codes represent an individual 
language. The vast majority of program translations are into an 
individual language, so the check seems plausible.


For bonus points, you could also check ISO 639-5 and print a warning (or 
info) that this locale code represents a language group rather than an 
individual language. :-)


This is essentially Axel's latest suggestion -- except that I'd suggest 
to use ISO 639-3 instead of ISO 639-2 as authoritative source.


Sorry for this long e-mail, but languages and their codes are pretty 
hard ...


Regards,
Tobias


OpenPGP_signature
Description: OpenPGP digital signature


Bug#1013946: lintian: wrongly report unknown-locale-code ber

2022-06-28 Thread Russ Allbery
Axel Beckert  writes:

> Hrm, a serious thought on this: Why not implement both variants?

> What if we

> * make unknown-locale-code look at ISO 639-1, 639-2, 639-3 and even
>   639-5 for generally valid codes, and then

> * add a new, maybe pedantic-level warning which is only emitted if a
>   language group is used in a locale name, i.e. check locales against
>   ISO 639-5 and if one of these (which IIRC include the language groups
>   present in ISO 639-2) is used as locale, we emit a tag which might
>   be named locale-uses-language-group-code or similar?

> This currently sounds if it would make use of all our arguments for
> and against including ISO 639-2, would be backwards compatible and
> more precise and helpful.

Oh, this is a great idea.  I like this.  That way if someone is using a
language group on purpose, such as in this case, they can just override or
ignore the tag.

FYI, the only tags found in 639-2 that are not in 639-3 plus 639-5 are:

{
  "alpha_3": "cnr",
  "name": "Montenegrin"
},
{
  "alpha_3": "him",
  "name": "Himachali languages; Western Pahari languages"
},
{
  "alpha_3": "qaa-qtz",
  "name": "Reserved for local use"
},

According to https://iso639-3.sil.org/code/cnr, cnr is in 639-3, so the
iso-codes data for it may be out of date.  Likewise, him is apparently now
in 639-5.

The conclusion I'd draw from that is that there's probably no need to add
639-2 if you include both 639-3 and 639-5, and it may be simpler to just
ignore it.

-- 
Russ Allbery (r...@debian.org)  



Bug#1013946: lintian: wrongly report unknown-locale-code ber

2022-06-27 Thread Axel Beckert
Hi Russ,

Russ Allbery wrote:
> So in short, I think I talked myself back around to your solution.
> :)

Same to me, I talked myself back around to your (previous) opinion.
:-) Hilarious!

So we both seem to have had good arguments. :-)

Hrm, a serious thought on this: Why not implement both variants?

What if we

* make unknown-locale-code look at ISO 639-1, 639-2, 639-3 and even
  639-5 for generally valid codes, and then

* add a new, maybe pedantic-level warning which is only emitted if a
  language group is used in a locale name, i.e. check locales against
  ISO 639-5 and if one of these (which IIRC include the language groups
  present in ISO 639-2) is used as locale, we emit a tag which might
  be named locale-uses-language-group-code or similar?

This currently sounds if it would make use of all our arguments for
and against including ISO 639-2, would be backwards compatible and
more precise and helpful.

Ok, and I should really go to bed now. :-) 

Regards, Axel
-- 
 ,''`.  |  Axel Beckert , https://people.debian.org/~abe/
: :' :  |  Debian Developer, ftp.ch.debian.org Admin
`. `'   |  4096R: 2517 B724 C5F6 CA99 5329  6E61 2FF9 CD59 6126 16B5
  `-|  1024D: F067 EA27 26B9 C3FC 1486  202E C09E 1D89 9593 0EDE



Bug#1013946: lintian: wrongly report unknown-locale-code ber

2022-06-27 Thread Axel Beckert
Hi,

one more comment:

Russ Allbery wrote:
> I worked out the same thing, and I'm fairly sure that means that this is
> not a valid locale.  It's the code for the Berber language *group*, and
> the individual members of that group have their own 639-3 codes, so that
> seems to imply to me that those translations were tagged with the wrong
> code.

So I wondered what they should actually be tagged as. "Judeo-Berber"
is the only language with the string "berber" I found in ISO 639-3.

Unfortunately I found no mapping between ISO 639-2/-5 language groups
and actual languages in ISO 639-3 — in neither of JSON files for these
three parts.

So I dug around in Wikipedia and figured out that Judeo-Berber is a
"Non-Zenati Northern Berber language". And it using the Hebrew
alphabet seems to be a unique characteristic. Which again seems rather
specific and if it's a Berber language and hasn't Hebrew letters, it's
likely not Judeo-Berber.

Given https://en.wikipedia.org/wiki/File:Linguistic_Diagram_Berber.png
(from https://en.wikipedia.org/wiki/Berber_languages) there are tons
of possible languages gathered under "Berber languages", so the longer
the more I tend to agree with Russ' arguments to stay with ISO 639-3
only.

Plus maybe add a few more notes to the tag description to explain why
language groups are probably no good idea for locales.

Still would be happy about input from Toddy on this. :-)

Regards, Axel
-- 
 ,''`.  |  Axel Beckert , https://people.debian.org/~abe/
: :' :  |  Debian Developer, ftp.ch.debian.org Admin
`. `'   |  4096R: 2517 B724 C5F6 CA99 5329  6E61 2FF9 CD59 6126 16B5
  `-|  1024D: F067 EA27 26B9 C3FC 1486  202E C09E 1D89 9593 0EDE



Bug#1013946: lintian: wrongly report unknown-locale-code ber

2022-06-27 Thread Russ Allbery
Axel Beckert  writes:

> Anyway, JFTR: I just looked at how lintian in Debian Stable (i.e.
> 2.104.0 in Bullseye) does the locale code lookup. It had it's own data
> file for that (and hence now using iso-codes is good as it is no more
> duplicating these 33kB of data) and that file
> (/usr/share/lintian/data/files/locale-codes) states:

>   # List of locale codes.  This is derived from the ISO 639-1, ISO
>   # 639-2, and ISO 639-3 standards.

> And indeed, "ber" was in that file.

> So previously lintian did use ISO 639-1, 639-2 and 639-3.

> So using just ISO 639-3 was either an accident, on purpose or a
> regression and has been introduced when lintian was switching to
> iso-code's files as data source in commit
> https://salsa.debian.org/lintian/lintian/-/commit/fcaded19

What I think I managed to reconstruct from reading about this [1] is that
639-2 was the original work to supplement 639-1 (which is limited to
two-letter codes and omits a lot of smaller languages).  However, ISO
639-2 also assigned codes to language families and some other things,
wherease ISO 639-3 is limited to just languages and the families moved to
ISO 639-5.

[1] https://en.wikipedia.org/wiki/ISO_639-2 mostly.

Looking at ISO 639-5, I think a lot of those wouldn't make sense as
translations.  It has a lot of things like zhx (Chinese family), cpe (all
English-based creoles), or grk (Greek languages).  Some of those (cpe for
example) also appear in ISO 639-2, which implies to me that 639-2 is a bit
too broad for useful translations.

That said, reading more about the Berber languages [2], I understand how
this happened with this group in particular.  Specifically, this:

A listing of the other Berber languages is complicated by their
closeness; there is little distinction between language and
dialect. The primary difficulty of subclassification, however, lies in
the eastern Berber languages, where there is little agreement.

probably implies that the languages are sufficiently mutually
comprehensible that it may make sense to translate something to "Berber"
without specifying a specific language in the family.  (I could imagine
that sometimes it may avoid political and social issues to not specify a
specific language from the family, although I have no idea if that's the
case here.)

[2] https://en.wikipedia.org/wiki/Berber_languages

However, that wouldn't really make sense for "cpe" (creoles are very
different from each other even if they're English-based).  So that still
feels to me like it leans away from including everything in 639-2.

I think I may be talking myself into adding an exception list of non-639-3
language codes that nonetheless are used by translators.  But that's an
ongoing maintenance burden, so maybe that's not the right move either.

The alternate argument is that Lintian's check is really mostly there to
catch typos, and maybe we should assume anyone who uses any 639-2 or 639-3
code knows what they're doing.  And since that's what Lintian used to do,
it has the benefit of fixing a regression and I don't think anyone was
complaining about the breadth of the previous list, just the duplication
of information.

So in short, I think I talked myself back around to your solution.  :)
(Maybe all of this can be captured in comments for the next poor
maintainer who has to try to understand what's going on.)

-- 
Russ Allbery (r...@debian.org)  



Bug#1013946: lintian: wrongly report unknown-locale-code ber

2022-06-27 Thread Axel Beckert
Control: tag -1 + help

Hi Russ,

Russ Allbery wrote:
> > But upon deeper inspection I found that this is likely not an issue in
> > iso-codes as "ber" is correctly not in
> > /usr/share/iso-codes/json/iso_639-3.json but in …/iso_639-2.json and
> > …/iso_639-5.json as it is a code for a language group. (Which kinda
> > makes it suspicious for me to be used in locales. But then again I'm
> > not a linguist.)
> 
> Sorry, I followed up on the bug and forgot to explicitly cc Lintian

Not needed. I got the message via the lintian ML / maintainer address.
(Somehow I though didn't get my own messages to that bug report back
via the list.)

> I worked out the same thing, and I'm fairly sure that means that this is
> not a valid locale.  It's the code for the Berber language *group*, and
> the individual members of that group have their own 639-3 codes, so that
> seems to imply to me that those translations were tagged with the wrong
> code.

Yep, I also noticed that. I'm just not sure where exactly the border
between just a group of languages, which has no common grounds to be
spoken anywhere, and a group of very similar languages, which likely
can be understood by members of another language from the same group
and maybe even have a common written language, is.

Toddy may indeed have some more input for us here.

> Fabio also followed up and noted that there are a few translations for ber
> in Launchpad, but they're all partial and probably not usable.

Ok, I didn't get that mail. So maybe I really didn't get your initial
mail, just another mail from you to the bug report. :-)

> Tobias probably knows more, as iso-codes maintainer, but my guess is that
> this is a mistake on the Launchpad side and those translations should be
> for one of the specific languages of the group rather than being coded to
> the 639-5 language group code.  I think Lintian should still continue to
> use 639-3.
> 
> That said, I'll leave it to you to decide if you want to hang on to the
> bug or not.  :)

Thanks for your input here. Actually that variant so far was my second
choice (the stricter one) so far. See the very end of that one long
mail from me. :-)

Anyway, JFTR: I just looked at how lintian in Debian Stable (i.e.
2.104.0 in Bullseye) does the locale code lookup. It had it's own data
file for that (and hence now using iso-codes is good as it is no more
duplicating these 33kB of data) and that file
(/usr/share/lintian/data/files/locale-codes) states:

  # List of locale codes.  This is derived from the ISO 639-1, ISO
  # 639-2, and ISO 639-3 standards.

And indeed, "ber" was in that file.

So previously lintian did use ISO 639-1, 639-2 and 639-3.

So using just ISO 639-3 was either an accident, on purpose or a
regression and has been introduced when lintian was switching to
iso-code's files as data source in commit
https://salsa.debian.org/lintian/lintian/-/commit/fcaded19

Unfortunately this commit was tagged "Gbp-Dch: ignore" in git
(why?!?), so it didn't appear in debian/changelog. *g* (I may
retroactively add it to the debian/changelog entry of 2.115.0 like I
already added the item about switching to Text::Glob which also caused
bugs.)

Anyway, with you proposing a more strict checking here and I was at
least initially proposing to get back to the more laxer parsing used
previously, it would be really good to have some additionaly input
from someone with a bit more experience on that topic. I hope that
Toddy can provide that. :-)

Tagging as help for that reason.

Regards, Axel
-- 
 ,''`.  |  Axel Beckert , https://people.debian.org/~abe/
: :' :  |  Debian Developer, ftp.ch.debian.org Admin
`. `'   |  4096R: 2517 B724 C5F6 CA99 5329  6E61 2FF9 CD59 6126 16B5
  `-|  1024D: F067 EA27 26B9 C3FC 1486  202E C09E 1D89 9593 0EDE



Bug#1013946: lintian: wrongly report unknown-locale-code ber

2022-06-27 Thread Russ Allbery
Axel Beckert  writes:

> Thanks for your effort, Russ! That was my first guess, too.

> But upon deeper inspection I found that this is likely not an issue in
> iso-codes as "ber" is correctly not in
> /usr/share/iso-codes/json/iso_639-3.json but in …/iso_639-2.json and
> …/iso_639-5.json as it is a code for a language group. (Which kinda
> makes it suspicious for me to be used in locales. But then again I'm
> not a linguist.)

Sorry, I followed up on the bug and forgot to explicitly cc Lintian and of
course that message didn't come through.

I worked out the same thing, and I'm fairly sure that means that this is
not a valid locale.  It's the code for the Berber language *group*, and
the individual members of that group have their own 639-3 codes, so that
seems to imply to me that those translations were tagged with the wrong
code.

Fabio also followed up and noted that there are a few translations for ber
in Launchpad, but they're all partial and probably not usable.

Tobias probably knows more, as iso-codes maintainer, but my guess is that
this is a mistake on the Launchpad side and those translations should be
for one of the specific languages of the group rather than being coded to
the 639-5 language group code.  I think Lintian should still continue to
use 639-3.

That said, I'll leave it to you to decide if you want to hang on to the
bug or not.  :)

-- 
Russ Allbery (r...@debian.org)  



Bug#1013946: lintian: wrongly report unknown-locale-code ber

2022-06-27 Thread Axel Beckert
Control: affects -1 - lintian
Control: reassign -1 lintian
Control: found -1 2.115.1

Hi Russ,

Russ Allbery wrote:
> Thanks for the report!  Lintian gets the canonical list of locales from
> the iso-codes package, and if I'm reading the last modification times from
> its Salsa repository correctly, it may have been a bit since it was
> updated.
> 
> I'm reassigning this bug to iso-codes for further investigation and cc'ing
> the maintainer.

Thanks for your effort, Russ! That was my first guess, too.

But upon deeper inspection I found that this is likely not an issue in
iso-codes as "ber" is correctly not in
/usr/share/iso-codes/json/iso_639-3.json but in …/iso_639-2.json and
…/iso_639-5.json as it is a code for a language group. (Which kinda
makes it suspicious for me to be used in locales. But then again I'm
not a linguist.)

Lintian only uses …/iso_639-3.json as of now. And according to source
code comments it thinks that ISO 639-1 and ISO 639-2 are both subsets
of ISO 639-3 which is clearly wrong.

In the end it is one of the cases where the POSIX specification is
ambiguous as it doesn't state which part of ISO 639 is relevant. (And
ISO doesn't make this easier as ISO 639-1 was just called "ISO 639"
when it was first published in 1967.)

So reassigning back to lintian. :-)

I'll implement a change in lintian which also takes ISO 639-2 into
account.

Toddy: I though wouldn't be unhappy if you could have a look at my
reasoning in
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1013946#33 which I
unfortunately didn't Cc to you. My main question is: Which parts of
ISO 639 are valid for usage in POSIX locales? I couldn't answer it
even after like 2 hours of digging standards and Wikipedia. Maybe you
can. :-)

Regards, Axel
-- 
 ,''`.  |  Axel Beckert , https://people.debian.org/~abe/
: :' :  |  Debian Developer, ftp.ch.debian.org Admin
`. `'   |  4096R: 2517 B724 C5F6 CA99 5329  6E61 2FF9 CD59 6126 16B5
  `-|  1024D: F067 EA27 26B9 C3FC 1486  202E C09E 1D89 9593 0EDE



Bug#1013946: lintian: wrongly report unknown-locale-code ber

2022-06-27 Thread Axel Beckert
user lintian-ma...@debian.org
usertag 1013946 + false-positive unknown-locale-code
tag 1013946 + confirmed
retitle 1013946 lintian: [FP] Wrongly reports unknown-locale-code "ber" (POSIX 
locales: ISO 639-2 vs 639-3 vs 639-5)
kthxbye

Hi Fabio,

Fabio Fantoni wrote:
> Package: lintian
> Version: 2.115.1
> Severity: normal
> 
> Hi, on a lintian output I saw:
> 
> W: xapps-common: unknown-locale-code ber [usr/share/locale/ber/]
> 
> but ber locale exists:
> https://www.loc.gov/standards/iso639-2/php/langcodes_name.php?code_ID=54

thanks for your bug report. This brought up a quite diffcult question:
Which parts of ISO 639 are meant to be used for POSIX locales.


Summary / TL;DR
---

It is currently not clear if this is really false positive or a true
positive. It basically boils down to the question which parts of ISO
639 should be used for POSIX locales: ISO 639-2, 639-3, 639-5 or a
combination of these? Lintian currently uses only ISO 639-3 — which
includes probably all of ISO 639-1 and most but not all of ISO 639-2.

And ISO 639-3 doesn't currently doesn't include "ber" (which is a
group of languages and not a language) but includes e.g. "jbe"
("Judeo-Berber"). ISO 639-2 and 639-5 though do include "ber".

For locales, POSIX refers to ISO/IEC 15897. And that one refers to ISO
639, but not explicitly to any part of it.

I came to the conclusion to expand this lintian check from only using
ISO 639-3 to also ISO 639-2 (which both also include ISO 639-1) and
hence make "ber" a locale accepted by lintian.

For a more detailed reasoning and the used sources, see below.


Long Story and Reasoning


It seems as if Lintian only takes ISO 639-3 into account, not ISO
639-2. And https://iso639-3.sil.org/about says

  At the core of ISO 639-3 are the individual languages already
  accounted for in ISO 639-2. The large number of living languages in
  the initial inventory of ISO 639-3 beyond those already included in
  ISO 639-2 was derived primarily from […]

For me, it's currently not clear if that means that all languages in
ISO 639-2 are literally included in ISO 639-3 (i.e. ISO 639-3 is a
superset of ISO 639-2) or if ISO 639-3 is just an addition to
ISO 639-2 (i.e. the languages in ISO 639-2 and ISO 639-3 are
disjunct).

In the former case, this would be a bug in the package iso-codes (or
isoquery, depending on the data model; see below), in the latter case
this would be a bug in Lintian as it would need to take ISO 639-2 into
account here, too.

And "ber" is in ISO 639-2 since 2009 according to
https://www.loc.gov/standards/iso639-2/php/code_changes_bycode.php?code_ID=54

And "isoquery" also finds it in ISO 639-2, but not ISO 639-3:

  → isoquery -i 639-3 ber
  isoquery: The code "ber" is not defined in ISO 639-3.
  → isoquery -i 639-2 ber
  ber Berber languages
  →

And indeed, the word "ber" can only be found in the ISO 639-3 and ISO
639-5 datasets:

  → fgrep -wA1 ber /usr/share/iso-codes/json/iso_639-?.json
  /usr/share/iso-codes/json/iso_639-2.json:  "alpha_3": "ber",
  /usr/share/iso-codes/json/iso_639-2.json-  "name": "Berber languages"
  --
  /usr/share/iso-codes/json/iso_639-5.json:  "alpha_3": "ber",
  /usr/share/iso-codes/json/iso_639-5.json-  "name": "Berber languages"

ISO 639-5 is also said to be a "supplement" according to
https://www.loc.gov/standards/iso639-5/

Then again there are three letter codes for languages like "deu" (and
its alias "ger") for German which are in both, ISO 639-2 as well as
ISO 639-3, but not ISO 639-5. For me, this only adds to the confusion.

Relevant file is lib/Lintian/Check/Files/Locales.pm at lines 69 to 90
as of today:

 69 has ISO639_3_by_alpha3 => (
 70 is => 'rw',
 71 lazy => 1,
 72 default => sub {
 73 my ($self) = @_;
 74 
 75 local $ENV{LC_ALL} = 'C';
 76 
 77 my $bytes = 
path('/usr/share/iso-codes/json/iso_639-3.json')->slurp;
 78 my $json = decode_json($bytes);
 79 
 80 my %iso639_3;
 81 for my $entry (@{$json->{'639-3'}}) {
 82 
 83 my $alpha_3 = $entry->{alpha_3};
 84 
 85 $iso639_3{$alpha_3} = $entry;
 86 }
 87 
 88 return \%iso639_3;
 89 }
 90 );

Lines 100 to 122 though give a hint that the author of this code
thinks that ISO 639-3 is just a union of ISO 639-1 and ISO 639-2:

100 my %CODES;
101 for my $entry (values %{$self->ISO639_3_by_alpha3}) {
   
102 
103 my $type = $entry->{type};
104 
105 # 
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=692548#10
106 next
107   if $type eq $RESERVED || $type eq $SPECIAL;
108 
109 # also have two letters, ISO 639-1
 ^
110  

Bug#1013946: lintian: wrongly report unknown-locale-code ber

2022-06-27 Thread Fabio Fantoni

Il 27/06/2022 23:53, Russ Allbery ha scritto:

Russ Allbery  writes:

Fabio Fantoni  writes:

Hi, on a lintian output I saw:
W: xapps-common: unknown-locale-code ber [usr/share/locale/ber/]
but ber locale exists:
https://www.loc.gov/standards/iso639-2/php/langcodes_name.php?code_ID=54

Thanks for the report!  Lintian gets the canonical list of locales from
the iso-codes package, and if I'm reading the last modification times from
its Salsa repository correctly, it may have been a bit since it was
updated.
I'm reassigning this bug to iso-codes for further investigation and cc'ing
the maintainer.

Hm, sorry about the self-follow-up.  I looked at this a bit further, and
now I think this may be intentional.  I believe ber refers to a language
*group* rather than a single language, and thus is an ISO 639-5 code but
not an ISO 639-3 code.

Lintian specifically uses ISO 639-3 to get a list of known locale
identifiers.

Do you know if an ISO 639-5 locale works properly on a Debian system, in
the sense that it can be used for translations and the other normal locale
things?  ber seems to be the code for the Berber *family* of languages,
which has multiple members that have their own ISO 639-3 codes.  I'm not
sure that locale information for the entire family, as opposed to
individual members of that family, makes sense.

I don't know if "ber" locale works but in xapp and other cinnamon 
components there is a partial translation related, this is why I 
encountered this warning on lintian, the translation is done with 
launchpad: https://translations.launchpad.net/linuxmint/latest/+lang/ber


ber is supported in launchpad, transifex 
(https://www.transifex.com/explore/languages/) that I use, I don't 
search on others.


from another search now I found this: 
https://www.debian.org/international/l10n/po/ber.en.html and seems ber 
is used only in few packages and all with partial translation so I 
suppose thatthere are no users using it, even if it works there are too 
few translated strings




OpenPGP_signature
Description: OpenPGP digital signature


Bug#1013946: lintian: wrongly report unknown-locale-code ber

2022-06-27 Thread Russ Allbery
Russ Allbery  writes:
> Fabio Fantoni  writes:

>> Hi, on a lintian output I saw:

>> W: xapps-common: unknown-locale-code ber [usr/share/locale/ber/]

>> but ber locale exists:
>> https://www.loc.gov/standards/iso639-2/php/langcodes_name.php?code_ID=54

> Thanks for the report!  Lintian gets the canonical list of locales from
> the iso-codes package, and if I'm reading the last modification times from
> its Salsa repository correctly, it may have been a bit since it was
> updated.

> I'm reassigning this bug to iso-codes for further investigation and cc'ing
> the maintainer.

Hm, sorry about the self-follow-up.  I looked at this a bit further, and
now I think this may be intentional.  I believe ber refers to a language
*group* rather than a single language, and thus is an ISO 639-5 code but
not an ISO 639-3 code.

Lintian specifically uses ISO 639-3 to get a list of known locale
identifiers.

Do you know if an ISO 639-5 locale works properly on a Debian system, in
the sense that it can be used for translations and the other normal locale
things?  ber seems to be the code for the Berber *family* of languages,
which has multiple members that have their own ISO 639-3 codes.  I'm not
sure that locale information for the entire family, as opposed to
individual members of that family, makes sense.

-- 
Russ Allbery (r...@debian.org)  



Bug#1013946: lintian: wrongly report unknown-locale-code ber

2022-06-27 Thread Russ Allbery
Control: reassign -1 iso-codes
Control: retitle -1 ber locale missing from iso_639-3.json
Control: affects -1 lintian

Fabio Fantoni  writes:

> Hi, on a lintian output I saw:

> W: xapps-common: unknown-locale-code ber [usr/share/locale/ber/]

> but ber locale exists:
> https://www.loc.gov/standards/iso639-2/php/langcodes_name.php?code_ID=54

Thanks for the report!  Lintian gets the canonical list of locales from
the iso-codes package, and if I'm reading the last modification times from
its Salsa repository correctly, it may have been a bit since it was
updated.

I'm reassigning this bug to iso-codes for further investigation and cc'ing
the maintainer.

-- 
Russ Allbery (r...@debian.org)  



Bug#1013946: lintian: wrongly report unknown-locale-code ber

2022-06-27 Thread Fabio Fantoni

Package: lintian
Version: 2.115.1
Severity: normal

Hi, on a lintian output I saw:

W: xapps-common: unknown-locale-code ber [usr/share/locale/ber/]

but ber locale exists: 
https://www.loc.gov/standards/iso639-2/php/langcodes_name.php?code_ID=54