/show_language.asp?code=LPR
Dialects RUIJA, TORNE, SEA LAPPISH. Ruija is the Finnish name for the
territory covered by Northern-Troms and Finnmark provinces (fylker), its
not a dialect.
The Ethnologue data is drawn from multiple sources and, while every effort
is made to obtain the most reliable information
[EMAIL PROTECTED]
Sent: den 17 februari 2002 01:54
Subject: SV: Analysis of ISO 639 and mappings to SIL Ethnologue
The Norwegian and Sami language pages on this web site are unfortunately
so full of errors that they should be removed or corrected immediately
in order to avoid misleading
PROTECTED];
[EMAIL PROTECTED]; [EMAIL PROTECTED]; Trond
Trosterud ; Håvard Hjulstad
Emne: Re: Analysis of ISO 639 and mappings to SIL Ethnologue
Isn't that *both* the Finnish name for the territory *and*
the Finnish name for the dialect?
Stefan
No, not as far as I know. Since you refer
[apologies in advance to those who receive this multiple times]
In connection to work that Gary Simons and I have been doing in
interaction with ISO/TC 37/SC 2/WG 1, we have added some new pages to the
Ethnologue web site that present an analysis we have done of the existing
ISO 639 language
After considerable and unfortunate delay, the new Ethnologue site,
including the online version of the 14th Edition, is at last available to
the public: http://www.ethnologue.com/home.asp. There are still refinements
being made, but all the basics are there and working.
- Peter
After considerable and unfortunate delay, the new Ethnologue site,
including the online version of the 14th Edition, is at last
available to
the public: http://www.ethnologue.com/home.asp. There are
still refinements
being made, but all the basics are there and working.
Very nice
At 7:18 AM -0800 11/23/00, Christopher John Fynn wrote:
Spoken language is not necessarily at all the same
thing as written language .
There are e.g. plenty of mutually incomprehensible
forms of spoken English which might each deserve a
code in a standard for spoken languages but
Elliotte Rusty Harold [EMAIL PROTECTED] wrote:
At 7:18 AM -0800 11/23/00, Christopher John Fynn wrote:
Spoken language is not necessarily at all the same thing as
written language . There are e.g. plenty of mutually
incomprehensible forms of spoken English which might each deserve
Elliotte Rusty Harold wrote:
I've yet to encounter a spoken
version of English that I couldn't understand, after at most a couple
of minutes of accustoming myself to the accent.
You live in a country where dialect differentiation is a feeble thing,
consisting mainly in pronunciation, and
John Cowan noted:
In general, Geordie (the traditional dialect spoken around the Tyne
River in England) is considered to be the English dialect most difficult
for North Americans.
To that I would add Glaswegian. When watching the
Scots-produced mystery shows that show up on PBS in the
Kenneth Whistler wrote:
To that I would add Glaswegian. When watching the
Scots-produced mystery shows that show up on PBS in the United
States on occasion, my wife and I often turn to each other
in bafflement and say, "Subtitles, please."
Scots is a separate language! If you understand
John Cowan replied:
Kenneth Whistler wrote:
To that I would add Glaswegian. When watching the
Scots-produced mystery shows that show up on PBS in the United
States on occasion, my wife and I often turn to each other
in bafflement and say, "Subtitles, please."
Scots is a separate
On Thu, 30 Nov 2000, Kenneth Whistler wrote:
Scots is a separate language! If you understand anything at all
it's by a happy accident. (There is of course Scots-flavored
English as well, which is another matter.)
I was, of course, referring to Scots (alleged) English, and not
to
Peter Constable wrote:
This is a good example of why an enumeration of "languages"
based only on written forms (as found in ISO 639) is
insufficient for all user needs.
Of course ISO 639 is insufficient for *all* user needs
- no standard is. And is there actually a remit for
ISO 639 to
,
de-Arabicized.
Literary Hindi, or Hindi-Urdu, has four varieties: Hindi (High
Hindi, Nagari
Hindi, Literary Hindi, Standard Hindi)...
/CITE
from the online Ethnologue database, 13th ed.
URL:http://www.sil.org/ethnologue/countries/Inda.html#HND
Mm. Maybe
Peter Constable wrote:
SRC is the code for 'Bosnian', 'Croatian', and 'Serbo-Croatian', which
means that there is a many-to-one mapping from ISO 639-1 'bs', 'hr',
'sr' to Ethnologue 'SRC'.
By Ethnologue standards of mutual intelligibility, there is only one
language here.
Well
, has four varieties: Hindi (High Hindi, Nagari
Hindi, Literary Hindi, Standard Hindi)...
/CITE
from the online Ethnologue database, 13th ed.
URL:http://www.sil.org/ethnologue/countries/Inda.html#HND
Mm. Maybe a more polite (more PC) turn of phrase
se of
"co-" to denote subsidiary status, as in "co-pilot." I suspect the
Ethnologue staff intended the former (polite?) sense, but it could be
intepreted either way as desired.
What fun language is!
-Doug Ewell
Fullerton, California
plane".
(repeated several times in different messages)
Agreed. This is a refreshing departure from the position I perceived
earlier, that ISO 639 was severely broken and the Ethnologue approach
was inherently superior. The truth, of course, is that each approach
has its advantages and drawbacks fo
a separate, contrasting sense of
"co-" to denote subsidiary status, as in "co-pilot." I suspect the
Ethnologue staff intended the former (polite?) sense, but it could be
intepreted either way as desired.
What fun language is.
As far as I'm aware the co- prefix does mean an equa
Arsa Kevin Bracey:
As far as I'm aware the co- prefix does mean an equal grouping. Examples that
spring to mind are co-worker, co-conspirator, co-exist, coincidence and
co-operative. I thought co-dialects was a cunningly concise way of saying
that they could all be considered dialects of
[Apologies if you already got this. It seems to be bouncing, and so am
sending it again.]
On 09/21/2000 10:52:22 AM Doug Ewell wrote:
[snip]
Agreed. This is a refreshing departure from the position I perceived
earlier, that ISO 639 was severely broken and the Ethnologue approach
On 09/16/2000 04:27:45 PM Doug Ewell wrote:
All I am asking in this particular case is for the Ethnologue editor to
assign *one* primary name (and spelling) to each three-letter language
code, and to relegate the other names to alternate status in a
consistent way. That is the first necessary
On 09/17/2000 03:19:32 PM Doug Ewell wrote:
Well, perhaps this is another, unintended example of a problem with
incorporating the Ethnologue linguistic distinctions into other
standards without serious review. If Spaniards consider their language
sufficiently different from the Spanish spoken
On 09/17/2000 11:39:14 AM Doug Ewell wrote:
What names are I supposed to associate with codes like SHU, MKJ, and
SRC in my (possibly hypothetical) application that deals with language
tags? Such associations are normally expected to be one-to-one.
If Ethnologue codes are going to be regarded
On 09/17/2000 07:22:05 PM "Carl W. Brown" wrote:
You are right the Ethnologue is not appropriate as a standard.
If we're assuming a single standard, in the sense of a single "tiling of
the plane" of languages, we're not proposing that the Ethnologue be the
standard. We ar
-arc, and Aramaic is arc. If
you want to specify Assyrian Neo-Aramaic specifically, you can use
i-sil-aii.
John is absolutely correct here, and I need to qualify my agreement to
Carl's statement along exactly the lines John is indicating here.
Ethnologue can supplement ISO codes, but we're
quot;i-" and "x-" codes, namely that ISO 639-1 codes must be used
whenever possible, followed in turn by ISO 639-2 codes,
Absolutely.
"i-sil-xxx"
Ethnologue codes (whoops, John, that's a real code (for Keo)), other
"i-" codes, and finally "x-" codes
On 09/17/2000 11:13:36 PM John Cowan wrote:
Exactly so. And BTW "my proposal" is also Harald Alvestrand's proposal.
I wasn't aware of that until Harald mentioned something not too many days
ago.
- Peter
and
individuals are already using Ethnologue codes in this way precisely
because ISO provides very limited coverage.
I agree. For example when it was brought up that other Turkic languages
might be using the dot less i. I noticed that the SIL confirmed that
Azerbaijan uses the Latin alphabet
aijan uses the Latin alphabet. On the other hand it said that Urum was
"Spoken by ethnic 'Greeks'". Unless this is some kind of inside joke I can
not imagine any Greek having anything to do with anything Turkish.
Apart from cohabiting in Anatolia for a millenium. :-) In any case, the
E
From: Nick Nicholas [mailto:[EMAIL PROTECTED]]
Sent: Wednesday, September 20, 2000 4:48 PM
Apart from cohabiting in Anatolia for a millenium. :-) In any case, the
Ethnologue is correct about Urum; Urum and Mariupolitan Greek are the two
languages spoken by an ethnically Greek population, which
08:21:04 AM Michael Everson wrote:
[snip]
The Ethnologue lists six different Ancash Quechua, five different Huánaco
Quechuas, and a lot of other Quechuas besides. It's got five kinds of
Italian. How do we evaluate this? And I don't know how many Zapotecos,
there are too many to count. Do we just
John Cowan [EMAIL PROTECTED] wrote:
Doug wants the Ethnologue to give each of its languages (uniquely
tagged) a single unique worldwide authoritative name. That's not
reasonable in all cases, though it is in 99.5%.
What names are I supposed to associate with codes like SHU, MKJ, and
SRC
an
they reserve for their own (pure) Spanish
Well, perhaps this is another, unintended example of a problem with
incorporating the Ethnologue linguistic distinctions into other
standards without serious review. If Spaniards consider their language
sufficiently different from the Spanish spo
://www.i18nWithVB.com/
- Original Message -
From: "Doug Ewell" [EMAIL PROTECTED]
To: "Unicode List" [EMAIL PROTECTED]
Sent: Sunday, September 17, 2000 1:19 PM
Subject: Re: [OT] Re: the Ethnologue
Michael Kaplan [EMAIL PROTECTED] wrote:
Spaniards generally refer to the
Michka wrote :
Most seem to be okay with the addition of the country/region tag from
ISO-3166 for determing the difference between languages spoken in several
places -- this is usually what is done for English, Arabic, Portuguese,
French, and Chinese, as well.
I don't see how one can use
Well, to cover THAT level of variation, there is only the Ethnologue that I
have ever seen. But the specific question was about language differences
that ISO *can* cover.
michka
a new book on internationalization in VB at
http://www.i18nWithVB.com/
- Original Message -
From: "C
John Cowan wrote:
I see the problem: the same language (with the same code) may be
preferentially
known by one name in one country and another name in another. Because
the Ethnologue names languages by country, conflicts like this can appear.
The entry on "Chadian Spoken Arabic" (in C
On Sun, 17 Sep 2000, Carl W. Brown wrote:
I can understand your point of view as a standards person.
You are right the Ethnologue is not appropriate as a standard. But that
does not make it useless.
I am not a "standards person", and I think you have my stand mixed up.
I a
From: "John Cowan" [EMAIL PROTECTED]
Besides I can not
take any standard that implements i-klingon as a human language too
seriously.
Why not? Human beings speak it (some more fluently than others), and
write texts in it. Just follow the links from www.kli.org. It is not
anybody's
Michael Kaplan [EMAIL PROTECTED] wrote:
Don't forget to use 1554 (0x0612) if you need a Windows LCID for
Klingon - Latin and 2578 (0x0A12) for Klingon - pIqaD.
There's nothing more powerful than a user defined area. :-)
This is, at once, the best argument for and the best argument against
.
The Ethnologue lists six different Ancash Quechua, five different Huánaco
Quechuas, and a lot of other Quechuas besides. It's got five kinds of
Italian. How do we evaluate this? And I don't know how many Zapotecos,
there are too many to count. Do we just accept that it's all been evaluated?
Well, then we
Here's another thing about the Ethnologue list that has been almost,
but not quite, addressed. Just so everyone knows, the point here is
*NOT* that the six or seven thousand additional languages in Ethnologue
are somehow not worthy of encoding, but that the list is incompletely
edited
Ar 08:46 -0800 2000-09-16, scríobh Doug Ewell:
Here's another thing about the Ethnologue list that has been almost,
but not quite, addressed. Just so everyone knows, the point here is
*NOT* that the six or seven thousand additional languages in Ethnologue
are somehow not worthy of encoding
: the same language (with the same code) may be preferentially
known by one name in one country and another name in another. Because
the Ethnologue names languages by country, conflicts like this can appear.
The entry on "Chadian Spoken Arabic" (in Chad) lists "Shuwa Arabic" as
From: "John Cowan" [EMAIL PROTECTED]
It seems clear from the detailed information that in all 14 cases,
there is only one language, known by different names in different
countries. Expecting the Ethnologue to solve this problem by fiat,
or even to openly prefer one name ov
On 09/14/2000 04:59:55 AM J%ORG KNAPPEN wrote:
What really makes me wonder, is that the ethnologue seems to ignore the
vast amount of published information on the german language and its
dialects.
There is more than a century of dialetological research on german, and
there
are easy accessible
On Wed, 13 Sep 2000, Michael Everson wrote:
It names Hancock 1990 as the source of this (impossibly incorrect)
information. In the bibliography there is no Hancock 1990.
Just like The Unicode Standard Version 3.0, page 317, which names
ISIRI 3342 as a source for ZWJ and ZWNJ, but there's no
: The Ethnologue treats Valencian as a dialect of Catalan, which
is correct based on the mutual intelligibility criterion, but they have distinct
orthographies. Unfortunately, the two are in the same country, so the 3166
trick (en-us vs. en-gb, e.g.) doesn't work. (If Valenciana has a 3166-2
regional code
On 09/14/2000 10:29:52 AM John Cowan wrote:
In a nutshell: The Ethnologue treats Valencian as a dialect of Catalan,
which
is correct based on the mutual intelligibility criterion, but they have
distinct
orthographies. Unfortunately, the two are in the same country, so the
3166
trick (en-us vs
Roozbeh wrote:
On Wed, 13 Sep 2000, Michael Everson wrote:
It names Hancock 1990 as the source of this (impossibly incorrect)
information. In the bibliography there is no Hancock 1990.
Just like The Unicode Standard Version 3.0, page 317, which names
ISIRI 3342 as a source for ZWJ and
ging lists? Just
wonderin'...
I have no problem with that whatsoever. Creating an alternate
namespace mechanism with Ethnologue codes in a separate
namespace seems to offer exactly what you describe.
I'm wary of having two competing namespaces. As an alternative,
I'd like to suggest something on
Rick McGowan asked:
Can anyone point me to an existing list of languages that is more =
comprehensive and better researched than the Ethnologue? If there is no =
such list, then we don't need to consider any alternatives, right?
Ask the closest university department of comparative
with fair consistency. The Ethnologue is a place to start.
Can anyone point me to an existing list of languages that is more
comprehensive and better researched than the Ethnologue? If there is no
such list, then we don't need to consider any alternatives, right?
I agree with everything Rick has s
Ar 09:19 -0800 2000-09-12, scríobh [EMAIL PROTECTED]:
First, by the definitions assumed in the Ethnologue, they are all
considered to be distinct languages; they would be candidates for separate
literacy and literature development (if currently spoken-only), and if
literature were
Ar 23:56 +0100 2000-09-12, scríobh Christopher J. Fynn:
A lot of what are listed as "languages" in the Ethnologue are what most people
would call dialects. For instance almost every known dialect of spoken Tibetan
is listed as a separate language in the Ethnolouge although they all
The Library of Congress is very closely involved with ISO 639-2.
In fact, it is mostly their list of codes.
Misha
Oh Michael...
I think there are codes given to entities in the Ethnologue list that
aren't languages in the sense that we need to identify languages
it is failing to adequately serve some.
...
Furthermore, we would contend that the categories enumerated in the
Ethnologue by-and-large *are* the categories that need to be identified for
general IT purposes. In the majority of cases, the distinctions made are
those that would be needed
At 02:10 AM 9/14/2000 -0700, [EMAIL PROTECTED] wrote:
The problem here is that ISO639 has, for better or worse, been adopted by
a wide array of DIFFERING applications. It's a convenience standard that
we vaguely have to live with.
No, it's an inconvenience standard that we vaguely have to live
Re the Linguasphere, Peter C wrote:
- As Chris mentioned, the info isn't available online.
Actually, the Linguasphere is available on-line, if you pay for it... One hundred
sixty pounds sterling (two hundred seventy-five US dollars) for a license to use the
electronic version.
Rick
With English, the problem with spell checking is quite
different, and different
lists of words would not be as easy for a solution: the en-US
vs. en-GB
tagging does not seem to adequately cover the various
differences such as
-ise vs. -ize, -our vs. -or, -re vs. -er, use of shall vs.
Otto Stolz wrote:
I think, the ethnologue lacks information about variant orthographies.
Yes, it does. But that's OK, because we can make a composite tagging system that tags
orthography separately from language.
So... does anyone have a comprehensive list of orthographies?
Rick
From: Arnt Gulbrandsen [mailto:[EMAIL PROTECTED]]
Are there valid reasons why the imperfect but comprehensive
needs to be a
standard? I can see one reason for it _not_ to be a standard:
A list can
be added to faster, so it's easier for a list to be truly
comprehensive.
Michael Everson wrote (amplified by me):
tire, civilize, color, center (US)
tyre, civilize, colour, centre (GB-Oxonia)
tyre, civilise, colour, centre (GB-Demotica)
tire, civilise, colour, centre (CA)
I have seen a photograph of an actual Canadian sign saying "Tire Centre",
which in GB
st.
Misha
[This mail was written using voice recognition software]
Perhaps
another organization (like the Unicode Consortium) could take it upon
itself to massage the Ethnologue langauge list and add corrections,
deletions, and insertions; and put the new list on-line as "the most
up-to-d
On 09/13/2000 01:39:37 AM J%ORG KNAPPEN wrote:
I once looked at the ethnologue and its subdivision of the german language
is just ridiculous. Not small errors, a gross misconception. I don't trust
the ethnologue in area where I don't know the fact well, since it fails in
one
area where I know
trying to find the perfect solution.
Without such an approach, any new standard work will be plagued with
exactly the kind of inconsistencies that make both ISO 639 and the
Ethnologue of dubious merit for IT purposes.
I don't understand assertions that the Ethnologue is of dubious merit
are listed as "languages" in the Ethnologue are what most
people
would call dialects. For instance almost every known dialect of spoken
Tibetan
is listed as a separate language in the Ethnolouge although they all
share
only one written form.
YES. This is one of the serious problems o
, it is not clear that an attempt to adopt
a comprehensive enumeration of languages will lead to many more problems.
There will *always* be somebody who says they need something different. On
the other hand, if we use the Ethnologue to add coverage for lesser-known
languages to existing systems, many users
On 09/13/2000 06:37:25 AM Michael Everson wrote:
First, by the definitions assumed in the Ethnologue, they are all
considered to be distinct languages; they would be candidates for
separate
literacy and literature development (if currently spoken-only), and if
literature were to be developed
On 09/13/2000 11:59:01 AM Rick McGowan wrote:
Re the Linguasphere, Peter C wrote:
- As Chris mentioned, the info isn't available online.
Actually, the Linguasphere is available on-line, if you pay for it... One
hundred sixty pounds sterling (two hundred seventy-five US dollars) for a
license
..
I have no problem with that whatsoever. Creating an alternate namespace
mechanism with Ethnologue codes in a separate namespace seems to offer
exactly what you describe.
- Peter
---
Peter Constable
Non-Roman Script Initi
that this issue is orthogonal to the country code of RFC 1766.
E. g., both de-AT, de-CH and de-DE could be either spelled the 1902,
or the 1996, way. Hence, the spelling subtag, and the country subtag
should be optional, independend of each other.
I would agree.
I think, the ethnologue lacks
I thnk there are codes given to entities in the Ethnologue list that aren't
languages in the sense that we need to identify languages in IT and in
Bibliography (which is what the codes are for). I think that it is not
mature for International Standardization. It is a work in progress, subject
On 09/12/2000 12:18:37 PM Michael Everson wrote:
I thnk there are codes given to entities in the Ethnologue list that
aren't
languages in the sense that we need to identify languages in IT and in
Bibliography (which is what the codes are for).
Perhaps there is a cat that needs to be let out
Oh Michael...
I think there are codes given to entities in the Ethnologue list that
aren't languages in the sense that we need to identify languages in IT
and in Bibliography
ISO 639, and every other "standard" for language/locale codes also has this problem,
and from what
Can anyone point me to an existing list of languages that is more
comprehensive and better researched than the Ethnologue?
If there is no such list, then we don't need to consider any
alternatives, right?
I'm not qualified to judge the merits of one list over another
but there certaily
78 matches
Mail list logo