Re: [Foundation-l] Languages and numbers

2011-07-02 Thread Amir E. Aharoni
2011/7/1 Milos Rancic mill...@gmail.com:
 As Russia is fairly developed country, it is likely that reaching people
 who speak those languages and teaching them how to use Wikimedia
 projects would the task for WM RU. Besides that, I think that all
 languages of Russia have writing systems and support in Unicode.

Actually, a few small languages in Northern and Eastern Russia don't
have writing systems, but at least for some of them one is being
developed by the government.

And all the current languages of Russia are indeed supported in
Unicode, but in a few discussions i had just a couple of weeks ago i
learned the shocking truth: While we take Unicode for granted for
about a decade, it is not so for quite a lot of people around the
globe. In less developed parts of Russia there are still computers
with Windows 98 and even earlier, and Unicode support there is poor to
non-existent. Maybe in Russia WM-RU can indeed handle this - for
example, to organize sending donated second-hand computers to key
organizations in these regions (schools, libraries, local newspapers
etc.)

This, however, happens in many other countries, some of which need
Unicode even more desperately than these Russian regions, and which
don't have a chapter. For example, Ethiopia. There the Foundation or
other chapters will be able to help. WM-IL, for example, sent
second-hand computers pre-installed with Ubuntu and offline Wikipedia
to African countries, and maybe other chapters did similar things,
too.

Long story short: Unicode support cannot be taken for granted, but
something can be done about it.

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Languages and numbers

2011-07-01 Thread Yaroslav M. Blanter
Milosh, thanks for your work. Just to correct: Moksha, Erzya, Yakut
(=Sakha), Komi-Zyrian (=Komi) and Lak all have Wikipedias (though
admittedly for Lak I am the only active contributor). Adyge is almost
identical to Kabardino-Circassian, and Adyge speakers probably will never
have their own Wikipedia. Balkar is a part of Karachai-Balkar which has a
Wikipedia.

Cheers
Yaroslav


 Russian Federation
 783720 Lezgi
 696630 Erzya
 614000 Moksha
 516490 Dargwa
 499300 Adyghe
 460090 Mari, Meadow
 422550 Kumyk
 413000 Ingush
 363000 Yakut
 264400 Tuva
 217000 Komi-Zyrian
 164420 Lak
 128900 Tabassaran
 113710 Balkar

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Languages and numbers

2011-07-01 Thread Amir E. Aharoni
2011/7/1 Yaroslav M. Blanter pute...@mccme.ru:
 Adyge is almost
 identical to Kabardino-Circassian, and Adyge speakers probably will never
 have their own Wikipedia.

From what i hear about this, Adyge and Kabardian may be two varieties
of a Circassian [[macrolanguage]]. Maybe someone who cares about it
will submit a request to ISO to consider redefining their codes
accordingly.

The recently created Kabardian Wikipedia ( kbd.wikipedia.org ) is
developing quite nicely. It already has contributors in both varieties
of this language and they get along well.

--
Amir Elisha Aharoni · אָמִיר אֱלִישָׁע אַהֲרוֹנִי
http://aharoni.wordpress.com
We're living in pieces,
 I want to live in peace. - T. Moore

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Languages and numbers

2011-07-01 Thread Milos Rancic
On 07/01/2011 01:24 PM, Yaroslav M. Blanter wrote:
 Milosh, thanks for your work. Just to correct: Moksha, Erzya, Yakut
 (=Sakha), Komi-Zyrian (=Komi) and Lak all have Wikipedias (though
 admittedly for Lak I am the only active contributor). Adyge is almost
 identical to Kabardino-Circassian, and Adyge speakers probably will never
 have their own Wikipedia. Balkar is a part of Karachai-Balkar which has a
 Wikipedia.

Thanks! I've updated database for those which have Wikipedias.

As Russia is fairly developed country, it is likely that reaching people
who speak those languages and teaching them how to use Wikimedia
projects would the task for WM RU. Besides that, I think that all
languages of Russia have writing systems and support in Unicode.

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Languages and numbers

2011-06-27 Thread Milos Rancic
On 06/27/2011 12:30 AM, M. Williamson wrote:
 Some of these actually already have Wikipedias:
 
 Meadow Mari
 Yakut (aka Sakha)
 Lak
 Balkar (aka Karachay-Balkar)
 Yiddish, Eastern (= standard Yiddish, Western Yiddish is the one we are
 missing but it has much fewer speakers; according to Ethnologue there are
 only 5,400 around the world)
 
 In addition, in another message you stated that we probably had Wikipedias
 in every Sinitic language that was distinct enough from Mandarin to receive
 an own Wikipedia; Min Bei has 10.3 million speakers and does not have a
 Wikipedia and is definitely far removed from Mandarin; Xiang is also
 probably deserving of its own Wikipedia and has 30 million+ speakers.

Thanks for the corrections!

As for Han languages, because of the languages which you mentioned, I
intentionally left all of them. Obviously, they will be analyzed on
case-by-case basis.

But, Han languages are not endangered, China is fairly developed
country, their basic written language needs are covered by CJK
characters and fonts etc. If they want to have Wikipedia, it is likely
that they would get it, but it is not priority.

If we are talking about languages of China, Hmong–Mien (or Miao–Yao)
languages, for example, should be more in focus, as some of them have
enough speakers to create viable Wikimedia projects if supported
(Chuanqiandian Cluster Miao has 1.4M of speakers).

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Languages and numbers

2011-06-27 Thread Milos Rancic
More data could be found at [1]. It is about coverage of languages by
Wikimedia projects by size of population, logarithmic.

Numbers are not a surprise.

[1]
https://spreadsheets.google.com/spreadsheet/ccc?key=tCwO11tFPLPB-SJafDesypgauthkey=CPCE5pMB#gid=1


___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Languages and numbers

2011-06-26 Thread M. Williamson
Some of these actually already have Wikipedias:

Meadow Mari
Yakut (aka Sakha)
Lak
Balkar (aka Karachay-Balkar)
Yiddish, Eastern (= standard Yiddish, Western Yiddish is the one we are
missing but it has much fewer speakers; according to Ethnologue there are
only 5,400 around the world)

In addition, in another message you stated that we probably had Wikipedias
in every Sinitic language that was distinct enough from Mandarin to receive
an own Wikipedia; Min Bei has 10.3 million speakers and does not have a
Wikipedia and is definitely far removed from Mandarin; Xiang is also
probably deserving of its own Wikipedia and has 30 million+ speakers.


2011/6/24 Milos Rancic mill...@gmail.com

 While preparing Missing Wikipedias [1], I've got numbers of speakers and
 languages by area and country with chapter not covered by Wikipedias.

 Numbers are preliminary, some of them should be corrected. I didn't
 exclude Han languages, which mostly shouldn't be counted, and similar.
 Note, also, that every language should be analyzed separately. Many
 languages are spoken not just inside of one country.

 Please, fix errors and comment.

 * * *

 Areas. They approximate the usual definitions of areas, but they are
 different because of linguistic corrections.

 * Afro-Asiatic Area: Area where Afro-Asiatic languages are dominant.
 North Africa + Middle East + Sudan, Ethiopia, Eritrea and Somalia - Iran.
 * Europe: Europe (including Caucasus) includes Turkey.
 * South Asia: South Asia + Iran. Dominantly Indo-European and Dravidian
 languages.
 * Sub-Saharan Africa: The rest of Africa.
 * Polynesia, Australia and Oceania: Includes Malaysia and Taiwan
 (Taiwanese languages not covered in Wikipedias are dominantly
 Austronesian.)
 * East Asia: Han China China (Central), Korea and Japan.
 * South-East Asia: Includes non-Han south China China (South).
 * Latin America: Parts of America where Spanish and Portuguese are
 official languages.
 * Anglo-French America: Parts of America where English, French and Dutch
 are official languages.
 * North Asia: Asian part of former USSR, Mongolia and non-Han northern
 and western China China (North).

 The first column is number of speakers, the second number of languages,
 the third is area.

 399259294 592 South Asia
 353676706 1805 Sub-Saharan Africa
 221855457 253 Afro-Asiatic Area
 138979263 2198 Polynesia, Australia and Oceania
 107363760 37 East Asia
 99260271 447 South-East Asia
 47901185 143 Europe
 30361602 724 Latin America
 8481452 227 Anglo-French America
 3724384 45 North Asia

 * * *

 Countries with chapters. (Numbers are not fully correct, as they include
 some languages removed in the list below this one.)

 If any chapter (or interested group) is interested in full list of
 missing languages, I'll provide it by request before completing the
 work. I suppose that some chapters are interested in languages with less
 than 100K of speakers, as well.

 296,097,274 349 India
 71,356,176 681 Indonesia
 46,676,395 157 Philippines
 7,819,010 9 Germany
 7,994,871 76 Russian Federation
 5,386,580 5 Serbia
 4,785,299 6 South Africa
 2,841,300 17 Israel
 1,139,750 4 Ukraine
 1,085,931 125 United States
 832,000 3 Netherlands
 705,967 70 Canada
 472,470 1 Czech Republic
 375,704 17 Taiwan
 313,642 6 Chile
 246,900 3 United Kingdom
 200,500 4 Spain
 191,430 5 Poland
 151,240 7 Sweden
 132,809 12 Argentina
 86,390 155 Australia
 50,000 1 France
 30,000 1 Hungary
 29,980 4 Switzerland
 17,460 5 Finland
 15,000 1 Portugal
 10,500 2 Norway
 5,000 1 Denmark
 4,500 1 Estonia

 Languages with more than million or more than 100,000 of speakers
 without Wikipedia and with chapter in the country:

 India (more than million)
 38261000 Awadhi
 3470 Maithili
 1750 Chhattisgarhi
 1300 Magahi
 1300 Haryanvi
 1280 Deccan
 1040 Malvi
 950 Kanauji
 900 Dhundari
 776 Bagheli
 697 Varhadi-Nagpuri
 6170900 Santali
 600 Lambadi
 5622600 Marwari
 500 Mewati
 473 Hadothi
 4004490 Konkani
 390 Merwari
 380 Mina
 3633900 Konkani, Goan
 300 Shekhawati
 300 Godwari
 292 Garhwali
 268 Indian Sign Language
 236 Kumaoni
 211 Dogri
 210 Bagri
 2094200 Kurux
 200 Mewari
 197 Sadri
 195 Tulu
 195 Gondi, Northern
 193 Waddar
 171 Wagdi
 170 Kangri
 158 Khandesi
 1560280 Mundari
 1543300 Bodo
 150 Ho
 143 Nimadi
 1391000 Meitei
 130 Bhili
 120 Vasavi
 115 Bhilali
 1045000 Panjabi, Mirpur
 100 Pahari, Mahasu

 Indonesia (more than million)
 13600900 Madura
 553 Minangkabau
 393 Musi
 3502300 Banjar
 333 Bali
 270 Betawi
 235 Malay, Central
 210 Sasak
 200 Batak Toba
 188 Malay, Makassar
 160 Makasar
 120 Batak Simalungun
 120 Batak Dairi
 110 Batak Mandailing
 100 Malay, Jambi

 Philippines (more than 100k)
 577 Hiligaynon
 250 Bicolano, Central
 190 Bicolano, Albay
 1062000 Tausug
 100 Maguindanao
 776000 Maranao

Re: [Foundation-l] Languages and numbers

2011-06-25 Thread Isabell Long
Hi,

On 25 Jun 2011, at 05:52, Milos Rancic mill...@gmail.com wrote:

 While preparing Missing Wikipedias [1], I've got numbers of speakers and
 languages by area and country with chapter not covered by Wikipedias.

Fascinating!  Thanks for the work!  :-)

Isabell.

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Languages and numbers

2011-06-25 Thread Milos Rancic
Forwarding Deryk Chan's email and my response on his request.

 Original Message 
Subject: Re: [Internal-l] Fwd: [Foundation-l] Languages and numbers
Date: Sat, 25 Jun 2011 13:55:58 +0200
From: Milos Rancic mill...@gmail.com
To: Deryck Chan deryckc...@gmail.com

On 06/25/2011 01:28 PM, Deryck Chan wrote:
 (sorry, am on mobile, can't post to list. Feel free to forward this onto
 the list)

 2 obvious queries:
 1. How are we going to do a Wikipedia on... Indian Sign Language?
 2. If we exclude the Chinese languages from the table (which is a move I
 agree with), we should also exclude all other languages which defer to
 the standard written form of a related language that has a Wikipedia,
 eg. Mainfränkisch (because we have a  standard German Wikipedia).

1. There are requests for Wikipedias in sign languages (search for sign
language here [1]). They intend to use SignWriting [2]. We are waiting
for implementation of top-bottom writing to be able to host sign languages.

2. I didn't say that we should exclude Chinese languages, but that is
likely that some of them should be excluded. If they are too close to
Mandarin so there is no significant difference in writing, yes. If not,
no. But, I think that all of the Han languages not closely related to
Mandarin already have their own Wikipedia.

Note, also, that there is request for Wikipedia in Swabian [3], as well
as there are a number of Wikipedias in German languages. So, it's up to
them to decide what do they want. Besides that, one thing is Standard
Chinese, the other is Standard German. Logographic script allows much
more varieties to be covered than phonetic one. For example, with
logographic script Serbian and English could be written in one
orthography (while not English and German nor Serbian and Bulgarian).

Besides that, I intentionally categorized Han China, Korea and Japan
together (as East Asia) because it is not likely that WMF should do
anything there. All countries are developed enough (OK, North Korea is
not, but there is South Korea) and languages in those areas stay well
enough. That's true for the most of languages of countries which are
OECD members.

The main purpose of this document is to point to the large populations
without Wikipedia in their native language. India, Indonesia and
Philippines will be in focus, obviously.

[1] http://meta.wikimedia.org/wiki/Requests_for_new_languages
[2] http://en.wikipedia.org/wiki/SignWriting
[3]
http://meta.wikimedia.org/wiki/Requests_for_new_languages/Wikipedia_Swabian

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Languages and numbers

2011-06-25 Thread Bishakha Datta
I posted this on the India list (many people are not subscribed to
foundation-l) - forwarding this question which just popped up.

Bishakha

-- Forwarded message --
From: Vickram Crishna vvcris...@radiophony.com
Date: Sat, Jun 25, 2011 at 6:08 PM
Subject: Re: [Wikimediaindia-l] Fwd: [Foundation-l] Languages and numbers
To: Wikimedia India Community list wikimediaindi...@lists.wikimedia.org


It is fascinating, although I think I may not have understood the
classifications. Is there only one Indian Sign Language, for instance? I was
told by a user (in the UK) that several are in use in different parts of the
country. Still, perhaps the variants do not have sufficient numbers of users
to qualify for this listing. However, the context in which I was told was
precisely the severe lack of support materials for helping users become
self-sufficient and good communicators, so the list itself becomes a
barrier.

Unfortunately, I do not know at the moment how to fix the problem.

[296,097,274 349 India]

Does the population number mean that the existing indic language wikipedias
covers the rest of the population ie over 90 crore? Is this information
updated from the current census?


On Sat, Jun 25, 2011 at 10:22 AM, Milos Rancic mill...@gmail.com wrote:

 While preparing Missing Wikipedias [1], I've got numbers of speakers and
 languages by area and country with chapter not covered by Wikipedias.

 Numbers are preliminary, some of them should be corrected. I didn't
 exclude Han languages, which mostly shouldn't be counted, and similar.
 Note, also, that every language should be analyzed separately. Many
 languages are spoken not just inside of one country.

 Please, fix errors and comment.

 * * *

 Areas. They approximate the usual definitions of areas, but they are
 different because of linguistic corrections.

 * Afro-Asiatic Area: Area where Afro-Asiatic languages are dominant.
 North Africa + Middle East + Sudan, Ethiopia, Eritrea and Somalia - Iran.
 * Europe: Europe (including Caucasus) includes Turkey.
 * South Asia: South Asia + Iran. Dominantly Indo-European and Dravidian
 languages.
 * Sub-Saharan Africa: The rest of Africa.
 * Polynesia, Australia and Oceania: Includes Malaysia and Taiwan
 (Taiwanese languages not covered in Wikipedias are dominantly
 Austronesian.)
 * East Asia: Han China China (Central), Korea and Japan.
 * South-East Asia: Includes non-Han south China China (South).
 * Latin America: Parts of America where Spanish and Portuguese are
 official languages.
 * Anglo-French America: Parts of America where English, French and Dutch
 are official languages.
 * North Asia: Asian part of former USSR, Mongolia and non-Han northern
 and western China China (North).

 The first column is number of speakers, the second number of languages,
 the third is area.

 399259294 592 South Asia
 353676706 1805 Sub-Saharan Africa
 221855457 253 Afro-Asiatic Area
 138979263 2198 Polynesia, Australia and Oceania
 107363760 37 East Asia
 99260271 447 South-East Asia
 47901185 143 Europe
 30361602 724 Latin America
 8481452 227 Anglo-French America
 3724384 45 North Asia

 * * *

 Countries with chapters. (Numbers are not fully correct, as they include
 some languages removed in the list below this one.)

 If any chapter (or interested group) is interested in full list of
 missing languages, I'll provide it by request before completing the
 work. I suppose that some chapters are interested in languages with less
 than 100K of speakers, as well.

 296,097,274 349 India
 71,356,176 681 Indonesia
 46,676,395 157 Philippines
 7,819,010 9 Germany
 7,994,871 76 Russian Federation
 5,386,580 5 Serbia
 4,785,299 6 South Africa
 2,841,300 17 Israel
 1,139,750 4 Ukraine
 1,085,931 125 United States
 832,000 3 Netherlands
 705,967 70 Canada
 472,470 1 Czech Republic
 375,704 17 Taiwan
 313,642 6 Chile
 246,900 3 United Kingdom
 200,500 4 Spain
 191,430 5 Poland
 151,240 7 Sweden
 132,809 12 Argentina
 86,390 155 Australia
 50,000 1 France
 30,000 1 Hungary
 29,980 4 Switzerland
 17,460 5 Finland
 15,000 1 Portugal
 10,500 2 Norway
 5,000 1 Denmark
 4,500 1 Estonia

 Languages with more than million or more than 100,000 of speakers
 without Wikipedia and with chapter in the country:

 India (more than million)
 38261000 Awadhi
 3470 Maithili
 1750 Chhattisgarhi
 1300 Magahi
 1300 Haryanvi
 1280 Deccan
 1040 Malvi
 950 Kanauji
 900 Dhundari
 776 Bagheli
 697 Varhadi-Nagpuri
 6170900 Santali
 600 Lambadi
 5622600 Marwari
 500 Mewati
 473 Hadothi
 4004490 Konkani
 390 Merwari
 380 Mina
 3633900 Konkani, Goan
 300 Shekhawati
 300 Godwari
 292 Garhwali
 268 Indian Sign Language
 236 Kumaoni
 211 Dogri
 210 Bagri
 2094200 Kurux
 200 Mewari
 197 Sadri
 195 Tulu
 195 Gondi, Northern
 193 Waddar
 171 Wagdi
 170 Kangri
 158 Khandesi
 1560280 Mundari
 1543300 Bodo
 150 Ho
 143 Nimadi

Re: [Foundation-l] Languages and numbers

2011-06-25 Thread Milos Rancic
On 06/25/2011 03:11 PM, Bishakha Datta wrote:
 I posted this on the India list (many people are not subscribed to
 foundation-l) - forwarding this question which just popped up.

First of all, although numbers look fascinatingly precise, they are far
from that. When you make a sum of approximations like
~1M+800k+30k+4k+700+20+ the language spoken by three individuals, you
will get fascinating number 1,834,723. So, the numbers are far from
being census-level precision.

All of the numbers are based on Ethnologue data [1], which varies from
very good to very bad approximations. Ethnologue varies even in
linguistic classification a lot. (Being educated in Serbian linguistics,
I know how bad the description of the South Slavic area is.) BUT, it is
the best source for all languages of the world ever been made, and it
gives good general picture.

 [296,097,274 349 India]

 Does the population number mean that the existing indic language
 wikipedias covers the rest of the population ie over 90 crore? Is
 this information updated from the current census?

By making a quick approximation of number of speakers of some large
official languages of India [2] and not counting English, I've come to
the number of ~650M and stopped counting (BTW, that includes the number
of 180M of Hindi speakers from 1991; and according to the population
growth in India, there should be at least 250M of Hindi speakers today).
Thus, I think that ~300M more could be gathered by other languages with
Wikipedias and by adjusting existing numbers for population growth. (I
could make more precise calculation if needed, but I would need some
time.) It should be also noted that dates of the entries in Ethnologue
vary a lot and that some of them could be old 20 years or more.

And, again, this should be used as very general guideline, not as a
precise one. This list would be very good in telling that there are much
more speakers of Awadhi than Merwari today. However, it is not good to
be used for comparison of number of speakers between Awadhi and
Maithili. But, anyway, that's not important. We know that we should work
to cover both Awadhi and Maithili.

At the other side, I will, indeed, try to make those numbers more useful
(although I think that the most important usefulness is about pointing
to the large populations without Wikipedias).

 It is fascinating, although I think I may not have understood the
 classifications. Is there only one Indian Sign Language, for
 instance? I was told by a user (in the UK) that several are in use in
 different parts of the country. Still, perhaps the variants do not
 have sufficient numbers of users to qualify for this listing.
 However, the context in which I was told was precisely the severe
 lack of support materials for helping users become self-sufficient
 and good communicators, so the list itself becomes a barrier.

 Unfortunately, I do not know at the moment how to fix the problem.

I've checked the whole database and just one Indian Sign Language has
been listed, which doesn't tell us a lot. Ethnologue entry about Indian
Sign Language [3] says that it is called Indo-Pakistani Sign Language
or Urban Indian Sign Language. However, according to the fact that
Deaf schools mainly do not use ISL..., it could mean that dialectical
divergence could be very high (thus, it could look as a number of
different languages), no matter the fact that it's been used in Pakistan
and Bangladesh, as well.

Said so, I have to admit that my knowledge about sign languages is very
limited.

[1] http://www.ethnologue.com/
[2] http://en.wikipedia.org/wiki/Languages_with_official_status_in_India
[3] http://www.ethnologue.com/show_language.asp?code=ins

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


[Foundation-l] Languages and numbers

2011-06-24 Thread Milos Rancic
While preparing Missing Wikipedias [1], I've got numbers of speakers and
languages by area and country with chapter not covered by Wikipedias.

Numbers are preliminary, some of them should be corrected. I didn't
exclude Han languages, which mostly shouldn't be counted, and similar.
Note, also, that every language should be analyzed separately. Many
languages are spoken not just inside of one country.

Please, fix errors and comment.

* * *

Areas. They approximate the usual definitions of areas, but they are
different because of linguistic corrections.

* Afro-Asiatic Area: Area where Afro-Asiatic languages are dominant.
North Africa + Middle East + Sudan, Ethiopia, Eritrea and Somalia - Iran.
* Europe: Europe (including Caucasus) includes Turkey.
* South Asia: South Asia + Iran. Dominantly Indo-European and Dravidian
languages.
* Sub-Saharan Africa: The rest of Africa.
* Polynesia, Australia and Oceania: Includes Malaysia and Taiwan
(Taiwanese languages not covered in Wikipedias are dominantly Austronesian.)
* East Asia: Han China China (Central), Korea and Japan.
* South-East Asia: Includes non-Han south China China (South).
* Latin America: Parts of America where Spanish and Portuguese are
official languages.
* Anglo-French America: Parts of America where English, French and Dutch
are official languages.
* North Asia: Asian part of former USSR, Mongolia and non-Han northern
and western China China (North).

The first column is number of speakers, the second number of languages,
the third is area.

399259294 592 South Asia
353676706 1805 Sub-Saharan Africa
221855457 253 Afro-Asiatic Area
138979263 2198 Polynesia, Australia and Oceania
107363760 37 East Asia
99260271 447 South-East Asia
47901185 143 Europe
30361602 724 Latin America
8481452 227 Anglo-French America
3724384 45 North Asia

* * *

Countries with chapters. (Numbers are not fully correct, as they include
some languages removed in the list below this one.)

If any chapter (or interested group) is interested in full list of
missing languages, I'll provide it by request before completing the
work. I suppose that some chapters are interested in languages with less
than 100K of speakers, as well.

296,097,274 349 India
71,356,176 681 Indonesia
46,676,395 157 Philippines
7,819,010 9 Germany
7,994,871 76 Russian Federation
5,386,580 5 Serbia
4,785,299 6 South Africa
2,841,300 17 Israel
1,139,750 4 Ukraine
1,085,931 125 United States
832,000 3 Netherlands
705,967 70 Canada
472,470 1 Czech Republic
375,704 17 Taiwan
313,642 6 Chile
246,900 3 United Kingdom
200,500 4 Spain
191,430 5 Poland
151,240 7 Sweden
132,809 12 Argentina
86,390 155 Australia
50,000 1 France
30,000 1 Hungary
29,980 4 Switzerland
17,460 5 Finland
15,000 1 Portugal
10,500 2 Norway
5,000 1 Denmark
4,500 1 Estonia

Languages with more than million or more than 100,000 of speakers
without Wikipedia and with chapter in the country:

India (more than million)
38261000 Awadhi
3470 Maithili
1750 Chhattisgarhi
1300 Magahi
1300 Haryanvi
1280 Deccan
1040 Malvi
950 Kanauji
900 Dhundari
776 Bagheli
697 Varhadi-Nagpuri
6170900 Santali
600 Lambadi
5622600 Marwari
500 Mewati
473 Hadothi
4004490 Konkani
390 Merwari
380 Mina
3633900 Konkani, Goan
300 Shekhawati
300 Godwari
292 Garhwali
268 Indian Sign Language
236 Kumaoni
211 Dogri
210 Bagri
2094200 Kurux
200 Mewari
197 Sadri
195 Tulu
195 Gondi, Northern
193 Waddar
171 Wagdi
170 Kangri
158 Khandesi
1560280 Mundari
1543300 Bodo
150 Ho
143 Nimadi
1391000 Meitei
130 Bhili
120 Vasavi
115 Bhilali
1045000 Panjabi, Mirpur
100 Pahari, Mahasu

Indonesia (more than million)
13600900 Madura
553 Minangkabau
393 Musi
3502300 Banjar
333 Bali
270 Betawi
235 Malay, Central
210 Sasak
200 Batak Toba
188 Malay, Makassar
160 Makasar
120 Batak Simalungun
120 Batak Dairi
110 Batak Mandailing
100 Malay, Jambi

Philippines (more than 100k)
577 Hiligaynon
250 Bicolano, Central
190 Bicolano, Albay
1062000 Tausug
100 Maguindanao
776000 Maranao
639000 Capiznon
54 Bontoc, Central
50 Ibanag
395000 Inakeanon
378000 Kinaray-a
35 Masbatenyo
345000 Surigaonon
319000 Sama, Southern
293000 Chavacano
234000 Bicolano, Iriga
20 Romblomanon
20 Bantoanon
185000 Sorsogon, Waray
15 Kankanaey
15 Blaan, Koronadal
147000 Davawenyo
14 Subanen, Central
134000 Itawit
123000 Cuyonon
122000 Bicolano, Northern Catanduanes
111000 Ibaloi
107000 Yakan
10 Philippine Sign Language
10 Binukid

Germany
491 Mainfränkisch
200 Saxon, Upper
819000 Swabian

Russian Federation
783720 Lezgi
696630 Erzya
614000 Moksha
516490 Dargwa
499300 Adyghe
460090 Mari, Meadow
422550 Kumyk
413000 Ingush
363000 Yakut
264400 Tuva
217000 Komi-Zyrian
164420 Lak
128900 Tabassaran
113710 Balkar

Serbia and Kosovo
4156090 Albanian, Gheg
709570 Romani, Balkan
318920 Romani, Sinte