Re: Polyglot keyboards (was: Non-standard 8-bit fonts still in use)

2016-05-10 Thread Philippe Verdy
Very true, and this will likely not change.
Even users of "ergonomic" layouts want to keep this ergonomy for their
letters (an letter pairs).
All that can be made reasonable is to extend existing layouts with minimal
changes: basic letters, decimal digits, and basic punctuation must remain
at the same place (and there's also some resistance for the most common few
additional letters used in each language that are typically placed on the
1st row, or near the Enter key).
What is likely to change is the placement of combinations using AltGr on
the first row (but on non-US keyboards, these also include some ASCII
characters considered essential on a computer like the backslash, hash
sign, tilde, arrobace, or underscore)

This leaves little freedom for changes except for keys currently assigned
to less essential characters such as the degree sign, the micro sign, the
pound sign (in countries not usingf this symbol daily), the "universal"
currency sign, the paragraph mark... Those can be used to fit better
candidates for extensions.

But without an extension of keyboard rows, it will be difficult to have a
wide adoption on physical keyboards. Function keys F1..F12 may be easily
reduced to fit additional keys for letters and diacritics.

Keyboards have instead been extended for many things that most people in
fact almost never use or don't need there such as multimedia keys,
shortcuts to launch the browser or calculator app. or the contextual
menu/options key (added by Windows), or TWO (sic!) keys for the Windows key
(Keep only one and map the few additional keys found on Japanese keyboards).

But it is challenging to have decent sizes for keys on notebooks keyboards
which are already extremely packed (F1..F12 are already reduced
vertically). They invented another way: using a new "Fn" mode key for
additional multimedia keys (or keys for switching the Wifi, Bluetooth or
display adapters, or control the display lightness or sound volume/mute, or
to eliminate the PrintScreen function, or the ScrollLock or NumLock mode
switch keys). A few of them added a couple of character keys for currency
units ($ and €) instead of the Japanese mode keys.

In fact every brand has done what it wanted to extend the keyboards...
except for extending really the usable alphabets.

For virtual on-screen layouts, there's much more freedom as the display
panel is adaptative and allows more innovative input methods, of things
never dound on physical keyboards such as entering emojis.


2016-05-10 16:55 GMT+02:00 Doug Ewell :

> Otto Stolz wrote:
>
> > Yes, there is somebody going there. E. g., the German standard
> > DIN 2137:2012-06 defines a “T2” layout which is meant
> > for all official, Latin-based orthographies worldwide, and
> > additionally for the Latin-based minority languages of Germany
> > and Austria. The layout is based on the traditional QWERTZU layout
> > for German and Austrian keyboards (which is now dubbed “T1”).
> > Cf. .
>
> Yes, but there's the rub. QWERTY users are about as willing to switch to
> QWERTZ in the name of global standardization as Germans would be to
> switch to QWERTY.
>
> --
> Doug Ewell | http://ewellic.org | Thornton, CO 
>
>
>


Polyglot keyboards (was: Non-standard 8-bit fonts still in use)

2016-05-10 Thread Doug Ewell
Otto Stolz wrote:

> Yes, there is somebody going there. E. g., the German standard
> DIN 2137:2012-06 defines a “T2” layout which is meant
> for all official, Latin-based orthographies worldwide, and
> additionally for the Latin-based minority languages of Germany
> and Austria. The layout is based on the traditional QWERTZU layout
> for German and Austrian keyboards (which is now dubbed “T1”).
> Cf. .

Yes, but there's the rub. QWERTY users are about as willing to switch to
QWERTZ in the name of global standardization as Germans would be to
switch to QWERTY.

--
Doug Ewell | http://ewellic.org | Thornton, CO 




Polyglot keyboards (was: Non-standard 8-bit fonts still in use)

2016-05-10 Thread Otto Stolz

Hello,

am 2016-05-08 um 20:11 Uhr schrieb Don Osborn:

Another thing about user needs is that the polyglot/pluriliterate user
may prefer something that reflects that, as opposed to having multiple
keyboards for languages whose character repertoires are much the same.
 From a national or regional (sub-continental) point of view I would
think a one-size fits all/many standard or set of keyboard standards
would be ideal. But no one seems to be going there yet, after all these
years.


Yes, there is somebody going there. E. g., the German standard
DIN 2137:2012-06 defines a “T2” layout which is meant
for all official, Latin-based orthographies worldwide, and
additionally for the Latin-based minority languages of Germany
and Austria. The layout is based on the traditional QWERTZU layout
for German and Austrian keyboards (which is now dubbed “T1”).
Cf. .

There is also a “T3” layout defined which comprises all characters
mentioned in ISO/IEC 9995-3:2010.

You can even buy a hardware T2 keyboard; however I have not tried it,
because I have defined my own keyboard layout suite (pan-European Latin,
pan-European Cyrillic, monotonic Greek, and Yiddish) for personal use,
long ago.

Best wishes,
  Otto Stolz


Re: Non-standard 8-bit fonts still in use

2016-05-09 Thread Marcel Schneider
On Sun, 8 May 2016 10:19:54 -0400, Don Osborn  wrote:

> Marcel, I would be very interested to know more about what you are
> working on wrt Bambara - perhaps offline.

Thank you for your interest. Iʼm glad to come in touch 
with on-going work and I already started mailing but 
eventually would like to acknowledge on-list; although…

On Sun, 8 May 2016 14:11:20 -0400, Don Osborn  wrote:

> To get this a little on-topic for the list, the
> good news is that Unicode means we're talking just about keyboards and
> not about multiple incompatible fonts as well.

Indeed, however font issues are IMHO even more suitable 
for the List (though strictly they are out of scope too), 
as opposed to keyboard layouts, that must not be discussed 
on the Unicode List. Only giving some hints is suitable, 
as had been done in this thread up to now. Consequently 
I switched off-list immediately. But here I’m doing some 
metadiscussion, so please disregard.

> In the background one should bring in the issue of whether computer
> science students and IT experts in Africa had any introduction to
> Unicode. That could be a big missing piece in the equation.

For future archive readers there may be some need to recall 
that this phenomenon is a global one. Missing training to Unicode 
is observed in Europe as well, and on other continents. 
Please see the following recent thread:

Unicode in the Curriculum? 
from Andre Schappo on 2015-12-30 (Unicode Mail List Archive). 
Retrieved March 11, 2016, from 
http://www.unicode.org/mail-arch/unicode-ml/y2015-m12/0073.html

> On the font side, my impression (a bit dated) is that there is/was a
> policy dimension or gap. Back when Unicode was becoming more widely
> adopted, there were new computers marketed in Africa without the then
> limited repertoire of fonts with extended Latin. Even when these were
> included, there are some instances where it is possible that 8-bit fonts
> with extended characters were created on machines that already had one
> or two Unicode fonts - evidently unbeknownst to the user. So there was,
> and always has been, a public education side to this that none of us in
> position or interest to do so have been able to address.

Please see also the capital left-hook N glyph issue Don documented 
at the very beginning of this thread:

Non-standard 8-bit fonts still in use from Don Osborn on 2015-10-15 (Unicode 
Mail List Archive). 
(2015, October 21). Retrieved October 21, 2015, from 
http://www.unicode.org/mail-arch/unicode-ml/y2015-m10/0135.html

For one more comment on that issue:
http://unicode.org/mail-arch/unicode-ml/y2015-m10/0214.html


On Sun, 8 May 2016 12:31:59 -0600, Doug Ewell  wrote:

> Don Osborn wrote:
> 
> > In the multilingual settings I'm most interested in, the language
> > requirements often overlap, sometimes considerably (thinking here of
> > extended Latin alphabets). This is because in many languages use
> > characters that are part of the African Reference Alphabet. So it is
> > possible to have one keyboard layout for each language, or merge
> > requirements if you will for two or more. When the A12n-collab group
> > was active* one concept discussed at some length was a "pan-Sahelian"
> > layout that could serve many languages across a number of countries.
> 
> I wonder if there is a good and fairly comprehensive reference to the
> most common Latin-based alphabets used for African languages, comparable
> to Michael Everson's "The Alphabets of Europe" [1]. Such would be
> helpful for determining the level of effort to create a pan-African
> keyboard layout, or to adapt (if necessary) an existing multilingual
> layout like John Cowan's Moby Latin [2].
> 
> [1] http://www.evertype.com/alphabets/
> [2]
> 
>http://recycledknowledge.blogspot.com/2013/09/us-moby-latin-keyboard-for-windows.html

On Sun, 8 May 2016 19:15:20 +, d...@bisharat.net replied:

> Rhonda Hartell did a compilation based on available info, 
> published 23 yrs ago by SIL. Christian Chanard put that info 
> into a database, Systemes alphabetiques, accessible via links from 
> http://www.bisharat.net/wikidoc/pmwiki.php/PanAfrLoc/WritingSystems#toc11
> 
> All I have right now (taking break from shoveling leaf compost). 

Thanks for this resource. Iʼve taken a look and I like the interface. 
But there is some update missing, or more accurately, the source was outdated, 
as shows up when looking at the Bambara section that does not take into account 
the new orthography, though this had already been valid during over one decade 
(1982..1993).

Sadly this valuable database is unreliable unless the data is revised. 
I hope that can be done soon. However unfortunately Iʼm unable to do this job.


Best regards,

Marcel



Re: Non-standard 8-bit fonts still in use

2016-05-08 Thread dzo
Rhonda Hartell did a compilation based on available info, published 23 yrs ago 
by SIL. Christian Chanard put that info into a database, Systemes 
alphabetiques, accessible via links from 
http://www.bisharat.net/wikidoc/pmwiki.php/PanAfrLoc/WritingSystems#toc11

All I have right now (taking break from shoveling leaf compost). 

Don



--Original Message--
From: Doug Ewell
Sender: Unicode
To: unicode@unicode.org
To: Don Osborn
Subject: Re: Non-standard 8-bit fonts still in use
Sent: May 8, 2016 2:31 PM

Don Osborn wrote:

> In the multilingual settings I'm most interested in, the language
> requirements often overlap, sometimes considerably (thinking here of
> extended Latin alphabets). This is because in many languages use
> characters that are part of the African Reference Alphabet. So it is
> possible to have one keyboard layout for each language, or merge
> requirements if you will for two or more. When the A12n-collab group
> was active* one concept discussed at some length was a "pan-Sahelian"
> layout that could serve many languages across a number of countries.

I wonder if there is a good and fairly comprehensive reference to the 
most common Latin-based alphabets used for African languages, comparable 
to Michael Everson's "The Alphabets of Europe" [1]. Such would be 
helpful for determining the level of effort to create a pan-African 
keyboard layout, or to adapt (if necessary) an existing multilingual 
layout like John Cowan's Moby Latin [2].

[1] http://www.evertype.com/alphabets/
[2] 
http://recycledknowledge.blogspot.com/2013/09/us-moby-latin-keyboard-for-windows.html

--
Doug Ewell | http://ewellic.org | Thornton, CO 



Sent via BlackBerry by AT



Re: Non-standard 8-bit fonts still in use

2016-05-08 Thread Doug Ewell

Don Osborn wrote:


In the multilingual settings I'm most interested in, the language
requirements often overlap, sometimes considerably (thinking here of
extended Latin alphabets). This is because in many languages use
characters that are part of the African Reference Alphabet. So it is
possible to have one keyboard layout for each language, or merge
requirements if you will for two or more. When the A12n-collab group
was active* one concept discussed at some length was a "pan-Sahelian"
layout that could serve many languages across a number of countries.


I wonder if there is a good and fairly comprehensive reference to the 
most common Latin-based alphabets used for African languages, comparable 
to Michael Everson's "The Alphabets of Europe" [1]. Such would be 
helpful for determining the level of effort to create a pan-African 
keyboard layout, or to adapt (if necessary) an existing multilingual 
layout like John Cowan's Moby Latin [2].


[1] http://www.evertype.com/alphabets/
[2] 
http://recycledknowledge.blogspot.com/2013/09/us-moby-latin-keyboard-for-windows.html


--
Doug Ewell | http://ewellic.org | Thornton, CO 




Re: Non-standard 8-bit fonts still in use

2016-05-08 Thread Don Osborn
Thanks Doug. You're right as far as that goes, but I'd suggest there's 
more to it.


Languages (by which of course we mean their written forms) have 
requirements, and for cross-border languages, requirements may be 
defined differently by the different countries where they are spoken. 
And users have needs and experience.


In the multilingual settings I'm most interested in, the language 
requirements often overlap, sometimes considerably (thinking here of 
extended Latin alphabets). This is because in many languages use 
characters that are part of the African Reference Alphabet. So it is 
possible to have one keyboard layout for each language, or merge 
requirements if you will for two or more. When the A12n-collab group was 
active* one concept discussed at some length was a "pan-Sahelian" layout 
that could serve many languages across a number of countries.


But even then, considering variations by country (orthographies often 
set by country not by language), there can be several possible sets of 
language requirements, in a "pan-Sahelian" layout. And that's just one 
example.


Then there is the question of key assignments for any given character. 
Unfortunately in Africa there are not established layouts to deal with - 
most formally educated people will be most familiar with QWERTY or 
AZERTY for the official languages. Everything else is pretty much a 
matter of choice, although some small communities of users may have 
developed familiarity with particular layouts (perhaps a reason for 
persistence of something like Bambara Arial). So another reason there 
are a zillion keyboards is that people are inventing them - for good 
reasons and intent, we can admit, but often without awareness of other 
efforts, or communication with other communities of users.


You are right however that none of these are standards (with a possible 
exception - would have to go back and check) - I was trying to be clever 
- but there are different layouts.


Another thing about user needs is that the polyglot/pluriliterate user 
may prefer something that reflects that, as opposed to having multiple 
keyboards for languages whose character repertoires are much the same. 
From a national or regional (sub-continental) point of view I would 
think a one-size fits all/many standard or set of keyboard standards 
would be ideal. But no one seems to be going there yet, after all these 
years.


And one could go on. To get this a little on-topic for the list, the 
good news is that Unicode means we're talking just about keyboards and 
not about multiple incompatible fonts as well.


Don

* I'm floating the idea of a new list on the full spectrum of African 
languages & technology issues. Anyone interested or who has thoughts on 
that idea one way or another, please contact me offline.



On 5/8/2016 12:50 PM, Doug Ewell wrote:

Don Osborn wrote:


Concerning the keyboard side of the issue, there has been a lot of
discussion about unified standards over the years, but what we end up
with is maybe another case of "The nice thing about standards is that
there are so many to choose from."


There are a zillion keyboard layouts, not because of too many 
conflicting standards per se, but primarily because people don't want 
to change away from the layout they're familiar with, and secondarily 
because different languages have different needs.


--
Doug Ewell | http://ewellic.org | Thornton, CO 




Re: Non-standard 8-bit fonts still in use

2016-05-08 Thread Doug Ewell

Don Osborn wrote:


Concerning the keyboard side of the issue, there has been a lot of
discussion about unified standards over the years, but what we end up
with is maybe another case of "The nice thing about standards is that
there are so many to choose from."


There are a zillion keyboard layouts, not because of too many 
conflicting standards per se, but primarily because people don't want to 
change away from the layout they're familiar with, and secondarily 
because different languages have different needs.


--
Doug Ewell | http://ewellic.org | Thornton, CO  



Re: Non-standard 8-bit fonts still in use

2016-05-08 Thread Philippe Verdy
2016-05-08 16:19 GMT+02:00 Don Osborn :

> The flexibility of touchpad keyboards in theory gets beyond the
> limitations of the physical keyboards - has anyone tried adding a row to
> say a QWERY layout, which includes additional characters, rather than
> sweating the issues about shoehorning them in other levels or key
> sequences? Is that even possible? Still would be helpful to have standards,
> but where something is visible, it is easy to use.
>

It is technically possible, but the problem is to add distinctive hardware
"scan codes" to keys in this row.

See this table:
https://msdn.microsoft.com/en-us/library/aa299374(v=vs.60).aspx

You'll note that almost all scancodes in the 7-bit range are used. So you'd
need "extended scancodes", i.e. prefixing the special virtual scancode 00
on Windows (or the hardware scancode E0) before the extended scan code for
the actual key. (The special scancode "00" turns the 7-bit table into an
equivalent 8-bit table, but note that keyboards use 7-bit scancodes only,
as the 8th bit is used for the press/release flag)

For that, you could then reuse the scancodes of the first row (those for
digits). Note that the scancodes for the row of "standard" function keys
(F1..F12) is already extended this way (for additional function keys).

Bit note also this table:
https://www.win.tue.nl/~aeb/linux/kbd/scancodes-10.html

You'll see that the hardware scancodes E0-0A and E0-0B are already assigned
on PC for special functions, and so cannot be used to "extend" the keys for
digits 9 and 0 on the first row (whose scancodes are 0A and 0B
respectively). This is not so critical: you can perfectly have additional
keys assigend for a row using non-contiguous hardware scancodes (after all
the alphabetic part of the keyboard is already using multiple ranges of
hardware and virtual scancodes).

But you'd need a new keyboard driver (and an extension to MSKLC on Windows)
to allow mapping this supplementary row, and a industry agreement to assign
new extended keys in non-conflicting ways (these days, it is the Microsoft
hardware labs that centralize the extensions used on PC-compatible
hardware, Apple used to have its own registry for its own keyboards, but
now Macs are PC and can use the same keyboards not necessarily built by
Apple, e.g. by Logitech). The connectors are compatible with the same USB
interface.

There are some differences in hardware scancodes used on the USB interface
(Windows internally translated hardware scancodes for some interfaces into
the same virtual scancodes before sending them to upper keyboard drivers
and applications: this is where scancode E0 on the old PC-keyboard
interface or the newer PS/2 interface or USB interface, or in the old BIOS
interface is remapped into the same virtual scancode 00 for Windows drivers
and apps).

There's also an additional hardware extension code E1 for a few function
keys (it is used for a few functions encoded on 3 bytes, for upward
compatibility reasons, such as the "Pause" key).

Various other vendors have used specific hardware scancodes, but today
almost everyone agrees to the same PC standard.


Re: Non-standard 8-bit fonts still in use

2016-05-08 Thread Don Osborn
Thanks all for the replies on this matter. Concerning the keyboard side 
of the issue, there has been a lot of discussion about unified standards 
over the years, but what we end up with is maybe another case of "The 
nice thing about standards is that there are so many to choose from." 
Within that, there seem to be two main questions addressed by keyboard 
creation: production and popular use. It Many keyboards are made with 
production in one or maybe a couple of languages in mind - this is in 
line with the thinking behind creation of old 8-bit modified fonts. On 
the other hand, is the need for keyboard layouts that can be accessed 
broadly without the users having to learn new key assignments at each 
new device. In terms of philosophy, I'd see common keyboards as more in 
line with the intent of Unicode.


In the ideal world, there would be no distinction between keyboards 
created with limited/focused production in mind (limited in the sense of 
one language in a multilingual society and/or focused on a particular 
production need), and keyboards intended to facilitate broad usage. Like 
a QWERTY+ or AZERTY+ perhaps? That has not been easy - kind of another 
theory of everything problem.


The flexibility of touchpad keyboards in theory gets beyond the 
limitations of the physical keyboards - has anyone tried adding a row to 
say a QWERY layout, which includes additional characters, rather than 
sweating the issues about shoehorning them in other levels or key 
sequences? Is that even possible? Still would be helpful to have 
standards, but where something is visible, it is easy to use.


On the font side, my impression (a bit dated) is that there is/was a 
policy dimension or gap. Back when Unicode was becoming more widely 
adopted, there were new computers marketed in Africa without the then 
limited repertoire of fonts with extended Latin. Even when these were 
included, there are some instances where it is possible that 8-bit fonts 
with extended characters were created on machines that already had one 
or two Unicode fonts - evidently unbeknownst to the user. So there was, 
and always has been, a public education side to this that none of us in 
position or interest to do so have been able to address.


In the background one should bring in the issue of whether computer 
science students and IT experts in Africa had any introduction to 
Unicode. That could be a big missing piece in the equation.


The case of the Chinese publications using modified 8-bit fonts for both 
Hausa boko and Chinese pinyin is a specialized one. Given the small 
number of people working on both those languages it may be just the 
chance outcome of their not being aware that Unicode already had their 
needs covered. A specialized keyboard for production of text including 
hooked consonants and tone-marked vowels, plus awareness of Unicode 
would probably set them on a new course.


Marcel, I would be very interested to know more about what you are 
working on wrt Bambara - perhaps offline.


Don


On 5/5/2016 10:35 PM, Marcel Schneider wrote:

On Sat, 30 Apr 2016 13:27:02 -0400, Don Osborn  wrote:


If the latter be the case, that would seem to have implications
regarding dissemination of information about Unicode. "If you
standardize it, they will adopt" certainly holds for industry and
well-informed user communities (such as in open source software), but
not necessarily for more localized initiatives. This is not to seek to
assign blame in any way, but rather to point out what seems to be a
persistent issue with long term costs in terms of usability of text in
writing systems as diverse as Bambara, Hausa boko, and Chinese pinyin.

The situation Don describes is challenging the work that is already done and 
on-going in Mali, with several keyboard layouts at hand. If widening the range 
is really suitable, one might wish to test a couple of other solutions than 
already mentioned, that roughly fall into two subsets:

1) Letters on the digits row. Thanks to a kindly shared resource, Iʼm able to 
tell that over one dozen Windows layouts—mainly French, as used in Mali, but 
also Lithuanian, Czech, Slovak, and Vietnamese, have the digits in the Shift or 
AltGr shift states. The latter is the only useful way of mapping letters on 
digit keys and becomes handy if the Kana toggle is added, either alone or in 
synergy with the Kana modifier instead of AltGr. With all bracketing characters 
in group 2 level 1 on the home row and so on, there is enough place to have all 
characters for Bambara and French directly accessed.

2) Letters through dead keys. This is the ISO/IEC 9995 way of making more 
characters available in additional groups with dead key group selectors 
(referred to as remnant modifiers but actually implemented as dead keys). This 
is also one way SIL/Tavultesoftʼs layouts work for African and notably for 
Malian languages. IME-based keyboarding software may additionally offer a 
transparent input experience.



Re: Non-standard 8-bit fonts still in use

2016-05-05 Thread Marcel Schneider
On Sat, 30 Apr 2016 13:27:02 -0400, Don Osborn  wrote:

> If the latter be the case, that would seem to have implications
> regarding dissemination of information about Unicode. "If you
> standardize it, they will adopt" certainly holds for industry and
> well-informed user communities (such as in open source software), but
> not necessarily for more localized initiatives. This is not to seek to
> assign blame in any way, but rather to point out what seems to be a
> persistent issue with long term costs in terms of usability of text in
> writing systems as diverse as Bambara, Hausa boko, and Chinese pinyin.

The situation Don describes is challenging the work that is already done and 
on-going in Mali, with several keyboard layouts at hand. If widening the range 
is really suitable, one might wish to test a couple of other solutions than 
already mentioned, that roughly fall into two subsets:

1) Letters on the digits row. Thanks to a kindly shared resource, Iʼm able to 
tell that over one dozen Windows layouts—mainly French, as used in Mali, but 
also Lithuanian, Czech, Slovak, and Vietnamese, have the digits in the Shift or 
AltGr shift states. The latter is the only useful way of mapping letters on 
digit keys and becomes handy if the Kana toggle is added, either alone or in 
synergy with the Kana modifier instead of AltGr. With all bracketing characters 
in group 2 level 1 on the home row and so on, there is enough place to have all 
characters for Bambara and French directly accessed.

2) Letters through dead keys. This is the ISO/IEC 9995 way of making more 
characters available in additional groups with dead key group selectors 
(referred to as remnant modifiers but actually implemented as dead keys). This 
is also one way SIL/Tavultesoftʼs layouts work for African and notably for 
Malian languages. IME-based keyboarding software may additionally offer a 
transparent input experience.


On Mon, 2 May 2016 12:03:58 -0400, Ed Trager  wrote:

> Also with web applications the "software installation" issue is eliminated.
> Remember that while it is easy for technologically savvy folks like members
> of this mailing list to install keyboard drivers on any platform we like,
> this process is somewhat beyond the reach of many people I know, even when
> they are otherwise fairly comfortable using computers. 

I canʼt easily believe that people who are comfortable with computers may have 
trouble using the widely automatted keyboard layout installation feature, 
because Iʼve as well experienced myself as got the opportunity to observe on 
other persons I know, that in fact there is some kind of reluctance based on 
the belief—call it a myth or an urban legend—that Windows plus preinstalled 
software plus MS Office come along with everything any user may need until the 
next update. Though informing about Microsoftʼs help to customize the keyboard 
is more complicated in that the display is part of the hardware, and the 
functioning behind has more of a blackbox.


Being actually working on such a project for the fr-FR locale, Iʼve already got 
some ideas for Bambara. I hope it can soon be on-line.

Kind regards,

Marcel



Re: Non-standard 8-bit fonts still in use

2016-05-02 Thread Oren Watson
Hm... I don't think that simply search-replacing of ascii characters for
the characters the font uses them for will work, except on .txt files.
Microsoft Word documents, HTML files, and any other non-plaintext files
will almost certainly be corrupted by such a program, because the tags
might contain those letters. (in addition, unlike .docx files, .doc files
from windows xp contain binary data which could have arbitrary bytes.)

Probably in practical terms a good solution is to make a Microsoft Word
macro to do the replacement, and post instruction to copypaste it.

On Mon, May 2, 2016 at 3:34 AM, Martin J. Dürst 
wrote:

> Hello Don,
>
> I agree with Doug that creating a good keyboard layout is a good thing to
> do. Among the people on this list, you probably have the best contacts, and
> can help create some test layouts and see how people react.
>
> Also, creating fonts that have the necessary coverage but are encoded in
> Unicode may help, depending on how well the necessary characters are
> supported out of the box in the OS version in use on the ground (which may
> be quite old).
>
> Also, a conversion program will help. It shouldn't be too difficult,
> because as far as I understand, it's essentially just a few characters than
> need conversion, and it's 1 byte to multibyte. Even in a low level language
> such as C, that's just a few lines, and any of the students in my
> programming course could write that (they just wrote something similar as
> an exercise last week).
>
> On 2016/05/01 02:27, Don Osborn wrote:
>
>> Last October I posted about persistence of old modified/hacked 8-bit
>> fonts, with an example from Mali. This is a quick follow up, with
>> belated thanks to those who responded to that post on and off list, and
>> a set of examples from China and Nigeria. I conclude below with some
>> thoughts about what this says about dissemination of information about
>> Unicode.
>>
>
> I'm not familiar with the actual situation on the ground, which may vary
> in each place, but in general, what will convince people is not theoretical
> information, but practical tools and examples about what works better with
> Unicode (e.g.: if you do it this way, it will show correctly in the Web
> browser on your new smart phone, or if you do it this way, even your
> relative in Europe can read it without installing a special font,...).
>
> Even in the developed world, where most people these days are using
> Unicode, most don't know what it is, and that's just fine, because it just
> works.
>
> Regards,   Martin.
>


Re: Non-standard 8-bit fonts still in use

2016-05-02 Thread Ed Trager
In addition to creating platform-specific keyboard layouts as Doug
suggested, I would also like to point out that it is now also possible —and
possibly even easier— to create web-based keyboard and input method engines
that may allow a greater degree of cross-platform support, reducing
platform-specific work.

Also with web applications the "software installation" issue is eliminated.
Remember that while it is easy for technologically savvy folks like members
of this mailing list to install keyboard drivers on any platform we like,
this process is somewhat beyond the reach of many people I know, even when
they are otherwise fairly comfortable using computers.

As an example, see http://unifont.org/keycurry/, a Javascript/jQuery-based
web app that I wrote and use for myself all of the time.

One limitation of keycurry is that currently almost all of the keyboard
maps assume an American QWERTY layout. But honestly it would not be very
difficult to generate variant maps for AZERTY or whatever else one wants. I
just have not bothered myself to do that extra work because I bought my
laptop in the U.S. and the default QWERTY layout works fine for me,
especially now that I can write new keyboard maps for most scripts and
languages in a matter of a few minutes (unifont.org/keycurry now uses
JSON-based keyboard maps with UTF-8, in addition to an older format based
on Yudit; obviously IMEs for scripts like Korean or Chinese take a lot
longer to write, but simple keymaps for Latin and many other scripts are
super easy to make).

In fact, with web-based solutions, users don't even have to download or
install the fonts, as obviously we can just use web fonts to supply
Unicode-based fonts to the web app. (In fact this is exactly what I do for
the Tai Tham keyboards in keycurry, inter alia).

Best - Ed

On Mon, May 2, 2016 at 3:34 AM, Martin J. Dürst 
wrote:

> Hello Don,
>
> I agree with Doug that creating a good keyboard layout is a good thing to
> do. Among the people on this list, you probably have the best contacts, and
> can help create some test layouts and see how people react.
>
> Also, creating fonts that have the necessary coverage but are encoded in
> Unicode may help, depending on how well the necessary characters are
> supported out of the box in the OS version in use on the ground (which may
> be quite old).
>
> Also, a conversion program will help. It shouldn't be too difficult,
> because as far as I understand, it's essentially just a few characters than
> need conversion, and it's 1 byte to multibyte. Even in a low level language
> such as C, that's just a few lines, and any of the students in my
> programming course could write that (they just wrote something similar as
> an exercise last week).
>
> On 2016/05/01 02:27, Don Osborn wrote:
>
>> Last October I posted about persistence of old modified/hacked 8-bit
>> fonts, with an example from Mali. This is a quick follow up, with
>> belated thanks to those who responded to that post on and off list, and
>> a set of examples from China and Nigeria. I conclude below with some
>> thoughts about what this says about dissemination of information about
>> Unicode.
>>
>
> I'm not familiar with the actual situation on the ground, which may vary
> in each place, but in general, what will convince people is not theoretical
> information, but practical tools and examples about what works better with
> Unicode (e.g.: if you do it this way, it will show correctly in the Web
> browser on your new smart phone, or if you do it this way, even your
> relative in Europe can read it without installing a special font,...).
>
> Even in the developed world, where most people these days are using
> Unicode, most don't know what it is, and that's just fine, because it just
> works.
>
> Regards,   Martin.
>


Re: Non-standard 8-bit fonts still in use

2016-05-02 Thread Martin J. Dürst

Hello Don,

I agree with Doug that creating a good keyboard layout is a good thing 
to do. Among the people on this list, you probably have the best 
contacts, and can help create some test layouts and see how people react.


Also, creating fonts that have the necessary coverage but are encoded in 
Unicode may help, depending on how well the necessary characters are 
supported out of the box in the OS version in use on the ground (which 
may be quite old).


Also, a conversion program will help. It shouldn't be too difficult, 
because as far as I understand, it's essentially just a few characters 
than need conversion, and it's 1 byte to multibyte. Even in a low level 
language such as C, that's just a few lines, and any of the students in 
my programming course could write that (they just wrote something 
similar as an exercise last week).


On 2016/05/01 02:27, Don Osborn wrote:

Last October I posted about persistence of old modified/hacked 8-bit
fonts, with an example from Mali. This is a quick follow up, with
belated thanks to those who responded to that post on and off list, and
a set of examples from China and Nigeria. I conclude below with some
thoughts about what this says about dissemination of information about
Unicode.


I'm not familiar with the actual situation on the ground, which may vary 
in each place, but in general, what will convince people is not 
theoretical information, but practical tools and examples about what 
works better with Unicode (e.g.: if you do it this way, it will show 
correctly in the Web browser on your new smart phone, or if you do it 
this way, even your relative in Europe can read it without installing a 
special font,...).


Even in the developed world, where most people these days are using 
Unicode, most don't know what it is, and that's just fine, because it 
just works.


Regards,   Martin.


Re: Non-standard 8-bit fonts still in use

2016-05-01 Thread Doug Ewell

Don Osborn wrote:


Substituting characters such that the key for an otherwise unused
character yields a hooked letter or a tone-marked vowel may be seen as
sufficient for their purposes and easier than switching to Unicode and
sorting out a new keyboard system.


The myth is that switching to Unicode requires switching to a new and 
{ unfamiliar, complex, hard to adopt } keyboard layout. Even when the 
"new" part is true, the rest need not be.


Assuming they are currently using a Windows U.S. English layout, someone 
could easily provide them with a layout that either:


1. puts the non-ASCII letters on the keys corresponding to the ASCII 
symbols currently repurposed by their font (for example, pressing q 
yields ɛ), or


2. puts them on AltGr combinations (for example, pressing AltGr+e yields 
ɛ).


In the first case, there would be no apparent change for the user, but 
the mapping from q to ɛ would be moved out of the font and into the 
input process.


The second case would allow access to both English and (e.g.) Bambara 
characters, but would require a change for the user typing Bambara, so 
would probably meet with more resistance.


Tools could be easily written to convert existing text like "tqgq" to 
the real spelling, so compatibility with the hacked fonts would become 
less of a concern.


--
Doug Ewell | http://ewellic.org | Thornton, CO  



Re: Non-standard 8-bit fonts still in use

2016-04-30 Thread Andrew Cunningham
Don,

Most African communities I work with within diaspora are using Unicode.
Although 8 bit legacy content is still in use.

Probably the most use I see of legacy encodings is among the Karen
languages. Sgaw Karen uses seem to still be using 8-bit fonts. There is a
psuedo-Unicode solution but 8-bit fonts dominate still.

The problem for Karen is that the default rendering for Unicode fonts isn't
suitable. And locl support in applications has been lagging.

The ideal Unicode font for Myanmar script would have somewhere between 8-10
language systems. Cross platform support is lacking. Currently best
approach is a separate font for each language system.

Andrew

On Friday, 16 October 2015, Don Osborn  wrote:
> I was surprised to learn of continued reference to and presumably use of
8-bit fonts modified two decades ago for the extended Latin alphabets of
Malian languages, and wondered if anyone has similar observations in other
countries. Or if there have been any recent studies of adoption of Unicode
fonts in the place of local 8-bit fonts for extended Latin (or non-Latin)
in local language computing.
>
> At various times in the past I have encountered the idea that local
languages with extended alphabets in Africa require special fonts (that
region being my main geographic area of experience with multilingual
computing), but assumed that this notion was fading away.
>
> See my recent blog post for a quick and by no means complete discussion
about this topic, which of course has to do with more than just the fonts
themselves:
http://niamey.blogspot.com/2015/10/the-secret-life-of-bambara-arial.html
>
> TIA for any feedback.
>
> Don Osborn
>
>
>

-- 
Andrew Cunningham
lang.supp...@gmail.com


Re: Non-standard 8-bit fonts still in use

2015-10-27 Thread Marcel Schneider
I was preparing the following feedback long before the obituary of Michael S. 
Kaplan.


I stay mourning.


 

Since discussion restarted, am I allowed to send this today, instead of 
tomorrow? 

Initially it was planned for yesterday, the day when I found Doug Ewell’s and 
following messages, that brought me the bad news.


 

I’m grateful for Erkki I. Kolehmainen’s advice to complete best effort prior to 
sending. My apologies for previous defaults.


 

On Thu, 15 Oct 2015 20:22:08 -0400, Don Osborn  wrote:

> I was surprised to learn of continued reference to and presumably use of 
> 8-bit fonts modified two decades ago for the extended Latin alphabets of 
> Malian languages
[...]> 
> See my recent blog post for a quick and by no means complete discussion 
> about this topic, which of course has to do with more than just the 
> fonts themselves: 
> http://niamey.blogspot.com/2015/10/the-secret-life-of-bambara-arial.html

Here is another example of legacy font usage less than two years back:
http://csprousers.org/forum/viewtopic.php?f=1=753

❏ Legacy fonts offer at least one substantial advantage, which is already 
underscored in the comments on the cited blog post: They allow to use any 
habitual ASCII-oriented keyboard layout like French in Mali. Personally I feel 
with all the people who stay using the fonts issued by the «1990s […] joint 
project of the Malian Ministry of Education and the French Agence de 
Coopération Culturelle et Technique (ACCT […])», and I wouldnʼt neither throw 
away a proven worktool without being sure to get a better one.

I suppose the cited people in any case didn't use the "clavier unifié 
français-bambara" that you linked to on another blog post:
http://www.mali-pense.net/IMG/pdf/le-clavier_francais-bambara.pdf
cited on:
http://www.mali-pense.net/Ressources-pour-la-pratique-du.html
cited on:
http://niamey.blogspot.fr/2014/11/writing-bambara-right.html

Indeed there is a big oopsie with keyboard layouts on Windows: we cannot 
associate applications and default keyboard layouts like we can associate file 
extensions and applications. So one working method not to be bothered with 
switching keyboard layouts is to have appropriate templates in the word 
processor with extra fonts instead of extra layouts.

❏ The glyph issue: To get «"Bambara Arial" to work on the internet», a simple 
macro replacing ù, %/Ù, q, Q, x, X, v, V with ɔ, Ɔ, ɛ, Ɛ, ɲ, Ɲ, ŋ, Ŋ isnʼt 
enough because though Arial, it wonʼt be «true Bambara» any more given the 
inconsistency of all fonts I could view that use the “n” form for uppercase 
eng, like Arial and Times do, while sticking with the “N” form for uppercase 
palatal n. I believe itʼs not just «more to it» but even the main reason, 
despite of opposite opinion in comments. I couldnʼt gather any suitable font, 
but book-printers must have them, and possibly both shapes in the same font.

Such fonts seem to be really confidential. In a bilingual Bambara-French book 
from France (1996), the typeface clearly shows that “n”-shaped uppercase ɲ has 
been emulated by using oversize lowercase. Ported to HTML, this workaround 
results in replacing all ‘Ɲ’ by lowercase in a [font-size: 135%; line-height: 
75%; font-weight: lighter;] span, though it stays looking semi-bold when less 
than 400 is unavailable. That can make aware that it doesnʼt render in plain 
text. And it works best with sans-serif fonts. I really donʼt know if this has 
at least some resemblance to Bambara Arial, and I do wish to be able to check. 
I note, too, that such a construct is not Unicode conformant.

It would be desirable to overcome that system of special fonts, workarounds, 
and limited support. I donʼt know if really some communities prefer the “N” 
form for uppercase palatal n, or even for eng, or both. Was there a problem at 
the time when actual fonts were created? Does anybody know a solution? Iʼm 
likely to believe that this would eventually be language tagging and use of 
modern rendering engines along with up-to-date fonts providing both glyphs.

However, in my opinion, correct display of so widely spoken and written a 
language as Bamanankan, should not have to rely on sophisticated byways.

❏ About Unicode aware education: Iʼm not likely to share presumptions about 
lack of training in Mali more than in other countries, including European. 
Keyboard layout documentation from Europe last updated after fifteen years of 
Unicode still targets non-conformant rendering engines (where precomposed and 
decomposed characters display differently) and doesnʼt mind using canonical 
decomposition afterwards to streamline the input (actually, for Bambara, by 
using the French ‘à’, ‘è’, ‘ù’, and ‘é’ keys). Well, the existence of 
decomposition in text processing was about the first things Unicode taught me, 
as I was too ignorant to directly point the browser to TUS and UAXes to learn 
about—while already bothering about creating keyboard layouts... That turns out 
not to be a single 

Re: Non-standard 8-bit fonts still in use

2015-10-20 Thread Frédéric Grosshans

Le 16/10/2015 02:22, Don Osborn a écrit :
I was surprised to learn of continued reference to and presumably use 
of 8-bit fonts modified two decades ago for the extended Latin 
alphabets of Malian languages, and wondered if anyone has similar 
observations in other countries. Or if there have been any recent 
studies of adoption of Unicode fonts in the place of local 8-bit fonts 
for extended Latin (or non-Latin) in local language computing.
A different usage where I suspect 8 bits proprietary fonts are used are 
electronic French (Grandjean) stenotypes, which use some non-unicode 
characters (like E without middle-bar). They are apparently used with 
computer software since the 1980’s (cf 
https://hal.archives-ouvertes.fr/jpa-00245165/document [pdf in French]) 
to make live subtitles. But I guess the proprietary nature of these 
characters and use by a single company (since ~1910) makes there 
encoding in Unicode unlikely.


  Frédéric



Re: Non-standard 8-bit fonts still in use

2015-10-17 Thread Richard Wordingham
On Thu, 15 Oct 2015 20:22:08 -0400
Don Osborn  wrote:

> I was surprised to learn of continued reference to and presumably use
> of 8-bit fonts modified two decades ago for the extended Latin
> alphabets of Malian languages, and wondered if anyone has similar
> observations in other countries. Or if there have been any recent
> studies of adoption of Unicode fonts in the place of local 8-bit
> fonts for extended Latin (or non-Latin) in local language computing.

Non-Unicode fonts have been particularly resilient in Indic scripts,
though I'm not sure what the current state of play is.  I'm not sure
that they are particularly '8-bit', but rather, they re-use the more
accessible codes.

Although these font schemes generally have the disadvantage that plain
text is not supported, in the Indic world they do have advantages over
Unicode:

1) What you type is what you get.  Indic rearrangement irritates a lot
of people.  Several Tai scripts have successfully resisted it, but
Indians have been suppressed by the influence of ISCII.

2) They avoid the dependence on a language-specific shaping engine.
Microsoft's USE may now eliminate this advantage.

3) Text is accessible for editing.  Windows provides no cursor
positioning within grapheme clusters, and the one response has been to
prevent editing of grapheme clusters.   As  a slight compensation, the
idea that backward deletion should delete the preceding encoded
character has a lot of implementation support.

I understand that in Cambodia, Unicode was established by government
edict.

Richard.


Re: Tirhuta (linked to: Re: Non-standard 8-bit fonts still in use)

2015-10-17 Thread Marcel Schneider
On Sat, 17 Oct 2015 09:20:13 +0100, Richard Wordingham  wrote:

> On Thu, 15 Oct 2015 20:22:08 -0400
> Don Osborn  wrote:
> 
> > I was surprised to learn of continued reference to and presumably use
> > of 8-bit fonts modified two decades ago for the extended Latin
> > alphabets of Malian languages, and wondered if anyone has similar
> > observations in other countries. Or if there have been any recent
> > studies of adoption of Unicode fonts in the place of local 8-bit
> > fonts for extended Latin (or non-Latin) in local language computing.
> 
> Non-Unicode fonts have been particularly resilient in Indic scripts,
> though I'm not sure what the current state of play is. I'm not sure
> that they are particularly '8-bit', but rather, they re-use the more
> accessible codes.
> 
> Although these font schemes generally have the disadvantage that plain
> text is not supported, in the Indic world they do have advantages over
> Unicode:
> 
> 1) What you type is what you get. Indic rearrangement irritates a lot
> of people. Several Tai scripts have successfully resisted it, but
> Indians have been suppressed by the influence of ISCII.

Does this mean that OpenType fonts are "overscripted" and that glyph reordering 
and glyph substitution are not appreciated?

If so, the best seems to me to convert legacy fonts to Unicode conformant fonts 
without scripting them. 
Or to provide kind of a *stable input* option that disables the advanced 
behaviour.

Marcel 

 

[First in thread: 
http://www.unicode.org/mail-arch/unicode-ml/y2015-m09/0155.html]


[Previous in thread: 
http://www.unicode.org/mail-arch/unicode-ml/y2015-m09/0156.html]


 


Non-standard 8-bit fonts still in use

2015-10-15 Thread Don Osborn
I was surprised to learn of continued reference to and presumably use of 
8-bit fonts modified two decades ago for the extended Latin alphabets of 
Malian languages, and wondered if anyone has similar observations in 
other countries. Or if there have been any recent studies of adoption of 
Unicode fonts in the place of local 8-bit fonts for extended Latin (or 
non-Latin) in local language computing.


At various times in the past I have encountered the idea that local 
languages with extended alphabets in Africa require special fonts (that 
region being my main geographic area of experience with multilingual 
computing), but assumed that this notion was fading away.


See my recent blog post for a quick and by no means complete discussion 
about this topic, which of course has to do with more than just the 
fonts themselves: 
http://niamey.blogspot.com/2015/10/the-secret-life-of-bambara-arial.html


TIA for any feedback.

Don Osborn