Re: Unicode organization is still anti-Serbian and anti-Macedonian

2014-02-17 Thread Kent Karlsson

Den 2014-02-17 10:33, skrev "Gerrit Ansmann" :

>> I don't like the idea, but one possibility would be to define Serbian glyph
>> styles by adding variation selectors.  Variation selectors are already
>> 'defined' for the decimal digits U+0030 to U+0039.  It would, however,
>> mess up string comparison operations that weren't smart enough to ignore
>> variation selectors.

>

Also, for the variation selectors to work for the end user, it requires
> the same technologies whose lack of support is why we are discussing this
> in the first place, doesn¹t it? So, defining the corresponding variation
> selectors would not make the end user see the correct glyphs earlier.

Still, variation selectors would be, in the text, a very localized
indication, independent of (displaying) user's preference settings
or language declaration (from the author, in e.g. XML/HTML formats)
for the text, and variation selectors are indeed more likely to
survive operations like cut-and-paste. There would be a problem of
inserting variation selectors at all places where appropriate, though.
Spell checking functionality could, in principle at least, help with
the latter.

/Kent K



___
Unicode mailing list
Unicode@unicode.org
http://unicode.org/mailman/listinfo/unicode


Re: Unicode organization is still anti-Serbian and anti-Macedonian

2014-02-17 Thread Otto Stolz

Hello,

Крушевљанин Иван had written:

People, do you realize that proper glyphs are needed everywhere and
every time, CONSTANTLY, even when American ordinary user chats with
German ordinary user about Serbian language


Am 2014-02-17 um 00:50 Uhr MEZ schrieb Richard Wordingham:

One issue here that I don't know the solution for is how the right
glyphs should be chosen for displaying plain text communication.  I
don't know any general mechanism for, say, specifying that by
default Cyrillic text should use Serbian glyphs, CJK characters
should use Japanese glyphs and that Cuneiform should use Neo-Assyrian
glyphs.


This boils down to the fact that, in plain-text communication, the
receiver can – and should – chose the appropriate font. This holds,
in particular, for classical e-mail. Thence my recent claim that the
problem posed by Иван is a mere font issue.

In HTML, this is a bit different: The author has control over the
fonts (thence over the glyphic style) used for the display, but the
reader can normally override the author’s choice. Hence, WWW authors
should specify suitable fonts for their respective articvles (or even
parts thereof).

On paper, or in PDF and other facsimile formaats, the author is
entirely responsible for the glyphic style and appearnce, and he
should always chose suitable fonts. This is the realm of the
solution involving that ‘Gentium Plus srp’ font I had mentioned,
recently.

May i humbly remind Иван (and all other readers of this thread) that
the problem manifests itself (mainly or only) with italic style
letters; hence there remains virually no problem with normal
(non-italic) style.

Best wishes,
  Otto Stolz


___
Unicode mailing list
Unicode@unicode.org
http://unicode.org/mailman/listinfo/unicode


Re: Unicode organization is still anti-Serbian and anti-Macedonian

2014-02-17 Thread Gerrit Ansmann

On Mon, 17 Feb 2014 00:50:45 +0100, Richard Wordingham 
 wrote:


I don't like the idea, but one possibility would be to define Serbian glyph 
styles by adding variation selectors.  Variation selectors are already 
'defined' for the decimal digits U+0030 to U+0039.  It would, however, mess up 
string comparison operations that weren't smart enough to ignore variation 
selectors.


Also, for the variation selectors to work for the end user, it requires the 
same technologies whose lack of support is why we are discussing this in the 
first place, doesn’t it? So, defining the corresponding variation selectors 
would not make the end user see the correct glyphs earlier.
___
Unicode mailing list
Unicode@unicode.org
http://unicode.org/mailman/listinfo/unicode


Re: Unicode organization is still anti-Serbian and anti-Macedonian

2014-02-16 Thread Tom Gewecke

On Feb 16, 2014, at 4:33 AM, Крушевљанин wrote:

> 
> What is interesting, I know next to nothing about Apple. (Probably because 
> Macintosh computers are expensive as hell.) I have read something about AAT 
> technology, but what about their fonts? Are there Serbian/Macedonian glyphs?

I had a look, and I think the answer is "no".  (Except for two, one of which is 
Chinese, which seem to have the Serbian 'б' by mistake).___
Unicode mailing list
Unicode@unicode.org
http://unicode.org/mailman/listinfo/unicode


Re: Unicode organization is still anti-Serbian and anti-Macedonian

2014-02-16 Thread Tom Gewecke
On Feb 16, 2014, at 4:50 PM, Richard Wordingham wrote:

> 
> One issue here that I don't know the solution for is how the right
> glyphs should be chosen for displaying plain text communication.  I
> don't know any general mechanism for, say, specifying that by
> default Cyrillic text should use Serbian glyphs, CJK characters
> should use Japanese glyphs and that Cuneiform should use Neo-Assyrian
> glyphs.  

In Mac OS X and iOS, this is currently being done for the CJK case by switch 
fonts according to the order of languages in the system level language 
preferences.  If Japanese is higher than Chinese on the list, then by default a 
Japanese font is used for CJK plain text.___
Unicode mailing list
Unicode@unicode.org
http://unicode.org/mailman/listinfo/unicode


Re: Unicode organization is still anti-Serbian and anti-Macedonian

2014-02-16 Thread Richard Wordingham
On Sun, 16 Feb 2014 13:57:38 -0800
David Starner  wrote:
 
> > People, do you realize that proper glyphs are needed everywhere and
> > every time, CONSTANTLY, even when American ordinary user chats with
> > German ordinary user about Serbian language

> And if we picked your option and they did use Cyrillic? I'm betting
> American ordinary user and German ordinary user would load up their
> Russian keyboards and type away using Russian letters for Serbian.

American *ordinary* user and German *ordinary* user would not be typing
Serbian.

One issue here that I don't know the solution for is how the right
glyphs should be chosen for displaying plain text communication.  I
don't know any general mechanism for, say, specifying that by
default Cyrillic text should use Serbian glyphs, CJK characters
should use Japanese glyphs and that Cuneiform should use Neo-Assyrian
glyphs.  

> There won't be new Serbian characters invalidating
> every text stored in systems in Serbian today.

I don't like the idea, but one possibility would be to define Serbian
glyph styles by adding variation selectors.  Variation selectors are
already 'defined' for the decimal digits U+0030 to U+0039.  It would,
however, mess up string comparison operations that weren't smart enough
to ignore variation selectors.

Richard.
___
Unicode mailing list
Unicode@unicode.org
http://unicode.org/mailman/listinfo/unicode


Re: Unicode organization is still anti-Serbian and anti-Macedonian

2014-02-16 Thread Richard Wordingham
On Sun, 16 Feb 2014 18:44:55 +0100
Philippe Verdy  wrote:

> 2014-02-15 19:25 GMT+01:00 Richard Wordingham <
> richard.wording...@ntlworld.com>:
> 
> > On Fri, 14 Feb 2014 02:37:19 -0800
> > Крушевљанин  wrote:Should these combinations
> > be well known?  They're not listed in the
> > CLDR exemplar characters for Serbian.
> >
> > As for input, I would suggest that the solution for the simpler
> > keyboarding techniques is to enter them as base character and then
> > dead key.

> There's another inut method where you can press a key for the
> diacritic after a base letter: this key is treated in isolation and
> immediately generates the combining diacritic, independantly of the
> characters pressed before.

Sorry, this is what I meant.  I should have written 'diacritic',
not 'dead key'.

> But such input method will not warranty the NFC form,

Which is an argument for text editors to have normalisation functions,
like the emacs ucs-normalize-NFC-region command.

> and cand
> produce broken sequences (in some cases the diacritic may be
> invisible in the generated text).

Something many users of the Thai script currently have to live
with.

Richard.
___
Unicode mailing list
Unicode@unicode.org
http://unicode.org/mailman/listinfo/unicode


Re: Unicode organization is still anti-Serbian and anti-Macedonian

2014-02-16 Thread David Starner
Every time you attack the only character set that supports various
third-world African languages and various tiny North American
languages  and various small Indian languages and various Philippine
scripts, as it's "easy for you latin-oriented nations (USA,
Germany...) to ignore the rest of the world, especially third-world
countries", people stop listening to you. Unicode is the system
designed to make it possible to write the scripts of all languages.
Microsoft happens to have been one of the largest drivers behind it,
having spent a lot of money on Unicode and OpenType to make this stuff
possible.

> People, do you realize that proper glyphs are needed everywhere and every 
> time, CONSTANTLY, even when American ordinary
> user chats with German ordinary user about Serbian language

They'd use Latin because that's what their keyboards are going to
support. Virtually every recent protocol runs over some sort of XML,
so language tagging comes free, and if they don't, they need to
provide some sort of language tagging.

And if we picked your option and they did use Cyrillic? I'm betting
American ordinary user and German ordinary user would load up their
Russian keyboards and type away using Russian letters for Serbian. It
is an incredibly well-known problem that if you have two similar
looking characters, users will use the more common one even when the
less common one is the correct one.

There won't be new precomposed characters, and there shouldn't be a
need for them. There won't be new Serbian characters invalidating
every text stored in systems in Serbian today. Maybe 15 years ago, a
change could in theory have been done, but not today. Deal with what
you have, because those decisions have been made and written in stone.

-- 
Kie ekzistas vivo, ekzistas espero.
___
Unicode mailing list
Unicode@unicode.org
http://unicode.org/mailman/listinfo/unicode


Re: Unicode organization is still anti-Serbian and anti-Macedonian

2014-02-16 Thread Philippe Verdy
2014-02-15 19:25 GMT+01:00 Richard Wordingham <
richard.wording...@ntlworld.com>:

> On Fri, 14 Feb 2014 02:37:19 -0800
> Крушевљанин  wrote:Should these combinations be well
> known?  They're not listed in the
> CLDR exemplar characters for Serbian.
>
> As for input, I would suggest that the solution for the simpler
> keyboarding techniques is to enter them as base character and then dead
> key.
>
"Dead keys" don't work this way. Their name really indicate that these keys
have no action (seam dead) until another key is pressed AFTER them.

So you press the dead key for the diacritic, then the key for the base
letter, to produce EITHER:
- a single precomposed character (where it exists) ; OR
- a canonically equivalent decomposed combiing sequence representing the
letter with its diacritic(s) (preferably in NFC form).

Dead keys may be combined in advanced keyboard drivers supporting complex
input states for handling multiple diacritics typed before a base letter ;
but simple keyboard drivers (such as those generated by MS Keyboard Layout
Editor) do not handle these complex states. But nothing prohibits building
such a keboard driver.

There's another inut method where you can press a key for the diacritic
after a base letter: this key is treated in isolation and immediately
generates the combining diacritic, independantly of the characters pressed
before. But such input method will not warranty the NFC form, and cand
produce broken sequences (in some cases the diacritic may be invisible in
the generated text).

For simple alphabetic scripts (like Latin, Greek, Cyrillic), the dead key
input method is generally prefered. the other one is used to enter isolated
combining diacritics which are almost never used in association with other
letters (and notably not in combining sequences equivalent to an existing
precomposed letter).

If you think about the combining diaeresis, as it is already used very
frequently in association with Latin and Cyrillic letters using a dead key
method, it should also be used as a dead key even for less frequent base
letters such as the Cyrillic letter Q. All that is needed is to use an
updated driver adding the mapping for diacritic dead key+letter, in which
it will output the NFC combining sequence if there's no precomposed NFC
equivalent



Unfortunately, the drivers generated by the MS Keyboard Layout Creator
(MSKLC), when it does not find any explciitly predefined mapping for
diacritic dead key+base letter, will generate the mapping for , followed by the base letter, meaning that you won't get
the text , but  !

The second limitation of MSKLC is that it cannot chain dead letters: each
input state must be mapped to a single state represented by a single
character, which is the spacing modifier letter that would be output if you
press the SPACE bar after the diacritic. It incorrectly assumes that
combinations that are not mapped explicitly will always be used followed by
a space bar keystroke to produce a spacing modifier letter, as if all
unmapped sequences were not possible and do not exist in the real world.

The other limitation is that this input state table can only be represented
by a single character in the BMP (but it may be represented by a PUA of the
BMP, even if MSKLC warns that this character may not be supported by fonts
on the native OS or in the Console using the local legacy OEM or "ANSI"
codepage (an 8-bit code page which may be either SBCS or DBCS).

Drivers built by MSKLC do not allow mapping a dead key outside the root
state table (so after pressing a dead key, possibly in combination with
state modier keys like Shift; Ctrl, Alt, and with the current state of the
CapsLock/ShiftLock), you can only press a single base character (also
possibly in combinjation with state modifier keys).

Due to these limitations of MSKLC, trying to generate some advanced keymaps
to support extended sets of combining sequences, requires using complex key
combinations with state modifiers (for the dead key and for the base
letter), which are very uneasy to input when it would be simpler and faster
to enter if sequences of dead keys were supported.

Dead keys are not very complex, in fact they are quire friendly and have
the advantage of normalizing the input to NFC directly, without needing any
additional support from the external text editor (modifying the text buffer
on the flow). They are natural to users even if the input order of
keystrokes is reversed, compared to the Unicode encoding of the generated
text (something that most users will never see as they have no idea about
how the text will be finally encoded and used in their applications).
___
Unicode mailing list
Unicode@unicode.org
http://unicode.org/mailman/listinfo/unicode


Re: Unicode organization is still anti-Serbian and anti-Macedonian

2014-02-16 Thread Крушевљанин
O-kay, I got several on-list and off-list messages, so I'll compile some 
replies here. I receive this mailing list in daily digest, so please excuse my 
style of replying/commenting. Please read this compilation minutely and don't 
take everything as insult.

People, I am perfectly aware of their existance and capable to use fonts like:
- from Microsoft (Windows Vista and above): Calibri, Cambria, Candara, Consolas 
(please make upper part (macron) of italic 'т' longer, it looks stupid now), 
Constantia, Corbel, Sitka (Gabriola has the potential)
- from Adobe: Arno Pro, Baskerville Cyrillic, Excelsior LT, Garamond Premier 
Pro, Helvetica Inserat, Minion Pro, Myriad (currently misses Serbian 'б'), 
Times Ten, Warnock Pro (Sava Pro also fits for this purpose)
- DejaVu family (Sans, Serif, Mono)
- GNU FreeFont family (Sans, Serif, Mono)
- Ubuntu family (Ubuntu, Mono)
- other useful fonts: Gentium Plus (SIL Graphite technology), EB Garamond.

Linux Libertine/Liberation/Biolinum family currently have severe issues and/or 
missing glyphs. And, font developers: please forgive me if I missed some good 
font for Serbian/Macedonian purposes!

I would like Microsoft to alter and provide Serbian/Macedonian support to 
following old (but unfortunately still used as default in many modern programs) 
fonts: Arial, Comic Sans (please provide Serbian 'б' and fix italic 'т') 
Courier New (please provide Serbian 'б'), Georgia, Impact (please provide 
Serbian 'б'), Tahoma (please provide Serbian 'б'), Times New Roman, Verdana 
(please provide Serbian 'б')

Adobe, Microsoft and others: please also note that, to cover both languages, in 
OpenType fonts you need to place both locale tags, language SRB and language 
MKD. (SRB for Serbian, MKD for Macedonian.) Macedonian cyrillic incorporates 
бгдпт from Serbian cyrillic, plus they have separate character 'ѓ' and italic 
glyph for that character rarely looks correct (GNU FreeSerif and EB Garamond 
have it best).

What is interesting, I know next to nothing about Apple. (Probably because 
Macintosh computers are expensive as hell.) I have read something about AAT 
technology, but what about their fonts? Are there Serbian/Macedonian glyphs? I 
saw one old screenshot of some Serbian Wikipedia page viewed from MacOS (and 
Safari?, I don't know exact details) but I didn't see proper glyphs.

* * *

Unicode problems that small countries (like Serbia and Macedonia) have are 
SEVERE, they can not be called "a mere font issue". Please do not insult my 
intelligence quotient. This is because Serbian/Macedonian language and our 
cyrillic script is not used on south Balkan only. People from all around the 
world communicate, and we all have different operating systems, software, 
fonts...

When folks from America, Germany, Russia, China, Japan... exchange mail, 
documents, textual informations on Wikipedia (even on Wikipedia informations 
are not always and everywhere tagged) with folks in Serbia and Macedonia, they 
all encounter problems — they get Russian cyrillic instead of 
Serbian/Macedonian.

People, do you realize that proper glyphs are needed everywhere and every time, 
CONSTANTLY, even when American ordinary user chats with German ordinary user 
about Serbian language, on different OS-es, textual e-mail/chat clients, GUI 
(Graphical User Interface) forms... We must NOT rely on OpenType and similar 
technologies for this! Serbia and Macedonia became "second-class citizens", 
systematically discriminated in computer world! That's why I want Unicode to 
finally fulfill this requirement. To make Serbia and Macedonia "first-class 
citizens"! And you can not use "Private User Areas", that's not reliable. 
Please read further discusion below with employee from Microsoft.

And note that Serbian/Macedonian cyrillic is not just "preferable", this is not 
appropriate term. The correct glyphs are REQUIRED — we can not accept Russian 
glyphs. Especially when in Russian small italic 'п' and 'т' looks *exactly* 
like latin 'n' and 'm'! That's nonsense for Serbian/Macedonian users (because 
we also use latin).

Furthermore, Serbian small 'б' is visually better than Russian counterpart. 
Sure this is my personal opinion, and I say it because Russian version looks to 
digit 6, Serbian doesn't (or, at least, at very low size)! So, Serbian small 
'б' can enter the Unicode as authentic Serbian letter. It resembles Greek 
gamma, but it's not exactly the same — the pronunciation is different and upper 
part of glyph design must be slightly altered, and result would be fine.

(And all Serbian glyphs are visually better than Russian. Yes, I claim it. 
Russian "curvature" italic 'г', for example, is *extremely ugly* for me. 
Serbian "i-macron" style is better. And longer part of cursive/handwritten 'д' 
always goes below, like latin 'g' in some designs, not above.)

* * *

Technologies like OpenType, SIL Graphite and AAT are good. People want 
stylistic alternate shapes, ornaments etc. But these technologie

Re: precomposed characters (was: Unicode organization is still anti-Serbian and anti-Macedonian)

2014-02-16 Thread Richard Wordingham
On Sat, 15 Feb 2014 19:39:59 +0100
"Janusz S. Bien"  wrote:

> Quote/Cytat - Richard Wordingham   
> (Sat 15 Feb 2014 07:25:51 PM CET):
> > Each precomposed character adds a small processing
> > overhead to an extremely large number of computers, not just to the
> > computers that actually use it.

> This is a very strong claim. Would be so kind to elaborate?

The following need to be stored simply because the character has been
assigned:

name (typically for character pick-lists)
script (typically for breaking text runs by script)
casing (upper/lower/titlecase)
collation properties (not strictly necessary)

There are many other properties, but many of them will often be covered
by default rules and may not need to be stored explicitly.

The only likely subsetting options I can think of would be to not
support the supplementary planes or to not support CJK characters.
This data will be moved when an operating system is installed, and the
files are liable to be moved or replaced at other times.  I will concede
that it is possible that this information may not need to be moved from
disk to memory - the data is likely to be ordered by codepoint and if
nearby codepoints are never used either it will not need to be loaded.

Some data files are mapped to memory, but I unfortunately I can't
comment on the processing overhead of increasing their size if the
additional data is not accessed.

The operations that will be most significantly be affected is
composition.  I am assuming that composition information will be used
even in the presence of a composition exclusion, e.g. to select the
best glyph from a font.  (One could optimise this away by potentially
rendering the canonical decomposition of a precomposed character
differently to the precomposed character.)  The composition data,
consisting of the pairs of characters to which precomposed characters
decompose, will be stored in codepoint order of the decomposition.  The
net effect of this is that the existence of unused composition data
will increase the number of cache misses, and thus increase the amount
of processing required.

If there is not a separate store of compositions not subject to
composition exclusion, then the same effect will occur whenever a
composition happens as part of the transform of a character string to
NFC or NFKC, e.g. in the processing of a non-ASCII internet domain name.

If data access is not carefully optimised, there will be many more
occasions when unused decompositions will nevertheless add to the
processing load.

Richard.
___
Unicode mailing list
Unicode@unicode.org
http://unicode.org/mailman/listinfo/unicode


Re: Unicode organization is still anti-Serbian and anti-Macedonian

2014-02-15 Thread Richard Wordingham
On Sat, 15 Feb 2014 17:43:05 -0800
"Steven R. Loomis"  wrote:

> Richard, SRB and MKD respectively are both in the page you linked to.

Good.  I made the mistake of thinking the list was sorted by
English language name, rather than tag.

Richard.

> >I do seem to have found a problem, though I find it hard to believe.
> >When I looked for the OpenType language tag for Serbian at
> >http://www.microsoft.com/typography/otspec/languagetags.htm , it
> >wasn't there!
___
Unicode mailing list
Unicode@unicode.org
http://unicode.org/mailman/listinfo/unicode


Re: Unicode organization is still anti-Serbian and anti-Macedonian

2014-02-15 Thread Richard Wordingham
On Sat, 15 Feb 2014 18:25:51 +
Richard Wordingham  wrote:

> On Fri, 14 Feb 2014 02:37:19 -0800
> Крушевљанин  wrote:
> 
> > There is still problem with letters бгдпт in italic, and б in
> > regular mode.
> 
> > OpenType support is still very weak (Firefox, LibreOffice on Linux,
> > Adobe's software and that's it, practically). It's also
> > disappointing that Microsoft is still incapable to implement and
> > force this support on system level.
> 
> I'll be interested to know what stops Gentium Plus, suggested by Otto
> Stolz, from working on, say, Windows 7.

I do seem to have found a problem, though I find it hard to believe.
When I looked for the OpenType language tag for Serbian at
http://www.microsoft.com/typography/otspec/languagetags.htm , it
wasn't there!  Now I'm puzzled as to how any flavour of OpenType is
supposed to automatically switch between Russian and Serbian italics as
such. Gentium Plus (italic) has the Serbian italic glyphs, but via the
aalt feature, which I don't think is what one would want for most uses.

> I'm very sure the support is there at a system level

It seems I was wrong!

Richard.

___
Unicode mailing list
Unicode@unicode.org
http://unicode.org/mailman/listinfo/unicode


Re: Unicode organization is still anti-Serbian and anti-Macedonian

2014-02-15 Thread Gerrit Ansmann

On Fri, 14 Feb 2014 11:37:19 +0100, Крушевљанин  wrote:


There is still problem with letters бгдпт in italic, and б in regular mode.

OpenType support is still very weak (Firefox, LibreOffice on Linux, Adobe's 
software and that's it, practically). It's also disappointing that Microsoft is 
still incapable to implement and force this support on system level.

I want Unicode organization to change their politics and pay attention to small 
countries like Serbia and Macedonia. We have real-world problems. Thank you.


Just to avoid that I am arguing from a wrong premise: From what I gathered in a 
quick research, the problem is that the upright letter б and the italic letters 
б, г, д, п and т have a different shape in Serbian and Macedonian Cyrillic than 
in other flavours of Cyrillic.

First of all, the lack of support of such features by font creators and the 
support of font standards by a certain software company (who ironically happens 
to have created that standard) are hardly Unicode’s fault. It’s like 
complaining to your government that your favorite merchant still won’t sell 
bananas, though bananas were legalised twenty years ago.

But most importantly: Encoding these characters won’t do your goal any good for 
many reasons:
• Even if Unicode did include these characters today, it would take a long time 
for creators of fonts and other software to catch up – just consider how slow 
support of OpenType (or other intelligent font standards) is growing, despite 
the fact that it concerns a lot of languages and not just two.
• You cannot control every old text to be converted. However, for many such 
text you can control with which font or font technologies they are rendered. 
The support of working solutions for the latter is likely to grow even slower 
if your request were granted.
• People will be confused as to which characters they should use.
• In the current situation, if a font does support Cyrillic but not the Serbian 
and Macedonian specialties, there is a decent if not identical fallback in many 
cases. If the new characters were used, however, fonts that support Cyrillic 
but not the new characters (which especially includes every font that exists 
today) would not even render the upright versions of the new Serbian/Macedonian 
г, д, п and т correctly, even though they do contain these glyphs.
• If you consider this only a temporary makeshift solution to the problem, it 
works against other temporary solutions (see below).

Actually, the only advantage I see in encoding these letters separately is that 
it makes type designers aware of these specialties of Serbian and Macedonian – 
but neither is this the purpose of Unicode nor is it the best way to do so, and 
moreover does it not compensate the aforementioned disadvantages.

Some suggestions, how to better invest your ressources and energy on this issue:
• Make type designers aware of this.
• Enforce support of OpenType (or other intelligent font standards) or work on 
it yourself. (In general, it would be good if people stopped working on 
makeshift solutions for problems specific to their language or complaining 
about their lack of support and started working on the support of global 
standards that will not only solve their problem but benefit many other 
languages too.)
• Improve open-source fonts by adding the special glyphs yourself.
• As a temporary solution: Request and advertise versions of important fonts 
that default to the Serbian/Macedonian versions of said characters instead of 
the others. Or for open-source fonts: Make those versions yourself. (See also 
Otto Stolz’s answer)


In Unicode, Latin scripts are always favored, which is simply not fair to the 
rest of the world. They have space to put glyphs for dominoes, a lot of dead 
languages etc. but they don't have space for real-world issues.


It’s somewhat amazing how you complain about Unicode’s focus on Latin script 
and its encoding of things that are not Latin in one line.


Also, there are Serbian/Macedonian cyrillic vowels with accents (total: 7 types 
× 6 possible letters = 42 combinations) where majority of them don't exist 
precomposed, and is impossible to enter them. A lot of nowadays' fonts (even 
commercial) still have issues with accents.


At least for the 6 accents and 5 vowels I found, using combining diacritical 
marks should work very well even without OpenType, given that the font supports 
these characters (and you can bet that a font which does not even support this, 
would not support your requested precomposed glyphs).
___
Unicode mailing list
Unicode@unicode.org
http://unicode.org/mailman/listinfo/unicode


precomposed characters (was: Unicode organization is still anti-Serbian and anti-Macedonian)

2014-02-15 Thread Janusz S. Bien
Quote/Cytat - Richard Wordingham   
(Sat 15 Feb 2014 07:25:51 PM CET):

Each precomposed character adds a small processing
overhead to an extremely large number of computers, not just to the
computers that actually use it.


This is a very strong claim. Would be so kind to elaborate?

Best regards

Janus

--
Prof. dr hab. Janusz S. Bień -  Uniwersytet Warszawski (Katedra  
Lingwistyki Formalnej)

Prof. Janusz S. Bień - University of Warsaw (Formal Linguistics Department)
jsb...@uw.edu.pl, jsb...@mimuw.edu.pl, http://fleksem.klf.uw.edu.pl/~jsbien/

___
Unicode mailing list
Unicode@unicode.org
http://unicode.org/mailman/listinfo/unicode


Re: Unicode organization is still anti-Serbian and anti-Macedonian

2014-02-15 Thread Richard Wordingham
On Fri, 14 Feb 2014 02:37:19 -0800
Крушевљанин  wrote:

> There is still problem with letters бгдпт in italic, and б in regular
> mode.

> OpenType support is still very weak (Firefox, LibreOffice on Linux,
> Adobe's software and that's it, practically). It's also disappointing
> that Microsoft is still incapable to implement and force this support
> on system level.

I'll be interested to know what stops Gentium Plus, suggested by Otto
Stolz, from working on, say, Windows 7.  I'm very sure the support is
there at a system level - the problem (if any) is more likely to be
at an application level.  Does LibreOffice on Windows not support
Serbian italics?   

> Also, there are Serbian/Macedonian cyrillic vowels with accents
> (total: 7 types × 6 possible letters = 42 combinations) where
> majority of them don't exist precomposed, and is impossible to enter
> them. A lot of nowadays' fonts (even commercial) still have issues
> with accents.

Should these combinations be well known?  They're not listed in the
CLDR exemplar characters for Serbian.

As for input, I would suggest that the solution for the simpler
keyboarding techniques is to enter them as base character and then dead
key.  Dead keys could be available for more advanced input systems,
e.g. ibus on Linux and 'text services' on Windows (Vista and above, I
believe).

> In Unicode, Latin scripts are always favored, which is simply not
> fair to the rest of the world. They have space to put glyphs for
> dominoes, a lot of dead languages etc. but they don't have space for
> real-world issues.

Precomposed characters are an unpleasant feature in Unicode.  I am
curious as to how the Serbian combinations escaped notice.  When are
they actually used?  Each precomposed character adds a small processing
overhead to an extremely large number of computers, not just to the
computers that actually use it.  By contrast, dominoes can be ignored
when no-one using the computer is using the characters for them.

Richard.

___
Unicode mailing list
Unicode@unicode.org
http://unicode.org/mailman/listinfo/unicode


Re: Unicode organization is still anti-Serbian and anti-Macedonian

2014-02-15 Thread Otto Stolz

Hello,

Am 14.02.2014 11:37, schrieb Крушевљанин:

There is still problem with letters бгдпт in italic, and б in regular mode.


As has been said, already, in this thread, this is a mere font issue:
you have to use a particular font in order to display those italic
letters, in the Serbian and Macedonian style.

One example:
The ‘Gentium Plus’ font from SIL International, available from

can be configured to display the Serbian/Macedonian style italics
rather than the glyphs used elsewhere. If this configuration is
too cumbersome for you, feel free to ask me privately, for a copy
of the font, configured for Serbian/Macedonian. I can send you
that copy, without any obligation to maintain it or to adapt forth-
coming versions.

Best wishes,
  Otto Stolz



___
Unicode mailing list
Unicode@unicode.org
http://unicode.org/mailman/listinfo/unicode


Re: Unicode organization is still anti-Serbian and anti-Macedonian

2014-02-14 Thread Mark Davis ☕
Unicode is not anti-Serbian or Macedonian.

The exact level of Unicode support will depend on your operating system and
font choice. For example, on the Mac there are reasonable results with
arbitrary
accents. Here are examples with  and 

q̈

Q̈

Here is an image, in case your emailer or OS doesn't handle these well.

[image: Inline image 1]
See also http://www.unicode.org/standard/where/

As to the italic, that also depends on the font support on your system.



Mark 

*— Il meglio è l’inimico del bene —*


On Fri, Feb 14, 2014 at 2:37 AM, Крушевљанин  wrote:

> There is still problem with letters бгдпт in italic, and б in regular mode.
>
> OpenType support is still very weak (Firefox, LibreOffice on Linux,
> Adobe's software and that's it, practically). It's also disappointing that
> Microsoft is still incapable to implement and force this support on system
> level.
>
> Also, there are Serbian/Macedonian cyrillic vowels with accents (total: 7
> types × 6 possible letters = 42 combinations) where majority of them don't
> exist precomposed, and is impossible to enter them. A lot of nowadays'
> fonts (even commercial) still have issues with accents.
>
> In Unicode, Latin scripts are always favored, which is simply not fair to
> the rest of the world. They have space to put glyphs for dominoes, a lot of
> dead languages etc. but they don't have space for real-world issues.
>
> I want Unicode organization to change their politics and pay attention to
> small countries like Serbia and Macedonia. We have real-world problems.
> Thank you.
>
> If you think these are biases of me, I say — real-world problem for us.
> If you think changes would invalidate existing texts, I say — no, because
> *real* Serbian/Macedonian support still doesn't exist! And we can develop
> converters in the future, so I don't see any "huge cost" problems...
>
> --
> Крушевљанин Иван
>
> _
> The Free Email with so much more!
> => http://www.MuchoMail.com <=
>
> ___
> Unicode mailing list
> Unicode@unicode.org
> http://unicode.org/mailman/listinfo/unicode
>
<>___
Unicode mailing list
Unicode@unicode.org
http://unicode.org/mailman/listinfo/unicode


Unicode organization is still anti-Serbian and anti-Macedonian

2014-02-14 Thread Крушевљанин
There is still problem with letters бгдпт in italic, and б in regular mode.

OpenType support is still very weak (Firefox, LibreOffice on Linux, Adobe's 
software and that's it, practically). It's also disappointing that Microsoft is 
still incapable to implement and force this support on system level.

Also, there are Serbian/Macedonian cyrillic vowels with accents (total: 7 types 
× 6 possible letters = 42 combinations) where majority of them don't exist 
precomposed, and is impossible to enter them. A lot of nowadays' fonts (even 
commercial) still have issues with accents.

In Unicode, Latin scripts are always favored, which is simply not fair to the 
rest of the world. They have space to put glyphs for dominoes, a lot of dead 
languages etc. but they don't have space for real-world issues.

I want Unicode organization to change their politics and pay attention to small 
countries like Serbia and Macedonia. We have real-world problems. Thank you.

If you think these are biases of me, I say — real-world problem for us.
If you think changes would invalidate existing texts, I say — no, because 
*real* Serbian/Macedonian support still doesn't exist! And we can develop 
converters in the future, so I don't see any "huge cost" problems...

-- 
Крушевљанин Иван

_
The Free Email with so much more!
=> http://www.MuchoMail.com <=

___
Unicode mailing list
Unicode@unicode.org
http://unicode.org/mailman/listinfo/unicode