RE: Accessing the WG2 document register

2015-06-15 Thread Doug Ewell
Marcel Schneider  wrote:

> I don't measure exactly the implications of a keyboard compliance to
> a given standard when this standard is developed "on the paper" and
> without taking into consideration all needs and preferences of end-
> users.

ISO did not come up with the 2010 revision to 9995-3 on their own. It
originated with the German NB.

> The Ohm sign you mention reminds me that ISO perpetuated on
> keyboard some deprecated legacy characters that end up anyway to be
> replaced with their canonical equivalent, that in this example is
> Greek capital omega. That's another disconnect.

The relationship between U+2126 OHM SIGN and U+03A9 GREEK CAPITAL LETTER
OMEGA is not at issue here. Neither of these characters is present on US
International.

> And standardizing the dead key registries to exclude all characters
> that are not composed ones, is a counterproductive constraint based on
> the belief that the only way to get aware of the content of a layout
> is to read the keycap labels. This is a way of never getting curly
> quotes and apostrophe.

Dead keys under Windows are not constrained in the way you describe. As
I said earlier today, I use a keyboard on Windows on which all of these
characters are available via dead keys: “ ” ‘ ’ ʼ
 
> I'm very glad to learn there is this good keyboard layout for the USA
> and for the UK, and I wonder very much what's missing for everybody to
> use it.
> Thank you very much, I just downloaded the two drivers and I'm curious
> about how to map nine hundred characters on two levels without
> chaining dead keys!
> Well I didn't look for, because at the beginning I searched for the
> French keyboard.

Since John made the .klc source file available with the download, I'm
sure it would not be too difficult to adapt it to a French-based layout.

> The problem is not about code pages, it is about keeping them vividly
> in users' minds and letting them impact the Unicode Standard while
> since a quarter of a century, Unicode is on.

I'd guess there are very few users who consciously see the use of U+2019
as both apostrophe and right-single-quote as a vestige of code pages, or
as a conscious effort by Evil Microsoft™ to force them into anything.

> There's so much communication about word processing, that there would
> have been a little place to introduce the difference between an
> apostrophe and a single closing quotation mark, but instead of that,
> Microsoft urged Unicode to remove the recommendation and to restore
> the chaos.

Perhaps a UTC member can confirm whether this is fact or speculation.
Markus Kuhn's comment from 1999 about "couldn't Unicode follow
Microsoft...?" doesn't prove that Unicode was in fact strong-armed by
Microsoft.

--
Doug Ewell | http://ewellic.org | Thornton, CO 🇺🇸




RE: Accessing the WG2 document register

2015-06-15 Thread Marcel Schneider
On Mon, Jun 15, 2015, 18:36, Doug Ewell  wrote:

 

> The Level 3 and "Level 4" (Shift+AltGr) allocations of US International
> do not conform in any way to the common secondary layout of either
> 9995-3:2002 or 9995-3:2010. For example, there is no ohm sign on US
> International in any group or level, either at D01 (2002) or D02 (2010).
> Perhaps we are not talking about the same thing when we say "conforms to
> ISO/IEC 9995."

 

I don't measure exactly the implications of a keyboard compliance to a given

standard when this standard is developed "on the paper" and without taking

into consideration all needs and preferences of end-users. The Ohm sign

you mention reminds me that ISO perpetuated on keyboard some

deprecated legacy characters that end up anyway to be replaced with their

canonical equivalent, that in this example is Greek capital omega. That's 

another disconnect.

 

And standardizing the dead key registries to exclude all characters that are not

composed ones, is a counterproductive constraint based on the belief that

the only way to get aware of the content of a layout is to read the keycap

labels. This is a way of never getting curly quotes and apostrophe.

 

On Mon, Jun 15, 2015, 17:12, Doug Ewell  wrote:

 

> I use John Cowan's Moby Latin keyboard, built with MSKLC, which is 100%
> compatible with the AltGr-less US keyboard and supports almost 900 other
> characters, including all of the apostrophes and quotes and dashes and
> other characters under discussion:

> http://recycledknowledge.blogspot.com/2013/09/us-moby-latin-keyboard-for-windows.html

> I spent years designing and updating my own keyboard layout and studying
> other layouts. I've ended this quest since I started using Moby Latin;
> it's the best I've seen in numerous ways.

 

I'm very glad to learn there is this good keyboard layout for the USA and

for the UK, and I wonder very much what's missing for everybody to use it.

Thank you very much, I just downloaded the two drivers and I'm curious about how

to map nine hundred characters on two levels without chaining dead keys!

Well I didn't look for, because at the beginning I searched for the French 
keyboard.

 

 
>> Microsoft’s choice of mashing up apostrophe and close-quote to end up
>> with an unprocessable hybrid was wrong. Very wrong.

 
> Windows-1252 and the other Windows code pages were developed during the
> 1980s, before Unicode, when almost all non-Asian character sets were
> limited to 256 code points. The distinctions between apostrophe and
> right-single-quote, weighed against the confusion caused by encoding two
> identical-looking characters, would never have been sufficient back then
> to justify separate encoding in this limited space.

 

The problem is not about code pages, it is about keeping them vividly

in users' minds and letting them impact the Unicode Standard while

since a quarter of a century, Unicode is on.

 

The amazing chance of being able to disambiguate apostrophe and

close-quote has been purposely overridden after Unicode had

published clearly that U+02BC is apostrophe. Nothing was simplier

than letting this recommendation as it was, and tackle the job of

implementing Unicode on Windows, on Microsoft Office, and in the

offices. There's so much communication about word processing,

that there would have been a little place to introduce the difference

between an apostrophe and a single closing quotation mark, but

instead of that, Microsoft urged Unicode to remove the recommendation

and to restore the chaos.

 

I can't believe that was OK. Never, never.

 

Marcel Schneider

> Message du 15/06/15 18:36
> De : "Doug Ewell" 
> A : "Unicode Mailing List" 
> Copie à : "Marcel Schneider" 
> Objet : RE: Accessing the WG2 document register
> 
> Marcel Schneider wrote:
> 
> > The US International keyboard layout indeed conforms to ISO/IEC 9995.
> > AFAIK it was preexistent, and was validated for conformance by
> > considering that the AltGr and Shift + AltGr shift states contain the
> > secondary group.
> > I did not think about it as an _implementation_ of ISO/IEC 9995.
> 
> "ISO/IEC 9995" is a multi-part standard that covers many different
> aspects of keyboards. US International certainly conforms to many of the
> parts:
> 
> • it has alphanumeric, numeric, and editing zones with keys which can
> be referenced by "E01" notation, as per 9995-1
> 
> • it has shifting keys which are used to select levels
> 
> • the primary layout (Levels 1 and 2) conforms to 9995-2, as does
> practically any Latin-script keyboard
> 
> • it has Escape and cursor keys in conformance with 9995-5
> 
> • and so on.
> 
> The Level 3 and "Level 4" (Shift+AltGr) allocations of US International
> do not conform in any way to the common secondary layout of either
> 9995-3:2002 or 9995-3:2010. For example, there is no ohm sign on US
> International in any group or level, either at D01 (2002) or D02 (2010).
> Perhaps we are not talking about the same 

Re: Another take on the English Apostrophe in Unicode

2015-06-15 Thread Marcel Schneider
On Mon, Jun 15, 2015, 18:36, Doug Ewell  wrote:

 

> The Level 3 and "Level 4" (Shift+AltGr) allocations of US International
> do not conform in any way to the common secondary layout of either
> 9995-3:2002 or 9995-3:2010. For example, there is no ohm sign on US
> International in any group or level, either at D01 (2002) or D02 (2010).
> Perhaps we are not talking about the same thing when we say "conforms to
> ISO/IEC 9995."

 

I don't measure exactly the implications of a keyboard compliance to a given

standard when this standard is developed "on the paper" and without taking

into consideration all needs and preferences of end-users. The Ohm sign

you mention reminds me that ISO perpetuated on keyboard some

deprecated legacy characters that end up anyway to be replaced with their

canonical equivalent, that in this example is Greek capital omega. That's 

another disconnect.

 

And standardizing the dead key registries to exclude all characters that are not

composed ones, is a counterproductive constraint based on the belief that

the only way to get aware of the content of a layout is to read the keycap

labels. This is a way of never getting curly quotes and apostrophe.

 

On Mon, Jun 15, 2015, 17:12, Doug Ewell  wrote:

 

> I use John Cowan's Moby Latin keyboard, built with MSKLC, which is 100%
> compatible with the AltGr-less US keyboard and supports almost 900 other
> characters, including all of the apostrophes and quotes and dashes and
> other characters under discussion:

> http://recycledknowledge.blogspot.com/2013/09/us-moby-latin-keyboard-for-windows.html

> I spent years designing and updating my own keyboard layout and studying
> other layouts. I've ended this quest since I started using Moby Latin;
> it's the best I've seen in numerous ways.

 

I'm very glad to learn there is this good keyboard layout for the USA and

for the UK, and I wonder very much what's missing for everybody to use it.

Thank you very much, I just downloaded the two drivers and I'm curious about how

to map nine hundred characters on two levels without chaining dead keys!

Well I didn't look for, because at the beginning I searched for the French 
keyboard.

 

 
>> Microsoft’s choice of mashing up apostrophe and close-quote to end up
>> with an unprocessable hybrid was wrong. Very wrong.

 
> Windows-1252 and the other Windows code pages were developed during the
> 1980s, before Unicode, when almost all non-Asian character sets were
> limited to 256 code points. The distinctions between apostrophe and
> right-single-quote, weighed against the confusion caused by encoding two
> identical-looking characters, would never have been sufficient back then
> to justify separate encoding in this limited space.

 

The problem is not about code pages, it is about keeping them vividly

in users' minds and letting them impact the Unicode Standard while

since a quarter of a century, Unicode is on.

 

The amazing chance of being able to disambiguate apostrophe and

close-quote has been purposely overridden after Unicode had

published clearly that U+02BC is apostrophe. Nothing was simplier

than letting this recommendation as it was, and tackle the job of

implementing Unicode on Windows, on Microsoft Office, and in the

offices. There's so much communication about word processing,

that there would have been a little place to introduce the difference

between an apostrophe and a single closing quotation mark, but

instead of that, Microsoft urged Unicode to remove the recommendation

and to restore the chaos.

 

I can't believe that was OK. Never, never.

 

Marcel Schneider

> Message du 15/06/15 18:36
> De : "Doug Ewell" 
> A : "Unicode Mailing List" 
> Copie à : "Marcel Schneider" 
> Objet : RE: Accessing the WG2 document register
> 
> Marcel Schneider wrote:
> 
> > The US International keyboard layout indeed conforms to ISO/IEC 9995.
> > AFAIK it was preexistent, and was validated for conformance by
> > considering that the AltGr and Shift + AltGr shift states contain the
> > secondary group.
> > I did not think about it as an _implementation_ of ISO/IEC 9995.
> 
> "ISO/IEC 9995" is a multi-part standard that covers many different
> aspects of keyboards. US International certainly conforms to many of the
> parts:
> 
> • it has alphanumeric, numeric, and editing zones with keys which can
> be referenced by "E01" notation, as per 9995-1
> 
> • it has shifting keys which are used to select levels
> 
> • the primary layout (Levels 1 and 2) conforms to 9995-2, as does
> practically any Latin-script keyboard
> 
> • it has Escape and cursor keys in conformance with 9995-5
> 
> • and so on.
> 
> The Level 3 and "Level 4" (Shift+AltGr) allocations of US International
> do not conform in any way to the common secondary layout of either
> 9995-3:2002 or 9995-3:2010. For example, there is no ohm sign on US
> International in any group or level, either at D01 (2002) or D02 (2010).
> Perhaps we are not talking about the same 

RE: Accessing the WG2 document register

2015-06-15 Thread Doug Ewell
Marcel Schneider  wrote:

> The US International keyboard layout indeed conforms to ISO/IEC 9995.
> AFAIK it was preexistent, and was validated for conformance by
> considering that the AltGr and Shift + AltGr shift states contain the
> secondary group.
> I did not think about it as an _implementation_ of ISO/IEC 9995.

"ISO/IEC 9995" is a multi-part standard that covers many different
aspects of keyboards. US International certainly conforms to many of the
parts:

• it has alphanumeric, numeric, and editing zones with keys which can
be referenced by "E01" notation, as per 9995-1

• it has shifting keys which are used to select levels

• the primary layout (Levels 1 and 2) conforms to 9995-2, as does
practically any Latin-script keyboard

• it has Escape and cursor keys in conformance with 9995-5

• and so on.

The Level 3 and "Level 4" (Shift+AltGr) allocations of US International
do not conform in any way to the common secondary layout of either
9995-3:2002 or 9995-3:2010. For example, there is no ohm sign on US
International in any group or level, either at D01 (2002) or D02 (2010).
Perhaps we are not talking about the same thing when we say "conforms to
ISO/IEC 9995."

--
Doug Ewell | http://ewellic.org | Thornton, CO 🇺🇸




Re: Accessing the WG2 document register

2015-06-15 Thread Marcel Schneider
On Mon, Jun 15, 2015, Doug Ewell  wrote:

> At least it was possible to implement the old ISO 9995-3 standard on
> Windows, treating Group 2, Levels 1 and 2 as if they were Group 1,
> Levels 3 and 4 -- in other words, by using AltGr and Shift+AltGr.

The US International keyboard layout indeed conforms to ISO/IEC 9995. 
AFAIK it was preexistent, and was validated for conformance by considering that 
the AltGr and Shift + AltGr shift states contain the secondary group. 
I did not think about it as an _implementation_ of ISO/IEC 9995.

> The new ISO 9995-3 standard isn't implemented anywhere, and can't be as
> long as no specification exists to access the additional groups and
> shift states without adding more physical keys. "Figure it out for
> yourself" is not a specification.

The new German standard keyboard layouts T2 and T3 are ISO/IEC 9995.
Other national keyboard layouts before them are, too.
There is exactly a Group 1 with three levels and a Group 2 with two.

Marcel Schneider


Re: Accessing the WG2 document register

2015-06-15 Thread Doug Ewell
Marcel Schneider  wrote:

> This makes me remember the idea I got about ISO when I considered the
> ISO/IEC 9995 standard. This standard specifies that on all keyboards,
> there should be a so-called common secondary group, and that this
> secondary group should contain all the characters that are on the
> keyboard but aren't for a so-called strictly national use.  This
> sounds to me as if it were fascistic or neofascistic.

Please read the history of attempts to standardize keyboard layouts
across national boundaries. National standard bodies have always
insisted on their particular differences in layout (Q/A, W/Z, Y/Z) and
convenient access to characters specific to their languages. This is not
imposed from the outside.

> The way this secondary group is accessed seems rather complicated and
> been engineered in disconnect from actual OSs and keyboard drivers.
> The result was that when it went on to be implemented on Windows, the
> secondary group was not accessed like specified but as Kana levels,
? which is very consistent with a real keyboard. But in the meantime,
> this ISO/IEC 9995 standard wastes a whole shift state by excluding it
> simply from use, on the pretext that you need to press more than two
> keys: Shift + AltGr + another key. This restriction to a maximum
> number of two simultaneously pressed keys was so fancy Microsoft
> didn't bother about. Really, to enter a character from the second
> level of the secondary group, you need to press Shift + Kana + another
> key.  That's all OK, but the ISO/IEC 9995 standard is *not*. 

At least it was possible to implement the old ISO 9995-3 standard on
Windows, treating Group 2, Levels 1 and 2 as if they were Group 1,
Levels 3 and 4 -- in other words, by using AltGr and Shift+AltGr.

The new ISO 9995-3 standard isn't implemented anywhere, and can't be as
long as no specification exists to access the additional groups and
shift states without adding more physical keys. "Figure it out for
yourself" is not a specification.

--
Doug Ewell | http://ewellic.org | Thornton, CO 🇺🇸




Re: Another take on the English Apostrophe in Unicode

2015-06-15 Thread Philippe Verdy
2015-06-15 16:49 GMT+02:00 Marcel Schneider :


> It's indeed very useful to keep two Control modifiers. Because the
> modifiers at the left and right border of the block are acted with the
> little finger and should thus be symetrical. This does not apply to the Alt
> keys and other keys more or less centered around the space bar, which are
> acted with the thumbs. As Alt is less used than Kana (when there is a Kana
> key), Kana should be on left Alt, symetrical to the (on many keyboards
> already implemented) AltGr key. The Alt key comes then on the Applications
> key, which is mnemonic because of the contextual menu icon. Internally,
> indeed, the Alt keys (left and right) are called Menu keys (Virtual key
> Left Menu or VK_LMENU, and VK_RMENU). This contextual menu is then invoked
> pressing the right Windows key, which is consistently missing on laptops.
>
Not just laptops. My desktop PC only has a single Windows key, on the left.
Anyway there's little use of the Windows key that was introduced lately
(and there are still lot of keyboards that don't have this key). The same
remark applies to the ScrollLock key (which is now frequently remapped to
Fn+Pause/SysAttn or other similar combination using the single Windows key
when there's no Fn key which is typical of notebooks).

However I disagree with your opinion about AltGr+Shift combinations: it
works perfectly including with the ISO 9995 definitions: the unshifted and
shifted position are in the same "group".

However ISO 9995 allows CapsLock to be used to create other groups instead
of just reproducing the shifted/unshifted layout. It can be very useful for
users in India to switch between Latin and local abugidas. It could be used
as well by users writing in Arabic and Hebrew abjads, or with
African (Ethiopic) or North-American syllabary scripts that are complex to
map on a usable keyboard.

But I think that keyboard should all have a dedicated Kana key to easily
map additional groups without sacrificing other shift keys on the last row:
keyboards really don't need two windows keys and so the space bar can
remain with a cumfortable width (as well for the Shift key or Backspace
which is too narrow on many keyboards).
On the last row therre should never be more than 7 keys on both sides of
the space bar, and the most external keys (Ctrl) have to remain wide). If a
Kana key or present, in fact it should be to the right of the right
control, or ro the right of the right Shift

AltGr needs to keep some width extension compared to letter keys, and in
fact could be larger than the left Alt, because it is used for entering
text. The Application key is too large for me, just like the left Windows
key (its extra width should be better given to the left Control key to make
it a bit more central).

Those that design keyboard almost never test them for real usability: they
prefer slling them with many packed multimedia functions (or buttons for
Calc, Mail, Web or swtiching windows, and that are rarely used). Only
keyboards for gamers have some attention, but only to give them additional
programmable function keys for specific games... Keyboards on notebooks are
extremely poorly designed, a complete nonsense.


Re: Another take on the English Apostrophe in Unicode

2015-06-15 Thread Doug Ewell
Marcel Schneider  wrote:

> A free tool, the Microsoft Keyboard Layout Creator, allows every user
> to add U+02BC on his preferred keyboard layout

I use John Cowan's Moby Latin keyboard, built with MSKLC, which is 100%
compatible with the AltGr-less US keyboard and supports almost 900 other
characters, including all of the apostrophes and quotes and dashes and
other characters under discussion:

http://recycledknowledge.blogspot.com/2013/09/us-moby-latin-keyboard-for-windows.html

I spent years designing and updating my own keyboard layout and studying
other layouts. I've ended this quest since I started using Moby Latin;
it's the best I've seen in numerous ways.

Elsewhere:

> ISO stands for stability

We wish. Several of us on this list have worked on standards and
standard-like activities that correct for, and defend against,
instability in ISO standards.

> Microsoft’s choice of mashing up apostrophe and close-quote to end up
> with an unprocessable hybrid was wrong. Very wrong.

Windows-1252 and the other Windows code pages were developed during the
1980s, before Unicode, when almost all non-Asian character sets were
limited to 256 code points. The distinctions between apostrophe and
right-single-quote, weighed against the confusion caused by encoding two
identical-looking characters, would never have been sufficient back then
to justify separate encoding in this limited space.

--
Doug Ewell | http://ewellic.org | Thornton, CO 🇺🇸




RE: Accessing the WG2 document register

2015-06-15 Thread Peter Constable
I suggest that people on this list that have not personally engaged directly in 
ISO process via their country’s designated standards bodies should stop opining 
and editorializing on that body.

ISO isn’t perfect by any means, but in the many years I have been directly 
involved in ISO process I can’t say I’ve ever seen discrimination other than 
appropriate discrimination of ideas on technical merits.


Peter


From: Unicode [mailto:unicode-boun...@unicode.org] On Behalf Of Marcel Schneider
Sent: Monday, June 15, 2015 1:30 AM
To: wjgo_10...@btinternet.com; pan...@umich.edu; unicode@unicode.org; 
babelst...@gmail.com
Subject: Re: Accessing the WG2 document register


 On Mon, Jun 15, 2015, William_J_G Overington 
mailto:wjgo_10...@btinternet.com>> wrote:

 > I have been thinking about the current discussion in the Unicode mailing 
 > list about a particular ISO committee no longer being allowed to accept 
 > proposal documents from individuals, because of a rule change from a higher 
 > level within ISO.
>
> I am thinking of how the committee meetings might be different from how they 
> would be if the rules had not been changed and what might not get encoded 
> that might have been encoded had the rule change not happened.
>
> In the short term, the individual contributor is hurt, yet in the long term 
> the document encoding process is hurt and the whole world of information 
> technology may be hurt as potentially good content has been ignored due to 
> discrimination, and a standards document produced that is not as good as it 
> could have been had there not been the discrimination.

> ...
> I opine that it is important when deciding what will be considered for 
> encoding that there is no discrimination about considering encoding 
> proposals. Not only does ignoring contributions cause immediate problems but 
> also there can be second order effects and so on as potential later 
> contributions will not be made as they will not have the original 
> contribution to build upon, and many people may not even realize that the 
> second order effects have taken place.
>

I'm shocked that there is still any discrimination, even against individuals, 
in ISO, and worse, that such discrimination has been newly introduced.



This makes me remember the idea I got about ISO when I considered the ISO/IEC 
9995 standard. This standard specifies that on all keyboards, there should be a 
so-called common secondary group, and that this secondary group should contain 
all the characters that are on the keyboard but aren't for a so-called strictly 
national use.  This sounds to me as if it were fascistic or neofascistic. The 
way this secondary group is accessed seems rather complicated and been 
engineered in disconnect from actual OSs and keyboard drivers. The result was 
that when it went on to be implemented on Windows, the secondary group was not 
accessed like specified but as Kana levels, which is very consistent with a 
real keyboard. But in the meantime, this ISO/IEC 9995 standard wastes a whole 
shift state by excluding it simply from use, on the pretext that you need to 
press more than two keys: Shift + AltGr + another key. This restriction to a 
maximum number of two simultaneously pressed keys was so fancy Microsoft didn't 
bother about. Really, to enter a character from the second level of the 
secondary group, you need to press Shift + Kana + another key.  That's all OK, 
but the ISO/IEC 9995 standard is *not*.



I won't repeat what I already wrote on this List. Sincerely I thought that the 
International Association for Standardization is today a real international 
organization which cares for all nations on the earth, whether the proposals 
come from individuals or collectivities. I dimly recall that in the nineties, 
ISO was even likely to refuse demands made by its own national members. Reports 
and results showed that it even dit not consult anybody of the nations it was 
encoding the characters of, except a few people who were not always reliable, 
ISO 8859-1 showed.



To read such things today makes me furious again. I personally wish that you, 
Mr Pandey, Mr West and Mr Overington, be fully heard at ISO and that *all* 
proposals are treated equally, fully, and successfully.

What are we going to do? What are you going to do? I repeat, I'm shocked, and I 
hate ISO again.





Best regards,

Marcel Schneider


> Message du 15/06/15 09:53
> De : "William_J_G Overington" 
> mailto:wjgo_10...@btinternet.com>>
> A : pan...@umich.edu, 
> unicode@unicode.org, 
> babelst...@gmail.com
> Copie à :
> Objet : Re: Accessing the WG2 document register
>
> I have been thinking about the current discussion in the Unicode mailing list 
> about a particular ISO committee no longer being allowed to accept proposal 
> documents from individuals, because of a rule change from a higher level 
> within ISO.
>
> I am thinking of how the

Re: Another take on the English Apostrophe in Unicode

2015-06-15 Thread Marcel Schneider
On Fri, Jun 12, 2015, Philippe Verdy  wrote:

> These are application shortcuts, but these modifier keys combinations are 
> used with base function keys (F1...F12), not with keys on the alphanumeric 
> parts of the keyboard. So there's no conflict.

Thank you for your advice. It'll be very useful.
I was not precise enough, the upper row of the alphanumerical block is used 
with Ctrl, Shift+Ctrl, Shift+Alt by the language bar but optionally only.

> It is normal then to not assign CTRL+keys or CONTROL+shift+keys 
> (independantly of the capslock state) with non-control characters if the same 
> keys are used to type non-control ASCII characters in range U+0040..U+005F. 
> This means that 32 positions on the keyboard must not be used for any 
> assignment.
> The same remark applies to ALT+digit and ALT+letter (otherwise keyboard 
> shortcut for application menus or navigation in web forms won't work 
> correctly, or will take the priority when you intended to type a valid 
> character, forcing these application functions instead of accepting your 
> character input).
MSKLC performs this "safety checks" and will issue warnings if you do so.

The Alt shift state is unassignable in the MSKLC. When used for shortcuts with 
Clavier+, these are prioritized and work fine.

> This is not just "my" advaice but documented in the ISO standard.

That depends on which ISO Standard you refer to. If it's ISO/IEC 9995, then 
beware! IMHO this standard isn't to be taken seriously, otherwise you'll have 
to stay away from using the Shift + AltGr shift state, to take just one 
outstanding example.

> Assigning characters to positions defined for application shortcuts is a bad 
> idea. Keyboard layouts should map characters in positions that are 
> independant of applications (but layouts may be specific to an OS if the OS 
> interface defines some standard shortcuts: this is a problem when using 
> virtualized OSes, as there's a conflict with shortcuts used to switch from 
> the guest to the host: personnally I have chosen the Application key for this 
> instead of the right control, because the Application key is rarely needed, 
> but I frequently type control with the right hand or two hands, notably 
> CTRL+A, CTRL+C, CTRL+X, CTRL+V).

It's indeed very useful to keep two Control modifiers. Because the modifiers at 
the left and right border of the block are acted with the little finger and 
should thus be symetrical. This does not apply to the Alt keys and other keys 
more or less centered around the space bar, which are acted with the thumbs. As 
Alt is less used than Kana (when there is a Kana key), Kana should be on left 
Alt, symetrical to the (on many keyboards already implemented) AltGr key. The 
Alt key comes then on the Applications key, which is mnemonic because of the 
contextual menu icon. Internally, indeed, the Alt keys (left and right) are 
called Menu keys (Virtual key Left Menu or VK_LMENU, and VK_RMENU). This 
contextual menu is then invoked pressing the right Windows key, which is 
consistently missing on laptops. Laptops must however have an Applications key 
to prevent the AltGr key from being positioned too far rightwards, beside of a 
space bar too long, because this hardware layout has some negative impact on 
ergonomics, specialists say.
On the US keyboard layout at http://charupdate.info however, Applications is a 
Kana toggle, while Right Windows is a Compose key. For laptops this shifts 
rightwards to get Compose on Applications, and Kana toggle on, well, Right 
Control. Because there are laptops with nothing between Right Alt and Right 
Control, so I even thought at mapping the Kana toggle on Pause, but this turned 
out to be buggy, besides that keyboards without Applications (Menu) often are 
lacking the Pause key too.

> On the French keyboard, CONTROL and SHIFT+CONTROL must be reserved on 7 
> successive keys of the first row ("5([", "6-|", "7è`", "8_\", "9ç^", "0à@", 
> "°)]"), they are needed to get ASCII controls
> However CONTROL+@ is extremely rarely needed in applications to enter a NULL 
> control that will be almost always filtered out silently, only some editors 
> that allow loading and editing binary files will use it, e.g. Emacs or Vim 
> which have a "binary editing" mode that avoids altering the encoding of 
> newlines, but displays all controls explicitly, and that does not limit the 
> "line length". Personally I prefer not using text editors to edit binary 
> files, this is too much unsafe with their "insertion" working mode, it is 
> highly preferable and much simpler to use an hexadecimal editor).
> This means that CONTROL+"0à@" may be assigned something else more useful 
> (even if the MSKLC compiler warns about it).
> But you can assign characters with CONTROL and CONTROL+SHIFT for the 6 other 
> keys of the first row ("²", "1&", "2é~", "3"#", "4'{" on the left side, and 
> "+=}" on the last position to the right).

I ended up assigning no characters on Control shift stat

Re: Another take on the English apostrophe in Unicode

2015-06-15 Thread Philippe Verdy
2015-06-15 15:20 GMT+02:00 QSJN 4 UKR :

> By the way, about smart quotes. I am using that for long time. My
> keyboard layout generates two characters on one key-press (so I have
> to enter [«»][←]{sth}[→] instead of [«]{sth}[»]). It's not that good,
>

You could generate three keystrokes [«][»][←] from a single keypress to get
the same effect.

Various editors already do that when you press the first key for the
opening quote, and all you have to type then is the [→] key (instead of the
key for a closing quote) after typing the word.

Such system is used in many IDE or text editors for programmers when they
enter the opening parenthese, or square bracket, or single/double quotes,
or braces, or block comment prefixes, or any paired symbols or keywords
used in the programming language (e.g. "begin | end" in Pascal, "#if
|\n#endif" in C/C++ preprocessor directives : the pipe here notes the
position of the cursor after typing what is just before it, what is after
the pipe is inserted after the cursor position).

If you disagree with those automatic insertions after the cursor, you can
immediately press CTRL+Z to cancel this added suffix but keep what you just
entered. another CTRL+Z will undo your previous keypress(es) for the
character(s) just before the cursor position. Some editors are even smarter
before the cursor position is not just a single position but a selected
range and as long as you continue typing just before this range, the
selection is preserved, and when you press [→] it will skip over this whole
selection and you an also press then the backspace key to delete that
autoinserted selected range. If you move your cursor elsewhere, the
selection is unselected and you get back to the normal insertion cursor
with an empty selection.

Such system is used for example in Notepad++ (for Windows), or Eclipse (you
can disable this automatic insertion in your preferences).

This editor feature does not depend on the character layout but depends on
the selected language for matching pairs: it does not have to be limited to
programming languages and can be used as well for natural human languages,
including in advanced word processors. It can also be used to insert
automatically some additional space when you just press an initial quote:
entering only [«] when editing French text, what you would get is
[«][NNBSP]|[NNBSP][»] (with the cursor selection over the last two
characters). These editors normally have a way to edit their automatic
insertion rules (with the text to match before, the text to add jut after
it, the new cursor position, and the text to insert just after it (and to
hopefully preselect in such a way that when continuing entering text
without moving the insertion position, it is not overwritten but just
preseves this selected text). Such rules can be part of the parameters for
the spell checker.


Re: Another take on the English apostrophe in Unicode

2015-06-15 Thread Marcel Schneider
On Tue Mar 26 2002 - 10:01:43 EST, Mark Davis ☕️  wrote:

http://www.unicode.org/mail-arch/unicode-ml/y2002-m03/0598.html

> Apostrophe, hyphen, and various other puncutation by default continue
> a word, but this behavior may be overriden on a per-language basis.
> Heuristics or more sophisticated engines may be needed when the
> apostrophe is at the end of a word, as in “the peoples' choice”, since
> it is ambiguous. The modifier letter apostrophe, on the other hand, is
> always treated as a letter.

 

[I replaced '<' '>' with '“' '”' to prevent confusion with a tag by the user 
agent.]

 

On Tue Mar 26 2002 - 11:44:28 EST, Marco Cimarosti  wrote:

http://www.unicode.org/mail-arch/unicode-ml/y2002-m03/0604.html


 

> Mark Davis wrote: 
>> Apostrophe, hyphen, and various other puncutation by default continue 
>> a word, but this behavior may be overriden on a per-language basis. 

> This may work for things such as finding word boundaries, but not for 
> identifiers. 

> According to the ID_Start and ID_Continue properties in 
> , neither 
> U+0027 (APOSTROPHE) nor U+2019 (RIGHT SINGLE QUOTATION MARK) are allowed in 
> an identifier. And this is not surprising, since they are primarily 
> quotation marks. 

> On the other hand, U+02BC (MODIFIER LETTER APOSTROPHE) is allowed in any 
> position within an identifier. Using U+02BC as the apostrophe, would allow 
> to use words such as: ,  or <'em> in identifiers. 

> But this hits against the fact that Unicode's own suggestion is to use 
> U+2019 for the apostrophe.

 


On Tue Mar 26 2002 - 12:08:41 EST , Marco Cimarosti  wrote:

http://www.unicode.org/mail-arch/unicode-ml/y2002-m03/0608.html



> But, as you say, the apostrophe is legitimate and sometimes mandatory in the 
> orthography of English and many other languages. So, it seems to me that its 
> preferred encoding should make it possible to use it in identifiers, 
> filenames, URI(')s, and so on.

 

 

Don't we fall back into the times of all-0x27 and stay in front of on-going 
confusion when 


English apostrophe is ambiguated with closing-quote? 


As you told us, having both U+02BC and U+2019 in use will need some 
supplemental algorithms.


But as you told in 2002, this is true when both are confused in only one 
character, too.


 

I suspect that the cost of using MODIFIER LETTER APOSTROPHE for English 
apostrophe (and as 


apostrophe on the whole) today would mainly be the cost of updating 
implementations and text files. 


If this cost is too high, we would have to consider that text has not to be 
quoted nor to be converted 


between British and US English. I hope people will stay communicating and 
exchanging.


 

Marcel Schneider


 

 

 

 













Re: Another take on the English apostrophe in Unicode

2015-06-15 Thread QSJN 4 UKR
By the way, about smart quotes. I am using that for long time. My
keyboard layout generates two characters on one key-press (so I have
to enter [«»][←]{sth}[→] instead of [«]{sth}[»]). It's not that good,
but I'm not afraid neither to lose quotation marks or parentheses nor
become a victim of artificial intelligence :)
About what is one word. Do you know the German prefixes? "... ...
macht ... ... ... ... ... ... auf".
Let me ask if double-quotes are parts of word or not?  For example, in
this sentence "not" is a noun, not particle? Was "Titanic" titanic?



re: ISO (was Re: Accessing the WG2 document register)

2015-06-15 Thread Marcel Schneider
Thank you. That's done.

 

I'd finished by thinking seriously that today, the ISO'd improved itself. 

The case of how Mr Anshuman Pandey is treated by ISO proves that it did not.

This sheltered documents access policy and practice makes ISO appear like 

sheer moonless night, I experienced myself. And as there is no transparency, 

you even don't know what's about.

 

I hope that Mr Pandey's work will be fully honored and be taken into account. 

Well, I don't understand much of these processes, but it's clear to me since 

a pretty long time that there's a problem with ISO somewhat.

 

Best regards,

Marcel 

> Message du 15/06/15 12:51
> De : "Janusz S. Bien" 
> A : "Marcel Schneider" 
> Copie à : unicode@unicode.org
> Objet : ISO (was Re: Accessing the WG2 document register)
> 
> Quote/Cytat - Marcel Schneider  (Mon 15 Jun 2015 
> 10:29:41 AM CEST):
> 
> 
> > What are we going to do? What are you going to do? I repeat, I'm 
> > shocked, and I hate ISO again.
> 
> Please remember that your government supports ISO through your 
> national standard body. So contact AFNOR and persuade them to take an 
> appropriate action.
> 
> Good luck!
> 
> Janusz
> 
> -- 
> Prof. dr hab. Janusz S. Bień - Uniwersytet Warszawski (Katedra 
> Lingwistyki Formalnej)
> Prof. Janusz S. Bień - University of Warsaw (Formal Linguistics Department)
> jsb...@uw.edu.pl, jsb...@mimuw.edu.pl, http://fleksem.klf.uw.edu.pl/~jsbien/
> 
>

ISO (was Re: Accessing the WG2 document register)

2015-06-15 Thread Janusz S. Bien
Quote/Cytat - Marcel Schneider  (Mon 15 Jun 2015  
10:29:41 AM CEST):



What are we going to do? What are you going to do? I repeat, I'm  
shocked, and I hate ISO again.


Please remember that your government supports ISO through your  
national standard body. So contact AFNOR and persuade them to take an  
appropriate action.


Good luck!

Janusz

--
Prof. dr hab. Janusz S. Bień -  Uniwersytet Warszawski (Katedra  
Lingwistyki Formalnej)

Prof. Janusz S. Bień - University of Warsaw (Formal Linguistics Department)
jsb...@uw.edu.pl, jsb...@mimuw.edu.pl, http://fleksem.klf.uw.edu.pl/~jsbien/



Re: Another take on the English apostrophe in Unicode

2015-06-15 Thread Marcel Schneider
On Mon, Jun 15, 2015 at 10:19 AM, Mark Davis ☕️  wrote:

> On Mon, Jun 15, 2015 at 9:17 AM, Marcel Schneider  wrote:

>> When we take the topic down again from linguistics to the core mission of 
>> Unicode, that is character encoding and text processing standardisation, 
>> ellipsis and Swedish abbreviation colon differ from the single closing 
>> quotation mark in this, that they are not to be processed.

>> Linguistics, however, delivered the foundation on which Unicode issued its 
>> first recommendation on what character to use for apostrophe. The result was 
>> neither a matter of opinion, nor of probabilities.

>> Actually, the choice is between perpetuating confusion in word processing, 
>> and get people confused for a little time when announcing that U+2019 for 
>> apostrophe was a mistake.


> Quite nice of you to inform me of the core mission of Unicode—I must have 
> somehow missed that.

> More seriously, it is not all so black and white. As we developed Unicode, we 
> considered whether to separate characters by function, eg, an END OF SENTENCE 
> PERIOD, ABBREVIATION PERIOD, DECIMAL PERIOD, NUMERIC GROUPING PERIOD, etc. Or 
> DIARASIS vs UMLAUT. We quickly concluded that the costs far, far outweighed 
> the benefits.

>In practice, whenever characters are essentially identical—and by that I mean 
>that the overlap between the acceptable glyphs for each character is very 
>high—people will inevitably mix up the characters on entry. So any processing 
>that depends on that distinction is forced to correct the data anyway. And 
>separating them causes even simple things like searching for a character on a 
>page to get screwed up without having equivalence classes.

>So we only separated essentially identical characters in limited cases: such 
>as letters from different scripts.

 

It was a very good idea to disambiguate also apostrophe and single quote, and I 
feel it's not paid too much because it simplified greatly the processing of 
quotation marks in English. I mean, the replacement of each pair of one kind by 
a pair of another kind. When I search for quotes in a text, I don't want to be 
distracted by apostrophes. Don't worry about equivalence classes, they already 
present to us a word without apostrophe as equivalent to the same letters with 
an apostrophe/quote between. It's every time better the computer knows what a 
character is exactly, even when at output it doesn't need to let us know, than 
that it comes up with a useless mixup.


 

You just brought up another good idea too: Period-terminated abbreviations are 
listed as exceptions in word processors. Another list could contain all words 
with leading apostrophe and all words with trailing apostrophe. This might 
allow to filter search results and to separate definitely apostrophes and 
single comma quotation marks. And at input, the smart quotes algorithms will 
become even smarter. Say, really smart.


 

I don't believe working people would mix up letter apostrophe and close-quote 
if they were on keyboard. And even now that they aren't, people don't, because 
people just hit the apostrophe key, which without any dumb smart quotes 
algorithm leads always to visually satisfying results, as shown in the Unicode 
documentation. For good desktop publishing, people must work hard anyway, so it 
would be nice to give them the means, and not to overburden them with routine 
tasks due to deficient text encoding.


 

The way things are working today is not satisfying concerning the English 
apostrophe. I still can't believe that the Unicode Committees were wrong when 
recommending U+02BC. Restoring this advantage today, will be at the honor of 
all involved parties, and we and future generations will thank you very much. 

 

If they'll exist.


 

Best regards,


Marcel Schneider




Re: Accessing the WG2 document register

2015-06-15 Thread Marcel Schneider
 On Mon, Jun 15, 2015, William_J_G Overington  wrote:

 > I have been thinking about the current discussion in the Unicode mailing 
list about a particular ISO committee no longer being allowed to accept 
proposal documents from individuals, because of a rule change from a higher 
level within ISO.
> 
> I am thinking of how the committee meetings might be different from how they 
> would be if the rules had not been changed and what might not get encoded 
> that might have been encoded had the rule change not happened.
> 
> In the short term, the individual contributor is hurt, yet in the long term 
> the document encoding process is hurt and the whole world of information 
> technology may be hurt as potentially good content has been ignored due to 
> discrimination, and a standards document produced that is not as good as it 
> could have been had there not been the discrimination.

> ...
> I opine that it is important when deciding what will be considered for 
> encoding that there is no discrimination about considering encoding 
> proposals. Not only does ignoring contributions cause immediate problems but 
> also there can be second order effects and so on as potential later 
> contributions will not be made as they will not have the original 
> contribution to build upon, and many people may not even realize that the 
> second order effects have taken place.
>

I'm shocked that there is still any discrimination, even against individuals, 
in ISO, and worse, that such discrimination has been newly introduced.

 

This makes me remember the idea I got about ISO when I considered the ISO/IEC 
9995 standard. This standard specifies that on all keyboards, there should be a 
so-called common secondary group, and that this secondary group should contain 
all the characters that are on the keyboard but aren't for a so-called strictly 
national use.  This sounds to me as if it were fascistic or neofascistic. The 
way this secondary group is accessed seems rather complicated and been 
engineered in disconnect from actual OSs and keyboard drivers. The result was 
that when it went on to be implemented on Windows, the secondary group was not 
accessed like specified but as Kana levels, which is very consistent with a 
real keyboard. But in the meantime, this ISO/IEC 9995 standard wastes a whole 
shift state by excluding it simply from use, on the pretext that you need to 
press more than two keys: Shift + AltGr + another key. This restriction to a 
maximum number of two simultaneously pressed keys was so fancy Microsoft didn't 
bother about. Really, to enter a character from the second level of the 
secondary group, you need to press Shift + Kana + another key.  That's all OK, 
but the ISO/IEC 9995 standard is *not*.


 

I won't repeat what I already wrote on this List. Sincerely I thought that the 
International Association for Standardization is today a real international 
organization which cares for all nations on the earth, whether the proposals 
come from individuals or collectivities. I dimly recall that in the nineties, 
ISO was even likely to refuse demands made by its own national members. Reports 
and results showed that it even dit not consult anybody of the nations it was 
encoding the characters of, except a few people who were not always reliable, 
ISO 8859-1 showed.


 

To read such things today makes me furious again. I personally wish that you, 
Mr Pandey, Mr West and Mr Overington, be fully heard at ISO and that *all* 
proposals are treated equally, fully, and successfully.


What are we going to do? What are you going to do? I repeat, I'm shocked, and I 
hate ISO again.


 

 

Best regards,


Marcel Schneider
 

> Message du 15/06/15 09:53
> De : "William_J_G Overington" 
> A : pan...@umich.edu, unicode@unicode.org, babelst...@gmail.com
> Copie à : 
> Objet : Re: Accessing the WG2 document register
> 
> I have been thinking about the current discussion in the Unicode mailing list 
> about a particular ISO committee no longer being allowed to accept proposal 
> documents from individuals, because of a rule change from a higher level 
> within ISO.
> 
> I am thinking of how the committee meetings might be different from how they 
> would be if the rules had not been changed and what might not get encoded 
> that might have been encoded had the rule change not happened.
> 
> In the short term, the individual contributor is hurt, yet in the long term 
> the document encoding process is hurt and the whole world of information 
> technology may be hurt as potentially good content has been ignored due to 
> discrimination, and a standards document produced that is not as good as it 
> could have been had there not been the discrimination.
> 
> Thinking of this I remembered that some years ago, possibly on Channel 4 
> television news in the UK, there was an item about a lady who had that year 
> won the Nobel Prize for Literature. I am trying to trace who it was and a 
> particular work by her, t

Re: Another take on the English apostrophe in Unicode

2015-06-15 Thread Mark Davis ☕️
On Mon, Jun 15, 2015 at 9:17 AM, Marcel Schneider 
wrote:

> When we take the topic down again from linguistics to the core mission of
> Unicode, that is character encoding and text processing standardisation,
> ellipsis and Swedish abbreviation colon differ from the single closing
> quotation mark in this, that they are not to be processed.
>
>
>
> Linguistics, however, delivered the foundation on which Unicode issued its
> first recommendation on what character to use for apostrophe. The result
> was neither a matter of opinion, nor of probabilities.
>
>
>
> Actually, the choice is between perpetuating confusion in word processing,
> and get people confused for a little time when announcing that U+2019 for
> apostrophe was a mistake.
>
>
​Quite nice of you to inform me of the core mission of Unicode—I must have
somehow missed that.


More seriously, it is not all so black and white. As we developed​ Unicode,
we considered whether to separate characters by function, eg, an END OF
SENTENCE PERIOD, ABBREVIATION PERIOD, DECIMAL PERIOD, NUMERIC GROUPING
PERIOD, etc. Or DIARASIS vs UMLAUT. We quickly concluded that the costs
far, far outweighed the benefits.

In practice, whenever characters are essentially identical—and by that I
mean that the overlap between the acceptable glyphs for each character is
very high—people will inevitably mix up the characters on entry. So any
processing that depends on that distinction is forced to correct the data
anyway. And separating them causes even simple things like searching for a
character on a page to get screwed up without having equivalence classes.

So we only separated essentially identical characters in limited cases:
such as letters from different scripts.

Mark 

*— Il meglio è l’inimico del bene —*


Re: Another take on the English Apostrophe in Unicode

2015-06-15 Thread Philippe Verdy
2015-06-15 8:23 GMT+02:00 Marcel Schneider :

> On Fri, Jun 12, 2015, Philippe Verdy  wrote:
> Even the Language bar uses the upper row to define shortcuts with Control,
> Shift+Control, Shift+Alt to switch between keyboard layouts, which are
> prioritized.
>
These are application shortcuts, but these modifier keys combinations are
used with base function keys (F1...F12), not with keys on the alphanumeric
parts of the keyboard. So there's no conflict.

It is normal then to not assign CTRL+keys or CONTROL+shift+keys
(independantly of the capslock state) with non-control characters if the
same keys are used to type non-control ASCII characters in range
U+0040..U+005F. This means that 32 positions on the keyboard must not be
used for any assignment.

The same remark applies to ALT+digit and ALT+letter (otherwise keyboard
shortcut for application menus or navigation in web forms won't work
correctly, or will take the priority when you intended to type a valid
character, forcing these application functions instead of accepting your
character input).

MSKLC performs this "safety checks" and will issue warnings if you do so.

This is not just "my" advaice but documented in the ISO standard.

> So to test the shortcuts with Clavier+, I must first remove shortcuts in
> the Language bar. Then the way was free to test Mr Overingtonʼs shortcuts
> for curly apostrophes (I will send the result just after). When I deleted
> the shortcuts in Clavier+ to test your advice, I found no application
> shortcuts for Ctrl+4 while the keys 1, 2, 5 and 0 are usually mapped as
> Word shortcut with CONTROL, while the heading formatting is with ALT. But
> indeed among ASCII controls I found eight on the French keyboard:
>
>
> //VirtualKey |ScanCd |ISO_# |Ctrl
> {VK_ESCAPE /*T01 */ ,0x001b
> {VK_CANCEL /*X46 */ ,0x0003
> {VK_BACK /*T0E E13*/ ,0x007f
> {VK_OEM_6 /*T1A D11*/ ,0x001b
> {VK_OEM_1 /*T1B D12*/ ,0x001d
> {VK_OEM_5 /*T2B C12*/ ,0x001c
> {VK_RETURN /*T1C C13*/ ,'\n'
> {VK_OEM_102 /*T56 B00*/ ,0x001c
>
> On the alphanumerical block, there are always the same five, three among
> them near the Enter key. The British-American Apostrophe key is exempt of
> Controls too. This is probably why Mr Overington wants to use CONTROL and
> SHIFT+CONTROL for U+2019 and U+02BC, as custom applications shortcuts.
>
Assigning characters to positions defined for application shortcuts is a
bad idea. Keyboard layouts should map characters in positions that are
independant of applications (but layouts may be specific to an OS if the OS
interface defines some standard shortcuts: this is a problem when using
virtualized OSes, as there's a conflict with shortcuts used to switch from
the guest to the host: personnally I have chosen the Application key for
this instead of the right control, because the Application key is rarely
needed, but I frequently type control with the right hand or two hands,
notably CTRL+A, CTRL+C, CTRL+X, CTRL+V).

On the French keyboard, CONTROL and SHIFT+CONTROL must be reserved on 7
successive keys of the first row ("5([", "6-|", "7è`", "8_\", "9ç^", "0à@",
"°)]"), they are needed to get ASCII controls

However CONTROL+@ is extremely rarely needed in applications to enter a
NULL control that will be almost always filtered out silently, only some
editors that allow loading and editing binary files will use it, e.g. Emacs
or Vim which have a "binary editing" mode that avoids altering the encoding
of newlines, but displays all controls explicitly, and that does not limit
the "line length". Personally I prefer not using text editors to edit
binary files, this is too much unsafe with their "insertion" working mode,
it is highly preferable and much simpler to use an hexadecimal editor).
This means that CONTROL+"0à@" may be assigned something else more useful
(even if the MSKLC compiler warns about it).

But you can assign characters with CONTROL and CONTROL+SHIFT for the 6
other keys of the first row ("²", "1&", "2é~", "3"#", "4'{" on the left
side, and "+=}" on the last position to the right).

This means that CONTRL+4 can be safely assigned to U+02BC for the
apostrophe letter, but the most common encoding of the French apostrophe is
U+2019 (the closing single quote) as French normally does not use single
quotation marks, or if it does, it cannot be followed by a letter and
cannot be confused with a French apostrophe that is always followed by a
letter (or number 1).



For now I've not seen any specific need of U+02BC in French (U+2019 is
enough, even if it represents two distinct things in French, but in
distinct non-colliding contexts).

But of course U+02BC is needed for English that needs the distinction with
single quotes, because the English apostrophes are used more permissively
including at end of words just before a space or punctuation or end of line

In French this is not valid to use the apostrophe for elisions at end of
words, you need to use instead some abbreviation mark or style.. or no mark
at all.



The F

Re: Accessing the WG2 document register

2015-06-15 Thread William_J_G Overington
I have been thinking about the current discussion in the Unicode mailing list 
about a particular ISO committee no longer being allowed to accept proposal 
documents from individuals, because of a rule change from a higher level within 
ISO.

I am thinking of how the committee meetings might be different from how they 
would be if the rules had not been changed and what might not get encoded that 
might have been encoded had the rule change not happened.

In the short term, the individual contributor is hurt, yet in the long term the 
document encoding process is hurt and the whole world of information technology 
may be hurt as potentially good content has been ignored due to discrimination, 
and a standards document produced that is not as good as it could have been had 
there not been the discrimination.

Thinking of this I remembered that some years ago, possibly on Channel 4 
television news in the UK, there was an item about a lady who had that year won 
the Nobel Prize for Literature. I am trying to trace who it was and a 
particular work by her, thus far without success.

There was a work, either a poem or a narrative, about what happened differently 
at a railway station because she was not there as a passenger that day, as to 
how what happened was different from what would have happened had she been 
there.

I cannot be sure but I think that Hungary came into it somewhere, either as a 
Hungarian lady or a Hungarian railway station.

I opine that it is important when deciding what will be considered for encoding 
that there is no discrimination about considering encoding proposals. Not only 
does ignoring contributions cause immediate problems but also there can be 
second order effects and so on as potential later contributions will not be 
made as they will not have the original contribution to build upon, and many 
people may not even realize that the second order effects have taken place.

William Overington

15 June 2015



Original message
>From : pan...@umich.edu
Date : 10/06/2015 - 11:01 (GMTST)
To : babelst...@gmail.com
Cc : unic...@unicode.org, unicode@unicode.org
Subject : Re: Accessing the WG2 document register

Andrew,

Thank you for this detailed investigation. It is truly informative.

As I am considered an ineligible contributor by ISO, um, standards, I hereby 
withdraw all of my contributions to Unicode, and reflexively to ISO 10646. A 
list of the contributions that I withdraw is given at:

http://linguistics.berkeley.edu/~pandey/

Whoever has the task of coordinating with ISO, is that you Michel?, please 
withdraw all of my contributions.

All the best,
Anshuman





Re: Another take on the English apostrophe in Unicode

2015-06-15 Thread Marcel Schneider
 On Sat, Jun 13, 2015, Mark Davis  wrote:

> In particular, I see no need to change our recommendation on the character 
> used 
> in contractions for English and many other languages (U+2019). Similarly, we 
> wouldn't 
> recommend use of anything but the colon for marking abbreviations in Swedish, 
> or 
> propose a new MODIFIER LETTER ELLIPSIS for "supercali...docious".

> (IMO, U+02BC was probably just a mistake; the minor benefit is not worth the 
> confusion.)


When we take the topic down again from linguistics to the core mission of 
Unicode, that is character encoding and text processing standardisation, 
ellipsis and Swedish abbreviation colon differ from the single closing 
quotation mark in this, that they are not to be processed.

 

Linguistics, however, delivered the foundation on which Unicode issued its 
first recommendation on what character to use for apostrophe. The result was 
neither a matter of opinion, nor of probabilities.


 

Actually, the choice is between perpetuating confusion in word processing, and 
get people confused for a little time when announcing that U+2019 for 
apostrophe was a mistake.


 

 

Marcel Schneider



 

 

> Message du 13/06/15 17:36
> De : "Mark Davis ☕️" 
> A : "Peter Constable" 

> Copie à : "verd...@wanadoo.fr" , "Kalvesmaki,
Joel" , "Unicode Mailing List" 
> Objet : Re: Another take on the English apostrophe in Unicode
> 
>


>
On Sat, Jun 13, 2015 at 5:10 PM, Peter Constable 
wrote:
>

When it comes to orthography, the notion of what comprise words of a language 
is generally pure convention. That’s because there isn’t any single 
_linguistic_ definition of word that gives the same answer when phonological 
vs. morphological or syntactic criteria are applied. There are book-length 
works on just this topic, such as this:

 




>
In particular, I see no need to change our recommendation on the character used 
in contractions for English and many other languages (U+2019). Similarly, we 
wouldn't recommend use of anything but the colon for marking abbreviations in 
Swedish, or propose a new MODIFIER LETTER ELLIPSIS for "supercali...docious".

> 
(IMO, U+02BC was probably just a mistake; the minor benefit is not worth the 
confusion.)



 

>
Mark

> 
— Il meglio è l’inimico del bene —