Re: Unicode in passwords

2015-10-01 Thread Richard Wordingham
On Thu, 1 Oct 2015 07:01:12 +0200
Mark Davis ☕️  wrote:

> I've heard some concerns, mostly around the UI for people typing in
> passwords; that they get frustrated when they have to type their
> password on different devices:
> 
>1. A device may not have keyboard mappings with all the keys for
> their language.

The typographers will probably give English as an example!  Where's
the en dash key?

>2. The keyboard mappings across devices vary where they put keys,
>especially for minority script characters using some pattern of
>shift/alt/option/etc.. So the pattern of keys that they use on one
> may be different than on another.

Even ASCII can have problems.  A password containing '#' and '|' can't
be entered when a physical US keyboard (102 keys) is interpreted using
a mapping for a British keyboard (103 keys).  (There seem to be
different conventions as to which key is missing.)

Richard.



Re: UAX #29, Unicode Text Segmentation, update to improve Mongolian word segmentation

2015-10-01 Thread Richard Wordingham
On Wed, 30 Sep 2015 14:04:45 -0700
announceme...@unicode.org wrote:

> For further background on this issue and possible
> ways to address it, see PRI #308
> , /Property Change for U+202F
> NARROW NO-BREAK SPACE (NNBSP)/.

Is this the announcement of PRI #308?

Richard.


Re: Unicode in passwords

2015-10-01 Thread Mathias Bynens

> On 1 Oct 2015, at 07:19, Marc Durdin  wrote:
> 
> 2.   The number of dots corresponds to the number of code points, which 
> is misleading with complex scripts or advanced input methods: you won’t 
> necessarily see one dot per keystroke; in some cases, typing a character may 
> replace a dot with another dot or even delete a dot.

Lots of systems have a bug where supplementary code points show up as two dots 
instead of one, due to UTF-16 being used internally. OS X is an example. Demo 
(open in your browser):

data:text/html,


Re: Unicode in passwords

2015-10-01 Thread Mark Davis ☕️
As to #1, my note needs some clarification. For characters that don't
typically occur on *any* keyboards, people don't typically use those in
their passwords, so switching between different devices doesn't matter.

(One caveat would be where the password dialog permits selection from a
palette. That way it is independent of device.)

The problem comes in where someone uses (as I do), a Mac, a Windows box, a
Chromebook, and an Android tablet & phone. The Mac makes it easy to type an
em-dash—to use your example. It is slightly less easy on Android, a real
pain on Windows, and I haven't even tried on a Chomebook (maybe easy, maybe
not, just haven't tried). So for me to use an em-dash in a password would
just be opening up to annoyance.

I just had a quick look, and it appears that on the latest systems we have
data for in CLDR, em-dash is typeable (somehow) on:

   - all of the android keyboards
   - 85% of the osx keyboards
   - 27% of chromeos keyboards
   - 9% of windows keyboards

http://www.unicode.org/cldr/charts/28/keyboards/chars2keyboards.html

It's even somewhat uglier in the case where I'm typing a password on a
borrowed/public computing device (although typing a password on such a
device may not be exactly a great idea from a security standpoint!).

Mark 

*— Il meglio è l’inimico del bene —*

On Thu, Oct 1, 2015 at 9:33 AM, Richard Wordingham <
richard.wording...@ntlworld.com> wrote:

> On Thu, 1 Oct 2015 07:01:12 +0200
> Mark Davis ☕️  wrote:
>
> > I've heard some concerns, mostly around the UI for people typing in
> > passwords; that they get frustrated when they have to type their
> > password on different devices:
> >
> >1. A device may not have keyboard mappings with all the keys for
> > their language.
>
> The typographers will probably give English as an example!  Where's
> the en dash key?
>
> >2. The keyboard mappings across devices vary where they put keys,
> >especially for minority script characters using some pattern of
> >shift/alt/option/etc.. So the pattern of keys that they use on one
> > may be different than on another.
>
> Even ASCII can have problems.  A password containing '#' and '|' can't
> be entered when a physical US keyboard (102 keys) is interpreted using
> a mapping for a British keyboard (103 keys).  (There seem to be
> different conventions as to which key is missing.)
>
> Richard.
>
>


Re: Unicode in passwords

2015-10-01 Thread Andre Schappo

On 1 Oct 2015, at 08:33, Richard Wordingham wrote:
> 
> Even ASCII can have problems.  A password containing '#' and '|' can't
> be entered when a physical US keyboard (102 keys) is interpreted using
> a mapping for a British keyboard (103 keys).  (There seem to be
> different conventions as to which key is missing.)

I used to have a # in one of my passwords. It used to be fun finding where the 
# key was on a computer's default pre-login keyboard mapping which frequently 
did not match what was printed on the physical keys. I became quite adept at it 
and it certainly made for a more secure password because of the challenge of 
finding # on the keyboard.

I, personally, would really like to have a non-ascii unicode password. I would 
when choosing a non-ascii unicode password test to make sure I could enter it 
on all the devices I use.

André Schappo





NNBSP and Word Boundaries

2015-10-01 Thread Richard Wordingham
The background document for PRI #308 (Property Change for NNBSP),
http://www.unicode.org/review/pri308/pri308-background.html , says,

"The only other widely noted use for U+202F NNBSP is for representation
of the thin non-breaking space (espace fine insécable) regularly seen
next to certain punctuation marks in French style typography. However,
the word segmentation change for U+202F should have no impact in that
context, as ExtendNumLet is explicitly for preventing breaks between
letters, but does not prevent the identification of word boundaries
next to punctuation marks."

Unfortunately, this isn't quite true.  In the text fragment "
dit: ", there would be internal word-boundaries before 'd' and
before and after ':', but the word isolated would be the four characters
"dit".  One solution would be replace NNBSP by U+2009 THIN
SPACE, for with untailored line-breaking there would be no line break
between it and the 't' or colon, but there would be a word break
between the 't' and the thin space.

The problem is that characters with property ExtendNumLet can be the
first or last character of a word as well as a character strictly
within a word.  In this respect, the property differs from characters
with the property MidNumLet.  The problem with using that property
instead is that such characters, such as FULL STOP, may be flanked by
letters or numbers within a word, but not both.  The problem then
arises with the Mongolian analogue of '4th' etc. - it is written digit,
NNBSP, letters, and is a single word.

Richard.