Re: Unicode in passwords

2015-10-08 Thread Philippe Verdy
They demand such passwords only for their web services, which are accessed
by web browsers. Not for booting devices. Being on the web, the protocols
are based on HTML and web browsers. As the web is now Unicode with UTF-8 in
a vast majority of contents, those web services are already UTF-8 ready (it
is also a requirement on those web insterfaces used by banks to have
javascript support).
So restricting those web passwords to only ASCII is a bad choice : to
extend the usable charset, forcing the inclusion of ASCII capitals and
punctuation is not sufficient, There is certainly a better way to extend
the set to include as well all characters supported by browser input
methods for the targeted languages, and that are still easy to type in on
most devices (this means not adding characters not suppoted by old versions
of Windows or by basic smartphones).

With an extended repertoire (not restricted to ASCII, thanks to UTF-8 on
the web), password lengths could remain relatively short and easy to type
and remember

(the alternative using passphrases also requires being able to type words
in the local language in its basic orthography, and some compatibility
normalization, as well as case folding will be helpful to provide good
interoperability across client devices, where typing letters with mised
case is frequently very inconvenient on touche devices, as well for people
with disabilities and that type with only one finger).

Still, there are still many banks whose passwords are limited to only basic
decimal digits, and limited to at most 8 of them.
As this is not enough, the input forms will also request other numbers that
people frequently cannot easily remember. others will use two-factor
authentication using mobile phones and confirmation codes sent by SMS, or
will send an additional code in physical letters, they will take footprints
of the browser or IMEI code of the smartphone used and preapproval required
before trusting devices, or giving the number of a physical credit card by
procesing a ¤0.00 online payment with it, and some pseudo "secret"
questions (social security number, identity card/passport/driver licence
number...) but some are very week and ask for something that is rarely
secret such as the birth date (Facebook initially published it by default
to anyone without asking when you create the account, now it is private by
default, except for the birthday application enabled by default and
notifying all "friends". But too late for those that had created their
account years ago, it is now public for eternity even if it can be hidden
on the current version of profiles... similar fake secrets are names of
family members and pets, as all the info is)


2015-10-07 18:10 GMT+02:00 Doug Ewell :

> Philippe Verdy wrote:
>
> > This is a demonstration that using case differences to add more
> > combinations in short passwords is a bad design.
>
> But more and more organizations and banks and supermarket rewards
> programs are demanding it, along with "at least one digit" and "at least
> one 'special' character" and "at least N characters in length" and "must
> change every N days" -- regardless of what Bruce Schneier or anyone else
> says.
>
> --
> Doug Ewell | http://ewellic.org | Thornton, CO 
>
>
>
>


RE: Unicode in passwords

2015-10-08 Thread Doug Ewell
Philippe Verdy wrote:

> They demand such passwords only for their web services, which are
> accessed by web browsers. Not for booting devices.

My company enforces all of the password restrictions I listed, as well
as "ASCII only," for access both to individual PCs and to the company
network.

--
Doug Ewell | http://ewellic.org | Thornton, CO 




Re: Unicode in passwords

2015-10-07 Thread Julian Bradfield
On 2015-10-06, Philippe Verdy  wrote:
> I was speaking of OUTPUT fields : you want to display passwords that are
> stored somewhere (including in a text document stored in some safe place
> such as an external flash drive). People can't remember many passwords.

Again, output fields (such as in the Firefox password manager), in my
experience, display the text that is in them, not a stripped and
compressed version. If they don't, it's a bug.
If you start using passwords including NBSP and EM-DASH, then it's
going to get a bit awkward - but you should know you're doing that,
and take measures accordingly.

> Hiding them on screen is a fake security, what we need is complex passwords
> (difficult to memoize so we need a wallet to store them but people will
> also **printing** them and not store them in a electronic format), and many

It's questionable whether there is ever a need to print a password,
except in the case of an automatically generated hard-copy password
reset. My digital will (if I'd produced one) would need about half a
dozen passwords, mainly the master password for the password manager,
plus some sensitive finance and system admin ones. That's few enough
to write down by hand (or type by hand into a text file), with
appropriate notes.

> passwords (one for each site or application requiring one). But they also
> want to be able to type them correctly: long passwords hidden on screen

Most of our students seem (when I see them logging in to give
presentations) to have long passwords - 20-30 characters - and they
don't seem to have a problem. This also illustrates why defaulting to
hidden passwords is useful.

> Biometric identification is also another fake security (because it is

Not sure what this has to do with Unicode in passwords.

> immutable, when passwords can be and should be changed regularly) and it is

Bruce Schneier is one of the best known and most respected security
researchers around today, and here's his advice:

  So in general: you don't need to regularly change the password to
  your computer or online financial accounts (including the accounts
  at retail sites); definitely not for low-security accounts. You
  should change your corporate login password occasionally, and you
  need to take a good hard look at your friends, relatives, and
  paparazzi before deciding how often to change your Facebook
  password. But if you break up with someone you've shared a computer
  with, change them all. 

( https://www.schneier.com/blog/archives/2010/11/changing_passwo.html )


-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.



Re: Unicode in passwords

2015-10-07 Thread Philippe Verdy
2015-10-07 13:16 GMT+02:00 Stephane Bortzmeyer :

> On Tue, Oct 06, 2015 at 10:53:00PM +0200,
>  Philippe Verdy  wrote
>  a message of 72 lines which said:
>
> > it is highly preferable to extend the character repertoire to
> > Unicode and accept letters in NFKC form and unified by case folding
>
> As I said before, "the ship has sailed". RFC 7613 has been published,
> and uses NFC and case preservation. It is IMHO useless to reopen this
> discussion.
>

Reread the RFC, it discusses the case-insensitive profile using NFC and
conversion to lowercase, this is the bug.

>
> > the recent RFC that forgot the issue : its case-insensitive profile
> > based on NFC and conversion to lowercase is definitely broken !)
>
> What is broken is your analysis. RFC 7613 does not convert passwords
> to lowercase. Indeed, it says exactly the opposite, which seems to
> indicate that you did not read it before calling it broken:
>
>Case-Mapping Rule: Uppercase and titlecase characters MUST NOT be
>mapped to their lowercase equivalents.
>

You are reading the other section for the case-sensitive profile (in
SASLprep, section 6.1), which is absolutely not forbidden for user names,
and already an established practice since too many decennial (email
addresses, local user names in Windows...), and this very new RFC will not
change this practice before very long.


Re: Unicode in passwords

2015-10-07 Thread Stephane Bortzmeyer
On Tue, Oct 06, 2015 at 10:53:00PM +0200,
 Philippe Verdy  wrote 
 a message of 72 lines which said:

> it is highly preferable to extend the character repertoire to
> Unicode and accept letters in NFKC form and unified by case folding

As I said before, "the ship has sailed". RFC 7613 has been published,
and uses NFC and case preservation. It is IMHO useless to reopen this
discussion.

> the recent RFC that forgot the issue : its case-insensitive profile
> based on NFC and conversion to lowercase is definitely broken !)

What is broken is your analysis. RFC 7613 does not convert passwords
to lowercase. Indeed, it says exactly the opposite, which seems to
indicate that you did not read it before calling it broken:

   Case-Mapping Rule: Uppercase and titlecase characters MUST NOT be
   mapped to their lowercase equivalents.
   


Re: Unicode in passwords

2015-10-07 Thread Doug Ewell
Philippe Verdy wrote:

> This is a demonstration that using case differences to add more
> combinations in short passwords is a bad design.

But more and more organizations and banks and supermarket rewards
programs are demanding it, along with "at least one digit" and "at least
one 'special' character" and "at least N characters in length" and "must
change every N days" -- regardless of what Bruce Schneier or anyone else
says.

--
Doug Ewell | http://ewellic.org | Thornton, CO 





Re: Unicode in passwords

2015-10-06 Thread Richard Wordingham
On Tue, 6 Oct 2015 11:21:42 +0200
Mark Davis ☕️  wrote:

> While I think that RFC is useful, it has been interesting just how
> many of the problems recounted on this list go far beyond it, often
> having to do with UI issues. It would be useful to have a paper
> somewhere that organizes all of the problems presented here, and
> maybe makes a stab at describing techniques for handling them.

Indeed, there are several different scenarios.  The most prototypical
are:

1) Initial access to a stand-alone computing device, the conventional
logging on. In this case, it is usually risky to use anything but
printable ASCII.

2) Internet passwords for use in privacy.  Basically any non-trivial
combination of characters should be acceptable, provided it will not be
mangled in transmission.  Under the rules of Unicode, this means that
the text should be normalised before becoming a mere sequence of bytes.

Note that in the second scenario, there is normally an 'administrator'
who can put things right.

Richard.



Re: Unicode in passwords

2015-10-06 Thread Philippe Verdy
2015-10-06 21:57 GMT+02:00 Richard Wordingham <
richard.wording...@ntlworld.com>:

> It's an interesting issue for a password that one can't type.  It's by
> no means a guarantee, either.  I once specified a new a password that
> changed case in the middle not realising that I had started with caps
> lock on.  Consequently, both copies has the wrong capitalisation.  I
> was using a wireless keyboard, which to conserve battery power doesn't
> have a caps lock indicator.  (In the old days, caps lock would have
> physically locked, but that's not how keyboard drivers work nowadays.)
> It took a little while before it occurred to me that I might have had a
> problem with caps lock.
>

This is a demonstration that using case differences to add more
combinations in short passwords is a bad design. As well hiding typed input
is not a good idea: we need at least a pressable button to look/confirm
what we are typing.

Instead of lettercase combinations limited to ASCII, it is highly
preferable to extend the character repertoire to Unicode and accept letters
in NFKC form and unified by case folding (NOT conversion to lowercase or
uppercase, as it is not stable across Unicode versions).

So we should define here the usable set of characters (and define
characters that should be ignored and discarded if present on input). This
should be a profile in UAX #31 (and we should issue a strong warning
against the recent RFC that forgot the issue : its case-insensitive profile
based on NFC and conversion to lowercase is definitely broken !)


Re: Unicode in passwords

2015-10-06 Thread Richard Wordingham
On Tue,  6 Oct 2015 20:13:12 +0100 (BST)
Julian Bradfield  wrote:

> On 2015-10-06, Asmus Freytag (t)  wrote:
> > All browsers I use display spaces in input boxes, and put blobs for
> > hidden fields. Do you have evidence for broken input fields?
> > 
> > 
> > Network keys. That interface seems to consistently give people a
> > choice to reveal the key.
> 
> ? That's not broken in the way Philippe was discussing.

No, but if you make the password up as you type it, you might not then
notice that one accidentally typed a double space.

> > Copy-paste works on all my systems, too - do you have evidence of
> > broken copy-paste in this way?
> > 
> > 
> > I've seen input fields where sites don't allow paste on the
> > second copy (the confirmation copy).
> > 
> > Even for non-password things.
> 
> That's not relevantly broken, either - it's a design feature, to make
> sure you can type the password again (from finger memory!).

It's an interesting issue for a password that one can't type.  It's by
no means a guarantee, either.  I once specified a new a password that
changed case in the middle not realising that I had started with caps
lock on.  Consequently, both copies has the wrong capitalisation.  I
was using a wireless keyboard, which to conserve battery power doesn't
have a caps lock indicator.  (In the old days, caps lock would have
physically locked, but that's not how keyboard drivers work nowadays.)
It took a little while before it occurred to me that I might have had a
problem with caps lock.

Richard.


Re: Unicode in passwords

2015-10-06 Thread Philippe Verdy
2015-10-06 16:31 GMT+02:00 Julian Bradfield :

> On 2015-10-06, Philippe Verdy  wrote:
> > I don't think it is a good idea for tectual passwords to make differences
> > based on the number of spaces. Being plain text they are likely to be
> > displayed in utser interfaces in a way that the user will not see.
> Without
>
> This is true of all passwords. Passwords have to be typed by finger
> memory, not by looking at them (unless you're the type who puts them
> on sticky notes, in which case you type by looking at the text on the
> note). One doesn't normally see the characters, at best a count of
> characters.
>
> > trimming, users won't see the initial or final space, and the password
> > input method may not display them as well (e.g. in an HTML input form or
>
> All browsers I use display spaces in input boxes, and put blobs for
> hidden fields. Do you have evidence for broken input fields?
>

I was speaking of OUTPUT fields : you want to display passwords that are
stored somewhere (including in a text document stored in some safe place
such as an external flash drive). People can't remember many passwords.
Hiding them on screen is a fake security, what we need is complex passwords
(difficult to memoize so we need a wallet to store them but people will
also **printing** them and not store them in a electronic format), and many
passwords (one for each site or application requiring one). But they also
want to be able to type them correctly: long passwords hidden on screen
will not help much (Hidden passwords in input forms is just to avoid some
spying eyes on your screen, but people can still pay on your keystrokes...)

If people are concerned by eyes, they'll need to hide their keyboard input
(notably on touch screens!) but also their screen by first making sure
there's nobody around to look at what you do. If there's a camera, hiding
the password on screen will also no help, it will also be easy to see your
keystrokes.

Biometric identification is also another fake security (because it is
immutable, when passwords can be and should be changed regularly) and it is
extremely easy to duplicate a biometric data record (to be more effective,
the physical captor device should be internally secured and its internal
data instantly flushed in case of intrusion, and this device should be
securely authenticated in addition to performing the biometric check, but
the biometric data should not be transmitted, instead it should be used to
compute a secure hash from the hidden biometric data and negociated and
checked unique randomized data from the source requesting the access, it
should use public key encryption with a couple of public/private key pairs,
not symetric keys, or triple key pairs if using another independant third
party: the private keys will never be exchanged or duplicated). But some
time you'll need to reset those keys and the only tool you'll have will be
to use cleartext pass phrases, even if there's a physical device
identification, encryption with key pairs and the extremely private
biometric data.

Unfortunately biometric data is now shared with governmental third parties,
and even exchanged internationally (they are present on passports and
biometric passports are now mandatory for any one taking a plane
to/from/via the United States and now in many European countries as well;
DNA tracks are also very easyto capture. Biometric data is no longer a
private property, they cannot be used as secrets for access authentication
or signatures). There's still nothing to replace pass phrases and those
need to be user friendly for their legitimate owners.


Re: Unicode in passwords

2015-10-06 Thread Stephane Bortzmeyer
On Tue, Oct 06, 2015 at 12:57:51PM +0900,
 Yoriyuki Yamagata  wrote 
 a message of 33 lines which said:

> FYI, IETF is working on this issue.  See Internet Draft
> https://tools.ietf.org/html/draft-ietf-precis-saslprepbis-17 based
> on PRECIS framework RFC 7564 https://tools.ietf.org/html/rfc7564

As alreday mentioned on that list, the draft is no longer a draft, it
was published as a RFC, RFC 7613, two months ago



Re: Unicode in passwords

2015-10-06 Thread Julian Bradfield
On 2015-10-06, Asmus Freytag (t)  wrote:
> All browsers I use display spaces in input boxes, and put blobs for
> hidden fields. Do you have evidence for broken input fields?
> 
> 
> Network keys. That interface seems to consistently give people a
> choice to reveal the key.

? That's not broken in the way Philippe was discussing.

> Copy-paste works on all my systems, too - do you have evidence of
> broken copy-paste in this way?
> 
> 
> I've seen input fields where sites don't allow paste on the second
> copy (the confirmation copy).
> 
> Even for non-password things.

That's not relevantly broken, either - it's a design feature, to make
sure you can type the password again (from finger memory!).

-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.



Re: Unicode in passwords

2015-10-06 Thread Mark Davis ☕️
While I think that RFC is useful, it has been interesting just how many of
the problems recounted on this list go far beyond it, often having to do
with UI issues. It would be useful to have a paper somewhere that organizes
all of the problems presented here, and maybe makes a stab at describing
techniques for handling them.


Mark 

*— Il meglio è l’inimico del bene —*

On Tue, Oct 6, 2015 at 10:48 AM, Stephane Bortzmeyer 
wrote:

> On Tue, Oct 06, 2015 at 12:57:51PM +0900,
>  Yoriyuki Yamagata  wrote
>  a message of 33 lines which said:
>
> > FYI, IETF is working on this issue.  See Internet Draft
> > https://tools.ietf.org/html/draft-ietf-precis-saslprepbis-17 based
> > on PRECIS framework RFC 7564 https://tools.ietf.org/html/rfc7564
>
> As alreday mentioned on that list, the draft is no longer a draft, it
> was published as a RFC, RFC 7613, two months ago
> 
>


Re: Unicode in passwords

2015-10-06 Thread Julian Bradfield
On 2015-10-06, Philippe Verdy  wrote:
> Finally note that passwords are not necessarily single identifiers
> (whitespaces and word separators are accepted, but whitespaces should
> require special handling with trimming (at both ends) and compression of
> multiple occurences.

Why would you trim or compress whitespace? Using multiple spaces seems a
perfectly legitimate way of making a password harder to guess.

-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.



Re: Unicode in passwords

2015-10-06 Thread Philippe Verdy
I don't think it is a good idea for tectual passwords to make differences
based on the number of spaces. Being plain text they are likely to be
displayed in utser interfaces in a way that the user will not see. Without
trimming, users won't see the initial or final space, and the password
input method may not display them as well (e.g. in an HTML input form or
when using a button to generate passphrases that users must then copy-paste
to their password manager or to some private text document). Some password
storages also will implicitly trim and compress those strings (e.g. in a
fixed-width column of a table in a database). There's also frequently no
visual hint when entering or displaying those spaces and compression occurs
implicitly, or pass phrases may be line wrapped in the middle where you
won't see the number of spaces.

2015-10-06 12:25 GMT+02:00 Julian Bradfield :

> On 2015-10-06, Philippe Verdy  wrote:
> > Finally note that passwords are not necessarily single identifiers
> > (whitespaces and word separators are accepted, but whitespaces should
> > require special handling with trimming (at both ends) and compression of
> > multiple occurences.
>
> Why would you trim or compress whitespace? Using multiple spaces seems a
> perfectly legitimate way of making a password harder to guess.
>
> --
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.
>
>


Re: Unicode in passwords

2015-10-06 Thread Philippe Verdy
And there are severe issues in this RFC for its case mapping profile: it
requires converting "uppercase" characters to "lowercase", but these
properties are not stable (see for example the history of Cherokee letters,
changed from gc=Lo to gc=Lu when lowercase letters were added and with case
pairs added at the same time, see also the addition of the capital sharp S
for German).

That RFC should used used the Unicode "Case Folding" algorithm which is
stable (case folded strings are NOT necessarily all lowercase, they are
just warrantied to keep a single case variant, and case folding implies the
use of compatibility normalization forms, i.e. NFKC or NFKD, to get the
correct closure: the standard Unicode normalizations are also stable) !

2015-10-06 10:48 GMT+02:00 Stephane Bortzmeyer :

> On Tue, Oct 06, 2015 at 12:57:51PM +0900,
>  Yoriyuki Yamagata  wrote
>  a message of 33 lines which said:
>
> > FYI, IETF is working on this issue.  See Internet Draft
> > https://tools.ietf.org/html/draft-ietf-precis-saslprepbis-17 based
> > on PRECIS framework RFC 7564 https://tools.ietf.org/html/rfc7564
>
> As alreday mentioned on that list, the draft is no longer a draft, it
> was published as a RFC, RFC 7613, two months ago
> 
>


Re: Unicode in passwords

2015-10-06 Thread Philippe Verdy
Note that Java strings DO allow the presence of lone surrogates, as well as
non-characters , because Java strings are unrestricted vectors of 16-bit
code units (non-BMP characters are handled as pairs of surrogates).

In those conditions, normalizing the Java string will leave those lone
surrogates (and non-characters) as is, or will throw an exception,
depending on the API used. Java strings do not have any implied encoding
(their "char" members are also unrestricted 16-bit code units, they have
some basic properties but only in BMP, defined in the builtin Character
class API: properties for non-BMP characters require using a library to
provide them, such as ICU4J).

This is essentially the same kind as C/C++ "wide" strings using 16-bit
wchar_t, except that:
- C/C++ wide strings do not allow the inclusion of U+ which is a
terminator, unless you use a string class keeping the actual string length
(and not just the allocated buffer length which may be larger).
- Java strings, including litterals, are immutable, and optionally atomized
into a global dictionary, which includes all string litterals to share the
storage space of multiple instances with equal contents, including across
distinct classes from distinct packages.
- This also true for string literals (which are all immutable and atomized,
and initialized from the compiled bytecode of classes using a modified
version of UTF-8 that preserves all 16-bit code units (including lone
surrogates and non-characters like U+), but also store U+ as
<0xC0,0x80>. This modified UTF-8 encoding is also what you get if you use
the JNI interface version with 8-bit string (this internally requires a
conversion by JNI, using a temporary buffer); if you use the JNI interface
version with 16-bit strings, you work directly with the internal 16-bit
java strings and there's no conversion: you'll also get the lone surrogates
and all non-characters and you are not restricted to only valid UTF-16.
- Java strings are commonly used for fast initialization of large immutable
binary arrays because the conversion from Modified-UTF-8 to 16-bit strings
does not require running any compîled bytecode (this is not true for other
static arrays which requires large code for array litterals and not
warrantied to be immutable: the alternative to this large compiled code is
to initialize those large static arrays by I*/O *from an external stream,
such as a file beside the class in the same package, and possibly packed in
the same JAR).

Java passwords are "strings" but then still allow them to include arbitrary
16-bit code units, even if they violate UTF-16 restrictions. You will not
get much difference is you use byte arrays, the only change being the
difference of size of code units. Between those two representation you are
free to convert them with ANY encodings pair, and not just assuming
UTF-8<>UTF-16.

However, for security reasons, it's best to avoid string litterals for
passwords, because they can be enumerated from the global dictionnary of
atomized strings, or directly by reading the byte code of the compiled
class where they are sored in modified-UTF-8 but loaded and used as
arbitrary 16-bit strings (but the same is true if you use a byte array
literal ! you can just parse the initilization byte code to get the list of
bytes). If passwords or authorization keys are stored somewhere (as strings
or as byte arrays) they should be encrypted into a safe storage and not in
static string litterals or byte array initializers (they will BOTH be clear
text in the bytecode of the compiled class).

In both cases, there is NO normalization applied implicitly or
checked/enforced by the API (the only check that occurs is at class loading
time for the Modified-UTF-8 encoding for string literals: if it is wrong
the class will not load at all, you'll get an invalid class exception;
there's no such ckeck at all for the encoding of byte array initializers,
the only checks are the validity of the java initializer byte code and
bounds of array indexes used by the initiliazer code).



2015-10-06 5:39 GMT+02:00 Martin J. Dürst :

> On 2015/10/01 13:11, Jonathan Rosenne wrote:
>
>> For languages such as Java, passwords should be handled as byte arrays
>> rather than strings. This may make it difficult to apply normalization.
>>
>
> Well, they should be received from the user interface as strings, then
> normalized, then converted to byte arrays using a well-defined single
> encoding. Somewhat tedious, but hopefully not difficult.
>
> Regards,   Martin.
>


Re: Unicode in passwords

2015-10-06 Thread Norbert Lindenberg

> On Oct 6, 2015, at 6:04 , Philippe Verdy  wrote:
> 
> In those conditions, normalizing the Java string will leave those lone 
> surrogates (and non-characters) as is, or will throw an exception, depending 
> on the API used. Java strings do not have any implied encoding (their "char" 
> members are also unrestricted 16-bit code units, they have some basic 
> properties but only in BMP, defined in the builtin Character class API: 
> properties for non-BMP characters require using a library to provide them, 
> such as ICU4J).

The Java Character class was enhanced in J2SE 5.0 to support supplementary 
characters. The String class was specified to be based on UTF-16, and string 
processing throughout the platform was updated to support supplementary 
characters based on UTF-16. These changes have been available to the public 
since 2004. For a summary, see
http://www.oracle.com/technetwork/articles/java/supplementary-142654.html

Norbert


Re: Unicode in passwords

2015-10-06 Thread Asmus Freytag (t)

  
  
On 10/6/2015 7:31 AM, Julian Bradfield
  wrote:


  
All browsers I use display spaces in input boxes, and put blobs for
hidden fields. Do you have evidence for broken input fields?


Network keys. That interface seems to consistently give people a
choice to reveal the key.


  


  
when using a button to generate passphrases that users must then copy-paste
to their password manager or to some private text document).

  
  
Copy-paste works on all my systems, too - do you have evidence of
broken copy-paste in this way?


I've seen input fields where sites don't allow paste on the second
copy (the confirmation copy).

Even for non-password things.

A./

  



Re: Unicode in passwords

2015-10-06 Thread Julian Bradfield
On 2015-10-06, Philippe Verdy  wrote:
> I don't think it is a good idea for tectual passwords to make differences
> based on the number of spaces. Being plain text they are likely to be
> displayed in utser interfaces in a way that the user will not see. Without

This is true of all passwords. Passwords have to be typed by finger
memory, not by looking at them (unless you're the type who puts them
on sticky notes, in which case you type by looking at the text on the
note). One doesn't normally see the characters, at best a count of
characters.

> trimming, users won't see the initial or final space, and the password
> input method may not display them as well (e.g. in an HTML input form or

All browsers I use display spaces in input boxes, and put blobs for
hidden fields. Do you have evidence for broken input fields?

> when using a button to generate passphrases that users must then copy-paste
> to their password manager or to some private text document).

Copy-paste works on all my systems, too - do you have evidence of
broken copy-paste in this way?

> Some password
> storages also will implicitly trim and compress those strings (e.g. in a

If it compresses it on setting, but doesn't compress it on testing, or
vice versa, then that's a bug. If it does the same for setting and
testing, it doesn't matter (except to compromise the crack-resistance
of the password).

> fixed-width column of a table in a database). There's also frequently no
> visual hint when entering or displaying those spaces and compression occurs

Evidence? Maybe if you're typing a password into a Word document it's
hard to count spaces, but why would you be doing that?

-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.



Re: Unicode in passwords

2015-10-05 Thread Philippe Verdy
NFC is probably not the best choice for passwords. It should probably be
NFKC

Look also in the recent proposed update for UAX #31, and consider the
special case where an application does not want passwords to be
case-significant, but accepts using something else than just ASCII letters:
it will be then necessry to apply some closure for NFKC.
Finally note that passwords are not necessarily single identifiers
(whitespaces and word separators are accepted, but whitespaces should
require special handling with trimming (at both ends) and compression of
multiple occurences. It would also be necessay to make sure that acceptable
passwords at least begin with an XID_Start character.

May be all this discussion could be a new section in UAX #31 to take into
account the possible presence of whitespaces (for "pass phrases" which are
not really "identifiers") in "Medial" positions : define a profile as
described in UAX #31 to add whitespaces in "Medial" and remove them from
excluded characters, and possibly extend the set of "Start" to more than
just XID_Start (e.g. you could use some punctuation like '!' or
mathematical sign like '+', and possibly also accept non-decimal digits
that are preserved after NFKC closure)



2015-10-05 17:12 GMT+02:00 Stephane Bortzmeyer :

> On Wed, Sep 30, 2015 at 04:15:30PM -0700,
>  Clark S. Cox III  wrote
>  a message of 73 lines which said:
>
> > You really wouldn’t want “Schlüssel” and “Schlüssel” being different
> > passwords, would you? (assuming that my mail client and/or OS is not
> > interfering, the first is NFC, while the second is NFD)
>
> Hence the RFC 7613, mentioned already here by Marc Blanchet, that you
> must really read if you're interesed in Unicode passwords.
>
> In that case, the RFC is clear: NFC mandatory (and UTF-8 encoding).
>
>4.  Normalization Rule: Unicode Normalization Form C (NFC) MUST be
>applied to all characters.
>
>


Re: Unicode in passwords

2015-10-05 Thread Philippe Verdy
Also some people may want to use now emojis within their passwords or pass
phrases (they are now very common on most smartphones and layouts for
tactile screens or in instant messaging applications used on desktops,
using mouse clicks or taps for selecting them). But I would not recommend
them for encrypting bootable disks or in BIOS/UEFI boot environments
without support for extended input methods and rich graphics to render them
on basic text consoles, unless they are part of a national encoding
standard and supported natively).

For boot environments, you'll be limited by the local hardware support, but
if there's such a support (keyboard or font), it may be helpful to include
some extra symbols, to block remote accesses without this native support
(e.g. on Japanese systems, you could use the extra keys found only on
Japanese keyboards and you won't be able to control the system without the
appropriate device recognized in the booting environment).

2015-10-06 2:08 GMT+02:00 Philippe Verdy :

> NFC is probably not the best choice for passwords. It should probably be
> NFKC
>
> Look also in the recent proposed update for UAX #31, and consider the
> special case where an application does not want passwords to be
> case-significant, but accepts using something else than just ASCII letters:
> it will be then necessry to apply some closure for NFKC.
> Finally note that passwords are not necessarily single identifiers
> (whitespaces and word separators are accepted, but whitespaces should
> require special handling with trimming (at both ends) and compression of
> multiple occurences. It would also be necessay to make sure that acceptable
> passwords at least begin with an XID_Start character.
>
> May be all this discussion could be a new section in UAX #31 to take into
> account the possible presence of whitespaces (for "pass phrases" which are
> not really "identifiers") in "Medial" positions : define a profile as
> described in UAX #31 to add whitespaces in "Medial" and remove them from
> excluded characters, and possibly extend the set of "Start" to more than
> just XID_Start (e.g. you could use some punctuation like '!' or
> mathematical sign like '+', and possibly also accept non-decimal digits
> that are preserved after NFKC closure)
>
>
>
> 2015-10-05 17:12 GMT+02:00 Stephane Bortzmeyer :
>
>> On Wed, Sep 30, 2015 at 04:15:30PM -0700,
>>  Clark S. Cox III  wrote
>>  a message of 73 lines which said:
>>
>> > You really wouldn’t want “Schlüssel” and “Schlüssel” being different
>> > passwords, would you? (assuming that my mail client and/or OS is not
>> > interfering, the first is NFC, while the second is NFD)
>>
>> Hence the RFC 7613, mentioned already here by Marc Blanchet, that you
>> must really read if you're interesed in Unicode passwords.
>>
>> In that case, the RFC is clear: NFC mandatory (and UTF-8 encoding).
>>
>>4.  Normalization Rule: Unicode Normalization Form C (NFC) MUST be
>>applied to all characters.
>>
>>
>


Re: Unicode in passwords

2015-10-05 Thread Martin J. Dürst

On 2015/10/01 13:11, Jonathan Rosenne wrote:

For languages such as Java, passwords should be handled as byte arrays rather 
than strings. This may make it difficult to apply normalization.


Well, they should be received from the user interface as strings, then 
normalized, then converted to byte arrays using a well-defined single 
encoding. Somewhat tedious, but hopefully not difficult.


Regards,   Martin.


Re: Unicode in passwords

2015-10-05 Thread Martin J. Dürst

Some additional concerns:

- Input methods for Chinese, Japanese,... need visual feedback to check 
that the correct Han character was selected. That may show (some parts 
of) the password to bystanders.


- Length limitations of 8 bytes are few and far between these days, but 
they still exist. Even where they are gone, they may have been replaced 
with "safe" limitations, say e.g. 50 bytes. That may still be pretty 
restrictive for some languages when using UTF-8.


- There may occasionally be different length limitations for different 
kinds of access with the same password. That can create very difficult 
situations where the length limitation cuts off part of a UTF-8 byte 
sequence.


- Some interfaces try to estimate the 'quality' of a password on 
password creation. Short passwords, or passwords with only lower-case 
Latin may be rejected, others labeled as 'medium safe', and so on. A 
password with lots of bytes may be labeled as 'excellent' even though it 
consists of characters all taken from the same small script, and thus 
has rather low entropy. Of course, there's the effect that at least for 
a while, the bad guys may think it's too bothersome to try non-ASCII 
passwords, so that may temporarily make them somewhat safer.


Regards,   Martin.

On 2015/10/01 14:01, Mark Davis ☕️ wrote:

I've heard some concerns, mostly around the UI for people typing in
passwords; that they get frustrated when they have to type their password
on different devices:

1. A device may not have keyboard mappings with all the keys for their
language.
2. The keyboard mappings across devices vary where they put keys,
especially for minority script characters using some pattern of
shift/alt/option/etc.. So the pattern of keys that they use on one may be
different than on another.
3. People are often 'blind' to the characters being entered: they just
see a dot, for example. If the keyboards for their language are not
standard, then that makes it difficult.
4. Even if they see, for an instant, the character they type, if the
device doesn't have a font for their language's characters, it may be just
a box.
5. Even if those are not true, the glyph may not be distinctive enough
if the size is too small.



Mark <https://google.com/+MarkDavis>

*— Il meglio è l’inimico del bene —*

On Thu, Oct 1, 2015 at 6:11 AM, Jonathan Rosenne <jonathan.rose...@gmail.com

wrote:



For languages such as Java, passwords should be handled as byte arrays
rather than strings. This may make it difficult to apply normalization.



Jonathan Rosenne



*From:* Unicode [mailto:unicode-boun...@unicode.org] *On Behalf Of *Clark
S. Cox III
*Sent:* Thursday, October 01, 2015 2:16 AM
*To:* Hans Åberg
*Cc:* unicode@unicode.org; John O'Conner
*Subject:* Re: Unicode in passwords





On 2015/09/30, at 13:29, Hans Åberg <haber...@telia.com> wrote:





On 30 Sep 2015, at 18:33, John O'Conner <jsocon...@gmail.com> wrote:

Can you recommend any documents to help me understand potential issues (if
any) for password policies and validation methods that allow characters
from more "exotic" portions of the Unicode space?


On UNIX computers, one computes a hash (like SHA-256), which is then used
to authenticate the password up to a high probability. The hash is stored
in the open, but it is not known how to compute the password from the hash,
so knowing the hash does not easily allow authentication.

So if the password is



… normalized and then …



encoded in say UTF-8 and then hashed, it would seem to take care of most
problems.



You really wouldn’t want “Schlüssel” and “Schlüssel” being different
passwords, would you? (assuming that my mail client and/or OS is not
interfering, the first is NFC, while the second is NFD)





Re: Unicode in passwords

2015-10-05 Thread Yoriyuki Yamagata
Dear John,

FYI, IETF is working on this issue.  See Internet Draft 
https://tools.ietf.org/html/draft-ietf-precis-saslprepbis-17 based on PRECIS 
framework RFC 7564 https://tools.ietf.org/html/rfc7564

Best,

> 2015/10/01 1:33、John O'Conner  のメール:
> 
> I'm researching potential problems and best practices for password policies 
> that allow non-Latin-1 Unicode characters. My searching of the unicode.org 
> site showed me a general security considerations document (UTR #36) but 
> nothing specific for password policies using Unicode.
> 
> Can you recommend any documents to help me understand potential issues (if 
> any) for password policies and validation methods that allow characters from 
> more "exotic" portions of the Unicode space? 
> 
> Best regards,
> John O'Conner
> 

— 
Yoriyuki Yamagata
National Institute of Advanced Science and Technology (AIST), Senior Researcher
http://staff.aist.go.jp/yoriyuki.yamagata/en/








Re: Unicode in passwords

2015-10-05 Thread Marc Blanchet

On 5 Oct 2015, at 8:14, Shriramana Sharma wrote:


I recently came across this bug report where a filesystem encrypted
with a Cyrillic script password could not be decrypted at boot time:

https://bugzilla.redhat.com/show_bug.cgi?id=681250


And?

From what I understand, this is related to the fact that the OS has two 
levels of boot/console/installation scripts and the first level is very 
basic regarding i18n (i.e. us-ascii only guaranteed to work).


Marc.




--
Shriramana Sharma ஶ்ரீரமணஶர்மா 
श्रीरमणशर्मा


Re: Unicode in passwords

2015-10-05 Thread Shriramana Sharma
I recently came across this bug report where a filesystem encrypted
with a Cyrillic script password could not be decrypted at boot time:

https://bugzilla.redhat.com/show_bug.cgi?id=681250


-- 
Shriramana Sharma ஶ்ரீரமணஶர்மா श्रीरमणशर्मा



Re: Unicode in passwords

2015-10-05 Thread Shriramana Sharma
I had hoped it would be obvious my reply was not intended to the "best
practices" part of the OP, but to the "potential problems" part of
it... In any case, I have nothing further to say on this topic.

-- 
Shriramana Sharma ஶ்ரீரமணஶர்மா श्रीरमणशर्मा



Re: Unicode in passwords

2015-10-05 Thread Marc Blanchet



On 5 Oct 2015, at 9:42, Shriramana Sharma wrote:


On 10/5/15, Marc Blanchet  wrote:

On 5 Oct 2015, at 8:14, Shriramana Sharma wrote:


https://bugzilla.redhat.com/show_bug.cgi?id=681250


And?


Well the OP did say:


I'm researching potential problems and best practices for password
policies that allow non-Latin-1 Unicode characters.


The link seemed valid food for the research as was offerred FWIW.


sure. but roughly one could conclude from the bug report that only allow 
us-ascii is safe, which may not be what could be « best practices » 
depending on the point of view…


Marc.



--
Shriramana Sharma ஶ்ரீரமணஶர்மா 
श्रीरमणशर्मा


Re: Unicode in passwords

2015-10-05 Thread Marc Blanchet

On 5 Oct 2015, at 10:47, Shriramana Sharma wrote:


I had hoped it would be obvious my reply was not intended to the "best
practices" part of the OP, but to the "potential problems" part of
it...


sure. my comment was also just informative, not targeting to your 
comment, but targeting the fact that « best practices » may not be 
« us-ascii » only if you want to be i18n.


Marc.


In any case, I have nothing further to say on this topic.

--
Shriramana Sharma ஶ்ரீரமணஶர்மா 
श्रीरमणशर्मा


Re: Unicode in passwords

2015-10-05 Thread Shriramana Sharma
On 10/5/15, Marc Blanchet  wrote:
> On 5 Oct 2015, at 8:14, Shriramana Sharma wrote:
>
>> https://bugzilla.redhat.com/show_bug.cgi?id=681250
>
> And?

Well the OP did say:


I'm researching potential problems and best practices for password
policies that allow non-Latin-1 Unicode characters.


The link seemed valid food for the research as was offerred FWIW.

-- 
Shriramana Sharma ஶ்ரீரமணஶர்மா श्रीरमणशर्मा



Re: Unicode in passwords

2015-10-05 Thread Stephane Bortzmeyer
On Wed, Sep 30, 2015 at 04:15:30PM -0700,
 Clark S. Cox III  wrote 
 a message of 73 lines which said:

> You really wouldn’t want “Schlüssel” and “Schlüssel” being different
> passwords, would you? (assuming that my mail client and/or OS is not
> interfering, the first is NFC, while the second is NFD)

Hence the RFC 7613, mentioned already here by Marc Blanchet, that you
must really read if you're interesed in Unicode passwords.

In that case, the RFC is clear: NFC mandatory (and UTF-8 encoding).

   4.  Normalization Rule: Unicode Normalization Form C (NFC) MUST be
   applied to all characters.



Re: Unicode in passwords

2015-10-01 Thread Richard Wordingham
On Thu, 1 Oct 2015 07:01:12 +0200
Mark Davis ☕️  wrote:

> I've heard some concerns, mostly around the UI for people typing in
> passwords; that they get frustrated when they have to type their
> password on different devices:
> 
>1. A device may not have keyboard mappings with all the keys for
> their language.

The typographers will probably give English as an example!  Where's
the en dash key?

>2. The keyboard mappings across devices vary where they put keys,
>especially for minority script characters using some pattern of
>shift/alt/option/etc.. So the pattern of keys that they use on one
> may be different than on another.

Even ASCII can have problems.  A password containing '#' and '|' can't
be entered when a physical US keyboard (102 keys) is interpreted using
a mapping for a British keyboard (103 keys).  (There seem to be
different conventions as to which key is missing.)

Richard.



Re: Unicode in passwords

2015-10-01 Thread Mathias Bynens

> On 1 Oct 2015, at 07:19, Marc Durdin  wrote:
> 
> 2.   The number of dots corresponds to the number of code points, which 
> is misleading with complex scripts or advanced input methods: you won’t 
> necessarily see one dot per keystroke; in some cases, typing a character may 
> replace a dot with another dot or even delete a dot.

Lots of systems have a bug where supplementary code points show up as two dots 
instead of one, due to UTF-16 being used internally. OS X is an example. Demo 
(open in your browser):

data:text/html,


Re: Unicode in passwords

2015-10-01 Thread Mark Davis ☕️
As to #1, my note needs some clarification. For characters that don't
typically occur on *any* keyboards, people don't typically use those in
their passwords, so switching between different devices doesn't matter.

(One caveat would be where the password dialog permits selection from a
palette. That way it is independent of device.)

The problem comes in where someone uses (as I do), a Mac, a Windows box, a
Chromebook, and an Android tablet & phone. The Mac makes it easy to type an
em-dash—to use your example. It is slightly less easy on Android, a real
pain on Windows, and I haven't even tried on a Chomebook (maybe easy, maybe
not, just haven't tried). So for me to use an em-dash in a password would
just be opening up to annoyance.

I just had a quick look, and it appears that on the latest systems we have
data for in CLDR, em-dash is typeable (somehow) on:

   - all of the android keyboards
   - 85% of the osx keyboards
   - 27% of chromeos keyboards
   - 9% of windows keyboards

http://www.unicode.org/cldr/charts/28/keyboards/chars2keyboards.html

It's even somewhat uglier in the case where I'm typing a password on a
borrowed/public computing device (although typing a password on such a
device may not be exactly a great idea from a security standpoint!).

Mark 

*— Il meglio è l’inimico del bene —*

On Thu, Oct 1, 2015 at 9:33 AM, Richard Wordingham <
richard.wording...@ntlworld.com> wrote:

> On Thu, 1 Oct 2015 07:01:12 +0200
> Mark Davis ☕️  wrote:
>
> > I've heard some concerns, mostly around the UI for people typing in
> > passwords; that they get frustrated when they have to type their
> > password on different devices:
> >
> >1. A device may not have keyboard mappings with all the keys for
> > their language.
>
> The typographers will probably give English as an example!  Where's
> the en dash key?
>
> >2. The keyboard mappings across devices vary where they put keys,
> >especially for minority script characters using some pattern of
> >shift/alt/option/etc.. So the pattern of keys that they use on one
> > may be different than on another.
>
> Even ASCII can have problems.  A password containing '#' and '|' can't
> be entered when a physical US keyboard (102 keys) is interpreted using
> a mapping for a British keyboard (103 keys).  (There seem to be
> different conventions as to which key is missing.)
>
> Richard.
>
>


Re: Unicode in passwords

2015-10-01 Thread Andre Schappo

On 1 Oct 2015, at 08:33, Richard Wordingham wrote:
> 
> Even ASCII can have problems.  A password containing '#' and '|' can't
> be entered when a physical US keyboard (102 keys) is interpreted using
> a mapping for a British keyboard (103 keys).  (There seem to be
> different conventions as to which key is missing.)

I used to have a # in one of my passwords. It used to be fun finding where the 
# key was on a computer's default pre-login keyboard mapping which frequently 
did not match what was printed on the physical keys. I became quite adept at it 
and it certainly made for a more secure password because of the challenge of 
finding # on the keyboard.

I, personally, would really like to have a non-ascii unicode password. I would 
when choosing a non-ascii unicode password test to make sure I could enter it 
on all the devices I use.

André Schappo





Re: Unicode in passwords

2015-09-30 Thread Hans Åberg

> On 30 Sep 2015, at 18:33, John O'Conner  wrote:
> 
> Can you recommend any documents to help me understand potential issues (if 
> any) for password policies and validation methods that allow characters from 
> more "exotic" portions of the Unicode space?

On UNIX computers, one computes a hash (like SHA-256), which is then used to 
authenticate the password up to a high probability. The hash is stored in the 
open, but it is not known how to compute the password from the hash, so knowing 
the hash does not easily allow authentication.

So if the password is encoded in say UTF-8 and then hashed, it would seem to 
take care of most problems.





Re: Unicode in passwords

2015-09-30 Thread Richard Wordingham
On Wed, 30 Sep 2015 16:15:30 -0700
"Clark S. Cox III"  wrote:
 
> You really wouldn’t want “Schlüssel” and “Schlüssel” being different
> passwords, would you?

It'd make them slightly safer to write down!  I trust the tradition of
truncating Unix passwords to 8 bytes is well and truly defunct - that'd
reduce Thai passwords to two characters plus one bit!

Richard.



RE: Unicode in passwords

2015-09-30 Thread Jonathan Rosenne
For languages such as Java, passwords should be handled as byte arrays rather 
than strings. This may make it difficult to apply normalization. 

 

Jonathan Rosenne

 

From: Unicode [mailto:unicode-boun...@unicode.org] On Behalf Of Clark S. Cox III
Sent: Thursday, October 01, 2015 2:16 AM
To: Hans Åberg
Cc: unicode@unicode.org; John O'Conner
Subject: Re: Unicode in passwords

 

 

On 2015/09/30, at 13:29, Hans Åberg <haber...@telia.com> wrote:

 





On 30 Sep 2015, at 18:33, John O'Conner <jsocon...@gmail.com> wrote:

Can you recommend any documents to help me understand potential issues (if any) 
for password policies and validation methods that allow characters from more 
"exotic" portions of the Unicode space?


On UNIX computers, one computes a hash (like SHA-256), which is then used to 
authenticate the password up to a high probability. The hash is stored in the 
open, but it is not known how to compute the password from the hash, so knowing 
the hash does not easily allow authentication.

So if the password is 

 

… normalized and then …





encoded in say UTF-8 and then hashed, it would seem to take care of most 
problems.

 

You really wouldn’t want “Schlüssel” and “Schlüssel” being different passwords, 
would you? (assuming that my mail client and/or OS is not interfering, the 
first is NFC, while the second is NFD)



Re: Unicode in passwords

2015-09-30 Thread Marc Blanchet


On 30 Sep 2015, at 12:33, John O'Conner wrote:

I'm researching potential problems and best practices for password 
policies
that allow non-Latin-1 Unicode characters. My searching of the 
unicode.org
site showed me a general security considerations document (UTR #36) 
but

nothing specific for password policies using Unicode.

Can you recommend any documents to help me understand potential issues 
(if
any) for password policies and validation methods that allow 
characters

from more "exotic" portions of the Unicode space?


the IETF have been doing work related to this exact issue. You might 
want to look at RFC7564 (generic framework) and RFC7613 (username and 
passwords, used in various IETF protocols).


Marc.



Best regards,
John O'Conner


RE: Unicode in passwords

2015-09-30 Thread Marc Durdin
That’s a good list. A few other things I’ve seen:


1.   Even if the user sees the character for an instant, complex script 
characters can be very puzzling as they appear differently and “out of order” 
when isolated.

2.   The number of dots corresponds to the number of code points, which is 
misleading with complex scripts or advanced input methods: you won’t 
necessarily see one dot per keystroke; in some cases, typing a character may 
replace a dot with another dot or even delete a dot.

3.   Directionality can be frustrating.

I’ve had to assist in situations where a user has set a new Windows password 
using a custom keyboard, and then been unable to login, e.g. with Remote 
Desktop, or even with the standard Windows login screen.

iOS, for example, doesn’t even allow the user to select a different input 
method for password boxes – it seems to always be Latin script only (even if 
you’ve removed all your Latin script keyboards from Settings).

Marc

From: Unicode [mailto:unicode-boun...@unicode.org] On Behalf Of Mark Davis ??
Sent: Thursday, 1 October 2015 3:01 PM
To: Jonathan Rosenne <jonathan.rose...@gmail.com>
Cc: Unicode Public <unicode@unicode.org>
Subject: Re: Unicode in passwords

I've heard some concerns, mostly around the UI for people typing in passwords; 
that they get frustrated when they have to type their password on different 
devices:

  1.  A device may not have keyboard mappings with all the keys for their 
language.
  2.  The keyboard mappings across devices vary where they put keys, especially 
for minority script characters using some pattern of shift/alt/option/etc.. So 
the pattern of keys that they use on one may be different than on another.
  3.  People are often 'blind' to the characters being entered: they just see a 
dot, for example. If the keyboards for their language are not standard, then 
that makes it difficult.
  4.  Even if they see, for an instant, the character they type, if the device 
doesn't have a font for their language's characters, it may be just a box.
  5.  Even if those are not true, the glyph may not be distinctive enough if 
the size is too small.


Mark<https://google.com/+MarkDavis>

— Il meglio è l’inimico del bene —

On Thu, Oct 1, 2015 at 6:11 AM, Jonathan Rosenne 
<jonathan.rose...@gmail.com<mailto:jonathan.rose...@gmail.com>> wrote:

For languages such as Java, passwords should be handled as byte arrays rather 
than strings. This may make it difficult to apply normalization.



Jonathan Rosenne

From: Unicode 
[mailto:unicode-boun...@unicode.org<mailto:unicode-boun...@unicode.org>] On 
Behalf Of Clark S. Cox III
Sent: Thursday, October 01, 2015 2:16 AM
To: Hans Åberg
Cc: unicode@unicode.org<mailto:unicode@unicode.org>; John O'Conner
Subject: Re: Unicode in passwords


On 2015/09/30, at 13:29, Hans Åberg 
<haber...@telia.com<mailto:haber...@telia.com>> wrote:


On 30 Sep 2015, at 18:33, John O'Conner 
<jsocon...@gmail.com<mailto:jsocon...@gmail.com>> wrote:

Can you recommend any documents to help me understand potential issues (if any) 
for password policies and validation methods that allow characters from more 
"exotic" portions of the Unicode space?

On UNIX computers, one computes a hash (like SHA-256), which is then used to 
authenticate the password up to a high probability. The hash is stored in the 
open, but it is not known how to compute the password from the hash, so knowing 
the hash does not easily allow authentication.

So if the password is

… normalized and then …

encoded in say UTF-8 and then hashed, it would seem to take care of most 
problems.

You really wouldn’t want “Schlüssel” and “Schlüssel” being different passwords, 
would you? (assuming that my mail client and/or OS is not interfering, the 
first is NFC, while the second is NFD)



Re: Unicode in passwords

2015-09-30 Thread Mark Davis ☕️
I've heard some concerns, mostly around the UI for people typing in
passwords; that they get frustrated when they have to type their password
on different devices:

   1. A device may not have keyboard mappings with all the keys for their
   language.
   2. The keyboard mappings across devices vary where they put keys,
   especially for minority script characters using some pattern of
   shift/alt/option/etc.. So the pattern of keys that they use on one may be
   different than on another.
   3. People are often 'blind' to the characters being entered: they just
   see a dot, for example. If the keyboards for their language are not
   standard, then that makes it difficult.
   4. Even if they see, for an instant, the character they type, if the
   device doesn't have a font for their language's characters, it may be just
   a box.
   5. Even if those are not true, the glyph may not be distinctive enough
   if the size is too small.



Mark <https://google.com/+MarkDavis>

*— Il meglio è l’inimico del bene —*

On Thu, Oct 1, 2015 at 6:11 AM, Jonathan Rosenne <jonathan.rose...@gmail.com
> wrote:

> For languages such as Java, passwords should be handled as byte arrays
> rather than strings. This may make it difficult to apply normalization.
>
>
>
> Jonathan Rosenne
>
>
>
> *From:* Unicode [mailto:unicode-boun...@unicode.org] *On Behalf Of *Clark
> S. Cox III
> *Sent:* Thursday, October 01, 2015 2:16 AM
> *To:* Hans Åberg
> *Cc:* unicode@unicode.org; John O'Conner
> *Subject:* Re: Unicode in passwords
>
>
>
>
>
> On 2015/09/30, at 13:29, Hans Åberg <haber...@telia.com> wrote:
>
>
>
>
>
> On 30 Sep 2015, at 18:33, John O'Conner <jsocon...@gmail.com> wrote:
>
> Can you recommend any documents to help me understand potential issues (if
> any) for password policies and validation methods that allow characters
> from more "exotic" portions of the Unicode space?
>
>
> On UNIX computers, one computes a hash (like SHA-256), which is then used
> to authenticate the password up to a high probability. The hash is stored
> in the open, but it is not known how to compute the password from the hash,
> so knowing the hash does not easily allow authentication.
>
> So if the password is
>
>
>
> … normalized and then …
>
>
>
> encoded in say UTF-8 and then hashed, it would seem to take care of most
> problems.
>
>
>
> You really wouldn’t want “Schlüssel” and “Schlüssel” being different
> passwords, would you? (assuming that my mail client and/or OS is not
> interfering, the first is NFC, while the second is NFD)
>