If this seems long and at the start quite boring and familiar and you are really in a hurry, skip to the ###. :)

Let us not forget some lower limits to the number of bits needed to provide sufficient security. The number of bits needed depends on the number of tries an attacker can do per second. Let's start with 128 bits. A 128 bit HMAC, CMAC,... or fingerprint (i.e. hash) is currently thought to provide enough security even against a powerful attacker that can run an attack for years. Now we do not want to use 128 bits because we know that people are bad at comparing a 128 bit fingerprint. So how can we shave off bits?

- make the algorithm slow. The 128 bits starting value are based on the fact that hashes etc. are easy to implement in hardware and can be calculated quickly. Making the algorithm n times slower than a hash saves ld(n) bits. This also means that the legit user has to invest more time, but he usually does not need to run the algorithm frequently. Examples that use this approach are:
- PBKDF2
- scrypt
- fingerprints with leading zeros
- fingerprints that are "nicely readable" in some baseXX (*)
- (add more here)

- only allow a limited number of trials per time unit or even total number of trials. It reduces the number of bit dramatically. This already slightly leans into the realm of special systems because you can not have that in an environment where you can do true offline attacks (that's my theory, surprise me with a counter-example). This is really suited for even the most forgetful people (though I must admit that I was once completely in the blue about my pin code too... very awkward)
- ATM pin codes ("secure" hardware)
- SIM card pin codes (mobile phones) ("secure" hardware)
- any kind of well-implemented remote login. If you have hundreds of servers that use the same authentication system while not talking to each other about failed attempts, that of course factors in. (presumed impossibility of offline attacks, i.e. you do not have the password's hash or maybe even know whether a user exists) - PAKE tries it but as Trevor mentioned, it is not water-tight as there are possibilities for online and offline brute-forcing that are viable.
- (add more here)

- restrict yourself to VERY specific cases:
- people must physically meet each other AND
- bring mobile phones / business cards / ...
- make up and remember (or write down, which is mostly bad though, because the value is a secret) bit strings of certain length i.e. random words... how do you know if your random words have the reqired number of bits? this case really is very shady from a security perspective even though the procedure is easily understandable, i.e. usability (as in user-friendlyness) is high - you have an existing secure channel (there should not need to be any discussion about this case)
- (add more here)

- make a weak assumption
- "voice imitation is difficult" (it is not, there are many people who can imitate voices compared to people who can fake hashes) - "people can think up random bitstrings of certain length" (very unlikely. numerous studies have shown that the human being is not a good RNG. E.g. if i knew someone met at a conference and they made up a shared secret there, i'd probably try something with *con*, cake, coffee, napkin... and anything i know about those people personally. Also, people tend to forget these one-time secrets and write them down - subverting security is as easy as stealing their wallets then)
- (add more here)

- more, better ways to shave off bits?

Added complications are extra requirements on the value/bitstring which basically means added bits and/or decreased usability, often to the point where people do not know what to do:
- unlinkability
- (add more here)

Anything non computer-generated with very low bit count is only applicable in _very_ specific settings, varying depending on what is assumed about the attacker (i.e. are you a high profile target?). Personally I am more interested in as unconditionally as possible secure schemes and I do not trust any of the "weak assumptions" I listed - for me, any scheme that relies on them is just about as good as no scheme. That does not mean it actually IS that way, I am just saying, for me, the difference between 0.0001 and 0 is negligible.

### Many discussions here have contained several solutions. Some are solutions to to very specific, settings, where suddenly, under certain assumptions connected with that setting, the number of needed bits is very low. Then the same discussions sometimes also contain solutions with generalized approaches, that do not rely on anything except the presumed hardness of certain problems (ECDLP, prime factoring, "inverting" hashes). More often than not, some solutions from both realms need a comparable number of bits for the user to handle - it is obvious that the strong solution is the better one then. I propose making it very clear which case a certain discussion is about, maybe by writing down the assumptions about the system (even the obvious ones) and further requirements (unlinkability etc). Otherwise there can be discussions containing 5 different schemes at the same time, that are all secure - under their own very specific assumptions. This might lead to confusion and it is doubtful that the discussion can arrive at valuable results if the challenge (system parameters, requirements) is not clear (maybe it was just me that sometimes the challenge was not completely clear). As an example requirements might be, "no business cards, no mobile phones" or "users can easily be tricked into accepting a wrong hash as the right one", both of which would rule out any discussion about fingerprints or random computer-generated values, and the discussion would then just be about which scheme provides most security while being as user-friendly as possible under the given assumptions. There could be discussions that aim for different settings, one allowing business cards, the other one does not - in that cases, answers should make clear which of the target scenarios they refer to. These are just ideas :)

(*) Note on the "nicely readable" fingerprints: I am personally not sure how much you actually gain by doing it. Reading such a fingerprint out loud to someone else, you might (as the listener) not be able to distinguish between a's and e's or i's and since the fingerprint is basically an alteration between consonant and vowel, there is a lot of potential for attackers; basically, they just need to get the consonants right, and even there you can swap c with k etc... There need to be studies for this. Of course it is even pretty hard already to just get the consonants right by brute force. Basically what I am saying is that as an attacker you just need to get something that "sounds right", which might be much easier than getting something that is close to the original in a hamming-distance way.

The amount of bits you save is easily calculated: as mentioned, the "readable" fingerprint is mostly an alteration between vowels (5 = 2.3bits) and consonants (21 = 4.4bits), totaling 6.7bits per character pair, i.e. 3.4 bits per character (actually more, 1 more bit for whether we start with consonant or vowel and the alteration is not strict). Comparing that with the theoretical entropy of a base32 string (5 bits per character) we reduce the entropy to 3.4/5 = 67% (of course it doesn't scale that way, if you start with a 256 bit hash you will probably never get close to 67%). A 125 bit value that meets the "niceness" criteria therefore has around 84bits of entropy. Achieving the same by looking for a hash with leading zeros means finding a 125 bit hash with 41 leading zeros, occurring on average every 2^41 hashes (2,199,023,255,552 .. 2,200 billon). However, that hash might be more valuabe (depending on the results of the needed "nicely readable hash"-study) because you can take these remaining 84 bits and get them into any shape that is, coding-theoretically speaking, the shortest one with most distance by human standards (visual and auditorial), therefore providing the lowest error rate at shortest word length / lowest word number. Of course you need to encode the number of leading zeros somehow, which adds around 6 bits of information to the remaining 84 bits from the hash. Alternatively, find a hash with even more leading zeros to make up for those 6 bits, getting you 2^47 which is already very uncomfortable to calculate on standard hardware: 1 every 140,737,488,355,328 i.e. 140 trillion...

I hope at least parts of this are useful in the discussion.


Stefan
_______________________________________________
Messaging mailing list
[email protected]
https://moderncrypto.org/mailman/listinfo/messaging

Reply via email to