On Thu, Feb 6, 2014 at 1:40 AM, Daniel Thomas <[email protected]> wrote:
> On 06/02/14 01:35, Trevor Perrin wrote:
>> I like the smaller size of the pseudowords, particularly for
>> transcribing these things, spelling out the characters over the phone,
>> or viewing on a small screen.  And a lot of the words are unusual so
>> are going to need to be spelled out.
>>
>> But it would be interesting to see what a better wordlist looks like.
>
> Diceware[0] is has a (fairly short 7776) word list in multiple languages
> for the purpose of generating easy to remember passphrases.


Hi Daniel,

That's about a 13-bit list, which seems like a good middle ground
between an 8-bit list (like PGPfone) or a 16-bit list.  An 8-bit list
necessitates 16 word fingerprints for 128-bit security, which feels
like too many words.  A 16-bit list contains 65K words, which is more
than most people's vocabulary, meaning a lot of unusual words that
would have to be spelled out.

The Diceware dictionary is designed around short words and word
fragments (it includes numbers, punctuation, and non-words, which is a
bit weird IMO).  I wrote a script to generate 10 random Diceware words
to see what fingerprints might look like:

https://github.com/trevp/keyname


hop - flu - urn - belie - gogo - gravy - mayor - avow - plush - enter

bump - seem - soft - lm - plane - exit - plus - stilt - behind - malta

tract - rude - rhine - ready - climb - fell - fell - reek - cody - kudzu

bunch - sound - adler - galt - signor - glom - soup - on - lund - juju

essay - eave - ef - pro - stung - gn - smash - josef - vetch - busy

dawson - tic - vy - cake - rock - sr - store - ice - plunk - gp

old - swept - win - mike - xy - chill - seethe - allow - alva - jh

grace - curia - coke - rebut - 15 - foray - jaw - weco - anvil - buenos

pn - adair - swelt - faith - slash - berlin - watch - blood - start - santa

grow - del - bon - 99th - kepler - cam - fun - 37th - dryad - prone


Below compares 5 diceware fingerprints side-by-side with 5 pseudoword
fingerprints of score=18.  The pseudoword fingerprints took an average
of ~30 seconds apiece to generate on a single core of my Macbook Air.
(The max possible score is 20, a score of 18 means 2 deviations from
vowel/consonant alternation):


oman - swath - haze - elmer - gouda - admix - feat - afar - reel - for

ukigex - 3kiw - jejod - yvak - rewupa


blitz - teal - emma - bambi - queen - 92 - mecum - om - derek - twa

lijuv7 - woxm - pokoj - cixa - ehajen


op - zomba - 84th - soy - oval - evolve - spook - fk - ghi - magog

syivoh - upim - leewo - hoda - madeso


piotr - vain - david - mk - gasp - buoy - malt - az - hang - rena

bewora - zutm - hirub - ugux - tlezeb


perk - fate - cinch - gulf - jb - marks - wag - canoe - sprig - maw

ripoyu - ime2 - fenef - aqos - lehnof


Both approaches seem pretty decent, not sure which is best.  Choosing
13-bit wordlists for different languages and dealing with
cross-language compatibility seems a hassle, but so is computing tens
of millions of hashes for a fingerprint.

There's a lot more that could be done here:  e.g. make a better
wordlist than Diceware, or optimize the pseudoword search and do
better scoring.

If anyone wants to do UX research, these would be great projects...


Trevor
_______________________________________________
Messaging mailing list
[email protected]
https://moderncrypto.org/mailman/listinfo/messaging

Reply via email to