Re: Is this secure?
Am 03.03.2010 04:51, schrieb Lie Ryan: import itertools def gen(): valid_chars = 'abcdefghijklmnopqrstuvwxyz' for char in itertools.repeat(valid_chars): yield char gen = gen() def gen_rand_string(length): chars = (next(gen) for i in range(length)) return ''.join(chars) since it gives me a perfect distribution of letters, It does not. Only if not (length(valid_chars) % length) Regards, Michael -- http://mail.python.org/mailman/listinfo/python-list
Re: Is this secure?
On 02/25/2010 06:16 AM, mk wrote: > On 2010-02-24 20:01, Robert Kern wrote: >> I will repeat my advice to just use random.SystemRandom.choice() instead >> of trying to interpret the bytes from /dev/urandom directly. > > Out of curiosity: > > def gen_rand_string(length): > prng = random.SystemRandom() > chars = [] > for i in range(length): > chars.append(prng.choice('abcdefghijklmnopqrstuvwxyz')) > return ''.join(chars) > > if __name__ == "__main__": > chardict = {} > for i in range(1): > ##w = gen_rand_word(10) > w = gen_rand_string(10) > count_chars(chardict, w) > counts = list(chardict.items()) > counts.sort(key = operator.itemgetter(1), reverse = True) > for char, count in counts: > print char, count > > > s 3966 > d 3912 > g 3909 > h 3905 > a 3901 > u 3900 > q 3891 > m 3888 > k 3884 > b 3878 > x 3875 > v 3867 > w 3864 > y 3851 > l 3825 > z 3821 > c 3819 > e 3819 > r 3816 > n 3808 > o 3797 > f 3795 > t 3784 > p 3765 > j 3730 > i 3704 > > Better, although still not perfect. > I give you this: I give you this: import itertools def gen(): valid_chars = 'abcdefghijklmnopqrstuvwxyz' for char in itertools.repeat(valid_chars): yield char gen = gen() def gen_rand_string(length): chars = (next(gen) for i in range(length)) return ''.join(chars) since it gives me a perfect distribution of letters, it must be a very secure random password generation scheme. -- http://mail.python.org/mailman/listinfo/python-list
Re: Is this secure?
In article , Robert Kern wrote: >On 2010-02-28 01:28 AM, Aahz wrote: >> In article, >> Robert Kern wrote: >>> >>> If you are storing the password instead of making your user remember >>> it, most platforms have some kind of keychain secure password >>> storage. I recommend reading up on the APIs available on your targeted >>> platforms. >> >> Are you sure? I haven't done a lot of research, but my impression was >> that Windows didn't have anything built in. > >You're right, not built-in, but Windows does provide enough crypto >services for a cross-platform Python implementation to be built: > > http://pypi.python.org/pypi/keyring Thanks you! That's a big help! -- Aahz (a...@pythoncraft.com) <*> http://www.pythoncraft.com/ "Many customs in this life persist because they ease friction and promote productivity as a result of universal agreement, and whether they are precisely the optimal choices is much less important." --Henry Spencer -- http://mail.python.org/mailman/listinfo/python-list
Re: Is this secure?
On 2010-02-28 01:28 AM, Aahz wrote: In article, Robert Kern wrote: If you are storing the password instead of making your user remember it, most platforms have some kind of keychain secure password storage. I recommend reading up on the APIs available on your targeted platforms. Are you sure? I haven't done a lot of research, but my impression was that Windows didn't have anything built in. You're right, not built-in, but Windows does provide enough crypto services for a cross-platform Python implementation to be built: http://pypi.python.org/pypi/keyring -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco -- http://mail.python.org/mailman/listinfo/python-list
Re: Is this secure?
a...@pythoncraft.com (Aahz) writes: > Are you sure? I haven't done a lot of research, but my impression was > that Windows didn't have anything built in. I don't know much about the windows but there is the CAPI and then there is all the TCPA (i.e. DRM) stuff. Maybe it can be used somehow. -- http://mail.python.org/mailman/listinfo/python-list
Re: Is this secure?
In article , Robert Kern wrote: > >If you are storing the password instead of making your user remember >it, most platforms have some kind of keychain secure password >storage. I recommend reading up on the APIs available on your targeted >platforms. Are you sure? I haven't done a lot of research, but my impression was that Windows didn't have anything built in. -- Aahz (a...@pythoncraft.com) <*> http://www.pythoncraft.com/ "Many customs in this life persist because they ease friction and promote productivity as a result of universal agreement, and whether they are precisely the optimal choices is much less important." --Henry Spencer -- http://mail.python.org/mailman/listinfo/python-list
Re: Is this secure?
On Wed, 24 Feb 2010 21:02:07 +0100, mk wrote: [snip] > > rand_str_SystemRandom_seeding > mean 3845.15384615 std dev 46.2016419186 > l 3926 1.75 std devs away from mean > y 3916 1.53 std devs away from mean > d 3909 1.38 std devs away from mean > a 3898 1.14 std devs away from mean > p 3898 1.14 std devs away from mean > c 3889 0.95 std devs away from mean > u 3884 0.84 std devs away from mean > j 3873 0.60 std devs away from mean > n 3873 0.60 std devs away from mean > w 3866 0.45 std devs away from mean > x 3863 0.39 std devs away from mean > r 3855 0.21 std devs away from mean > m 3852 0.15 std devs away from mean > b 3841 -0.09 std devs away from mean > t 3835 -0.22 std devs away from mean > o 3829 -0.35 std devs away from mean > k 3827 -0.39 std devs away from mean > i 3821 -0.52 std devs away from mean > s 3812 -0.72 std devs away from mean > q 3806 -0.85 std devs away from mean > v 3803 -0.91 std devs away from mean > g 3799 -1.00 std devs away from mean > h 3793 -1.13 std devs away from mean > e 3782 -1.37 std devs away from mean > f 3766 -1.71 std devs away from mean > z 3758 -1.89 std devs away from mean Chi2 = 14.43, 25 d.f., prob = 0.046362. The observed distribution is SIGNIFICANTLY CLOSER to the uniform distribution than reasonable by chance. > rand_str_SystemRandom_noseeding > mean 3845.15384615 std dev 55.670522726 > i 3961 2.08 std devs away from mean > r 3911 1.18 std devs away from mean > e 3910 1.16 std devs away from mean > m 3905 1.08 std devs away from mean > a 3901 1.00 std devs away from mean > u 3893 0.86 std devs away from mean > t 3882 0.66 std devs away from mean > w 3872 0.48 std devs away from mean > s 3870 0.45 std devs away from mean > c 3868 0.41 std devs away from mean > n 3866 0.37 std devs away from mean > q 3865 0.36 std devs away from mean > k 3863 0.32 std devs away from mean > y 3848 0.05 std devs away from mean > j 3836 -0.16 std devs away from mean > v 3830 -0.27 std devs away from mean > f 3829 -0.29 std devs away from mean > z 3829 -0.29 std devs away from mean > g 3827 -0.33 std devs away from mean > l 3818 -0.49 std devs away from mean > b 3803 -0.76 std devs away from mean > d 3803 -0.76 std devs away from mean > p 3756 -1.60 std devs away from mean > x 3755 -1.62 std devs away from mean > h 3744 -1.82 std devs away from mean > o 3729 -2.09 std devs away from mean Chi2 = 20.96, 25 d.f., prob = 0.304944. The observed distribution is not significantly different from the uniform distribution. > rand_str_custom > mean 3517.15384615 std dev 40.7541336343 > i 3586 1.69 std devs away from mean > a 3578 1.49 std devs away from mean > e 3575 1.42 std devs away from mean > m 3570 1.30 std devs away from mean > q 3562 1.10 std devs away from mean > c 3555 0.93 std devs away from mean > g 3552 0.86 std devs away from mean > w 3542 0.61 std devs away from mean > p 3536 0.46 std devs away from mean > x 3533 0.39 std devs away from mean > s 3528 0.27 std devs away from mean > o 3524 0.17 std devs away from mean > d 3516 -0.03 std devs away from mean > t 3515 -0.05 std devs away from mean > h 3511 -0.15 std devs away from mean > v 3502 -0.37 std devs away from mean > z 3502 -0.37 std devs away from mean > b 3500 -0.42 std devs away from mean > f 3496 -0.52 std devs away from mean > u 3492 -0.62 std devs away from mean > l 3486 -0.76 std devs away from mean > r 3478 -0.96 std devs away from mean > n 3476 -1.01 std devs away from mean > j 3451 -1.62 std devs away from mean > k 3450 -1.65 std devs away from mean > y 3430 -2.14 std devs away from mean Chi2 = 12.28, 25 d.f., prob = 0.015815. The observed distribution is SIGNIFICANTLY CLOSER to the uniform distribution than reasonable by chance. > It would appear that SystemRandom().choice is indeed best (in terms of > how much the counts stray from mean in std devs), but only after seeding > it with os.urandom. I don't see any reason to worry about any of the three, except perhaps that the first and last are surprisingly uniform. -- To email me, substitute nowhere->spamcop, invalid->net. -- http://mail.python.org/mailman/listinfo/python-list
Re: Is this secure?
On Wed, 24 Feb 2010 20:16:24 +0100, mk wrote: > On 2010-02-24 20:01, Robert Kern wrote: >> I will repeat my advice to just use random.SystemRandom.choice() instead >> of trying to interpret the bytes from /dev/urandom directly. > > Out of curiosity: > > def gen_rand_string(length): > prng = random.SystemRandom() > chars = [] > for i in range(length): > chars.append(prng.choice('abcdefghijklmnopqrstuvwxyz')) > return ''.join(chars) > > if __name__ == "__main__": > chardict = {} > for i in range(1): > ##w = gen_rand_word(10) > w = gen_rand_string(10) > count_chars(chardict, w) > counts = list(chardict.items()) > counts.sort(key = operator.itemgetter(1), reverse = True) > for char, count in counts: > print char, count > > > s 3966 > d 3912 > g 3909 > h 3905 > a 3901 > u 3900 > q 3891 > m 3888 > k 3884 > b 3878 > x 3875 > v 3867 > w 3864 > y 3851 > l 3825 > z 3821 > c 3819 > e 3819 > r 3816 > n 3808 > o 3797 > f 3795 > t 3784 > p 3765 > j 3730 > i 3704 > > Better, although still not perfect. What would be perfect? Surely one shouldn't be happy if all the tallies come out exactly equal: that would be a blatant indication of something very nonrandom going on. The tallies given above give a chi-squared value smack in the middle of the range expected for random sampling of a uniform distribution (p = 0.505). So the chi-squared metric of goodness-of-fit to a unifom distribution says you're doing fine. -- To email me, substitute nowhere->spamcop, invalid->net. -- http://mail.python.org/mailman/listinfo/python-list
Re: Is this secure?
On 2010-02-25 09:03 AM, mk wrote: 2. The app will have GUI and it will be locally installed; it's not going to be web app, it will just be online in the sense of downloading data frequently from the net. If you are storing the password instead of making your user remember it, most platforms have some kind of keychain secure password storage. I recommend reading up on the APIs available on your targeted platforms. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco -- http://mail.python.org/mailman/listinfo/python-list
Re: Is this secure?
On 2010-02-25 02:31, Paul Rubin wrote: It might be helpful if you could say what your application does, or anyway give an idea of what its actual security requirements are. Generating and emailing someone a random password is a fairly standard method for (e.g.) web forums to verify that the person has supplied a working email address, basically as a first level spam filter. Your scheme is probably ok for that. If you're doing something with more demanding security requirements, then as mentioned before, there is a whole lot of stuff you have to pay attention to, and focusing narrowly on password generation isn't that useful. OK some data: 1. I'm going to probably use HTTPS (I meant HTTP over SSL, but wrote HTTP instead of being precise) 2. The app will have GUI and it will be locally installed; it's not going to be web app, it will just be online in the sense of downloading data frequently from the net. 3. I can't disclose the details on what the app will be doing, but it's not going to be terribly security-critical, medium-level at most - what happens in the app itself is not _directly_ related to money. 4. The app will be based on subscription model, costing $10-$20 per month. It's not really doing online banking or smth like that. 5. The worst thing that can happen when security of some account is compromised is that the user will have to reset the password, resending it to predefined email (I don't really see the way of organizing it in a different manner). I'm thinking about optionally passphrase-securing the password saved in GUI. In that case 'diceware' approach would be helpful. I certainly do not want to focus narrowly on password generation: the entire security model will have to be worked out, but the project didn't get to that stage yet, it's all still in planning stages. I just wanted to have this one part (password generation) researched before I get to other stages so I don't have to implement this later in haste and do smth wrong. Regards, mk -- http://mail.python.org/mailman/listinfo/python-list
Re: Is this secure?
On 2010-02-25 02:07, Steven D'Aprano wrote: On Wed, 24 Feb 2010 18:23:17 +0100, mk wrote: Anyway, the passwords for authorized users will be copied and pasted from email into in the application GUI which will remember it for them, so they will not have to remember and type them in. So to break your application's security model, all somebody has to do is use their PC and they have full access to their account? Or get hold of the copy and paste buffer? Or the application's config files? Yes. There's no way around this, short of forcing them to use hardware key, which is an overkill for this application. So I have little in the way of limitations of password length - even though in *some* cases somebody might have to (or be ignorant enough) to retype the password instead of pasting it in. Or your users might be sensible enough to not trust a role-your-own security model, and prefer to memorize the password than to trust that nobody will get access to their PC. The app is not that critical, it's about quarterly subscription to the service, and the users will be able to reset the password anyway. If it were that critical, I'd use the hardware keys; if hardware keys are not used, once somebody gains an (unconstrained) access to the user's PC, there's not much that app developer can do. I've read somewhere a warning from PuTTY developer that even though the key is (normally) protected by the passphrase, losing even an encrypted key is quite likely to lead to its compromise. There's even some software for that on the net: http://www.neophob.com/serendipity/index.php?/archives/127-PuTTY-Private-Key-cracker.html The main application will access the data using HTTP (probably), so the main point is that an attacker is not able to guess passwords using brute force. And why would they bother doing that when they can sniff the wire and get the passwords in plain text? You should assume your attackers are *smarter* than you, not trust them to be foolish. I should have written HTTPS. Regards, mk -- http://mail.python.org/mailman/listinfo/python-list
Re: Is this secure?
mk writes: > Anyway, the passwords for authorized users will be copied and pasted > from email into in the application GUI which will remember it for > them, so they will not have to remember and type them in. It occurs to me that you don't even need to mess with letters in that case: password = os.urandom(5).encode('hex') will generate a string of 10 hex digits that you can give to the user. (That is for Python 2.x but I think it might be broken in Python 3). It might be helpful if you could say what your application does, or anyway give an idea of what its actual security requirements are. Generating and emailing someone a random password is a fairly standard method for (e.g.) web forums to verify that the person has supplied a working email address, basically as a first level spam filter. Your scheme is probably ok for that. If you're doing something with more demanding security requirements, then as mentioned before, there is a whole lot of stuff you have to pay attention to, and focusing narrowly on password generation isn't that useful. -- http://mail.python.org/mailman/listinfo/python-list
Re: Is this secure?
On Wed, 24 Feb 2010 18:23:17 +0100, mk wrote: > Anyway, the passwords for authorized users will be copied and pasted > from email into in the application GUI which will remember it for them, > so they will not have to remember and type them in. So to break your application's security model, all somebody has to do is use their PC and they have full access to their account? Or get hold of the copy and paste buffer? Or the application's config files? > So I have little in > the way of limitations of password length - even though in *some* cases > somebody might have to (or be ignorant enough) to retype the password > instead of pasting it in. Or your users might be sensible enough to not trust a role-your-own security model, and prefer to memorize the password than to trust that nobody will get access to their PC. > The main application will access the data using HTTP (probably), so the > main point is that an attacker is not able to guess passwords using > brute force. And why would they bother doing that when they can sniff the wire and get the passwords in plain text? You should assume your attackers are *smarter* than you, not trust them to be foolish. -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: Is this secure?
Am 24.02.2010 21:06, schrieb mk: I just posted a comparison with calculating std deviations for various methods - using os.urandom, SystemRandom.choice with seeding and without seeding. I saw them They all seem to have slightly different distributions. No they don't. Just run those tests again and you will see that you cannot put them in any order or behaviour. They are all correct now, except that you cannot seed SystemRandom, as it is *not* a PRNG (at least here, it is a wrapper for /dev/random) Regards, Michael -- http://mail.python.org/mailman/listinfo/python-list
Re: Is this secure?
mk writes: > def rand_str_custom(n): > s = os.urandom(n) > return ''.join([chr(ord('a') + ord(x) % 26) for x in s if ord(x) < 234]) Note that simply throws away some of the chars. You have to replace them, not throw them away. > rand_str_SystemRandom_seeding > mean 3845.15384615 std dev 46.2016419186 > l 3926 1.75 std devs away from mean > y 3916 1.53 std devs away from mean ... What do you think you're measuring here? Yes, if you're doing 1000's of draws from a distribution, you'd expect a few of them to be 1.75 sigma from the mean. Since there are 26 letters, you'd expect a multinomial distribution which you can test for with the multinomial test or some approximation from the article: http://en.wikipedia.org/wiki/Multinomial_test I wish I knew more statistics than I do, since there is probably some more familiar statistical test (e.g. the T-test) that you can use as the number of trials gets large, since each bin of the multinomial distribution should eventually start to look like a normal distribution due to the central limit theorem. Others here know a lot more about this stuff than I do, and can probably give better advice. Anyway though, the output of os.urandom should be extremely hard to distinguish from real randomness (that's the whole point of a cryptographic PRNG). -- http://mail.python.org/mailman/listinfo/python-list
Re: Is this secure?
On 2010-02-24 14:02 PM, mk wrote: It would appear that SystemRandom().choice is indeed best (in terms of how much the counts stray from mean in std devs), but only after seeding it with os.urandom. Calling random.seed() does not affect SystemRandom() whatsoever. You are getting perfectly acceptable distributions for all three variants. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco -- http://mail.python.org/mailman/listinfo/python-list
Re: Is this secure?
On 2010-02-24 20:30, Michael Rudolf wrote: The reason is 256 % 26 != 0 256 mod 26 equals 22, thus your code is hitting a-v about 10% (256/26 is approx. 10) more often than w-z. writing secure code is hard... So true. That's why one should stick to standard libs when it comes to crypto or security in general. It's just to easy to mess it up. Just ask Debian about whether touching OpenSSL was a good idea ;) That was brain-dead hiccup, for crying out loud how could they do smth so stupid. def gen_rand_word(n): with open('/dev/urandom') as f: return ''.join([chr(ord('a') + ord(x) % 26) for x in f.read(n) if ord(x) > 22]) Off-by-one-error: you're skipping len(range(22))==23 hits. Argh, it's late here. Well, I really think that you should use repeated Random.choice on an alphabet. Or Random.Systemrandom.choice if you don't trust the PRNG. I just posted a comparison with calculating std deviations for various methods - using os.urandom, SystemRandom.choice with seeding and without seeding. They all seem to have slightly different distributions. Regards, mk -- http://mail.python.org/mailman/listinfo/python-list
Re: Is this secure?
On 2010-02-24 20:19, Robert Kern wrote: On 2010-02-24 13:09 PM, mk wrote: On 2010-02-24 20:01, Robert Kern wrote: I will repeat my advice to just use random.SystemRandom.choice() instead of trying to interpret the bytes from /dev/urandom directly. Oh I hear you -- for production use I would (will) certainly consider this. However, now I'm interested in the problem itself: why is the damn distribution not uniform? You want "< 234", not "< 235". (234 % 26 == 0), so you get some extra 'a's. Right, this explains the 'a' outlier. Fixed. But still: import operator import os import random import math def rand_str_custom(n): s = os.urandom(n) return ''.join([chr(ord('a') + ord(x) % 26) for x in s if ord(x) < 234]) def count_chars(chardict, word): for c in word: try: chardict[c] += 1 except KeyError: chardict[c] = 0 def rand_str_SystemRandom_seeding(length): seed = os.urandom(32) random.seed(seed) prng = random.SystemRandom() chars = [] for i in range(length): chars.append(prng.choice('abcdefghijklmnopqrstuvwxyz')) return ''.join(chars) def rand_str_SystemRandom_noseeding(length): prng = random.SystemRandom() chars = [] for i in range(length): chars.append(prng.choice('abcdefghijklmnopqrstuvwxyz')) return ''.join(chars) def sd(x): sd.sum += x sd.sum2 += x*x sd.n+= 1.0 sum, sum2, n = sd.sum, sd.sum2, sd.n return math.sqrt(sum2/n - sum*sum/n/n) def gen_rand_with_fun(fun): print fun.__name__ chardict = {} for i in range(1): w = fun(10) count_chars(chardict, w) counts = list(chardict.items()) counts.sort(key = operator.itemgetter(1), reverse = True) nums = [c[1] for c in counts] sd.sum = sd.sum2 = sd.n = 0 mean = (1.0*sum(nums))/len(nums) stddev = map(sd, nums)[-1] print 'mean', mean, 'std dev', stddev for char, count in counts: print char, count, '%.2f' % ((count - mean)/stddev), 'std devs away from mean' if __name__ == "__main__": gen_rand_with_fun(rand_str_SystemRandom_seeding) print gen_rand_with_fun(rand_str_SystemRandom_noseeding) print gen_rand_with_fun(rand_str_custom) rand_str_SystemRandom_seeding mean 3845.15384615 std dev 46.2016419186 l 3926 1.75 std devs away from mean y 3916 1.53 std devs away from mean d 3909 1.38 std devs away from mean a 3898 1.14 std devs away from mean p 3898 1.14 std devs away from mean c 3889 0.95 std devs away from mean u 3884 0.84 std devs away from mean j 3873 0.60 std devs away from mean n 3873 0.60 std devs away from mean w 3866 0.45 std devs away from mean x 3863 0.39 std devs away from mean r 3855 0.21 std devs away from mean m 3852 0.15 std devs away from mean b 3841 -0.09 std devs away from mean t 3835 -0.22 std devs away from mean o 3829 -0.35 std devs away from mean k 3827 -0.39 std devs away from mean i 3821 -0.52 std devs away from mean s 3812 -0.72 std devs away from mean q 3806 -0.85 std devs away from mean v 3803 -0.91 std devs away from mean g 3799 -1.00 std devs away from mean h 3793 -1.13 std devs away from mean e 3782 -1.37 std devs away from mean f 3766 -1.71 std devs away from mean z 3758 -1.89 std devs away from mean rand_str_SystemRandom_noseeding mean 3845.15384615 std dev 55.670522726 i 3961 2.08 std devs away from mean r 3911 1.18 std devs away from mean e 3910 1.16 std devs away from mean m 3905 1.08 std devs away from mean a 3901 1.00 std devs away from mean u 3893 0.86 std devs away from mean t 3882 0.66 std devs away from mean w 3872 0.48 std devs away from mean s 3870 0.45 std devs away from mean c 3868 0.41 std devs away from mean n 3866 0.37 std devs away from mean q 3865 0.36 std devs away from mean k 3863 0.32 std devs away from mean y 3848 0.05 std devs away from mean j 3836 -0.16 std devs away from mean v 3830 -0.27 std devs away from mean f 3829 -0.29 std devs away from mean z 3829 -0.29 std devs away from mean g 3827 -0.33 std devs away from mean l 3818 -0.49 std devs away from mean b 3803 -0.76 std devs away from mean d 3803 -0.76 std devs away from mean p 3756 -1.60 std devs away from mean x 3755 -1.62 std devs away from mean h 3744 -1.82 std devs away from mean o 3729 -2.09 std devs away from mean rand_str_custom mean 3517.15384615 std dev 40.7541336343 i 3586 1.69 std devs away from mean a 3578 1.49 std devs away from mean e 3575 1.42 std devs away from mean m 3570 1.30 std devs away from mean q 3562 1.10 std devs away from mean c 3555 0.93 std devs away from mean g 3552 0.86 std devs away from mean w 3542 0.61 std devs away from mean p 3536 0.46 std devs away from mean x 3533 0.39 std devs away from mean s 3528 0.27 std devs away from mean o 3524 0.17 std devs away from mean d 3516 -0.03 std devs away from mean t 3515 -0.05 std devs away from mean h 3511 -0.15 std devs away from mean v 3502 -0.37 std devs away from mean z 3502 -0.37 std devs away from mean b 3500 -0.42 std devs away from m
Re: Is this secure?
Am 24.02.2010 19:35, schrieb mk: On 2010-02-24 18:56, Michael Rudolf wrote: The reason is 256 % 26 != 0 256 mod 26 equals 22, thus your code is hitting a-v about 10% (256/26 is approx. 10) more often than w-z. writing secure code is hard... So true. That's why one should stick to standard libs when it comes to crypto or security in general. It's just to easy to mess it up. Just ask Debian about whether touching OpenSSL was a good idea ;) You might want to skip the values 0-22 to achieve a truly uniform distribution. Hmm perhaps you meant to skip values over 256 - 22 ? That's the same thing as x mod y equals x+N*y mod y for every natural N. def gen_rand_word(n): with open('/dev/urandom') as f: return ''.join([chr(ord('a') + ord(x) % 26) for x in f.read(n) if ord(x) > 22]) Off-by-one-error: you're skipping len(range(22))==23 hits. OK, I just see that I wrote misleading 0-22 while I meant range(22). While with this: def gen_rand_word(n): with open('/dev/urandom') as f: return ''.join([chr(ord('a') + ord(x) % 26) for x in f.read(n) if ord(x) < 235]) Same off-by-one. FYI: Electronic Cash PINs in europe (dont know about the rest of the world) were computed the same way (random hexdigit and just mod it when it's too large) leading to a high probability that your first digit was a 1 :) Schadenfreude is deriving joy from others' misfortunes; what is the German word, if any, for deriving solace from others' misfortunes? ;-) Well - "Schadenfreude" *is* in fact a german word :) "Schaden" is the event or result of misfortune, "Freude" is joy. Well, I really think that you should use repeated Random.choice on an alphabet. Or Random.Systemrandom.choice if you don't trust the PRNG. Regards, Michael -- http://mail.python.org/mailman/listinfo/python-list
Re: Is this secure?
mk writes: > So I have little in the way of limitations of password length ...> > The main application will access the data using HTTP (probably), so > the main point is that an attacker is not able to guess passwords > using brute force. If it's HTTP instead of HTTPS and you're sending the password in the clear, then a serious attacker can simply eavesdrop the connection and pick up the password. Again, if the application is a web forum or something like that, the security requirements probably aren't terribly high. If it's (say) a financial application with potentially motivated attackers, you've got to be a lot more careful than I think you're being right now, and you should really get security specialists involved. > Using A-z with 10-char password seems to provide 3 orders of magnitude > more combinations than a-z: Yes, 2**10 = 1024 so (57/25)**10 is a little more than that. > Even then I'm not getting completely uniform distribution for some reason: Exact equality of the counts would be surprising and a sign that something was wrong with the generation process. It would be like flipping a coin 1 times and getting exactly 5000 heads. The binomial distribution tells you that the number should be close to 5000, but that it's unlikely to be -exactly- 5000. Also, as Michael Rudolf mentioned, getting a letter by taking n%26 where n is drawn uniformly from [0..255] doesn't give a uniform distribution because 256 is not a multiple of 26. I had thought about making an adjustment for that when I posted, but it didn't seem worth cluttering up the code. Uniformity for its own sake doesn't gain you anything; what matters is entropy. If you compute the entropy difference between the slightly nonuniform distribution and a uniform one, it's very small. To get a more uniform distribution I usually just take a larger n, rather than conditionalizing the draws. For example, in the diceware-like code I posted, I read 10 random bytes (giving a uniform random number on [0..2**80]) from urandom for each word. That is still not perfectly uniform, but it's closer to the point where the difference would be very hard to detect. > Aw shucks when will I learn to do the stuff in 3 lines well instead of > 20, poorly. :-/ Well, that's partly a matter of practice, but I'll mention one way I simplified the code, which was by reading more bytes from /dev/urandom than was really necessary. I read one byte for each random letter (i.e. throwing away about 3 random bits for each letter) while you tried to encode the urandom data cleverly and map 4 random bytes to 5 alphabetic letters. /dev/urandom uses a cryptographic PRNG and it's pretty fast, so reading a few extra bytes from it to simplify your code doesn't really cost you anything. -- http://mail.python.org/mailman/listinfo/python-list
Re: Is this secure?
On 2010-02-24 13:16 PM, mk wrote: On 2010-02-24 20:01, Robert Kern wrote: I will repeat my advice to just use random.SystemRandom.choice() instead of trying to interpret the bytes from /dev/urandom directly. Out of curiosity: def gen_rand_string(length): prng = random.SystemRandom() chars = [] for i in range(length): chars.append(prng.choice('abcdefghijklmnopqrstuvwxyz')) return ''.join(chars) if __name__ == "__main__": chardict = {} for i in range(1): ## w = gen_rand_word(10) w = gen_rand_string(10) count_chars(chardict, w) counts = list(chardict.items()) counts.sort(key = operator.itemgetter(1), reverse = True) for char, count in counts: print char, count s 3966 d 3912 g 3909 h 3905 a 3901 u 3900 q 3891 m 3888 k 3884 b 3878 x 3875 v 3867 w 3864 y 3851 l 3825 z 3821 c 3819 e 3819 r 3816 n 3808 o 3797 f 3795 t 3784 p 3765 j 3730 i 3704 Better, although still not perfect. This distribution is well within expectations. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco -- http://mail.python.org/mailman/listinfo/python-list
Re: Is this secure?
Robert Kern writes: > I will repeat my advice to just use random.SystemRandom.choice() > instead of trying to interpret the bytes from /dev/urandom directly. SystemRandom is something pretty new so I wasn't aware of it. But yeah, if I were thinking more clearly I would have suggested os.urandom instead of opening /dev/urandom. -- http://mail.python.org/mailman/listinfo/python-list
Re: Is this secure?
On 2010-02-24 13:09 PM, mk wrote: On 2010-02-24 20:01, Robert Kern wrote: I will repeat my advice to just use random.SystemRandom.choice() instead of trying to interpret the bytes from /dev/urandom directly. Oh I hear you -- for production use I would (will) certainly consider this. However, now I'm interested in the problem itself: why is the damn distribution not uniform? You want "< 234", not "< 235". (234 % 26 == 0), so you get some extra 'a's. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco -- http://mail.python.org/mailman/listinfo/python-list
Re: Is this secure?
On 2010-02-24 20:01, Robert Kern wrote: I will repeat my advice to just use random.SystemRandom.choice() instead of trying to interpret the bytes from /dev/urandom directly. Out of curiosity: def gen_rand_string(length): prng = random.SystemRandom() chars = [] for i in range(length): chars.append(prng.choice('abcdefghijklmnopqrstuvwxyz')) return ''.join(chars) if __name__ == "__main__": chardict = {} for i in range(1): ##w = gen_rand_word(10) w = gen_rand_string(10) count_chars(chardict, w) counts = list(chardict.items()) counts.sort(key = operator.itemgetter(1), reverse = True) for char, count in counts: print char, count s 3966 d 3912 g 3909 h 3905 a 3901 u 3900 q 3891 m 3888 k 3884 b 3878 x 3875 v 3867 w 3864 y 3851 l 3825 z 3821 c 3819 e 3819 r 3816 n 3808 o 3797 f 3795 t 3784 p 3765 j 3730 i 3704 Better, although still not perfect. Regards, mk -- http://mail.python.org/mailman/listinfo/python-list
Re: Is this secure?
On 2010-02-24 20:01, Robert Kern wrote: I will repeat my advice to just use random.SystemRandom.choice() instead of trying to interpret the bytes from /dev/urandom directly. Oh I hear you -- for production use I would (will) certainly consider this. However, now I'm interested in the problem itself: why is the damn distribution not uniform? Regards, mk -- http://mail.python.org/mailman/listinfo/python-list
Re: Is this secure?
On 2010-02-24 12:35 PM, mk wrote: While with this: def gen_rand_word(n): with open('/dev/urandom') as f: return ''.join([chr(ord('a') + ord(x) % 26) for x in f.read(n) if ord(x) < 235]) a 3852 ... 1. I'm systematically getting 'a' outlier: have no idea why for now. 2. This is somewhat better (except 'a') but still not uniform. I will repeat my advice to just use random.SystemRandom.choice() instead of trying to interpret the bytes from /dev/urandom directly. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco -- http://mail.python.org/mailman/listinfo/python-list
Re: Is this secure?
On 2010-02-24 18:56, Michael Rudolf wrote: The reason is 256 % 26 != 0 256 mod 26 equals 22, thus your code is hitting a-v about 10% (256/26 is approx. 10) more often than w-z. writing secure code is hard... I'm going to switch to PHP: Python world wouldn't lose much, but PHP would gain a lot. You might want to skip the values 0-22 to achieve a truly uniform distribution. Hmm perhaps you meant to skip values over 256 - 22 ? Bc I'm getting this (reduced the run to 1000 generated strings): def gen_rand_word(n): with open('/dev/urandom') as f: return ''.join([chr(ord('a') + ord(x) % 26) for x in f.read(n) if ord(x) > 22]) z 3609 b 3608 s 3567 e 3559 j 3556 r 3555 g 3548 p 3540 m 3538 q 3532 h 3528 y 3526 v 3524 i 3500 x 3496 c 3488 k 3488 l 3487 u 3487 a 3469 o 3465 d 3455 t 3439 f 3437 n 3417 w 3175 While with this: def gen_rand_word(n): with open('/dev/urandom') as f: return ''.join([chr(ord('a') + ord(x) % 26) for x in f.read(n) if ord(x) < 235]) a 3852 w 3630 s 3623 v 3582 y 3569 p 3568 c 3558 k 3558 b 3556 r 3553 x 3546 m 3534 n 3522 o 3515 h 3510 d 3505 u 3487 t 3486 i 3482 f 3477 e 3474 g 3460 q 3453 l 3437 z 3386 j 3382 1. I'm systematically getting 'a' outlier: have no idea why for now. 2. This is somewhat better (except 'a') but still not uniform. FYI: Electronic Cash PINs in europe (dont know about the rest of the world) were computed the same way (random hexdigit and just mod it when it's too large) leading to a high probability that your first digit was a 1 :) Schadenfreude is deriving joy from others' misfortunes; what is the German word, if any, for deriving solace from others' misfortunes? ;-) Regards, mk -- http://mail.python.org/mailman/listinfo/python-list
Re: Is this secure?
On 2010-02-24 18:59, Steve Holden wrote: Aw shucks when will I learn to do the stuff in 3 lines well instead of 20, poorly. :-/ When you've got as much experience as Paul? And how much experience does Paul have? (this is mostly not a facile question) For my part, my more serious effort (on and off) with programming in Python is under a year. Regards, mk -- http://mail.python.org/mailman/listinfo/python-list
Re: Is this secure?
Am 24.02.2010 18:23, schrieb mk: Even then I'm not getting completely uniform distribution for some reason: d 39411 l 39376 f 39288 a 39275 s 39225 r 39172 p 39159 t 39073 k 39071 u 39064 e 39005 o 39005 n 38995 j 38993 h 38975 q 38958 c 38938 b 38906 g 38894 i 38847 m 38819 v 38712 z 35321 y 35228 w 35189 x 35075 Code: import operator def gen_rand_word(n): with open('/dev/urandom') as f: return ''.join([chr(ord('a') + ord(x) % 26) for x in f.read(n)]) The reason is 256 % 26 != 0 256 mod 26 equals 22, thus your code is hitting a-v about 10% (256/26 is approx. 10) more often than w-z. You might want to skip the values 0-22 to achieve a truly uniform distribution. FYI: Electronic Cash PINs in europe (dont know about the rest of the world) were computed the same way (random hexdigit and just mod it when it's too large) leading to a high probability that your first digit was a 1 :) Regards, Michael -- http://mail.python.org/mailman/listinfo/python-list
Re: Is this secure?
mk wrote: > On 2010-02-24 03:50, Paul Rubin wrote: >> The stuff about converting 4 random bytes to a decimal string and then >> peeling off 2 digits at a time is pretty awful, and notice that since >> 2**32 is 4294967296, in the cases where you get 10 digits, the first >> 2-digit pair is never higher than 42. > > Yikes! I didn't think about that. This is probably where (some part of) > probability skewing comes from. > > Anyway, the passwords for authorized users will be copied and pasted > from email into in the application GUI which will remember it for them, > so they will not have to remember and type them in. So I have little in > the way of limitations of password length - even though in *some* cases > somebody might have to (or be ignorant enough) to retype the password > instead of pasting it in. > > In that case the "diceware" approach is not necessary, even though I > will certainly remember this approach for a case when users will have to > remember & type the passwords in. > > The main application will access the data using HTTP (probably), so the > main point is that an attacker is not able to guess passwords using > brute force. > > Using A-z with 10-char password seems to provide 3 orders of magnitude > more combinations than a-z: > 57 ** 10 > 36201456891249L 25 ** 10 > 95367431640625L > > Even though I'm not sure it is worth it, assuming 1000 brute-force > guesses per second (which over the web would amount pretty much to DOS), > this would take # days: > 57 ** 10 / (1000 * 3600 * 24) > 4190200595L 25 ** 10 / (1000 * 3600 * 24) > 1103789L > > Even then I'm not getting completely uniform distribution for some reason: > > d 39411 > l 39376 > f 39288 > a 39275 > s 39225 > r 39172 > p 39159 > t 39073 > k 39071 > u 39064 > e 39005 > o 39005 > n 38995 > j 38993 > h 38975 > q 38958 > c 38938 > b 38906 > g 38894 > i 38847 > m 38819 > v 38712 > z 35321 > y 35228 > w 35189 > x 35075 > > Code: > > import operator > > def gen_rand_word(n): > with open('/dev/urandom') as f: > return ''.join([chr(ord('a') + ord(x) % 26) for x in f.read(n)]) > > def count_chars(chardict, word): > for c in word: > try: > chardict[c] += 1 > except KeyError: > chardict[c] = 0 > > if __name__ == "__main__": > chardict = {} > for i in range(10): > w = gen_rand_word(10) > count_chars(chardict, w) > counts = list(chardict.items()) > counts.sort(key = operator.itemgetter(1), reverse = True) > for char, count in counts: > print char, count > >> I'd write your code something like this: >> >> nletters = 5 >> >> def randomword(n): >> with open('/dev/urandom') as f: >> return ''.join([chr(ord('a')+ord(c)%26) for c in f.read(n)]) >> >> print randomword(nletters) > > Aw shucks when will I learn to do the stuff in 3 lines well instead of > 20, poorly. :-/ > When you've got as much experience as Paul? regards Steve -- Steve Holden +1 571 484 6266 +1 800 494 3119 PyCon is coming! Atlanta, Feb 2010 http://us.pycon.org/ Holden Web LLC http://www.holdenweb.com/ UPCOMING EVENTS:http://holdenweb.eventbrite.com/ -- http://mail.python.org/mailman/listinfo/python-list
Re: Is this secure?
On 2010-02-24 03:50, Paul Rubin wrote: The stuff about converting 4 random bytes to a decimal string and then peeling off 2 digits at a time is pretty awful, and notice that since 2**32 is 4294967296, in the cases where you get 10 digits, the first 2-digit pair is never higher than 42. Yikes! I didn't think about that. This is probably where (some part of) probability skewing comes from. Anyway, the passwords for authorized users will be copied and pasted from email into in the application GUI which will remember it for them, so they will not have to remember and type them in. So I have little in the way of limitations of password length - even though in *some* cases somebody might have to (or be ignorant enough) to retype the password instead of pasting it in. In that case the "diceware" approach is not necessary, even though I will certainly remember this approach for a case when users will have to remember & type the passwords in. The main application will access the data using HTTP (probably), so the main point is that an attacker is not able to guess passwords using brute force. Using A-z with 10-char password seems to provide 3 orders of magnitude more combinations than a-z: >>> 57 ** 10 36201456891249L >>> 25 ** 10 95367431640625L Even though I'm not sure it is worth it, assuming 1000 brute-force guesses per second (which over the web would amount pretty much to DOS), this would take # days: >>> 57 ** 10 / (1000 * 3600 * 24) 4190200595L >>> 25 ** 10 / (1000 * 3600 * 24) 1103789L Even then I'm not getting completely uniform distribution for some reason: d 39411 l 39376 f 39288 a 39275 s 39225 r 39172 p 39159 t 39073 k 39071 u 39064 e 39005 o 39005 n 38995 j 38993 h 38975 q 38958 c 38938 b 38906 g 38894 i 38847 m 38819 v 38712 z 35321 y 35228 w 35189 x 35075 Code: import operator def gen_rand_word(n): with open('/dev/urandom') as f: return ''.join([chr(ord('a') + ord(x) % 26) for x in f.read(n)]) def count_chars(chardict, word): for c in word: try: chardict[c] += 1 except KeyError: chardict[c] = 0 if __name__ == "__main__": chardict = {} for i in range(10): w = gen_rand_word(10) count_chars(chardict, w) counts = list(chardict.items()) counts.sort(key = operator.itemgetter(1), reverse = True) for char, count in counts: print char, count I'd write your code something like this: nletters = 5 def randomword(n): with open('/dev/urandom') as f: return ''.join([chr(ord('a')+ord(c)%26) for c in f.read(n)]) print randomword(nletters) Aw shucks when will I learn to do the stuff in 3 lines well instead of 20, poorly. :-/ Regards, mk -- http://mail.python.org/mailman/listinfo/python-list
Re: Is this secure?
Steven D'Aprano writes: > Given a random six character password taken out of an alphabet of 52 > characters, it takes over nine billion attempts to brute force it. > Reducing the alphabet by 50% cuts that down to less than 200 million. To > make up for that loss of 1 bit of entropy, you need two extra characters > in your password. One extra character comes pretty close (within 1.3 bits). Even two extra chars is probably (subjective) easier for a user to deal with than a completely random mixture of upper/lower case. You don't get the extra bit per character if that distribution is anything other than random, of course. For something like a web password (each guess takes a server hit), where the resource guarded is not very valuable, 5 chars is probably enough for most purposes. For something like an encryption key subject to offline attacks, 6 mixed-case characters will barely slow a real attacker down. As before, my suggestion is still diceware. I've used random alphanumerics in the past but they're too big a hassle, they have to be written down, etc. And of course, if you're doing something serious, use a hardware token. -- http://mail.python.org/mailman/listinfo/python-list
Re: Is this secure?
On Tue, 23 Feb 2010 18:39:53 -0800, Paul Rubin wrote: > Steven D'Aprano writes: >> Paul, if you were anyone else, I'd be sneering uncontrollably about >> now, but you're not clueless about cryptography, so what have I missed? >> Why is reducing the number of distinct letters by more than 50% >> anything but a disaster? This makes the task of brute-forcing the >> password exponentially easier. > > Reducing the number of distinct letters by 50% decreases the entropy per > character by 1 bit. You say that as if 1 bit of entropy isn't much :) Given a random six character password taken out of an alphabet of 52 characters, it takes over nine billion attempts to brute force it. Reducing the alphabet by 50% cuts that down to less than 200 million. To make up for that loss of 1 bit of entropy, you need two extra characters in your password. > That stuff about mixing letters and digits and > funny symbols just makes the password a worse nuisance to remember and > type, for a small gain in entropy that you can compute and make up for. Well, everybody has their own ways of remembering passwords, and I'd much prefer to remember an eight character password with "funny symbols" that I chose myself, than a six character password with nothing but letters that was chosen for me. Of course, I flatter myself that I know how to choose good passwords, and I hate remembering long random strings even from a reduced alphabet (e.g. I hate memorizing eight digit phone numbers, and am completely incapable of remembering ten digit mobile phone numbers). -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: Is this secure?
In message <7xwry39tpi@ruckus.brouhaha.com>, Paul Rubin wrote: > More generally still, passwords regardless of their entropy content are > a sucky way to encapsulate cryptographic secrets. They’re a shared secret. How else would you represent a shared secret, if not with a shared secret? -- http://mail.python.org/mailman/listinfo/python-list
Re: Is this secure?
Lie Ryan writes: > If an attacker knows the that the random number generator have an > extreme skew and he knows the distribution of the letters, how much > advantage would it give the attacker? My initial guess is that the more > skewed the letters are, the better the advantage, since an attacker > using brute-force can write his program to prefer the most likely letters? A useable (conservative) estimate is that the attacker's workload is 1/p where p is the probability of the most likely password. That basically says the password strength can be measured by the min-entropy. Cryptographers often use that approach. If you want to be more precise, you can do a conditional probability calculation assuming the attacker works down the list of possible passwords in order of decreasing probability, stopping when they hit the right one. More generally still, passwords regardless of their entropy content are a sucky way to encapsulate cryptographic secrets. We keep using them because every alternative has drawbacks of its own. -- http://mail.python.org/mailman/listinfo/python-list
Re: Is this secure?
On 02/24/10 14:09, Robert Kern wrote: > On 2010-02-23 20:43 , Steven D'Aprano wrote: >> On Wed, 24 Feb 2010 02:40:13 +, Steven D'Aprano wrote: >> >>> On Tue, 23 Feb 2010 15:36:02 +0100, mk wrote: >>> The question is: is this secure? That is, can the string generated this way be considered truly random? >>> >>> Putting aside the philosophical question of what "truly random" means, I >>> presume you mean that the letters are uniformly distributed. The answer >>> to that is, they don't like uniformly distributed. >> >> Er, they don't *look* uniformly distributed. >> >> (Of course, being random, perhaps they are and I just got unlucky.) > > You'd have to be very, *very* unlucky to get a sample of that size so > far from uniformly distributed if the generating process actually were > uniform. > > Of course, uniformity isn't really necessary. You just need enough > entropy in the distribution (amongst other things like protection of the > seed from being known or guessed). A skewed distribution of characters > is perfectly fine provided that you had enough characters in the > password to meet the desired entropy requirement. A skewed distribution > does require more characters to meet a specified entropy requirement > than a uniform distribution, of course. > > That said, for a naive strategy like "pick an independent random > character, repeat", you should just use a uniform distribution. It makes > the analysis easier. Worthwhile generators that give skewed > distributions usually do so for a good reason, like generating > pronounceable passwords. If an attacker knows the that the random number generator have an extreme skew and he knows the distribution of the letters, how much advantage would it give the attacker? My initial guess is that the more skewed the letters are, the better the advantage, since an attacker using brute-force can write his program to prefer the most likely letters? -- http://mail.python.org/mailman/listinfo/python-list
Re: Is this secure?
On 2010-02-23 20:43 , Steven D'Aprano wrote: On Wed, 24 Feb 2010 02:40:13 +, Steven D'Aprano wrote: On Tue, 23 Feb 2010 15:36:02 +0100, mk wrote: The question is: is this secure? That is, can the string generated this way be considered truly random? Putting aside the philosophical question of what "truly random" means, I presume you mean that the letters are uniformly distributed. The answer to that is, they don't like uniformly distributed. Er, they don't *look* uniformly distributed. (Of course, being random, perhaps they are and I just got unlucky.) You'd have to be very, *very* unlucky to get a sample of that size so far from uniformly distributed if the generating process actually were uniform. Of course, uniformity isn't really necessary. You just need enough entropy in the distribution (amongst other things like protection of the seed from being known or guessed). A skewed distribution of characters is perfectly fine provided that you had enough characters in the password to meet the desired entropy requirement. A skewed distribution does require more characters to meet a specified entropy requirement than a uniform distribution, of course. That said, for a naive strategy like "pick an independent random character, repeat", you should just use a uniform distribution. It makes the analysis easier. Worthwhile generators that give skewed distributions usually do so for a good reason, like generating pronounceable passwords. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco -- http://mail.python.org/mailman/listinfo/python-list
Re: Is this secure?
mk writes: >> You might look at the sitewww.diceware.comfor an approach to this, >> which you can implement with a program. The docs there are pretty >> thoughtful and may help you understand the relevant issues. > > Thanks. But I would also be grateful for indicating what is wrong/ugly > in my code. The stuff about converting 4 random bytes to a decimal string and then peeling off 2 digits at a time is pretty awful, and notice that since 2**32 is 4294967296, in the cases where you get 10 digits, the first 2-digit pair is never higher than 42. There are also some effects on the lower digits. The total entropy loss probably isn't fatal but as described, it's ugly. I'd write your code something like this: nletters = 5 def randomword(n): with open('/dev/urandom') as f: return ''.join([chr(ord('a')+ord(c)%26) for c in f.read(n)]) print randomword(nletters) I wouldn't rely on a 5 letter combination for a high security application, but it might be ok for some low security purposes. Two random 5-letter combinations separated by a hyphen will be much better, and is probably easier to type than a solid block of 10 letters. -- http://mail.python.org/mailman/listinfo/python-list
Re: Is this secure?
Steven D'Aprano writes: > Putting aside the philosophical question of what "truly random" means, I > presume you mean that the letters are uniformly distributed. The answer > to that is, they don't like uniformly distributed. That is a good point, the way those letters are generated (through the decimal conversion stuff), they won't be all that uniform. -- http://mail.python.org/mailman/listinfo/python-list
Re: Is this secure?
Steven D'Aprano writes: > Paul, if you were anyone else, I'd be sneering uncontrollably about now, > but you're not clueless about cryptography, so what have I missed? Why is > reducing the number of distinct letters by more than 50% anything but a > disaster? This makes the task of brute-forcing the password exponentially > easier. Reducing the number of distinct letters by 50% decreases the entropy per character by 1 bit. That stuff about mixing letters and digits and funny symbols just makes the password a worse nuisance to remember and type, for a small gain in entropy that you can compute and make up for. The main thing you have to make sure is that the min-entropy is sufficient for your purposes, and it's generally more convenient to do that by making the password a little bit longer than by imposing contortions on the person typing it. Ross Anderson's "Security Engineering" chapter about passwords is worth reading too: http://www.cl.cam.ac.uk/~rja14/Papers/SE-03.pdf When I mentioned entropy loss to the OP though, I mostly meant loss from getting rid of the letter z. The (binary) Shannon entropy of the uniform probability distribution on 26 letters is 4.7004397 bits; on 25 letters, it's 4.6438561 bits. The difference isn't enough to give an attacker that much advantage. I like the diceware approach to passphrase generation and I've been using it for years. www.diceware.com explains it in detail and the docs there are quite well-thought-out and informative. Keep in mind that the entropy needed for an online password (attacker has to make a server query for every guess, and hopefully gets locked out after n wrong tries) and an offline one (attacker has something like a hash of the password and can run a completely offline search) are different. Here is a program that I use sometimes: from math import log dictfile = '/usr/share/dict/words' def genrandom(nbytes): with open('/dev/urandom') as f: return int(f.read(nbytes).encode('hex'), 16) def main(): wordlist = list(x.strip() for x in open(dictfile) if len(x) < 7) nwords = len(wordlist) print "%d words, entropy=%.3f bits/word"% ( nwords, log(nwords, 2)) print '-'.join(wordlist[genrandom(10)%nwords] for i in xrange(6)) main() -- http://mail.python.org/mailman/listinfo/python-list
Re: Is this secure?
On Wed, 24 Feb 2010 02:40:13 +, Steven D'Aprano wrote: > On Tue, 23 Feb 2010 15:36:02 +0100, mk wrote: > >> The question is: is this secure? That is, can the string generated this >> way be considered truly random? > > Putting aside the philosophical question of what "truly random" means, I > presume you mean that the letters are uniformly distributed. The answer > to that is, they don't like uniformly distributed. Er, they don't *look* uniformly distributed. (Of course, being random, perhaps they are and I just got unlucky.) -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: Is this secure?
On Tue, 23 Feb 2010 15:36:02 +0100, mk wrote: > The question is: is this secure? That is, can the string generated this > way be considered truly random? Putting aside the philosophical question of what "truly random" means, I presume you mean that the letters are uniformly distributed. The answer to that is, they don't like uniformly distributed. This isn't a sophisticated statistical test, it's the equivalent of a back-of-the-envelope calculation: I generated 100,000 random strings with your code, and counted how often each letter appears: If the letters are uniformly distributed, you would expect all the numbers to be quite close, but instead they range from 15063 to 25679: {'a': 15063, 'c': 20105, 'b': 15100, 'e': 25465, 'd': 25458, 'g': 25597, 'f': 25589, 'i': 25045, 'h': 25679, 'k': 22945, 'j': 25531, 'm': 16187, 'l': 16252, 'o': 16076, 'n': 16012, 'q': 16069, 'p': 16119, 's': 16088, 'r': 16087, 'u': 15951, 't': 16081, 'w': 16236, 'v': 15893, 'y': 15834, 'x': 15956} Eye-balling it, it looks vaguely two-humped, one hump around 15-16K, the second around 22-25K. Sure enough, here's a quick-and-dirty graph: a | *** b | *** c | *** d | *** e | *** f | g | h | i | *** j | k | ** l | ** m | ** n | * o | ** p | ** q | ** r | ** s | ** t | ** u | * v | * w | ** x | * y | * The mean of the counts is 19056.72, and the mean deviation is 3992.28. While none of this is statistically sophisticated, it does indicate to me that your function is nowhere even close to uniform. It has a very strong bias. -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: Is this secure?
On Tue, 23 Feb 2010 11:19:59 -0800, Paul Rubin wrote: > mk writes: >> I need to generate passwords and I think that pseudo-random generator >> is not good enough, frankly. So I wrote this function:... The question >> is: is this secure? That is, can the string generated this way be >> considered truly random? (I abstract from not-quite-perfect nature of >> /dev/urandom at the moment; I can always switch to /dev/random which is >> better) > > urandom is fine and the entropy loss from the numeric conversions and > eliminating 'z' in that code before you get letters out is not too bad. What? You're going from a possible alphabet of 62 (excluding punctuation) or 94 (inc punctuation available on an American keyboard) distinct letters down to 25, and you say that's "not too bad"? Paul, if you were anyone else, I'd be sneering uncontrollably about now, but you're not clueless about cryptography, so what have I missed? Why is reducing the number of distinct letters by more than 50% anything but a disaster? This makes the task of brute-forcing the password exponentially easier. Add the fact that the passwords are so short (as little as two characters in my tests) and this is about as far from secure as it is possible to be. -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: Is this secure?
On Tue, 23 Feb 2010 15:36:02 +0100, mk wrote: > Hello, > > I need to generate passwords and I think that pseudo-random generator is > not good enough, frankly. So I wrote this function: [snip] > (yes I know that this way generated string will not contain 'z' because > 99/4 + 97 = 121 which is 'y') You're worried about the security of the PRNG but then generate a TWO to FIVE character lowercase password with no digits, punctuation or the letter 'z'? That's priceless! Python's PRNG is not suitable for producing cryptographically strong streams of random bytes, but it is perfectly strong enough for generating good passwords. > The question is: is this secure? No. You are wasting your time trying to fix something which isn't a problem, and introducing a much bigger problem instead. You are MUCH MUCH MUCH better off with a six or ten character password taken from upper and lowercase letters, plus digits, plus punctuation, than a four digit password taken from lowercase letters only. Even if the first case has some subtle statistical deviation from uniformity, and the second is "truly random" (whatever that means), it doesn't matter. Nobody is going to crack your password because the password generator is 0.01% more likely to generate a "G" than a "q". But they *will* brute- force your password if you have a four digit password taken from a-y only. > That is, can the string generated this > way be considered truly random? Define truly random. -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: Is this secure?
In message , mk wrote: > I need to generate passwords and I think that pseudo-random generator is > not good enough, frankly. So I wrote this function: Much simpler: import subprocess data, _ = subprocess.Popen \ ( args = ("pwgen", "-nc"), stdout = subprocess.PIPE ).communicate() print data -- http://mail.python.org/mailman/listinfo/python-list
Re: Is this secure?
On 2010-02-23 13:59 PM, mk wrote: On Feb 23, 7:19 pm, Paul Rubin wrote: The code is pretty ugly. The main problem is you end up with a password that's usually 5 letters but sometimes just 4 or fewer. Well I didn't write the whole thing here, in actual use I'd write a loop repeating the function until I have enough characters and then I'd select a substring of specified length. Anything else in the code that is ugly and I should correct? I would recommend using random.SystemRandom.choice() on a sequence of acceptable characters. E.g. (untested) import random import string characters = string.letters + string.digits + '~...@#$%^&*()-+=,;./\?><|' # ... or whatever. def gen_rand_string(length): prng = random.SystemRandom() chars = [] for i in range(length): chars.append(prng.choice(characters)) return ''.join(chars) -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco -- http://mail.python.org/mailman/listinfo/python-list
Re: Is this secure?
On 2010-02-23 13:19 PM, Paul Rubin wrote: I find it's most practical to use a few random words (chosen from a word list like /usr/dict/words) rather than random letters. Words are easier to remember and type. You might look at the site www.diceware.com for an approach to this, which you can implement with a program. The docs there are pretty thoughtful and may help you understand the relevant issues. I like RFC 1751 for this: http://gitweb.pycrypto.org/?p=crypto/pycrypto-2.x.git;a=blob;f=lib/Crypto/Util/RFC1751.py;h=1c98a212c22066adabfee521b495eeb4f9d7232b;hb=HEAD Shortened URL: http://tr.im/Pv9B -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco -- http://mail.python.org/mailman/listinfo/python-list
Re: Is this secure?
On Feb 23, 7:19 pm, Paul Rubin wrote: > The code is pretty ugly. The main problem is you end up with a password > that's usually 5 letters but sometimes just 4 or fewer. Well I didn't write the whole thing here, in actual use I'd write a loop repeating the function until I have enough characters and then I'd select a substring of specified length. Anything else in the code that is ugly and I should correct? > You might look at the sitewww.diceware.comfor an approach to this, > which you can implement with a program. The docs there are pretty > thoughtful and may help you understand the relevant issues. Thanks. But I would also be grateful for indicating what is wrong/ugly in my code. Regards, mk -- http://mail.python.org/mailman/listinfo/python-list
Re: Is this secure?
mk writes: > I need to generate passwords and I think that pseudo-random generator > is not good enough, frankly. So I wrote this function:... > The question is: is this secure? That is, can the string generated > this way be considered truly random? (I abstract from > not-quite-perfect nature of /dev/urandom at the moment; I can always > switch to /dev/random which is better) urandom is fine and the entropy loss from the numeric conversions and eliminating 'z' in that code before you get letters out is not too bad. The code is pretty ugly. The main problem is you end up with a password that's usually 5 letters but sometimes just 4 or fewer. Passwords that short are vulnerable to dictionary attacks. Longer passwords made from random letters are difficult to remember. I find it's most practical to use a few random words (chosen from a word list like /usr/dict/words) rather than random letters. Words are easier to remember and type. You might look at the site www.diceware.com for an approach to this, which you can implement with a program. The docs there are pretty thoughtful and may help you understand the relevant issues. -- http://mail.python.org/mailman/listinfo/python-list