I kind of doubt that readbytes is all that slow, I wonder if Python is doing something besides actually reading from /dev/urandom or if these are somehow otherwise not really comparable bits of code.
On Sun, Oct 19, 2014 at 6:57 AM, Tim Holy <[email protected]> wrote: > If you profile, you'll discover that this is simply measuring the > performance > of `readbytes!`. Can you reduce your example to focus on this point and > file as > an issue? > > Best, > --Tim > > On Saturday, October 18, 2014 01:19:50 PM alexander maznev wrote: > > Apologies for the semi-redundant theme :) > > > > I noticed the Julia rand function did not seem to work with BigInt > numbers > > (idk if there is a different one somewhere else that does) but it's only > a > > few lines to add one. For example the python ecdsa library uses the > > following custom randrange, which is nice and concise, but the > > implementation is rather naive in that it approximates to the top byte, > > meaning a worst case mean of 256 reads from urandom for ranges of the > type > > 256**k + 1. > > > > def randrange(order, entropy=None): > > if entropy is None: > > entropy = os.urandom > > assert order > 1 > > bytes = orderlen(order) > > dont_try_forever = 10000 > > while dont_try_forever > 0: > > dont_try_forever -= 1 > > candidate = string_to_number(entropy(bytes)) + 1 > > if 1 <= candidate < order: > > return candidate, (10000 - dont_try_forever) > > continue > > raise RuntimeError("randrange() tried hard but gave up, either > > something" > > " is very wrong or you got realllly unlucky. Order > > was" > > " %x" % order) > > > > Julia (bit-based version). > > > > function randrange(order) > > upper_2 = length(base(2, order-1)); > > upper_256 = int(upper_2/8 +.5); tries=0; > > f= open("/dev/urandom") > > while true > > tries+=1 > > ent_256 = readbytes(f, upper_256) > > ent_2 = "" > > for x in ent_256 > > ent_2 = *(ent_2, base(2,x,8)) > > end > > rand_num = parseint(BigInt, ent_2[1:upper_2], 2) > > if rand_num < order > > close(f); return rand_num, tries > > else continue; > > end > > end > > end > > > > function timeit(n=1000) > > t = time(); tries = 0; order = BigInt(256)^31; > > for i in range(1,n) > > x, tr = randrange(order) > > tries +=tr > > end > > @printf("avg run. time: %llf, avg num. tries: %f \n", ((time() -t) / > n > > ), tries/n) > > end > > > > > > > > Turns out the Julia randrange is slower, even though the average number > of > > worst case reads is only 2, at a power of 31, that will be a 32 byte > > ent_256, so 32 string concatenations, and 32 base2 conversions, - but it > > still comes out slower than the python randrange (even with its average > of > > 256 os.urandom calls). If one goes up to a bigger number, i.e. 256^751, > the > > smart bit collecting randrange overtakes the naive python one. On the > other > > hand below is a bit-based version of the python randrange. It beats the > > Julia one at all sizes by a factor of about 25x. > > > > python (bit-based version, worst case average 2 reads) > > import os > > from numpy import base_repr as base > > > > def randrange(order): > > upper_2 = (order-1).bit_length() > > upper_256 = int(upper_2/8 + 1); tries=0 > > while True: > > tries +=1 > > ent_256 = os.urandom(upper_256) > > ent_2 = "" > > for x in ent_256: > > ent_2 += base(x, 2, 8)[-8:] > > rand_num = int(ent_2[:upper_2], base=2) > > if rand_num < order: > > return rand_num, tries > > else:continue > >
