That might be why Python has a special function for reading from it – specialized caching behavior.
> On Oct 19, 2014, at 12:47 PM, Jameson Nash <[email protected]> wrote: > > The documentation for os.urandom says that it reads /dev/urandom, so that's > probably not it. Is the Julia file layer caching too much? I imagine that > this syscall is expensive on a per byte basis, unlike normal file reads where > block reads are closer to constant cost. > >> On Sunday, October 19, 2014, Stefan Karpinski <[email protected]> wrote: >> I kind of doubt that readbytes is all that slow, I wonder if Python is doing >> something besides actually reading from /dev/urandom or if these are somehow >> otherwise not really comparable bits of code. >> >>> On Sun, Oct 19, 2014 at 6:57 AM, Tim Holy <[email protected]> wrote: >>> If you profile, you'll discover that this is simply measuring the >>> performance >>> of `readbytes!`. Can you reduce your example to focus on this point and >>> file as >>> an issue? >>> >>> Best, >>> --Tim >>> >>> On Saturday, October 18, 2014 01:19:50 PM alexander maznev wrote: >>> > Apologies for the semi-redundant theme :) >>> > >>> > I noticed the Julia rand function did not seem to work with BigInt numbers >>> > (idk if there is a different one somewhere else that does) but it's only a >>> > few lines to add one. For example the python ecdsa library uses the >>> > following custom randrange, which is nice and concise, but the >>> > implementation is rather naive in that it approximates to the top byte, >>> > meaning a worst case mean of 256 reads from urandom for ranges of the type >>> > 256**k + 1. >>> > >>> > def randrange(order, entropy=None): >>> > if entropy is None: >>> > entropy = os.urandom >>> > assert order > 1 >>> > bytes = orderlen(order) >>> > dont_try_forever = 10000 >>> > while dont_try_forever > 0: >>> > dont_try_forever -= 1 >>> > candidate = string_to_number(entropy(bytes)) + 1 >>> > if 1 <= candidate < order: >>> > return candidate, (10000 - dont_try_forever) >>> > continue >>> > raise RuntimeError("randrange() tried hard but gave up, either >>> > something" >>> > " is very wrong or you got realllly unlucky. Order >>> > was" >>> > " %x" % order) >>> > >>> > Julia (bit-based version). >>> > >>> > function randrange(order) >>> > upper_2 = length(base(2, order-1)); >>> > upper_256 = int(upper_2/8 +.5); tries=0; >>> > f= open("/dev/urandom") >>> > while true >>> > tries+=1 >>> > ent_256 = readbytes(f, upper_256) >>> > ent_2 = "" >>> > for x in ent_256 >>> > ent_2 = *(ent_2, base(2,x,8)) >>> > end >>> > rand_num = parseint(BigInt, ent_2[1:upper_2], 2) >>> > if rand_num < order >>> > close(f); return rand_num, tries >>> > else continue; >>> > end >>> > end >>> > end >>> > >>> > function timeit(n=1000) >>> > t = time(); tries = 0; order = BigInt(256)^31; >>> > for i in range(1,n) >>> > x, tr = randrange(order) >>> > tries +=tr >>> > end >>> > @printf("avg run. time: %llf, avg num. tries: %f \n", ((time() -t) / n >>> > ), tries/n) >>> > end >>> > >>> > >>> > >>> > Turns out the Julia randrange is slower, even though the average number of >>> > worst case reads is only 2, at a power of 31, that will be a 32 byte >>> > ent_256, so 32 string concatenations, and 32 base2 conversions, - but it >>> > still comes out slower than the python randrange (even with its average of >>> > 256 os.urandom calls). If one goes up to a bigger number, i.e. 256^751, >>> > the >>> > smart bit collecting randrange overtakes the naive python one. On the >>> > other >>> > hand below is a bit-based version of the python randrange. It beats the >>> > Julia one at all sizes by a factor of about 25x. >>> > >>> > python (bit-based version, worst case average 2 reads) >>> > import os >>> > from numpy import base_repr as base >>> > >>> > def randrange(order): >>> > upper_2 = (order-1).bit_length() >>> > upper_256 = int(upper_2/8 + 1); tries=0 >>> > while True: >>> > tries +=1 >>> > ent_256 = os.urandom(upper_256) >>> > ent_2 = "" >>> > for x in ent_256: >>> > ent_2 += base(x, 2, 8)[-8:] >>> > rand_num = int(ent_2[:upper_2], base=2) >>> > if rand_num < order: >>> > return rand_num, tries >>> > else:continue
