Le dimanche 19 octobre 2014 à 13:14 -0400, Stefan Karpinski a écrit :
> That might be why Python has a special function for reading from it –
> specialized caching behavior.
Note that for those on bleeding-edge Linux distributions, a new system
call has just been added in 3.17 to get random numbers:
http://lwn.net/Articles/606141/

But my understanding is that for scientific applications (as opposed to
cryptographic ones), /dev/urandom isn't the best choice since reading
from it is quite expensive for the system due to security requirements,
and it costs entropy for programs which may really need it.


Regards

> 
> 
> On Oct 19, 2014, at 12:47 PM, Jameson Nash <[email protected]> wrote:
> 
> 
> > The documentation for os.urandom says that it reads /dev/urandom, so
> > that's probably not it. Is the Julia file layer caching too much? I
> > imagine that this syscall is expensive on a per byte basis, unlike
> > normal file reads where block reads are closer to constant cost. 
> > 
> > On Sunday, October 19, 2014, Stefan Karpinski <[email protected]>
> > wrote:
> >         I kind of doubt that readbytes is all that slow, I wonder if
> >         Python is doing something besides actually reading
> >         from /dev/urandom or if these are somehow otherwise not
> >         really comparable bits of code.
> >         
> >         On Sun, Oct 19, 2014 at 6:57 AM, Tim Holy
> >         <[email protected]> wrote:
> >                 If you profile, you'll discover that this is simply
> >                 measuring the performance
> >                 of `readbytes!`. Can you reduce your example to
> >                 focus on this point and file as
> >                 an issue?
> >                 
> >                 Best,
> >                 --Tim
> >                 
> >                 On Saturday, October 18, 2014 01:19:50 PM alexander
> >                 maznev wrote:
> >                 > Apologies for the semi-redundant theme :)
> >                 >
> >                 > I noticed the Julia rand function did not seem to
> >                 work with BigInt numbers
> >                 > (idk if there is a different one somewhere else
> >                 that does) but it's only a
> >                 > few lines to add one. For example the python ecdsa
> >                 library uses the
> >                 > following custom randrange, which is nice and
> >                 concise, but the
> >                 > implementation is rather naive in that it
> >                 approximates to the top byte,
> >                 > meaning a worst case mean of 256 reads from
> >                 urandom for ranges of the type
> >                 > 256**k + 1.
> >                 >
> >                 > def randrange(order, entropy=None):
> >                 >     if entropy is None:
> >                 >         entropy = os.urandom
> >                 >     assert order > 1
> >                 >     bytes = orderlen(order)
> >                 >     dont_try_forever = 10000
> >                 >     while dont_try_forever > 0:
> >                 >         dont_try_forever -= 1
> >                 >         candidate =
> >                 string_to_number(entropy(bytes)) + 1
> >                 >         if 1 <= candidate < order:
> >                 >             return candidate, (10000 -
> >                 dont_try_forever)
> >                 >         continue
> >                 >     raise RuntimeError("randrange() tried hard but
> >                 gave up, either
> >                 > something"
> >                 >                        " is very wrong or you got
> >                 realllly unlucky. Order
> >                 > was"
> >                 >                        " %x" % order)
> >                 >
> >                 > Julia (bit-based version).
> >                 >
> >                 > function randrange(order)
> >                 >     upper_2 = length(base(2, order-1));
> >                 >     upper_256 = int(upper_2/8 +.5); tries=0;
> >                 >     f= open("/dev/urandom")
> >                 >     while true
> >                 >         tries+=1
> >                 >         ent_256 = readbytes(f, upper_256)
> >                 >         ent_2 = ""
> >                 >         for x in ent_256
> >                 >             ent_2 = *(ent_2, base(2,x,8))
> >                 >         end
> >                 >         rand_num = parseint(BigInt,
> >                 ent_2[1:upper_2], 2)
> >                 >         if rand_num < order
> >                 >             close(f); return rand_num, tries
> >                 >         else continue;
> >                 >         end
> >                 >     end
> >                 > end
> >                 >
> >                 > function timeit(n=1000)
> >                 >     t = time(); tries = 0; order = BigInt(256)^31;
> >                 >     for i in range(1,n)
> >                 >         x, tr = randrange(order)
> >                 >         tries +=tr
> >                 >     end
> >                 >     @printf("avg run. time: %llf, avg num. tries:
> >                 %f \n", ((time() -t) / n
> >                 > ), tries/n)
> >                 >     end
> >                 >
> >                 >
> >                 >
> >                 > Turns out the Julia randrange is slower, even
> >                 though the average number of
> >                 > worst case reads is only 2, at a power of 31, that
> >                 will be a 32 byte
> >                 > ent_256, so 32 string concatenations, and 32 base2
> >                 conversions, - but it
> >                 > still comes out slower than the python randrange
> >                 (even with its average of
> >                 > 256 os.urandom calls). If one goes up to a bigger
> >                 number, i.e. 256^751, the
> >                 > smart bit collecting randrange overtakes the naive
> >                 python one. On the other
> >                 > hand below is a bit-based version of the python
> >                 randrange. It beats the
> >                 > Julia one at all sizes by a factor of about 25x.
> >                 >
> >                 > python (bit-based version, worst case average 2
> >                 reads)
> >                 > import os
> >                 > from numpy import base_repr as base
> >                 >
> >                 > def randrange(order):
> >                 >     upper_2 = (order-1).bit_length()
> >                 >     upper_256 = int(upper_2/8 + 1); tries=0
> >                 >     while True:
> >                 >         tries +=1
> >                 >         ent_256 = os.urandom(upper_256)
> >                 >         ent_2 = ""
> >                 >         for x in ent_256:
> >                 >             ent_2 += base(x, 2, 8)[-8:]
> >                 >         rand_num = int(ent_2[:upper_2], base=2)
> >                 >         if rand_num < order:
> >                 >             return rand_num, tries
> >                 >         else:continue
> >                 
> >                 
> >         
> >         

Reply via email to