Re: [julia-users] BigInt randrange 25x slower than in python

Jameson Nash Sun, 19 Oct 2014 09:48:08 -0700

The documentation for os.urandom says that it reads /dev/urandom, so that's
probably not it. Is the Julia file layer caching too much? I imagine that
this syscall is expensive on a per byte basis, unlike normal file reads
where block reads are closer to constant cost.


On Sunday, October 19, 2014, Stefan Karpinski <[email protected]> wrote:

> I kind of doubt that readbytes is all that slow, I wonder if Python is
> doing something besides actually reading from /dev/urandom or if these are
> somehow otherwise not really comparable bits of code.
>
> On Sun, Oct 19, 2014 at 6:57 AM, Tim Holy <[email protected]
> <javascript:_e(%7B%7D,'cvml','[email protected]');>> wrote:
>
>> If you profile, you'll discover that this is simply measuring the
>> performance
>> of `readbytes!`. Can you reduce your example to focus on this point and
>> file as
>> an issue?
>>
>> Best,
>> --Tim
>>
>> On Saturday, October 18, 2014 01:19:50 PM alexander maznev wrote:
>> > Apologies for the semi-redundant theme :)
>> >
>> > I noticed the Julia rand function did not seem to work with BigInt
>> numbers
>> > (idk if there is a different one somewhere else that does) but it's
>> only a
>> > few lines to add one. For example the python ecdsa library uses the
>> > following custom randrange, which is nice and concise, but the
>> > implementation is rather naive in that it approximates to the top byte,
>> > meaning a worst case mean of 256 reads from urandom for ranges of the
>> type
>> > 256**k + 1.
>> >
>> > def randrange(order, entropy=None):
>> >     if entropy is None:
>> >         entropy = os.urandom
>> >     assert order > 1
>> >     bytes = orderlen(order)
>> >     dont_try_forever = 10000
>> >     while dont_try_forever > 0:
>> >         dont_try_forever -= 1
>> >         candidate = string_to_number(entropy(bytes)) + 1
>> >         if 1 <= candidate < order:
>> >             return candidate, (10000 - dont_try_forever)
>> >         continue
>> >     raise RuntimeError("randrange() tried hard but gave up, either
>> > something"
>> >                        " is very wrong or you got realllly unlucky.
>> Order
>> > was"
>> >                        " %x" % order)
>> >
>> > Julia (bit-based version).
>> >
>> > function randrange(order)
>> >     upper_2 = length(base(2, order-1));
>> >     upper_256 = int(upper_2/8 +.5); tries=0;
>> >     f= open("/dev/urandom")
>> >     while true
>> >         tries+=1
>> >         ent_256 = readbytes(f, upper_256)
>> >         ent_2 = ""
>> >         for x in ent_256
>> >             ent_2 = *(ent_2, base(2,x,8))
>> >         end
>> >         rand_num = parseint(BigInt, ent_2[1:upper_2], 2)
>> >         if rand_num < order
>> >             close(f); return rand_num, tries
>> >         else continue;
>> >         end
>> >     end
>> > end
>> >
>> > function timeit(n=1000)
>> >     t = time(); tries = 0; order = BigInt(256)^31;
>> >     for i in range(1,n)
>> >         x, tr = randrange(order)
>> >         tries +=tr
>> >     end
>> >     @printf("avg run. time: %llf, avg num. tries: %f \n", ((time() -t)
>> / n
>> > ), tries/n)
>> >     end
>> >
>> >
>> >
>> > Turns out the Julia randrange is slower, even though the average number
>> of
>> > worst case reads is only 2, at a power of 31, that will be a 32 byte
>> > ent_256, so 32 string concatenations, and 32 base2 conversions, - but it
>> > still comes out slower than the python randrange (even with its average
>> of
>> > 256 os.urandom calls). If one goes up to a bigger number, i.e. 256^751,
>> the
>> > smart bit collecting randrange overtakes the naive python one. On the
>> other
>> > hand below is a bit-based version of the python randrange. It beats the
>> > Julia one at all sizes by a factor of about 25x.
>> >
>> > python (bit-based version, worst case average 2 reads)
>> > import os
>> > from numpy import base_repr as base
>> >
>> > def randrange(order):
>> >     upper_2 = (order-1).bit_length()
>> >     upper_256 = int(upper_2/8 + 1); tries=0
>> >     while True:
>> >         tries +=1
>> >         ent_256 = os.urandom(upper_256)
>> >         ent_2 = ""
>> >         for x in ent_256:
>> >             ent_2 += base(x, 2, 8)[-8:]
>> >         rand_num = int(ent_2[:upper_2], base=2)
>> >         if rand_num < order:
>> >             return rand_num, tries
>> >         else:continue
>>
>>
>

Re: [julia-users] BigInt randrange 25x slower than in python

Reply via email to