Re: random.SystemRandom().randint() inefficient

2022-07-27 Thread Chris Angelico
On Thu, 28 Jul 2022 at 05:36, Cecil Westerhof via Python-list
 wrote:
>
> Roel Schroeven  writes:
>
> > Cecil Westerhof via Python-list schreef op 27/07/2022 om 17:43:
> >> "Michael F. Stemper"  writes:
> >>
> >> > This is orthogonal to your question, but might be of some use to you:
> >> >
> >> > The combination of using len(to_try) as an argument to randint() and
> >> > saving the output to a variable named "index" suggests that you might
> >> > be setting up to select a random element from to_try, as in:
> >> >   something = to_try[index]
> >> >
> >> > If that is the case, you might want to consider using random.choice() 
> >> > instead:
> >> >
> >> >   >>> from random import choice
> >> >   >>> to_try = [2,3,5,7,11,13,"seventeen",19]
> >> >   >>> choice(to_try)
> >> >   2
> >> >   >>> choice(to_try)
> >> >   'seventeen'
> >> >   >>> choice(to_try)
> >> >   13
> >> >   >>> choice(to_try)
> >> >   5
> >> >   >>>
> >>
> >> Yes, I try to select a random element, but it has also to be removed,
> >> because an element should not be used more as once.
> >> This is the code I use:
> >>  # index = randbelow(len(to_try))
> >>  index = randrange(len(to_try))
> >>  found = permutation[to_try.pop(index)]
> > Do you know in advance how many items you'll need, or maybe an upper
> > limit on the amount? In that case it might be more efficient to use
> > random.sample(to_try, k=nr_items_needed).
>
> Something else to try. :-)
> And yes: I will be using half of the list.
>

A perfect job for random.sample() then, if that's *exactly* half the list.

But if you don't know for sure how many you'll need, an easy way to
make it more efficient would be to take the n'th element, then move
the last element down to the n'th position, and finally pop off the
last element. That way, you only ever pop the last element away. The
list will get reordered arbitrarily while you do this, but if the only
purpose of it is to remove elements from random consideration, that
won't be a problem.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: random.SystemRandom().randint() inefficient

2022-07-27 Thread Alan Bawden
Cecil Westerhof  writes:

   Alan Bawden  writes:

   > Cecil Westerhof  writes:
   >
   >Yes, I try to select a random element, but it has also to be removed,
   >because an element should not be used more as once.
   >
   > Instead of using pop to do that why not something like:
   >
   > def lazy_shuffle(seq):
   > """
   > Generate the elements of the given sequence in a random order.
   > """
   > # Delete the next line if you want to use a different version of
   > # randrange:
   > from random import randrange
   > # Delete the next line if SEQ is already a mutable sequence and you
   > # are willing to have it destroyed by this process:
   > seq = list(seq)
   > n = len(seq)
   > while n:
   > i = randrange(n)
   > yield seq[i]
   > n -= 1
   > if i < n:
   > seq[i] = seq[n]

   That looks interesting.
   But there is on problem. (I think.)
   The list is only partly eaten and I will eat a lot of sequences. So a
   lot of sequences will be left in the runtime.
   Or is there a way to destroy the lazy_shuffle when it is not needed
   anymore?

You don't have to worry about that.  That's what Garbage Collection is
for.  After you drop the last reference to the generator, the GC will
destroy it for you.  Welcome to Python.

BTW, what I wrote might be slightly faster if you replace it with:

while n:
i = randrange(n)
yield seq[i]
n -= 1
seq[i] = seq[n]

That will save you some time spent testing and branching.

-- 
Alan Bawden
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: random.SystemRandom().randint() inefficient

2022-07-27 Thread Cecil Westerhof via Python-list
Alan Bawden  writes:

> Cecil Westerhof  writes:
>
>Yes, I try to select a random element, but it has also to be removed,
>because an element should not be used more as once.
>
> Instead of using pop to do that why not something like:
>
> def lazy_shuffle(seq):
> """
> Generate the elements of the given sequence in a random order.
> """
> # Delete the next line if you want to use a different version of
> # randrange:
> from random import randrange
> # Delete the next line if SEQ is already a mutable sequence and you
> # are willing to have it destroyed by this process:
> seq = list(seq)
> n = len(seq)
> while n:
> i = randrange(n)
> yield seq[i]
> n -= 1
> if i < n:
> seq[i] = seq[n]

That looks interesting.
But there is on problem. (I think.)
The list is only partly eaten and I will eat a lot of sequences. So a
lot of sequences will be left in the runtime.
Or is there a way to destroy the lazy_shuffle when it is not needed
anymore?

-- 
Cecil Westerhof
Senior Software Engineer
LinkedIn: http://www.linkedin.com/in/cecilwesterhof
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: random.SystemRandom().randint() inefficient

2022-07-27 Thread Cecil Westerhof via Python-list
MRAB  writes:

>>> When you pop an element from the last, the elements after it need to be
>>> moved down, which takes time.
>>>
>>> Try shuffling the list and then popping the now randomly-ordered
>>> elements off the end.
>> Would shuffling not be a lot more expensive? Especially because I do
>> not eat the whole list.
>> 
> You won't know until you time it.

A first indication is that it doubles the needed time.

-- 
Cecil Westerhof
Senior Software Engineer
LinkedIn: http://www.linkedin.com/in/cecilwesterhof
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: random.SystemRandom().randint() inefficient

2022-07-27 Thread Cecil Westerhof via Python-list
Roel Schroeven  writes:

> Cecil Westerhof via Python-list schreef op 27/07/2022 om 17:43:
>> "Michael F. Stemper"  writes:
>>
>> > This is orthogonal to your question, but might be of some use to you:
>> >
>> > The combination of using len(to_try) as an argument to randint() and
>> > saving the output to a variable named "index" suggests that you might
>> > be setting up to select a random element from to_try, as in:
>> >   something = to_try[index]
>> >
>> > If that is the case, you might want to consider using random.choice() 
>> > instead:
>> >
>> >   >>> from random import choice
>> >   >>> to_try = [2,3,5,7,11,13,"seventeen",19]
>> >   >>> choice(to_try)
>> >   2
>> >   >>> choice(to_try)
>> >   'seventeen'
>> >   >>> choice(to_try)
>> >   13
>> >   >>> choice(to_try)
>> >   5
>> >   >>>
>>
>> Yes, I try to select a random element, but it has also to be removed,
>> because an element should not be used more as once.
>> This is the code I use:
>>  # index = randbelow(len(to_try))
>>  index = randrange(len(to_try))
>>  found = permutation[to_try.pop(index)]
> Do you know in advance how many items you'll need, or maybe an upper
> limit on the amount? In that case it might be more efficient to use 
> random.sample(to_try, k=nr_items_needed).

Something else to try. :-)
And yes: I will be using half of the list.

-- 
Cecil Westerhof
Senior Software Engineer
LinkedIn: http://www.linkedin.com/in/cecilwesterhof
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: random.SystemRandom().randint() inefficient

2022-07-27 Thread MRAB

On 27/07/2022 18:24, Cecil Westerhof via Python-list wrote:

MRAB  writes:


On 27/07/2022 16:43, Cecil Westerhof via Python-list wrote:

"Michael F. Stemper"  writes:


This is orthogonal to your question, but might be of some use to you:

The combination of using len(to_try) as an argument to randint() and
saving the output to a variable named "index" suggests that you might
be setting up to select a random element from to_try, as in:
  something = to_try[index]

If that is the case, you might want to consider using random.choice() instead:

  >>> from random import choice
  >>> to_try = [2,3,5,7,11,13,"seventeen",19]
  >>> choice(to_try)
  2
  >>> choice(to_try)
  'seventeen'
  >>> choice(to_try)
  13
  >>> choice(to_try)
  5
  >>>

Yes, I try to select a random element, but it has also to be removed,
because an element should not be used more as once.
This is the code I use:
 # index = randbelow(len(to_try))
 index = randrange(len(to_try))
 found = permutation[to_try.pop(index)]



When you pop an element from the last, the elements after it need to be
moved down, which takes time.

Try shuffling the list and then popping the now randomly-ordered
elements off the end.


Would shuffling not be a lot more expensive? Especially because I do
not eat the whole list.


You won't know until you time it.
--
https://mail.python.org/mailman/listinfo/python-list


Re: random.SystemRandom().randint() inefficient

2022-07-27 Thread Alan Bawden
Cecil Westerhof  writes:

   Yes, I try to select a random element, but it has also to be removed,
   because an element should not be used more as once.

Instead of using pop to do that why not something like:

def lazy_shuffle(seq):
"""
Generate the elements of the given sequence in a random order.
"""
# Delete the next line if you want to use a different version of
# randrange:
from random import randrange
# Delete the next line if SEQ is already a mutable sequence and you
# are willing to have it destroyed by this process:
seq = list(seq)
n = len(seq)
while n:
i = randrange(n)
yield seq[i]
n -= 1
if i < n:
seq[i] = seq[n]

-- 
Alan Bawden
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: random.SystemRandom().randint() inefficient

2022-07-27 Thread Roel Schroeven

Cecil Westerhof via Python-list schreef op 27/07/2022 om 17:43:

"Michael F. Stemper"  writes:

> This is orthogonal to your question, but might be of some use to you:
>
> The combination of using len(to_try) as an argument to randint() and
> saving the output to a variable named "index" suggests that you might
> be setting up to select a random element from to_try, as in:
>   something = to_try[index]
>
> If that is the case, you might want to consider using random.choice() instead:
>
>   >>> from random import choice
>   >>> to_try = [2,3,5,7,11,13,"seventeen",19]
>   >>> choice(to_try)
>   2
>   >>> choice(to_try)
>   'seventeen'
>   >>> choice(to_try)
>   13
>   >>> choice(to_try)
>   5
>   >>>

Yes, I try to select a random element, but it has also to be removed,
because an element should not be used more as once.
This is the code I use:
 # index = randbelow(len(to_try))
 index = randrange(len(to_try))
 found = permutation[to_try.pop(index)]
Do you know in advance how many items you'll need, or maybe an upper 
limit on the amount? In that case it might be more efficient to use 
random.sample(to_try, k=nr_items_needed).


--
"Honest criticism is hard to take, particularly from a relative, a friend,
an acquaintance, or a stranger."
-- Franklin P. Jones

--
https://mail.python.org/mailman/listinfo/python-list


Re: random.SystemRandom().randint() inefficient

2022-07-27 Thread Cecil Westerhof via Python-list
MRAB  writes:

> On 27/07/2022 16:43, Cecil Westerhof via Python-list wrote:
>> "Michael F. Stemper"  writes:
>> 
>>> This is orthogonal to your question, but might be of some use to you:
>>>
>>> The combination of using len(to_try) as an argument to randint() and
>>> saving the output to a variable named "index" suggests that you might
>>> be setting up to select a random element from to_try, as in:
>>>   something = to_try[index]
>>>
>>> If that is the case, you might want to consider using random.choice() 
>>> instead:
>>>
>>>   >>> from random import choice
>>>   >>> to_try = [2,3,5,7,11,13,"seventeen",19]
>>>   >>> choice(to_try)
>>>   2
>>>   >>> choice(to_try)
>>>   'seventeen'
>>>   >>> choice(to_try)
>>>   13
>>>   >>> choice(to_try)
>>>   5
>>>   >>>
>> Yes, I try to select a random element, but it has also to be removed,
>> because an element should not be used more as once.
>> This is the code I use:
>>  # index = randbelow(len(to_try))
>>  index = randrange(len(to_try))
>>  found = permutation[to_try.pop(index)]
>> 
>
> When you pop an element from the last, the elements after it need to be
> moved down, which takes time.
>
> Try shuffling the list and then popping the now randomly-ordered
> elements off the end.

Would shuffling not be a lot more expensive? Especially because I do
not eat the whole list.

-- 
Cecil Westerhof
Senior Software Engineer
LinkedIn: http://www.linkedin.com/in/cecilwesterhof
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: random.SystemRandom().randint() inefficient

2022-07-27 Thread Cecil Westerhof via Python-list
Dennis Lee Bieber  writes:

> On Wed, 27 Jul 2022 10:45:47 +0200, Cecil Westerhof 
> declaimed the following:
>
>
>>What do you mean with where the python version is from?
>
>   Base Python.org download, ActiveState package download, Anaconda
> package download, native OS install/extra install via OS repository
> download (Debian/Ubuntu: apt install xxx, where xxx is not the native OS
> Python)

Just the default Debian install.

-- 
Cecil Westerhof
Senior Software Engineer
LinkedIn: http://www.linkedin.com/in/cecilwesterhof
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: random.SystemRandom().randint() inefficient

2022-07-27 Thread Barry


> On 27 Jul 2022, at 17:09, Cecil Westerhof via Python-list 
>  wrote:
> 
> Barry  writes:
> 
 On 26 Jul 2022, at 16:07, Cecil Westerhof via Python-list 
  wrote:
>>> 
>>> I need to get a random integer. At first I tried it with:
>>>   from secrets import randbelow
>>>   index = randbelow(len(to_try))
>>> 
>>> This works perfectly, but it took some time. So I thought I try:
>>>   from random  import SystemRandom
>>>   index = SystemRandom().randint(0, len(to_try) - 1)
>>> 
>>> A first indication is that the second version would take about two
>>> times as much time as the first. Is there a reason for this, or should
>>> this not be happening?
>> 
>> What is the OS that you are running on and its version?
>> If it’s linux what is the kernel version?
>> What version of python and where from?
> 
> That is always good information of-course.
> Debian 11.3
> 5.10.0-13-amd64
> 3.9.2
> 
> What do you mean with where the python version is from?

Because random number generator implementation depends on the
OS since it is SystemRandom() that you are comparing with python’s
Implementation.

Barry
> 
> -- 
> Cecil Westerhof
> Senior Software Engineer
> LinkedIn: http://www.linkedin.com/in/cecilwesterhof
> -- 
> https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: random.SystemRandom().randint() inefficient

2022-07-27 Thread Mats Wichmann
On 7/27/22 02:45, Cecil Westerhof via Python-list wrote:
> Barry  writes:

>> What version of python and where from?
> 
> That is always good information of-course.
> Debian 11.3
> 5.10.0-13-amd64
> 3.9.2
> 
> What do you mean with where the python version is from?

On Windows, the platform of a large proportion of people asking
questions here, there are many possible Python builds (python.org,
Microsoft Store, Conda, ActiveState, etc.) and it sometimes matters,
thus that tends to be a standard question that gets asked here.  In your
case the implied answer is "distro package".
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: random.SystemRandom().randint() inefficient

2022-07-27 Thread Dennis Lee Bieber
On Wed, 27 Jul 2022 10:45:47 +0200, Cecil Westerhof 
declaimed the following:


>What do you mean with where the python version is from?

Base Python.org download, ActiveState package download, Anaconda
package download, native OS install/extra install via OS repository
download (Debian/Ubuntu: apt install xxx, where xxx is not the native OS
Python)


-- 
Wulfraed Dennis Lee Bieber AF6VN
wlfr...@ix.netcom.comhttp://wlfraed.microdiversity.freeddns.org/
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: random.SystemRandom().randint() inefficient

2022-07-27 Thread MRAB

On 27/07/2022 16:43, Cecil Westerhof via Python-list wrote:

"Michael F. Stemper"  writes:


This is orthogonal to your question, but might be of some use to you:

The combination of using len(to_try) as an argument to randint() and
saving the output to a variable named "index" suggests that you might
be setting up to select a random element from to_try, as in:
  something = to_try[index]

If that is the case, you might want to consider using random.choice() instead:

  >>> from random import choice
  >>> to_try = [2,3,5,7,11,13,"seventeen",19]
  >>> choice(to_try)
  2
  >>> choice(to_try)
  'seventeen'
  >>> choice(to_try)
  13
  >>> choice(to_try)
  5
  >>>


Yes, I try to select a random element, but it has also to be removed,
because an element should not be used more as once.
This is the code I use:
 # index = randbelow(len(to_try))
 index = randrange(len(to_try))
 found = permutation[to_try.pop(index)]



When you pop an element from the last, the elements after it need to be 
moved down, which takes time.


Try shuffling the list and then popping the now randomly-ordered 
elements off the end.

--
https://mail.python.org/mailman/listinfo/python-list


Re: random.SystemRandom().randint() inefficient

2022-07-27 Thread Michael F. Stemper

On 26/07/2022 16.47, Cecil Westerhof wrote:

Chris Angelico  writes:


On Wed, 27 Jul 2022 at 06:06, Cecil Westerhof via Python-list
 wrote:


Chris Angelico  writes:


On Wed, 27 Jul 2022 at 01:06, Cecil Westerhof via Python-list
 wrote:


I need to get a random integer. At first I tried it with:
 from secrets import randbelow
 index = randbelow(len(to_try))

This works perfectly, but it took some time. So I thought I try:
 from random  import SystemRandom
 index = SystemRandom().randint(0, len(to_try) - 1)

A first indication is that the second version would take about two
times as much time as the first. Is there a reason for this, or should
this not be happening?



You're setting up a brand new SystemRandom instance just for a single
random number. For a fairer comparison, set up the instance, then
generate far more than just a single number, and see how that goes.


Thanks. I thought I did something wrong and I did.
I will try to implement like you said and look what the result will
be. (And share it.)


Thanks! Don't feel bad; performance testing is *hard*, getting
meaningful results takes a lot of of fiddling with parameters, and
getting interesting AND meaningful results can sometimes seem about
impossible.


(As I understand it both do more, or less the same and should have
comparable performance.)


In normal production work? Yes (the SystemRandom object doesn't have
any significant state - a seeded RNG could have a lot more overhead
here). But for performance testing? The work of instantiating the
class could be completely irrelevant, or it could be dominating your
results. It's hard to say, hence the suggestion to try it without
reinstantiating.


It had a very big influence. Original it took about three times more
time to run my program. (The program was still running when I posted
the original post and the difference was higher as I anticipated.)
Removing that did cut about 45% of the execution time of the program.
(So the initiation is quit expensive.)
But it still takes about 50% more time. So I am still a bit
flabbergasted.

The new code:
 from random  import SystemRandom
 system_random   = SystemRandom()
 index = system_random.randint(0, len(to_try) - 1)


This is orthogonal to your question, but might be of some use to you:

The combination of using len(to_try) as an argument to randint() and
saving the output to a variable named "index" suggests that you might
be setting up to select a random element from to_try, as in:
  something = to_try[index]

If that is the case, you might want to consider using random.choice() instead:

  >>> from random import choice
  >>> to_try = [2,3,5,7,11,13,"seventeen",19]
  >>> choice(to_try)
  2
  >>> choice(to_try)
  'seventeen'
  >>> choice(to_try)
  13
  >>> choice(to_try)
  5
  >>>

--
Michael F. Stemper
This sentence no verb.
--
https://mail.python.org/mailman/listinfo/python-list


Re: random.SystemRandom().randint() inefficient

2022-07-27 Thread Cecil Westerhof via Python-list
Dennis Lee Bieber  writes:

> On Tue, 26 Jul 2022 23:47:59 +0200, Cecil Westerhof 
> declaimed the following:
>
>
>>The new code:
>>from random  import SystemRandom
>>system_random   = SystemRandom()
>>index = system_random.randint(0, len(to_try) - 1)
>>
>>The first two statements are executed once.
>>The last statement I think about 75 * 10 ** 6.
>>
>>So it seems that my first idea of using randbelow was the correct one.
>>But if anyone could explain why SystemRandom is so much more
>>expensive, I would be interested to know it.
>>(Or am I still doing something wrong?)
>
>   What happens with
>
>   system_randint = SystemRandom().randint #no parens
>
>   index = system_randint(...)
>
> which may remove the method lookup from the repetition.

I had already switched to randrange. This went to 15 minutes from 21
minutes.
By removing the method lookup I could shave off another minute. So
certainly noteworthy. (Should have thought about it myself.)

-- 
Cecil Westerhof
Senior Software Engineer
LinkedIn: http://www.linkedin.com/in/cecilwesterhof
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: random.SystemRandom().randint() inefficient

2022-07-27 Thread Cecil Westerhof via Python-list
Chris Angelico  writes:

> Incidentally - if you are actually trying to select a specific item,
> you may want to consider random.choice.

Yes, I try to select a random element, but it has also to be removed.
An element should be used at most once. This is the code I use:
# index = randbelow(len(to_try))
index = randrange(len(to_try))
found = permutation[to_try.pop(index)]

-- 
Cecil Westerhof
Senior Software Engineer
LinkedIn: http://www.linkedin.com/in/cecilwesterhof
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: random.SystemRandom().randint() inefficient

2022-07-27 Thread Cecil Westerhof via Python-list
Chris Angelico  writes:

> On Wed, 27 Jul 2022 at 08:18, Cecil Westerhof via Python-list
>  wrote:
>>
>> Chris Angelico  writes:
>>
>> > On Wed, 27 Jul 2022 at 06:06, Cecil Westerhof via Python-list
>> >  wrote:
>> >>
>> >> Chris Angelico  writes:
>> >>
>> >> > On Wed, 27 Jul 2022 at 01:06, Cecil Westerhof via Python-list
>> >> >  wrote:
>> >> >>
>> >> >> I need to get a random integer. At first I tried it with:
>> >> >> from secrets import randbelow
>> >> >> index = randbelow(len(to_try))
>> >> >>
>> >> >> This works perfectly, but it took some time. So I thought I try:
>> >> >> from random  import SystemRandom
>> >> >> index = SystemRandom().randint(0, len(to_try) - 1)
>> >> >>
>> >> >> A first indication is that the second version would take about two
>> >> >> times as much time as the first. Is there a reason for this, or should
>> >> >> this not be happening?
>> >> >>
>> >> >
>> >> > You're setting up a brand new SystemRandom instance just for a single
>> >> > random number. For a fairer comparison, set up the instance, then
>> >> > generate far more than just a single number, and see how that goes.
>> >>
>> >> Thanks. I thought I did something wrong and I did.
>> >> I will try to implement like you said and look what the result will
>> >> be. (And share it.)
>> >
>> > Thanks! Don't feel bad; performance testing is *hard*, getting
>> > meaningful results takes a lot of of fiddling with parameters, and
>> > getting interesting AND meaningful results can sometimes seem about
>> > impossible.
>> >
>> >> (As I understand it both do more, or less the same and should have
>> >> comparable performance.)
>> >
>> > In normal production work? Yes (the SystemRandom object doesn't have
>> > any significant state - a seeded RNG could have a lot more overhead
>> > here). But for performance testing? The work of instantiating the
>> > class could be completely irrelevant, or it could be dominating your
>> > results. It's hard to say, hence the suggestion to try it without
>> > reinstantiating.
>>
>> It had a very big influence. Original it took about three times more
>> time to run my program. (The program was still running when I posted
>> the original post and the difference was higher as I anticipated.)
>> Removing that did cut about 45% of the execution time of the program.
>> (So the initiation is quit expensive.)
>> But it still takes about 50% more time. So I am still a bit
>> flabbergasted.
>>
>> The new code:
>> from random  import SystemRandom
>> system_random   = SystemRandom()
>> index = system_random.randint(0, len(to_try) - 1)
>>
>> The first two statements are executed once.
>> The last statement I think about 75 * 10 ** 6.
>>
>> So it seems that my first idea of using randbelow was the correct one.
>> But if anyone could explain why SystemRandom is so much more
>> expensive, I would be interested to know it.
>> (Or am I still doing something wrong?)
>
> Hmm. There are still a lot of differences here. Are you able to make
> use of randrange() instead, to make them more consistent?
>
> According to the source code, secrets.randbelow is calling on an
> internal method _randbelow of the SystemRandom object, but randrange
> (if called with only one arg) will go straight into that same method.
> Here's my results:
>
> rosuav@sikorsky:~$ python3 -m timeit -s 'from random import randrange'
> 'randrange(1)'
> 100 loops, best of 5: 322 nsec per loop
> rosuav@sikorsky:~$ python3 -m timeit -s 'from random import
> SystemRandom; r = SystemRandom()' 'r.randint(0, 1)'
> 20 loops, best of 5: 1.92 usec per loop
> rosuav@sikorsky:~$ python3 -m timeit -s 'from random import
> SystemRandom; r = SystemRandom()' 'r.randrange(1)'
> 20 loops, best of 5: 1.87 usec per loop
> rosuav@sikorsky:~$ python3 -m timeit -s 'from secrets import
> randbelow' 'randbelow(1)'
> 20 loops, best of 5: 1.64 usec per loop
>
> (The difference with the first one is that it isn't using the system
> RNG, so it has the limitations of an internal PRNG.)
>
> When you call randint, what happens is (1) the endpoint is incremented
> to transform it from inclusive-inclusive to inclusive-exclusive; (2)
> randrange is called with two args; (3) fast path 1 is skipped, fast
> path 2 is taken, and _randbelow gets called to get an actual random
> number, which gets zero added to it before returning.
>
> If, instead, you use randrange(len(to_try)), what would happen is (1)
> fast path 1 is used, and (2) _randbelow is called to get the random
> number.
>
> In secrets.randbelow, it's even better: (1) _randbelow is called to
> get the random number.
>
> I think that might be what's going on here. You're adding some tiny
> amounts of extra work, but if you were to make them more similar, the
> distinctions would evaporate. Here's a couple of other variants that
> use SystemRandom:
>
> rosuav@sikorsky:~$ python3 -m timeit -s 'from random import
> SystemRandom; randrange = SystemRandom().randrange' 

Re: random.SystemRandom().randint() inefficient

2022-07-27 Thread Cecil Westerhof via Python-list
Barry  writes:

>> On 26 Jul 2022, at 16:07, Cecil Westerhof via Python-list 
>>  wrote:
>> 
>> I need to get a random integer. At first I tried it with:
>>from secrets import randbelow
>>index = randbelow(len(to_try))
>> 
>> This works perfectly, but it took some time. So I thought I try:
>>from random  import SystemRandom
>>index = SystemRandom().randint(0, len(to_try) - 1)
>> 
>> A first indication is that the second version would take about two
>> times as much time as the first. Is there a reason for this, or should
>> this not be happening?
>
> What is the OS that you are running on and its version?
> If it’s linux what is the kernel version?
> What version of python and where from?

That is always good information of-course.
Debian 11.3
5.10.0-13-amd64
3.9.2

What do you mean with where the python version is from?

-- 
Cecil Westerhof
Senior Software Engineer
LinkedIn: http://www.linkedin.com/in/cecilwesterhof
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: random.SystemRandom().randint() inefficient

2022-07-27 Thread Cecil Westerhof via Python-list
"Michael F. Stemper"  writes:

> This is orthogonal to your question, but might be of some use to you:
>
> The combination of using len(to_try) as an argument to randint() and
> saving the output to a variable named "index" suggests that you might
> be setting up to select a random element from to_try, as in:
>   something = to_try[index]
>
> If that is the case, you might want to consider using random.choice() instead:
>
>   >>> from random import choice
>   >>> to_try = [2,3,5,7,11,13,"seventeen",19]
>   >>> choice(to_try)
>   2
>   >>> choice(to_try)
>   'seventeen'
>   >>> choice(to_try)
>   13
>   >>> choice(to_try)
>   5
>   >>>

Yes, I try to select a random element, but it has also to be removed,
because an element should not be used more as once.
This is the code I use:
# index = randbelow(len(to_try))
index = randrange(len(to_try))
found = permutation[to_try.pop(index)]

-- 
Cecil Westerhof
Senior Software Engineer
LinkedIn: http://www.linkedin.com/in/cecilwesterhof
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: random.SystemRandom().randint() inefficient

2022-07-27 Thread Barry


> On 26 Jul 2022, at 16:07, Cecil Westerhof via Python-list 
>  wrote:
> 
> I need to get a random integer. At first I tried it with:
>from secrets import randbelow
>index = randbelow(len(to_try))
> 
> This works perfectly, but it took some time. So I thought I try:
>from random  import SystemRandom
>index = SystemRandom().randint(0, len(to_try) - 1)
> 
> A first indication is that the second version would take about two
> times as much time as the first. Is there a reason for this, or should
> this not be happening?

What is the OS that you are running on and its version?
If it’s linux what is the kernel version?
What version of python and where from?

Barry

> 
> -- 
> Cecil Westerhof
> Senior Software Engineer
> LinkedIn: http://www.linkedin.com/in/cecilwesterhof
> -- 
> https://mail.python.org/mailman/listinfo/python-list
> 

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: random.SystemRandom().randint() inefficient

2022-07-26 Thread Chris Angelico
On Wed, 27 Jul 2022 at 09:28, Dennis Lee Bieber  wrote:
>
> On Tue, 26 Jul 2022 16:38:38 +0200, Cecil Westerhof 
> declaimed the following:
>
> >I need to get a random integer. At first I tried it with:
> >from secrets import randbelow
> >index = randbelow(len(to_try))
> >
> >This works perfectly, but it took some time. So I thought I try:
> >from random  import SystemRandom
> >index = SystemRandom().randint(0, len(to_try) - 1)
> >
> >A first indication is that the second version would take about two
> >times as much time as the first. Is there a reason for this, or should
> >this not be happening?
>
> Well, off the top of my head...
>
> For one generation of "index" you are first creating an instance of
> SystemRandom(), using it to generate your random integer, and then
> disposing of the instance.
>
> If you only need ONE random integer, the time difference probably
> doesn't matter. OTOH, if you need many during the run, using
>
> sr = SystemRandom()
> #stuff in some loop that generates multiple ints
> index = sr.randint(...)
>
> Hmmm, wonder if there is a speed difference between
> .randint(0, len(to_try) - 1)
> and
> .randint(1, len(to_try)) - 1
>

Probably not significant, since the same amount of arithmetic gets
done either way. But switching to single-arg randrange(len(to_try))
will definitely help, and IMO is clearer as well (since the
implication is selecting one from a group of items).

Incidentally - if you are actually trying to select a specific item,
you may want to consider random.choice.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: random.SystemRandom().randint() inefficient

2022-07-26 Thread Dennis Lee Bieber
On Tue, 26 Jul 2022 23:47:59 +0200, Cecil Westerhof 
declaimed the following:


>The new code:
>from random  import SystemRandom
>system_random   = SystemRandom()
>index = system_random.randint(0, len(to_try) - 1)
>
>The first two statements are executed once.
>The last statement I think about 75 * 10 ** 6.
>
>So it seems that my first idea of using randbelow was the correct one.
>But if anyone could explain why SystemRandom is so much more
>expensive, I would be interested to know it.
>(Or am I still doing something wrong?)

What happens with

system_randint = SystemRandom().randint #no parens

index = system_randint(...)

which may remove the method lookup from the repetition.


-- 
Wulfraed Dennis Lee Bieber AF6VN
wlfr...@ix.netcom.comhttp://wlfraed.microdiversity.freeddns.org/
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: random.SystemRandom().randint() inefficient

2022-07-26 Thread Dennis Lee Bieber
On Tue, 26 Jul 2022 16:38:38 +0200, Cecil Westerhof 
declaimed the following:

>I need to get a random integer. At first I tried it with:
>from secrets import randbelow
>index = randbelow(len(to_try))
>
>This works perfectly, but it took some time. So I thought I try:
>from random  import SystemRandom
>index = SystemRandom().randint(0, len(to_try) - 1)
>
>A first indication is that the second version would take about two
>times as much time as the first. Is there a reason for this, or should
>this not be happening?

Well, off the top of my head...

For one generation of "index" you are first creating an instance of
SystemRandom(), using it to generate your random integer, and then
disposing of the instance.

If you only need ONE random integer, the time difference probably
doesn't matter. OTOH, if you need many during the run, using

sr = SystemRandom()
#stuff in some loop that generates multiple ints
index = sr.randint(...)

Hmmm, wonder if there is a speed difference between
.randint(0, len(to_try) - 1)
and
.randint(1, len(to_try)) - 1


-- 
Wulfraed Dennis Lee Bieber AF6VN
wlfr...@ix.netcom.comhttp://wlfraed.microdiversity.freeddns.org/
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: random.SystemRandom().randint() inefficient

2022-07-26 Thread Chris Angelico
On Wed, 27 Jul 2022 at 08:18, Cecil Westerhof via Python-list
 wrote:
>
> Chris Angelico  writes:
>
> > On Wed, 27 Jul 2022 at 06:06, Cecil Westerhof via Python-list
> >  wrote:
> >>
> >> Chris Angelico  writes:
> >>
> >> > On Wed, 27 Jul 2022 at 01:06, Cecil Westerhof via Python-list
> >> >  wrote:
> >> >>
> >> >> I need to get a random integer. At first I tried it with:
> >> >> from secrets import randbelow
> >> >> index = randbelow(len(to_try))
> >> >>
> >> >> This works perfectly, but it took some time. So I thought I try:
> >> >> from random  import SystemRandom
> >> >> index = SystemRandom().randint(0, len(to_try) - 1)
> >> >>
> >> >> A first indication is that the second version would take about two
> >> >> times as much time as the first. Is there a reason for this, or should
> >> >> this not be happening?
> >> >>
> >> >
> >> > You're setting up a brand new SystemRandom instance just for a single
> >> > random number. For a fairer comparison, set up the instance, then
> >> > generate far more than just a single number, and see how that goes.
> >>
> >> Thanks. I thought I did something wrong and I did.
> >> I will try to implement like you said and look what the result will
> >> be. (And share it.)
> >
> > Thanks! Don't feel bad; performance testing is *hard*, getting
> > meaningful results takes a lot of of fiddling with parameters, and
> > getting interesting AND meaningful results can sometimes seem about
> > impossible.
> >
> >> (As I understand it both do more, or less the same and should have
> >> comparable performance.)
> >
> > In normal production work? Yes (the SystemRandom object doesn't have
> > any significant state - a seeded RNG could have a lot more overhead
> > here). But for performance testing? The work of instantiating the
> > class could be completely irrelevant, or it could be dominating your
> > results. It's hard to say, hence the suggestion to try it without
> > reinstantiating.
>
> It had a very big influence. Original it took about three times more
> time to run my program. (The program was still running when I posted
> the original post and the difference was higher as I anticipated.)
> Removing that did cut about 45% of the execution time of the program.
> (So the initiation is quit expensive.)
> But it still takes about 50% more time. So I am still a bit
> flabbergasted.
>
> The new code:
> from random  import SystemRandom
> system_random   = SystemRandom()
> index = system_random.randint(0, len(to_try) - 1)
>
> The first two statements are executed once.
> The last statement I think about 75 * 10 ** 6.
>
> So it seems that my first idea of using randbelow was the correct one.
> But if anyone could explain why SystemRandom is so much more
> expensive, I would be interested to know it.
> (Or am I still doing something wrong?)

Hmm. There are still a lot of differences here. Are you able to make
use of randrange() instead, to make them more consistent?

According to the source code, secrets.randbelow is calling on an
internal method _randbelow of the SystemRandom object, but randrange
(if called with only one arg) will go straight into that same method.
Here's my results:

rosuav@sikorsky:~$ python3 -m timeit -s 'from random import randrange'
'randrange(1)'
100 loops, best of 5: 322 nsec per loop
rosuav@sikorsky:~$ python3 -m timeit -s 'from random import
SystemRandom; r = SystemRandom()' 'r.randint(0, 1)'
20 loops, best of 5: 1.92 usec per loop
rosuav@sikorsky:~$ python3 -m timeit -s 'from random import
SystemRandom; r = SystemRandom()' 'r.randrange(1)'
20 loops, best of 5: 1.87 usec per loop
rosuav@sikorsky:~$ python3 -m timeit -s 'from secrets import
randbelow' 'randbelow(1)'
20 loops, best of 5: 1.64 usec per loop

(The difference with the first one is that it isn't using the system
RNG, so it has the limitations of an internal PRNG.)

When you call randint, what happens is (1) the endpoint is incremented
to transform it from inclusive-inclusive to inclusive-exclusive; (2)
randrange is called with two args; (3) fast path 1 is skipped, fast
path 2 is taken, and _randbelow gets called to get an actual random
number, which gets zero added to it before returning.

If, instead, you use randrange(len(to_try)), what would happen is (1)
fast path 1 is used, and (2) _randbelow is called to get the random
number.

In secrets.randbelow, it's even better: (1) _randbelow is called to
get the random number.

I think that might be what's going on here. You're adding some tiny
amounts of extra work, but if you were to make them more similar, the
distinctions would evaporate. Here's a couple of other variants that
use SystemRandom:

rosuav@sikorsky:~$ python3 -m timeit -s 'from random import
SystemRandom; randrange = SystemRandom().randrange' 'randrange(1)'
20 loops, best of 5: 1.81 usec per loop
rosuav@sikorsky:~$ python3 -m timeit -s 'from random import
SystemRandom; randrange = SystemRandom()._randbelow'

Re: random.SystemRandom().randint() inefficient

2022-07-26 Thread Cecil Westerhof via Python-list
Chris Angelico  writes:

> On Wed, 27 Jul 2022 at 06:06, Cecil Westerhof via Python-list
>  wrote:
>>
>> Chris Angelico  writes:
>>
>> > On Wed, 27 Jul 2022 at 01:06, Cecil Westerhof via Python-list
>> >  wrote:
>> >>
>> >> I need to get a random integer. At first I tried it with:
>> >> from secrets import randbelow
>> >> index = randbelow(len(to_try))
>> >>
>> >> This works perfectly, but it took some time. So I thought I try:
>> >> from random  import SystemRandom
>> >> index = SystemRandom().randint(0, len(to_try) - 1)
>> >>
>> >> A first indication is that the second version would take about two
>> >> times as much time as the first. Is there a reason for this, or should
>> >> this not be happening?
>> >>
>> >
>> > You're setting up a brand new SystemRandom instance just for a single
>> > random number. For a fairer comparison, set up the instance, then
>> > generate far more than just a single number, and see how that goes.
>>
>> Thanks. I thought I did something wrong and I did.
>> I will try to implement like you said and look what the result will
>> be. (And share it.)
>
> Thanks! Don't feel bad; performance testing is *hard*, getting
> meaningful results takes a lot of of fiddling with parameters, and
> getting interesting AND meaningful results can sometimes seem about
> impossible.
>
>> (As I understand it both do more, or less the same and should have
>> comparable performance.)
>
> In normal production work? Yes (the SystemRandom object doesn't have
> any significant state - a seeded RNG could have a lot more overhead
> here). But for performance testing? The work of instantiating the
> class could be completely irrelevant, or it could be dominating your
> results. It's hard to say, hence the suggestion to try it without
> reinstantiating.

It had a very big influence. Original it took about three times more
time to run my program. (The program was still running when I posted
the original post and the difference was higher as I anticipated.)
Removing that did cut about 45% of the execution time of the program.
(So the initiation is quit expensive.)
But it still takes about 50% more time. So I am still a bit
flabbergasted.

The new code:
from random  import SystemRandom
system_random   = SystemRandom()
index = system_random.randint(0, len(to_try) - 1)

The first two statements are executed once.
The last statement I think about 75 * 10 ** 6.

So it seems that my first idea of using randbelow was the correct one.
But if anyone could explain why SystemRandom is so much more
expensive, I would be interested to know it.
(Or am I still doing something wrong?)

-- 
Cecil Westerhof
Senior Software Engineer
LinkedIn: http://www.linkedin.com/in/cecilwesterhof
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: random.SystemRandom().randint() inefficient

2022-07-26 Thread Chris Angelico
On Wed, 27 Jul 2022 at 06:06, Cecil Westerhof via Python-list
 wrote:
>
> Chris Angelico  writes:
>
> > On Wed, 27 Jul 2022 at 01:06, Cecil Westerhof via Python-list
> >  wrote:
> >>
> >> I need to get a random integer. At first I tried it with:
> >> from secrets import randbelow
> >> index = randbelow(len(to_try))
> >>
> >> This works perfectly, but it took some time. So I thought I try:
> >> from random  import SystemRandom
> >> index = SystemRandom().randint(0, len(to_try) - 1)
> >>
> >> A first indication is that the second version would take about two
> >> times as much time as the first. Is there a reason for this, or should
> >> this not be happening?
> >>
> >
> > You're setting up a brand new SystemRandom instance just for a single
> > random number. For a fairer comparison, set up the instance, then
> > generate far more than just a single number, and see how that goes.
>
> Thanks. I thought I did something wrong and I did.
> I will try to implement like you said and look what the result will
> be. (And share it.)

Thanks! Don't feel bad; performance testing is *hard*, getting
meaningful results takes a lot of of fiddling with parameters, and
getting interesting AND meaningful results can sometimes seem about
impossible.

> (As I understand it both do more, or less the same and should have
> comparable performance.)

In normal production work? Yes (the SystemRandom object doesn't have
any significant state - a seeded RNG could have a lot more overhead
here). But for performance testing? The work of instantiating the
class could be completely irrelevant, or it could be dominating your
results. It's hard to say, hence the suggestion to try it without
reinstantiating.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: random.SystemRandom().randint() inefficient

2022-07-26 Thread Cecil Westerhof via Python-list
Chris Angelico  writes:

> On Wed, 27 Jul 2022 at 01:06, Cecil Westerhof via Python-list
>  wrote:
>>
>> I need to get a random integer. At first I tried it with:
>> from secrets import randbelow
>> index = randbelow(len(to_try))
>>
>> This works perfectly, but it took some time. So I thought I try:
>> from random  import SystemRandom
>> index = SystemRandom().randint(0, len(to_try) - 1)
>>
>> A first indication is that the second version would take about two
>> times as much time as the first. Is there a reason for this, or should
>> this not be happening?
>>
>
> You're setting up a brand new SystemRandom instance just for a single
> random number. For a fairer comparison, set up the instance, then
> generate far more than just a single number, and see how that goes.

Thanks. I thought I did something wrong and I did.
I will try to implement like you said and look what the result will
be. (And share it.)

(As I understand it both do more, or less the same and should have
comparable performance.)

-- 
Cecil Westerhof
Senior Software Engineer
LinkedIn: http://www.linkedin.com/in/cecilwesterhof
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: random.SystemRandom().randint() inefficient

2022-07-26 Thread Chris Angelico
On Wed, 27 Jul 2022 at 01:06, Cecil Westerhof via Python-list
 wrote:
>
> I need to get a random integer. At first I tried it with:
> from secrets import randbelow
> index = randbelow(len(to_try))
>
> This works perfectly, but it took some time. So I thought I try:
> from random  import SystemRandom
> index = SystemRandom().randint(0, len(to_try) - 1)
>
> A first indication is that the second version would take about two
> times as much time as the first. Is there a reason for this, or should
> this not be happening?
>

You're setting up a brand new SystemRandom instance just for a single
random number. For a fairer comparison, set up the instance, then
generate far more than just a single number, and see how that goes.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: random.SystemRandom().randint() inefficient

2022-07-26 Thread Weatherby,Gerard
Absolutely. The task (“generate a random number”) is ill-defined.

Want a fast random number? Use 552015933 (I just retrieved it from random.org).

Want a true random number, use random.org API. (https://api.random.org/pricing).

Something in between, follow approaches Stefan suggests.



—
Gerard Weatherby | Application Architect NMRbox | NAN | Department of Molecular 
Biology and Biophysics
 UConn Health 263 Farmington Avenue, Farmington, CT 06030-6406 uchc.edu
On Jul 26, 2022, 11:45 AM -0400, Stefan Ram , wrote:
*** Attention: This is an external email. Use caution responding, opening 
attachments or clicking on links. ***

Cecil Westerhof  writes:
I need to get a random integer.

There are all kinds of trade-offs. When you need entropy,
the function sometimes has to collect data from hardware,
which takes some time. Pseudo-random integers sometimes can
be calculated faster. As a compromise, you can get some
entropy and use it to initialize ("seed") a pseudo-random
number generator and then get some pseudo-random numbers
until you seed it again. A linear-congruential generator
should be fast enough, but in some cases might not be random
enough.


--
https://urldefense.com/v3/__https://mail.python.org/mailman/listinfo/python-list__;!!Cn_UX_p3!kjJ7U6kkSW8uOTWJCnHeXn0Acczd619asmIrVMA_NOqvP_JMqqQ1Rpp61sptQucbOuEFGb6ot499_FqaRajiS9s$
-- 
https://mail.python.org/mailman/listinfo/python-list


random.SystemRandom().randint() inefficient

2022-07-26 Thread Cecil Westerhof via Python-list
I need to get a random integer. At first I tried it with:
from secrets import randbelow
index = randbelow(len(to_try))

This works perfectly, but it took some time. So I thought I try:
from random  import SystemRandom
index = SystemRandom().randint(0, len(to_try) - 1)

A first indication is that the second version would take about two
times as much time as the first. Is there a reason for this, or should
this not be happening?

-- 
Cecil Westerhof
Senior Software Engineer
LinkedIn: http://www.linkedin.com/in/cecilwesterhof
-- 
https://mail.python.org/mailman/listinfo/python-list