Re: random.SystemRandom().randint() inefficient
On Thu, 28 Jul 2022 at 05:36, Cecil Westerhof via Python-list wrote: > > Roel Schroeven writes: > > > Cecil Westerhof via Python-list schreef op 27/07/2022 om 17:43: > >> "Michael F. Stemper" writes: > >> > >> > This is orthogonal to your question, but might be of some use to you: > >> > > >> > The combination of using len(to_try) as an argument to randint() and > >> > saving the output to a variable named "index" suggests that you might > >> > be setting up to select a random element from to_try, as in: > >> > something = to_try[index] > >> > > >> > If that is the case, you might want to consider using random.choice() > >> > instead: > >> > > >> > >>> from random import choice > >> > >>> to_try = [2,3,5,7,11,13,"seventeen",19] > >> > >>> choice(to_try) > >> > 2 > >> > >>> choice(to_try) > >> > 'seventeen' > >> > >>> choice(to_try) > >> > 13 > >> > >>> choice(to_try) > >> > 5 > >> > >>> > >> > >> Yes, I try to select a random element, but it has also to be removed, > >> because an element should not be used more as once. > >> This is the code I use: > >> # index = randbelow(len(to_try)) > >> index = randrange(len(to_try)) > >> found = permutation[to_try.pop(index)] > > Do you know in advance how many items you'll need, or maybe an upper > > limit on the amount? In that case it might be more efficient to use > > random.sample(to_try, k=nr_items_needed). > > Something else to try. :-) > And yes: I will be using half of the list. > A perfect job for random.sample() then, if that's *exactly* half the list. But if you don't know for sure how many you'll need, an easy way to make it more efficient would be to take the n'th element, then move the last element down to the n'th position, and finally pop off the last element. That way, you only ever pop the last element away. The list will get reordered arbitrarily while you do this, but if the only purpose of it is to remove elements from random consideration, that won't be a problem. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: random.SystemRandom().randint() inefficient
Cecil Westerhof writes: Alan Bawden writes: > Cecil Westerhof writes: > >Yes, I try to select a random element, but it has also to be removed, >because an element should not be used more as once. > > Instead of using pop to do that why not something like: > > def lazy_shuffle(seq): > """ > Generate the elements of the given sequence in a random order. > """ > # Delete the next line if you want to use a different version of > # randrange: > from random import randrange > # Delete the next line if SEQ is already a mutable sequence and you > # are willing to have it destroyed by this process: > seq = list(seq) > n = len(seq) > while n: > i = randrange(n) > yield seq[i] > n -= 1 > if i < n: > seq[i] = seq[n] That looks interesting. But there is on problem. (I think.) The list is only partly eaten and I will eat a lot of sequences. So a lot of sequences will be left in the runtime. Or is there a way to destroy the lazy_shuffle when it is not needed anymore? You don't have to worry about that. That's what Garbage Collection is for. After you drop the last reference to the generator, the GC will destroy it for you. Welcome to Python. BTW, what I wrote might be slightly faster if you replace it with: while n: i = randrange(n) yield seq[i] n -= 1 seq[i] = seq[n] That will save you some time spent testing and branching. -- Alan Bawden -- https://mail.python.org/mailman/listinfo/python-list
Re: random.SystemRandom().randint() inefficient
Alan Bawden writes: > Cecil Westerhof writes: > >Yes, I try to select a random element, but it has also to be removed, >because an element should not be used more as once. > > Instead of using pop to do that why not something like: > > def lazy_shuffle(seq): > """ > Generate the elements of the given sequence in a random order. > """ > # Delete the next line if you want to use a different version of > # randrange: > from random import randrange > # Delete the next line if SEQ is already a mutable sequence and you > # are willing to have it destroyed by this process: > seq = list(seq) > n = len(seq) > while n: > i = randrange(n) > yield seq[i] > n -= 1 > if i < n: > seq[i] = seq[n] That looks interesting. But there is on problem. (I think.) The list is only partly eaten and I will eat a lot of sequences. So a lot of sequences will be left in the runtime. Or is there a way to destroy the lazy_shuffle when it is not needed anymore? -- Cecil Westerhof Senior Software Engineer LinkedIn: http://www.linkedin.com/in/cecilwesterhof -- https://mail.python.org/mailman/listinfo/python-list
Re: random.SystemRandom().randint() inefficient
MRAB writes: >>> When you pop an element from the last, the elements after it need to be >>> moved down, which takes time. >>> >>> Try shuffling the list and then popping the now randomly-ordered >>> elements off the end. >> Would shuffling not be a lot more expensive? Especially because I do >> not eat the whole list. >> > You won't know until you time it. A first indication is that it doubles the needed time. -- Cecil Westerhof Senior Software Engineer LinkedIn: http://www.linkedin.com/in/cecilwesterhof -- https://mail.python.org/mailman/listinfo/python-list
Re: random.SystemRandom().randint() inefficient
Roel Schroeven writes: > Cecil Westerhof via Python-list schreef op 27/07/2022 om 17:43: >> "Michael F. Stemper" writes: >> >> > This is orthogonal to your question, but might be of some use to you: >> > >> > The combination of using len(to_try) as an argument to randint() and >> > saving the output to a variable named "index" suggests that you might >> > be setting up to select a random element from to_try, as in: >> > something = to_try[index] >> > >> > If that is the case, you might want to consider using random.choice() >> > instead: >> > >> > >>> from random import choice >> > >>> to_try = [2,3,5,7,11,13,"seventeen",19] >> > >>> choice(to_try) >> > 2 >> > >>> choice(to_try) >> > 'seventeen' >> > >>> choice(to_try) >> > 13 >> > >>> choice(to_try) >> > 5 >> > >>> >> >> Yes, I try to select a random element, but it has also to be removed, >> because an element should not be used more as once. >> This is the code I use: >> # index = randbelow(len(to_try)) >> index = randrange(len(to_try)) >> found = permutation[to_try.pop(index)] > Do you know in advance how many items you'll need, or maybe an upper > limit on the amount? In that case it might be more efficient to use > random.sample(to_try, k=nr_items_needed). Something else to try. :-) And yes: I will be using half of the list. -- Cecil Westerhof Senior Software Engineer LinkedIn: http://www.linkedin.com/in/cecilwesterhof -- https://mail.python.org/mailman/listinfo/python-list
Re: random.SystemRandom().randint() inefficient
On 27/07/2022 18:24, Cecil Westerhof via Python-list wrote: MRAB writes: On 27/07/2022 16:43, Cecil Westerhof via Python-list wrote: "Michael F. Stemper" writes: This is orthogonal to your question, but might be of some use to you: The combination of using len(to_try) as an argument to randint() and saving the output to a variable named "index" suggests that you might be setting up to select a random element from to_try, as in: something = to_try[index] If that is the case, you might want to consider using random.choice() instead: >>> from random import choice >>> to_try = [2,3,5,7,11,13,"seventeen",19] >>> choice(to_try) 2 >>> choice(to_try) 'seventeen' >>> choice(to_try) 13 >>> choice(to_try) 5 >>> Yes, I try to select a random element, but it has also to be removed, because an element should not be used more as once. This is the code I use: # index = randbelow(len(to_try)) index = randrange(len(to_try)) found = permutation[to_try.pop(index)] When you pop an element from the last, the elements after it need to be moved down, which takes time. Try shuffling the list and then popping the now randomly-ordered elements off the end. Would shuffling not be a lot more expensive? Especially because I do not eat the whole list. You won't know until you time it. -- https://mail.python.org/mailman/listinfo/python-list
Re: random.SystemRandom().randint() inefficient
Cecil Westerhof writes: Yes, I try to select a random element, but it has also to be removed, because an element should not be used more as once. Instead of using pop to do that why not something like: def lazy_shuffle(seq): """ Generate the elements of the given sequence in a random order. """ # Delete the next line if you want to use a different version of # randrange: from random import randrange # Delete the next line if SEQ is already a mutable sequence and you # are willing to have it destroyed by this process: seq = list(seq) n = len(seq) while n: i = randrange(n) yield seq[i] n -= 1 if i < n: seq[i] = seq[n] -- Alan Bawden -- https://mail.python.org/mailman/listinfo/python-list
Re: random.SystemRandom().randint() inefficient
Cecil Westerhof via Python-list schreef op 27/07/2022 om 17:43: "Michael F. Stemper" writes: > This is orthogonal to your question, but might be of some use to you: > > The combination of using len(to_try) as an argument to randint() and > saving the output to a variable named "index" suggests that you might > be setting up to select a random element from to_try, as in: > something = to_try[index] > > If that is the case, you might want to consider using random.choice() instead: > > >>> from random import choice > >>> to_try = [2,3,5,7,11,13,"seventeen",19] > >>> choice(to_try) > 2 > >>> choice(to_try) > 'seventeen' > >>> choice(to_try) > 13 > >>> choice(to_try) > 5 > >>> Yes, I try to select a random element, but it has also to be removed, because an element should not be used more as once. This is the code I use: # index = randbelow(len(to_try)) index = randrange(len(to_try)) found = permutation[to_try.pop(index)] Do you know in advance how many items you'll need, or maybe an upper limit on the amount? In that case it might be more efficient to use random.sample(to_try, k=nr_items_needed). -- "Honest criticism is hard to take, particularly from a relative, a friend, an acquaintance, or a stranger." -- Franklin P. Jones -- https://mail.python.org/mailman/listinfo/python-list
Re: random.SystemRandom().randint() inefficient
MRAB writes: > On 27/07/2022 16:43, Cecil Westerhof via Python-list wrote: >> "Michael F. Stemper" writes: >> >>> This is orthogonal to your question, but might be of some use to you: >>> >>> The combination of using len(to_try) as an argument to randint() and >>> saving the output to a variable named "index" suggests that you might >>> be setting up to select a random element from to_try, as in: >>> something = to_try[index] >>> >>> If that is the case, you might want to consider using random.choice() >>> instead: >>> >>> >>> from random import choice >>> >>> to_try = [2,3,5,7,11,13,"seventeen",19] >>> >>> choice(to_try) >>> 2 >>> >>> choice(to_try) >>> 'seventeen' >>> >>> choice(to_try) >>> 13 >>> >>> choice(to_try) >>> 5 >>> >>> >> Yes, I try to select a random element, but it has also to be removed, >> because an element should not be used more as once. >> This is the code I use: >> # index = randbelow(len(to_try)) >> index = randrange(len(to_try)) >> found = permutation[to_try.pop(index)] >> > > When you pop an element from the last, the elements after it need to be > moved down, which takes time. > > Try shuffling the list and then popping the now randomly-ordered > elements off the end. Would shuffling not be a lot more expensive? Especially because I do not eat the whole list. -- Cecil Westerhof Senior Software Engineer LinkedIn: http://www.linkedin.com/in/cecilwesterhof -- https://mail.python.org/mailman/listinfo/python-list
Re: random.SystemRandom().randint() inefficient
Dennis Lee Bieber writes: > On Wed, 27 Jul 2022 10:45:47 +0200, Cecil Westerhof > declaimed the following: > > >>What do you mean with where the python version is from? > > Base Python.org download, ActiveState package download, Anaconda > package download, native OS install/extra install via OS repository > download (Debian/Ubuntu: apt install xxx, where xxx is not the native OS > Python) Just the default Debian install. -- Cecil Westerhof Senior Software Engineer LinkedIn: http://www.linkedin.com/in/cecilwesterhof -- https://mail.python.org/mailman/listinfo/python-list
Re: random.SystemRandom().randint() inefficient
> On 27 Jul 2022, at 17:09, Cecil Westerhof via Python-list > wrote: > > Barry writes: > On 26 Jul 2022, at 16:07, Cecil Westerhof via Python-list wrote: >>> >>> I need to get a random integer. At first I tried it with: >>> from secrets import randbelow >>> index = randbelow(len(to_try)) >>> >>> This works perfectly, but it took some time. So I thought I try: >>> from random import SystemRandom >>> index = SystemRandom().randint(0, len(to_try) - 1) >>> >>> A first indication is that the second version would take about two >>> times as much time as the first. Is there a reason for this, or should >>> this not be happening? >> >> What is the OS that you are running on and its version? >> If it’s linux what is the kernel version? >> What version of python and where from? > > That is always good information of-course. > Debian 11.3 > 5.10.0-13-amd64 > 3.9.2 > > What do you mean with where the python version is from? Because random number generator implementation depends on the OS since it is SystemRandom() that you are comparing with python’s Implementation. Barry > > -- > Cecil Westerhof > Senior Software Engineer > LinkedIn: http://www.linkedin.com/in/cecilwesterhof > -- > https://mail.python.org/mailman/listinfo/python-list -- https://mail.python.org/mailman/listinfo/python-list
Re: random.SystemRandom().randint() inefficient
On 7/27/22 02:45, Cecil Westerhof via Python-list wrote: > Barry writes: >> What version of python and where from? > > That is always good information of-course. > Debian 11.3 > 5.10.0-13-amd64 > 3.9.2 > > What do you mean with where the python version is from? On Windows, the platform of a large proportion of people asking questions here, there are many possible Python builds (python.org, Microsoft Store, Conda, ActiveState, etc.) and it sometimes matters, thus that tends to be a standard question that gets asked here. In your case the implied answer is "distro package". -- https://mail.python.org/mailman/listinfo/python-list
Re: random.SystemRandom().randint() inefficient
On Wed, 27 Jul 2022 10:45:47 +0200, Cecil Westerhof declaimed the following: >What do you mean with where the python version is from? Base Python.org download, ActiveState package download, Anaconda package download, native OS install/extra install via OS repository download (Debian/Ubuntu: apt install xxx, where xxx is not the native OS Python) -- Wulfraed Dennis Lee Bieber AF6VN wlfr...@ix.netcom.comhttp://wlfraed.microdiversity.freeddns.org/ -- https://mail.python.org/mailman/listinfo/python-list
Re: random.SystemRandom().randint() inefficient
On 27/07/2022 16:43, Cecil Westerhof via Python-list wrote: "Michael F. Stemper" writes: This is orthogonal to your question, but might be of some use to you: The combination of using len(to_try) as an argument to randint() and saving the output to a variable named "index" suggests that you might be setting up to select a random element from to_try, as in: something = to_try[index] If that is the case, you might want to consider using random.choice() instead: >>> from random import choice >>> to_try = [2,3,5,7,11,13,"seventeen",19] >>> choice(to_try) 2 >>> choice(to_try) 'seventeen' >>> choice(to_try) 13 >>> choice(to_try) 5 >>> Yes, I try to select a random element, but it has also to be removed, because an element should not be used more as once. This is the code I use: # index = randbelow(len(to_try)) index = randrange(len(to_try)) found = permutation[to_try.pop(index)] When you pop an element from the last, the elements after it need to be moved down, which takes time. Try shuffling the list and then popping the now randomly-ordered elements off the end. -- https://mail.python.org/mailman/listinfo/python-list
Re: random.SystemRandom().randint() inefficient
On 26/07/2022 16.47, Cecil Westerhof wrote: Chris Angelico writes: On Wed, 27 Jul 2022 at 06:06, Cecil Westerhof via Python-list wrote: Chris Angelico writes: On Wed, 27 Jul 2022 at 01:06, Cecil Westerhof via Python-list wrote: I need to get a random integer. At first I tried it with: from secrets import randbelow index = randbelow(len(to_try)) This works perfectly, but it took some time. So I thought I try: from random import SystemRandom index = SystemRandom().randint(0, len(to_try) - 1) A first indication is that the second version would take about two times as much time as the first. Is there a reason for this, or should this not be happening? You're setting up a brand new SystemRandom instance just for a single random number. For a fairer comparison, set up the instance, then generate far more than just a single number, and see how that goes. Thanks. I thought I did something wrong and I did. I will try to implement like you said and look what the result will be. (And share it.) Thanks! Don't feel bad; performance testing is *hard*, getting meaningful results takes a lot of of fiddling with parameters, and getting interesting AND meaningful results can sometimes seem about impossible. (As I understand it both do more, or less the same and should have comparable performance.) In normal production work? Yes (the SystemRandom object doesn't have any significant state - a seeded RNG could have a lot more overhead here). But for performance testing? The work of instantiating the class could be completely irrelevant, or it could be dominating your results. It's hard to say, hence the suggestion to try it without reinstantiating. It had a very big influence. Original it took about three times more time to run my program. (The program was still running when I posted the original post and the difference was higher as I anticipated.) Removing that did cut about 45% of the execution time of the program. (So the initiation is quit expensive.) But it still takes about 50% more time. So I am still a bit flabbergasted. The new code: from random import SystemRandom system_random = SystemRandom() index = system_random.randint(0, len(to_try) - 1) This is orthogonal to your question, but might be of some use to you: The combination of using len(to_try) as an argument to randint() and saving the output to a variable named "index" suggests that you might be setting up to select a random element from to_try, as in: something = to_try[index] If that is the case, you might want to consider using random.choice() instead: >>> from random import choice >>> to_try = [2,3,5,7,11,13,"seventeen",19] >>> choice(to_try) 2 >>> choice(to_try) 'seventeen' >>> choice(to_try) 13 >>> choice(to_try) 5 >>> -- Michael F. Stemper This sentence no verb. -- https://mail.python.org/mailman/listinfo/python-list
Re: random.SystemRandom().randint() inefficient
Dennis Lee Bieber writes: > On Tue, 26 Jul 2022 23:47:59 +0200, Cecil Westerhof > declaimed the following: > > >>The new code: >>from random import SystemRandom >>system_random = SystemRandom() >>index = system_random.randint(0, len(to_try) - 1) >> >>The first two statements are executed once. >>The last statement I think about 75 * 10 ** 6. >> >>So it seems that my first idea of using randbelow was the correct one. >>But if anyone could explain why SystemRandom is so much more >>expensive, I would be interested to know it. >>(Or am I still doing something wrong?) > > What happens with > > system_randint = SystemRandom().randint #no parens > > index = system_randint(...) > > which may remove the method lookup from the repetition. I had already switched to randrange. This went to 15 minutes from 21 minutes. By removing the method lookup I could shave off another minute. So certainly noteworthy. (Should have thought about it myself.) -- Cecil Westerhof Senior Software Engineer LinkedIn: http://www.linkedin.com/in/cecilwesterhof -- https://mail.python.org/mailman/listinfo/python-list
Re: random.SystemRandom().randint() inefficient
Chris Angelico writes: > Incidentally - if you are actually trying to select a specific item, > you may want to consider random.choice. Yes, I try to select a random element, but it has also to be removed. An element should be used at most once. This is the code I use: # index = randbelow(len(to_try)) index = randrange(len(to_try)) found = permutation[to_try.pop(index)] -- Cecil Westerhof Senior Software Engineer LinkedIn: http://www.linkedin.com/in/cecilwesterhof -- https://mail.python.org/mailman/listinfo/python-list
Re: random.SystemRandom().randint() inefficient
Chris Angelico writes: > On Wed, 27 Jul 2022 at 08:18, Cecil Westerhof via Python-list > wrote: >> >> Chris Angelico writes: >> >> > On Wed, 27 Jul 2022 at 06:06, Cecil Westerhof via Python-list >> > wrote: >> >> >> >> Chris Angelico writes: >> >> >> >> > On Wed, 27 Jul 2022 at 01:06, Cecil Westerhof via Python-list >> >> > wrote: >> >> >> >> >> >> I need to get a random integer. At first I tried it with: >> >> >> from secrets import randbelow >> >> >> index = randbelow(len(to_try)) >> >> >> >> >> >> This works perfectly, but it took some time. So I thought I try: >> >> >> from random import SystemRandom >> >> >> index = SystemRandom().randint(0, len(to_try) - 1) >> >> >> >> >> >> A first indication is that the second version would take about two >> >> >> times as much time as the first. Is there a reason for this, or should >> >> >> this not be happening? >> >> >> >> >> > >> >> > You're setting up a brand new SystemRandom instance just for a single >> >> > random number. For a fairer comparison, set up the instance, then >> >> > generate far more than just a single number, and see how that goes. >> >> >> >> Thanks. I thought I did something wrong and I did. >> >> I will try to implement like you said and look what the result will >> >> be. (And share it.) >> > >> > Thanks! Don't feel bad; performance testing is *hard*, getting >> > meaningful results takes a lot of of fiddling with parameters, and >> > getting interesting AND meaningful results can sometimes seem about >> > impossible. >> > >> >> (As I understand it both do more, or less the same and should have >> >> comparable performance.) >> > >> > In normal production work? Yes (the SystemRandom object doesn't have >> > any significant state - a seeded RNG could have a lot more overhead >> > here). But for performance testing? The work of instantiating the >> > class could be completely irrelevant, or it could be dominating your >> > results. It's hard to say, hence the suggestion to try it without >> > reinstantiating. >> >> It had a very big influence. Original it took about three times more >> time to run my program. (The program was still running when I posted >> the original post and the difference was higher as I anticipated.) >> Removing that did cut about 45% of the execution time of the program. >> (So the initiation is quit expensive.) >> But it still takes about 50% more time. So I am still a bit >> flabbergasted. >> >> The new code: >> from random import SystemRandom >> system_random = SystemRandom() >> index = system_random.randint(0, len(to_try) - 1) >> >> The first two statements are executed once. >> The last statement I think about 75 * 10 ** 6. >> >> So it seems that my first idea of using randbelow was the correct one. >> But if anyone could explain why SystemRandom is so much more >> expensive, I would be interested to know it. >> (Or am I still doing something wrong?) > > Hmm. There are still a lot of differences here. Are you able to make > use of randrange() instead, to make them more consistent? > > According to the source code, secrets.randbelow is calling on an > internal method _randbelow of the SystemRandom object, but randrange > (if called with only one arg) will go straight into that same method. > Here's my results: > > rosuav@sikorsky:~$ python3 -m timeit -s 'from random import randrange' > 'randrange(1)' > 100 loops, best of 5: 322 nsec per loop > rosuav@sikorsky:~$ python3 -m timeit -s 'from random import > SystemRandom; r = SystemRandom()' 'r.randint(0, 1)' > 20 loops, best of 5: 1.92 usec per loop > rosuav@sikorsky:~$ python3 -m timeit -s 'from random import > SystemRandom; r = SystemRandom()' 'r.randrange(1)' > 20 loops, best of 5: 1.87 usec per loop > rosuav@sikorsky:~$ python3 -m timeit -s 'from secrets import > randbelow' 'randbelow(1)' > 20 loops, best of 5: 1.64 usec per loop > > (The difference with the first one is that it isn't using the system > RNG, so it has the limitations of an internal PRNG.) > > When you call randint, what happens is (1) the endpoint is incremented > to transform it from inclusive-inclusive to inclusive-exclusive; (2) > randrange is called with two args; (3) fast path 1 is skipped, fast > path 2 is taken, and _randbelow gets called to get an actual random > number, which gets zero added to it before returning. > > If, instead, you use randrange(len(to_try)), what would happen is (1) > fast path 1 is used, and (2) _randbelow is called to get the random > number. > > In secrets.randbelow, it's even better: (1) _randbelow is called to > get the random number. > > I think that might be what's going on here. You're adding some tiny > amounts of extra work, but if you were to make them more similar, the > distinctions would evaporate. Here's a couple of other variants that > use SystemRandom: > > rosuav@sikorsky:~$ python3 -m timeit -s 'from random import > SystemRandom; randrange = SystemRandom().randrange'
Re: random.SystemRandom().randint() inefficient
Barry writes: >> On 26 Jul 2022, at 16:07, Cecil Westerhof via Python-list >> wrote: >> >> I need to get a random integer. At first I tried it with: >>from secrets import randbelow >>index = randbelow(len(to_try)) >> >> This works perfectly, but it took some time. So I thought I try: >>from random import SystemRandom >>index = SystemRandom().randint(0, len(to_try) - 1) >> >> A first indication is that the second version would take about two >> times as much time as the first. Is there a reason for this, or should >> this not be happening? > > What is the OS that you are running on and its version? > If it’s linux what is the kernel version? > What version of python and where from? That is always good information of-course. Debian 11.3 5.10.0-13-amd64 3.9.2 What do you mean with where the python version is from? -- Cecil Westerhof Senior Software Engineer LinkedIn: http://www.linkedin.com/in/cecilwesterhof -- https://mail.python.org/mailman/listinfo/python-list
Re: random.SystemRandom().randint() inefficient
"Michael F. Stemper" writes: > This is orthogonal to your question, but might be of some use to you: > > The combination of using len(to_try) as an argument to randint() and > saving the output to a variable named "index" suggests that you might > be setting up to select a random element from to_try, as in: > something = to_try[index] > > If that is the case, you might want to consider using random.choice() instead: > > >>> from random import choice > >>> to_try = [2,3,5,7,11,13,"seventeen",19] > >>> choice(to_try) > 2 > >>> choice(to_try) > 'seventeen' > >>> choice(to_try) > 13 > >>> choice(to_try) > 5 > >>> Yes, I try to select a random element, but it has also to be removed, because an element should not be used more as once. This is the code I use: # index = randbelow(len(to_try)) index = randrange(len(to_try)) found = permutation[to_try.pop(index)] -- Cecil Westerhof Senior Software Engineer LinkedIn: http://www.linkedin.com/in/cecilwesterhof -- https://mail.python.org/mailman/listinfo/python-list
Re: random.SystemRandom().randint() inefficient
> On 26 Jul 2022, at 16:07, Cecil Westerhof via Python-list > wrote: > > I need to get a random integer. At first I tried it with: >from secrets import randbelow >index = randbelow(len(to_try)) > > This works perfectly, but it took some time. So I thought I try: >from random import SystemRandom >index = SystemRandom().randint(0, len(to_try) - 1) > > A first indication is that the second version would take about two > times as much time as the first. Is there a reason for this, or should > this not be happening? What is the OS that you are running on and its version? If it’s linux what is the kernel version? What version of python and where from? Barry > > -- > Cecil Westerhof > Senior Software Engineer > LinkedIn: http://www.linkedin.com/in/cecilwesterhof > -- > https://mail.python.org/mailman/listinfo/python-list > -- https://mail.python.org/mailman/listinfo/python-list
Re: random.SystemRandom().randint() inefficient
On Wed, 27 Jul 2022 at 09:28, Dennis Lee Bieber wrote: > > On Tue, 26 Jul 2022 16:38:38 +0200, Cecil Westerhof > declaimed the following: > > >I need to get a random integer. At first I tried it with: > >from secrets import randbelow > >index = randbelow(len(to_try)) > > > >This works perfectly, but it took some time. So I thought I try: > >from random import SystemRandom > >index = SystemRandom().randint(0, len(to_try) - 1) > > > >A first indication is that the second version would take about two > >times as much time as the first. Is there a reason for this, or should > >this not be happening? > > Well, off the top of my head... > > For one generation of "index" you are first creating an instance of > SystemRandom(), using it to generate your random integer, and then > disposing of the instance. > > If you only need ONE random integer, the time difference probably > doesn't matter. OTOH, if you need many during the run, using > > sr = SystemRandom() > #stuff in some loop that generates multiple ints > index = sr.randint(...) > > Hmmm, wonder if there is a speed difference between > .randint(0, len(to_try) - 1) > and > .randint(1, len(to_try)) - 1 > Probably not significant, since the same amount of arithmetic gets done either way. But switching to single-arg randrange(len(to_try)) will definitely help, and IMO is clearer as well (since the implication is selecting one from a group of items). Incidentally - if you are actually trying to select a specific item, you may want to consider random.choice. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: random.SystemRandom().randint() inefficient
On Tue, 26 Jul 2022 23:47:59 +0200, Cecil Westerhof declaimed the following: >The new code: >from random import SystemRandom >system_random = SystemRandom() >index = system_random.randint(0, len(to_try) - 1) > >The first two statements are executed once. >The last statement I think about 75 * 10 ** 6. > >So it seems that my first idea of using randbelow was the correct one. >But if anyone could explain why SystemRandom is so much more >expensive, I would be interested to know it. >(Or am I still doing something wrong?) What happens with system_randint = SystemRandom().randint #no parens index = system_randint(...) which may remove the method lookup from the repetition. -- Wulfraed Dennis Lee Bieber AF6VN wlfr...@ix.netcom.comhttp://wlfraed.microdiversity.freeddns.org/ -- https://mail.python.org/mailman/listinfo/python-list
Re: random.SystemRandom().randint() inefficient
On Tue, 26 Jul 2022 16:38:38 +0200, Cecil Westerhof declaimed the following: >I need to get a random integer. At first I tried it with: >from secrets import randbelow >index = randbelow(len(to_try)) > >This works perfectly, but it took some time. So I thought I try: >from random import SystemRandom >index = SystemRandom().randint(0, len(to_try) - 1) > >A first indication is that the second version would take about two >times as much time as the first. Is there a reason for this, or should >this not be happening? Well, off the top of my head... For one generation of "index" you are first creating an instance of SystemRandom(), using it to generate your random integer, and then disposing of the instance. If you only need ONE random integer, the time difference probably doesn't matter. OTOH, if you need many during the run, using sr = SystemRandom() #stuff in some loop that generates multiple ints index = sr.randint(...) Hmmm, wonder if there is a speed difference between .randint(0, len(to_try) - 1) and .randint(1, len(to_try)) - 1 -- Wulfraed Dennis Lee Bieber AF6VN wlfr...@ix.netcom.comhttp://wlfraed.microdiversity.freeddns.org/ -- https://mail.python.org/mailman/listinfo/python-list
Re: random.SystemRandom().randint() inefficient
On Wed, 27 Jul 2022 at 08:18, Cecil Westerhof via Python-list wrote: > > Chris Angelico writes: > > > On Wed, 27 Jul 2022 at 06:06, Cecil Westerhof via Python-list > > wrote: > >> > >> Chris Angelico writes: > >> > >> > On Wed, 27 Jul 2022 at 01:06, Cecil Westerhof via Python-list > >> > wrote: > >> >> > >> >> I need to get a random integer. At first I tried it with: > >> >> from secrets import randbelow > >> >> index = randbelow(len(to_try)) > >> >> > >> >> This works perfectly, but it took some time. So I thought I try: > >> >> from random import SystemRandom > >> >> index = SystemRandom().randint(0, len(to_try) - 1) > >> >> > >> >> A first indication is that the second version would take about two > >> >> times as much time as the first. Is there a reason for this, or should > >> >> this not be happening? > >> >> > >> > > >> > You're setting up a brand new SystemRandom instance just for a single > >> > random number. For a fairer comparison, set up the instance, then > >> > generate far more than just a single number, and see how that goes. > >> > >> Thanks. I thought I did something wrong and I did. > >> I will try to implement like you said and look what the result will > >> be. (And share it.) > > > > Thanks! Don't feel bad; performance testing is *hard*, getting > > meaningful results takes a lot of of fiddling with parameters, and > > getting interesting AND meaningful results can sometimes seem about > > impossible. > > > >> (As I understand it both do more, or less the same and should have > >> comparable performance.) > > > > In normal production work? Yes (the SystemRandom object doesn't have > > any significant state - a seeded RNG could have a lot more overhead > > here). But for performance testing? The work of instantiating the > > class could be completely irrelevant, or it could be dominating your > > results. It's hard to say, hence the suggestion to try it without > > reinstantiating. > > It had a very big influence. Original it took about three times more > time to run my program. (The program was still running when I posted > the original post and the difference was higher as I anticipated.) > Removing that did cut about 45% of the execution time of the program. > (So the initiation is quit expensive.) > But it still takes about 50% more time. So I am still a bit > flabbergasted. > > The new code: > from random import SystemRandom > system_random = SystemRandom() > index = system_random.randint(0, len(to_try) - 1) > > The first two statements are executed once. > The last statement I think about 75 * 10 ** 6. > > So it seems that my first idea of using randbelow was the correct one. > But if anyone could explain why SystemRandom is so much more > expensive, I would be interested to know it. > (Or am I still doing something wrong?) Hmm. There are still a lot of differences here. Are you able to make use of randrange() instead, to make them more consistent? According to the source code, secrets.randbelow is calling on an internal method _randbelow of the SystemRandom object, but randrange (if called with only one arg) will go straight into that same method. Here's my results: rosuav@sikorsky:~$ python3 -m timeit -s 'from random import randrange' 'randrange(1)' 100 loops, best of 5: 322 nsec per loop rosuav@sikorsky:~$ python3 -m timeit -s 'from random import SystemRandom; r = SystemRandom()' 'r.randint(0, 1)' 20 loops, best of 5: 1.92 usec per loop rosuav@sikorsky:~$ python3 -m timeit -s 'from random import SystemRandom; r = SystemRandom()' 'r.randrange(1)' 20 loops, best of 5: 1.87 usec per loop rosuav@sikorsky:~$ python3 -m timeit -s 'from secrets import randbelow' 'randbelow(1)' 20 loops, best of 5: 1.64 usec per loop (The difference with the first one is that it isn't using the system RNG, so it has the limitations of an internal PRNG.) When you call randint, what happens is (1) the endpoint is incremented to transform it from inclusive-inclusive to inclusive-exclusive; (2) randrange is called with two args; (3) fast path 1 is skipped, fast path 2 is taken, and _randbelow gets called to get an actual random number, which gets zero added to it before returning. If, instead, you use randrange(len(to_try)), what would happen is (1) fast path 1 is used, and (2) _randbelow is called to get the random number. In secrets.randbelow, it's even better: (1) _randbelow is called to get the random number. I think that might be what's going on here. You're adding some tiny amounts of extra work, but if you were to make them more similar, the distinctions would evaporate. Here's a couple of other variants that use SystemRandom: rosuav@sikorsky:~$ python3 -m timeit -s 'from random import SystemRandom; randrange = SystemRandom().randrange' 'randrange(1)' 20 loops, best of 5: 1.81 usec per loop rosuav@sikorsky:~$ python3 -m timeit -s 'from random import SystemRandom; randrange = SystemRandom()._randbelow'
Re: random.SystemRandom().randint() inefficient
Chris Angelico writes: > On Wed, 27 Jul 2022 at 06:06, Cecil Westerhof via Python-list > wrote: >> >> Chris Angelico writes: >> >> > On Wed, 27 Jul 2022 at 01:06, Cecil Westerhof via Python-list >> > wrote: >> >> >> >> I need to get a random integer. At first I tried it with: >> >> from secrets import randbelow >> >> index = randbelow(len(to_try)) >> >> >> >> This works perfectly, but it took some time. So I thought I try: >> >> from random import SystemRandom >> >> index = SystemRandom().randint(0, len(to_try) - 1) >> >> >> >> A first indication is that the second version would take about two >> >> times as much time as the first. Is there a reason for this, or should >> >> this not be happening? >> >> >> > >> > You're setting up a brand new SystemRandom instance just for a single >> > random number. For a fairer comparison, set up the instance, then >> > generate far more than just a single number, and see how that goes. >> >> Thanks. I thought I did something wrong and I did. >> I will try to implement like you said and look what the result will >> be. (And share it.) > > Thanks! Don't feel bad; performance testing is *hard*, getting > meaningful results takes a lot of of fiddling with parameters, and > getting interesting AND meaningful results can sometimes seem about > impossible. > >> (As I understand it both do more, or less the same and should have >> comparable performance.) > > In normal production work? Yes (the SystemRandom object doesn't have > any significant state - a seeded RNG could have a lot more overhead > here). But for performance testing? The work of instantiating the > class could be completely irrelevant, or it could be dominating your > results. It's hard to say, hence the suggestion to try it without > reinstantiating. It had a very big influence. Original it took about three times more time to run my program. (The program was still running when I posted the original post and the difference was higher as I anticipated.) Removing that did cut about 45% of the execution time of the program. (So the initiation is quit expensive.) But it still takes about 50% more time. So I am still a bit flabbergasted. The new code: from random import SystemRandom system_random = SystemRandom() index = system_random.randint(0, len(to_try) - 1) The first two statements are executed once. The last statement I think about 75 * 10 ** 6. So it seems that my first idea of using randbelow was the correct one. But if anyone could explain why SystemRandom is so much more expensive, I would be interested to know it. (Or am I still doing something wrong?) -- Cecil Westerhof Senior Software Engineer LinkedIn: http://www.linkedin.com/in/cecilwesterhof -- https://mail.python.org/mailman/listinfo/python-list
Re: random.SystemRandom().randint() inefficient
On Wed, 27 Jul 2022 at 06:06, Cecil Westerhof via Python-list wrote: > > Chris Angelico writes: > > > On Wed, 27 Jul 2022 at 01:06, Cecil Westerhof via Python-list > > wrote: > >> > >> I need to get a random integer. At first I tried it with: > >> from secrets import randbelow > >> index = randbelow(len(to_try)) > >> > >> This works perfectly, but it took some time. So I thought I try: > >> from random import SystemRandom > >> index = SystemRandom().randint(0, len(to_try) - 1) > >> > >> A first indication is that the second version would take about two > >> times as much time as the first. Is there a reason for this, or should > >> this not be happening? > >> > > > > You're setting up a brand new SystemRandom instance just for a single > > random number. For a fairer comparison, set up the instance, then > > generate far more than just a single number, and see how that goes. > > Thanks. I thought I did something wrong and I did. > I will try to implement like you said and look what the result will > be. (And share it.) Thanks! Don't feel bad; performance testing is *hard*, getting meaningful results takes a lot of of fiddling with parameters, and getting interesting AND meaningful results can sometimes seem about impossible. > (As I understand it both do more, or less the same and should have > comparable performance.) In normal production work? Yes (the SystemRandom object doesn't have any significant state - a seeded RNG could have a lot more overhead here). But for performance testing? The work of instantiating the class could be completely irrelevant, or it could be dominating your results. It's hard to say, hence the suggestion to try it without reinstantiating. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: random.SystemRandom().randint() inefficient
Chris Angelico writes: > On Wed, 27 Jul 2022 at 01:06, Cecil Westerhof via Python-list > wrote: >> >> I need to get a random integer. At first I tried it with: >> from secrets import randbelow >> index = randbelow(len(to_try)) >> >> This works perfectly, but it took some time. So I thought I try: >> from random import SystemRandom >> index = SystemRandom().randint(0, len(to_try) - 1) >> >> A first indication is that the second version would take about two >> times as much time as the first. Is there a reason for this, or should >> this not be happening? >> > > You're setting up a brand new SystemRandom instance just for a single > random number. For a fairer comparison, set up the instance, then > generate far more than just a single number, and see how that goes. Thanks. I thought I did something wrong and I did. I will try to implement like you said and look what the result will be. (And share it.) (As I understand it both do more, or less the same and should have comparable performance.) -- Cecil Westerhof Senior Software Engineer LinkedIn: http://www.linkedin.com/in/cecilwesterhof -- https://mail.python.org/mailman/listinfo/python-list
Re: random.SystemRandom().randint() inefficient
On Wed, 27 Jul 2022 at 01:06, Cecil Westerhof via Python-list wrote: > > I need to get a random integer. At first I tried it with: > from secrets import randbelow > index = randbelow(len(to_try)) > > This works perfectly, but it took some time. So I thought I try: > from random import SystemRandom > index = SystemRandom().randint(0, len(to_try) - 1) > > A first indication is that the second version would take about two > times as much time as the first. Is there a reason for this, or should > this not be happening? > You're setting up a brand new SystemRandom instance just for a single random number. For a fairer comparison, set up the instance, then generate far more than just a single number, and see how that goes. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: random.SystemRandom().randint() inefficient
Absolutely. The task (“generate a random number”) is ill-defined. Want a fast random number? Use 552015933 (I just retrieved it from random.org). Want a true random number, use random.org API. (https://api.random.org/pricing). Something in between, follow approaches Stefan suggests. — Gerard Weatherby | Application Architect NMRbox | NAN | Department of Molecular Biology and Biophysics UConn Health 263 Farmington Avenue, Farmington, CT 06030-6406 uchc.edu On Jul 26, 2022, 11:45 AM -0400, Stefan Ram , wrote: *** Attention: This is an external email. Use caution responding, opening attachments or clicking on links. *** Cecil Westerhof writes: I need to get a random integer. There are all kinds of trade-offs. When you need entropy, the function sometimes has to collect data from hardware, which takes some time. Pseudo-random integers sometimes can be calculated faster. As a compromise, you can get some entropy and use it to initialize ("seed") a pseudo-random number generator and then get some pseudo-random numbers until you seed it again. A linear-congruential generator should be fast enough, but in some cases might not be random enough. -- https://urldefense.com/v3/__https://mail.python.org/mailman/listinfo/python-list__;!!Cn_UX_p3!kjJ7U6kkSW8uOTWJCnHeXn0Acczd619asmIrVMA_NOqvP_JMqqQ1Rpp61sptQucbOuEFGb6ot499_FqaRajiS9s$ -- https://mail.python.org/mailman/listinfo/python-list
random.SystemRandom().randint() inefficient
I need to get a random integer. At first I tried it with: from secrets import randbelow index = randbelow(len(to_try)) This works perfectly, but it took some time. So I thought I try: from random import SystemRandom index = SystemRandom().randint(0, len(to_try) - 1) A first indication is that the second version would take about two times as much time as the first. Is there a reason for this, or should this not be happening? -- Cecil Westerhof Senior Software Engineer LinkedIn: http://www.linkedin.com/in/cecilwesterhof -- https://mail.python.org/mailman/listinfo/python-list