Re: Simple TCP proxy

2022-07-27 Thread Martin Di Paola



On Wed, Jul 27, 2022 at 08:32:31PM +0200, Morten W. Petersen wrote:

You're thinking of the backlog argument of listen?


From my understanding, yes, when you set up the "accepter" socket (the
one that you use to listen and accept new connections), you can define
the length of the queue for incoming connections that are not accepted
yet.

This will be the equivalent of your SimpleQueue which basically puts a
limits on how many incoming connections are "accepted" to do a real job.

Using skt.listen(N) the incoming connections are put on hold by the OS
while in your implementation are formally accepted but they are not
allowed to do any meaningful work: they are put on the SimpleQueue and
only when they are popped then they will work (send/recv data).

The difference then between the OS and your impl is minimal. The only
case that I can think is that on the clients' side it may exist a
timeout for the acceptance of the connection so your proxy server will
eagerly accept these connections so no timeout is possible(*)

On a side note, you implementation is too thread-naive: it uses plain
Python lists, integers and boolean variables which are not thread safe.
It is a matter of time until your server will start behave weird.

One option is that you use thread-safe objects. I'll encourage to read
about thread-safety in general and then which sync mechanisms Python
offers.

Another option is to remove the SimpleQueue and the background function
that allows a connection to be "active".

If you think, the handlers are 99% independent except that you want to
allow only N of them to progress (stablish and forward the connection)
and when a handler finishes, another handler "waiting" is activated, "in
a queue fashion" as you said.

If you allow me to not have a strict queue discipline here, you can achieve
the same results coordinating the handlers using semaphores. Once again,
take this email as starting point for your own research.

On a second side note, the use of handlers and threads is inefficient
because while you have N active handlers sending/receiving data, because
you are eagerly accepting new connections you will have much more
handlers created and (if I'm not wrong), each will be a thread.

A more efficient solution could be

1) accept as many connections as you can, saving the socket (not the
handler) in the thread-safe queue.
2) have N threads in the background popping from the queue a socket and
then doing the send/recv stuff. When the thread is done, the thread
closes the socket and pops another from the queue.

So the queue length will be the count of accepted connections but in any
moment your proxy will not activate (forward) more than N connections.

This idea is thread-safe, simpler, efficient and has the queue
discipline (I leave aside the usefulness).

I encourage you to take time to read about the different things
mentioned as concurrency and thread-related stuff is not easy to
master.

Thanks,
Martin.

(*) make your proxy server slow enough and yes, you will get timeouts
anyways.



Well, STP will accept all connections, but can limit how many of the
accepted connections that are active at any given time.

So when I bombed it with hundreds of almost simultaneous connections, all
of them were accepted, but only 25 were actively sending and receiving data
at any given time. First come, first served.

Regards,

Morten

On Wed, Jul 27, 2022 at 8:00 PM Chris Angelico  wrote:


On Thu, 28 Jul 2022 at 02:15, Morten W. Petersen 
wrote:
>
> Hi.
>
> I'd like to share with you a recent project, which is a simple TCP proxy
> that can stand in front of a TCP server of some sort, queueing requests
and
> then allowing n number of connections to pass through at a time:

How's this different from what the networking subsystem already does?
When you listen, you can set a queue length. Can you elaborate?

ChrisA
--
https://mail.python.org/mailman/listinfo/python-list




--
I am https://leavingnorway.info
Videos at https://www.youtube.com/user/TheBlogologue
Twittering at http://twitter.com/blogologue
Blogging at http://blogologue.com
Playing music at https://soundcloud.com/morten-w-petersen
Also playing music and podcasting here:
http://www.mixcloud.com/morten-w-petersen/
On Google+ here https://plus.google.com/107781930037068750156
On Instagram at https://instagram.com/morphexx/
--

I am https://leavingnorway.info

Videos at https://www.youtube.com/user/TheBlogologue
Twittering at http://twitter.com/blogologue

Blogging at http://blogologue.com
Playing music at https://soundcloud.com/morten-w-petersen

Also playing music and podcasting here:
http://www.mixcloud.com/morten-w-petersen/

On Instagram at https://instagram.com/morphexx/
--
https://mail.python.org/mailman/listinfo/python-list

--
https://mail.python.org/mailman/listinfo/python-list


Re: random.SystemRandom().randint() inefficient

2022-07-27 Thread Chris Angelico
On Thu, 28 Jul 2022 at 05:36, Cecil Westerhof via Python-list
 wrote:
>
> Roel Schroeven  writes:
>
> > Cecil Westerhof via Python-list schreef op 27/07/2022 om 17:43:
> >> "Michael F. Stemper"  writes:
> >>
> >> > This is orthogonal to your question, but might be of some use to you:
> >> >
> >> > The combination of using len(to_try) as an argument to randint() and
> >> > saving the output to a variable named "index" suggests that you might
> >> > be setting up to select a random element from to_try, as in:
> >> >   something = to_try[index]
> >> >
> >> > If that is the case, you might want to consider using random.choice() 
> >> > instead:
> >> >
> >> >   >>> from random import choice
> >> >   >>> to_try = [2,3,5,7,11,13,"seventeen",19]
> >> >   >>> choice(to_try)
> >> >   2
> >> >   >>> choice(to_try)
> >> >   'seventeen'
> >> >   >>> choice(to_try)
> >> >   13
> >> >   >>> choice(to_try)
> >> >   5
> >> >   >>>
> >>
> >> Yes, I try to select a random element, but it has also to be removed,
> >> because an element should not be used more as once.
> >> This is the code I use:
> >>  # index = randbelow(len(to_try))
> >>  index = randrange(len(to_try))
> >>  found = permutation[to_try.pop(index)]
> > Do you know in advance how many items you'll need, or maybe an upper
> > limit on the amount? In that case it might be more efficient to use
> > random.sample(to_try, k=nr_items_needed).
>
> Something else to try. :-)
> And yes: I will be using half of the list.
>

A perfect job for random.sample() then, if that's *exactly* half the list.

But if you don't know for sure how many you'll need, an easy way to
make it more efficient would be to take the n'th element, then move
the last element down to the n'th position, and finally pop off the
last element. That way, you only ever pop the last element away. The
list will get reordered arbitrarily while you do this, but if the only
purpose of it is to remove elements from random consideration, that
won't be a problem.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Simple TCP proxy

2022-07-27 Thread Chris Angelico
On Thu, 28 Jul 2022 at 04:32, Morten W. Petersen  wrote:
>
> Hi Chris.
>
> You're thinking of the backlog argument of listen?

Yes, precisely.

> Well, STP will accept all connections, but can limit how many of the accepted 
> connections that are active at any given time.
>
> So when I bombed it with hundreds of almost simultaneous connections, all of 
> them were accepted, but only 25 were actively sending and receiving data at 
> any given time. First come, first served.
>

Hmm. Okay. Not sure what the advantage is, but sure.

If the server's capable of handling the total requests-per-minute,
then a queueing system like this should help with burst load, although
I would have thought that the listen backlog would do the same. What
happens if the server actually gets overloaded though? Do connections
get disconnected after appearing connected? What's the disconnect
mode?

BTW, you probably don't want to be using the _thread module - Python
has a threading module which is better suited to this sort of work.
Although you may want to consider asyncio instead, as that has far
lower overhead when working with large numbers of sockets.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: random.SystemRandom().randint() inefficient

2022-07-27 Thread Alan Bawden
Cecil Westerhof  writes:

   Alan Bawden  writes:

   > Cecil Westerhof  writes:
   >
   >Yes, I try to select a random element, but it has also to be removed,
   >because an element should not be used more as once.
   >
   > Instead of using pop to do that why not something like:
   >
   > def lazy_shuffle(seq):
   > """
   > Generate the elements of the given sequence in a random order.
   > """
   > # Delete the next line if you want to use a different version of
   > # randrange:
   > from random import randrange
   > # Delete the next line if SEQ is already a mutable sequence and you
   > # are willing to have it destroyed by this process:
   > seq = list(seq)
   > n = len(seq)
   > while n:
   > i = randrange(n)
   > yield seq[i]
   > n -= 1
   > if i < n:
   > seq[i] = seq[n]

   That looks interesting.
   But there is on problem. (I think.)
   The list is only partly eaten and I will eat a lot of sequences. So a
   lot of sequences will be left in the runtime.
   Or is there a way to destroy the lazy_shuffle when it is not needed
   anymore?

You don't have to worry about that.  That's what Garbage Collection is
for.  After you drop the last reference to the generator, the GC will
destroy it for you.  Welcome to Python.

BTW, what I wrote might be slightly faster if you replace it with:

while n:
i = randrange(n)
yield seq[i]
n -= 1
seq[i] = seq[n]

That will save you some time spent testing and branching.

-- 
Alan Bawden
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: random.SystemRandom().randint() inefficient

2022-07-27 Thread Cecil Westerhof via Python-list
Alan Bawden  writes:

> Cecil Westerhof  writes:
>
>Yes, I try to select a random element, but it has also to be removed,
>because an element should not be used more as once.
>
> Instead of using pop to do that why not something like:
>
> def lazy_shuffle(seq):
> """
> Generate the elements of the given sequence in a random order.
> """
> # Delete the next line if you want to use a different version of
> # randrange:
> from random import randrange
> # Delete the next line if SEQ is already a mutable sequence and you
> # are willing to have it destroyed by this process:
> seq = list(seq)
> n = len(seq)
> while n:
> i = randrange(n)
> yield seq[i]
> n -= 1
> if i < n:
> seq[i] = seq[n]

That looks interesting.
But there is on problem. (I think.)
The list is only partly eaten and I will eat a lot of sequences. So a
lot of sequences will be left in the runtime.
Or is there a way to destroy the lazy_shuffle when it is not needed
anymore?

-- 
Cecil Westerhof
Senior Software Engineer
LinkedIn: http://www.linkedin.com/in/cecilwesterhof
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: random.SystemRandom().randint() inefficient

2022-07-27 Thread Cecil Westerhof via Python-list
MRAB  writes:

>>> When you pop an element from the last, the elements after it need to be
>>> moved down, which takes time.
>>>
>>> Try shuffling the list and then popping the now randomly-ordered
>>> elements off the end.
>> Would shuffling not be a lot more expensive? Especially because I do
>> not eat the whole list.
>> 
> You won't know until you time it.

A first indication is that it doubles the needed time.

-- 
Cecil Westerhof
Senior Software Engineer
LinkedIn: http://www.linkedin.com/in/cecilwesterhof
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: random.SystemRandom().randint() inefficient

2022-07-27 Thread Cecil Westerhof via Python-list
Roel Schroeven  writes:

> Cecil Westerhof via Python-list schreef op 27/07/2022 om 17:43:
>> "Michael F. Stemper"  writes:
>>
>> > This is orthogonal to your question, but might be of some use to you:
>> >
>> > The combination of using len(to_try) as an argument to randint() and
>> > saving the output to a variable named "index" suggests that you might
>> > be setting up to select a random element from to_try, as in:
>> >   something = to_try[index]
>> >
>> > If that is the case, you might want to consider using random.choice() 
>> > instead:
>> >
>> >   >>> from random import choice
>> >   >>> to_try = [2,3,5,7,11,13,"seventeen",19]
>> >   >>> choice(to_try)
>> >   2
>> >   >>> choice(to_try)
>> >   'seventeen'
>> >   >>> choice(to_try)
>> >   13
>> >   >>> choice(to_try)
>> >   5
>> >   >>>
>>
>> Yes, I try to select a random element, but it has also to be removed,
>> because an element should not be used more as once.
>> This is the code I use:
>>  # index = randbelow(len(to_try))
>>  index = randrange(len(to_try))
>>  found = permutation[to_try.pop(index)]
> Do you know in advance how many items you'll need, or maybe an upper
> limit on the amount? In that case it might be more efficient to use 
> random.sample(to_try, k=nr_items_needed).

Something else to try. :-)
And yes: I will be using half of the list.

-- 
Cecil Westerhof
Senior Software Engineer
LinkedIn: http://www.linkedin.com/in/cecilwesterhof
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: random.SystemRandom().randint() inefficient

2022-07-27 Thread MRAB

On 27/07/2022 18:24, Cecil Westerhof via Python-list wrote:

MRAB  writes:


On 27/07/2022 16:43, Cecil Westerhof via Python-list wrote:

"Michael F. Stemper"  writes:


This is orthogonal to your question, but might be of some use to you:

The combination of using len(to_try) as an argument to randint() and
saving the output to a variable named "index" suggests that you might
be setting up to select a random element from to_try, as in:
  something = to_try[index]

If that is the case, you might want to consider using random.choice() instead:

  >>> from random import choice
  >>> to_try = [2,3,5,7,11,13,"seventeen",19]
  >>> choice(to_try)
  2
  >>> choice(to_try)
  'seventeen'
  >>> choice(to_try)
  13
  >>> choice(to_try)
  5
  >>>

Yes, I try to select a random element, but it has also to be removed,
because an element should not be used more as once.
This is the code I use:
 # index = randbelow(len(to_try))
 index = randrange(len(to_try))
 found = permutation[to_try.pop(index)]



When you pop an element from the last, the elements after it need to be
moved down, which takes time.

Try shuffling the list and then popping the now randomly-ordered
elements off the end.


Would shuffling not be a lot more expensive? Especially because I do
not eat the whole list.


You won't know until you time it.
--
https://mail.python.org/mailman/listinfo/python-list


Re: random.SystemRandom().randint() inefficient

2022-07-27 Thread Alan Bawden
Cecil Westerhof  writes:

   Yes, I try to select a random element, but it has also to be removed,
   because an element should not be used more as once.

Instead of using pop to do that why not something like:

def lazy_shuffle(seq):
"""
Generate the elements of the given sequence in a random order.
"""
# Delete the next line if you want to use a different version of
# randrange:
from random import randrange
# Delete the next line if SEQ is already a mutable sequence and you
# are willing to have it destroyed by this process:
seq = list(seq)
n = len(seq)
while n:
i = randrange(n)
yield seq[i]
n -= 1
if i < n:
seq[i] = seq[n]

-- 
Alan Bawden
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: random.SystemRandom().randint() inefficient

2022-07-27 Thread Roel Schroeven

Cecil Westerhof via Python-list schreef op 27/07/2022 om 17:43:

"Michael F. Stemper"  writes:

> This is orthogonal to your question, but might be of some use to you:
>
> The combination of using len(to_try) as an argument to randint() and
> saving the output to a variable named "index" suggests that you might
> be setting up to select a random element from to_try, as in:
>   something = to_try[index]
>
> If that is the case, you might want to consider using random.choice() instead:
>
>   >>> from random import choice
>   >>> to_try = [2,3,5,7,11,13,"seventeen",19]
>   >>> choice(to_try)
>   2
>   >>> choice(to_try)
>   'seventeen'
>   >>> choice(to_try)
>   13
>   >>> choice(to_try)
>   5
>   >>>

Yes, I try to select a random element, but it has also to be removed,
because an element should not be used more as once.
This is the code I use:
 # index = randbelow(len(to_try))
 index = randrange(len(to_try))
 found = permutation[to_try.pop(index)]
Do you know in advance how many items you'll need, or maybe an upper 
limit on the amount? In that case it might be more efficient to use 
random.sample(to_try, k=nr_items_needed).


--
"Honest criticism is hard to take, particularly from a relative, a friend,
an acquaintance, or a stranger."
-- Franklin P. Jones

--
https://mail.python.org/mailman/listinfo/python-list


Re: Simple TCP proxy

2022-07-27 Thread Morten W. Petersen
Hi Chris.

You're thinking of the backlog argument of listen?

Well, STP will accept all connections, but can limit how many of the
accepted connections that are active at any given time.

So when I bombed it with hundreds of almost simultaneous connections, all
of them were accepted, but only 25 were actively sending and receiving data
at any given time. First come, first served.

Regards,

Morten

On Wed, Jul 27, 2022 at 8:00 PM Chris Angelico  wrote:

> On Thu, 28 Jul 2022 at 02:15, Morten W. Petersen 
> wrote:
> >
> > Hi.
> >
> > I'd like to share with you a recent project, which is a simple TCP proxy
> > that can stand in front of a TCP server of some sort, queueing requests
> and
> > then allowing n number of connections to pass through at a time:
>
> How's this different from what the networking subsystem already does?
> When you listen, you can set a queue length. Can you elaborate?
>
> ChrisA
> --
> https://mail.python.org/mailman/listinfo/python-list
>


-- 
I am https://leavingnorway.info
Videos at https://www.youtube.com/user/TheBlogologue
Twittering at http://twitter.com/blogologue
Blogging at http://blogologue.com
Playing music at https://soundcloud.com/morten-w-petersen
Also playing music and podcasting here:
http://www.mixcloud.com/morten-w-petersen/
On Google+ here https://plus.google.com/107781930037068750156
On Instagram at https://instagram.com/morphexx/
--

I am https://leavingnorway.info

Videos at https://www.youtube.com/user/TheBlogologue
Twittering at http://twitter.com/blogologue

Blogging at http://blogologue.com
Playing music at https://soundcloud.com/morten-w-petersen

Also playing music and podcasting here:
http://www.mixcloud.com/morten-w-petersen/

On Instagram at https://instagram.com/morphexx/
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: More efficient code, but slower program

2022-07-27 Thread Cecil Westerhof via Python-list
2qdxy4rzwzuui...@potatochowder.com writes:

> On 2022-07-27 at 17:48:47 +0200,
> Regarding "Re: More efficient code, but slower program,"
> Cecil Westerhof via Python-list  wrote:
>
>> r...@zedat.fu-berlin.de (Stefan Ram) writes:
>> 
>> > Cecil Westerhof  writes:
>> >>values = [*range(100)]
>> >
>> >   In many cases, any iterable is just fine and a list is not
>> >   required, just as peudo-random numbers often are just fine and
>> >   real-world entropy is not required.
>> 
>> In this case both are. I must select (several times) a random element
>> from the list. So I need the list.
>> I also want the randomness to be as good as possible to make the
>> 'simulation' as good as possible.
>
> "[A]s good as possible" for simulations and tests may not require the
> cryptographic quality numbers from SystemRandom.  Many/most pseudo
> random number generators are optimized for statistically normalized
> outputs, and are repeatable as a bonus (again, often a requirement for
> certain types of simulations; YMMV).
>
> Also, what if you shuffled the list first (e.g., with random.shuffle)
> and then iterated through it directly?  Repeatedly removing arbitrary
> elements from the middle of a list is potentially expensive.

Someone else also mentioned that. I think I should put it on my list
of things to do/try.

-- 
Cecil Westerhof
Senior Software Engineer
LinkedIn: http://www.linkedin.com/in/cecilwesterhof
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: random.SystemRandom().randint() inefficient

2022-07-27 Thread Cecil Westerhof via Python-list
MRAB  writes:

> On 27/07/2022 16:43, Cecil Westerhof via Python-list wrote:
>> "Michael F. Stemper"  writes:
>> 
>>> This is orthogonal to your question, but might be of some use to you:
>>>
>>> The combination of using len(to_try) as an argument to randint() and
>>> saving the output to a variable named "index" suggests that you might
>>> be setting up to select a random element from to_try, as in:
>>>   something = to_try[index]
>>>
>>> If that is the case, you might want to consider using random.choice() 
>>> instead:
>>>
>>>   >>> from random import choice
>>>   >>> to_try = [2,3,5,7,11,13,"seventeen",19]
>>>   >>> choice(to_try)
>>>   2
>>>   >>> choice(to_try)
>>>   'seventeen'
>>>   >>> choice(to_try)
>>>   13
>>>   >>> choice(to_try)
>>>   5
>>>   >>>
>> Yes, I try to select a random element, but it has also to be removed,
>> because an element should not be used more as once.
>> This is the code I use:
>>  # index = randbelow(len(to_try))
>>  index = randrange(len(to_try))
>>  found = permutation[to_try.pop(index)]
>> 
>
> When you pop an element from the last, the elements after it need to be
> moved down, which takes time.
>
> Try shuffling the list and then popping the now randomly-ordered
> elements off the end.

Would shuffling not be a lot more expensive? Especially because I do
not eat the whole list.

-- 
Cecil Westerhof
Senior Software Engineer
LinkedIn: http://www.linkedin.com/in/cecilwesterhof
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: More efficient code, but slower program

2022-07-27 Thread Dennis Lee Bieber
On Wed, 27 Jul 2022 14:58:02 +0200, Cecil Westerhof 
declaimed the following:

>It is not very important, but I am just curious.
>
>Original I had in a program:
>values  = [*range(100)]
>
>But because it is done quite often I expected that initialising:
>range_list  = [*range(100)]
>
>and then use:
>values  = range_list.copy()
>
>Would be more efficient. So I tried:
>timeit('values  = [*range(100)]')
>1.6964535564184189
>
>and:
>timeit('new_values = values.copy()', 'values = [*range(100)]')
>0.6457642465829849
>
Were these done in the same program/session? If so, the first
invocation may be initializing/caching the first 100 integers (Python tends
to keep some number of integers in a permanent cache to speed later access
to common values).

Also rather than * unpacking of the range iterator into a [] list...
just...

>>> list(range(100))
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39,
40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58,
59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77,
78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96,
97, 98, 99]
>>> 

... will do it. 

Also, if your goal is to /remove/ and entry from the list via some
index, you might consider if this is more effective than copying the list
and THEN popping a value.

>>> full = list(range(100))
>>> import random
>>> idx = random.randint(0, len(full))
>>> idx
74
>>> trim = full[:idx] + full[idx+1:]
>>> trim
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39,
40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58,
59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 75, 76, 77, 78,
79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97,
98, 99]
>>> trim == full
False
>>> 

or

>>> trim2 = full[:]
>>> del trim2[idx]
>>> trim
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39,
40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58,
59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 75, 76, 77, 78,
79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97,
98, 99]
>>> 

The first does two partial copies skipping the item to be removed, and
joins the results into a new list. The second does a full copy and DELETES
the element to be removed from the copy.

>>> trim3 = full[:]
>>> trim3.remove(idx)
>>> trim3
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39,
40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58,
59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 75, 76, 77, 78,
79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97,
98, 99]
>>> 

This option works because the list is sequential integers and the index
matches the values in the list (.remove() removes the first MATCHING
element, so if the list can have duplicates is may not remove the one AT
the index position).



-- 
Wulfraed Dennis Lee Bieber AF6VN
wlfr...@ix.netcom.comhttp://wlfraed.microdiversity.freeddns.org/
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: random.SystemRandom().randint() inefficient

2022-07-27 Thread Cecil Westerhof via Python-list
Dennis Lee Bieber  writes:

> On Wed, 27 Jul 2022 10:45:47 +0200, Cecil Westerhof 
> declaimed the following:
>
>
>>What do you mean with where the python version is from?
>
>   Base Python.org download, ActiveState package download, Anaconda
> package download, native OS install/extra install via OS repository
> download (Debian/Ubuntu: apt install xxx, where xxx is not the native OS
> Python)

Just the default Debian install.

-- 
Cecil Westerhof
Senior Software Engineer
LinkedIn: http://www.linkedin.com/in/cecilwesterhof
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: random.SystemRandom().randint() inefficient

2022-07-27 Thread Barry


> On 27 Jul 2022, at 17:09, Cecil Westerhof via Python-list 
>  wrote:
> 
> Barry  writes:
> 
 On 26 Jul 2022, at 16:07, Cecil Westerhof via Python-list 
  wrote:
>>> 
>>> I need to get a random integer. At first I tried it with:
>>>   from secrets import randbelow
>>>   index = randbelow(len(to_try))
>>> 
>>> This works perfectly, but it took some time. So I thought I try:
>>>   from random  import SystemRandom
>>>   index = SystemRandom().randint(0, len(to_try) - 1)
>>> 
>>> A first indication is that the second version would take about two
>>> times as much time as the first. Is there a reason for this, or should
>>> this not be happening?
>> 
>> What is the OS that you are running on and its version?
>> If it’s linux what is the kernel version?
>> What version of python and where from?
> 
> That is always good information of-course.
> Debian 11.3
> 5.10.0-13-amd64
> 3.9.2
> 
> What do you mean with where the python version is from?

Because random number generator implementation depends on the
OS since it is SystemRandom() that you are comparing with python’s
Implementation.

Barry
> 
> -- 
> Cecil Westerhof
> Senior Software Engineer
> LinkedIn: http://www.linkedin.com/in/cecilwesterhof
> -- 
> https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Simple TCP proxy

2022-07-27 Thread Chris Angelico
On Thu, 28 Jul 2022 at 02:15, Morten W. Petersen  wrote:
>
> Hi.
>
> I'd like to share with you a recent project, which is a simple TCP proxy
> that can stand in front of a TCP server of some sort, queueing requests and
> then allowing n number of connections to pass through at a time:

How's this different from what the networking subsystem already does?
When you listen, you can set a queue length. Can you elaborate?

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: random.SystemRandom().randint() inefficient

2022-07-27 Thread Mats Wichmann
On 7/27/22 02:45, Cecil Westerhof via Python-list wrote:
> Barry  writes:

>> What version of python and where from?
> 
> That is always good information of-course.
> Debian 11.3
> 5.10.0-13-amd64
> 3.9.2
> 
> What do you mean with where the python version is from?

On Windows, the platform of a large proportion of people asking
questions here, there are many possible Python builds (python.org,
Microsoft Store, Conda, ActiveState, etc.) and it sometimes matters,
thus that tends to be a standard question that gets asked here.  In your
case the implied answer is "distro package".
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: random.SystemRandom().randint() inefficient

2022-07-27 Thread Dennis Lee Bieber
On Wed, 27 Jul 2022 10:45:47 +0200, Cecil Westerhof 
declaimed the following:


>What do you mean with where the python version is from?

Base Python.org download, ActiveState package download, Anaconda
package download, native OS install/extra install via OS repository
download (Debian/Ubuntu: apt install xxx, where xxx is not the native OS
Python)


-- 
Wulfraed Dennis Lee Bieber AF6VN
wlfr...@ix.netcom.comhttp://wlfraed.microdiversity.freeddns.org/
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: random.SystemRandom().randint() inefficient

2022-07-27 Thread MRAB

On 27/07/2022 16:43, Cecil Westerhof via Python-list wrote:

"Michael F. Stemper"  writes:


This is orthogonal to your question, but might be of some use to you:

The combination of using len(to_try) as an argument to randint() and
saving the output to a variable named "index" suggests that you might
be setting up to select a random element from to_try, as in:
  something = to_try[index]

If that is the case, you might want to consider using random.choice() instead:

  >>> from random import choice
  >>> to_try = [2,3,5,7,11,13,"seventeen",19]
  >>> choice(to_try)
  2
  >>> choice(to_try)
  'seventeen'
  >>> choice(to_try)
  13
  >>> choice(to_try)
  5
  >>>


Yes, I try to select a random element, but it has also to be removed,
because an element should not be used more as once.
This is the code I use:
 # index = randbelow(len(to_try))
 index = randrange(len(to_try))
 found = permutation[to_try.pop(index)]



When you pop an element from the last, the elements after it need to be 
moved down, which takes time.


Try shuffling the list and then popping the now randomly-ordered 
elements off the end.

--
https://mail.python.org/mailman/listinfo/python-list


Re: More efficient code, but slower program

2022-07-27 Thread 2QdxY4RzWzUUiLuE
On 2022-07-27 at 17:48:47 +0200,
Regarding "Re: More efficient code, but slower program,"
Cecil Westerhof via Python-list  wrote:

> r...@zedat.fu-berlin.de (Stefan Ram) writes:
> 
> > Cecil Westerhof  writes:
> >>values = [*range(100)]
> >
> >   In many cases, any iterable is just fine and a list is not
> >   required, just as peudo-random numbers often are just fine and
> >   real-world entropy is not required.
> 
> In this case both are. I must select (several times) a random element
> from the list. So I need the list.
> I also want the randomness to be as good as possible to make the
> 'simulation' as good as possible.

"[A]s good as possible" for simulations and tests may not require the
cryptographic quality numbers from SystemRandom.  Many/most pseudo
random number generators are optimized for statistically normalized
outputs, and are repeatable as a bonus (again, often a requirement for
certain types of simulations; YMMV).

Also, what if you shuffled the list first (e.g., with random.shuffle)
and then iterated through it directly?  Repeatedly removing arbitrary
elements from the middle of a list is potentially expensive.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: random.SystemRandom().randint() inefficient

2022-07-27 Thread Michael F. Stemper

On 26/07/2022 16.47, Cecil Westerhof wrote:

Chris Angelico  writes:


On Wed, 27 Jul 2022 at 06:06, Cecil Westerhof via Python-list
 wrote:


Chris Angelico  writes:


On Wed, 27 Jul 2022 at 01:06, Cecil Westerhof via Python-list
 wrote:


I need to get a random integer. At first I tried it with:
 from secrets import randbelow
 index = randbelow(len(to_try))

This works perfectly, but it took some time. So I thought I try:
 from random  import SystemRandom
 index = SystemRandom().randint(0, len(to_try) - 1)

A first indication is that the second version would take about two
times as much time as the first. Is there a reason for this, or should
this not be happening?



You're setting up a brand new SystemRandom instance just for a single
random number. For a fairer comparison, set up the instance, then
generate far more than just a single number, and see how that goes.


Thanks. I thought I did something wrong and I did.
I will try to implement like you said and look what the result will
be. (And share it.)


Thanks! Don't feel bad; performance testing is *hard*, getting
meaningful results takes a lot of of fiddling with parameters, and
getting interesting AND meaningful results can sometimes seem about
impossible.


(As I understand it both do more, or less the same and should have
comparable performance.)


In normal production work? Yes (the SystemRandom object doesn't have
any significant state - a seeded RNG could have a lot more overhead
here). But for performance testing? The work of instantiating the
class could be completely irrelevant, or it could be dominating your
results. It's hard to say, hence the suggestion to try it without
reinstantiating.


It had a very big influence. Original it took about three times more
time to run my program. (The program was still running when I posted
the original post and the difference was higher as I anticipated.)
Removing that did cut about 45% of the execution time of the program.
(So the initiation is quit expensive.)
But it still takes about 50% more time. So I am still a bit
flabbergasted.

The new code:
 from random  import SystemRandom
 system_random   = SystemRandom()
 index = system_random.randint(0, len(to_try) - 1)


This is orthogonal to your question, but might be of some use to you:

The combination of using len(to_try) as an argument to randint() and
saving the output to a variable named "index" suggests that you might
be setting up to select a random element from to_try, as in:
  something = to_try[index]

If that is the case, you might want to consider using random.choice() instead:

  >>> from random import choice
  >>> to_try = [2,3,5,7,11,13,"seventeen",19]
  >>> choice(to_try)
  2
  >>> choice(to_try)
  'seventeen'
  >>> choice(to_try)
  13
  >>> choice(to_try)
  5
  >>>

--
Michael F. Stemper
This sentence no verb.
--
https://mail.python.org/mailman/listinfo/python-list


More efficient code, but slower program

2022-07-27 Thread Cecil Westerhof via Python-list
It is not very important, but I am just curious.

Original I had in a program:
values  = [*range(100)]

But because it is done quite often I expected that initialising:
range_list  = [*range(100)]

and then use:
values  = range_list.copy()

Would be more efficient. So I tried:
timeit('values  = [*range(100)]')
1.6964535564184189

and:
timeit('new_values = values.copy()', 'values = [*range(100)]')
0.6457642465829849

That showed that it should make a positive difference.
But when changing the program it took a little bit more time.
I find the code with the copy a little bit better, so I kept it.
But I am curious why the effect is the opposite of what I expected.
It does not hurt to understand optimisation better, so I can do a
better job when I need it.

-- 
Cecil Westerhof
Senior Software Engineer
LinkedIn: http://www.linkedin.com/in/cecilwesterhof
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: random.SystemRandom().randint() inefficient

2022-07-27 Thread Cecil Westerhof via Python-list
Dennis Lee Bieber  writes:

> On Tue, 26 Jul 2022 23:47:59 +0200, Cecil Westerhof 
> declaimed the following:
>
>
>>The new code:
>>from random  import SystemRandom
>>system_random   = SystemRandom()
>>index = system_random.randint(0, len(to_try) - 1)
>>
>>The first two statements are executed once.
>>The last statement I think about 75 * 10 ** 6.
>>
>>So it seems that my first idea of using randbelow was the correct one.
>>But if anyone could explain why SystemRandom is so much more
>>expensive, I would be interested to know it.
>>(Or am I still doing something wrong?)
>
>   What happens with
>
>   system_randint = SystemRandom().randint #no parens
>
>   index = system_randint(...)
>
> which may remove the method lookup from the repetition.

I had already switched to randrange. This went to 15 minutes from 21
minutes.
By removing the method lookup I could shave off another minute. So
certainly noteworthy. (Should have thought about it myself.)

-- 
Cecil Westerhof
Senior Software Engineer
LinkedIn: http://www.linkedin.com/in/cecilwesterhof
-- 
https://mail.python.org/mailman/listinfo/python-list


Simple TCP proxy

2022-07-27 Thread Morten W. Petersen
Hi.

I'd like to share with you a recent project, which is a simple TCP proxy
that can stand in front of a TCP server of some sort, queueing requests and
then allowing n number of connections to pass through at a time:

https://github.com/morphex/stp

I'll be developing it further, but the the files committed in this tree
seem to be stable:

https://github.com/morphex/stp/tree/9910ca8c80e9d150222b680a4967e53f0457b465

I just bombed that code with 700+ requests almost simultaneously, and STP
handled it well.

Regards,

Morten

-- 
I am https://leavingnorway.info
Videos at https://www.youtube.com/user/TheBlogologue
Twittering at http://twitter.com/blogologue
Blogging at http://blogologue.com
Playing music at https://soundcloud.com/morten-w-petersen
Also playing music and podcasting here:
http://www.mixcloud.com/morten-w-petersen/
On Google+ here https://plus.google.com/107781930037068750156
On Instagram at https://instagram.com/morphexx/
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: random.SystemRandom().randint() inefficient

2022-07-27 Thread Cecil Westerhof via Python-list
Chris Angelico  writes:

> Incidentally - if you are actually trying to select a specific item,
> you may want to consider random.choice.

Yes, I try to select a random element, but it has also to be removed.
An element should be used at most once. This is the code I use:
# index = randbelow(len(to_try))
index = randrange(len(to_try))
found = permutation[to_try.pop(index)]

-- 
Cecil Westerhof
Senior Software Engineer
LinkedIn: http://www.linkedin.com/in/cecilwesterhof
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: random.SystemRandom().randint() inefficient

2022-07-27 Thread Cecil Westerhof via Python-list
Chris Angelico  writes:

> On Wed, 27 Jul 2022 at 08:18, Cecil Westerhof via Python-list
>  wrote:
>>
>> Chris Angelico  writes:
>>
>> > On Wed, 27 Jul 2022 at 06:06, Cecil Westerhof via Python-list
>> >  wrote:
>> >>
>> >> Chris Angelico  writes:
>> >>
>> >> > On Wed, 27 Jul 2022 at 01:06, Cecil Westerhof via Python-list
>> >> >  wrote:
>> >> >>
>> >> >> I need to get a random integer. At first I tried it with:
>> >> >> from secrets import randbelow
>> >> >> index = randbelow(len(to_try))
>> >> >>
>> >> >> This works perfectly, but it took some time. So I thought I try:
>> >> >> from random  import SystemRandom
>> >> >> index = SystemRandom().randint(0, len(to_try) - 1)
>> >> >>
>> >> >> A first indication is that the second version would take about two
>> >> >> times as much time as the first. Is there a reason for this, or should
>> >> >> this not be happening?
>> >> >>
>> >> >
>> >> > You're setting up a brand new SystemRandom instance just for a single
>> >> > random number. For a fairer comparison, set up the instance, then
>> >> > generate far more than just a single number, and see how that goes.
>> >>
>> >> Thanks. I thought I did something wrong and I did.
>> >> I will try to implement like you said and look what the result will
>> >> be. (And share it.)
>> >
>> > Thanks! Don't feel bad; performance testing is *hard*, getting
>> > meaningful results takes a lot of of fiddling with parameters, and
>> > getting interesting AND meaningful results can sometimes seem about
>> > impossible.
>> >
>> >> (As I understand it both do more, or less the same and should have
>> >> comparable performance.)
>> >
>> > In normal production work? Yes (the SystemRandom object doesn't have
>> > any significant state - a seeded RNG could have a lot more overhead
>> > here). But for performance testing? The work of instantiating the
>> > class could be completely irrelevant, or it could be dominating your
>> > results. It's hard to say, hence the suggestion to try it without
>> > reinstantiating.
>>
>> It had a very big influence. Original it took about three times more
>> time to run my program. (The program was still running when I posted
>> the original post and the difference was higher as I anticipated.)
>> Removing that did cut about 45% of the execution time of the program.
>> (So the initiation is quit expensive.)
>> But it still takes about 50% more time. So I am still a bit
>> flabbergasted.
>>
>> The new code:
>> from random  import SystemRandom
>> system_random   = SystemRandom()
>> index = system_random.randint(0, len(to_try) - 1)
>>
>> The first two statements are executed once.
>> The last statement I think about 75 * 10 ** 6.
>>
>> So it seems that my first idea of using randbelow was the correct one.
>> But if anyone could explain why SystemRandom is so much more
>> expensive, I would be interested to know it.
>> (Or am I still doing something wrong?)
>
> Hmm. There are still a lot of differences here. Are you able to make
> use of randrange() instead, to make them more consistent?
>
> According to the source code, secrets.randbelow is calling on an
> internal method _randbelow of the SystemRandom object, but randrange
> (if called with only one arg) will go straight into that same method.
> Here's my results:
>
> rosuav@sikorsky:~$ python3 -m timeit -s 'from random import randrange'
> 'randrange(1)'
> 100 loops, best of 5: 322 nsec per loop
> rosuav@sikorsky:~$ python3 -m timeit -s 'from random import
> SystemRandom; r = SystemRandom()' 'r.randint(0, 1)'
> 20 loops, best of 5: 1.92 usec per loop
> rosuav@sikorsky:~$ python3 -m timeit -s 'from random import
> SystemRandom; r = SystemRandom()' 'r.randrange(1)'
> 20 loops, best of 5: 1.87 usec per loop
> rosuav@sikorsky:~$ python3 -m timeit -s 'from secrets import
> randbelow' 'randbelow(1)'
> 20 loops, best of 5: 1.64 usec per loop
>
> (The difference with the first one is that it isn't using the system
> RNG, so it has the limitations of an internal PRNG.)
>
> When you call randint, what happens is (1) the endpoint is incremented
> to transform it from inclusive-inclusive to inclusive-exclusive; (2)
> randrange is called with two args; (3) fast path 1 is skipped, fast
> path 2 is taken, and _randbelow gets called to get an actual random
> number, which gets zero added to it before returning.
>
> If, instead, you use randrange(len(to_try)), what would happen is (1)
> fast path 1 is used, and (2) _randbelow is called to get the random
> number.
>
> In secrets.randbelow, it's even better: (1) _randbelow is called to
> get the random number.
>
> I think that might be what's going on here. You're adding some tiny
> amounts of extra work, but if you were to make them more similar, the
> distinctions would evaporate. Here's a couple of other variants that
> use SystemRandom:
>
> rosuav@sikorsky:~$ python3 -m timeit -s 'from random import
> SystemRandom; randrange = SystemRandom().randrange' 

Re: More efficient code, but slower program

2022-07-27 Thread Cecil Westerhof via Python-list
r...@zedat.fu-berlin.de (Stefan Ram) writes:

> Cecil Westerhof  writes:
>>values = [*range(100)]
>
>   In many cases, any iterable is just fine and a list is not
>   required, just as peudo-random numbers often are just fine and
>   real-world entropy is not required.

In this case both are. I must select (several times) a random element
from the list. So I need the list.
I also want the randomness to be as good as possible to make the
'simulation' as good as possible.


>   Usually one wants to write code for readability, and thinking
>   too much about runtime efficiency optimizations is in vain,
>   because one might get different results with a different
>   version of Python or on a different machine.

That is why I went for the less efficient code. ;-)

-- 
Cecil Westerhof
Senior Software Engineer
LinkedIn: http://www.linkedin.com/in/cecilwesterhof
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: random.SystemRandom().randint() inefficient

2022-07-27 Thread Cecil Westerhof via Python-list
Barry  writes:

>> On 26 Jul 2022, at 16:07, Cecil Westerhof via Python-list 
>>  wrote:
>> 
>> I need to get a random integer. At first I tried it with:
>>from secrets import randbelow
>>index = randbelow(len(to_try))
>> 
>> This works perfectly, but it took some time. So I thought I try:
>>from random  import SystemRandom
>>index = SystemRandom().randint(0, len(to_try) - 1)
>> 
>> A first indication is that the second version would take about two
>> times as much time as the first. Is there a reason for this, or should
>> this not be happening?
>
> What is the OS that you are running on and its version?
> If it’s linux what is the kernel version?
> What version of python and where from?

That is always good information of-course.
Debian 11.3
5.10.0-13-amd64
3.9.2

What do you mean with where the python version is from?

-- 
Cecil Westerhof
Senior Software Engineer
LinkedIn: http://www.linkedin.com/in/cecilwesterhof
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: random.SystemRandom().randint() inefficient

2022-07-27 Thread Cecil Westerhof via Python-list
"Michael F. Stemper"  writes:

> This is orthogonal to your question, but might be of some use to you:
>
> The combination of using len(to_try) as an argument to randint() and
> saving the output to a variable named "index" suggests that you might
> be setting up to select a random element from to_try, as in:
>   something = to_try[index]
>
> If that is the case, you might want to consider using random.choice() instead:
>
>   >>> from random import choice
>   >>> to_try = [2,3,5,7,11,13,"seventeen",19]
>   >>> choice(to_try)
>   2
>   >>> choice(to_try)
>   'seventeen'
>   >>> choice(to_try)
>   13
>   >>> choice(to_try)
>   5
>   >>>

Yes, I try to select a random element, but it has also to be removed,
because an element should not be used more as once.
This is the code I use:
# index = randbelow(len(to_try))
index = randrange(len(to_try))
found = permutation[to_try.pop(index)]

-- 
Cecil Westerhof
Senior Software Engineer
LinkedIn: http://www.linkedin.com/in/cecilwesterhof
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: random.SystemRandom().randint() inefficient

2022-07-27 Thread Barry


> On 26 Jul 2022, at 16:07, Cecil Westerhof via Python-list 
>  wrote:
> 
> I need to get a random integer. At first I tried it with:
>from secrets import randbelow
>index = randbelow(len(to_try))
> 
> This works perfectly, but it took some time. So I thought I try:
>from random  import SystemRandom
>index = SystemRandom().randint(0, len(to_try) - 1)
> 
> A first indication is that the second version would take about two
> times as much time as the first. Is there a reason for this, or should
> this not be happening?

What is the OS that you are running on and its version?
If it’s linux what is the kernel version?
What version of python and where from?

Barry

> 
> -- 
> Cecil Westerhof
> Senior Software Engineer
> LinkedIn: http://www.linkedin.com/in/cecilwesterhof
> -- 
> https://mail.python.org/mailman/listinfo/python-list
> 

-- 
https://mail.python.org/mailman/listinfo/python-list