Re: Simple TCP proxy

2022-07-31 Thread Morten W. Petersen
Well, initially I was just curious.

As the name implies, it's a TCP proxy, and different features could go into
that.

I looked at for example port knocking for hindering unauthorized access to
the (protected) TCP service SMPS, but there you also have the possibility
of someone eavesdropping, and learning the right handshake, if you will.
So it's something that will work, until someone gets determined to make a
mess.

In short, it will give better control than backlog does, enabling
Python-style code and logic to deal with different situations.

I was about to say "deal with things intelligently"; but I think
"intelligent" is a word that doesn't fit here or in many other applications.

Say for example this service comes under attack for unknown reasons; it
could be possible to teach the proxy to only accept connections to the
backend server for IP addresses / subnets that have previously n number of
transmissions back and forth.  If you know that the service will have max
50 different clients.

Anyway, what Chris said earlier, I think we can file that under "eagerness
to tech others and show what you know".  Right Chris? :)

Regards,

Morten

On Sat, Jul 30, 2022 at 10:31 PM Barry  wrote:

>
>
>
> > On 30 Jul 2022, at 20:33, Morten W. Petersen  wrote:
> > I thought it was a bit much.
> >
> > I just did a bit more testing, and saw that the throughput of wget
> through
> > regular lighttpd was 1,3 GB/s, while through STP it was 122 MB/s, and
> using
> > quite a bit of CPU.
> >
> > Then I increased the buffer size 8-fold for reading and writing in
> run.py,
> > and the CPU usage went way down, and the transfer speed went up to 449
> MB/s.
>
> You are trading latency for through put.
>
> >
> > So it would require well more than a gigabit network interface to max out
> > STP throughput; CPU usage was around 30-40% max, on one processor core.
>
> With how many connections?
>
> >
> > There is good enough, and then there's general practice and/or what is
> > regarded as an elegant solution.  I'm looking for good enough, and in the
> > process I don't mind pushing the envelope on Python threading.
>
> You never did answer my query on why a large backlog is not good enough.
> Why do you need this program at all?
>
> Barry
> >
> > -Morten
> >
> > On Sat, Jul 30, 2022 at 12:59 PM Roel Schroeven 
> > wrote:
> >
> >> Morten W. Petersen schreef op 29/07/2022 om 22:59:
> >>> OK, sounds like sunshine is getting the best of you.
> >> It has to be said: that is uncalled for.
> >>
> >> Chris gave you good advice, with the best of intentions. Sometimes we
> >> don't like good advice if it says something we don't like, but that's no
> >> reason to take it off on the messenger.
> >>
> >> --
> >> "Iceland is the place you go to remind yourself that planet Earth is a
> >> machine... and that all organic life that has ever existed amounts to a
> >> greasy
> >> film that has survived on the exterior of that machine thanks to furious
> >> improvisation."
> >> -- Sam Hughes, Ra
> >>
> >> --
> >> https://mail.python.org/mailman/listinfo/python-list
> >
> >
> > --
> > I am https://leavingnorway.info
> > Videos at https://www.youtube.com/user/TheBlogologue
> > Twittering at http://twitter.com/blogologue
> > Blogging at http://blogologue.com
> > Playing music at https://soundcloud.com/morten-w-petersen
> > Also playing music and podcasting here:
> > http://www.mixcloud.com/morten-w-petersen/
> > On Google+ here https://plus.google.com/107781930037068750156
> > On Instagram at https://instagram.com/morphexx/
> > --
> > https://mail.python.org/mailman/listinfo/python-list
>
>

-- 
I am https://leavingnorway.info
Videos at https://www.youtube.com/user/TheBlogologue
Twittering at http://twitter.com/blogologue
Blogging at http://blogologue.com
Playing music at https://soundcloud.com/morten-w-petersen
Also playing music and podcasting here:
http://www.mixcloud.com/morten-w-petersen/
On Google+ here https://plus.google.com/107781930037068750156
On Instagram at https://instagram.com/morphexx/
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Simple TCP proxy

2022-07-30 Thread Barry



> On 30 Jul 2022, at 20:33, Morten W. Petersen  wrote:
> I thought it was a bit much.
> 
> I just did a bit more testing, and saw that the throughput of wget through
> regular lighttpd was 1,3 GB/s, while through STP it was 122 MB/s, and using
> quite a bit of CPU.
> 
> Then I increased the buffer size 8-fold for reading and writing in run.py,
> and the CPU usage went way down, and the transfer speed went up to 449 MB/s.

You are trading latency for through put.

> 
> So it would require well more than a gigabit network interface to max out
> STP throughput; CPU usage was around 30-40% max, on one processor core.

With how many connections?

> 
> There is good enough, and then there's general practice and/or what is
> regarded as an elegant solution.  I'm looking for good enough, and in the
> process I don't mind pushing the envelope on Python threading.

You never did answer my query on why a large backlog is not good enough.
Why do you need this program at all?

Barry
> 
> -Morten
> 
> On Sat, Jul 30, 2022 at 12:59 PM Roel Schroeven 
> wrote:
> 
>> Morten W. Petersen schreef op 29/07/2022 om 22:59:
>>> OK, sounds like sunshine is getting the best of you.
>> It has to be said: that is uncalled for.
>> 
>> Chris gave you good advice, with the best of intentions. Sometimes we
>> don't like good advice if it says something we don't like, but that's no
>> reason to take it off on the messenger.
>> 
>> --
>> "Iceland is the place you go to remind yourself that planet Earth is a
>> machine... and that all organic life that has ever existed amounts to a
>> greasy
>> film that has survived on the exterior of that machine thanks to furious
>> improvisation."
>> -- Sam Hughes, Ra
>> 
>> --
>> https://mail.python.org/mailman/listinfo/python-list
> 
> 
> -- 
> I am https://leavingnorway.info
> Videos at https://www.youtube.com/user/TheBlogologue
> Twittering at http://twitter.com/blogologue
> Blogging at http://blogologue.com
> Playing music at https://soundcloud.com/morten-w-petersen
> Also playing music and podcasting here:
> http://www.mixcloud.com/morten-w-petersen/
> On Google+ here https://plus.google.com/107781930037068750156
> On Instagram at https://instagram.com/morphexx/
> -- 
> https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Simple TCP proxy

2022-07-30 Thread Morten W. Petersen
I thought it was a bit much.

I just did a bit more testing, and saw that the throughput of wget through
regular lighttpd was 1,3 GB/s, while through STP it was 122 MB/s, and using
quite a bit of CPU.

Then I increased the buffer size 8-fold for reading and writing in run.py,
and the CPU usage went way down, and the transfer speed went up to 449 MB/s.

So it would require well more than a gigabit network interface to max out
STP throughput; CPU usage was around 30-40% max, on one processor core.

There is good enough, and then there's general practice and/or what is
regarded as an elegant solution.  I'm looking for good enough, and in the
process I don't mind pushing the envelope on Python threading.

-Morten

On Sat, Jul 30, 2022 at 12:59 PM Roel Schroeven 
wrote:

> Morten W. Petersen schreef op 29/07/2022 om 22:59:
> > OK, sounds like sunshine is getting the best of you.
> It has to be said: that is uncalled for.
>
> Chris gave you good advice, with the best of intentions. Sometimes we
> don't like good advice if it says something we don't like, but that's no
> reason to take it off on the messenger.
>
> --
> "Iceland is the place you go to remind yourself that planet Earth is a
> machine... and that all organic life that has ever existed amounts to a
> greasy
> film that has survived on the exterior of that machine thanks to furious
> improvisation."
>  -- Sam Hughes, Ra
>
> --
> https://mail.python.org/mailman/listinfo/python-list
>


-- 
I am https://leavingnorway.info
Videos at https://www.youtube.com/user/TheBlogologue
Twittering at http://twitter.com/blogologue
Blogging at http://blogologue.com
Playing music at https://soundcloud.com/morten-w-petersen
Also playing music and podcasting here:
http://www.mixcloud.com/morten-w-petersen/
On Google+ here https://plus.google.com/107781930037068750156
On Instagram at https://instagram.com/morphexx/
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Simple TCP proxy

2022-07-30 Thread Barry Scott
Morten,

As Chris remarked you need to learn a number of networking, python, system 
performance
and other skills to turn your project into production code.

Using threads does not scale very well. Its uses a lot of memory and raises CPU 
used
just to do the context switches. Also the GIL means that even if you are doing 
blocking
I/O the use of threads does not scale well.

Its rare to see multi threaded code, rather what you see is code that uses 
async I/O.

At its heart async code at the low level is using a kernel interface like epoll 
(or on old
systems select). What epoll allow you to do is wait on a sets of FDs for a 
range of
I/O operations. Like ready to read, ready to write and other activity (like the 
socket
closing).

You could write code to use epoll your self, but while fun to write you need to 
know
a lot about networking and linux to cover all the corner cases.

Libraries like twisted, trio, uvloop and pythons selectors implemented 
production quality
version of the required code with good APIs.

Do not judge these libraries by their size. They are no bloated and only as 
complex as
the problem they are solving requires.

There is a simple example of async code using the python selectors here that 
shows
the style of programming.
https://docs.python.org/3/library/selectors.html#examples 


The issues that you likely need to solve and test for include:
* handling unexpected socket close events.
* buffering and flow control from one socket's read to the another socket's 
write.
  What if one side is reading slower then the other is writing?
* timeout sockets that stop sending data and close them

At some point you will exceed the capacity for one process to handle the load.
The solution we used is to listen on the socket in a parent process and fork
enough child processes to handle the I/O load. This avoids issues with the GIL
and allows you to scale.

But I am still not sure why you need to do anything more the increase the 
backlog
on your listen socket in the main app. Set the backlog to 1,000,000 does that 
fix
your issue? 

You will need on Linux to change kernel limits to allow that size. See man 
listen
for info on what you need to change.

Barry

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Simple TCP proxy

2022-07-30 Thread Roel Schroeven

Morten W. Petersen schreef op 29/07/2022 om 22:59:

OK, sounds like sunshine is getting the best of you.

It has to be said: that is uncalled for.

Chris gave you good advice, with the best of intentions. Sometimes we 
don't like good advice if it says something we don't like, but that's no 
reason to take it off on the messenger.


--
"Iceland is the place you go to remind yourself that planet Earth is a
machine... and that all organic life that has ever existed amounts to a greasy
film that has survived on the exterior of that machine thanks to furious
improvisation."
-- Sam Hughes, Ra

--
https://mail.python.org/mailman/listinfo/python-list


Re: Simple TCP proxy

2022-07-29 Thread Morten W. Petersen
OK, sounds like sunshine is getting the best of you.

It's working with a pretty heavy load, I see ways of solving potential
problems that haven't become a problem yet, and I'm enjoying it.

Maybe you should tone down the coaching until someone asks for it.

Regards,

Morten

On Fri, Jul 29, 2022 at 10:46 PM Chris Angelico  wrote:

> On Sat, 30 Jul 2022 at 04:54, Morten W. Petersen 
> wrote:
> >
> > OK.
> >
> > Well, I've worked with web hosting in the past, and proxies like squid
> were used to lessen the load on dynamic backends.  There was also a website
> opensourcearticles.com that we had with Firefox, Thunderbird articles
> etc. that got quite a bit of traffic.
> >
> > IIRC, that website was mostly static with some dynamic bits and heavily
> cached by squid.
>
> Yep, and squid almost certainly won't have a thread for every incoming
> connection, spinning and waiting for the back end server. But squid
> does a LOT more than simply queue connections - it'll be inspecting
> headers and retaining a cache of static content, so it's not really
> comparable.
>
> > Most websites don't get a lot of traffic though, and don't have a big
> budget for "website system administration".  So maybe that's where I'm
> partly going with this, just making a proxy that can be put in front and
> deal with a lot of common situations, in a reasonably good way.
> >
> > If I run into problems with threads that can't be managed, then a switch
> to something like the queue_manager function which has data and then
> functions that manage the data and connections is an option.
> >
>
> I'll be quite frank with you: this is not production-quality code. It
> should not be deployed by anyone who doesn't have a big budget for
> "website system administration *training*". This code is good as a
> tool for YOU to learn how these things work; it shouldn't be a tool
> for anyone who actually has server load issues.
>
> I'm sorry if that sounds harsh, but the fact is, you can do a lot
> better by using this to learn more about networking than you'll ever
> do by trying to pitch it to any specific company.
>
> That said though: it's still good to know what your (theoretical)
> use-case is. That'll tell you what kinds of connection spam to throw
> at your proxy (lots of idle sockets? lots of HTTP requests? billions
> of half open TCP connections?) to see what it can cope with.
>
> Keep on playing with this code. There's a lot you can gain from it, still.
>
> ChrisA
> --
> https://mail.python.org/mailman/listinfo/python-list
>


-- 
I am https://leavingnorway.info
Videos at https://www.youtube.com/user/TheBlogologue
Twittering at http://twitter.com/blogologue
Blogging at http://blogologue.com
Playing music at https://soundcloud.com/morten-w-petersen
Also playing music and podcasting here:
http://www.mixcloud.com/morten-w-petersen/
On Google+ here https://plus.google.com/107781930037068750156
On Instagram at https://instagram.com/morphexx/
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Simple TCP proxy

2022-07-29 Thread Chris Angelico
On Sat, 30 Jul 2022 at 04:54, Morten W. Petersen  wrote:
>
> OK.
>
> Well, I've worked with web hosting in the past, and proxies like squid were 
> used to lessen the load on dynamic backends.  There was also a website 
> opensourcearticles.com that we had with Firefox, Thunderbird articles etc. 
> that got quite a bit of traffic.
>
> IIRC, that website was mostly static with some dynamic bits and heavily 
> cached by squid.

Yep, and squid almost certainly won't have a thread for every incoming
connection, spinning and waiting for the back end server. But squid
does a LOT more than simply queue connections - it'll be inspecting
headers and retaining a cache of static content, so it's not really
comparable.

> Most websites don't get a lot of traffic though, and don't have a big budget 
> for "website system administration".  So maybe that's where I'm partly going 
> with this, just making a proxy that can be put in front and deal with a lot 
> of common situations, in a reasonably good way.
>
> If I run into problems with threads that can't be managed, then a switch to 
> something like the queue_manager function which has data and then functions 
> that manage the data and connections is an option.
>

I'll be quite frank with you: this is not production-quality code. It
should not be deployed by anyone who doesn't have a big budget for
"website system administration *training*". This code is good as a
tool for YOU to learn how these things work; it shouldn't be a tool
for anyone who actually has server load issues.

I'm sorry if that sounds harsh, but the fact is, you can do a lot
better by using this to learn more about networking than you'll ever
do by trying to pitch it to any specific company.

That said though: it's still good to know what your (theoretical)
use-case is. That'll tell you what kinds of connection spam to throw
at your proxy (lots of idle sockets? lots of HTTP requests? billions
of half open TCP connections?) to see what it can cope with.

Keep on playing with this code. There's a lot you can gain from it, still.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Simple TCP proxy

2022-07-29 Thread Morten W. Petersen
OK.

Well, I've worked with web hosting in the past, and proxies like squid were
used to lessen the load on dynamic backends.  There was also a website
opensourcearticles.com that we had with Firefox, Thunderbird articles etc.
that got quite a bit of traffic.

IIRC, that website was mostly static with some dynamic bits and heavily
cached by squid.

Most websites don't get a lot of traffic though, and don't have a big
budget for "website system administration".  So maybe that's where I'm
partly going with this, just making a proxy that can be put in front and
deal with a lot of common situations, in a reasonably good way.

If I run into problems with threads that can't be managed, then a switch to
something like the queue_manager function which has data and then functions
that manage the data and connections is an option.

-Morten

On Fri, Jul 29, 2022 at 12:11 AM Chris Angelico  wrote:

> On Fri, 29 Jul 2022 at 07:24, Morten W. Petersen 
> wrote:
> >
> > Forwarding to the list as well.
> >
> > -- Forwarded message -
> > From: Morten W. Petersen 
> > Date: Thu, Jul 28, 2022 at 11:22 PM
> > Subject: Re: Simple TCP proxy
> > To: Chris Angelico 
> >
> >
> > Well, an increase from 0.1 seconds to 0.2 seconds on "polling" in each
> > thread whether or not the connection should become active doesn't seem
> like
> > a big deal.
>
> Maybe, but polling *at all* is the problem here. It shouldn't be
> hammering the other server. You'll quickly find that there are limits
> that simply shouldn't exist, because every connection is trying to
> check to see if it's active now. This is *completely unnecessary*.
> I'll reiterate the advice given earlier in this thread (of
> conversation): Look into the tools available for thread (of execution)
> synchronization, such as mutexes (in Python, threading.Lock) and
> events (in Python, threading.Condition). A poll interval enforces a
> delay before the thread notices that it's active, AND causes inactive
> threads to consume CPU, neither of which is a good thing.
>
> > And there's also some point where it is pointless to accept more
> > connections, and where maybe remedies like accepting known good IPs,
> > blocking IPs / IP blocks with more than 3 connections etc. should be
> > considered.
>
> Firewalling is its own science. Blocking IPs with too many
> simultaneous connections should be decided administratively, not
> because your proxy can't handle enough connections.
>
> > I think I'll be getting closer than most applications to an eventual
> > ceiling for what Python can handle of threads, and that's interesting and
> > could be beneficial for Python as well.
>
> Here's a quick demo of the cost of threads when they're all blocked on
> something.
>
> >>> import threading
> >>> finish = threading.Condition()
> >>> def thrd(cond):
> ... with cond: cond.wait()
> ...
> >>> threading.active_count() # Main thread only
> 1
> >>> import time
> >>> def spawn(n):
> ... start = time.monotonic()
> ... for _ in range(n):
> ... t = threading.Thread(target=thrd, args=(finish,))
> ... t.start()
> ... print("Spawned", n, "threads in", time.monotonic() - start,
> "seconds")
> ...
> >>> spawn(1)
> Spawned 1 threads in 7.548425202025101 seconds
> >>> threading.active_count()
> 10001
> >>> with finish: finish.notify_all()
> ...
> >>> threading.active_count()
> 1
>
> It takes a bit of time to start ten thousand threads, but after that,
> the system is completely idle again until I notify them all and they
> shut down.
>
> (Interestingly, it takes four times as long to start 20,000 threads,
> suggesting that something in thread spawning has O(n²) cost. Still,
> even that leaves the system completely idle once it's done spawning
> them.)
>
> If your proxy can handle 20,000 threads, I would be astonished. And
> this isn't even close to a thread limit.
>
> Obviously the cost is different if the threads are all doing things,
> but if you have thousands of active socket connections, you'll start
> finding that there are limitations in quite a few places, depending on
> how much traffic is going through them. Ultimately, yes, you will find
> that threads restrict you and asynchronous I/O is the only option; but
> you can take threads a fairly long way before they are the limiting
> factor.
>
> ChrisA
> --
> https://mail.python.org/mailman/listinfo/python-list
>


-- 
I am https://leavingnorway.info
Videos at https://www.youtube.com/user/TheBlogologue
Twittering at http://twitter.com/blogologue
Blogging at http://blogologue.com
Playing music at https://soundcloud.com/morten-w-petersen
Also playing music and podcasting here:
http://www.mixcloud.com/morten-w-petersen/
On Google+ here https://plus.google.com/107781930037068750156
On Instagram at https://instagram.com/morphexx/
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Simple TCP proxy

2022-07-29 Thread Morten W. Petersen
OK, that's useful to know. Thanks. :)

-Morten

On Fri, Jul 29, 2022 at 3:43 AM Andrew MacIntyre 
wrote:

> On 29/07/2022 8:08 am, Chris Angelico wrote:
> > It takes a bit of time to start ten thousand threads, but after that,
> > the system is completely idle again until I notify them all and they
> > shut down.
> >
> > (Interestingly, it takes four times as long to start 20,000 threads,
> > suggesting that something in thread spawning has O(n²) cost. Still,
> > even that leaves the system completely idle once it's done spawning
> > them.)
>
> Another cost of threads can be memory allocated as thread stack space,
> the default size of which varies by OS (see e.g.
>
> https://ariadne.space/2021/06/25/understanding-thread-stack-sizes-and-how-alpine-is-different/
> ).
>
> threading.stack_size() can be used to check and perhaps adjust the
> allocation size.
>
> --
> -
> Andrew I MacIntyre "These thoughts are mine alone..."
> E-mail: andy...@pcug.org.au(pref) | Snail: PO Box 370
>  andy...@bullseye.apana.org.au   (alt) |Belconnen ACT 2616
> Web:http://www.andymac.org/   |Australia
> --
> https://mail.python.org/mailman/listinfo/python-list
>


-- 
I am https://leavingnorway.info
Videos at https://www.youtube.com/user/TheBlogologue
Twittering at http://twitter.com/blogologue
Blogging at http://blogologue.com
Playing music at https://soundcloud.com/morten-w-petersen
Also playing music and podcasting here:
http://www.mixcloud.com/morten-w-petersen/
On Google+ here https://plus.google.com/107781930037068750156
On Instagram at https://instagram.com/morphexx/
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Simple TCP proxy

2022-07-28 Thread Chris Angelico
On Fri, 29 Jul 2022 at 11:42, Andrew MacIntyre  wrote:
>
> On 29/07/2022 8:08 am, Chris Angelico wrote:
> > It takes a bit of time to start ten thousand threads, but after that,
> > the system is completely idle again until I notify them all and they
> > shut down.
> >
> > (Interestingly, it takes four times as long to start 20,000 threads,
> > suggesting that something in thread spawning has O(n²) cost. Still,
> > even that leaves the system completely idle once it's done spawning
> > them.)
>
> Another cost of threads can be memory allocated as thread stack space,
> the default size of which varies by OS (see e.g.
> https://ariadne.space/2021/06/25/understanding-thread-stack-sizes-and-how-alpine-is-different/).
>
> threading.stack_size() can be used to check and perhaps adjust the
> allocation size.
>

Yeah, they do have quite a few costs, and a naive approach of "give a
thread to every client", while very convenient, will end up limiting
throughput. (But I'll be honest: I still have a server that's built on
exactly that model, because it's much much safer than risking one
client stalling out the whole server due to a small bug. But that's a
MUD server.) Thing is, though, it'll most likely limit throughput to
something in the order of thousands of concurrent connections (or
thousands per second if it's something like HTTP where they tend to
get closed again), maybe tens of thousands. So if you have something
where every thread needs its own database connection, well, you're
gonna have database throughput problems WAY before you actually run
into thread count limitations!

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Simple TCP proxy

2022-07-28 Thread Andrew MacIntyre

On 29/07/2022 8:08 am, Chris Angelico wrote:

It takes a bit of time to start ten thousand threads, but after that,
the system is completely idle again until I notify them all and they
shut down.

(Interestingly, it takes four times as long to start 20,000 threads,
suggesting that something in thread spawning has O(n²) cost. Still,
even that leaves the system completely idle once it's done spawning
them.)


Another cost of threads can be memory allocated as thread stack space, 
the default size of which varies by OS (see e.g. 
https://ariadne.space/2021/06/25/understanding-thread-stack-sizes-and-how-alpine-is-different/).


threading.stack_size() can be used to check and perhaps adjust the 
allocation size.


--
-
Andrew I MacIntyre "These thoughts are mine alone..."
E-mail: andy...@pcug.org.au(pref) | Snail: PO Box 370
andy...@bullseye.apana.org.au   (alt) |Belconnen ACT 2616
Web:http://www.andymac.org/   |Australia
--
https://mail.python.org/mailman/listinfo/python-list


Re: Simple TCP proxy

2022-07-28 Thread Chris Angelico
On Fri, 29 Jul 2022 at 07:24, Morten W. Petersen  wrote:
>
> Forwarding to the list as well.
>
> -- Forwarded message -
> From: Morten W. Petersen 
> Date: Thu, Jul 28, 2022 at 11:22 PM
> Subject: Re: Simple TCP proxy
> To: Chris Angelico 
>
>
> Well, an increase from 0.1 seconds to 0.2 seconds on "polling" in each
> thread whether or not the connection should become active doesn't seem like
> a big deal.

Maybe, but polling *at all* is the problem here. It shouldn't be
hammering the other server. You'll quickly find that there are limits
that simply shouldn't exist, because every connection is trying to
check to see if it's active now. This is *completely unnecessary*.
I'll reiterate the advice given earlier in this thread (of
conversation): Look into the tools available for thread (of execution)
synchronization, such as mutexes (in Python, threading.Lock) and
events (in Python, threading.Condition). A poll interval enforces a
delay before the thread notices that it's active, AND causes inactive
threads to consume CPU, neither of which is a good thing.

> And there's also some point where it is pointless to accept more
> connections, and where maybe remedies like accepting known good IPs,
> blocking IPs / IP blocks with more than 3 connections etc. should be
> considered.

Firewalling is its own science. Blocking IPs with too many
simultaneous connections should be decided administratively, not
because your proxy can't handle enough connections.

> I think I'll be getting closer than most applications to an eventual
> ceiling for what Python can handle of threads, and that's interesting and
> could be beneficial for Python as well.

Here's a quick demo of the cost of threads when they're all blocked on
something.

>>> import threading
>>> finish = threading.Condition()
>>> def thrd(cond):
... with cond: cond.wait()
...
>>> threading.active_count() # Main thread only
1
>>> import time
>>> def spawn(n):
... start = time.monotonic()
... for _ in range(n):
... t = threading.Thread(target=thrd, args=(finish,))
... t.start()
... print("Spawned", n, "threads in", time.monotonic() - start, "seconds")
...
>>> spawn(1)
Spawned 1 threads in 7.548425202025101 seconds
>>> threading.active_count()
10001
>>> with finish: finish.notify_all()
...
>>> threading.active_count()
1

It takes a bit of time to start ten thousand threads, but after that,
the system is completely idle again until I notify them all and they
shut down.

(Interestingly, it takes four times as long to start 20,000 threads,
suggesting that something in thread spawning has O(n²) cost. Still,
even that leaves the system completely idle once it's done spawning
them.)

If your proxy can handle 20,000 threads, I would be astonished. And
this isn't even close to a thread limit.

Obviously the cost is different if the threads are all doing things,
but if you have thousands of active socket connections, you'll start
finding that there are limitations in quite a few places, depending on
how much traffic is going through them. Ultimately, yes, you will find
that threads restrict you and asynchronous I/O is the only option; but
you can take threads a fairly long way before they are the limiting
factor.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Simple TCP proxy

2022-07-28 Thread Morten W. Petersen
Well, it's not just code size in terms of disk space, it is also code
complexity, and the level of knowledge, skill and time it takes to make use
of something.

And if something fails in an unobvious way in Twisted, I imagine that
requires somebody highly skilled, and that costs quite a bit of money. And
people like that might also not always be available.

-Morten

On Thu, Jul 28, 2022 at 2:29 PM Barry  wrote:

>
>
> On 28 Jul 2022, at 10:31, Morten W. Petersen  wrote:
>
> 
> Hi Barry.
>
> Well, I can agree that using backlog is an option for handling bursts. But
> what if that backlog number is exceeded?  How easy is it to deal with such
> a situation?
>
>
> You can make backlog very large, if that makes sense.
> But at some point you will be forced to reject connections,
> once you cannot keep up with the average rate of connections.
>
>
>
> I just cloned twisted, and compared the size:
>
> morphex@morphex-Latitude-E4310:~$ du -s stp; du -s tmp/twisted/
> 464 stp
> 98520 tmp/twisted/
> morphex@morphex-Latitude-E4310:~$ du -sh stp/LICENSE
> 36K stp/LICENSE
>
> >>> 464/98520.0
> 0.004709703613479496
> >>>
>
> It's quite easy to get an idea of what's going on in STP, as opposed to if
> something goes wrong in Twisted with the size of the codebase. I used to
> use emacs a lot, but then I came into a period where it was more practical
> to use nano, and I mostly use nano now, unless I need to for example search
> and replace or something like that.
>
>
> I mentioned twisted for context. Depending on yours need the built in
> python 3 async support may well be sufficient for you needs. Using threads
> is not scalable.
>
> In the places I code disk space of a few MiB is not an issue.
>
> Barry
>
>
> -Morten
>
> On Thu, Jul 28, 2022 at 8:31 AM Barry  wrote:
>
>>
>>
>> > On 27 Jul 2022, at 17:16, Morten W. Petersen  wrote:
>> >
>> > Hi.
>> >
>> > I'd like to share with you a recent project, which is a simple TCP proxy
>> > that can stand in front of a TCP server of some sort, queueing requests
>> and
>> > then allowing n number of connections to pass through at a time:
>> >
>> > https://github.com/morphex/stp
>> >
>> > I'll be developing it further, but the the files committed in this tree
>> > seem to be stable:
>> >
>> >
>> https://github.com/morphex/stp/tree/9910ca8c80e9d150222b680a4967e53f0457b465
>> >
>> > I just bombed that code with 700+ requests almost simultaneously, and
>> STP
>> > handled it well.
>>
>> What is the problem that this solves?
>>
>> Why not just increase the allowed size of the socket listen backlog if
>> you just want to handle bursts of traffic.
>>
>> I do not think of this as a proxy, rather a tunnel.
>> And the tunnel is a lot more expensive the having kernel keep the
>> connection in
>> the listen socket backlog.
>>
>> I work on a web proxy written on python that handles huge load and
>> using backlog of the bursts.
>>
>> It’s async using twisted as threads are not practice at scale.
>>
>> Barry
>>
>> >
>> > Regards,
>> >
>> > Morten
>> >
>> > --
>> > I am https://leavingnorway.info
>> > Videos at https://www.youtube.com/user/TheBlogologue
>> > Twittering at http://twitter.com/blogologue
>> > Blogging at http://blogologue.com
>> > Playing music at https://soundcloud.com/morten-w-petersen
>> > Also playing music and podcasting here:
>> > http://www.mixcloud.com/morten-w-petersen/
>> > On Google+ here https://plus.google.com/107781930037068750156
>> > On Instagram at https://instagram.com/morphexx/
>> > --
>> > https://mail.python.org/mailman/listinfo/python-list
>> >
>>
>>
>
> --
> I am https://leavingnorway.info
> Videos at https://www.youtube.com/user/TheBlogologue
> Twittering at http://twitter.com/blogologue
> Blogging at http://blogologue.com
> Playing music at https://soundcloud.com/morten-w-petersen
> Also playing music and podcasting here:
> http://www.mixcloud.com/morten-w-petersen/
> On Google+ here https://plus.google.com/107781930037068750156
> On Instagram at https://instagram.com/morphexx/
>
>

-- 
I am https://leavingnorway.info
Videos at https://www.youtube.com/user/TheBlogologue
Twittering at http://twitter.com/blogologue
Blogging at http://blogologue.com
Playing music at https://soundcloud.com/morten-w-petersen
Also playing music and podcasting here:
http://www.mixcloud.com/morten-w-petersen/
On Google+ here https://plus.google.com/107781930037068750156
On Instagram at https://instagram.com/morphexx/
-- 
https://mail.python.org/mailman/listinfo/python-list


Fwd: Simple TCP proxy

2022-07-28 Thread Morten W. Petersen
Forwarding to the list as well.

-- Forwarded message -
From: Morten W. Petersen 
Date: Thu, Jul 28, 2022 at 11:22 PM
Subject: Re: Simple TCP proxy
To: Chris Angelico 


Well, an increase from 0.1 seconds to 0.2 seconds on "polling" in each
thread whether or not the connection should become active doesn't seem like
a big deal.

And there's also some point where it is pointless to accept more
connections, and where maybe remedies like accepting known good IPs,
blocking IPs / IP blocks with more than 3 connections etc. should be
considered.

I think I'll be getting closer than most applications to an eventual
ceiling for what Python can handle of threads, and that's interesting and
could be beneficial for Python as well.

-Morten

On Thu, Jul 28, 2022 at 2:31 PM Chris Angelico  wrote:

> On Thu, 28 Jul 2022 at 21:01, Morten W. Petersen 
> wrote:
> >
> > Well, I was thinking of following the socketserver / handle layout of
> code and execution, for now anyway.
> >
> > It wouldn't be a big deal to make them block, but another option is to
> increase the sleep period 100% for every 200 waiting connections while
> waiting in handle.
>
> Easy denial-of-service attack then. Spam connections and the queue
> starts blocking hard. The sleep loop seems like a rather inefficient
> way to do things.
>
> > Another thing is that it's nice to see Python handling 500+ threads
> without problems. :)
>
> Yeah, well, that's not all THAT many threads, ultimately :)
>
> ChrisA
> --
> https://mail.python.org/mailman/listinfo/python-list
>


-- 
I am https://leavingnorway.info
Videos at https://www.youtube.com/user/TheBlogologue
Twittering at http://twitter.com/blogologue
Blogging at http://blogologue.com
Playing music at https://soundcloud.com/morten-w-petersen
Also playing music and podcasting here:
http://www.mixcloud.com/morten-w-petersen/
On Google+ here https://plus.google.com/107781930037068750156
On Instagram at https://instagram.com/morphexx/


-- 
I am https://leavingnorway.info
Videos at https://www.youtube.com/user/TheBlogologue
Twittering at http://twitter.com/blogologue
Blogging at http://blogologue.com
Playing music at https://soundcloud.com/morten-w-petersen
Also playing music and podcasting here:
http://www.mixcloud.com/morten-w-petersen/
On Google+ here https://plus.google.com/107781930037068750156
On Instagram at https://instagram.com/morphexx/
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Simple TCP proxy

2022-07-28 Thread Barry


> On 28 Jul 2022, at 10:31, Morten W. Petersen  wrote:
> 
> 
> Hi Barry.
> 
> Well, I can agree that using backlog is an option for handling bursts. But 
> what if that backlog number is exceeded?  How easy is it to deal with such a 
> situation?

You can make backlog very large, if that makes sense.
But at some point you will be forced to reject connections,
once you cannot keep up with the average rate of connections.


> 
> I just cloned twisted, and compared the size:
> 
> morphex@morphex-Latitude-E4310:~$ du -s stp; du -s tmp/twisted/
> 464 stp
> 98520 tmp/twisted/
> morphex@morphex-Latitude-E4310:~$ du -sh stp/LICENSE 
> 36K stp/LICENSE
> 
> >>> 464/98520.0
> 0.004709703613479496
> >>> 
> 
> It's quite easy to get an idea of what's going on in STP, as opposed to if 
> something goes wrong in Twisted with the size of the codebase. I used to use 
> emacs a lot, but then I came into a period where it was more practical to use 
> nano, and I mostly use nano now, unless I need to for example search and 
> replace or something like that.

I mentioned twisted for context. Depending on yours need the built in python 3 
async support may well be sufficient for you needs. Using threads is not 
scalable.

In the places I code disk space of a few MiB is not an issue.

Barry

> 
> -Morten
> 
>> On Thu, Jul 28, 2022 at 8:31 AM Barry  wrote:
>> 
>> 
>> > On 27 Jul 2022, at 17:16, Morten W. Petersen  wrote:
>> > 
>> > Hi.
>> > 
>> > I'd like to share with you a recent project, which is a simple TCP proxy
>> > that can stand in front of a TCP server of some sort, queueing requests and
>> > then allowing n number of connections to pass through at a time:
>> > 
>> > https://github.com/morphex/stp
>> > 
>> > I'll be developing it further, but the the files committed in this tree
>> > seem to be stable:
>> > 
>> > https://github.com/morphex/stp/tree/9910ca8c80e9d150222b680a4967e53f0457b465
>> > 
>> > I just bombed that code with 700+ requests almost simultaneously, and STP
>> > handled it well.
>> 
>> What is the problem that this solves?
>> 
>> Why not just increase the allowed size of the socket listen backlog if you 
>> just want to handle bursts of traffic.
>> 
>> I do not think of this as a proxy, rather a tunnel.
>> And the tunnel is a lot more expensive the having kernel keep the connection 
>> in
>> the listen socket backlog.
>> 
>> I work on a web proxy written on python that handles huge load and
>> using backlog of the bursts.
>> 
>> It’s async using twisted as threads are not practice at scale.
>> 
>> Barry
>> 
>> > 
>> > Regards,
>> > 
>> > Morten
>> > 
>> > -- 
>> > I am https://leavingnorway.info
>> > Videos at https://www.youtube.com/user/TheBlogologue
>> > Twittering at http://twitter.com/blogologue
>> > Blogging at http://blogologue.com
>> > Playing music at https://soundcloud.com/morten-w-petersen
>> > Also playing music and podcasting here:
>> > http://www.mixcloud.com/morten-w-petersen/
>> > On Google+ here https://plus.google.com/107781930037068750156
>> > On Instagram at https://instagram.com/morphexx/
>> > -- 
>> > https://mail.python.org/mailman/listinfo/python-list
>> > 
>> 
> 
> 
> -- 
> I am https://leavingnorway.info
> Videos at https://www.youtube.com/user/TheBlogologue
> Twittering at http://twitter.com/blogologue
> Blogging at http://blogologue.com
> Playing music at https://soundcloud.com/morten-w-petersen
> Also playing music and podcasting here: 
> http://www.mixcloud.com/morten-w-petersen/
> On Google+ here https://plus.google.com/107781930037068750156
> On Instagram at https://instagram.com/morphexx/
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Simple TCP proxy

2022-07-28 Thread Chris Angelico
On Thu, 28 Jul 2022 at 21:01, Morten W. Petersen  wrote:
>
> Well, I was thinking of following the socketserver / handle layout of code 
> and execution, for now anyway.
>
> It wouldn't be a big deal to make them block, but another option is to 
> increase the sleep period 100% for every 200 waiting connections while 
> waiting in handle.

Easy denial-of-service attack then. Spam connections and the queue
starts blocking hard. The sleep loop seems like a rather inefficient
way to do things.

> Another thing is that it's nice to see Python handling 500+ threads without 
> problems. :)

Yeah, well, that's not all THAT many threads, ultimately :)

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Simple TCP proxy

2022-07-28 Thread Morten W. Petersen
Well, I was thinking of following the socketserver / handle layout of code
and execution, for now anyway.

It wouldn't be a big deal to make them block, but another option is to
increase the sleep period 100% for every 200 waiting connections while
waiting in handle.

Another thing is that it's nice to see Python handling 500+ threads without
problems. :)

-Morten

On Thu, Jul 28, 2022 at 11:45 AM Chris Angelico  wrote:

> On Thu, 28 Jul 2022 at 19:41, Morten W. Petersen 
> wrote:
> >
> > Hi Martin.
> >
> > I was thinking of doing something with the handle function, but just this
> > little tweak:
> >
> >
> https://github.com/morphex/stp/commit/9910ca8c80e9d150222b680a4967e53f0457b465
> >
> > made a huge difference in CPU usage.  Hundreds of waiting sockets are now
> > using 20-30% of CPU instead of 10x that.
>
>  wait, what?
>
> Why do waiting sockets consume *any* measurable amount of CPU? Why
> don't the threads simply block until it's time to do something?
>
> ChrisA
> --
> https://mail.python.org/mailman/listinfo/python-list
>


-- 
I am https://leavingnorway.info
Videos at https://www.youtube.com/user/TheBlogologue
Twittering at http://twitter.com/blogologue
Blogging at http://blogologue.com
Playing music at https://soundcloud.com/morten-w-petersen
Also playing music and podcasting here:
http://www.mixcloud.com/morten-w-petersen/
On Google+ here https://plus.google.com/107781930037068750156
On Instagram at https://instagram.com/morphexx/
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Simple TCP proxy

2022-07-28 Thread Chris Angelico
On Thu, 28 Jul 2022 at 19:41, Morten W. Petersen  wrote:
>
> Hi Martin.
>
> I was thinking of doing something with the handle function, but just this
> little tweak:
>
> https://github.com/morphex/stp/commit/9910ca8c80e9d150222b680a4967e53f0457b465
>
> made a huge difference in CPU usage.  Hundreds of waiting sockets are now
> using 20-30% of CPU instead of 10x that.

 wait, what?

Why do waiting sockets consume *any* measurable amount of CPU? Why
don't the threads simply block until it's time to do something?

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Simple TCP proxy

2022-07-28 Thread Morten W. Petersen
Hi Martin.

I was thinking of doing something with the handle function, but just this
little tweak:

https://github.com/morphex/stp/commit/9910ca8c80e9d150222b680a4967e53f0457b465

made a huge difference in CPU usage.  Hundreds of waiting sockets are now
using 20-30% of CPU instead of 10x that.  So for example making the handle
function exit / stop and wait isn't necessary at this point. It also opens
up the possibility of sending a noop that is appropriate for the given
protocol.

I've not done a lot of thread programming before, but yes, locks can be
used and will be used if necessary. I wasn't sure what data types were
thread safe in Python, and it might be that some variables could be off by
1 or more, if using <= >= checks is an option and that there is no risk of
the variable containing "garbage".

I think with a simple focus, that the project is aimed at one task, will
make it easier to manage even complex matters such as concurrency and
threads.

-Morten

On Wed, Jul 27, 2022 at 11:00 PM Martin Di Paola 
wrote:

>
> On Wed, Jul 27, 2022 at 08:32:31PM +0200, Morten W. Petersen wrote:
> >You're thinking of the backlog argument of listen?
>
>  From my understanding, yes, when you set up the "accepter" socket (the
> one that you use to listen and accept new connections), you can define
> the length of the queue for incoming connections that are not accepted
> yet.
>
> This will be the equivalent of your SimpleQueue which basically puts a
> limits on how many incoming connections are "accepted" to do a real job.
>
> Using skt.listen(N) the incoming connections are put on hold by the OS
> while in your implementation are formally accepted but they are not
> allowed to do any meaningful work: they are put on the SimpleQueue and
> only when they are popped then they will work (send/recv data).
>
> The difference then between the OS and your impl is minimal. The only
> case that I can think is that on the clients' side it may exist a
> timeout for the acceptance of the connection so your proxy server will
> eagerly accept these connections so no timeout is possible(*)
>
> On a side note, you implementation is too thread-naive: it uses plain
> Python lists, integers and boolean variables which are not thread safe.
> It is a matter of time until your server will start behave weird.
>
> One option is that you use thread-safe objects. I'll encourage to read
> about thread-safety in general and then which sync mechanisms Python
> offers.
>
> Another option is to remove the SimpleQueue and the background function
> that allows a connection to be "active".
>
> If you think, the handlers are 99% independent except that you want to
> allow only N of them to progress (stablish and forward the connection)
> and when a handler finishes, another handler "waiting" is activated, "in
> a queue fashion" as you said.
>
> If you allow me to not have a strict queue discipline here, you can achieve
> the same results coordinating the handlers using semaphores. Once again,
> take this email as starting point for your own research.
>
> On a second side note, the use of handlers and threads is inefficient
> because while you have N active handlers sending/receiving data, because
> you are eagerly accepting new connections you will have much more
> handlers created and (if I'm not wrong), each will be a thread.
>
> A more efficient solution could be
>
> 1) accept as many connections as you can, saving the socket (not the
> handler) in the thread-safe queue.
> 2) have N threads in the background popping from the queue a socket and
> then doing the send/recv stuff. When the thread is done, the thread
> closes the socket and pops another from the queue.
>
> So the queue length will be the count of accepted connections but in any
> moment your proxy will not activate (forward) more than N connections.
>
> This idea is thread-safe, simpler, efficient and has the queue
> discipline (I leave aside the usefulness).
>
> I encourage you to take time to read about the different things
> mentioned as concurrency and thread-related stuff is not easy to
> master.
>
> Thanks,
> Martin.
>
> (*) make your proxy server slow enough and yes, you will get timeouts
> anyways.
>
> >
> >Well, STP will accept all connections, but can limit how many of the
> >accepted connections that are active at any given time.
> >
> >So when I bombed it with hundreds of almost simultaneous connections, all
> >of them were accepted, but only 25 were actively sending and receiving
> data
> >at any given time. First come, first served.
> >
> >Regards,
> >
> >Morten
> >
>

Re: Simple TCP proxy

2022-07-28 Thread Morten W. Petersen
Hi Barry.

Well, I can agree that using backlog is an option for handling bursts. But
what if that backlog number is exceeded?  How easy is it to deal with such
a situation?

I just cloned twisted, and compared the size:

morphex@morphex-Latitude-E4310:~$ du -s stp; du -s tmp/twisted/
464 stp
98520 tmp/twisted/
morphex@morphex-Latitude-E4310:~$ du -sh stp/LICENSE
36K stp/LICENSE

>>> 464/98520.0
0.004709703613479496
>>>

It's quite easy to get an idea of what's going on in STP, as opposed to if
something goes wrong in Twisted with the size of the codebase. I used to
use emacs a lot, but then I came into a period where it was more practical
to use nano, and I mostly use nano now, unless I need to for example search
and replace or something like that.

-Morten

On Thu, Jul 28, 2022 at 8:31 AM Barry  wrote:

>
>
> > On 27 Jul 2022, at 17:16, Morten W. Petersen  wrote:
> >
> > Hi.
> >
> > I'd like to share with you a recent project, which is a simple TCP proxy
> > that can stand in front of a TCP server of some sort, queueing requests
> and
> > then allowing n number of connections to pass through at a time:
> >
> > https://github.com/morphex/stp
> >
> > I'll be developing it further, but the the files committed in this tree
> > seem to be stable:
> >
> >
> https://github.com/morphex/stp/tree/9910ca8c80e9d150222b680a4967e53f0457b465
> >
> > I just bombed that code with 700+ requests almost simultaneously, and STP
> > handled it well.
>
> What is the problem that this solves?
>
> Why not just increase the allowed size of the socket listen backlog if you
> just want to handle bursts of traffic.
>
> I do not think of this as a proxy, rather a tunnel.
> And the tunnel is a lot more expensive the having kernel keep the
> connection in
> the listen socket backlog.
>
> I work on a web proxy written on python that handles huge load and
> using backlog of the bursts.
>
> It’s async using twisted as threads are not practice at scale.
>
> Barry
>
> >
> > Regards,
> >
> > Morten
> >
> > --
> > I am https://leavingnorway.info
> > Videos at https://www.youtube.com/user/TheBlogologue
> > Twittering at http://twitter.com/blogologue
> > Blogging at http://blogologue.com
> > Playing music at https://soundcloud.com/morten-w-petersen
> > Also playing music and podcasting here:
> > http://www.mixcloud.com/morten-w-petersen/
> > On Google+ here https://plus.google.com/107781930037068750156
> > On Instagram at https://instagram.com/morphexx/
> > --
> > https://mail.python.org/mailman/listinfo/python-list
> >
>
>

-- 
I am https://leavingnorway.info
Videos at https://www.youtube.com/user/TheBlogologue
Twittering at http://twitter.com/blogologue
Blogging at http://blogologue.com
Playing music at https://soundcloud.com/morten-w-petersen
Also playing music and podcasting here:
http://www.mixcloud.com/morten-w-petersen/
On Google+ here https://plus.google.com/107781930037068750156
On Instagram at https://instagram.com/morphexx/
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Simple TCP proxy

2022-07-28 Thread Morten W. Petersen
OK, I'll have a look at using something else than _threading.

I quickly saw a couple of points where code could be optimized for speed,
the loop that transfers data back and forth also has low throughput, but
first priority was getting it working and seeing that it is fairly stable.

Regards,

Morten
--

I am https://leavingnorway.info

Videos at https://www.youtube.com/user/TheBlogologue
Twittering at http://twitter.com/blogologue

Blogging at http://blogologue.com
Playing music at https://soundcloud.com/morten-w-petersen

Also playing music and podcasting here:
http://www.mixcloud.com/morten-w-petersen/

On Instagram at https://instagram.com/morphexx/



On Wed, Jul 27, 2022 at 9:57 PM Chris Angelico  wrote:

> On Thu, 28 Jul 2022 at 04:32, Morten W. Petersen 
> wrote:
> >
> > Hi Chris.
> >
> > You're thinking of the backlog argument of listen?
>
> Yes, precisely.
>
> > Well, STP will accept all connections, but can limit how many of the
> accepted connections that are active at any given time.
> >
> > So when I bombed it with hundreds of almost simultaneous connections,
> all of them were accepted, but only 25 were actively sending and receiving
> data at any given time. First come, first served.
> >
>
> Hmm. Okay. Not sure what the advantage is, but sure.
>
> If the server's capable of handling the total requests-per-minute,
> then a queueing system like this should help with burst load, although
> I would have thought that the listen backlog would do the same. What
> happens if the server actually gets overloaded though? Do connections
> get disconnected after appearing connected? What's the disconnect
> mode?
>
> BTW, you probably don't want to be using the _thread module - Python
> has a threading module which is better suited to this sort of work.
> Although you may want to consider asyncio instead, as that has far
> lower overhead when working with large numbers of sockets.
>
> ChrisA
> --
> https://mail.python.org/mailman/listinfo/python-list
>


-- 
I am https://leavingnorway.info
Videos at https://www.youtube.com/user/TheBlogologue
Twittering at http://twitter.com/blogologue
Blogging at http://blogologue.com
Playing music at https://soundcloud.com/morten-w-petersen
Also playing music and podcasting here:
http://www.mixcloud.com/morten-w-petersen/
On Google+ here https://plus.google.com/107781930037068750156
On Instagram at https://instagram.com/morphexx/
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Simple TCP proxy

2022-07-27 Thread Barry


> On 27 Jul 2022, at 17:16, Morten W. Petersen  wrote:
> 
> Hi.
> 
> I'd like to share with you a recent project, which is a simple TCP proxy
> that can stand in front of a TCP server of some sort, queueing requests and
> then allowing n number of connections to pass through at a time:
> 
> https://github.com/morphex/stp
> 
> I'll be developing it further, but the the files committed in this tree
> seem to be stable:
> 
> https://github.com/morphex/stp/tree/9910ca8c80e9d150222b680a4967e53f0457b465
> 
> I just bombed that code with 700+ requests almost simultaneously, and STP
> handled it well.

What is the problem that this solves?

Why not just increase the allowed size of the socket listen backlog if you just 
want to handle bursts of traffic.

I do not think of this as a proxy, rather a tunnel.
And the tunnel is a lot more expensive the having kernel keep the connection in
the listen socket backlog.

I work on a web proxy written on python that handles huge load and
using backlog of the bursts.

It’s async using twisted as threads are not practice at scale.

Barry

> 
> Regards,
> 
> Morten
> 
> -- 
> I am https://leavingnorway.info
> Videos at https://www.youtube.com/user/TheBlogologue
> Twittering at http://twitter.com/blogologue
> Blogging at http://blogologue.com
> Playing music at https://soundcloud.com/morten-w-petersen
> Also playing music and podcasting here:
> http://www.mixcloud.com/morten-w-petersen/
> On Google+ here https://plus.google.com/107781930037068750156
> On Instagram at https://instagram.com/morphexx/
> -- 
> https://mail.python.org/mailman/listinfo/python-list
> 

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Simple TCP proxy

2022-07-27 Thread Martin Di Paola



On Wed, Jul 27, 2022 at 08:32:31PM +0200, Morten W. Petersen wrote:

You're thinking of the backlog argument of listen?


From my understanding, yes, when you set up the "accepter" socket (the
one that you use to listen and accept new connections), you can define
the length of the queue for incoming connections that are not accepted
yet.

This will be the equivalent of your SimpleQueue which basically puts a
limits on how many incoming connections are "accepted" to do a real job.

Using skt.listen(N) the incoming connections are put on hold by the OS
while in your implementation are formally accepted but they are not
allowed to do any meaningful work: they are put on the SimpleQueue and
only when they are popped then they will work (send/recv data).

The difference then between the OS and your impl is minimal. The only
case that I can think is that on the clients' side it may exist a
timeout for the acceptance of the connection so your proxy server will
eagerly accept these connections so no timeout is possible(*)

On a side note, you implementation is too thread-naive: it uses plain
Python lists, integers and boolean variables which are not thread safe.
It is a matter of time until your server will start behave weird.

One option is that you use thread-safe objects. I'll encourage to read
about thread-safety in general and then which sync mechanisms Python
offers.

Another option is to remove the SimpleQueue and the background function
that allows a connection to be "active".

If you think, the handlers are 99% independent except that you want to
allow only N of them to progress (stablish and forward the connection)
and when a handler finishes, another handler "waiting" is activated, "in
a queue fashion" as you said.

If you allow me to not have a strict queue discipline here, you can achieve
the same results coordinating the handlers using semaphores. Once again,
take this email as starting point for your own research.

On a second side note, the use of handlers and threads is inefficient
because while you have N active handlers sending/receiving data, because
you are eagerly accepting new connections you will have much more
handlers created and (if I'm not wrong), each will be a thread.

A more efficient solution could be

1) accept as many connections as you can, saving the socket (not the
handler) in the thread-safe queue.
2) have N threads in the background popping from the queue a socket and
then doing the send/recv stuff. When the thread is done, the thread
closes the socket and pops another from the queue.

So the queue length will be the count of accepted connections but in any
moment your proxy will not activate (forward) more than N connections.

This idea is thread-safe, simpler, efficient and has the queue
discipline (I leave aside the usefulness).

I encourage you to take time to read about the different things
mentioned as concurrency and thread-related stuff is not easy to
master.

Thanks,
Martin.

(*) make your proxy server slow enough and yes, you will get timeouts
anyways.



Well, STP will accept all connections, but can limit how many of the
accepted connections that are active at any given time.

So when I bombed it with hundreds of almost simultaneous connections, all
of them were accepted, but only 25 were actively sending and receiving data
at any given time. First come, first served.

Regards,

Morten

On Wed, Jul 27, 2022 at 8:00 PM Chris Angelico  wrote:


On Thu, 28 Jul 2022 at 02:15, Morten W. Petersen 
wrote:
>
> Hi.
>
> I'd like to share with you a recent project, which is a simple TCP proxy
> that can stand in front of a TCP server of some sort, queueing requests
and
> then allowing n number of connections to pass through at a time:

How's this different from what the networking subsystem already does?
When you listen, you can set a queue length. Can you elaborate?

ChrisA
--
https://mail.python.org/mailman/listinfo/python-list




--
I am https://leavingnorway.info
Videos at https://www.youtube.com/user/TheBlogologue
Twittering at http://twitter.com/blogologue
Blogging at http://blogologue.com
Playing music at https://soundcloud.com/morten-w-petersen
Also playing music and podcasting here:
http://www.mixcloud.com/morten-w-petersen/
On Google+ here https://plus.google.com/107781930037068750156
On Instagram at https://instagram.com/morphexx/
--

I am https://leavingnorway.info

Videos at https://www.youtube.com/user/TheBlogologue
Twittering at http://twitter.com/blogologue

Blogging at http://blogologue.com
Playing music at https://soundcloud.com/morten-w-petersen

Also playing music and podcasting here:
http://www.mixcloud.com/morten-w-petersen/

On Instagram at https://instagram.com/morphexx/
--
https://mail.python.org/mailman/listinfo/python-list

--
https://mail.python.org/mailman/listinfo/python-list


Re: Simple TCP proxy

2022-07-27 Thread Chris Angelico
On Thu, 28 Jul 2022 at 04:32, Morten W. Petersen  wrote:
>
> Hi Chris.
>
> You're thinking of the backlog argument of listen?

Yes, precisely.

> Well, STP will accept all connections, but can limit how many of the accepted 
> connections that are active at any given time.
>
> So when I bombed it with hundreds of almost simultaneous connections, all of 
> them were accepted, but only 25 were actively sending and receiving data at 
> any given time. First come, first served.
>

Hmm. Okay. Not sure what the advantage is, but sure.

If the server's capable of handling the total requests-per-minute,
then a queueing system like this should help with burst load, although
I would have thought that the listen backlog would do the same. What
happens if the server actually gets overloaded though? Do connections
get disconnected after appearing connected? What's the disconnect
mode?

BTW, you probably don't want to be using the _thread module - Python
has a threading module which is better suited to this sort of work.
Although you may want to consider asyncio instead, as that has far
lower overhead when working with large numbers of sockets.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Simple TCP proxy

2022-07-27 Thread Morten W. Petersen
Hi Chris.

You're thinking of the backlog argument of listen?

Well, STP will accept all connections, but can limit how many of the
accepted connections that are active at any given time.

So when I bombed it with hundreds of almost simultaneous connections, all
of them were accepted, but only 25 were actively sending and receiving data
at any given time. First come, first served.

Regards,

Morten

On Wed, Jul 27, 2022 at 8:00 PM Chris Angelico  wrote:

> On Thu, 28 Jul 2022 at 02:15, Morten W. Petersen 
> wrote:
> >
> > Hi.
> >
> > I'd like to share with you a recent project, which is a simple TCP proxy
> > that can stand in front of a TCP server of some sort, queueing requests
> and
> > then allowing n number of connections to pass through at a time:
>
> How's this different from what the networking subsystem already does?
> When you listen, you can set a queue length. Can you elaborate?
>
> ChrisA
> --
> https://mail.python.org/mailman/listinfo/python-list
>


-- 
I am https://leavingnorway.info
Videos at https://www.youtube.com/user/TheBlogologue
Twittering at http://twitter.com/blogologue
Blogging at http://blogologue.com
Playing music at https://soundcloud.com/morten-w-petersen
Also playing music and podcasting here:
http://www.mixcloud.com/morten-w-petersen/
On Google+ here https://plus.google.com/107781930037068750156
On Instagram at https://instagram.com/morphexx/
--

I am https://leavingnorway.info

Videos at https://www.youtube.com/user/TheBlogologue
Twittering at http://twitter.com/blogologue

Blogging at http://blogologue.com
Playing music at https://soundcloud.com/morten-w-petersen

Also playing music and podcasting here:
http://www.mixcloud.com/morten-w-petersen/

On Instagram at https://instagram.com/morphexx/
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Simple TCP proxy

2022-07-27 Thread Chris Angelico
On Thu, 28 Jul 2022 at 02:15, Morten W. Petersen  wrote:
>
> Hi.
>
> I'd like to share with you a recent project, which is a simple TCP proxy
> that can stand in front of a TCP server of some sort, queueing requests and
> then allowing n number of connections to pass through at a time:

How's this different from what the networking subsystem already does?
When you listen, you can set a queue length. Can you elaborate?

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Simple TCP proxy

2022-07-27 Thread Morten W. Petersen
Hi.

I'd like to share with you a recent project, which is a simple TCP proxy
that can stand in front of a TCP server of some sort, queueing requests and
then allowing n number of connections to pass through at a time:

https://github.com/morphex/stp

I'll be developing it further, but the the files committed in this tree
seem to be stable:

https://github.com/morphex/stp/tree/9910ca8c80e9d150222b680a4967e53f0457b465

I just bombed that code with 700+ requests almost simultaneously, and STP
handled it well.

Regards,

Morten

-- 
I am https://leavingnorway.info
Videos at https://www.youtube.com/user/TheBlogologue
Twittering at http://twitter.com/blogologue
Blogging at http://blogologue.com
Playing music at https://soundcloud.com/morten-w-petersen
Also playing music and podcasting here:
http://www.mixcloud.com/morten-w-petersen/
On Google+ here https://plus.google.com/107781930037068750156
On Instagram at https://instagram.com/morphexx/
-- 
https://mail.python.org/mailman/listinfo/python-list