[squid-users] Theoretically speaking about a proxy service

ngtech1ltd Wed, 28 Jun 2023 19:21:16 -0700

Hey Everybody,

I have seen couple free proxy providers like:
Urban vpn
Nord vpn
Clearvpn


And couple other proxy services.

A long time ago I wrote the article:
A Proxy for each Internet user! The future!

https://www1.ngtech.co.il/wpe/2016/05/02/proxy-per-internet-user-is-it-realistic/

And I was just wondering to myself a thing or two about http proxies.

Most of the VPN services use and support OpenVPN, wireguard and other vpn 
services on the route level.
These are simple and needs some kinds of "smart" CGNAT to operate and are 
cheaper than a http proxy since the it works in the lower
level of the connection.
For example, you can give a static private IP to the client in your system and 
apply all the relevant routing and NAT rules and the connection
will be initiated automatically with the relevant external IP.
Also, if you need an IP address you can just spin an "exit" node on any public 
cloud and add it into the pool of routes.

But there is another option, the proxy way of things.
Either socks or plain HTTP Proxy..

But let start with a proxy to simplify things.

Let say I want to spin couple squid "exit" nodes and I would like to have a 
frontend that will route traffic based on authentication details.
I have seen an answer which is un-verified since 2013 at:
https://access.redhat.com/solutions/259903

To make it all work we first need to assume that 
bever_direct allow all

will force all CONNECT requests to a cache_peer (since there aren't too many 
plain http services else then MS updates and couple others).

There is also another problem, how do we route clients based on credentials 
from a frontend to the backend exit nodes / cache peers?

There are couple issues in this kinds of setup.
Since the client connects to the proxy service in plain text it can be 
intercepted so we will assume that the user can access some securely to the 
proxy.
IE Wireguard or OpenVPN or SSTP or other IPSEC based solution which or any 
other alternative method like a Trusted network...

The next step in this setup is securing the connections between the proxies.
For this we need to use some kind of network of connection between the Hub or 
Hubs to the exit nodes.
If both the HUB and the exit node has a public IP address behind a 1:1 nat and 
can communicate directly they can use Wireguard or OpenVPN to secure their 
connections.
There are couple other things that need to be sorted and these are the 
provisioning of the exit nodes and their registration and status check each.
Any of the HUBs need to be able to handle couple of these tasks with a bit of 
automation and couple uuid generators.

I wanted to build such a tiny setup but I lack couple things for the specs for 
such a system.
I have seen this nice post:
* 
https://www.blackhatworld.com/seo/developer-needed-to-build-scripts-to-create-proxies-using-haproxy-or-squid-advanced-expertise-required.1300167/

So I am trying to mimic a WWW net.
The first thing is to have two to three ipconfig.io nodes which will have a 
very tiny foot print that I will use to test the setup.
The next thing is the basic WWW net ie couple sites with BGP each will have a 
/24(?) CIDR behind them and a central /24(?) for all of them.
Since it's a lab it's preferable that all these will have a very small 
resources foot print.
We can use a simple containers network and use the next piece of software:
* https://github.com/georgyo/ifconfig.io
* https://hub.docker.com/r/elicro/ifconfig.io

For the tests we might need a root CA but not really relevant since -k is good 
enough for most basic tests with curl since... we assume the connection is 
secured already.

Networks the we can use, private only(?):
192.168.0.0/16
10.0.0.0/8
172.16. 0.0/12

We can use also use CGNAT cidr:
100.64.0.0/10

* https://www.rfc-editor.org/rfc/rfc6598

And just for theses who need:
* https://www.ngtech.co.il/ipcalc/
* https://hub.docker.com/r/elicro/ipcalc


So we will need first one central hub for automation registry and management.
It will use couple internal CIDRs and couple 1:1 nat address spaces.

The end result should be couple tiny clients that will run couple curl tests 
with usename and password that will be the routing vector for the setup.
So we will have one main HUB and this hub will have 1 port that will listen to 
all proxy requests with username and passwords.
So basically we need an office and an internet connection, an idea and all the 
automation tools to implement it.
Currently AWS and many other providers have enough automation tools that can 
remove some of the heavy lifting off the table.
So now for the DB and registration system.
For each exit node we need a uuid and couple specific services.
* health check
* external ip verification
* registration against the hub
* VPN to the central HUB? (complexity.. but flexibility for the NAT connection 
tracking limit of the OFFICE/Proxy IP)

In the central office we need let say port 10000 a http proxy in port which 
will be port forwarded to a single squid proxy server with a floating ip and 
redundant server.
If we would have a secure channel between the proxies and the central office it 
will be much simple to register new proxies 
(Assuming each proxy receives the uuid and registration and VPN details in it's 
cloud-init or any other initialization method)

So we would have a DB which will hold a uuid and configuration details prepared 
before for the registration and health checks and status.

The squid.conf of the proxy should be created dynamically since there are 
changes in the network....
Unless we assume a specific capacity and an internal connection between the HUB 
and the proxy.
If we assume an internal connection between the HUB and the proxies we can 
dedicate a cidr for the proxies.
Then we can create a pretty "static" squid.conf (a big one..) and we can change 
the configuration in the DB so
helpers will help us decide which proxy is up or down and which of the static 
cache_peers a user name and password will use.

What do you think about this? How will it work?
Squid can handle this kind of load with couple workers and couple scripts but 
to create such a setup, it’s a bit of a job.
Let say I will assume a network of proxies with 10 proxies which will spin up 
and down, how will it work????
How much resources are required to run test such a setup?

I believe a demo can all be done on a linux network namespaces on a single node 
setup but it's not like real world...
What OS will you use in such a setup?
These days any linux OS requires at-least 512 MB of RAM to spin nicely so I 
assume an Alpine based setup would be nice but...
It's not like RHEL systems, There are scripts that should be written and 
supervised to be used (compared to systemd) etc...

Let me know if the script I wrote seems reasonable enough.

( 6.0.3 here I'm coming, here since 3.2 beta )

Eliezer

_______________________________________________
squid-users mailing list
squid-users@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-users

[squid-users] Theoretically speaking about a proxy service

Reply via email to