Re: filtering HTTPS/CONNECT (summary and continuation of discussion)

Marcus Kool Fri, 16 Mar 2012 14:06:06 -0700

There were 4 threads about 'filtering HTTPS' and I will try to
summarise here.


Current situation with Squid 3.1.19:
What happens inside a CONNECT is practically not filterable because
1) sslBump is not used, or
2) sslBump is used and SSL+HTTP can be filtered, but it breaks the
   other data streams for Skype et al.  Using the unsafe options
   'sslproxy_cert_error allow all' and 'sslproxy_flags DONT_VERIFY_PEER'
   to circumvent the latter problem are far from desirable.

The wiki features pages say that Alex Rousskov is working on BumpSslServerFirst
and MimicSslServerCert but unfortunately Alex has not (yet) participated in the
discussion.

What I consider as the "desired situation":
*all* traffic will be filterable, since if there is an exception for
one category of data, one can write an application that makes a tunnel
using this particular category of data and hence is able to circumvent
all efforts to filter traffic.

To filter HTTP is trivial. To filter HTTPS there are two options:
1) to filter without sslBump and then the filter only receives
   "CONNECT <endpoint>:443" on which it has to make a decision to block
   or not.  This cripples the filter since it does not has access to the
   content and in many cases can not detect which application sends
   what (type of) data.
   An additional drawback is that connection can be blocked but an
   understandable error message cannot be presented to the end user.
2) use sslBump. The filter will receive "CONNECT <endpoint>:443" as well as
   "https://endpoint/path"; (and content for RESPMOD) for SSL+HTTP based
   connections so this is optimal for filtering SSL+HTTP connections.
   The discussion was much around what to do with data streams that are not
   SSL+HTTP.  This can be any protocol encapsulated by SSL or simply any
   protocol.

To be able to filter all data, Squid needs a modification to present raw data
about the non-SSL+HTTP data streams to a filter (URL redirector or ICAP).
To keep the discussion focussed on one type of filter I will assume that
an ICAP server is used as the filter.

The ICAP protocol has a considerable overhead (CPU processing) and extending
the ICAP protocol for data stream filtering is not the first choice.
Amos and Henrik were "optimistic" about implementing a new pipe filter.

The data streams for a bidirectional pipe have a different behavior than
HTTP and SSL+HTTP. Both client and server can send data at any time. And
for some, the server initiates the protocol and for others, the client
initiates.  OpenVPN is a chameleon and can pretend to be an SSL+HTTP server
but is also a VPN server.

In all cases that Squid sends a request to a filter, it would be
a *big* plus if it informs the filter what it already knows about the
CONNECT endpoint.  E.g. If it has SSL/TLS or not.

Since sslBump is being rewritten for 3.3 it is a good opportunity
to make Squid suitable for filtering *all* data streams.

The new sslBump flow could be something like this:

A) open socket to server. If error, close socket to client.
B) do the logic for ICAP REQMOD CONNECT endpoint:443
C) start SSL handshake to server and take care of all certificate issues.
   If the SSL handshake fails with a PROTOCOL error, the socket must be closed,
   a new socket must be opened, and Squid will assume that the endpoint
   uses an other protocol than SSL. Squid goes into tunnel mode and all
   filtering will be done by the new pipe filter.
   Squid may get a new option to define its behaviour in case the SSL handshake
   fails. The options could be called sslBumpForNoneSSL with values
   prohibitNoneSSL (terminate connection), passNoneSSL (always allow),
   filterNoneSSL (default value - let new pipe filter decide).
D) Squid now knows that the connection has a SSL/TLS wrapper but does not know
   yet if inside the wrapper HTTP is used.
   Squid monitors what the client *and* the server send on the pipe. If the
   client sends first and sends a valid HTTP command, Squid assumes that the
   connection has SSL+HTTP.
   If there is no SSL+HTTP Squid goes into tunnel mode and all filtering will be
   done with the new pipe filter.
E) do the "normal processing" and ICAP REQMOD/RESPMOD for https://endpoint/path

The total work of Squid+filter can be reduced if B) is done after C) since
Squid can inform the filter about the SSL handshake and the filter does
not have to do its own probe.

There was a suggestion for a connection cache which allows it to skip checks
and make assumptions about a new CONNECT to an endpoint that was CONNECTed 
before.

The new pipe filter requires a new protocol yet to be defined.
Squid initially tells the filter what it already knows about the endpoint.
I.e. uses SSL or not, time to CONNECT, endpoint address, cached information.
The Squid pipe sends copies of all data to the filter and the filter can reply
with one of the following: OK (proceed with this data), REPLACE-CONTENT (content
and a flag to optionally also terminate the connection), TERMINATE (just close
sockets), OK-FOR-ALL (proceed and do not consult me any more for this 
connection).
Squid also informs the filter when the connection is terminated by the
client or the server.

How do we go on from here?

Re: filtering HTTPS/CONNECT (summary and continuation of discussion)

Reply via email to