[squid-dev] Squid's mailman
Hi all, I posted today a message on squid-us...@lists.squid-cache.org and got 2 DMARC messages (more may follow in the next 24 hours) indicating that the mail message that mailman forwards to list members is rejected for list members - could be many more if they block without sending a DMARC message. It seems that the mailman software that lists.squid-cache.org uses does not obey current best practices and fails to deliver a message on the list to all subscribers where the subscriber's mail server does SPF checks and notices that the From header does not match with the IP address of Squid's mailman server and rejects/quarantines the message. It could be that many messages (not just mine) do not reach the mailboxes of list subscribers. Marcus ___ squid-dev mailing list squid-dev@lists.squid-cache.org https://lists.squid-cache.org/listinfo/squid-dev
Re: [squid-dev] squid website certificate error
I typed the name of the website without https in the address bar. I am not sure how it was redirected to the https address (could be my browser history or web server). In Firefox and Vivaldi I get the correct site. When I type 'www.squid-cache.org' in the address bar of Chrome it goes very wrong showing the contents of https://grass.osgeo.org/. Maybe Chrome tries https first and then http. Marcus On 29/08/2022 18:52, Francesco Chemolli wrote: The squid website is not supposed to be over https, because it’s served by multiple mirrors not necessarily under the project’s control. We have some ideas on how to change this but need the developer time to do it. Help is welcome :) On Mon, 29 Aug 2022 at 15:25, Marcus Kool wrote: Has anybody already complained that the certificates for squid-cache.org <http://squid-cache.org> and www.squid-cache.org <http://www.squid-cache.org> are messed up? Marcus ___ squid-dev mailing list squid-dev@lists.squid-cache.org http://lists.squid-cache.org/listinfo/squid-dev -- @mobile___ squid-dev mailing list squid-dev@lists.squid-cache.org http://lists.squid-cache.org/listinfo/squid-dev
[squid-dev] squid website certificate error
Has anybody already complained that the certificates for squid-cache.org and www.squid-cache.org are messed up? Marcus ___ squid-dev mailing list squid-dev@lists.squid-cache.org http://lists.squid-cache.org/listinfo/squid-dev
[squid-dev] TLS 1.3 0rtt
After reading https://www.privateinternetaccess.com/blog/2018/11/supercookey-a-supercookie-built-into-tls-1-2-and-1-3/ I am wondering if the TLS 1.3 implementation in Squid will have an option to disable the 0rtt feature so that user tracking is reduced. Marcus ___ squid-dev mailing list squid-dev@lists.squid-cache.org http://lists.squid-cache.org/listinfo/squid-dev
Re: [squid-dev] Block users dynamically
On 28/05/18 15:10, dean wrote: I am implementing modifications to Squid 3.5.27 for a thesis job. At some point in the code, I need to block a user. What I'm doing is writing to an external file that is used in the configuration, like Squish does. But it does not block the user, however when I reconfigure Squid if it blocks it. Is there something I do not know? When I change the file, should I reconfigure Squid? Is there another way to block users dynamically from the Squid code? You can use ufdbGuard for this purpose. ufdbGuard is a free URL redirector for Squid which can be configured to read lists of usernames or list of IP addresses every X minutes (default for X is 15). So if you control a blacklist with usernames and write the name of the user to a defined file, ufdbguardd will block these users. If the user must be blocked immediately you need to reload ufdbguardd, otherwise you wait until the configured time interval to reread the userlist expires and so after a few minutes the user gets blocked. Note that reloading ufdbguardd does not interfere with Squid and all activity by browsers and squid continues normally. Marcus ___ squid-dev mailing list squid-dev@lists.squid-cache.org http://lists.squid-cache.org/listinfo/squid-dev
[squid-dev] wiki.squid-cache.org has an expired certificate
___ squid-dev mailing list squid-dev@lists.squid-cache.org http://lists.squid-cache.org/listinfo/squid-dev
Re: [squid-dev] [PATCH] Bug 4662 adding --with-libressl build option
Do you think we can compromise and call it USE_OPENSSL_OR_LIBRESSL ? or call it USE_OPENSSL_API and then the code will eventually have none or few occurrences of USE_OPENSSL and USE_LIBRESSL to deal with OpenSSL and LibreSSL specifics. Marcus ___ squid-dev mailing list squid-dev@lists.squid-cache.org http://lists.squid-cache.org/listinfo/squid-dev
Re: [squid-dev] g++ 4.8.x and std::regex problems
On 11/28/2016 07:46 PM, Alex Rousskov wrote: Please undo that commit and let's discuss whether switching from libregex to std::regex now is a good idea. Thank you, Alex. Has anybody considered using RE2? It is a regex library that is fast, C++ source, high quality, public domain, and is supported by older compilers. Marcus ___ squid-dev mailing list squid-dev@lists.squid-cache.org http://lists.squid-cache.org/listinfo/squid-dev
Re: [squid-dev] [RFC] simplifying ssl_bump complexity
On 11/27/2016 11:20 PM, Alex Rousskov wrote: On 11/19/2016 07:06 PM, Amos Jeffries wrote: On 20/11/2016 12:08 p.m., Marcus Kool wrote: The current ssl bump steps allow problematic configs where Squid bumps or stares in one step and to splice in an other step, which can be resolved (made impossible) in a new configuration syntax. It would be nice to prohibit truly impossible actions at the syntax level, but I suspect that the only way to make that possible is to focus on final actions [instead of steps] and require at *most* one ssl_bump rule for each of the supported final actions: ssl_bump splice...rules that define when to splice... ssl_bump bump ...rules that define when to bump... ssl_bump terminate ...rules that define when to terminate... # no other ssl_bump lines allowed! The current intermediate actions (peek and stare) would have to go into the ACLs. There will be no ssl_bump rules for them at all. In other words, the admin would be required to _always_ write an equivalent of if (a1() && a2() && ...) then splice elsif (b1() && b2() && ...) then bump elsif (c1() && c2() && ...) then terminate else splice or bump, depending on state (or some other default; this decision is secondary) endif where a1(), b2(), and other functions/ACLs may peek or stare as needed to get the required information. The above if-then-else tree is clear. I like your suggestion to drop steps in the configuration and make Squid more intelligent to take decisions at the appropriate moments (steps). You mentioned admins being surprised about Squid bumping for a notification of an error and one way to improve that is to replace 'terminate' by 'terminate_with_error' (with bumping) and 'quick_terminate' (no bumping, just close fd). The quick_terminate, if used, is also faster, which is an added benefit. I am not sure such a change is desirable, but it is worth considering, I guess. Please note that I am ignoring the directives/actions naming issue for now. AFAICT, the syntax proposed by Amos (i.e., making stepN mandatory) does not solve this particular problem at all: # Syntactically valid nonsense: ssl_bump_step1 splice all ssl_bump_step2 bump all and neither is yours: # Syntactically valid nonsense? tls_server_hello passthrough all tls_client_hello terminate all Correct. It would be nice to have a better configuration syntax where impossible rules are easier to avoid and/or Squid has intelligence to detect nonsense rules and produce an error. Below is a new proposal to attempt to make the configuration more intuitive and less prone for admin misunderstandings. First the admin must define if there is any bumping at all. This could be done with https_decryption on|off This is similar to tls_new_connection peek|splice but much more intuitive. I do not see why https_decryption off is more intuitive (or more precise) than ssl_bump splice all For me this is because of the used terminology and because with 'https_decryption off' one does not write anything that has 'bump' in it, so the admin does not even have to read the documentation to learn that 'ssl_bump splice all' means 'no decryption'. especially after you consider the order of directives. Again, I am ignoring the naming issue for now. You may assume any name you want for any directive or ACL. allright for now. The only comment that I want to make without starting a new thread is that I think that conceptual terms are better than technical terms (hence my preference for 'passthrough' instead of 'splice'). But let's save this discussion for later. Iff https_decryption is on: 1) the "connection" step: When a browser uses "CONNECT " Squid does not need to make peek or splice decisions. When Squid intercepts a connection to "port 443 of " no peek or splice decision is made here any more. This step becomes obsolete in the proposed configuration. I am still hoping that will happen. But also still getting pushback that people want to terminate or splice without even looking at the clear-text hello details. I suspect that not looking at some SSL Hellos will always be needed because some of those Hellos are not Hellos at all and it takes too much time/resources for the SSL Hellos parser to detect some non-SSL Hellos. Besides that, it is always nice to be able to selectively bypass the complex Hello parser code in emergencies. Perhaps less resources are used if there is a two-stage parser: 1) quick scan of input for data layout of a ClientHello without semantic parsing the content, e.g. look at the CipherSuite and verify that the whole field has legal characters without verifying that the ciphersuite is a valid SSL string of ciphers. 2) do the complex parsing. Stage 1 should be fast and can separate SSL ClientHello from other protocols. 3) the "TLS server
Re: [squid-dev] [RFC] simplifying ssl_bump complexity
Hi Amos, Can you share your thoughts ? Thanks Marcus On 11/20/2016 10:55 AM, Marcus Kool wrote: On 11/20/2016 12:06 AM, Amos Jeffries wrote: On 20/11/2016 12:08 p.m., Marcus Kool wrote: [snip] I like the intent of the proposal and the new directives tls_*. What currently makes configuration in Squid 3/4 difficult is the logic of 'define in step x what to do in the next step' and IMO this logic is the main cause of misunderstandings and incorrect configurations. Also the terms 'bump' and 'splice' do not help ease of understanding. Since Squid evolved and bumping changed from 3.3 - 3.4 - 3.5 to 4.x, and likely will change again in 5.x, there is an opportunity to improve things more than is proposed. There is also a difference in dealing with transparent intercepted connections and direct connections (browsers doing a CONNECT) which also causes some misunderstandings. The current ssl bump steps allow problematic configs where Squid bumps or stares in one step and to splice in an other step, which can be resolved (made impossible) in a new configuration syntax. I propose to use a new logic for the configuration directives where 'define in step x what to do in the next step' is replaced with a new logic 'define in step x what to do _now_'. From reading the below I think you are mistaking what "now" means to Squid. Input access control directives in squid.conf make a decision about what action to do based on some state that just arrived. Maybe it is necessary to redefine 'now' but my point remains that 'define in step x what to do in the next step' is the cause of most misunderstandings. For example: HTTP message just finished parsing -> check http_access what to do with it. HTTP reply message just arrived -> check http_reply_access what to do with it. Thus my proposal was along the lines of: client hello recieved -> check tls_client_hello what to do with it. server hello recieved -> check tls_server_hello what to do with it. For both hello messages: is the decision moment the moment where it has been peeked at? Below is a new proposal to attempt to make the configuration more intuitive and less prone for admin misunderstandings. First the admin must define if there is any bumping at all. This could be done with https_decryption on|off This is similar to tls_new_connection peek|splice but much more intuitive. Iff https_decryption is on: 1) the "connection" step: When a browser uses "CONNECT " Squid does not need to make peek or splice decisions. When Squid intercepts a connection to "port 443 of " no peek or splice decision is made here any more. This step becomes obsolete in the proposed configuration. I am still hoping that will happen. But also still getting pushback that people want to terminate or splice without even looking at the clear-text hello details. We must know the reasons behind this pushback. Only then sane decisions can be made. 2) the "TLS client hello" step: When a browser uses CONNECT, Squid has a FQDN and does not need peeking a TLS client hello message. It can use the tls_client_hello directives given below. Sadly this is not correct. Squid still needs to get the client hello details at this point. They are needed to perform bump before the server hello is received, and to "terminate with an error message" without contacting a server. yes, correct. Squid must do this. But does it have to be configured? When Squid intercepts a connection, Squid always peeks to retrieve the SNI which is the equivalent of the FQDN used by a CONNECT. In this step admins may want to define what Squid must do, e.g. tls_client_hello passthrough aclfoo Note that the acl 'aclfoo' can use tls::client_servername and tls::client_servername should always have a FQDN if the connection is https. tls::client_servername expands to the IP address if the SNI of an intercepted connection could not be retrieved. What if the SNI contradicts the CONNECT message FQDN ? What if a raw-IP in the CONNECT message (or TCP SYN) does not belong to the server named in SNI ? :-) I left this out on purpose to not make the post even larger than it was. There is of course a lot of error checking. The question is if we have to configure it. If yes, can we get away with one directive based on an acl that uses tls::handshake_failure ? Squid would now be diverting the client transparently to a server other than the one it expects and caching under that FQDN. But the server cert would still authenticate as being the SNI host, so TLS cannot detect the diversion. The fake CONNECT's are a bit messy but IMHO we can only get rid of the first one done for intercepted connections. Although that alone would make both cases handle the same way. I do not know anything about the code that generates the fake CONNECT of an transparent interception connection, but logically there should not be a fake CONNECT for true
Re: [squid-dev] [RFC] simplifying ssl_bump complexity
On 11/20/2016 12:06 AM, Amos Jeffries wrote: On 20/11/2016 12:08 p.m., Marcus Kool wrote: [snip] I like the intent of the proposal and the new directives tls_*. What currently makes configuration in Squid 3/4 difficult is the logic of 'define in step x what to do in the next step' and IMO this logic is the main cause of misunderstandings and incorrect configurations. Also the terms 'bump' and 'splice' do not help ease of understanding. Since Squid evolved and bumping changed from 3.3 - 3.4 - 3.5 to 4.x, and likely will change again in 5.x, there is an opportunity to improve things more than is proposed. There is also a difference in dealing with transparent intercepted connections and direct connections (browsers doing a CONNECT) which also causes some misunderstandings. The current ssl bump steps allow problematic configs where Squid bumps or stares in one step and to splice in an other step, which can be resolved (made impossible) in a new configuration syntax. I propose to use a new logic for the configuration directives where 'define in step x what to do in the next step' is replaced with a new logic 'define in step x what to do _now_'. From reading the below I think you are mistaking what "now" means to Squid. Input access control directives in squid.conf make a decision about what action to do based on some state that just arrived. Maybe it is necessary to redefine 'now' but my point remains that 'define in step x what to do in the next step' is the cause of most misunderstandings. For example: HTTP message just finished parsing -> check http_access what to do with it. HTTP reply message just arrived -> check http_reply_access what to do with it. Thus my proposal was along the lines of: client hello recieved -> check tls_client_hello what to do with it. server hello recieved -> check tls_server_hello what to do with it. For both hello messages: is the decision moment the moment where it has been peeked at? Below is a new proposal to attempt to make the configuration more intuitive and less prone for admin misunderstandings. First the admin must define if there is any bumping at all. This could be done with https_decryption on|off This is similar to tls_new_connection peek|splice but much more intuitive. Iff https_decryption is on: 1) the "connection" step: When a browser uses "CONNECT " Squid does not need to make peek or splice decisions. When Squid intercepts a connection to "port 443 of " no peek or splice decision is made here any more. This step becomes obsolete in the proposed configuration. I am still hoping that will happen. But also still getting pushback that people want to terminate or splice without even looking at the clear-text hello details. We must know the reasons behind this pushback. Only then sane decisions can be made. 2) the "TLS client hello" step: When a browser uses CONNECT, Squid has a FQDN and does not need peeking a TLS client hello message. It can use the tls_client_hello directives given below. Sadly this is not correct. Squid still needs to get the client hello details at this point. They are needed to perform bump before the server hello is received, and to "terminate with an error message" without contacting a server. yes, correct. Squid must do this. But does it have to be configured? When Squid intercepts a connection, Squid always peeks to retrieve the SNI which is the equivalent of the FQDN used by a CONNECT. In this step admins may want to define what Squid must do, e.g. tls_client_hello passthrough aclfoo Note that the acl 'aclfoo' can use tls::client_servername and tls::client_servername should always have a FQDN if the connection is https. tls::client_servername expands to the IP address if the SNI of an intercepted connection could not be retrieved. What if the SNI contradicts the CONNECT message FQDN ? What if a raw-IP in the CONNECT message (or TCP SYN) does not belong to the server named in SNI ? :-) I left this out on purpose to not make the post even larger than it was. There is of course a lot of error checking. The question is if we have to configure it. If yes, can we get away with one directive based on an acl that uses tls::handshake_failure ? Squid would now be diverting the client transparently to a server other than the one it expects and caching under that FQDN. But the server cert would still authenticate as being the SNI host, so TLS cannot detect the diversion. The fake CONNECT's are a bit messy but IMHO we can only get rid of the first one done for intercepted connections. Although that alone would make both cases handle the same way. I do not know anything about the code that generates the fake CONNECT of an transparent interception connection, but logically there should not be a fake CONNECT for true HTTPS (TLS+HTTP) since a browser does not do a CONNECT, so why fake one? Was the fake CONNECT introduce
Re: [squid-dev] [RFC] simplifying ssl_bump complexity
On 11/19/2016 08:07 AM, Amos Jeffries wrote: Since ssl_bump directive went in my original opinion of it as being too complicated and confusing has pretty much been demonstrated as correct by the vast amount of misconfigurations and failed attempts of people to use it without direct assistance from those of us involved with its design. Since we are also transitioning to a world where 'SSL' does not exist any longer I think v5 is a good time to rename and redesign the directive a bit. I propose going back to the older config style where each step has its own directive name which self-documents what it does. That will reduce the confusion about what is going on at each 'step', and allow us a chance to have clearly documented default actions for each step. For example: tls_new_connection - default: peek all - or run ssl_bump check if that directive exists tls_client_hello - default: splice all - or run ssl_bump check if that directive exists tls_server_hello - default: terminate all - or run ssl_bump check if that directive exists I like the intent of the proposal and the new directives tls_*. What currently makes configuration in Squid 3/4 difficult is the logic of 'define in step x what to do in the next step' and IMO this logic is the main cause of misunderstandings and incorrect configurations. Also the terms 'bump' and 'splice' do not help ease of understanding. Since Squid evolved and bumping changed from 3.3 - 3.4 - 3.5 to 4.x, and likely will change again in 5.x, there is an opportunity to improve things more than is proposed. There is also a difference in dealing with transparent intercepted connections and direct connections (browsers doing a CONNECT) which also causes some misunderstandings. The current ssl bump steps allow problematic configs where Squid bumps or stares in one step and to splice in an other step, which can be resolved (made impossible) in a new configuration syntax. I propose to use a new logic for the configuration directives where 'define in step x what to do in the next step' is replaced with a new logic 'define in step x what to do _now_'. Below is a new proposal to attempt to make the configuration more intuitive and less prone for admin misunderstandings. First the admin must define if there is any bumping at all. This could be done with https_decryption on|off This is similar to tls_new_connection peek|splice but much more intuitive. Iff https_decryption is on: 1) the "connection" step: When a browser uses "CONNECT " Squid does not need to make peek or splice decisions. When Squid intercepts a connection to "port 443 of " no peek or splice decision is made here any more. This step becomes obsolete in the proposed configuration. 2) the "TLS client hello" step: When a browser uses CONNECT, Squid has a FQDN and does not need peeking a TLS client hello message. It can use the tls_client_hello directives given below. When Squid intercepts a connection, Squid always peeks to retrieve the SNI which is the equivalent of the FQDN used by a CONNECT. In this step admins may want to define what Squid must do, e.g. tls_client_hello passthrough aclfoo Note that the acl 'aclfoo' can use tls::client_servername and tls::client_servername should always have a FQDN if the connection is https. tls::client_servername expands to the IP address if the SNI of an intercepted connection could not be retrieved. For https connections with a client hello without the SNI extension: tls_client_hello passthrough|terminate aclbar where aclbar can contain tls::client_hello_missing_sni For connections that do not use TLS (i.e. not a valid TLS client hello message was seen): tls_client_hello passthrough|terminate aclbar2 where aclbar2 may contain tls::handshake_failure To define that the TLS handshake continues, the config can contain tls_client_hello continue This is a basically a no-op and not required but enhances readability of a configuration. 3) the "TLS server hello" step: Usually no directives are needed since rarely actions are taken based on the server hello message, so the default is tls_server_hello continue The tls_server_hello can be used to terminate specific connections. In this step many types of certificate errors can be detected and in the Squid configuration there must be a way to define what to do for specific errors and optionally for which FQDN. E.g. allow to define that connections with self-signed certificates are terminates but the self-signed cert for domain foo.example.com is allowed. See also the example config below and the use of tls::server_servername. What is left, is a configuration directive for connections that use TLS as an encryption wrapper but do not use HTTP inside the TLS wrapper: tls_no_http passthrough|terminate # similar to on_unsupported_protocol An example configuration looks like this: https_decryption on acl banks tls::client_servername .bank1.example.org acl no_sni tls::client_hello_missing_sni acl no_handshake
Re: [squid-dev] [PATCH] Support tunneling of bumped non-HTTP traffic. Other SslBump fixes.
I started testing this patch and observed one unwanted side effect of this patch: When a client connects to mtalk.google.com, Squid sends the following line to the URL rewriter: (unknown)://173.194.76.188:443 / - NONE Marcus Quoting Christos Tsantilas: Use case: Skype groups appear to use TLS-encrypted MSNP protocol instead of HTTPS. This change allows Squid admins using SslBump to tunnel Skype groups and similar non-HTTP traffic bytes via "on_unsupported_protocol tunnel all". Previously, the combination resulted in encrypted HTTP 400 (Bad Request) messages sent to the client (that does not speak HTTP). Also this patch: * fixes bug 4529: !EBIT_TEST(entry->flags, ENTRY_FWD_HDR_WAIT) assertion in FwdState.cc. * when splicing transparent connections during SslBump step1, avoid access-logging an extra record and log %ssl::bump_mode as the expected "splice" not "none". * handles an XXX comment inside clientTunnelOnError for possible memory leak of client streams related objects * fixes TunnelStateData logging in the case of splicing after peek. This is a Measurement Factory project. ___ squid-dev mailing list squid-dev@lists.squid-cache.org http://lists.squid-cache.org/listinfo/squid-dev
Re: [squid-dev] Benchmarking Performance with reuseport
This article better explains the benefits of O_REUSEPORT: https://lwn.net/Articles/542629/ A key paragraph is this: The problem with this technique, as Tom pointed out, is that when multiple threads are waiting in the accept() call, wake-ups are not fair, so that, under high load, incoming connections may be distributed across threads in a very unbalanced fashion. At Google, they have seen a factor-of-three difference between the thread accepting the most connections and the thread accepting the fewest connections; that sort of imbalance can lead to underutilization of CPU cores. By contrast, the SO_REUSEPORT implementation distributes connections evenly across all of the threads (or processes) that are blocked in accept() on the same port. So using O_REUSEPORT seems very beneficial for SMP-based Squid. Marcus On 08/09/2016 09:19 PM, Henrik Nordström wrote: tor 2016-08-04 klockan 23:12 +1200 skrev Amos Jeffries: I imagine that Nginx are seeing latency reduction due to no longer needing a central worker that receives the connection then spawns a whole new process to handle it. The behaviour sort of makes sense for a web server (which Nginx is at heart still, a copy of Apache) spawning CGI processes to handle each request. But kind of daft in these HTTP/1.1 multiplexed performance-centric days. No, it's only about accepting new connections on existing workers. Many high load sites still run with non-persistent connections to keep worker count down, and these benefit a lot from this change. Sites using persistent connections only benefit marginally. But the larger the worker count the higher the benefit as the load from new connections gets distrubuted by the kernel instead of a stamping herd of workers. Regards Henrik ___ squid-dev mailing list squid-dev@lists.squid-cache.org http://lists.squid-cache.org/listinfo/squid-dev ___ squid-dev mailing list squid-dev@lists.squid-cache.org http://lists.squid-cache.org/listinfo/squid-dev
[squid-dev] Benchmarking Performance with reuseport
https://www.nginx.com/blog/socket-sharding-nginx-release-1-9-1/ is an interesting short article about using the SO_REUSEPORT socket option which increased performance of nginx and had better balancing of connections across sockets of workers. Since Squid has the issue that load is not very well balanced between workers, I thought it is interesting to look at. Marcus ___ squid-dev mailing list squid-dev@lists.squid-cache.org http://lists.squid-cache.org/listinfo/squid-dev
Re: [squid-dev] HTTP meetup in Stockholm
On 07/12/2016 06:53 AM, Henrik Nordström wrote: tis 2016-07-12 klockan 18:34 +1200 skrev Amos Jeffries: I'm much more in favour of binary formats. The HTTP/2 HPACK design lends itself very easily to binary header values (ie sending integers as interger encoded value). Following PHK's lead on those. json is very ambiguous with no defined schema or type restrictions. It's up to the receiver to guess type information from format while parsing, which in itself is a mess from security point of view. The beauty of json is that it is trivially extensible with new data, and have all basic data constructs you need for arbitrary data. (name tagged, strings, integers, floats, booleans, arrays, dictionaries and maybe something more). But for the same reason it's also unsuitable for HTTP header information which should be consise, terse and un-ambiguous with little room for syntax errors. Regards Henrik Extensible json headers seems to lend itself to put a lot of application-specific stuff in headers instead of in payload. The headers should be used for the protocol only. Squid has had many issues in the past with non-conformity to standards. The Squid developers obviously want to stick with the standards and are forced by non-conformant apps and servers to support non-conformity. Can this workshop be used to address this? Marcus ___ squid-dev mailing list squid-dev@lists.squid-cache.org http://lists.squid-cache.org/listinfo/squid-dev
Re: [squid-dev] [RFC] on_crash
On 12/09/2015 09:20 PM, Alex Rousskov wrote: On 12/09/2015 02:28 PM, Amos Jeffries wrote: The above considerations are all good reasons for us not to be bundling by default IMO. I agree. Alex. I did not get what the script does, does it call gdb ? A script/executable that calls gdb and produces a readable stack trace of all squid processes is a powerful tool which makes debugging an issue much easier for many admins. So I suggest to release the binaries and scripts that you have, install them by default in a new subdirectory, e.g. .../debugbin or .../sbin/debug, and _not_ configure them in the default squid.conf to prevent them being used accidentally. If you do not want to bundle, then what is the alternative? Make a download area on squid-cache.org for the binaries and scripts ? Marcus ___ squid-dev mailing list squid-dev@lists.squid-cache.org http://lists.squid-cache.org/listinfo/squid-dev
Re: [squid-dev] Fake CONNECT requests during SSL Bump
On 09/24/2015 02:13 AM, Eliezer Croitoru wrote: On 23/09/2015 04:52, Amos Jeffries wrote: Exactly. They are processing steps. Not messages to be adapted. Amos +1 For that. [...] In any case the bottom line from me is that for now ICAP and ECAP are called ADAPTATION services and not ACL services. It can be extended to do so and it's not a part of the RFCs or definitions and it might be the right way to do things but it will require simple enough libraries that will let most admins (if not all) to be able to implement their ACL logics using these protocol\implementations. Eliezer ICAP is an adaptation protocol that almost everybody uses for access control. The ICAP server must be able to see all traffic going through Squid so that it can do what it was designed for and block (parts) of websites and other data streams. Other data streams may not be HTTP(S)-based and hence are not bumped, but for the ICAP server to be able to do its thing, it still needs a (fake) CONNECT. Going back to Steve's original message, I think that it is not necessary to generate a (fake) CONNECT for each bump step, but to send exactly one CONNECT at the moment that Squid makes a decision. I.e. when Squid decides to bump or splice. Marcus ___ squid-dev mailing list squid-dev@lists.squid-cache.org http://lists.squid-cache.org/listinfo/squid-dev
[squid-dev] download squid 3.5.8 fails
The download of the 3.5.8 sources fails :-( wget -vvv http://www.squid-cache.org/Versions/v3/3.5/squid-3.5.8.tar.gz --2015-09-02 17:16:43-- http://www.squid-cache.org/Versions/v3/3.5/squid-3.5.8.tar.gz Resolving www.squid-cache.org (www.squid-cache.org)... 92.223.231.190, 209.169.10.131 Connecting to www.squid-cache.org (www.squid-cache.org)|92.223.231.190|:80... connected. HTTP request sent, awaiting response... 404 Not Found 2015-09-02 17:16:43 ERROR 404: Not Found. Best regards, Marcus ___ squid-dev mailing list squid-dev@lists.squid-cache.org http://lists.squid-cache.org/listinfo/squid-dev
Re: [squid-dev] download squid 3.5.8 fails
The normal download URL works now also. Thanks Marcus On 09/02/2015 04:14 PM, Amos Jeffries wrote: On 3/09/2015 3:23 a.m., Marcus Kool wrote: The download of the 3.5.8 sources fails :-( wget -vvv http://www.squid-cache.org/Versions/v3/3.5/squid-3.5.8.tar.gz --2015-09-02 17:16:43-- http://www.squid-cache.org/Versions/v3/3.5/squid-3.5.8.tar.gz Resolving www.squid-cache.org (www.squid-cache.org)... 92.223.231.190, 209.169.10.131 Connecting to www.squid-cache.org (www.squid-cache.org)|92.223.231.190|:80... connected. HTTP request sent, awaiting response... 404 Not Found 2015-09-02 17:16:43 ERROR 404: Not Found. Yeah. I'm having trouble with one of the mirrors too. Try west.squid-cache.org as the domain. That one I know works. Amos ___ squid-dev mailing list squid-dev@lists.squid-cache.org http://lists.squid-cache.org/listinfo/squid-dev ___ squid-dev mailing list squid-dev@lists.squid-cache.org http://lists.squid-cache.org/listinfo/squid-dev
Re: [squid-dev] bug 4303
On 08/18/2015 12:36 PM, Amos Jeffries wrote: On 19/08/2015 12:56 a.m., Marcus Kool wrote: Amos, Christos, Christos' patch seems not to work for plain 3.5.7 sources. What do you suggest to try ? Will there be a snapshot release that is suitable for testing ? Christos now has it in trunk, but the last snapshot refused to build due to a compiler issues in the build farm which is now resolved. Tomorows trunk snapshot should be r14229 or later with it in. Next round of backports to 3.5 should include it there in 2-3 days as well unless something goes wrong in the portage. Thanks, I will wait for the 3.5 backport. Will the patch be announced on the list? marcus Amos ___ squid-dev mailing list squid-dev@lists.squid-cache.org http://lists.squid-cache.org/listinfo/squid-dev ___ squid-dev mailing list squid-dev@lists.squid-cache.org http://lists.squid-cache.org/listinfo/squid-dev
Re: [squid-dev] bug 4303
Amos, I tried the patch but several hunks failed. It seems that the patch is not compatible with the 3.5.7 release code or I am doing something wrong (see below). Marcus [root@srv018 squid-3.5.7]# patch -b -p0 --dry-run ../squid-sslbump-patch checking file src/acl/Acl.h Hunk #1 succeeded at 150 (offset 1 line). checking file src/acl/BoolOps.cc checking file src/acl/BoolOps.h Hunk #1 FAILED at 45. 1 out of 1 hunk FAILED checking file src/acl/Checklist.cc checking file src/acl/Checklist.h checking file src/acl/Tree.cc Hunk #2 FAILED at 69. 1 out of 2 hunks FAILED checking file src/acl/Tree.h Hunk #1 FAILED at 23. 1 out of 1 hunk FAILED checking file src/client_side.cc Hunk #1 FAILED at 4181. Hunk #2 FAILED at 4247. 2 out of 2 hunks FAILED checking file src/ssl/PeerConnector.cc Hunk #1 FAILED at 214. 1 out of 1 hunk FAILED On 08/12/2015 10:25 AM, Amos Jeffries wrote: On 13/08/2015 12:48 a.m., Marcus Kool wrote: yesterday I filed bug 4303 - assertion failed in PeerConnector:743 squid 3.5.7 I am not sure if it is a duplicate of bug 4259 since that bug description has almost no info to compare against. I enclosed a small fragment of cache.log in the bug report but the debug setting was ALL,1 93,3 61,9 so cache.log is very large. In case that you need a larger fragment of cache.log, I can provide it. Thanks Marcus. I was about to reply to the bug report, but this is better. I suspect this is a case of Squid going the wrong way in ssl_bump interpretation. Specifically the peek action at stage 3. Would you be able to try Christos' patch at the end of the mail here: http://lists.squid-cache.org/pipermail/squid-dev/2015-August/002981.html Amos ___ squid-dev mailing list squid-dev@lists.squid-cache.org http://lists.squid-cache.org/listinfo/squid-dev ___ squid-dev mailing list squid-dev@lists.squid-cache.org http://lists.squid-cache.org/listinfo/squid-dev
[squid-dev] bug 4303
yesterday I filed bug 4303 - assertion failed in PeerConnector:743 squid 3.5.7 I am not sure if it is a duplicate of bug 4259 since that bug description has almost no info to compare against. I enclosed a small fragment of cache.log in the bug report but the debug setting was ALL,1 93,3 61,9 so cache.log is very large. In case that you need a larger fragment of cache.log, I can provide it. Best regards, Marcus ___ squid-dev mailing list squid-dev@lists.squid-cache.org http://lists.squid-cache.org/listinfo/squid-dev
Re: [squid-dev] [PATCH] Temporary fix to restore compatibility with Amazon
On 06/24/2015 05:24 PM, Kinkie wrote: My 2c: I vote for reality; possibly with a shaming announce message; I wouldn't even recommend logging the violation: there is nothing the average admin can do about it. I would consider adding a shaming comment in the release notes. more cents: correct. A standard can be considered a strong guideline but if important sites violate the standard (i.e. users/admins complain) then Squid should be able to cope with it or it risks getting abandoned because Squid cannot cope with traffic of sites that otherwise work without Squid. For an admin it is irrelevant if the problem is caused by Squid or by a website. And the admin who dares to say to its users only visit sites that comply with the standards probably gets fired. On Wed, Jun 24, 2015 at 10:12 PM, Alex Rousskov rouss...@measurement-factory.com wrote: On 06/24/2015 05:26 AM, Amos Jeffries wrote: On 24/06/2015 5:55 p.m., Alex Rousskov wrote: This temporary trunk fix adds support for request URIs containing '|' characters. Such URIs are used by popular Amazon product (and probably other) sites: /images/I/ID1._RC|ID2.js,ID3.js,ID4.js_.js Without this fix, all requests for affected URIs timeout while Squid waits for the end of request headers it has already received(*). This is not right. Squid should be identifying the message as non-HTTP/1.x (which it isn't due to the URI syntax violation) and treating it as such. I agree that Amazon violates URI syntax. On the other hand, the message can be interpreted as HTTP/1.x for all practical purposes AFAICT. If you want to implement a different fix, please do so. Meanwhile, folks suffering from this serious regression can try the temporary fix I posted. The proper long-term fix is to allow any character in URI as long as we can reliably parse the request line (and, later, URI components). There is no point in hurting users by rejecting requests while slowly accumulating the list of benign characters used by web sites but prohibited by some RFC. The *proper* long term fix is to obey the standards in regard to message syntax so applications stop using these invalid (when un-encoded) characters and claiming HTTP/1.1 support. We had standards vs reality and policing traffic discussions several times in the past, with no signs of convergence towards a single approach, so I am not going to revisit that discussion now. We continue to disagree [while Squid users continue to suffer]. Thank you, Alex. ___ squid-dev mailing list squid-dev@lists.squid-cache.org http://lists.squid-cache.org/listinfo/squid-dev ___ squid-dev mailing list squid-dev@lists.squid-cache.org http://lists.squid-cache.org/listinfo/squid-dev
Re: [squid-dev] Death of SSLv3
On 05/07/2015 07:03 AM, Amos Jeffries wrote: Its done. SSLv3 is now a MUST NOT use protocol from RFC 7525 (http://tools.ietf.org/html/rfc7525) good decision. It's time for us to start ripping out from trunk all features and hacks supporting its use. Over the coming days I will be submitting patches to remove the squid.conf settings, similar to SSLv2 removal earlier. The exceptions which may remain are SSLv3 features which are used by the still-supported TLS versions. Such as session resume, and the SSLv3 format of Hello message (though not the SSLv3 protocol IDs). are you sure you want to do this _now_ ? It is predictable that users will complain with I know this provider is stupid and uses SSLv3 but I _need_ to access that site for our business and use this as a reason not to upgrade or blame squid. It may not be that much extra work to have a new option use_sslv3 with the default setting to OFF and not ripping SSLv3 code yet. Also, if you do not rip SSLv3, Squid can detect that a site uses SSLv3 and give a useful error message like this site insists in using the unsafe SSLv3 protocol instead of a confusing unknown protocol. Marcus Christos, if you can keep this in mind for all current / pending, and future SSL work. Amos ___ squid-dev mailing list squid-dev@lists.squid-cache.org http://lists.squid-cache.org/listinfo/squid-dev ___ squid-dev mailing list squid-dev@lists.squid-cache.org http://lists.squid-cache.org/listinfo/squid-dev
Re: [squid-dev] [PATCH] Non-HTTP bypass
On 12/31/2014 02:31 PM, Alex Rousskov wrote: On 12/31/2014 03:33 AM, Marcus Kool wrote: On 12/31/2014 05:54 AM, Alex Rousskov wrote: What would help is to decide whether we want to focus on A) multiple conditions for establishing a TCP tunnel; B) multiple ways to handle an unrecognized protocol error; OR C) multiple ways to handle multiple errors. IMO, we want (B) or perhaps (C) while leaving (A) as a separate out-of-scope feature. The proposed patch implements (B). To implement (C), the patch needs to add an ACL type to distinguish an unrecognized protocol error from other errors. From an administrators point of view, the admins that want Squid to filter internet access, definitely want (B). They want (B) to block audio, video, SSH tunnnels, VPNs, chat, file sharing, webdisks and all sorts of applications (but not all!) that use port 443. Agreed, except this is not limited to port 443. The scope includes intercepted port 80 connections and even CONNECT tunnels. If CONNECT tunnels are in scope, then so are all the applications that use it, including webdisk, audio, video, SSH etc. I think it was Amos who said that application builders hould use application-specific ports, but reality is that all firewalls block those ports by default. Skype was one of the first applications that worked everywhere, even behind a corporate firewall and it was done using CONNECT to the web proxy. And from a security point of view I think that administrators prefer that applications use CONNECT to the web proxy to have more control and logging about what traffic is going from a LAN to the internet. Basically this means that admins desire a more fine-grained control about what to do with each tunnel. There are two different needs here, actually: 1. A choice of actions (i.e., what to do) when dealing with an unsupported protocol. Currently, there is only one action: Send an HTTP error response. The proposed feature adds another action (tunnel) and, more importantly, adds a configuration interface to support more actions later. Sending an HTTP error to an application that does not speak HTTP is not very useful. Skype, SSH, videoplayers etc. only get confused at best. Simply closing the tunnel may be better and may result in an end user message 'cannot connect to ...' instead of 'server sends garbage' or 'undefined protocol'. Marcus 2. A way to further classify an unsupported protocol (i.e., fine-grained control). I started a new thread on this topic as it is not about the proposed bypass feature. Cheers, Alex. ___ squid-dev mailing list squid-dev@lists.squid-cache.org http://lists.squid-cache.org/listinfo/squid-dev
Re: [squid-dev] [PATCH] Non-HTTP bypass
On 12/31/2014 05:54 AM, Alex Rousskov wrote: [...] What would help is to decide whether we want to focus on A) multiple conditions for establishing a TCP tunnel; B) multiple ways to handle an unrecognized protocol error; OR C) multiple ways to handle multiple errors. IMO, we want (B) or perhaps (C) while leaving (A) as a separate out-of-scope feature. The proposed patch implements (B). To implement (C), the patch needs to add an ACL type to distinguish an unrecognized protocol error from other errors. From an administrators point of view, the admins that want Squid to filter internet access, definitely want (B). They want (B) to block audio, video, SSH tunnnels, VPNs, chat, file sharing, webdisks and all sorts of applications (but not all!) that use port 443. Basically this means that admins desire a more fine-grained control about what to do with each tunnel. The current functionality of filtering is divided between Squid itself and 3rd party software (ICAP daemons and URL redirectors). I plea for an interface where an external helper can decide what to do with an unknown protocol inside a tunnel because it is much more flexible than using ACLs and extending Squid with detection of (many) protocols. A while back when we discussed the older sslBump not being able to cope with Skype I suggested to use ICAP so that the ICAP daemon receives a REQMOD/RESPMOD message with CONNECT and intercepted content, which also is a valid option for me. I wish to all a Blessful and Happy New Year! Marcus [...] Thank you, Alex. ___ squid-dev mailing list squid-dev@lists.squid-cache.org http://lists.squid-cache.org/listinfo/squid-dev ___ squid-dev mailing list squid-dev@lists.squid-cache.org http://lists.squid-cache.org/listinfo/squid-dev
Re: [squid-dev] unsupported protocol classification
On 12/31/2014 02:23 PM, Alex Rousskov wrote: [ I am changing the Subject line for this sub-thread because this new discussion is not really relevant to the unsupported protocol bypass feature, even though that bypass feature will be used by those who need to classify unsupported protocols. ] On 12/31/2014 03:33 AM, Marcus Kool wrote: The current functionality of filtering is divided between Squid itself and 3rd party software (ICAP daemons and URL redirectors). ... as well as external ACLs and eCAP adapters. I plea for an interface where an external helper can decide what to do with an unknown protocol inside a tunnel because it is much more flexible than using ACLs and extending Squid with detection of (many) protocols. I doubt pleading will be enough, unfortunately, because a considerable amount of coding and design expertise is required to fulfill your dream. IMO, a quality implementation would involve: It is clear to me that this functionality will not be implemented next week, but for me it is not a dream. It is a reality that filtering becomes more important, just wait until a headline in the news comes along like secret document stolen via a web tunnel and everybody wants it. The risk is real and it is so simple to abuse CONNECT on port 443 for anything that it is extremely likely that it is already being used for illegal actions and will continue to be used for illegal actions. There is also not much point in having a web proxy that can filter 50% or 99% of what you want to filter. If you cannot filter everything and especially cannot filter known security risks, the filter solution is very weak. That is why ufdbGuard currently sends probes to sites that an application CONNECTs to. The probes tell ufdbGuard what type of traffic is to be expected but are also not 100% reliable since a probe is not the same as an inspection of the real traffic. 1. Encoding the tunnel information (including traffic) in [small] HTTP-like messages to be passed to ICAP/eCAP services. It is important to get this API design right while anticipating complications like servers that speak first, agents that do not send Hellos until they hear the other agent Hello, and fragmented Hellos. Most likely, the design will involve two tightly linked but concurrent streams of adaptation messages: user-Squid-origin and origin-Squid-user. Let's call that TUNMOD, as opposed to the existing REQMOD and RESPMOD. Getting the design right is definitely important. Therefore I like to bring up this issue once in a while so that with the design decisions made today of related parts, it will be easier to implement TUNMOD in the future. 2. Writing adaptation hooks to pass tunnel information (using TUNMOD design above) to adaptation services. The primary difficulty here is handling incremental give me more and give them more decisions while shoveling tunneled bytes. The current tunneling code does not do any adaptation at all so the developers would be starting from scratch (albeit with good examples available from non-tunneling code dealing with HTTP/FTP requests and HTTP/FTP responses). It can be simpler. TUNMOD replies can be limited to DONTKNOW - continue with what is happening and keep the TUNMOD server informed ACCEPT - continue and do not inform the TUNMOD server any more about this tunnel BLOCK - close the tunnel I think there is no need for adaptation since one accepts a webdisk, voice chat, VPN or whatever, or one does not accept it. So adaptation as is used for HTTP, is not an important feature. Sending an HTTP error on a tunnel is only useful if the tunnel uses SSL-encapsulated HTTP. 3. Implementing more actions than the already implemented start a blind tunnel and respond with an error. The shovel this to the other side and then come back to me with the newly received bytes action would be essential in many production cases, for example. The above is a large project. I do not recall any projects of that size and complexity implemented without sponsors in recent years but YMMV. We will see. Maybe there will be a sponsor to do this. It is 15:38 local time and my last post of the year. Happy New Year to all. Marcus Please note that modern Squid already has an API that lets 3rd party software to pick one of the supported actions. It is called annotations: External software sends Squid an annotation and the admin configures Squid to do X when annotation Y is received in context Z. A while back when we discussed the older sslBump not being able to cope with Skype I suggested to use ICAP so that the ICAP daemon receives a REQMOD/RESPMOD message with CONNECT and intercepted content, which also is a valid option for me. Yes, ICAP/eCAP is the right direction here IMO, but there are several challenges on that road. I tried to detail them above. HTH, Alex. ___ squid-dev mailing list squid-dev@lists.squid-cache.org http://lists.squid-cache.org
Re: Possible memory leak.
Eliezer, It is important to know what implementation of malloc is used. So it is important to know which OS/distro is used and which version of glibc/malloc. malloc on 64bit CentOS 6.x uses memory-mapped memory for allocations of 128 KB or larger and uses multiple (can't find how many) 64MB segments and many more when threads are used. I also suggest to collect total memory size _and_ resident memory size. The resident memory size is usually significantly smaller than the total memory size which can be explained by the 64MB segments that are only used for a low percentage. If you use CentOS, I recommend to export MALLOC_ARENA_MAX=1# should work well and/or export MMAP_THRESHOLD=4100100100 # no experience if this works and run the test again. Marcus On 07/20/2014 12:27 PM, Eliezer Croitoru wrote: I want to verify the issue I have seen: Now The server is on about 286 MB of resident memory. The issue is that the server memory usage was more then 800MB while two things in mind 1 - The whole web server is 600 MB 2 - 150MB is the maximum object size in memory (there is no disk cache) 3 - the cache memory of the server is the default of 256MB. I cannot think about an option that will lead this server to consume more then 400MB even if one 10 bytes file is being fetched with a query term every time with a different parameter. If the sum of all the request to the proxy are 30k I do not see how it would still lead to 900MB of ram used by squid. If I am mistaken(could very simple accomplished) then I want to understand what to look for in the mgr interface to see if there is a reasonable usage of memory or not. (I know it's a lot to ask but still) Thanks, Eliezer On 07/10/2014 09:10 PM, Eliezer Croitoru wrote: OK so I started this reverse proxy for a bandwidth testing site and it seems odd that it using more then 400MB when the only difference in the config is maximum_object_size_in_memory to 150MB and StoreID SNIP Eliezer
Squid 3.4.5 warning about MGR_INDEX
Using Squid 3.4.5 I observed in cache.log the following warning: 2014/06/08 09:50:42.804 kid1| disk.cc(92) file_open: file_open: error opening file /local/squid34/share/errors/templates/MGR_INDEX: (2) No such file or directory 2014/06/08 09:50:42.805 kid1| errorpage.cc(307) loadDefault: WARNING: failed to find or read error text file MGR_INDEX For other error template files all is well. Marcus
Re: issue with ICAP message for redirecting HTTPS/CONNECT
Thanks Nathan, that helped. Sometimes it is frustrating to just not see the small error... Marcus On 06/08/2014 01:02 PM, Nathan Hoad wrote: Hi Marcus, There's a bug in your ICAP server with how it's handling the Encapsulated header that it sends back to Squid. This is what your server sent back to Squid for a REQMOD request: Encapsulated: res-hdr=0, null-body=1930d X-Next-Services: 0d 0d CONNECT blockedhttps.urlfilterdb.com:443 HTTP/1.00d -- NOTE: also fails: CONNECT https://blockedhttps.urlfilterdb.com HTTP/1.00d snipped for brevity The Encapsulated header says that the HTTP object that has been sent back contains HTTP response headers, and no body. This leads Squid to believe it should be parsing a HTTP response, which expects the first token of the first line to begin with HTTP/, which is failing because the server has actually sent back a HTTP request. This explains the error in the logs, and why it's working for your GET and POST responses, which do indeed contain HTTP response objects. So for this particular example, the correct Encapsulated header value would be 'req-hdr=0, null-body=193'. I hope that helps, Nathan. -- Nathan Hoad Software Developer www.getoffmalawn.com On 9 June 2014 00:22, Marcus Kool marcus.k...@urlfilterdb.com wrote: I ran into an issue with the ICAP interface. The issue is that a GET/HTTP-based URL can be successfully rewritten but a CONNECT/HTTPS-based URL cannot. I used debug_options ALL,9 to find out what is going wrong but I fail to understand Squid. GET/HTTP to http://googleads.g.doubleclick.net works: Squid writes: REQMOD icap://127.0.0.1:1344/reqmod_icapd_squid34 ICAP/1.00d Host: 127.0.0.1:13440d Date: Sun, 08 Jun 2014 13:54:09 GMT0d Encapsulated: req-hdr=0, null-body=1350d Preview: 00d Allow: 2040d X-Client-IP: 127.0.0.10d 0d GET http://googleads.g.doubleclick.net/ HTTP/1.00d User-Agent: Wget/1.12 (linux-gnu)0d Accept: */*0d Host: googleads.g.doubleclick.net0d 0d ICAP daemon responds: ICAP/1.0 200 OK0d Server: ufdbICAPd/1.00d Date: Sun, 08 Jun 2014 13:54:09 GMT0d ISTag: 5394572c-45670d Connection: keep-alive0d Encapsulated: res-hdr=0, null-body=2330d X-Next-Services: 0d 0d HTTP/1.0 200 OK0d Date: Sun, 08 Jun 2014 13:54:09 GMT0d Server: ufdbICAPd/1.00d Last-Modified: Sun, 08 Jun 2014 13:54:09 GMT0d ETag: 498a-0001-5394572c-45670d Cache-Control: max-age=100d Content-Length: 00d Content-Type: text/html0d 0d 00d 0d CONNECT/HTTPS does not work: Squid writes: REQMOD icap://127.0.0.1:1344/reqmod_icapd_squid34 ICAP/1.00d Host: 127.0.0.1:13440d Date: Sun, 08 Jun 2014 12:29:32 GMT0d Encapsulated: req-hdr=0, null-body=870d Preview: 00d Allow: 2040d X-Client-IP: 127.0.0.10d 0d CONNECT googleads.g.doubleclick.net:443 HTTP/1.00d User-Agent: Wget/1.12 (linux-gnu)0d 0d ICAP daemon responds: ICAP/1.0 200 OK0d Server: ufdbICAPd/1.00d Date: Sun, 08 Jun 2014 12:29:32 GMT0d ISTag: 5394572c-45670d Connection: keep-alive0d Encapsulated: res-hdr=0, null-body=1930d X-Next-Services: 0d 0d CONNECT blockedhttps.urlfilterdb.com:443 HTTP/1.00d-- NOTE: also fails: CONNECT https://blockedhttps.urlfilterdb.com HTTP/1.00d Host: blockedhttps.urlfilterdb.com0d User-Agent: Wget/1.12 (linux-gnu)0d X-blocked-URL: googleads.g.doubleclick.net0d X-blocked-category: ads0d 0d 00d 0d and Squid in the end responds to wget: HTTP/1.1 500 Internal Server Error Server: squid/3.4.5 Mime-Version: 1.0 Date: Sun, 08 Jun 2014 13:59:27 GMT Content-Type: text/html Content-Length: 2804 X-Squid-Error: ERR_ICAP_FAILURE 0 Vary: Accept-Language Content-Language: en X-Cache: MISS from XXX X-Cache-Lookup: NONE from XXX:3128 Via: 1.1 XXX (squid/3.4.5) Connection: close A fragment of cache.log is below. I think that the line HttpReply.cc(460) sanityCheckStartLine: HttpReply::sanityCheckStartLine: missing protocol prefix (HTTP/) in 'CONNECT blockedhttps.urlfilterdb.com:443 HTTP/1.00d indicates where the problem is. Questions: The ICAP reply has a HTTP/ protocol prefix so does Squid have a problem parsing the reply? What is the issue with the reply of the ICAP daemon? Not directly related, but interesting: why does Squid sends CONNECT googleads.g.doubleclick.net:443 HTTP/1.0 to the ICAP daemon instead of CONNECT https://googleads.g.doubleclick.net HTTP/1.0 Thanks Marcus cache.log: - 2014/06/08 09:29:32.224 kid1| Xaction.cc(413) noteCommRead: read 384 bytes 2014/06/08 09:29:32.224 kid1| Xaction.cc(73) disableRetries: Adaptation::Icap::ModXact from now on cannot be retried [FD 12;rG/RwP(ieof) job9] 2014/06/08 09:29:32.224 kid1| ModXact.cc(646) parseMore: have 384 bytes to parse [FD 12;rG/RwP(ieof) job9] 2014/06/08 09:29:32.224 kid1| ModXact.cc(647) parseMore: ICAP/1.0 200 OK0d Server: ufdbICAPd/1.00d Date: Sun, 08 Jun 2014 12:29:32 GMT0d ISTag: 5394572c-45670d Connection: keep-alive0d Encapsulated: res-hdr=0, null-body=1930d X-Next-Services: 0d 0d CONNECT blockedhttps.urlfilterdb.com:443 HTTP/1.00d Host
Re: issue with ICAP message for redirecting HTTPS/CONNECT
no, no sslbump is used On 06/08/2014 11:57 AM, Eliezer Croitoru wrote: Are you using SSL-BUMP? Eliezer On 06/08/2014 05:22 PM, Marcus Kool wrote: I ran into an issue with the ICAP interface. The issue is that a GET/HTTP-based URL can be successfully rewritten but a CONNECT/HTTPS-based URL cannot. I used debug_options ALL,9 to find out what is going wrong but I fail to understand Squid.
Re: issue with ICAP message for redirecting HTTPS/CONNECT
On 06/08/2014 04:20 PM, Alex Rousskov wrote: On 06/08/2014 10:02 AM, Nathan Hoad wrote: There's a bug in your ICAP server with how it's handling the Encapsulated header that it sends back to Squid. ... The Encapsulated header says that the HTTP object that has been sent back contains HTTP response headers, and no body. This leads Squid to believe it should be parsing a HTTP response Hello Marcus, In addition to the Encapsulated header wrongly promising an HTTP response, the ICAP response also contains an encapsulated HTTP body chunk (of zero size) when the Encapsulated header promised no body at all. That ICAP server bug is present in both GET and CONNECT adaptation transactions (but the correct behavior would be different in each of those two cases). Thanks for pointing that out. If you are writing a yet another ICAP server, please note that free and commercial ICAP servers are available. Are you sure you want to go through the pains of writing a yet another broken one? And that you actually need ICAP? For this project I indeed need ICAP. I was not satisfied with the free ICAP servers and will make the ICAP server public domain so a commercial one is not an option. Finally, please note that rewriting and even satisfying CONNECT requests is difficult because the browser has certain expectations about the origin server and the browser's security model prevent many CONNECT request and response manipulations. yes, I am aware of all troubles with certificates and how browsers deal with them. ICAP was designed for HTTP, not HTTPS, but ICAP is all we got for content filtering. I am aware that ecap exists, but because ecap sits inside the Squid process but has no support for multithreading, which is a must-have for this project, ecap is not suitable for technical reasons. Thanks Marcus Cheers, Alex. On 9 June 2014 00:22, Marcus Kool marcus.k...@urlfilterdb.com wrote: I ran into an issue with the ICAP interface. The issue is that a GET/HTTP-based URL can be successfully rewritten but a CONNECT/HTTPS-based URL cannot. I used debug_options ALL,9 to find out what is going wrong but I fail to understand Squid. GET/HTTP to http://googleads.g.doubleclick.net works: Squid writes: REQMOD icap://127.0.0.1:1344/reqmod_icapd_squid34 ICAP/1.00d Host: 127.0.0.1:13440d Date: Sun, 08 Jun 2014 13:54:09 GMT0d Encapsulated: req-hdr=0, null-body=1350d Preview: 00d Allow: 2040d X-Client-IP: 127.0.0.10d 0d GET http://googleads.g.doubleclick.net/ HTTP/1.00d User-Agent: Wget/1.12 (linux-gnu)0d Accept: */*0d Host: googleads.g.doubleclick.net0d 0d ICAP daemon responds: ICAP/1.0 200 OK0d Server: ufdbICAPd/1.00d Date: Sun, 08 Jun 2014 13:54:09 GMT0d ISTag: 5394572c-45670d Connection: keep-alive0d Encapsulated: res-hdr=0, null-body=2330d X-Next-Services: 0d 0d HTTP/1.0 200 OK0d Date: Sun, 08 Jun 2014 13:54:09 GMT0d Server: ufdbICAPd/1.00d Last-Modified: Sun, 08 Jun 2014 13:54:09 GMT0d ETag: 498a-0001-5394572c-45670d Cache-Control: max-age=100d Content-Length: 00d Content-Type: text/html0d 0d 00d 0d CONNECT/HTTPS does not work: Squid writes: REQMOD icap://127.0.0.1:1344/reqmod_icapd_squid34 ICAP/1.00d Host: 127.0.0.1:13440d Date: Sun, 08 Jun 2014 12:29:32 GMT0d Encapsulated: req-hdr=0, null-body=870d Preview: 00d Allow: 2040d X-Client-IP: 127.0.0.10d 0d CONNECT googleads.g.doubleclick.net:443 HTTP/1.00d User-Agent: Wget/1.12 (linux-gnu)0d 0d ICAP daemon responds: ICAP/1.0 200 OK0d Server: ufdbICAPd/1.00d Date: Sun, 08 Jun 2014 12:29:32 GMT0d ISTag: 5394572c-45670d Connection: keep-alive0d Encapsulated: res-hdr=0, null-body=1930d X-Next-Services: 0d 0d CONNECT blockedhttps.urlfilterdb.com:443 HTTP/1.00d-- NOTE: also fails: CONNECT https://blockedhttps.urlfilterdb.com HTTP/1.00d Host: blockedhttps.urlfilterdb.com0d User-Agent: Wget/1.12 (linux-gnu)0d X-blocked-URL: googleads.g.doubleclick.net0d X-blocked-category: ads0d 0d 00d 0d and Squid in the end responds to wget: HTTP/1.1 500 Internal Server Error Server: squid/3.4.5 Mime-Version: 1.0 Date: Sun, 08 Jun 2014 13:59:27 GMT Content-Type: text/html Content-Length: 2804 X-Squid-Error: ERR_ICAP_FAILURE 0 Vary: Accept-Language Content-Language: en X-Cache: MISS from XXX X-Cache-Lookup: NONE from XXX:3128 Via: 1.1 XXX (squid/3.4.5) Connection: close A fragment of cache.log is below. I think that the line HttpReply.cc(460) sanityCheckStartLine: HttpReply::sanityCheckStartLine: missing protocol prefix (HTTP/) in 'CONNECT blockedhttps.urlfilterdb.com:443 HTTP/1.00d indicates where the problem is.
Re: How long is a domain or url can be?
On 05/01/2014 12:50 AM, Eliezer Croitoru wrote: On 05/01/2014 02:52 AM, Marcus Kool wrote: Eliezer, It is not clear what you want to achieve... If you just want to use a URL filter I suggest to use ufdbGuard. I am the author, give support, there are regular updates, it is multithreaded and holds only one copy in memory, and has a documented proprietary database format which is 3-4 times faster than squidGuard. Marcus Thanks Marcus, I am looking at couple things: I want to understand how SquidGuard was filtering data and doing policy stuff (since i am not able to think alone). I will try to look at ufdbGuard but now I know I can ask you if not SquidGuard team. Is it possible with ufdbGuard to update the DB without the need to reload or do anything? No, but ufdbguard reloads very fast and it has configuration options on how to behave during reload: - block all traffic - allow all traffic - allow and slow down all traffic (to reduce the number of unfiltered URLs) (is it ok to ask you in private?) Sure, if questions are not related to squid it is better not to use the squid list. Marcus Thanks All, Eliezer
Re: atomic ops on i386
gcc defines the symbols __GCC_HAVE_SYNC_COMPARE_AND_SWAP_4 and __GCC_HAVE_SYNC_COMPARE_AND_SWAP_8 and I use code that looks like this: #if defined(__GNUC__)__GCC_HAVE_SYNC_COMPARE_AND_SWAP_4 __SIZEOF_LONG_LONG__ == 4 (void) __sync_add_and_fetch( longLongVar, 1 ); #elif defined(__GNUC__)__GCC_HAVE_SYNC_COMPARE_AND_SWAP_8 __SIZEOF_LONG_LONG__ == 8 (void) __sync_add_and_fetch( longLongVar, 1 ); #else pthread_mutex_lock( counterMutex ); // or other mutex lock not based on pthread longLongVar++; pthread_mutex_unlock( counterMutex ); #endif I think that the root cause of the problem is in src/ipc/AtomicWord.h where it is assumed that if HAVE_ATOMIC_OPS is defined, atomic ops are defined for all types (and all sizes) which is an incorrect assumption. Changing the configure script to detect atomic ops for long long instead of int is a workaround, but this prevents the use of atomic ops for 4-byte types on systems that support it. Marcus On 04/14/2014 11:45 AM, Alex Rousskov wrote: On 04/13/2014 03:19 PM, Stuart Henderson wrote: On 2014-04-13, Alex Rousskov rouss...@measurement-factory.com wrote: On 04/13/2014 06:36 AM, Stuart Henderson wrote: I'm just trying to build 3.5-HEAD on OpenBSD/i386 (i.e. 32-bit mode) for the first time. It fails due to use of 64-bit atomic ops: MemStore.o(.text+0xc90): In function `MemStore::anchorEntry(StoreEntry, int, Ipc::StoreMapAnchor const)': : undefined reference to `__sync_fetch_and_add_8' MemStore.o(.text+0x3aa3): In function `MemStore::copyFromShm(StoreEntry, int, Ipc::StoreMapAnchor const)': : undefined reference to `__sync_fetch_and_add_8' MemStore.o(.text+0x3cce): In function `MemStore::copyFromShm(StoreEntry, int, Ipc::StoreMapAnchor const)': : undefined reference to `__sync_fetch_and_add_8' MemStore.o(.text+0x4040): In function `MemStore::copyFromShm(StoreEntry, int, Ipc::StoreMapAnchor const)': : undefined reference to `__sync_fetch_and_add_8' MemStore.o(.text+0x435f): In function `MemStore::copyFromShm(StoreEntry, int, Ipc::StoreMapAnchor const)': : undefined reference to `__sync_fetch_and_add_8' MemStore.o(.text+0x473d): more undefined references to `__sync_fetch_and_add_8' follow collect2: error: ld returned 1 exit status I am not an expert on this, but googling suggests building with -march=i586 or a similar GCC option may solve your problem. More possibly relevant details at That does fix the problem building, but I need this for package builds which are supposed to still work on 486, so I can't rely on users having 586 (cmpxchg8b). http://www.squid-cache.org/mail-archive/squid-dev/201308/0103.html specifically because swap_file_sz that they need to keep in sync across Squid kids is 64 bits - so I think fixing the autoconf check is probably what's needed then. Probably, assuming your users do not care about SMP-shared caching (memory or disk). Should the autoconf test be changed to check for working 64-bit ops, or is something more involved wanted? Filing a bug report may be a good idea, especially if you cannot make this work. I suppose the simplest fix would be something like this, --- configure.ac.orig Fri Apr 4 21:31:38 2014 +++ configure.acSun Apr 13 15:12:37 2014 @@ -416,7 +416,7 @@ dnl Check for atomic operations support in the compile dnl AC_MSG_CHECKING([for GNU atomic operations support]) AC_RUN_IFELSE([AC_LANG_PROGRAM([[ -int n = 0; +long long n = 0; ]],[[ __sync_add_and_fetch(n, 10); // n becomes 10 __sync_fetch_and_add(n, 20); // n becomes 30 Nitpick: s/long long/long long int/ but I think it would be safer to test both 32- and 64-bit sizes (using two different variables instead of a single n). Also, ideally, we should use int32_t and uint64_t types if possible, but that probably requires #inclusion of other headers and may become difficult unless there are already working ./configure test cases using those types that you can copy from. Happy to open a bug report if that's preferred, I thought I'd ask here first to work out the best direction. Again, I am not an expert on this, but I think you are on the right track. If you can test your patch and post here, somebody will probably commit it, especially if you can polish it based on the above suggestions. A bug report just makes it easier to remember to do it (and to point others to the fix). Thank you, Alex.
Re: c++0x and RHEL5.X
On 12/30/2013 10:16 AM, Amos Jeffries wrote: On 30/12/2013 11:21 p.m., Kinkie wrote: Hi all, we have been talking about mandating c++11 some time in the next few months. Today I was trying to rely on a c++0x feature, and I realized that RHEL5.X ships gcc 4.1.2, which doesn't support c++0x. RHEL6 ships g++ 4.4.7, which supports c++0x but not c++11. Now, what I need to do here is mostly convenience, I can work around it. However I am annoyed; we will need to make decision and this fact complicates things. RHEL cannot be a blocker for us. They will be stuck with that half-working GCC version until around 2020 unless they bump it up in a service pack release. CentOS has followed that, but is a bit more flexible with compiler packages. And those or Fedora compiler packages are usually okay for RHEL as well. I am running CentOS 5.x and on this has a package called 'gcc44' which install gcc 4.4.7 next to gcc 4.1.2. Does RHEL have the gcc44 package ? I was intending to start the serious decision talk late next year, probably after 3.5 has gone beta or stable. So that we take a good look at it for the 3.6 timeframe. That will give us at least half of the major distros fully on the preferred GCC versions and some like CentOS etc only a short few years away from EOL on the non-working versions (probably with packages available for the preferred compiler versions). PPS. Anything we do in the C++11 direction before 2015 will probably still require macros and wrappings. So look carefully at the features in regards to whether there is a non-C++11 equivalent and how messy the wrappers would make the code. Amos
Re: SLES build error, what to do?
On 12/23/2013 05:30 PM, Eliezer Croitoru wrote: Thanks Amos, On 23/12/13 05:33, Amos Jeffries wrote: Inquirer.cc:90: error: 'auto_ptr' is deprecated (declared at /usr/include/c++/4.3/backward/auto_ptr.h:91) This is a GCC bug. For a couple of releases the STL library required more advanced C++11 support than the compiler provided. It can only be worked around by upgrading GCC. OK so for SLES that has 4.3 specific version and they would probably will not upgrade for the next who knows how long.. https://www.suse.com/releasenotes/x86_64/SUSE-SLES/11-SP3/ has the information that you need: SLES 11 SP3 has an optional SDK with gcc 4.7.2 Marcus The suggestion is to compile and use it only as one process?? What I mean is that: if there will be pointer error how far can it make the runtime error be? Thanks again, Eliezer SNIP But after using the mentioned option the results seems like: # tail -f /usr/local/squid/var/logs/access.log 1387753634.690305 192.168.10.100 TCP_REFRESH_UNMODIFIED/304 306 GET http://docs.fedoraproject.org/en-US/index.html - HIER_DIRECT/80.239.156.215 - and it seems like it works but will maybe have some problems at runtime? There will be pointer problems in SMP mode. Amos
Re: [PATCH] Re: URL redirection with Squid 3.4
On 12/16/2013 01:46 PM, Alex Rousskov wrote: On 12/14/2013 06:28 AM, Amos Jeffries wrote: On 14/12/2013 6:59 a.m., Marcus Kool wrote: all, as discussed in a previous thread, the URL rewriter protocol of Squid 3.4 is different than with previous versions of Squid. Despite Amos' belief, I found out yesterday that there is no backward compatibility since a typical redirection URL is www.example.com/foo.cgi?category=adulturl=http://www.example.com/foo/bar and Squid 3.4 has a parser that splits tokens at '=' and then complains that it does not understand the answer of the URL redirector. Ouch. Thank you for finding this one. The fix appears to be limiting the character set we accept for key names such that it does not match any valid URL. I have now applied a patch to trunk as rev.13181 which limits characters in kv-pair key name to alphanumeric, hyphen and underscore. Based on icap_service experience that had a similar chain of developments/bugs/fixes, the best fix may be to require uri=... or a similar key=value pair for communicating URLs. There is an issue of backward compatibility which might be addressed by prohibiting bare URLs when newer key=value support is enabled (and honoring them otherwise). The presence of a uri=... key=value pair can be used to distinguish the two cases more-or-less reliably. In other words: * Want to use the newer key=value format? Use uri=... Does this mean that a response like http://www.example.com/cgi?cat=adulturi=foo will be parsed correctly? (note the uri= at the end) * Otherwise, you may continue to use bare URIs. What would work great for the URL redirector is: if the response starts with OK, ERR or BH, it can be parsed as the new 3.4 protocol with kv-pairs, and if not, it is the old pre-3.4 protocol and should be parsed as such. I wonder if the same logic works for the other interfaces. disclaimer: I am not very familiar with all interface changes, only the changes for the URL redirector. Marcus Please do _not_ interpret the above as a vote against restricting key name characters. However, we should probably restrict them the same way for key names _everywhere_ (in all key=value pairs) and not just in helper responses. Cheers, Alex.
Re: url_rewrite_program in Squid 3.4
and must return OK [status] url=newurl for a URL that needs to be redirected. One would expect that ERR is used for an error, not for something that is the opposite of an error. The error is that the re-writer could not or would not re-write the URL. You can return OK without the url=, status= or rewrite-url= keys. url= is only required *if* the URL is being redirected. rewrite-url= is only required *if* the URL is being rewritten. Thanks for the explanation. This means that the information on http://www.squid-cache.org/Versions/v3/3.4/cfgman/url_rewrite_program.html is not correct since url= and rewrite-url are not optional. I suggest to update this page to include that the result OK is meant for no URL modification / PASS. Thanks Marcus
Re: url_rewrite_program in Squid 3.4
On 11/12/2013 09:41 AM, Amos Jeffries wrote: You can return OK without the url=, status= or rewrite-url= keys. url= is only required *if* the URL is being redirected. rewrite-url= is only required *if* the URL is being rewritten. Thanks for the explanation. This means that the information on http://www.squid-cache.org/Versions/v3/3.4/cfgman/url_rewrite_program.html is not correct since url= and rewrite-url are not optional. I suggest to update this page to include that the result OK is meant for no URL modification / PASS. Ah, them docs. I keep looking at the wiki docs for helper protocol: http://wiki.squid-cache.org/Features/AddonHelpers Okay, config manual updated. Amos I hit refresh in the browser, but did not see an update for http://www.squid-cache.org/Versions/v3/3.4/cfgman/url_rewrite_program.html Is there a delay in the update? Reading http://wiki.squid-cache.org/Features/AddonHelpers I observed another inconsistency between http://wiki.squid-cache.org/Features/AddonHelpers and http://www.squid-cache.org/Versions/v3/3.4/cfgman/url_rewrite_program.html which is that in the spec of AddonHelpers states to use a bare URL on a rewrite while 3.4/cfgman states to use a kv url=URL. Since you looked at the wiki I assume that the base URL is correct. Can you confirm this? Thanks Marcus
Re: url_rewrite_program in Squid 3.4
On 11/12/2013 06:19 PM, Amos Jeffries wrote: On 2013-11-13 01:02, Marcus Kool wrote: On 11/12/2013 09:41 AM, Amos Jeffries wrote: You can return OK without the url=, status= or rewrite-url= keys. url= is only required *if* the URL is being redirected. rewrite-url= is only required *if* the URL is being rewritten. Thanks for the explanation. This means that the information on http://www.squid-cache.org/Versions/v3/3.4/cfgman/url_rewrite_program.html is not correct since url= and rewrite-url are not optional. I suggest to update this page to include that the result OK is meant for no URL modification / PASS. Ah, them docs. I keep looking at the wiki docs for helper protocol: http://wiki.squid-cache.org/Features/AddonHelpers Okay, config manual updated. Amos I'm not sure exactly what you mean by this. I assume its the [URL] entry in the AddonHelper response syntax? The old response syntax is still supported. That had a bare URL for rewrite and a status:URL pair for redirect. That has been changed to either a status=N url=X pair for redirect or a rewrite-url=X for rewrite in the new syntax. AddonHelpers mentions the two syntaxes since it covers all supported versions. The config manual only mentions the preferred syntax for the latest version unless you drill down to older release series manuals. Amos I hit refresh in the browser, but did not see an update for http://www.squid-cache.org/Versions/v3/3.4/cfgman/url_rewrite_program.html Is there a delay in the update? Yes, those docs are generated from the release code. So when 3.4.0.3 comes out the site will change. Reading http://wiki.squid-cache.org/Features/AddonHelpers I observed another inconsistency between http://wiki.squid-cache.org/Features/AddonHelpers and http://www.squid-cache.org/Versions/v3/3.4/cfgman/url_rewrite_program.html which is that in the spec of AddonHelpers states to use a bare URL on a rewrite while 3.4/cfgman states to use a kv url=URL. Since you looked at the wiki I assume that the base URL is correct. Can you confirm this? I'm not sure exactly what you mean by this. I assume its the [URL] entry in the AddonHelper response syntax? The old response syntax is still supported. That had a bare URL for rewrite and a status:URL pair for redirect. That has been changed to either a status=N url=X pair for redirect or a rewrite-url=X for rewrite in the new syntax. AddonHelpers mentions the two syntaxes since it covers all supported versions. The config manual only mentions the preferred syntax for the latest version unless you drill down to older release series manuals. Amos OK, I understand the syntax now. The only thing is the spec in cfgman/3.4 states that 'result' is large spec but does not include a simple OK without anything else. Marcus
url_rewrite_program in Squid 3.4
Hi, I noticed that 3.4.0.2 uses a new protocol for the url_rewrite_program that is incompatible with previous versions of Squid. I am updating ufdbGuard, a URL redirector for Squid, for Squid 3.4 to support the new protocol of Squid version 3.4. I read http://www.squid-cache.org/Versions/v3/3.4/cfgman/url_rewrite_program.html and was utterly surprised to read that a URL redirector must return ERR to indicate that the URL is fine and does not need to be redirected, and must return OK [status] url=newurl for a URL that needs to be redirected. One would expect that ERR is used for an error, not for something that is the opposite of an error. Is there a chance that the protocol reply ERR can be changed into something logical like PASS or UNCHANGED ? Furthermore, I suggest that the BH status code gets a parameter, a quoted string explaining what is happening with the URL redirector. Thanks Marcus
Re: [RFC] Peek and Splice
On 02/01/2013 03:00 PM, Alex Rousskov wrote: I agree with the general everything we proxy should be available for analysis principle. Getting to that point would be difficult because protocols and APIs such as ICAP, eCAP, external ACL helper, and url_rewriter were not designed to deal with everything. They need to be tweaked or extended to work with non-HTTP traffic. We already do that in some cases (e.g., FTP) but more is needed to handle everything. And that is exactly why I try to encourage you to implement about it now since doing this together with the planned change is less work than moving it to a future project. As a bonus it will make Squid one of the very few proxies which takes virus scanning and content filtering really seriously. Marcus
Re: [RFC] Peek and Splice
On 02/01/2013 02:17 AM, Alex Rousskov wrote: Hello, Many SslBump deployments try to minimize potential damage by _not_ bumping sites unless the local policy demands it. Unfortunately, this decision must currently be made based on very limited information: A typical HTTP CONNECT request does not contain many details and intercepted TCP connections are even worse. We would like to give admins a way to make bumping decision later in the process, when the SSL server certificate is available (or when it becomes clear that we are not dealing with an SSL connection at all!). The project is called Peek and Splice. The idea is to peek at the SSL client Hello message (if any), send a similar (to the extent possible) Hello message to the SSL server, peek at the SSL server Hello message, and then decide whether to bump. If the decision is _not_ to bump, the server Hello message is forwarded to the client and the two TCP connections are spliced at TCP level, with Squid shoveling TCP bytes back and forth without any decryption. If we succeed, the project will also pave the way for SSL SNI support because Squid will be able to send client SNI info to the SSL server, something that cannot be done today without modifying OpenSSL. I will not bore you with low-level details, but we think there is a good chance that Peek and Splice is possible to implement without OpenSSL modifications. In short, we plan using OpenSSL BIO level to prevent OpenSSL from prematurely negotiating secure connections on behalf of Squid (before Squid decides whether to bump or splice). We have started writing BIO code, and basic pieces appear to work, but the major challenges are still ahead of us so the whole effort might still fail. There are a few high-level things in this project that are not clear to me. I hope you can help find the best solutions: 1. Should other bumping modes switch to using SSL BIO that is required for Peek and Splice? Pros: Supporting one low-level SSL I/O model keeps code simpler. Cons: Compared to OpenSSL native implementation, our BIO code will probably add overheads (not to mention bugs). Is overall code simplification worth adding those overheads and dangers? 2. How to configure two ssl_bump decisions per transaction? When Peek and Splice is known to cause problems, the admin should be able to disable peeking using CONNECT/TCP level info alone. Thus, we probably have to keep the current ssl_bump option. We can add a peek action that will tell Squid to enable Peek and Slice: Peek at the certificates without immediately bumping the client or server connection (the current code does bump one or the other immediately). However, many (most?) bumping decisions should be done when server certificate is known -- the whole point behind Peek and Splice. We can add ssl_bump2 or ssl_bump_peeked that will be applied to peeked transactions only: ssl_bump peek safeToPeek ssl_bump none all ssl_bump_peeked server-first safeToBump ssl_bump_peeked splice all Is that the best configuration approach, or am I missing a more elegant solution? If there are any other Peek and Splice suggestions or concerns, please let me know. Thank you, Alex. This PeekSplice feature will make ssl_bump a useful feature since without PeekSplice ssl_bump aborts all non-SSL CONNECTS from Skype and other applications, so the user community will certainly welcome this. Currently Squid only sends to the ICAP server a REQMOD CONNECT www.example.com:443 (without content) and there is never a RESPMOD. I, as author of ufdbGuard and the (yet unpublished) new ICAP content filter, would welcome very much if the data of the peeks (client and server) is encapsulated into ICAP requests for the obvious purpose of content filtering. Thanks Marcus
Re: [RFC] Peek and Splice
On 02/01/2013 01:48 PM, Alex Rousskov wrote: On 02/01/2013 06:47 AM, Marcus Kool wrote: This PeekSplice feature will make ssl_bump a useful feature since without PeekSplice ssl_bump aborts all non-SSL CONNECTS from Skype and other applications, so the user community will certainly welcome this. Well, SslBump is already useful in environments where non-SSL CONNECTs are either prohibited or can be detected and bypassed using CONNECT or TCP-level information. Peek and Splice will allow bypass of non-SSL tunnels without building complicated white lists. While not in this project scope, Peek and Splice would probably make it possible (with some additional work) to allow Squid to detect and block non-SSL tunnels without bumping SSL tunnels. That could be useful in environments where HTTPS is allowed (and does not need to be bumped) but other tunnels are prohibited. Yes, I think it is useful to have an option allowed_protocols_for_connect: any|ssl Currently Squid only sends to the ICAP server a REQMOD CONNECT www.example.com:443 (without content) and there is never a RESPMOD. I, as author of ufdbGuard and the (yet unpublished) new ICAP content filter, would welcome very much if the data of the peeks (client and server) is encapsulated into ICAP requests for the obvious purpose of content filtering. Squid already sends bumped (i.e., decrypted) HTTP messages to ICAP and eCAP. If that does not happen in your SslBump tests, it is a bug or misconfiguration. Squid cannot send encrypted HTTP messages to ICAP or eCAP -- you must use SslBump if you want to filter encrypted traffic. There is no way around that. Yes, correct. I mixed the behaviour of Squid with sslbump (decrypted messages go to the ICAP server) and Squid without sslbump (ICAP server only receives a REQMOD). Or are you thinking about sending SSL Hello messages to ICAP and eCAP services? If Peek and Splice succeeds, that will be technically possible as well, but will require more work and would be a separate project. I was thinking about this: when Squid peeks at the data and finds that it is non-SSL, send it to the ICAP server to ask its opinion. This is obviously more work, but also extremely useful, since a content filter is only useful if it is able to inspect _all_ content, and consequently the feature of Squid to connect to content filters is only useful if Squid sends _all_ data to the content filter for analysis. Perhaps needless to say: virusses like to communicate in non-standard ways to Squid would be considered much more secure if it sends _all_ data to an ICAP server for analysis. Marcus
Re: Spaces in ACL values
On 09/13/2012 07:16 PM, Alex Rousskov wrote: 2) Add squid.conf directives to turn the new parsing behavior on and off for a section of the configuration file. This is also 100% backward compatible but difficult to introduce gradually -- admins will expect everything inside a quoted strings section to support quoted strings, and I am not 100% sure we can easily support that because different options use different token parsers. # start new quoting support section configuration_value_parser quoted_strings # now just use the new quoting support acl badOne1 user_cert CN Bad Guy acl badOne2 ext_user Bad Guy # restore backward-compatible mode configuration_value_parser bare_tokens acl oldOne user_cert CN One Two and Four 2b) Add squid.conf directives _at the beginning_ of the conf file to specify the parser behavior. So do not toggle and force the admin to be aware of quoted strings and _must_ check the whole config file himself. The default value of config_used_quoted_strings is off. This is still 100% backwards compatible without doing lots (?) of effort to please everybody and every situation. Marcus
Re: processing of ICAP Transfer-Ignore options
On 04/16/2012 04:34 AM, Henrik Nordström wrote: sön 2012-04-15 klockan 22:07 -0300 skrev Marcus Kool: Are you saying that you want to use the Content-Type header as the main guide for determining the file extension ? Yes, when there is a usable content-type. The idea itself is good. The problem is that it is very different than what the ICAP RFC states. I think that the negation of filtering based on Content-Type should use a new parameter, e.g. Ignore-Content-Type. And lets not forget that Transfer-Ignore based on a part of the URL can be used for REQMOD and RESPMOD while Ignore-Content-Type can only be used for RESPMOD. This has a small performance impact: there will be more ICAP traffic since less can be ignored. Anyway, clarity is the most important thing here and I suggest to move this discussion to the ICAP discussion forum. Clarity in a pile of mud... Hahaha. Do you propose to make a ICAP2 standard that is not backwards compatible? Regards Marcus
Re: processing of ICAP Transfer-Ignore options
I have a number of clients dealing with this most common case among the popular CMS systems today... GET http://example.com/index.php?some/file.jpg HTTP/1.1 ... HTTP/1.1 200 Okay Content-Type: text/xml ... GET http://example.com/imagews/1861245634-230is86 HTTP/1.1 ... HTTP/1.1 200 Okay Content-Type: image/jpeg ... If one is lucky the CMS *may* put .jpg on the second URI name. Amos The referred clients use reverse proxy or forward proxy ? Yeah, the more we talk about this issue, the more I think the existing Transfer-Ignore and a new Ignore-Content-Type are bogus since there are still too many web servers and CMS's that do things wrong. Both the file extension and the Content-Type is too often wrong. For RESPMOD, icapd uses content sniffing since when it blocks an object it insists in sending new content with the correct Content-Type. I came up with the original question since I try to optimize performance and ignore irrelevant content by not sending it to the ICAP server. I think we are stuck with processing all traffic without ignoring any content. Marcus
Re: processing of ICAP Transfer-Ignore options
On 04/16/2012 11:58 AM, Henrik Nordström wrote: mån 2012-04-16 klockan 09:40 -0300 skrev Marcus Kool: The idea itself is good. The problem is that it is very different than what the ICAP RFC states. Is it? A list of file extensions that ... It says file extensions. What is a file? In my mind the closest to file is what you get on your harddrive when you download something, and there is no direct map url - file extension. The file extension is derived from a combination of content-type, content-disposition and URL. I think we agree that file extension is an inappropriate term in this context. I agree that it would be more suitable to ignore transfers to the ICAP server based on Content-Type. However, looking at the RFC where the example uses asp, bat, exe, com, ole it seems that the authors of the RFC were thinking of a URL-based suffix, not content-type. I think that the negation of filtering based on Content-Type should use a new parameter, e.g. Ignore-Content-Type. That's a useful replacement. Regards Marcus
Re: processing of ICAP Transfer-Ignore options
On 04/15/2012 02:33 PM, Henrik Nordström wrote: lör 2012-04-14 klockan 19:11 -0600 skrev Alex Rousskov: Sure, I am just trying to find a way to improve compatibility of ICAP agents, even though the ICAP protocol itself is using wrong concepts when defining what was meant as a pretty useful feature. I'd propose the following algorithm: 1. Look up content-type in the mime table and deduce file extension from there unless the content-type is application/octet-stream. Limited to mime table expressions on the form \.ext$ where ext do not contain any special regex patterns. Are you saying that you want to use the Content-Type header as the main guide for determining the file extension ? I think that any change should stay close to the vague definitions of the ICAP RFC. The text explaining the Transfer-Complete gives an example of bat which is probable the old Windows .BAT command file which probably has a Content-Type of text/plain. IMO using the Content-Type will not have the desired behavior. At the time that the ICAP RFC was written there were hardly any CGI scripts and I believe that the intention was that the suffix of the URL was the file extension. Today, with the CGI parameters one could argue that they should be stripped before determining the file extension. Anyway, clarity is the most important thing here and I suggest to move this discussion to the ICAP discussion forum. Marcus 2. For application/octet-stream or when the file extension is otherwise uncertain, identify the filename and derive file extension from there, in priority order a) Content-Disposition filename parameter b) URL-path c) Last part of query parameters With some handwaving and juggling to determine priority of b c... Regards Henrik
Re: processing of ICAP Transfer-Ignore options
Yes, the file extension is vague, hence my original question. However, Squid 3.1 thinks that the file extension is the last bit of the URL after the last dot (like if the URL had a filename suffix). It seems logical to strip the CGI parameters before evaluating the file extension. I studied a lot of URLs, the file extension and Content-Type and it turns out that the file extension is far more reliable as an indicator of the content type than the Content-Type itself. Best regards, Marcus On 04/13/2012 06:42 PM, Henrik Nordström wrote: fre 2012-04-13 klockan 13:21 -0600 skrev Alex Rousskov: Yes, but primarily because the extension is not clearly defined. This is something we can address in ICAP Errata, I guess: Provide a definition of what should be considered a file extension, with a disclaimer that not all agents will use the definition provided. It would not solve all the problems but would be better than doing nothing. ICAP was designed for HTTP. HTTP does not have file name extensions, HTTP have content types. Regards Henrik
processing of ICAP Transfer-Ignore options
I am testing the ICAP interface of Squid 3.1.18 and noticed the following: The OPTIONS for RESPMOD is this: ICAP/1.0 200 OK0d Methods: RESPMOD0d Preview: 81920d Transfer-Preview: *0d Transfer-Ignore: bmp,ico,gif,jpg,jpe,jpeg,png,tiff,crl,avi,divx,flv,h264,mp4,mpg,mpeg,swf,wmv,mp3,wav,ttf,pdf,rar,tar,zip,gz,bz2,jar,js,json,htm,html,dhtml,shtml,css,rss,xml0d Service: ICAPD 0.9.1 ICAP server by URLfilterDB0d Service-ID: URLfilterDB0d ISTag: 4f883424-d44b0d Connection: keep-alive0d Encapsulated: null-body=00d Max-Connections: 5000d Options-TTL: 6000d Allow: 2040d Allow: 2060d X-Include: X-Client-IP, X-Server-IP, X-Forwarded-For, X-Subscriber-ID, X-Client-Username, X-Authenticated-Groups0d and the Transfer-Ignore processing works as expected for .gif etc. (e.g. the ICAP server does not receive the previews) _except_ for http://zzz.com/1409303.mp4?p1=2012-xxx where the ICAP server unexpectedly receives the preview. There is no formal definition in the RFC of what a file extension is. So the question is: is the file extension of http://zzz.com/1409303.mp4?p1=2012-xxx mp4 ? If yes, I will file a bug report. Marcus
Re: filtering HTTPS/CONNECT (summary and continuation of discussion)
Well, herd of cats is a term I've seen recently to describe FOSS project dev teams. Pretty accurate. You yourself are already part of the team simply by dint of your contribution pushing this discussion far enough forward to get a work plan out of it. With the work plan it should be easy to make up quotes and try to get sponsorship for all or parts of it. Some parts can be crossed between projects and prioritized by those of us interested in general code cleanups or proposed to a wider audience of sponsors than would support the feature you are asking for. Amos Amos, I am not a native speaker and do not get the hint herd of cats. I can contribute in all areas except modifying the code of Squid. Best regards, Marcus
Re: filtering HTTPS/CONNECT (summary and continuation of discussion)
On 03/17/2012 09:06 PM, Henrik Nordström wrote: lör 2012-03-17 klockan 11:10 -0600 skrev Alex Rousskov: No, it will not by default. One would have to maintain a white list of destinations that should not be bumped. Which you can't for thinks like Skype as they connect pretty much anywhere (peer-to-peer network). This is just one example. There is a growing list of services that use CONNECT: Citrixonline, videoconferencing, and other chat applications. Regards Henrik
Re: filtering HTTPS/CONNECT (summary and continuation of discussion)
On 03/19/2012 01:48 PM, Henrik Nordström wrote: mån 2012-03-19 klockan 11:35 -0300 skrev Marcus Kool: An unfiltered CONNECT (default for Squid) allows (SSH) tunnels. Squid standard configuration only allows port 443, which restricts this to those who intentioanlly want to pierce any network usage policy. I foresee a change. I foresee an increasing desire to be able to filter everything because of the need to remove the existing holes in security. There is undoubtly such environments. The question is if Squid is the right tool for this, or if it's in the target for Squid. This is an important point. It is the development team who makes the decision which features will be implemented. Surely there is some common idea about which direction Squid will go to but it is not clear to me. I read the roadmap but it is sort of a wishlist and therefore I started this discussion. As Alex stated, there is no use in starting work on a pipe filter for the filter if there is no Squid developer interested in doing the work on Squid. I am not in the position to actively support pipe filtering, so the only thing that I can do is ask for it. Best regards Marcus
Re: filtering HTTPS/CONNECT (summary and continuation of discussion)
Alex Rousskov wrote: On 03/16/2012 03:05 PM, Marcus Kool wrote: How do we go on from here? I recommend splitting this big problem into several smaller areas: Tunnel classification: As Henrik noted, Squid should wait for client (or server!) handshake before starting the SSL handshake with the server. Waiting for one of the sides to speak first (i.e., before Squid) allows us to categorize the tunnel intent: SSL, HTTP, Other. This step is critical for other projects below. Indeed this step is critical. Squid may not guess (wrongly) and unintentionally cause problems for applications that use CONNECT. HTTP tunnel: Either go to tunnel.cc or process almost as a regular request stream. Make the choice configurable. Not sure what you mean with HTTP tunnel (see below). SSL tunnel: Use bump-server-first. Add SNI forwarding support. If SSL handshake with the server fails (there are many broken and weird servers out there!), bump-server-first returns a secure error to the client. In some cases, it may be better to re-tunnel the server end (without bumping) or just close the client connection immediately. The former requires serious coding effort; the latter does not, but both are pretty straightforward. And make the choice configurable. Only after detection of SSL and after a successful SSL handshake Squid can detect what happens inside the SSL-wrapped data stream. Again Squid needs to monitor the server and the client and detect what is inside the SSL-wrapped data stream: 'regular HTTP' or 'something else'. When you refer to HTTP tunnel, do you mean SSL-wrapped HTTP ? Squid should switch to tunnel mode for an SSL-wrapped non-HTTP stream. Other tunnel: When a non-HTTP traffic is encountered at the beginning of a tunnel, switch to the tunneling mode or terminate both connections. Make the choice configurable. There are too many applications, and of course, a Squid admin wants to block some and allow others. One switch for all applications seems not very useful. Currently, only filters detect the various applications and can do the selective blocking. Since not all Squid installation have an (ICAP) filter, it is probably a good thing to have the switch anyway. Filterable Other tunnels (bumped or not!): Define a protocol and/or API to adapt tunnel.cc (or similar) I/O. Learn from ICAP mistakes. Implement the client/hosting side of that protocol/API in Squid. 3rd parties will implement the service/adapter sides. Also a must have to satisfy the idea that all data must be filterable. Did I miss any big cases? As you can see, all of the above are pretty much independent projects. Are _you_ interested in all or just some of them? My point of view is filter based and I think that you could already read between the lines that I think that Squid should have it all to make all data filterable. Filtering is done for security; to block Skype but allow other safer chat and VOIP applications. To block HTTPS proxies, to prevent (accidental) leaks of documents to public document sharing sites. And of course to block viruses. Filtering is also done to force employees to pay more attention to their work and less to the sports comments on the internet, but filtering for security is more important. As the web changes and more servers use HTTPS and more applications use CONNECT, I think Squid should have it all to remain a fully featured and safe web proxy. Will you do any work on Squid itself or are you looking for a volunteer on our end? If it is the former, would you like to create dedicated wiki pages for those projects you are interested in and start nailing down the details? I know very little of the internals of Squid and do not have the time to get to know the code well enough to make these types of changes. I am willing to write feature pages and assist in writing a detailed document for the new pipe filter protocol. ufdbGuard is GPLv2 and the new ICAP filter will also be GPLv2. The pipe filter module for the server will also be GPLv2 so that others (I am thinking of antivirus) can benefit from it. If you are looking for volunteers to work on the Squid side, then I would not recommend doing much on your end until you secure at least one such person. Otherwise, you may end up with a filter that you cannot attach to Squid. I do not want to appear to try to push the Squid team to do things. I know that you all are busy and that new features will have priorities and will be queued. I can only hope that the development team shares the same view that to remain a safe proxy, Squid needs the ability to filter *all* data. At this moment my #1 priority wish is that Squid 3.1.x or 3.2.x can be used with the sslBump feature turned on which does not break Skype and other applications using CONNECT. Will the new bump-server-first do this? Thank you, Alex.
Re: filtering HTTPS/CONNECT (summary and continuation of discussion)
There were 4 threads about 'filtering HTTPS' and I will try to summarise here. Current situation with Squid 3.1.19: What happens inside a CONNECT is practically not filterable because 1) sslBump is not used, or 2) sslBump is used and SSL+HTTP can be filtered, but it breaks the other data streams for Skype et al. Using the unsafe options 'sslproxy_cert_error allow all' and 'sslproxy_flags DONT_VERIFY_PEER' to circumvent the latter problem are far from desirable. The wiki features pages say that Alex Rousskov is working on BumpSslServerFirst and MimicSslServerCert but unfortunately Alex has not (yet) participated in the discussion. What I consider as the desired situation: *all* traffic will be filterable, since if there is an exception for one category of data, one can write an application that makes a tunnel using this particular category of data and hence is able to circumvent all efforts to filter traffic. To filter HTTP is trivial. To filter HTTPS there are two options: 1) to filter without sslBump and then the filter only receives CONNECT endpoint:443 on which it has to make a decision to block or not. This cripples the filter since it does not has access to the content and in many cases can not detect which application sends what (type of) data. An additional drawback is that connection can be blocked but an understandable error message cannot be presented to the end user. 2) use sslBump. The filter will receive CONNECT endpoint:443 as well as https://endpoint/path; (and content for RESPMOD) for SSL+HTTP based connections so this is optimal for filtering SSL+HTTP connections. The discussion was much around what to do with data streams that are not SSL+HTTP. This can be any protocol encapsulated by SSL or simply any protocol. To be able to filter all data, Squid needs a modification to present raw data about the non-SSL+HTTP data streams to a filter (URL redirector or ICAP). To keep the discussion focussed on one type of filter I will assume that an ICAP server is used as the filter. The ICAP protocol has a considerable overhead (CPU processing) and extending the ICAP protocol for data stream filtering is not the first choice. Amos and Henrik were optimistic about implementing a new pipe filter. The data streams for a bidirectional pipe have a different behavior than HTTP and SSL+HTTP. Both client and server can send data at any time. And for some, the server initiates the protocol and for others, the client initiates. OpenVPN is a chameleon and can pretend to be an SSL+HTTP server but is also a VPN server. In all cases that Squid sends a request to a filter, it would be a *big* plus if it informs the filter what it already knows about the CONNECT endpoint. E.g. If it has SSL/TLS or not. Since sslBump is being rewritten for 3.3 it is a good opportunity to make Squid suitable for filtering *all* data streams. The new sslBump flow could be something like this: A) open socket to server. If error, close socket to client. B) do the logic for ICAP REQMOD CONNECT endpoint:443 C) start SSL handshake to server and take care of all certificate issues. If the SSL handshake fails with a PROTOCOL error, the socket must be closed, a new socket must be opened, and Squid will assume that the endpoint uses an other protocol than SSL. Squid goes into tunnel mode and all filtering will be done by the new pipe filter. Squid may get a new option to define its behaviour in case the SSL handshake fails. The options could be called sslBumpForNoneSSL with values prohibitNoneSSL (terminate connection), passNoneSSL (always allow), filterNoneSSL (default value - let new pipe filter decide). D) Squid now knows that the connection has a SSL/TLS wrapper but does not know yet if inside the wrapper HTTP is used. Squid monitors what the client *and* the server send on the pipe. If the client sends first and sends a valid HTTP command, Squid assumes that the connection has SSL+HTTP. If there is no SSL+HTTP Squid goes into tunnel mode and all filtering will be done with the new pipe filter. E) do the normal processing and ICAP REQMOD/RESPMOD for https://endpoint/path The total work of Squid+filter can be reduced if B) is done after C) since Squid can inform the filter about the SSL handshake and the filter does not have to do its own probe. There was a suggestion for a connection cache which allows it to skip checks and make assumptions about a new CONNECT to an endpoint that was CONNECTed before. The new pipe filter requires a new protocol yet to be defined. Squid initially tells the filter what it already knows about the endpoint. I.e. uses SSL or not, time to CONNECT, endpoint address, cached information. The Squid pipe sends copies of all data to the filter and the filter can reply with one of the following: OK (proceed with this data), REPLACE-CONTENT (content and a flag to optionally also terminate the connection), TERMINATE
Re: filtering HTTPS/CONNECT (summary and continuation of discussion)
Alex Rousskov wrote: On 03/16/2012 03:05 PM, Marcus Kool wrote: There were 4 threads about 'filtering HTTPS' and I will try to summarise here. Current situation with Squid 3.1.19: What happens inside a CONNECT is practically not filterable because 1) sslBump is not used, or 2) sslBump is used and SSL+HTTP can be filtered, but it breaks the other data streams for Skype et al. Using the unsafe options 'sslproxy_cert_error allow all' and 'sslproxy_flags DONT_VERIFY_PEER' to circumvent the latter problem are far from desirable. The wiki features pages say that Alex Rousskov is working on BumpSslServerFirst and MimicSslServerCert but unfortunately Alex has not (yet) participated in the discussion. Sorry, I was on a business trip when the discussion started and could not respond until now (I tried!). ok, no need to apologise. To filter HTTP is trivial. To filter HTTPS there are two options: 1) to filter without sslBump and then the filter only receives CONNECT endpoint:443 on which it has to make a decision to block or not. This cripples the filter since it does not has access to the content and in many cases can not detect which application sends what (type of) data. An additional drawback is that connection can be blocked but an understandable error message cannot be presented to the end user. I believe this is already supported. Yes. Technically works but the issue of not being able to give the end user a different error than cannot connect to server is annoying to users. 2) use sslBump. The filter will receive CONNECT endpoint:443 as well as https://endpoint/path; (and content for RESPMOD) for SSL+HTTP based connections so this is optimal for filtering SSL+HTTP connections. The discussion was much around what to do with data streams that are not SSL+HTTP. This can be any protocol encapsulated by SSL or simply any protocol. To be able to filter all data, Squid needs a modification to present raw data about the non-SSL+HTTP data streams to a filter (URL redirector or ICAP). or eCAP. I read about eCAP but when I decided to make a new URL filter (I already wrote ufdbGuard a URL redirector), I decided for ICAP since it is more widespread and eCAP not yet matured. My new ICAP server (no better name yet than ufdbicapd) is multithreaded, loads a 200 MB URL database in memory and not that straightforward to put inside Squid with a loadable module. I do not want to judge eCAP since I know little about it, also because there is not that much documentation. I think I will look at it again to see if a hybrid solution is feasible. To keep the discussion focussed on one type of filter I will assume that an ICAP server is used as the filter. The ICAP protocol has a considerable overhead (CPU processing) and extending the ICAP protocol for data stream filtering is not the first choice. Amos and Henrik were optimistic about implementing a new pipe filter. The data streams for a bidirectional pipe have a different behavior than HTTP and SSL+HTTP. Both client and server can send data at any time. And for some, the server initiates the protocol and for others, the client initiates. OpenVPN is a chameleon and can pretend to be an SSL+HTTP server but is also a VPN server. In all cases that Squid sends a request to a filter, it would be a *big* plus if it informs the filter what it already knows about the CONNECT endpoint. E.g. If it has SSL/TLS or not. Since sslBump is being rewritten for 3.3 it is a good opportunity to make Squid suitable for filtering *all* data streams. Sure, although please keep in mind that the bump-server-first and certificate mimicking code is pretty much complete. We are going through beta testing and code polishing cycles now. I hope I would not have to rewrite a lot of stuff that already works! well, always good to hear that a project is almost done. The new sslBump flow could be something like this: A) open socket to server. If error, close socket to client. If there is an error, bump-ssl-server-first returns an error to the client, after establishing a secure connection with it. Closing the connection can sometimes be a good option as well, of course. Yeah, this depends on the error. When Squid cannot make a connection to the server, it could simple close the socket to the client. Just an idea. But doing a full handshake with a client and given a user-friendly error message is very nice. B) do the logic for ICAP REQMOD CONNECT endpoint:443 Bump-ssl-server-first does not change the order of ICAP processing and server connection establishment. And it would be wrong to change it, IMO. In other words, your (B) should come before (A) because (B) may change where we are connecting or even prohibit the CONNECT request (among other things): 1. Receive CONNECT. 2. Authenticate/etc. 3. Adapt/redirect/etc. 4. Bump. You are right. I totally forgot about the REQMOD post-cache vectoring point and what I
Re: filtering HTTPS
Henrik Nordström wrote: tis 2012-03-13 klockan 19:27 -0300 skrev Marcus Kool: Squid is not the tool for filtering non-http(s) traffic beyond requested hostname. I agree. Squid is not. This task is for the URL rewritors and ICAP servers. One way or another, Squid should offer all data that passes through it (1) to a filter. I like ICAP, but ICAP is designed for HTTP and not HTTPS and certainly not for non-HTTP, non-HTTPS data streams. non-HTTP traffic do not fit URLs or ICAP either. How would you map an SSH session? Sorry, I know virtually nothing about the internals of Squid so how to map it... I don't know. The only thing that I can say at this moment that Squid should give a filter the opportunity to inspect the content. If for whatever reason Squid cannot provide the content of a data stream, it should at least signal the filter that it does a CONNECT to a non-SSL+HTTP address so that the filter can probe/analyse/decide what to do. A filter pipe is interesting. A question is on how to implement it. ICAP has no support for it and in my opinion ICAP should be extended to support this. I know it is a long way to extend existing protocols but maybe it works by just doing it and making it a de facto standard. ICAP is designed for HTTP and is very message at a time oriented, separating request response seeing them as separate entities. For HTTP(s) it does support piped operation where the request response is being filtered as it's being forwarded. It's only a matter of the ICAP Server starting it's response before the whole request/response have been seen. But I do not think ICAP is suitable for general data stream filtering/adaptation. The protocol is simply not designed for it. In a data stream filter/adaptation you want to operate on the bidirectional datastream as a whole. Regards Henrik
Re: filtering HTTPS
Tsantilas Christos wrote: On 03/13/2012 05:12 PM, Marcus Kool wrote: Henrik Nordström wrote: And if both sides is monitored for traffic then detection do not need to rely on timeout. If any message is seen from server or if something that do not look like ssl hello is seen from client then enter tunnel mode. There is one but still, non-http protocols over ssl/tls, not just CONNECT but actual ssl/tls. Those need ssl/tls tunnel mode where application protocol is tunneled between client and server ssl connection. And maybe a dynamic ssl-bump blacklist. Where does the filtering gets involved? Also NoneSSL sites (aka tunnelmode) need to be filtered/blocked and/or scanned for virusses. Is it good idea to try filtering any(?) protocol (eg skype, streaming servers etc) using HTTP proxies and the ICAP protocol implemented to filter HTTP content? Yes. Skype is not just a simple chat. It does file transfers and remote desktop viewing. There are lots of sites who block Skype and allow ebuddy. Others only allow Yahoo IM and block all other chats. It is not up to us to decide what can be blocked. That is up to the administrator of Squid and the filters. If Squid filters 95% but intentionally does not filter some type of data, you will have in no time a new application that uses this unfiltered type of data to build a tunnel circumventing all filters. A sslbump whitelist is probably desired as well, skipping ssl/tls verification if it's already known the server is an https server. A whitelist has a security issue: www.mybank.com can be safe today and hacked tomorrow. I agree with Henrik here. The whitelist is a list saying that the sslbump can not be used for some sites. There was some confusion what is meant by 'whitelist'. An other thread clarified this. I agree with a cache for already verified endpoints but be careful: OpenVPN uses a trick to divert HTTPS traffic to a webserver and the other data streams are used for the VPN. Skipping certificate verification is unsafe. One should be extremely careful on skipping it. A certificate cache seems better: one caches the certificates of www.mybank.com and on the next CONNECT (the SSL handshake has to be done anyway), and Squid can bypass the certificate checking rules if the sent certificates were used in previous CONNECTs. This is a security issue. The server certificate may change for many reasons, eg considered unsafe because of a bad private/public key. You should always check server certificate. One does not need to re-check if a new connection receives the same certificates. I see for example thousands of CONNECTs in a short time to http://plus.google.com One user for one webpage can have several CONNECTs. I think it is safe to use a time-limited cache. And maybe also a CONNECT cache: so that Squid remembers to go into tunnelmode directly without trying to do a SSL handshake for every Skype connection.
Re: filtering HTTPS
Tsantilas Christos wrote: Issue 1: one cannot block CONNECT in an elegant way. I.e. a CONNECT to an undesired site cannot be redirected or anything since the application (possibly browser) want do a SSL handshake and it it fails it displays the 'vague error' cannot connect to site www.example.com which is indeed vague for an end user who usually only understands messages like you are not authorised to go to www.example.com. For true SSL+HTTP (https) sites, issue 1 can be resolved by *not* blocking the CONNECT and wait for the next GET https://www.example.com/index.html; and block/redirect this object. Lets call this a 'postponed SSL+HTTP block'. But for sites which do not use SSL+HTTP there is not a good solution since Squid and the URL redirector only see a CONNECT and never see a GET/HEAD/POST. I think you are describing the Bump-Server-First feature which is currently under development: http://wiki.squid-cache.org/Features/BumpSslServerFirst I read about this feature before posting. The feature description only talks about CONNECT to an SSL+HTTP endpoint while I like to extend the scope to CONNECT to any endpoint. The number of data protocols that Squid can encounter are infinite: - SSL+HTTP - SSL+ANYTHING - ANYTHING Issue 2: Skype does not work any more with sslBump. SSH tunnels, VPNs and other chat applications also stop working with sslBump since the sslBump feature does its SSL certificate checking and if this fails, the CONNECT fails. Using the options 'sslproxy_cert_error allow all' and 'sslproxy_flags DONT_VERIFY_PEER' is not considered useful since thay are truely very unsafe and I recommend never to use them. A ways is to use ACLs to select sites and maybe applications which can be sslBumped or not. Eh, yeah. Not so easy. I do like very much that Squid administrators make these lists. Not all admins have sufficient knowledge or complete understanding of all pitfalls and risks. ufdbGuard probes CONNECT endpoints and caches the result. It has a table of known Skype login servers but there are many Skype nodes (Skype users) that can only be detected dynamically. For a CONNECT to a Skype endpoint, ufdbGuard has the knowledge to signal to Squid to sslBump or not. And if the configuration says to block Skype, ufdbGuard can signal to Squid do not do the effort of doing an SSL handshake and terminate the thing. But lets not focus on just Skype, there is a growing number of applications that use CONNECT. And they may use any protocol. More background information: The URL redirector ufdbGuard has a feature to probe HTTPS connections. It does a SSL-handshake if this works it is followed by GET / HTTP/1.0 If the SSL-handshake does not work it probes for SSH, Skype and other chat protocols to find out where the the application CONNECTs to. ufdbGuard can block CONNECT to IP addresses but make exceptions for the CONNECTs which are used by allowed chat protocols. SSH and VPNs are blocked by ufdbGuard if the administrator has configured to block proxies. HTTPS is used more and more. Even Google uses it for their search engine. It is necessary to have a safe HTTPS proxy and content filtering in an absolutely safe and efficient way. Proposal: To have a good combination of web proxy and content filtering combination I propose the following: A) Squid's behaviour is modified for sslBump: after an unsuccessful SSL handshake, the CONNECT does not fail any more by default. This is to ensure that Skype et al. remains functional. Correct. This behaviour partially is already implemented under the Bump-Server-First feature. I am saying partially because currently works only with HTTPS protocol (requires that both client-to-squid and squid-to-server connections are supporting SSL) but can easily extended to support other protocols. It is easy to extend Bump-Server-First to not initiate SSL connection with the client if the server is not an SSL server. But again, why not using ACLs to avoid applying sslBump on applications like skype? What ACL do you have in mind? B) Squid gets a new option to define its behaviour in case the SSL handshake fails. The options could be called sslBumpForNoneSSL with values prohibitNoneSSL (terminate connection), passNoneSSL (always allow), filterNoneSSL (default value - let ICAP or URL rewritor decide). Yep, looks that you are right here, something like that required... C) Squid notifies the URL rewritor and ICAP server about the result of the SSL handshake. This is to optimise the filters and not do things twice. Web servers do no like probes and may temporarily block sites that use Squid if they receive too many probes, so the least number of probes the better. I.e. the line sent to the URL redirector is extended with a new flag like SSLhandshake=(verified|noSSL). This should not break existing URL redirectors since it already has the variable length urlgroup and most URL redirectors will consider the new flag part of the urlgroup. Probably a few URL
Re: filtering HTTPS
On 03/14/2012 01:33 AM, Amos Jeffries wrote: It does. http://www.squid-cache.org/Doc/config/icap_206_enable/ The 206 responses are similar to 204 responses (inside or outside preview) but also allow modifying the headers or the head of the data. Data streams come in parts. Maybe a filter wants to see the first data chunk of the client, followed by the first data chunk from the server and followed by the second data chunk of the client to finally decide: block (close sockets) or say I am not interested anymore. So the filter receives all data chunks of the data stream until it signals the proxy about its decision. For all chunks, when there is not yet a decision, the filter needs to respond with something like Continue. The ICAP protocol is not able to handle such cases. Just extending the ICAP protocol is not enough. Also my opinion is an HTTP proxy is not the correct tool to handle this type of filtering... Maybe can be implemented in squid but requires completely new interface/module to handle this. You can not just extend the ICAP/ECAP filtering subsystems Yes, I understood from Henrik's reply that his thought go to a new type of data stream filter. There is a no industry standard to filter data streams. So there is an important decision to make: extend an existing standard or make a new protocol that only works between Squid on one hand and ufdbGuard, my new ICAP server and possibly a few other tools.
Re: filtering HTTPS
sslBumpForNoneSSL with values prohibitNoneSSL (terminate connection), passNoneSSL (always allow), filterNoneSSL (default value - let ICAP or URL rewritor decide). C) Squid notifies the URL rewritor and ICAP server about the result of the SSL handshake. This is to optimise the filters and not do things twice. Web servers do no like probes and may temporarily block sites that use Squid if they receive too many probes, so the least number of probes the better. I.e. the line sent to the URL redirector is extended with a new flag like SSLhandshake=(verified|noSSL). This should not break existing URL redirectors since it already has the variable length urlgroup and most URL redirectors will consider the new flag part of the urlgroup. Probably a few URL redirectors need a minor modification. For ICAP Squid could send a new header called X-Squid-SSLhandshakeResult. D) squid.conf.documented, wiki and other documentation is updated that 'sslproxy_flags DONT_VERIFY_PEER' and 'sslproxy_cert_error allow all' are unsafe and not recommended. E) the option 'squid-uses-ssl-bump' is introduced to ufdbGuard. If set to 'yes' it will not verify the use of proper SSL certificates. If Squid can send the new flag SSLhandshake (URL redirector) or X-Squid-SSLhandshakeResult (ICAP server), the URL redirector and ICAP servers can be optimised further. Marcus Kool
Re: filtering HTTPS
Henrik Nordström wrote: tis 2012-03-13 klockan 12:12 -0300 skrev Marcus Kool: Where does the filtering gets involved? Also NoneSSL sites (aka tunnelmode) need to be filtered/blocked and/or scanned for virusses. Squid is not the tool for filtering non-http(s) traffic beyond requested hostname. I agree. Squid is not. This task is for the URL rewritors and ICAP servers. One way or another, Squid should offer all data that passes through it (1) to a filter. I like ICAP, but ICAP is designed for HTTP and not HTTPS and certainly not for non-HTTP, non-HTTPS data streams. (1) a virusscanner and a URL filter do not need the *whole* data stream. The first max-64K upload and the first max-64K download is most likely sufficient to determine what to do, pass or block. The protocol should have a feature that the filter is able to tell to Squid Continue with this data stream, but I am not interested in it any more. But it would be trivial to extend tunnel mode with a filter pipe, both in normal tunnel mode and SSL relay mode (decrypted encrypted, tunneling between two SSL connections). A filter pipe is interesting. A question is on how to implement it. ICAP has no support for it and in my opinion ICAP should be extended to support this. I know it is a long way to extend existing protocols but maybe it works by just doing it and making it a de facto standard. The question is what works best: A) use extended ICAP for regular HTTP(S) and data streams B) use ICAP for regular HTTP(S) and a new data stream protocol for data streams Regards Henrik
Fwd: subscription
Hi, I like to subscribe to squid-dev. I tried to subscribe at June 27 but got no response. Thanks Marcus Kool Original Message From: - Mon Jun 27 16:29:03 2011 Message-ID: 4e08d9fc.5060...@urlfilterdb.com Date: Mon, 27 Jun 2011 16:29:00 -0300 From: Marcus Kool marcus.k...@urlfilterdb.com User-Agent: Thunderbird 2.0.0.24 (X11/20110404) MIME-Version: 1.0 To: squid-dev@squid-cache.org Subject: subscription I am Marcus Kool, author of ufdbguardd, a URL filter for Squid and author of a recent patch for Squid for Regular Expression Optimisation. I am also writing a new URL filter based on ICAP and use Squid for testing. I like to be part of squid-dev. The main reasons for wanting to join squid-dev is to monitor issues with url_rewriter and ICAP and to talk directly to the developers of Squid for questions regarding the ICAP module. I do not intend to write much code for the Squid project. Thanks Marcus
[PATCH] regular expression optimisation patch for squid 3.1.14
Attached is a patch for optimisation of REs. This is the second submission of the patch and the comments from Amos' review are addressed. This patch is inspired by the work that I did for ufdbGuard and a few emails with Amos. The new code optimises lists of regular expressions. The optimisations are: * initial .* is stripped * RE-1 RE-2 ... RE-n are joined into one large RE: (RE-1)|(RE-2)|...|(RE-n) * -i ... -i options are optimised: the second one is ignored, same for +i The only modified file is src/acl/RegexData.cc attached are the patch (RegexData.cc.patch) and files for a unit test: squidtest.conf re.4lines- used in squidtest.conf; contains REs re.200lines - used in squidtest.conf; contains REs unittest_re_optim_wget - script with wget commands to trigger squid to evaluate REs unittest_re_optim_wget contains instructions on how to setup and perform a unit test I tried to get a member of the squid-dev mailing list but are not yet so comments should also go to my email address directly. Marcus Kool Marcus Kool wrote: Amos Jeffries wrote: Amos Jeffries wrote: Hi Marcus, Did my audit feedback on this make it to you? I've just noticed my mailer has not marked the thread as responded. On 01/07/11 00:52, Marcus Kool wrote: No, it did not. Okay. My mailer seems to have screwed up badly. There were a few little minor bits. * the patch being reversed. Just order the files the other way around on next patch. compileOptimisedREs/compileUnoptimisedREs have duplicate code checking for (RElen BUFSIZ+1) case on the wordlist key. They are already checked for that criteria by aclParseRegexList before adding. debugs() WARNING to the user should be DBG_IMPORTANT in the second parameter. The major problem debugs() need DBG_CRITICAL in parameter #2 and ERROR: instead of the function name. The 100 messages only need to be shown when checking the config for problems. ie. debugs(28, (opt_parse_cfg_only?DBG_IMPORTANT:2), Thanks for the feedback, I will make a new patch. I was not able to do it to be included in the next releases but it will be soon. None else has mentioned anything, so with these style tweaks it can go in. The next releases are planned to happen tomorrow. If you want to submit a new patch in the next 12hrs I'll use that. I tried to subscribe to the squid-dev mailing list the other day but got no reply yet. But in the list archives I did not see any response/feedback either. I saw that arrive. So whoever was moderating this week appears to have has okayed you for posting. If you went through the regular ezmail subscription process (mail to squid-dev-subscr...@squid-cache.org) you should have been receiving list mail for a few days? I have not yet received emails from squid-dev. Should I resend the application ? Amos Marcus patch-RE-optimisation-squid-3-1-14.tar.gz Description: GNU Zip compressed data
subscription
I am Marcus Kool, author of ufdbguardd, a URL filter for Squid and author of a recent patch for Squid for Regular Expression Optimisation. I am also writing a new URL filter based on ICAP and use Squid for testing. I like to be part of squid-dev. The main reasons for wanting to join squid-dev is to monitor issues with url_rewriter and ICAP and to talk directly to the developers of Squid for questions regarding the ICAP module. I do not intend to write much code for the Squid project. Thanks Marcus
[PATCH] regular expression optimisation patch for squid 3.1.12
This patch is inspired by the work that I did for ufdbGuard and a few emails with Amos. Attached is a patch for squid 3.1.12 to optimise lists of regular expressions. The optimisations are: * initial .* is stripped * RE-1 RE-2 ... RE-n are joined into one large RE: (RE-1)|(RE-2)|...|(RE-n) * -i ... -i options are optimised: the second one is ignored, same for +i The only modified file is src/acl/RegexData.cc attached are the patch (RegexData.cc.patch) and files for a unit test: squidtest.conf re.4lines - used in squidtest.conf; contains REs re.200lines - used in squidtest.conf; contains REs unittest_re_optim_wget - script with wget commands to trigger squid to evaluate REs unittest_re_optim_wget contains instructions on how to setup and perform a unit test I am not subscribed to the squid-dev mailing list. Please reply to my email address also. Marcus Kool marcus.k...@urlfilterdb.com Amos Jeffries wrote: On 01/06/11 09:18, Marcus Kool wrote: Hi, after some emails with Amos I agreed to make a patch for squid to optimise lists of regular expressions. The optimisations are: * initial .* is stripped * RE-1 RE-2 ... RE-n are joined into one large RE: (RE-1)|(RE-2)|...|(RE-n) * -i ... -i options are optimised: the second one is ignored, same for +i The only modified file is src/acl/RegexData.cc My question for submitting the patch: how do want the patch? is the output of the following command OK? LC_ALL=C TZ=UTC0 diff -Naur src/acl/RegexData.cc src/acl/RegexData.cc.orig That should be fine. I used a test set: a squid.conf, two files with regular expressions and a file with wget commands to test URLs. Do you want/need these? That would be helpful for unit-tests. So yes, thank you. How to post the patch ? As attachment please, with [PATCH] subject prefix and a description suitable for commit message. From an email you are happy adding permanently to the credits records. I am not subscribed to the squid-dev mailing list. Please reply to my email address also. Thanks Marcus Kool Amos abc.com urlfilterdb.com/secret xs4all.nl/verysecret cnn.com/public -i abc.example.com/scripts/cgi-bin/40example.cgi -i foo\.example\.com/html/index\.php -i foo\.example\.com/html/asfsecond\.php foo\.example\.com/html/another-very-long-url-to-test-buffers-of-the-re-optimisation-algorithm\.php 01john.*doe.example.com/.*/index.php abc.example.com/scripts/cgi-bin/01example.cgi foo\.example\.com/html/index\.php foo\.example\.com/html/skdfhsecond\.php foo\.example\.com/html/another-very-long-url-to-test-buffers-of-the-re-optimisation-algorithm\.php 02john.*doe.example.com/.*/index.php abc.example.com/scripts/cgi-bin/02example.cgi foo\.example\.com/html/index\.php foo\.example\.com/html/234second\.php foo\.example\.com/html/another-very-long-url-to-test-buffers-of-the-re-optimisation-algorithm\.php 03john.*doe.example.com/.*/index.php abc.example.com/scripts/cgi-bin/03example.cgi foo\.example\.com/html/index\.php foo\.example\.com/html/sdfsaassecond\.php foo\.example\.com/html/another-very-long-url-to-test-buffers-of-the-re-optimisation-algorithm\.php 04john.*doe.example.com/.*/index.php abc.example.com/scripts/cgi-bin/04example.cgi foo\.example\.com/html/index\.php foo\.example\.com/html/345nsecond\.php foo\.example\.com/html/another-very-long-url-to-test-buffers-of-the-re-optimisation-algorithm\.php 05john.*doe.example.com/.*/index.php abc.example.com/scripts/cgi-bin/05example.cgi foo\.example\.com/html/index\.php foo\.example\.com/html/asfkdhsadsecond\.php foo\.example\.com/html/another-very-long-url-to-test-buffers-of-the-re-optimisation-algorithm\.php 06john.*doe.example.com/.*/index.php abc.example.com/scripts/cgi-bin/06example.cgi foo\.example\.com/html/index\.php foo\.example\.com/html/2345234nnsecond\.php foo\.example\.com/html/another-very-long-url-to-test-buffers-of-the-re-optimisation-algorithm\.php 07john.*doe.example.com/.*/index.php abc.example.com/scripts/cgi-bin/07example.cgi foo\.example\.com/html/index\.php foo\.example\.com/html/asd0second\.php foo\.example\.com/html/another-very-long-url-to-test-buffers-of-the-re-optimisation-algorithm\.php 08john.*doe.example.com/.*/index.php abc.example.com/scripts/cgi-bin/08example.cgi foo\.example\.com/html/index\.php foo\.example\.com/html/sdgw1second\.php foo\.example\.com/html/another-very-long-url-to-test-buffers-of-the-re-optimisation-algorithm\.php 09john.*doe.example.com/.*/index.php abc.example.com/scripts/cgi-bin/09example.cgi foo\.example\.com/html/index\.php foo\.example\.com/html/safn2nsecond\.php foo\.example\.com/html/another-very-long-url-to-test-buffers-of-the-re-optimisation-algorithm\.php 10john.*doe.example.com/.*/index.php abc.example.com/scripts/cgi-bin/10example.cgi foo\.example\.com/html/index\.php foo\.example\.com/html/345n2second\.php foo\.example\.com/html/another-very-long-url-to-test-buffers-of-the-re-optimisation-algorithm\.php 11john.*doe.example.com/.*/index.php abc.example.com/scripts/cgi-bin
regular expression optimisation patch
Hi, after some emails with Amos I agreed to make a patch for squid to optimise lists of regular expressions. The optimisations are: * initial .* is stripped * RE-1 RE-2 ... RE-n are joined into one large RE: (RE-1)|(RE-2)|...|(RE-n) * -i ... -i options are optimised: the second one is ignored, same for +i The only modified file is src/acl/RegexData.cc My question for submitting the patch: how do want the patch? is the output of the following command OK? LC_ALL=C TZ=UTC0 diff -Naur src/acl/RegexData.cc src/acl/RegexData.cc.orig I used a test set: a squid.conf, two files with regular expressions and a file with wget commands to test URLs. Do you want/need these? How to post the patch ? I am not subscribed to the squid-dev mailing list. Please reply to my email address also. Thanks Marcus Kool
debugging Squid ICAP interface
Hello, My name is Marcus Kool, author of ufdbGuard - a URL redirector for Squid, and I have started development of an ICAP-based URL filter. As with all new developments, the code of the ICAP server undoubtedly has some bugs that need to be investigated and fixed. I also have seen Squid behaving unexpectedly (2 minutes timeouts where it seems not handle any request from a browser, assertion failure). I have various observations and questions about the Squid ICAP interface and like to discuss these with the persons who wrote or know much about the ICAP client part of Squid. I like to know with whom I can discuss this and which mailing list to use. Thanks, Marcus