I think I know what the issue is which can give us a clue to what is going on.
2018/10/15 15:05:45.083 kid1| RegexData.cc(71) match: aclRegexData::match: checking 'wiki.squid-cache.org:443' 2018/10/15 15:05:45.084 kid1| RegexData.cc(82) match: aclRegexData::match: looking for '(^ https://wiki.squid-cache.org/SquidFaq/SquidAcl.*)|(squid-cache.org/SquidFaq/SquidAcl.* )' 2018/10/15 15:05:45.084 kid1| Acl.cc(321) checklistMatches: ACL::ChecklistMatches: result for 'whitelist' is 0 The above seems to be applying the regex to "wiki.squid-cache.org:443" instead of to "https://wiki.squid-cache.org/SquidFaq/SquidAcl". I added the regex ".*squid-cache.org.*" to my list of regular expressions and now I see this. 2018/10/15 15:16:03.641 kid1| RegexData.cc(71) match: aclRegexData::match: checking 'wiki.squid-cache.org:443' 2018/10/15 15:16:03.641 kid1| RegexData.cc(82) match: aclRegexData::match: looking for '(^https?://[^/]+/ wiki.squid-cache.org/SquidFaq/SquidAcl.*)|(squid-cache.org.*)' 2018/10/15 15:16:03.641 kid1| RegexData.cc(93) match: aclRegexData::match: match '(^https?://[^/]+/ wiki.squid-cache.org/SquidFaq/SquidAcl.*)|(squid-cache.org.*)' found in ' wiki.squid-cache.org:443' 2018/10/15 15:16:03.641 kid1| Acl.cc(321) checklistMatches: ACL::ChecklistMatches: result for 'whitelist' is 1 Any idea why url_regex wouldn't try to match the full url and instead only matches on the subdomain, host domain, and port? The Squid FAQ <https://wiki.squid-cache.org/SquidFaq/SquidAcl> says the following: *url_regex*: URL regular expression pattern matching *urlpath_regex*: URL-path regular expression pattern matching, leaves out the protocol and hostname with this example given acl special_url url_regex ^http://www.squid-cache.org/Doc/FAQ/$ This seems to be the case between 3.3.8 (default on ubuntu 14.04) and 3.5.12 (default on ubuntu 16.04). Is there another configuration that forces url_regex to match the entire url? or should I use a different acl type? Best, On Mon, Oct 15, 2018 at 11:11 AM RB <[email protected]> wrote: > Hi Matus, > > Thanks for responding so quickly. I uploaded my configurations here if > that is more helpful: https://bit.ly/2NF4zNb > > The config that I previously shared is called squid_corp.conf. I also > noticed that if I don't use regular expressions and instead use domains, it > works correctly: > > # acl whitelist url_regex "/vagrant/squid_sites.txt" > acl whitelist url_regex .squid-cache.org > > > Every time my squid.conf or my squid_sites.txt is modified, I restart the > squid service > > sudo service squid3 restart > > > Then I use curl to test and now the url works. > > $ curl -sSL --proxy localhost:3128 -D - > https://wiki.squid-cache.org/SquidFaq/SquidAcl -o /dev/null 2>&1 > HTTP/1.1 200 Connection established > > HTTP/1.1 200 OK > Date: Mon, 15 Oct 2018 14:47:33 GMT > Server: Apache/2.4.7 (Ubuntu) > Vary: Cookie,User-Agent,Accept-Encoding > Content-Length: 101912 > Cache-Control: max-age=3600 > Expires: Mon, 15 Oct 2018 15:47:33 GMT > Content-Type: text/html; charset=utf-8 > > > But this does not allow me to get more granular. I can only allow all > subdomains and paths for the domain squid-cache.org but I'm unable to > only allow the regular expressions if I put them inline or put them in > squid_sites.txt. > > # acl whitelist url_regex "/vagrant/squid_sites.txt" > acl whitelist url_regex ^https://wiki.squid-cache.org/SquidFaq/SquidAcl.* > acl whitelist url_regex .*squid-cache.org/SquidFaq/SquidAcl.* > > > If I put them inline like I have above, when I restarted squid it says the > following > > 2018/10/15 14:54:48 kid1| strtokFile: .* > squid-cache.org/SquidFaq/SquidAcl.* not found > > > If I put the expressions in the squid_sites.txt the above "not found" > message isn't shown and this is the debug output in > /var/log/squid3/cache.log (full output https://pastebin.com/NVwRxVmQ). > > 2018/10/15 15:05:45.083 kid1| Checklist.cc(275) matchNode: 0x7fb0068da2b8 > matched=1 async=0 finished=0 > 2018/10/15 15:05:45.083 kid1| Acl.cc(336) matches: ACLList::matches: > checking whitelist > 2018/10/15 15:05:45.083 kid1| Acl.cc(319) checklistMatches: > ACL::checklistMatches: checking 'whitelist' > 2018/10/15 15:05:45.083 kid1| RegexData.cc(71) match: aclRegexData::match: > checking 'wiki.squid-cache.org:443' > 2018/10/15 15:05:45.084 kid1| RegexData.cc(82) match: aclRegexData::match: > looking for '(^ > https://wiki.squid-cache.org/SquidFaq/SquidAcl.*)|(squid-cache.org/SquidFaq/SquidAcl.* > )' > 2018/10/15 15:05:45.084 kid1| Acl.cc(321) checklistMatches: > ACL::ChecklistMatches: result for 'whitelist' is 0 > 2018/10/15 15:05:45.084 kid1| Acl.cc(349) matches: whitelist mismatched. > 2018/10/15 15:05:45.084 kid1| Acl.cc(354) matches: whitelist result is > false > > > So it's failing the regular expression check. If I use grep to verify if > the regex works, it does. > > $ echo https://wiki.squid-cache.org/SquidFaq/SquidAcl | grep "^ > https://wiki.squid-cache.org/SquidFaq/SquidAcl.*" > https://wiki.squid-cache.org/SquidFaq/SquidAcl > > > > are you aware that you can only see CONNECT in https requests, unless > using > ssl_bump? > > Ah interesting. Are you saying that my https connections will always fail > unless I use ssl_bump to decrypt https to http connections? How would this > work correctly in production? Does squid proxy only block urls if it > detects http? How do you configure ssl_bump to work in this case? and is > that viable in production? > > > of course it matches all, everything should match "all". > > I more wonder why doesn't it match "http_access allow localhost" > > > have you reloaded squid config after changing it? > > Did squid confirm it? > > Would you have an example of one entire config file that would work to > whitelist an http/https url using a regular expression? > > Best, > > > On Mon, Oct 15, 2018 at 4:49 AM Matus UHLAR - fantomas <[email protected]> > wrote: > >> KOn 15.10.18 01:04, RB wrote: >> >I'm trying to deny all urls except for only whitelisted regular >> >expressions. I have only this regular expression in my file >> >"squid_sites.txt" >> > >> >^https://wiki.squid-cache.org/SquidFaq/SquidAcl.* >> >> are you aware that you can only see CONNECT in https requests, unless >> using >> ssl_bump? >> >> >> >acl bastion src 10.5.0.0/1 >> >acl whitelist url_regex "/vagrant/squid_sites.txt" >> [...] >> >http_access allow manager localhost >> >http_access deny manager >> >http_access deny !Safe_ports >> >http_access allow localhost >> >http_access allow purge localhost >> >http_access deny purge >> >http_access deny CONNECT !SSL_ports >> > >> >http_access allow bastion whitelist >> >http_access deny bastion all >> >> >I tried enabling debugging and tailing /var/log/squid3/cache.log but my >> >curl statement keeps matching "all". >> >> of course it matches all, everything should match "all". >> >> I more wonder why doesn't it match "http_access allow localhost" >> >> >$ curl -sSL --proxy localhost:3128 -D - " >> >https://wiki.squid-cache.org/SquidFaq/SquidAcl" -o /dev/null 2>&1 | grep >> >Squid >> >X-Squid-Error: ERR_ACCESS_DENIED 0 >> >> >Any ideas what I'm doing wrong? >> >> have you reloaded squid config after changing it? >> Did squid confirm it? >> >> -- >> Matus UHLAR - fantomas, [email protected] ; http://www.fantomas.sk/ >> Warning: I wish NOT to receive e-mail advertising to this address. >> Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu. >> It's now safe to throw off your computer. >> _______________________________________________ >> squid-users mailing list >> [email protected] >> http://lists.squid-cache.org/listinfo/squid-users >> >
_______________________________________________ squid-users mailing list [email protected] http://lists.squid-cache.org/listinfo/squid-users
