Hi,

I am using Squid 4.5 with WCCP. Intercepting SSL by peeking at step1 and then 
deciding to either splice or bump upon the SNI.
I am noticing a weird behavior for some of my TCP connections.  Squid is taking 
over 20s to decide what do with the ClientHello sent by the browser. It is only 
after 20s that it decides to send out a ClientHello to the origin server and at 
the same time reply to the client with a ServerHello.
This behavior is hard to reproduce and only some clients are affected.

I will try to summarize what I see in cache.log with ALL, 6 debug options.


1)      Squid's INTERCEPTION thread/program receives a TCP SYN from workstation.
2019/02/06 17:23:19.070 kid1| 89,5| Intercept.cc(405) Lookup: address BEGIN: 
me/client= <SQUID_IP>:23129, destination/me= <CLIENT/BROWSER_IP>:58232


2)      Squid becomes the origin server and sets up the TCP connection.
2019/02/06 17:23:19.070 kid1| 5,5| AsyncCall.cc(93) ScheduleCall: 
TcpAcceptor.cc(339) will call 
httpsAccept(local=<ORIGIN_SERVER_ON_THE_INTERNET>:443 
remote=<CLIENT/BROWSER_IP>:58232 FD 40 flags=33, MXID_1101703) [call34733258]
2019/02/06 17:23:19.070 kid1| 5,5| AsyncCall.cc(38) make: make call httpsAccept 
[call34733258]
       2019/02/06 17:23:19.070 kid1| 33,4| client_side.cc(2776) httpsAccept: 
local=<ORIGIN_SERVER_ON_THE_INTERNET>:443 remote=<CLIENT/BROWSER_IP>:58232 FD 
40 flags=33 accepted, starting SSL negotiation.


3)      Squid checks the SSL ACLs for the destination IP.
2019/02/06 17:23:19.071 kid1| 28,5| Acl.cc(124) matches: checking (ssl_bump 
rules)
2019/02/06 17:23:19.071 kid1| 28,5| Checklist.cc(397) bannedAction: Action 
'ALLOWED/6' is not banned
2019/02/06 17:23:19.071 kid1| 28,5| Acl.cc(124) matches: checking (ssl_bump 
rule)
2019/02/06 17:23:19.071 kid1| 28,5| Acl.cc(124) matches: checking 
no_ssl_bump_src_ip
2019/02/06 17:23:19.071 kid1| 28,3| Ip.cc(538) match: aclIpMatchIp: 
'<CLIENT/BROWSER_IP>:58232' NOT found
2019/02/06 17:23:19.071 kid1| 28,3| Acl.cc(151) matches: checked: 
no_ssl_bump_src_ip = 0


4)      Squid decides to allow connections to the remote IP i.e 
<ORIGIN_SERVER_ON_THE_INTERNET> and decides to peek at the SNI (will accept 
ClientHello), hence fakes a CONNECT request
2019/02/06 17:23:19.071 kid1| 28,3| Checklist.cc(163) checkCallback: 
ACLChecklist::checkCallback: 0x1fb19fd8 answer=ALLOWED
2019/02/06 17:23:19.071 kid1| 33,2| client_side.cc(2744) 
httpsSslBumpAccessCheckDone: sslBump action peekneeded for 
local=<ORIGIN_SERVER_ON_THE_INTERNET>:443 remote=<CLIENT/BROWSER_IP>:58232 FD 
40 flags=33
2019/02/06 17:23:19.071 kid1| 33,2| client_side.cc(3395) fakeAConnectRequest: 
fake a CONNECT request to force connState to tunnel for ssl-bump


5)      The FAKE connect requests again runs through ACLs.  20ms are spent for 
DNS PTR lookup. A total of 30ms is spent parsing ACLs.
2019/02/06 17:23:19.103 kid1| 85,2| client_side_request.cc(758) 
clientAccessCheckDone: The request CONNECT <ORIGIN_SERVER_ON_THE_INTERNET> is 
ALLOWED; last ACL checked: localnet
2019/02/06 17:23:19.103 kid1| 93,4| AccessCheck.cc(145) checkCandidates: NO 
candidates left
2019/02/06 17:23:19.103 kid1| 93,3| AccessCheck.cc(196) callBack: NULL
2019/02/06 17:23:19.103 kid1| 93,5| AsyncCall.cc(26) AsyncCall: The AsyncCall 
Adaptation::Initiator::noteAdaptationAclCheckDone constructed, this=0x1552b0b0 
[call34733267]


6)      FAKE connect is processed and Squid reads the TCP connection, gets the 
ClientHello and reads the SNI.
2019/02/06 17:23:19.104 kid1| 33,5| AsyncCall.cc(38) make: make call 
Server::doClientRead [call34733270]
2019/02/06 17:23:19.104 kid1| 33,5| AsyncJob.cc(123) callStart: Http1::Server 
status in: [ job2971436]
2019/02/06 17:23:19.104 kid1| 33,5| Server.cc(104) doClientRead: 
local=<ORIGIN_SERVER_ON_THE_INTERNET>:443 remote=<CLIENT/BROWSER_IP>:58232 FD 
40 flags=33
2019/02/06 17:23:19.104 kid1| 5,3| Read.cc(92) ReadNow: 
local=<ORIGIN_SERVER_ON_THE_INTERNET>:443 remote=<CLIENT/BROWSER_IP>:58232 FD 
40 flags=33, size 4096, retval 203, errno 0
2019/02/06 17:23:19.104 kid1| 33,5| AsyncCall.cc(26) AsyncCall: The AsyncCall 
ConnStateData::requestTimeout constructed, this=0x3d42d40 [call34733271]
2019/02/06 17:23:19.104 kid1| 5,3| comm.cc(559) commSetConnTimeout: 
local=<ORIGIN_SERVER_ON_THE_INTERNET>:443 remote=<CLIENT/BROWSER_IP>:58232 FD 
40 flags=33 timeout 300
2019/02/06 17:23:19.104 kid1| 83,5| Handshake.cc(404) parseExtensions: first 
unsupported extension: 19018
2019/02/06 17:23:19.104 kid1| 83,3| Handshake.cc(497) parseSniExtension: 
host_name=<ORIGIN_SERVER_SNI>


7)      This is followed by another round of ACL processing now that we have 
the SNI.

2019/02/06 17:23:19.106 kid1| 33,3| Pipeline.cc(35) front: Pipeline 0xc2d7b60 
front 0x1ae2e730*2
2019/02/06 17:23:19.106 kid1| 33,3| Pipeline.cc(35) front: Pipeline 0xc2d7b60 
front 0x1ae2e730*3
2019/02/06 17:23:19.107 kid1| 83,5| Session.cc(103) NewSessionObject: SSL_new 
session=0x78136a0
2019/02/06 17:23:19.107 kid1| 83,5| bio.cc(616) squid_bio_ctrl: 0xcd0fe80 
104(6000, 0x7ffc32a6e6b4)
2019/02/06 17:23:19.107 kid1| 83,5| Session.cc(162) CreateSession: link FD 40 
to TLS session=0x78136a0
2019/02/06 17:23:19.107 kid1| 33,5| client_side.cc(2535) httpsCreate: will 
negotiate TLS on local=<ORIGIN_SERVER_ON_THE_INTERNET>:443 
remote=<CLIENT/BROWSER_IP>:58232 FD 40 flags=33
2019/02/06 17:23:19.107 kid1| 5,5| ModEpoll.cc(117) SetSelect: FD 40, type=1, 
handler=0, client_data=0, timeout=0
2019/02/06 17:23:19.107 kid1| 83,5| client_side.cc(3324) startPeekAndSplice: 
Peek and splice at step2 done. Start forwarding the request!!!
2019/02/06 17:23:19.107 kid1| 17,3| FwdState.cc(340) Start: 
'<ORIGIN_SERVER_ON_THE_INTERNET>:443'



8)      No ServerHello has been sent back to the client yet, Squid starts a TCP 
connection with the origin server
2019/02/06 17:23:19.110 kid1| 5,4| AsyncJob.cc(123) callStart: Comm::ConnOpener 
status in: [ job2971439]
2019/02/06 17:23:19.110 kid1| 5,5| ConnOpener.cc(350) doConnect: local=0.0.0.0 
remote=<ORIGIN_SERVER_ON_THE_INTERNET>:443 flags=1: Comm::OK - connected
2019/02/06 17:23:19.110 kid1| 5,4| ConnOpener.cc(155) cleanFd: local=0.0.0.0 
remote=<ORIGIN_SERVER_ON_THE_INTERNET>:443 flags=1 closing temp FD 50


9)      Squid starts a TLS session with the remote/origin server, sends the 
ClientHello. A total of 0.4 seconds in Squid sending clienthello to origin 
server. This is probably when Squid decides to send back the ServerHello to the 
browser.
2019/02/06 17:23:19.110 kid1| 83,5| Session.cc(103) NewSessionObject: SSL_new 
session=0x14899390
2019/02/06 17:23:19.111 kid1| 83,5| bio.cc(616) squid_bio_ctrl: 0x1492ef80 
104(6001, 0x7ffc32a6e884)
2019/02/06 17:23:19.111 kid1| 83,5| Session.cc(162) CreateSession: link FD 50 
to TLS session=0x14899390
2019/02/06 17:23:19.111 kid1| 83,5| PeerConnector.cc(123) initialize: 
local=<SQUID_IP>:44498 remote=<ORIGIN_SERVER_ON_THE_INTERNET>:443 FD 50 
flags=1, session=0x14899390

So somewhere between Step 8 and Step 9, Squid is taking over 20s.

What could possibly be keeping it busy ?
I have external ACL helpers but they work just fine. Average service time is 
1ms. Squid has not even spawning all helpers that it has been configured to do. 
(not exhausted the upper limit).
DNS resolution is also good. All CPU/MEM resources look just fine and again 
this affects only a subset of the traffic.  I don't have the failure logs from 
when this actually happens.


UPDATE: This problem statement seems local to a few websites. Outside of the 
proxy, those websites quite quickly as is expected.

Any thoughts on where to look ?  other bits and pieces I could check ?  I have 
jumbo frames enabled (9000 bytes) but am running the proxies at L2 1500 MTU.

-Ahmad



_______________________________________________
squid-users mailing list
squid-users@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-users

Reply via email to