On 14/12/2011, at 10:33 PM, James Peach wrote:
> Hi all,
>
> I've been looking at an issue where TSFetchUrl does not deliver the EOS event
> to the caller until long after the origin server closes the read socket. I
> made a small change to read_from_net() so that it notices when read(2)
> returns 0 even if it returned > 0 a previous time round the loop. This causes
> it to deliver the EOS event up to the HTTP state machine.
>
> In the end, however, the EOS event is not delivered to the FetchSM. Here's
> what happens …
>
> We receive the response from the origin server. It's
>
> [Dec 14 20:36:30.192] Server {0x7fff7b5f9960} DEBUG: (http_tunnel) [0]
> producer_handler [http server VC_EVENT_READ_READY]
> [Dec 14 20:36:30.192] Server {0x7fff7b5f9960} DEBUG: (http_tunnel) [0]
> producer_handler_dechunked [http server VC_EVENT_READ_READY]
> [Dec 14 20:36:30.192] Server {0x7fff7b5f9960} DEBUG: (http_chunk) creating a
> chunk of size 3296 bytes
> [Dec 14 20:36:30.192] Server {0x7fff7b5f9960} DEBUG: (http_redirect)
> [HttpTunnel::producer_handler] enable_redirection: [1 0 0] event: 100
> [Dec 14 20:36:30.192] Server {0x7fff7b5f9960} DEBUG: (pvc) [0] Passive:
> reenable Write
> [Dec 14 20:36:30.192] Server {0x7fff7b5f9960} DEBUG: (read_from_net)
> attempting to read 32768 bytes
> [Dec 14 20:36:30.192] Server {0x7fff7b5f9960} DEBUG: (read_from_net) read 0
> bytes
> [Dec 14 20:36:30.192] Server {0x7fff7b5f9960} DEBUG: (read_from_net)
> triggering EOS
> [Dec 14 20:36:30.192] Server {0x7fff7b5f9960} DEBUG: (http_tunnel) [0]
> producer_handler [http server VC_EVENT_EOS]
> [Dec 14 20:36:30.192] Server {0x7fff7b5f9960} DEBUG: (http_tunnel) [0]
> producer_handler_dechunked [http server VC_EVENT_EOS]
> [Dec 14 20:36:30.192] Server {0x7fff7b5f9960} DEBUG: (http_redirect)
> [HttpTunnel::producer_handler] enable_redirection: [1 0 0] event: 104
> [Dec 14 20:36:30.192] Server {0x7fff7b5f9960} DEBUG: (http) [0]
> [&HttpSM::tunnel_handler_server, VC_EVENT_EOS]
>
> At this point, we know that the read socket closed and we are trying to
> deliver the EOS event up the stack. Then this happens:
>
> [Dec 14 20:36:31.338] Server {0x7fff7b5f9960} DEBUG: (http_ss) [0] session
> closed
> [Dec 14 20:36:31.362] Server {0x7fff7b5f9960} DEBUG: (pvc) [0] Passive:
> reenable Write
> [Dec 14 20:36:31.362] Server {0x7fff7b5f9960} DEBUG: (pvc_event) [0] Passive:
> Received event 1
> [Dec 14 20:36:31.362] Server {0x7fff7b5f9960} DEBUG: (pvc) [0] Passive:
> process_write_side
> [Dec 14 20:36:31.363] Server {0x7fff7b5f9960} DEBUG: (pvc) [0] Passive:
> process_write_side; act_on 3499
> [Dec 14 20:36:31.386] Server {0x7fff7b5f9960} DEBUG: (pvc) [0] Passive:
> process_write_side; added 3499
> [Dec 14 20:36:31.386] Server {0x7fff7b5f9960} DEBUG: (http_tunnel) [0]
> consumer_handler [user agent VC_EVENT_WRITE_COMPLETE]
> [Dec 14 20:36:31.387] Server {0x7fff7b5f9960} DEBUG: (http) [0]
> [&HttpSM::tunnel_handler_ua, VC_EVENT_WRITE_COMPLETE]
> [Dec 14 20:36:31.387] Server {0x7fff7b5f9960} DEBUG: (http_cs) [0] session
> released by sm [0]
> [Dec 14 20:36:31.387] Server {0x7fff7b5f9960} DEBUG: (http_cs) [0] initiating
> io for next header
> [Dec 14 20:36:31.387] Server {0x7fff7b5f9960} DEBUG: (pvc) [0] Passive:
> do_io_read for -1 bytes
> [Dec 14 20:36:31.387] Server {0x7fff7b5f9960} DEBUG: (http) [0]
> [HttpSM::main_handler, HTTP_TUNNEL_EVENT_DONE]
> [Dec 14 20:36:31.387] Server {0x7fff7b5f9960} DEBUG: (http) [0]
> [&HttpSM::tunnel_handler, HTTP_TUNNEL_EVENT_DONE]
> [Dec 14 20:36:31.387] Server {0x7fff7b5f9960} DEBUG: (http_redirect)
> [HttpTunnel::deallocate_postdata_copy_buffers]
> [Dec 14 20:36:31.396] Server {0x7fff7b5f9960} DEBUG: (http_seq)
> [HttpStateMachineGet::update_stats] Logging transaction
> [Dec 14 20:36:31.396] Server {0x7fff7b5f9960} DEBUG: (http) [0] dellocating sm
>
> So now the HttpSM knows that it's all over because it is deallocating itself
> in HttpSM::kill_this().
>
> [Dec 14 20:36:31.396] Server {0x7fff7b5f9960} DEBUG: (pvc) [0] Active:
> process_read_side
> [Dec 14 20:36:31.396] Server {0x7fff7b5f9960} DEBUG: (pvc) [0] Active:
> process_read_side; act_on 3499
> [Dec 14 20:36:31.396] Server {0x7fff7b5f9960} DEBUG: (pvc) [0] Active:
> process_read_side; added 3499
> [Dec 14 20:36:31.396] Server {0x7fff7b5f9960} DEBUG: (FetchSM)
> [fetch_handler] calling fetch_plugin
> [Dec 14 20:36:31.396] Server {0x7fff7b5f9960} DEBUG: (FetchSM)
> [process_fetch_read] I am here read
> [Dec 14 20:36:31.396] Server {0x7fff7b5f9960} DEBUG: (FetchSM)
> [process_fetch_read] number of bytes in read ready 3499
> [Dec 14 20:36:31.397] Server {0x7fff7b5f9960} DEBUG: (pvc) [0] Active:
> reenable Read
> [Dec 14 20:36:31.397] Server {0x7fff7b5f9960} DEBUG: (pvc) [0] Passive:
> reenable Write
> [Dec 14 20:36:31.397] Server {0x7fff7b5f9960} DEBUG: (pvc_event) [0] Passive:
> Received event 1
> [Dec 14 20:36:31.397] Server {0x7fff7b5f9960} DEBUG: (pvc) [0] Passive:
> process_read_side
> [Dec 14 20:36:31.397] Server {0x7fff7b5f9960} DEBUG: (pvc) [0] Passive:
> process_read_side; act_on 0
> [Dec 14 20:36:31.397] Server {0x7fff7b5f9960} DEBUG: (pvc) [0] Passive:
> closed? 0 shutdown? 0
> [Dec 14 20:36:31.397] Server {0x7fff7b5f9960} DEBUG: (pvc) [0] Passive:
> process_write_side
> [Dec 14 20:36:31.397] Server {0x7fff7b5f9960} DEBUG: (pvc_event) [0] Active:
> Received event 1
> [Dec 14 20:36:31.397] Server {0x7fff7b5f9960} DEBUG: (pvc) [0] Active:
> process_read_side
> [Dec 14 20:36:31.397] Server {0x7fff7b5f9960} DEBUG: (pvc) [0] Active:
> process_read_side; act_on 0
> [Dec 14 20:36:31.397] Server {0x7fff7b5f9960} DEBUG: (pvc) [0] Active:
> closed? 0 shutdown? 0
>
> proxy/PluginVC.cc:
> 624 if (act_on <= 0) {
> 625 Debug("pvc", "[%u] %s: closed? %u shutdown? %u", PVC_ID, PVC_TYPE,
> 626 other_side->closed, other_side->write_state.shutdown);
> 627 if (other_side->closed || other_side->write_state.shutdown) {
> 628 read_state.vio._cont->handleEvent(VC_EVENT_EOS, &read_state.vio);
> 629 }
> 630 return;
> 631 }
>
> So the Plugin VC does not think its other_side is closed (because no-one has
> called do_io_close()?). But eventually there is a HttpClientSession timeout,
> which closes the right PluginVC and wakes it. Now the FetchSM is kicked and
> the TSFetchUrl event triggers:
>
> [Dec 14 20:38:26.380] Server {0x102602000} DEBUG: (pvc_event) [0] Passive:
> Received event 2
> [Dec 14 20:38:26.380] Server {0x102602000} DEBUG: (http_cs) [0]
> [&HttpClientSession::state_keep_alive, VC_EVENT_ACTIVE_TIMEOUT]
> [Dec 14 20:38:26.380] Server {0x102602000} DEBUG: (pvc) [0] Passive:
> do_io_close
> [Dec 14 20:38:26.380] Server {0x102602000} DEBUG: (http_cs) [0] session closed
> [Dec 14 20:38:26.380] Server {0x102602000} DEBUG: (http_cs) [0] session
> destroy
> [Dec 14 20:38:26.380] Server {0x102602000} DEBUG: (pvc) [0] Passive:
> process_close
> [Dec 14 20:38:26.380] Server {0x102602000} DEBUG: (pvc_event) [0] Active:
> Received event 1
> [Dec 14 20:38:26.380] Server {0x102602000} DEBUG: (pvc) [0] Active:
> process_read_side
> [Dec 14 20:38:26.380] Server {0x102602000} DEBUG: (pvc) [0] Active:
> process_read_side; act_on 0
> [Dec 14 20:38:26.380] Server {0x102602000} DEBUG: (pvc) [0] Active: closed? 1
> shutdown? 0
> [Dec 14 20:38:26.380] Server {0x102602000} DEBUG: (FetchSM) [fetch_handler]
> calling fetch_plugin
>
> After beating my head against this for a while, I don't see what the right
> way to deliver the EOS event up from the HTTP layer to the PluginVC is. Can
> anyone hold my hand a little and give me some clues about how this is
> supposed to work?
This appears to do the trick:
diff --git a/proxy/http/HttpSM.cc b/proxy/http/HttpSM.cc
index 624637b..b450755 100644
--- a/proxy/http/HttpSM.cc
+++ b/proxy/http/HttpSM.cc
@@ -2842,6 +2842,7 @@ HttpSM::tunnel_handler_server(int event,
HttpTunnelProducer *
if (close_connection) {
p->vc->do_io_close();
p->read_vio = NULL;
+ t_state.client_info.keep_alive = HTTP_NO_KEEPALIVE;
} else {
server_session->attach_hostname(t_state.current.server->name);
server_session->server_trans_stat--;
I don't understand the HTTP state machine well enough to know whether this is a
generally correct fix, so I'd love it if anyone could give this a thumbs up or
down
J