[ 
https://issues.apache.org/jira/browse/TS-4372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15254158#comment-15254158
 ] 

Bryan Call commented on TS-4372:
--------------------------------

What vc was the one that the assert failed on?  The client vc shouldn't migrate 
across threads.

I will take a look at ts-3612 and see if there are any issues with the active 
queue.

> Traffic server heart beat fails with 6.1
> ----------------------------------------
>
>                 Key: TS-4372
>                 URL: https://issues.apache.org/jira/browse/TS-4372
>             Project: Traffic Server
>          Issue Type: Bug
>          Components: Cop, Manager
>            Reporter: Susan Hinrichs
>            Assignee: Susan Hinrichs
>         Attachments: ts-4372-example.pcap
>
>
> When running 6.1 in a loaded production environment, traffic server will run 
> for a while (30 minutes or so), then server heart beats will start failing 
> intermittently.  Eventually two will fail in a row causing the traffic_cop to 
> restart traffic_server (or traffic_manager and then traffic_server I'm still 
> a bit unclear there).
> {code}
> traffic_cop[18078]: (test) read failed [104 'Connection reset by peer']
> {code}
> There are no particular resource limitations on the production machine in 
> this state.  The number of open sockets is around 50-60K which is consistent 
> with its 5.3.x peer.  The memory usage is no where near the limit.  The CPU 
> usage is high, but again, not near the limit (perhaps half the entire machine 
> usage).
> If we look at the packets exchanged on the loopback interface during this 
> heartbeat failing interval, we see some interesting things.  I'll attach an 
> example pcap file.   The interesting traffic is on port 8084 and 8083.  
> Traffic_cop sends a GET http://127.0.0.1:8083/synthetic.txt request to 
> traffic_server over port 8084.  Traffic server should proxy the request and 
> send the request GET /synthetic.txt to traffic_manager listing on port 8083.  
> Traffic manager returns a 200 response with some data.  Traffic_server relays 
> that response to traffic_cop.
> However, in the failure cases, traffic_cop sends the request and 
> traffic_manager sends a RESET after the connection has been established and 
> the request has been sent to it.   I'm guessing that there is logic in 
> traffic_server that closes the socket before reading the get request causing 
> the reset to be sent.   



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to