Github user shinrich commented on the issue:

    https://github.com/apache/trafficserver/issues/1476
  
    Have been adding member variables to NetVConnection and adding assert to 
HttpSM::state_api_callout.
    
    In the cores it seemed that the crash always happened in the first hook 
called from HttpSM::state_read_server_response_header.  As noted in the 
previous comment, the event came in on the server side VC, and the 
state_read_server_response_header is getting ready to send the response of the 
UA side VC.   I added allocate thread, free_thread, and last_alloc time to the 
netvc.  And added the following assert at the beginning of NetVConnection 
    
    {code}
    ink_release_assert(milestones[TS_MILESTONE_UA_CLOSE] > 0 || 
ua_session->get_netvc() == NULL ||
                (ua_session->get_create_time() != 0 && 
ink_hrtime_to_msec(ua_session->get_create_time()) <= 
ink_hrtime_to_msec(milestones[TS_MILESTONE_UA_BEGIN])));
    {code}
    
    This will trigger if there is not last_alloc_time set (it has been freed) 
or the create time of the VC is newer than the start of the transaction (has 
been reallocated).  This assert would trigger before we called the hooks 
implying that the client_vc has been freed in a previous stack but the 
ua_session reference has not been cleaned up.  The only place I could see this 
happening is in read_signal_and_update and write_signal_and_update of the 
corresponding vio._cont is NULL.  I added warnings in the NULL path, and they 
triggered a handful of times with event=EVENT_ERROR (3).  
    
    The error bubble addition (which we were concerned about from issue #1401) 
adds calls to read_signal_error and write_signal_error.  So I reverted that 
commit (c1ac5f8bf87fd4bc3a8e06507219970d83965acd) and removed my workarounds 
for issue #1401.  I let it run overnight and it crashed twice on my assert, but 
in the accept stack not the send response stack.  The timestamps varied in the 
low microseconds, so I am doubting the accuracy of our timestamping at the 
micro and nano seconds.  I added the ink_hrtime_to_msec to the asserts, and 
kicked it off to run today.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

Reply via email to