[ 
https://issues.apache.org/jira/browse/DISPATCH-2046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17333406#comment-17333406
 ] 

ASF GitHub Bot commented on DISPATCH-2046:
------------------------------------------

ganeshmurthy commented on a change in pull request #1142:
URL: https://github.com/apache/qpid-dispatch/pull/1142#discussion_r621440564



##########
File path: src/server.c
##########
@@ -1372,6 +1391,7 @@ void qd_server_free(qd_server_t *qd_server)
     if (!qd_server) return;
 
     qd_http_server_free(qd_server->http);
+    qd_timer_visit(true);

Review comment:
       Take this scenario where a client closes its connection, the 
qd_connection_free() is called which schedules a timer here - 
https://github.com/apache/qpid-dispatch/blob/main/src/server.c#L917
   This newly scheduled timer is set to go off in ctx->connector->delay seconds 
(maybe 5 seconds or so). This timer gets added to the scheduled_timers in 
timer.c. The router shuts down before the timer can fire. We need to step thru 
the scheduled_timers and take care of all the timers which is what I am trying 
to do here. Once the timers are scheduled, how can the owner access the timer? 
I am sorry if that is something rudimentary I missed.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Panic in handle due to deleted connector.
> -----------------------------------------
>
>                 Key: DISPATCH-2046
>                 URL: https://issues.apache.org/jira/browse/DISPATCH-2046
>             Project: Qpid Dispatch
>          Issue Type: Bug
>          Components: Routing Engine
>    Affects Versions: 1.15.0
>            Reporter: Alex Ward
>            Priority: Major
>             Fix For: 1.17.0
>
>
> I am seeing a segv at line 1063 due to ctx->connector being 0 in 
> server.c:handle(). 
> {code:java}
> (ugdb-x86_64-7.11-261) p ctx
> $6 = (qd_connection_t *) 0x802e55890
> (ugdb-x86_64-7.11-261) p *ctx
> $7 = {prev = 0x8023df490, next = 0x802e7a890, name = 0x0, server = 
> 0x80220d0c0, opened = false, closed = false, closed_locally = false, enqueued 
> = 0, timer = 0x0, pn_conn = 0x8023ec050, pn_sessions = {0x0 <repeats 15 
> times>}, ssl = 0x0, listener = 0x0, connector = 0x0, connector_lock = 
> 0x8023e6600, context = 0x0, user_context = 0x0, link_context = 0x0, 
> connection_id = 12, user_id = 0x0, free_user_id = false, policy_settings = 
> 0x0, n_sessions = 0, n_senders = 0, n_receivers = 0, open_container = 0x0, 
> deferred_calls = {head = 0x0, tail = 0x0, scratch = 0x0, size = 0}, 
> deferred_call_lock = 0x8023e65c0, policy_counted = false, role = 0x801a17660 
> "route-container", free_link_session_list = {head = 0x0, tail = 0x0, scratch 
> = 0x0, size = 0}, strip_annotations_in = false, strip_annotations_out = 
> false, wake = 0x8005d6e20 <connection_wake>, rhost = '\000' <repeats 1024 
> times>, rhost_port = '\000' <repeats 1056 times>}
> {code}
> As the comment at the start of handle() states, there is only one event being 
> handled at a time, but it doesn’t look like there is anything to stop someone 
> clearing ctx->connector between lines 1055 and 1063. The python code seems to 
> be coming in and deleting the connector while handle() is using it.
> [https://github.com/apache/qpid-dispatch/blob/main/src/server.c]
> {code:java}
>  989 /* Events involving a connection or listener are serialized by the 
> proactor so                            
>  990  * only one event per connection / listener will be processed at a time. 
>                                  
>  991  */          
>  992 static bool handle(qd_server_t *qd_server, pn_event_t *e, 
> pn_connection_t *pn_conn, qd_connection_t *ctx) 
>  993 {                                                                        
>                                  
>  994     if (pn_conn && qdr_is_authentication_service_connection(pn_conn)) {  
>                                  
>  995         qdr_handle_authentication_service_connection_event(e);           
>                                  
>  996         return true;                                                     
>                                  
>  997     }                                                                    
>                                  
>  998                                                                          
>                                   
>  999     switch (pn_event_type(e)) {         
>  …
> 1051     case PN_TRANSPORT_ERROR:                                             
>                                  
> 1052         {                                                                
>                                  
> 1053             pn_transport_t *transport = pn_event_transport(e);           
>                                  
> 1054             pn_condition_t* condition = transport ? 
> pn_transport_condition(transport) : NULL;             
> 1055             if (ctx && ctx->connector) { /* Outgoing connection */       
>                                  
> 1056                 qd_increment_conn_index(ctx);                            
>                                  
> 1057                 const qd_server_config_t *config = 
> &ctx->connector->config;                               
> 1058                 ctx->connector->state = CXTR_STATE_FAILED;               
>                                  
> 1059                 char conn_msg[300];                                      
>                                  
> 1060                 if (condition  && pn_condition_is_set(condition)) {      
>                                  
> 1061                     qd_format_string(conn_msg, 300, "[C%"PRIu64"] 
> Connection to %s failed: %s %s", ctx->connection_id, config->host_port,       
>                                                                     
> 1062                             pn_condition_get_name(condition), 
> pn_condition_get_description(condition));   
> 1063                     strcpy(ctx->connector->conn_msg, conn_msg);     
> {code}
> Another thread is at line 1063 in qd_connection_manager_delete_connector so 
> it looks like this thread just cleared ctx->connector that is being used in 
> handle. What is meant to be preventing this happening?
> [https://github.com/apache/qpid-dispatch/blob/main/src/connection_manager.c] 
> {code:java}
> 1055 void qd_connection_manager_delete_connector(qd_dispatch_t *qd, void 
> *impl)
> 1056 {                                                                        
>                                  
> 1057     qd_connector_t *ct = (qd_connector_t*) impl;                         
>                                  
> 1058     if (ct) {                                                            
>                                  
> 1059         sys_mutex_lock(ct->lock);                                        
>                                  
> 1060         if (ct->ctx) {                                                   
>                                  
> 1061             ct->ctx->connector = 0;                                      
>                                  
> 1062             if(ct->ctx->pn_conn)                                         
>                                  
> 1063                 qd_connection_invoke_deferred(ct->ctx, deferred_close, 
> ct->ctx->pn_conn);
> 1064                                                                          
>                                  
> 1065         }                                                                
>                                  
> 1066         sys_mutex_unlock(ct->lock);
> 1067         DEQ_REMOVE(qd->connection_manager->connectors, ct);              
>                                  
> 1068         qd_connector_decref(ct);                                         
>                                  
> 1069     }                                                                    
>                                  
> 1070 } 
> {code}
>  Does anyone have any suggestions for how to prevent this? Doesn't look like 
> I could use ct->lock in handle().



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org

Reply via email to