[ 
https://issues.apache.org/jira/browse/TS-4910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15536056#comment-15536056
 ] 

Alan M. Carroll commented on TS-4910:
-------------------------------------

I think that all the NetVCs in the server pool have as their VIO mutex the 
server pool lock. Therefore if you have the pool locked you have the required 
NetVC locks.

> We should get vc->mutex before do_io when the vc is acquired from session pool
> ------------------------------------------------------------------------------
>
>                 Key: TS-4910
>                 URL: https://issues.apache.org/jira/browse/TS-4910
>             Project: Traffic Server
>          Issue Type: Bug
>            Reporter: Oknet Xu
>
> Component: ServerSessionPool
> (I cound not found ServerSessionPool from JIRA)
> source: proxy/http/HttpSessionManager.cc
> {code}
> 309     // Now check to see if we have a connection in our shared connection 
> pool
> 310     EThread *ethread       = this_ethread();
> 311     ProxyMutex *pool_mutex = (TS_SERVER_SESSION_SHARING_POOL_THREAD == 
> sm->t_state.http_config_param->server_session_sharing_pool) ?
> 312                                ethread->server_session_pool->mutex.get() :
> 313                                m_g_pool->mutex.get();
> 314     MUTEX_TRY_LOCK(lock, pool_mutex, ethread);
> 315     if (lock.is_locked()) {
> 316       if (TS_SERVER_SESSION_SHARING_POOL_THREAD == 
> sm->t_state.http_config_param->server_session_sharing_pool) {
> 317         retval = ethread->server_session_pool->acquireSession(ip, 
> hostname_hash, match_style, sm, to_return);
> 318         Debug("http_ss", "[acquire session] thread pool search %s", 
> to_return ? "successful" : "failed");
> 319       } else {
> 320         retval = m_g_pool->acquireSession(ip, hostname_hash, match_style, 
> sm, to_return);
> 321         Debug("http_ss", "[acquire session] global pool search %s", 
> to_return ? "successful" : "failed");
> 322         // At this point to_return has been removed from the pool. Do we 
> need to move it
> 323         // to the same thread?
> 324         if (to_return) {
> 325           UnixNetVConnection *server_vc = dynamic_cast<UnixNetVConnection 
> *>(to_return->get_netvc());
> 326           if (server_vc) {
> 327             UnixNetVConnection *new_vc = 
> server_vc->migrateToCurrentThread(sm, ethread);
> {code}
> As the code above:
> 1. we get pool_mutex first
> 2. then acquire a vc from session pool
> 3. then migrate the vc to current thread without get vc->mutex
> Depend on the comments, a SM only access VIO & VC that returned with callback.
> The mutex of ServerSession may be different from server_vc while it is 
> acquired from ServerSessionPool and attached to HttpSM.
> HttpSM should get the server_vc->nh->mutex and server_vc->mutex first before 
> call ServerSession->do_io(). Currently HttpSM does not.
> The mutex create & usage list at below by timeline:
> 1. ClientVC is accepted from NetAccept with a new allocated mutex.
> 2. ClientSession is created and share the same mutex with ClientVC.
> 3. HttpSM is created and share the same mutex with ClientVC.
> 4. NetHandler get HttpSM->mutex and callback READ_READY to HttpSM by 
> ClientVC->read.vio._cont.
> {code}
> ClientVC->nh->mutex is locked by EventSystem
> HttpSM->mutex is locked by NetHandler
> ClientVC->mutex is locked due share the same mutex with HttpSM
> To access & modify a NetVC should get vc->nh->mutex and vc->mutex locked 
> simultaneously.
> {code}
> 5. Scenes1:
> HttpSM create ServerVC with HttpSM->mutex by netProcessor.connect_re()
> Then HttpSM create ServerSession with ServerVC and share the same mutex with 
> HttpSM.
> 5. Scenes2:
> HttpSM acquire a ServerSession from SessionPool and the ServerVC->thread is 
> not current EThread.
> The ServerSession->mutex is set to HttpSM->mutex while it is attached to 
> HttpSM.
> But the ServerVC->mutex is old one.
> The first bug:
> Before VC Migration merged:
> - ServerSession->do_io() is called and directly call ServerVC->do_io() 
> without get ServerVC->mutex first.
> After VC Migration merged:
> - Migrate ServerVC into current thread without get ServerVC->mutex first.
> The second bug:
> Before VC Migration merged:
> - Any do_io() should lock vc->nh->mutex and vc->mutex simultaneously.
> Suggestion:
> 1. Recall VC Migration
> 2. Re-design ServerSession
> To re-design ServerSession:
> 1. Add NetHandler *servervc_nh to save ServerVC->nh while ServerVC is 
> callbacked VC_EVENT_NET_OPEN to HttpSM.
> 2. For do_io(), check servervc_nh ==? get_NetHandler(this_ethread())
> 3a. equal, directly call ServerVC->do_io() and return ServerVC->VIO
> 3b. not equal, try lock servervc_nh->mutex and ServerVC->mutex
> 4a. if locked, directly call ServerVC->do_io() and return ServerVC->VIO
> 4b. if not, create a Cont and schedule it into 
> servervc_nh->trigger_event->thread. The Cont will call ServerVC->do_io() 
> later.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to