Oknet Xu created TS-4910:
----------------------------
Summary: We should get vc->mutex before do_io when the vc is
acquired from session pool
Key: TS-4910
URL: https://issues.apache.org/jira/browse/TS-4910
Project: Traffic Server
Issue Type: Bug
Reporter: Oknet Xu
Component: ServerSessionPool
(I cound not found ServerSessionPool from JIRA)
source: proxy/http/HttpSessionManager.cc
```
309 // Now check to see if we have a connection in our shared connection
pool
310 EThread *ethread = this_ethread();
311 ProxyMutex *pool_mutex = (TS_SERVER_SESSION_SHARING_POOL_THREAD ==
sm->t_state.http_config_param->server_session_sharing_pool) ?
312 ethread->server_session_pool->mutex.get() :
313 m_g_pool->mutex.get();
314 MUTEX_TRY_LOCK(lock, pool_mutex, ethread);
315 if (lock.is_locked()) {
316 if (TS_SERVER_SESSION_SHARING_POOL_THREAD ==
sm->t_state.http_config_param->server_session_sharing_pool) {
317 retval = ethread->server_session_pool->acquireSession(ip,
hostname_hash, match_style, sm, to_return);
318 Debug("http_ss", "[acquire session] thread pool search %s",
to_return ? "successful" : "failed");
319 } else {
320 retval = m_g_pool->acquireSession(ip, hostname_hash, match_style,
sm, to_return);
321 Debug("http_ss", "[acquire session] global pool search %s",
to_return ? "successful" : "failed");
322 // At this point to_return has been removed from the pool. Do we
need to move it
323 // to the same thread?
324 if (to_return) {
325 UnixNetVConnection *server_vc = dynamic_cast<UnixNetVConnection
*>(to_return->get_netvc());
326 if (server_vc) {
327 UnixNetVConnection *new_vc =
server_vc->migrateToCurrentThread(sm, ethread);
```
As the code above:
1. we get pool_mutex first
2. then acquire a vc from session pool
3. then migrate the vc to current thread without get vc->mutex
Depend on the comments, a SM only access VIO & VC that returned with callback.
The mutex of ServerSession may be different from server_vc while it is acquired
from ServerSessionPool and attached to HttpSM.
HttpSM should get the server_vc->nh->mutex and server_vc->mutex first before
call ServerSession->do_io(). Currently HttpSM does not.
ClientVC is accepted from NetAccept with a new allocated mutex.
Then ClientSession is created and share the same mutex with ClientVC.
Then HttpSM is created and share the same mutex with ClientVC.
Then NetHandler get HttpSM->mutex and callback READ_READY to HttpSM by
ClientVC->read.vio._cont.
** ClientVC->nh->mutex is locked by EventSystem
** HttpSM->mutex is locked by NetHandler
** ClientVC->mutex is locked due share the same mutex with HttpSM
Scenes1:
HttpSM create ServerVC with HttpSM->mutex by netProcessor.connect_re()
Then HttpSM create ServerSession with ServerVC and share the same mutex with
HttpSM.
Scenes2:
HttpSM acquire a ServerSession from SessionPool and the ServerVC->thread is not
current EThread.
The ServerSession->mutex is set to HttpSM->mutex while it is attached to HttpSM.
But the ServerVC->mutex is old one.
The first bug:
Before VC Migration merged:
- ServerSession->do_io() is called and directly call ServerVC->do_io() without
get ServerVC->mutex first.
After VC Migration merged:
- Migrate ServerVC into current thread without get ServerVC->mutex first.
The second bug:
Before VC Migration merged:
- Any do_io() should lock vc->nh->mutex and vc->mutex simultaneously.
Suggestion:
1. Recall VC Migration
2. Re-design ServerSession
To re-design ServerSession:
1. Add NetHandler *servervc_nh to save ServerVC->nh while ServerVC is
callbacked VC_EVENT_NET_OPEN to HttpSM.
2. For do_io(), check servervc_nh ==? get_NetHandler(this_ethread())
3a. equal, directly call ServerVC->do_io() and return ServerVC->VIO
3b. not equal, try lock servervc_nh->mutex and ServerVC->mutex
4a. if locked, directly call ServerVC->do_io() and return ServerVC->VIO
4b. if not, create a Cont and schedule it into
servervc_nh->trigger_event->thread. The Cont will call ServerVC->do_io() later.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)