[jira] [Commented] (TS-1158) Race on mutex switching for NetVConnections in UnixNetVConnection::mainEvent
[ https://issues.apache.org/jira/browse/TS-1158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235725#comment-13235725 ] John Plevyak commented on TS-1158: -- I am not sure either, hence the new jira issue. Race on mutex switching for NetVConnections in UnixNetVConnection::mainEvent Key: TS-1158 URL: https://issues.apache.org/jira/browse/TS-1158 Project: Traffic Server Issue Type: Bug Components: Core Affects Versions: 3.0.3 Environment: ALL Reporter: John Plevyak Assignee: John Plevyak Fix For: 3.1.4 Attachments: ts-1158-jp1.patch Because of the way session management works, the vio.mutex must be re-verified to be identical to the one the lock was taken on after the lock is acquired. Otherwise there is a race when the mutex is switched allowing such that the old lock is held while the new lock is in not held. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (TS-1158) Race on mutex switching for NetVConnections in UnixNetVConnection::mainEvent
[ https://issues.apache.org/jira/browse/TS-1158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13234171#comment-13234171 ] weijin commented on TS-1158: I see the read_from_net and write_to_net_io function also have such mechanism to prevent the race condition. I read and read it again, but still can not figure out how the mutex is switched. Can you explain it more detailly, and I also want to know what is consequences of the race. thanks ery much. Race on mutex switching for NetVConnections in UnixNetVConnection::mainEvent Key: TS-1158 URL: https://issues.apache.org/jira/browse/TS-1158 Project: Traffic Server Issue Type: Bug Components: Core Affects Versions: 3.0.3 Environment: ALL Reporter: John Plevyak Assignee: John Plevyak Fix For: 3.1.4 Attachments: ts-1158-jp1.patch Because of the way session management works, the vio.mutex must be re-verified to be identical to the one the lock was taken on after the lock is acquired. Otherwise there is a race when the mutex is switched allowing such that the old lock is held while the new lock is in not held. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (TS-1158) Race on mutex switching for NetVConnections in UnixNetVConnection::mainEvent
[ https://issues.apache.org/jira/browse/TS-1158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13234393#comment-13234393 ] John Plevyak commented on TS-1158: -- The mutex switch occurs in the HttpSessionManager. When a session is passed to it, the read.vio.mutex and write.vio.mutex from the old controlling HttpSM are replaced with that of a hash bucket of sessions in the Manager (a hash to reduce contention on this globally shared data structure). When a session is requested from the HttpSessionManager, they are replaced with those of the new HttpSM which will be using that OS connection. During the swap, the previous and new mutexes are held, but nevertheless, a race is possible if a thread grabs the old (pre substitution) mutex, then a context switch occurs and the mutexes are swapped and the old mutex (pre substitute) lock is released, then the first thread resumes, locks the (pre substitution) mutex and now two threads are running while thinking they are holding the mutex for the NetVC. The solution is to ensure, after the lock has been taken, that the mutex we have locked is the same one that is protecting the NetVC. If it is not, we back out and retry later. Race on mutex switching for NetVConnections in UnixNetVConnection::mainEvent Key: TS-1158 URL: https://issues.apache.org/jira/browse/TS-1158 Project: Traffic Server Issue Type: Bug Components: Core Affects Versions: 3.0.3 Environment: ALL Reporter: John Plevyak Assignee: John Plevyak Fix For: 3.1.4 Attachments: ts-1158-jp1.patch Because of the way session management works, the vio.mutex must be re-verified to be identical to the one the lock was taken on after the lock is acquired. Otherwise there is a race when the mutex is switched allowing such that the old lock is held while the new lock is in not held. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (TS-1158) Race on mutex switching for NetVConnections in UnixNetVConnection::mainEvent
[ https://issues.apache.org/jira/browse/TS-1158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13234396#comment-13234396 ] John Plevyak commented on TS-1158: -- Note that when replacing a mutex, both the new and old mutexes must be held. Also note that this protection (double checking) is only provided in the NetProcessor as it is the only Processor whose VC mutexes are switched. Any virtualization would need to provide the same protection. Race on mutex switching for NetVConnections in UnixNetVConnection::mainEvent Key: TS-1158 URL: https://issues.apache.org/jira/browse/TS-1158 Project: Traffic Server Issue Type: Bug Components: Core Affects Versions: 3.0.3 Environment: ALL Reporter: John Plevyak Assignee: John Plevyak Fix For: 3.1.4 Attachments: ts-1158-jp1.patch Because of the way session management works, the vio.mutex must be re-verified to be identical to the one the lock was taken on after the lock is acquired. Otherwise there is a race when the mutex is switched allowing such that the old lock is held while the new lock is in not held. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: [jira] [Commented] (TS-1158) Race on mutex switching for NetVConnections in UnixNetVConnection::mainEvent
excellent, thanks again. On Wed, 2012-03-21 at 14:59 +, John Plevyak (Commented) (JIRA) wrote: [ https://issues.apache.org/jira/browse/TS-1158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13234396#comment-13234396 ] John Plevyak commented on TS-1158: -- Note that when replacing a mutex, both the new and old mutexes must be held. Also note that this protection (double checking) is only provided in the NetProcessor as it is the only Processor whose VC mutexes are switched. Any virtualization would need to provide the same protection. Race on mutex switching for NetVConnections in UnixNetVConnection::mainEvent Key: TS-1158 URL: https://issues.apache.org/jira/browse/TS-1158 Project: Traffic Server Issue Type: Bug Components: Core Affects Versions: 3.0.3 Environment: ALL Reporter: John Plevyak Assignee: John Plevyak Fix For: 3.1.4 Attachments: ts-1158-jp1.patch Because of the way session management works, the vio.mutex must be re-verified to be identical to the one the lock was taken on after the lock is acquired. Otherwise there is a race when the mutex is switched allowing such that the old lock is held while the new lock is in not held. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (TS-1158) Race on mutex switching for NetVConnections in UnixNetVConnection::mainEvent
[ https://issues.apache.org/jira/browse/TS-1158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13234459#comment-13234459 ] taorui commented on TS-1158: excellent, thanks again. On Wed, 2012-03-21 at 14:59 +, John Plevyak (Commented) (JIRA) Race on mutex switching for NetVConnections in UnixNetVConnection::mainEvent Key: TS-1158 URL: https://issues.apache.org/jira/browse/TS-1158 Project: Traffic Server Issue Type: Bug Components: Core Affects Versions: 3.0.3 Environment: ALL Reporter: John Plevyak Assignee: John Plevyak Fix For: 3.1.4 Attachments: ts-1158-jp1.patch Because of the way session management works, the vio.mutex must be re-verified to be identical to the one the lock was taken on after the lock is acquired. Otherwise there is a race when the mutex is switched allowing such that the old lock is held while the new lock is in not held. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: [jira] [Commented] (TS-1158) Race on mutex switching for NetVConnections in UnixNetVConnection::mainEvent
I am afraid the race is (may be one of) the root cause of TS-857, but I am not sure. On Wed, 2012-03-21 at 14:53 +, John Plevyak (Commented) (JIRA) wrote: [ https://issues.apache.org/jira/browse/TS-1158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13234393#comment-13234393 ] John Plevyak commented on TS-1158: -- The mutex switch occurs in the HttpSessionManager. When a session is passed to it, the read.vio.mutex and write.vio.mutex from the old controlling HttpSM are replaced with that of a hash bucket of sessions in the Manager (a hash to reduce contention on this globally shared data structure). When a session is requested from the HttpSessionManager, they are replaced with those of the new HttpSM which will be using that OS connection. During the swap, the previous and new mutexes are held, but nevertheless, a race is possible if a thread grabs the old (pre substitution) mutex, then a context switch occurs and the mutexes are swapped and the old mutex (pre substitute) lock is released, then the first thread resumes, locks the (pre substitution) mutex and now two threads are running while thinking they are holding the mutex for the NetVC. The solution is to ensure, after the lock has been taken, that the mutex we have locked is the same one that is protecting the NetVC. If it is not, we back out and retry later. Race on mutex switching for NetVConnections in UnixNetVConnection::mainEvent Key: TS-1158 URL: https://issues.apache.org/jira/browse/TS-1158 Project: Traffic Server Issue Type: Bug Components: Core Affects Versions: 3.0.3 Environment: ALL Reporter: John Plevyak Assignee: John Plevyak Fix For: 3.1.4 Attachments: ts-1158-jp1.patch Because of the way session management works, the vio.mutex must be re-verified to be identical to the one the lock was taken on after the lock is acquired. Otherwise there is a race when the mutex is switched allowing such that the old lock is held while the new lock is in not held. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (TS-1158) Race on mutex switching for NetVConnections in UnixNetVConnection::mainEvent
[ https://issues.apache.org/jira/browse/TS-1158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235328#comment-13235328 ] taorui commented on TS-1158: I am afraid the race is (may be one of) the root cause of TS-857, but I am not sure. On Wed, 2012-03-21 at 14:53 +, John Plevyak (Commented) (JIRA) Race on mutex switching for NetVConnections in UnixNetVConnection::mainEvent Key: TS-1158 URL: https://issues.apache.org/jira/browse/TS-1158 Project: Traffic Server Issue Type: Bug Components: Core Affects Versions: 3.0.3 Environment: ALL Reporter: John Plevyak Assignee: John Plevyak Fix For: 3.1.4 Attachments: ts-1158-jp1.patch Because of the way session management works, the vio.mutex must be re-verified to be identical to the one the lock was taken on after the lock is acquired. Otherwise there is a race when the mutex is switched allowing such that the old lock is held while the new lock is in not held. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira