[jira] [Commented] (TS-1158) Race on mutex switching for NetVConnections in UnixNetVConnection::mainEvent

2012-03-22 Thread John Plevyak (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-1158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235725#comment-13235725
 ] 

John Plevyak commented on TS-1158:
--

I am not sure either, hence the new jira issue.

 Race on mutex switching for NetVConnections in UnixNetVConnection::mainEvent
 

 Key: TS-1158
 URL: https://issues.apache.org/jira/browse/TS-1158
 Project: Traffic Server
  Issue Type: Bug
  Components: Core
Affects Versions: 3.0.3
 Environment: ALL
Reporter: John Plevyak
Assignee: John Plevyak
 Fix For: 3.1.4

 Attachments: ts-1158-jp1.patch


 Because of the way session management works, the vio.mutex must be 
 re-verified to be identical to the one the lock was taken on after the lock 
 is acquired.  Otherwise there is a race when the mutex is switched allowing 
 such that the old lock is held while the new lock is in not held.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (TS-1158) Race on mutex switching for NetVConnections in UnixNetVConnection::mainEvent

2012-03-21 Thread weijin (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-1158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13234171#comment-13234171
 ] 

weijin commented on TS-1158:


I see the read_from_net and write_to_net_io function also have such mechanism 
to prevent the race condition. I read and read it again, but still can not 
figure out how the mutex is switched. Can you explain it more detailly, and I 
also want to know what is consequences of the race.  thanks ery much. 

 Race on mutex switching for NetVConnections in UnixNetVConnection::mainEvent
 

 Key: TS-1158
 URL: https://issues.apache.org/jira/browse/TS-1158
 Project: Traffic Server
  Issue Type: Bug
  Components: Core
Affects Versions: 3.0.3
 Environment: ALL
Reporter: John Plevyak
Assignee: John Plevyak
 Fix For: 3.1.4

 Attachments: ts-1158-jp1.patch


 Because of the way session management works, the vio.mutex must be 
 re-verified to be identical to the one the lock was taken on after the lock 
 is acquired.  Otherwise there is a race when the mutex is switched allowing 
 such that the old lock is held while the new lock is in not held.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (TS-1158) Race on mutex switching for NetVConnections in UnixNetVConnection::mainEvent

2012-03-21 Thread John Plevyak (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-1158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13234393#comment-13234393
 ] 

John Plevyak commented on TS-1158:
--

The mutex switch occurs in the HttpSessionManager.  When a session is passed to 
it, the read.vio.mutex and write.vio.mutex from the old controlling HttpSM are 
replaced with that of a hash bucket of sessions in the Manager (a hash to 
reduce contention on this globally shared data structure).  When a session is 
requested from the HttpSessionManager, they are replaced with those of the new 
HttpSM which will be using that OS connection.  During the swap, the previous 
and new mutexes are held, but nevertheless, a race is possible if a thread 
grabs the old (pre substitution) mutex, then a context switch occurs and the 
mutexes are swapped and the old mutex (pre substitute) lock is released, then 
the first thread resumes, locks the (pre substitution) mutex and now two 
threads are running while thinking they are holding the mutex for the NetVC.  
The solution is to ensure, after the lock has been taken, that the mutex we 
have locked is the same one that is protecting the NetVC.  If it is not, we 
back out and retry later.

 Race on mutex switching for NetVConnections in UnixNetVConnection::mainEvent
 

 Key: TS-1158
 URL: https://issues.apache.org/jira/browse/TS-1158
 Project: Traffic Server
  Issue Type: Bug
  Components: Core
Affects Versions: 3.0.3
 Environment: ALL
Reporter: John Plevyak
Assignee: John Plevyak
 Fix For: 3.1.4

 Attachments: ts-1158-jp1.patch


 Because of the way session management works, the vio.mutex must be 
 re-verified to be identical to the one the lock was taken on after the lock 
 is acquired.  Otherwise there is a race when the mutex is switched allowing 
 such that the old lock is held while the new lock is in not held.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (TS-1158) Race on mutex switching for NetVConnections in UnixNetVConnection::mainEvent

2012-03-21 Thread John Plevyak (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-1158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13234396#comment-13234396
 ] 

John Plevyak commented on TS-1158:
--

Note that when replacing a mutex, both the new and old mutexes must be held.   
Also note that this protection (double checking) is only provided in the 
NetProcessor as it is the only Processor whose VC mutexes are switched.  Any 
virtualization would need to provide the same protection.

 Race on mutex switching for NetVConnections in UnixNetVConnection::mainEvent
 

 Key: TS-1158
 URL: https://issues.apache.org/jira/browse/TS-1158
 Project: Traffic Server
  Issue Type: Bug
  Components: Core
Affects Versions: 3.0.3
 Environment: ALL
Reporter: John Plevyak
Assignee: John Plevyak
 Fix For: 3.1.4

 Attachments: ts-1158-jp1.patch


 Because of the way session management works, the vio.mutex must be 
 re-verified to be identical to the one the lock was taken on after the lock 
 is acquired.  Otherwise there is a race when the mutex is switched allowing 
 such that the old lock is held while the new lock is in not held.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: [jira] [Commented] (TS-1158) Race on mutex switching for NetVConnections in UnixNetVConnection::mainEvent

2012-03-21 Thread taorui
excellent, thanks again. 

On Wed, 2012-03-21 at 14:59 +, John Plevyak (Commented) (JIRA)
wrote:
 [ 
 https://issues.apache.org/jira/browse/TS-1158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13234396#comment-13234396
  ] 
 
 John Plevyak commented on TS-1158:
 --
 
 Note that when replacing a mutex, both the new and old mutexes must be held.  
  Also note that this protection (double checking) is only provided in the 
 NetProcessor as it is the only Processor whose VC mutexes are switched.  Any 
 virtualization would need to provide the same protection.
 
  Race on mutex switching for NetVConnections in UnixNetVConnection::mainEvent
  
 
  Key: TS-1158
  URL: https://issues.apache.org/jira/browse/TS-1158
  Project: Traffic Server
   Issue Type: Bug
   Components: Core
 Affects Versions: 3.0.3
  Environment: ALL
 Reporter: John Plevyak
 Assignee: John Plevyak
  Fix For: 3.1.4
 
  Attachments: ts-1158-jp1.patch
 
 
  Because of the way session management works, the vio.mutex must be 
  re-verified to be identical to the one the lock was taken on after the lock 
  is acquired.  Otherwise there is a race when the mutex is switched allowing 
  such that the old lock is held while the new lock is in not held.
 
 --
 This message is automatically generated by JIRA.
 If you think it was sent incorrectly, please contact your JIRA 
 administrators: 
 https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
 For more information on JIRA, see: http://www.atlassian.com/software/jira
 
 





[jira] [Commented] (TS-1158) Race on mutex switching for NetVConnections in UnixNetVConnection::mainEvent

2012-03-21 Thread taorui (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-1158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13234459#comment-13234459
 ] 

taorui commented on TS-1158:


excellent, thanks again. 

On Wed, 2012-03-21 at 14:59 +, John Plevyak (Commented) (JIRA)





 Race on mutex switching for NetVConnections in UnixNetVConnection::mainEvent
 

 Key: TS-1158
 URL: https://issues.apache.org/jira/browse/TS-1158
 Project: Traffic Server
  Issue Type: Bug
  Components: Core
Affects Versions: 3.0.3
 Environment: ALL
Reporter: John Plevyak
Assignee: John Plevyak
 Fix For: 3.1.4

 Attachments: ts-1158-jp1.patch


 Because of the way session management works, the vio.mutex must be 
 re-verified to be identical to the one the lock was taken on after the lock 
 is acquired.  Otherwise there is a race when the mutex is switched allowing 
 such that the old lock is held while the new lock is in not held.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: [jira] [Commented] (TS-1158) Race on mutex switching for NetVConnections in UnixNetVConnection::mainEvent

2012-03-21 Thread taorui
I am afraid the race is (may be one of) the root cause of TS-857, but I
am not sure.

On Wed, 2012-03-21 at 14:53 +, John Plevyak (Commented) (JIRA)
wrote:
 [ 
 https://issues.apache.org/jira/browse/TS-1158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13234393#comment-13234393
  ] 
 
 John Plevyak commented on TS-1158:
 --
 
 The mutex switch occurs in the HttpSessionManager.  When a session is passed 
 to it, the read.vio.mutex and write.vio.mutex from the old controlling HttpSM 
 are replaced with that of a hash bucket of sessions in the Manager (a hash to 
 reduce contention on this globally shared data structure).  When a session is 
 requested from the HttpSessionManager, they are replaced with those of the 
 new HttpSM which will be using that OS connection.  During the swap, the 
 previous and new mutexes are held, but nevertheless, a race is possible if a 
 thread grabs the old (pre substitution) mutex, then a context switch occurs 
 and the mutexes are swapped and the old mutex (pre substitute) lock is 
 released, then the first thread resumes, locks the (pre substitution) mutex 
 and now two threads are running while thinking they are holding the mutex for 
 the NetVC.  The solution is to ensure, after the lock has been taken, that 
 the mutex we have locked is the same one that is protecting the NetVC.  If it 
 is not, we back out and retry later.
 
  Race on mutex switching for NetVConnections in UnixNetVConnection::mainEvent
  
 
  Key: TS-1158
  URL: https://issues.apache.org/jira/browse/TS-1158
  Project: Traffic Server
   Issue Type: Bug
   Components: Core
 Affects Versions: 3.0.3
  Environment: ALL
 Reporter: John Plevyak
 Assignee: John Plevyak
  Fix For: 3.1.4
 
  Attachments: ts-1158-jp1.patch
 
 
  Because of the way session management works, the vio.mutex must be 
  re-verified to be identical to the one the lock was taken on after the lock 
  is acquired.  Otherwise there is a race when the mutex is switched allowing 
  such that the old lock is held while the new lock is in not held.
 
 --
 This message is automatically generated by JIRA.
 If you think it was sent incorrectly, please contact your JIRA 
 administrators: 
 https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
 For more information on JIRA, see: http://www.atlassian.com/software/jira
 






[jira] [Commented] (TS-1158) Race on mutex switching for NetVConnections in UnixNetVConnection::mainEvent

2012-03-21 Thread taorui (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-1158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235328#comment-13235328
 ] 

taorui commented on TS-1158:


I am afraid the race is (may be one of) the root cause of TS-857, but I
am not sure.

On Wed, 2012-03-21 at 14:53 +, John Plevyak (Commented) (JIRA)





 Race on mutex switching for NetVConnections in UnixNetVConnection::mainEvent
 

 Key: TS-1158
 URL: https://issues.apache.org/jira/browse/TS-1158
 Project: Traffic Server
  Issue Type: Bug
  Components: Core
Affects Versions: 3.0.3
 Environment: ALL
Reporter: John Plevyak
Assignee: John Plevyak
 Fix For: 3.1.4

 Attachments: ts-1158-jp1.patch


 Because of the way session management works, the vio.mutex must be 
 re-verified to be identical to the one the lock was taken on after the lock 
 is acquired.  Otherwise there is a race when the mutex is switched allowing 
 such that the old lock is held while the new lock is in not held.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira