[web2py] Re: Bug? Opening session files from multiple processes
Yes, I have a working solution now. Thanks for all the help :)
Re: [web2py] Re: Bug? Opening session files from multiple processes
On Mar 16, 2011, at 11:01 AM, Corne wrote: Yes, I have a working solution now. Thanks for all the help :) The fix is in the trunk, and will be in the next stable version.
[web2py] Re: Bug? Opening session files from multiple processes
We (again) looked deeper into what is really happening; and it is yet different. What we ran into is the following: We tried to set a session_id our self based on information in the url, which in this case resulted in calling the session connect code (where it went wrong) twice per request. In case a cookie was send; there is no problem at all. Session is handled by web2py like always (except for the fact that it's done twice). In case there is no cookie send; there is a problem. The first call to connect (web2py internal) has no session_id, so a new one is generated. The second call to connect (our plugin) has a session id so it's handled ok. In the end of the request, the session changes are written. But in our case (without cookie) the var session_new is True (and the session file is (re)opened with 'wb'). Opening with 'wb' does seem to change the file handle. The request that is handled by a differend process at the same time will now have an invallid session. This also explains the fact that reopening the session file seemed to solve the problem except for the fact that the real problem is somewhere else. I guess that using connect is something that is / should be allowed (it's in the book), this is also the way to, for example use sessions from an other application. and there the same issue could apply: in case someone uses connect to just use a session from a different application. The first connect from web2py might result in creating a new session. While the connect which is issued later by the user does result in an existing session.
Re: [web2py] Re: Bug? Opening session files from multiple processes
On Mar 15, 2011, at 7:15 AM, Corne wrote: We (again) looked deeper into what is really happening; and it is yet different. What we ran into is the following: We tried to set a session_id our self based on information in the url, which in this case resulted in calling the session connect code (where it went wrong) twice per request. In case a cookie was send; there is no problem at all. Session is handled by web2py like always (except for the fact that it's done twice). In case there is no cookie send; there is a problem. The first call to connect (web2py internal) has no session_id, so a new one is generated. The second call to connect (our plugin) has a session id so it's handled ok. In the end of the request, the session changes are written. But in our case (without cookie) the var session_new is True (and the session file is (re)opened with 'wb'). Opening with 'wb' does seem to change the file handle. The request that is handled by a differend process at the same time will now have an invallid session. This also explains the fact that reopening the session file seemed to solve the problem except for the fact that the real problem is somewhere else. I guess that using connect is something that is / should be allowed (it's in the book), this is also the way to, for example use sessions from an other application. and there the same issue could apply: in case someone uses connect to just use a session from a different application. The first connect from web2py might result in creating a new session. While the connect which is issued later by the user does result in an existing session. When you set your own session_id, does the corresponding session file always exist? If it does not, session.connect is going to discard that session_id and generate a new uuid. I don't see a way to force session.connect to create a session file with a predetermined id. If that's not an issue, you could try setting response.session_new = False before calling session.connect; session.connect probably ought to do that itself at the beginning of the not-db logic branch. We could also add a session-id argument to allow the caller to force an id, I suppose, but I'd like to be a little clearer on your use case. Doesn't creating a session id based on the request url open up a session to hijacking?
[web2py] Re: Bug? Opening session files from multiple processes
Right now the session is a combination of client ip and a uuid. The uuid prevents session hijacking. The ip serves two purposes: - the server can find expired sessions more easily - the apps the reject sessions coming from wrong ip thus further protecting against hijacking. On Mar 15, 11:55 am, Jonathan Lundell jlund...@pobox.com wrote: On Mar 15, 2011, at 7:15 AM, Corne wrote: We (again) looked deeper into what is really happening; and it is yet different. What we ran into is the following: We tried to set a session_id our self based on information in the url, which in this case resulted in calling the session connect code (where it went wrong) twice per request. In case a cookie was send; there is no problem at all. Session is handled by web2py like always (except for the fact that it's done twice). In case there is no cookie send; there is a problem. The first call to connect (web2py internal) has no session_id, so a new one is generated. The second call to connect (our plugin) has a session id so it's handled ok. In the end of the request, the session changes are written. But in our case (without cookie) the var session_new is True (and the session file is (re)opened with 'wb'). Opening with 'wb' does seem to change the file handle. The request that is handled by a differend process at the same time will now have an invallid session. This also explains the fact that reopening the session file seemed to solve the problem except for the fact that the real problem is somewhere else. I guess that using connect is something that is / should be allowed (it's in the book), this is also the way to, for example use sessions from an other application. and there the same issue could apply: in case someone uses connect to just use a session from a different application. The first connect from web2py might result in creating a new session. While the connect which is issued later by the user does result in an existing session. When you set your own session_id, does the corresponding session file always exist? If it does not, session.connect is going to discard that session_id and generate a new uuid. I don't see a way to force session.connect to create a session file with a predetermined id. If that's not an issue, you could try setting response.session_new = False before calling session.connect; session.connect probably ought to do that itself at the beginning of the not-db logic branch. We could also add a session-id argument to allow the caller to force an id, I suppose, but I'd like to be a little clearer on your use case. Doesn't creating a session id based on the request url open up a session to hijacking?
Re: [web2py] Re: Bug? Opening session files from multiple processes
On Mar 15, 2011, at 11:33 AM, Massimo Di Pierro wrote: Right now the session is a combination of client ip and a uuid. The uuid prevents session hijacking. The ip serves two purposes: - the server can find expired sessions more easily - the apps the reject sessions coming from wrong ip thus further protecting against hijacking. Right, but I was referring to Corne's mechanism: We tried to set a session_id our self based on information in the url, which in this case resulted in calling the session connect code (where it went wrong) twice per request. Unless there's a secret key or hash or something in the URL (like the newish hash logic in URL()), this seems open to hijacking. On Mar 15, 11:55 am, Jonathan Lundell jlund...@pobox.com wrote: On Mar 15, 2011, at 7:15 AM, Corne wrote: We (again) looked deeper into what is really happening; and it is yet different. What we ran into is the following: We tried to set a session_id our self based on information in the url, which in this case resulted in calling the session connect code (where it went wrong) twice per request. In case a cookie was send; there is no problem at all. Session is handled by web2py like always (except for the fact that it's done twice). In case there is no cookie send; there is a problem. The first call to connect (web2py internal) has no session_id, so a new one is generated. The second call to connect (our plugin) has a session id so it's handled ok. In the end of the request, the session changes are written. But in our case (without cookie) the var session_new is True (and the session file is (re)opened with 'wb'). Opening with 'wb' does seem to change the file handle. The request that is handled by a differend process at the same time will now have an invallid session. This also explains the fact that reopening the session file seemed to solve the problem except for the fact that the real problem is somewhere else. I guess that using connect is something that is / should be allowed (it's in the book), this is also the way to, for example use sessions from an other application. and there the same issue could apply: in case someone uses connect to just use a session from a different application. The first connect from web2py might result in creating a new session. While the connect which is issued later by the user does result in an existing session. When you set your own session_id, does the corresponding session file always exist? If it does not, session.connect is going to discard that session_id and generate a new uuid. I don't see a way to force session.connect to create a session file with a predetermined id. If that's not an issue, you could try setting response.session_new = False before calling session.connect; session.connect probably ought to do that itself at the beginning of the not-db logic branch. We could also add a session-id argument to allow the caller to force an id, I suppose, but I'd like to be a little clearer on your use case. Doesn't creating a session id based on the request url open up a session to hijacking?
[web2py] Re: Bug? Opening session files from multiple processes
When you set your own session_id, does the corresponding session file always exist? Yes, it does. I'm sure.. If that's not an issue, you could try setting response.session_new = False before calling session.connect I already tried that, and then it does work. The problem we are running into is exactly this: The response.session_new = True, and is never reset. (I guess it should do that). Doesn't creating a session id based on the request url open up a session to hijacking? The sessions are created by web2py as usual, in the request we only add a hash to find the session back (in our case the client might have no cookies). Internally we do also check the ip to avoid session hijacking. What is stored on the disk are the normal web2py sessions. In case that a client supports cookies it all works the normal way. In case there are no cookies, we use the hash from the url to set the session (using session.connect())
Re: [web2py] Re: Bug? Opening session files from multiple processes
On Mar 15, 2011, at 12:17 PM, Corne wrote: When you set your own session_id, does the corresponding session file always exist? Yes, it does. I'm sure.. If that's not an issue, you could try setting response.session_new = False before calling session.connect I already tried that, and then it does work. The problem we are running into is exactly this: The response.session_new = True, and is never reset. (I guess it should do that). I don't offhand see any harm in setting it properly in each connect call. I'm a little unsure about the logic if the session file doesn't exist, but as long as it does it should work fine. I'll suggest a patch to Massimo, but setting it to False before your call should continue to work regardless. I'm leaning in the direction of passing in a session id argument as well, but I'd like to think about it a little more. In your case (session file always exists) it doesn't matter so much, but if someone wanted to force the creation of a session file with a particular id, the logic you're using wouldn't work, because the id gets rewritten. At any rate: you have a working solution, right? Doesn't creating a session id based on the request url open up a session to hijacking? The sessions are created by web2py as usual, in the request we only add a hash to find the session back (in our case the client might have no cookies). Internally we do also check the ip to avoid session hijacking. What is stored on the disk are the normal web2py sessions. In case that a client supports cookies it all works the normal way. In case there are no cookies, we use the hash from the url to set the session (using session.connect())
[web2py] Re: Bug? Opening session files from multiple processes
I have similar behavior. Same long time generating page opened same time in few tabs give me some tabs with erased(overwrited?) session file and as a result login page. No exception or error. Is using session in db solve this issue? On 12 мар, 00:06, ron_m ron.mco...@gmail.com wrote: You should only get a portalocker.LockException if the no blocking option is set on the lock attempt which it is not the case in the original web2py code. However, the exception handler inside portalocker.lock will re-raise any exception if it is not IOError for Errno 11 It would be very helpful if you posted the stack trace that shows the exception type and details.
[web2py] Re: Bug? Opening session files from multiple processes
Which code are you using. Jonathan and I agree that Corne's patch in this thread does not lock the session. web2py does lock sessions unless you do session._unlock(request) On Mar 14, 6:17 am, KMax mkostri...@gmail.com wrote: I have similar behavior. Same long time generating page opened same time in few tabs give me some tabs with erased(overwrited?) session file and as a result login page. No exception or error. Is using session in db solve this issue? On 12 мар, 00:06, ron_m ron.mco...@gmail.com wrote: You should only get a portalocker.LockException if the no blocking option is set on the lock attempt which it is not the case in the original web2py code. However, the exception handler inside portalocker.lock will re-raise any exception if it is not IOError for Errno 11 It would be very helpful if you posted the stack trace that shows the exception type and details.
[web2py] Re: Bug? Opening session files from multiple processes
We've looked into the problem a bit deeper.. There is a problem but it's different than what was in the first post. The problem seems to be in the locking of the session file: Two processes/requests open the same session file for write and thus get a handle on it One request gets the portalocker lock on the session file, the other one blocks and waits for the first request to release the lock. The first request writes the changes to the session file, and releases the lock. The second request then gets the lock, but still uses the old file handle and results in an EOFError exception The first patch improved the situation, because the file is re-opened after the file is released. However, a proper way to handle this would be to use a lock file, which is used only for locking and not for storing data at the same time, as per attached patch --- globals.py 2011-03-14 16:17:19.466604041 +0100 +++ globals.py_new 2011-03-11 16:55:33.0 +0100 @@ -269,10 +269,12 @@ response.session_id = None if response.session_id: try: +response.session_lock = \ +open(response.session_filename+'.lock', 'wb') +portalocker.lock(response.session_lock, +portalocker.LOCK_EX) response.session_file = \ open(response.session_filename, 'rb+') -portalocker.lock(response.session_file, -portalocker.LOCK_EX) self.update(cPickle.load(response.session_file)) response.session_file.seek(0) oc = response.session_filename.split('/') [-1].split('-')[0] @@ -395,21 +397,23 @@ session_folder = os.path.dirname(response.session_filename) if not os.path.exists(session_folder): os.mkdir(session_folder) +response.session_lock = open(response.session_filename +'.lock', 'wb') +portalocker.lock(response.session_lock, portalocker.LOCK_EX) response.session_file = open(response.session_filename, 'wb') -portalocker.lock(response.session_file, portalocker.LOCK_EX) if response.session_file: cPickle.dump(dict(self), response.session_file) response.session_file.truncate() try: -portalocker.unlock(response.session_file) +portalocker.unlock(response.session_lock) response.session_file.close() +del response.session_lock del response.session_file except: ### this should never happen but happens in Windows pass def _unlock(self, response): -if response and response.session_file: +if response and response.session_file and response.session_lock: try: -portalocker.unlock(response.session_file) +portalocker.unlock(response.session_lock) except: ### this should never happen but happens in Windows pass
Re: [web2py] Re: Bug? Opening session files from multiple processes
On Mar 14, 2011, at 8:55 AM, Corne wrote: We've looked into the problem a bit deeper.. There is a problem but it's different than what was in the first post. The problem seems to be in the locking of the session file: Two processes/requests open the same session file for write and thus get a handle on it One request gets the portalocker lock on the session file, the other one blocks and waits for the first request to release the lock. The first request writes the changes to the session file, and releases the lock. The second request then gets the lock, but still uses the old file handle and results in an EOFError exception Something else (or something more) is going on here, I think. There's nothing wrong with using the old file handle, at least not that I can see. I wrote a simplified test case to verify that (which I could have wrong, but I don't think so). Something that bothers me a bit (though I don't think it's directly relevant to this problem) is the lazy creation of the session file. If the session file doesn't exist, it doesn't get created until the end of the request, when the session file is to be updated. If the current request calls session.forget, then it doesn't get created, but it sends back a cookie with the new session id. The next request sees that the file is missing and generates a new session id. So until we get a request that doesn't forget the session, we keep creating new session ids and sending them back. Also, if a client (presumably not a browser, which I can't see doing this) sends two simultaneous requests before the session file gets created, there's a race condition such that they'll get different session ids. Now, that'd be fairly strange behavior, but (I think) possible. The liberal use of session.forget could make it more likely. Maybe it's not worth worrying about. The first patch improved the situation, because the file is re-opened after the file is released. However, a proper way to handle this would be to use a lock file, which is used only for locking and not for storing data at the same time, as per attached patch --- globals.py 2011-03-14 16:17:19.466604041 +0100 +++ globals.py_new 2011-03-11 16:55:33.0 +0100 @@ -269,10 +269,12 @@ response.session_id = None if response.session_id: try: +response.session_lock = \ +open(response.session_filename+'.lock', 'wb') +portalocker.lock(response.session_lock, +portalocker.LOCK_EX) response.session_file = \ open(response.session_filename, 'rb+') -portalocker.lock(response.session_file, -portalocker.LOCK_EX) self.update(cPickle.load(response.session_file)) response.session_file.seek(0) oc = response.session_filename.split('/') [-1].split('-')[0] @@ -395,21 +397,23 @@ session_folder = os.path.dirname(response.session_filename) if not os.path.exists(session_folder): os.mkdir(session_folder) +response.session_lock = open(response.session_filename +'.lock', 'wb') +portalocker.lock(response.session_lock, portalocker.LOCK_EX) response.session_file = open(response.session_filename, 'wb') -portalocker.lock(response.session_file, portalocker.LOCK_EX) if response.session_file: cPickle.dump(dict(self), response.session_file) response.session_file.truncate() try: -portalocker.unlock(response.session_file) +portalocker.unlock(response.session_lock) response.session_file.close() +del response.session_lock del response.session_file except: ### this should never happen but happens in Windows pass def _unlock(self, response): -if response and response.session_file: +if response and response.session_file and response.session_lock: try: -portalocker.unlock(response.session_file) +portalocker.unlock(response.session_lock) except: ### this should never happen but happens in Windows pass
[web2py] Re: Bug? Opening session files from multiple processes
This is a strange behavior of open. Which linux version? Anyway, I will test your solution, it seems the right one. Massimo On Mar 11, 3:30 am, Corne i...@cobra-scripters.nl wrote: Hello, We were running into the following: (os: Linux - Apache with mod_wsgi) We have client code which does a number of requests: what happened is that some of these requests got 401-not authorized as response. The problem is that in globals.py the session file is opened (rb+) and the file is locked. The second request (which is handled at the same time from a different process) tries to do the same; but the file is still locked at that time. Expected result would be that we have to wait for the lock to free, but instead open throws an exception. As solution we now first open the file for reading, than acquire the lock; and finally open the file for reading/appending (rb+) Opening the file for reading will succeed, than the lock is acquired (or we wait until we can get the lock). When we have the lock there is no problem opening the file for reading and writing anymore.. Patch: --- globals.py 2011-03-11 10:13:22.634740461 +0100 +++ globals.py_new 2011-03-10 15:43:55.0 +0100 @@ -270,9 +270,11 @@ if response.session_id: try: response.session_file = \ - open(response.session_filename, 'rb+') + open(response.session_filename, 'r') portalocker.lock(response.session_file, portalocker.LOCK_EX) + response.session_file = \ + open(response.session_filename, 'rb+') self.update(cPickle.load(response.session_file)) response.session_file.seek(0) oc = response.session_filename.split('/') [-1].split('-')[0]
[web2py] Re: Bug? Opening session files from multiple processes
This is a strange behavior of open. Which linux version? Slackware 13.0 (with kernel 2.6.33.3)
[web2py] Re: Bug? Opening session files from multiple processes
I do not believe the file is locked using the code above because the second open closes the previous file pointer and unlocks the file. Can you show the traceback of the problem you are having? On Mar 11, 3:30 am, Corne i...@cobra-scripters.nl wrote: Hello, We were running into the following: (os: Linux - Apache with mod_wsgi) We have client code which does a number of requests: what happened is that some of these requests got 401-not authorized as response. The problem is that in globals.py the session file is opened (rb+) and the file is locked. The second request (which is handled at the same time from a different process) tries to do the same; but the file is still locked at that time. Expected result would be that we have to wait for the lock to free, but instead open throws an exception. As solution we now first open the file for reading, than acquire the lock; and finally open the file for reading/appending (rb+) Opening the file for reading will succeed, than the lock is acquired (or we wait until we can get the lock). When we have the lock there is no problem opening the file for reading and writing anymore.. Patch: --- globals.py 2011-03-11 10:13:22.634740461 +0100 +++ globals.py_new 2011-03-10 15:43:55.0 +0100 @@ -270,9 +270,11 @@ if response.session_id: try: response.session_file = \ - open(response.session_filename, 'rb+') + open(response.session_filename, 'r') portalocker.lock(response.session_file, portalocker.LOCK_EX) + response.session_file = \ + open(response.session_filename, 'rb+') self.update(cPickle.load(response.session_file)) response.session_file.seek(0) oc = response.session_filename.split('/') [-1].split('-')[0]
Re: [web2py] Re: Bug? Opening session files from multiple processes
On Mar 11, 2011, at 5:13 AM, Massimo Di Pierro wrote: This is a strange behavior of open. Which linux version? Anyway, I will test your solution, it seems the right one. No. The second open unlocks the file (presumably because f is closed with f gets rebound) and it's not getting locked again. Massimo On Mar 11, 3:30 am, Corne i...@cobra-scripters.nl wrote: Hello, We were running into the following: (os: Linux - Apache with mod_wsgi) We have client code which does a number of requests: what happened is that some of these requests got 401-not authorized as response. The problem is that in globals.py the session file is opened (rb+) and the file is locked. The second request (which is handled at the same time from a different process) tries to do the same; but the file is still locked at that time. Expected result would be that we have to wait for the lock to free, but instead open throws an exception. As solution we now first open the file for reading, than acquire the lock; and finally open the file for reading/appending (rb+) Opening the file for reading will succeed, than the lock is acquired (or we wait until we can get the lock). When we have the lock there is no problem opening the file for reading and writing anymore.. Patch: --- globals.py 2011-03-11 10:13:22.634740461 +0100 +++ globals.py_new 2011-03-10 15:43:55.0 +0100 @@ -270,9 +270,11 @@ if response.session_id: try: response.session_file = \ -open(response.session_filename, 'rb+') +open(response.session_filename, 'r') portalocker.lock(response.session_file, portalocker.LOCK_EX) +response.session_file = \ +open(response.session_filename, 'rb+') self.update(cPickle.load(response.session_file)) response.session_file.seek(0) oc = response.session_filename.split('/') [-1].split('-')[0]
[web2py] Re: Bug? Opening session files from multiple processes
You should only get a portalocker.LockException if the no blocking option is set on the lock attempt which it is not the case in the original web2py code. However, the exception handler inside portalocker.lock will re-raise any exception if it is not IOError for Errno 11 It would be very helpful if you posted the stack trace that shows the exception type and details.