Hi Chris,

> Thanks Hans.  Good tips.  I'm slowly learning.. :)

No prob -- me too!  :)  On the topic of concurrency, I strongly recommend
Brian Goetz' book: Java Concurrency in Practice.  While it's certainly
about Java, a lot of the principles apply pretty directly to Python.  Of
course, if you never plan to write a line of Java, you may be able to find
the key points in free online sources.  Anyway, I had to implement a java
network server recently, and that was an invaluable guide.
 
>> 1) that the mongo_conn object itself is thread-safe (does not have
> internal
>> state that could get overridden by concurrent calls) and
> 
> I'll have to look at the pymongo Connection class and see if it is
> thread safe in order to verify that mongo_conn is thread-safe.  I may
> post over on the mongodb user group to verify.

Yeah.  I would /imagine/ that it's threadsafe; that seems to be the trend
for database API libraries in python, but certainly worth checking with
them.

>> 2) that app_globals.mongo_conn was set in a thread-safe way.  I assume
> that
>> this is happening in some module-level initialization code, which I
>> *believe* is thread-safe in python (not positive on that point).
> 
> Yes, mongo_conn is initialized in the lib/app_globals.py Globals
> __init__, which should be called just on the application load/launch.
> Ben mentioned above, that is area is thread-safe.

Ok -- I guess I missed that comment.  I suspected, but that's good to know
for sure.

>> does pylons guarantee a
>> new instance of your controller object for handling every request (and
>> hence, per thread)?
> 
> I think that is a key question ... so I searched around a bit more - I
> found this (see Mike Orr's post)
>
http://groups.google.com/group/paste-users/browse_thread/thread/bcf2ed96f6581f52
> ""... Pylons assumes its native controllers are not thread
> safe, and instantiates one for each request. ...""
> 
> So, from that I assume Pylons instantiates a new controller instance
> per request.

Good research; that's great to know for the future.  So, yes, you should be
all set with your Mongo class given that the controller gets instantiated
for every request.

> What I've gathered thus far: each controller will get its own instance
> of Mongo assigned to self.db.  However, all self.db Mongo instances
> share the same reference to the connection object mongo_conn. So, if
> something is not thread safe with the connection then I may see thread
> related problems (with saving/updating etc).  Does that sound right?

Yup - you got it.

> I guess, my next step is to determing in pymongo's connection is
> thread safe or if I need to use thread local.
> 
> (I've been looking at sqlalchemy/pylons code to figure out how they do
> threading and whatnot with the model, but for the uninitiated it can
> be a little confusing if you don't know what you're looking for.)

Yeah, I concur.  Frankly, the Paste internals are really hard to follow,
and SQLAlchemy is not much simpler.  Paste adds additional complexity by
using the StackedObjectProxy, which is more than simply a thread local.

The basic idea with thread local (threading.local class in Python) is that
the instance will be created per thread.  So, if in your module you put:

from threading import local as ThreadLocal

container = ThreadLocal()

# And then in functions/classes in your code you add stuff to your
container, like:

def function():
  container.db = Mongo()
  # etc.

Those will all be stored in a thread-local context; i.e. you won't have to
worry about concurrent access to anything you put in your threading.local
instance.

This is certainly getting a bit off-Pylons topic, but another approach
would be to use a lock with global data (e.g. data in app_globals).  For
example an RLock (reentrant lock, allows same thread to enter the locked
area but blocks other threads) to manage access to module globals:

# network.py:
servers = []
servers_lock = threading.RLock()

# othermodule.py:
# And then when you want to select or change network.servers (from some
other module), you do this:
with network.servers_lock:
   network.servers = enumerate_my_servers()

# Or:
with network.servers_lock:
   if serverobj in network.servers:
      do_something()

One thing to note with concurrency is that you have to lock on both read
and write (otherwise you could be reading dirty values).  Sometimes you
really do need values that are global across threads; this is for those
cases.

Hope that was helpful, not confusing.  I do think concurrency needs more
discussion in python.  It seems like a lot of people ignore the issue. 
Frameworks like Pylons certainly help keep it out of sight, but it is
certainly there behind the scenes and I think deserves some recognition.

Good luck!

Hans


--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"pylons-discuss" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/pylons-discuss?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to