These are my first thoughts about approaching threaded python 2.7
apps. Please critique this, I could be totally wrong here! And I don't
want to be wrong. Thanks in advance.

----

Hello, dummy here.

I'm just beginning my first experiments with python 2.7 apps, using
"threadsafe: true". But I'm a clueless n00b as far as python goes.
Well, not a n00b, but still a beginner. And then this multi-threading
thing turns up, and I find myself groaning "oh man, really, does it
have to get this complex?" I think I hear a lot of similar groans out
there ;-)

I'm betting that the whole "multithreaded" thing in python appengine
apps is scaring plenty of people. I've done a lot of concurrent
programming, but the prospect of dealing with threading in python has
daunted me a bit because I'm a beginner with python and appengine as
it is - this just makes life harder. But hey, it's being added for a
reason; I'd best quit complaining and start figuring it out!

Thinking about threads and python, I realised that I didn't know how I
needed to actually use multi-threading to make my apps leaner and
meaner. I mean, why would I use them? They're for doing inherently
concurrent things. Serving up pages isn't inherently concurrent stuff,
at the app development level. What exactly is expected here? Shouldn't
the framework be doing that kind of thing for me?

And of course that was the aha moment. The framework *is* doing the work for me.

The situation with python appengine development up until now has been
that instances process serially. They take a request, see it through
to its end. They take another request. And so on. That's cool, but
instances spend a lot of time sitting around waiting when they could
be doing more work.

But with the new python 2.7 support, you can tell appengine that it
would be ok to give instances more work when they are blocked waiting
for something. eg: if they are doing a big url fetch, or a long query
from datastore, something like that, then it's cool to give them
another request to begin working on, and come back to the waiting
request later when its ready. You do that by setting "threadsafe:
true" in your app.yaml .

Being threadsafe sounds scary! But actually it shouldn't be a huge
deal. Pretty much it's about what you shouldn't do.

Multi-threading means having multiple points of execution on the one
codebase in the one address space. Anything you do to touch things
external to that (like datastore, memcache, url fetches) shouldn't
care about that (assuming the client libraries are threadsafe). And
normal code touching local variables will be fine.

Probably the only real thing you've got to worry about is using
instance memory (global variables more or less). That's because
multiple requests, ie: multiple threads, can come in and fiddle with
that global memory at the same time. You can fix that with some
concurrency primitives, but if that sounds scary you can just avoid
touching global memory in the first place.

So if you're using instance memory as part of a caching strategy, for
instance (caching like instance-memory -> memcache -> datastore), then
you either need to make the instance memory caching threadsafe, or
just stop using instance memory for that purpose.

The other big gotcha, implied by this issue with global memory, is
libraries. Which libraries are threadsafe? Plenty probably aren't,
especially some of those shady 3rd party python libs you found lying
around on code.google.com . Why not? Because they use global memory.
But the built in libs should be ok, unless we've been specifically
told they're not, and I don't recall any information like that.

Oh, and your app needs to use WSGI script handlers, presumably because
the cgi method we were recommended to use in py 2.5 apps is not
threadsafe.

So to sum up, if you aren't too sure about multi threading and want to
keep it simple, it seems like you can get your existing app processing
parallel requests by doing the following:
 - Remove uses of global instance memory (if you don't know what that
means you're probably not doing it anyway)
 - Remove/replace non threadsafe libraries (tricky - do more
experienced pythonistas know of any way to easily determine this? eg
pre-existing lists?)
 - Modify your app starting point, the bit that wrangles your
WSGIApplication, so that it works like this:
      
http://code.google.com/appengine/docs/python/gettingstartedpython27/helloworld.html
   and not like this:
      
http://code.google.com/appengine/docs/python/gettingstarted/usingwebapp.html
- Set up your app.yaml properly, as per:
      
http://code.google.com/appengine/docs/python/gettingstartedpython27/helloworld.html
- Update your SDK to 1.5.5 (or later) otherwise it'll refuse to upload.

I don't think the dev appserver will run your code concurrently yet,
but you can always set threadsafe: false for local development, then
change it before you upload.

On a related note, there is other stuff that you need to check to make
sure your app is ready for python 2.7, largely around newer versions
of libraries being used (eg: webob has changed). Check this page:
http://code.google.com/appengine/docs/python/python27/newin27.html

-- 
Emlyn

http://my.syyn.cc - Synchonise Google+, Facebook, WordPress and Google
Buzz posts,
comments and all.
http://point7.wordpress.com - My blog
Find me on Facebook and Buzz

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to google-appengine@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.

Reply via email to