Re: Multiple modules with database access + general app design?
Frank Millman wrote: I have subclassed threading.Thread, and I store a number of attributes within the subclass that are local to the thread. It seems to work fine, but according to what you say (and according to the Python docs, otherwise why would there be a 'Local' class) there must be some reason why it is not a good idea. Please can you explain the problem with this approach. Your design is just fine. If you follow the thread upwards, you'll notice that I encouraged the OP to pass everything by parameter. Using thread local storage in this case was meant to be a kludge so that not every def and every call has to be changed. There are other cases when you don't control how threads are created (say, a plugin for web framework) where thread local storage is useful. threading.local is new in Python 2.4, so it doesn't seem to be that essential to Python thread programming. Daniel -- http://mail.python.org/mailman/listinfo/python-list
Re: Multiple modules with database access + general app design?
Daniel Dittmar wrote: Frank Millman wrote: I have subclassed threading.Thread, and I store a number of attributes within the subclass that are local to the thread. It seems to work fine, but according to what you say (and according to the Python docs, otherwise why would there be a 'Local' class) there must be some reason why it is not a good idea. Please can you explain the problem with this approach. Your design is just fine. If you follow the thread upwards, you'll notice that I encouraged the OP to pass everything by parameter. Many thanks, Daniel Frank -- http://mail.python.org/mailman/listinfo/python-list
Multiple modules with database access + general app design?
Hey people I'm an experience PHP programmer who's been writing python for a couple of weeks now. I'm writing quite a large application which I've decided to break down in to lots of modules (replacement for PHP's include() statement). My problem is, in PHP if you open a database connection it's always in scope for the duration of the script. Even if you use an abstraction layer ($db = DB::connect(...)) you can `global $db` and bring it in to scope, but in Python I'm having trouble keeping the the database in scope. At the moment I'm having to push the database into the module, but I'd prefer the module to bring the database connection in (pull) from its parent. Eg: import modules modules.foo.c = db.cursor() modules.foo.Bar() Can anyone recommend any cleaner solutions to all of this? As far as I can see it, Python doesn't have much support for breaking down large programs in to organisable files and referencing each other. Another problem is I keep having to import modules all over the place. A real example is, I have a module webhosting, a module users, and a module common. These are all submodules of the module modules (bad naming I know). The database connection is instantiated on the db variable of my main module, which is yellowfish (a global module), so get the situation where: (yellowfish.py) import modules modules.webhosting.c = db.cursor() modules.webhosting.Something() webhosting needs methods in common and users: from modules import common, users However users also needs common: from modules import common And they all need access to the database (users and common) from yellowfish import db c = db.cursor() Can anyone give me advice on making this all a bit more transparent? I guess I really would like a method to bring all these files in to the same scope to make everything seem to be all one application, even though everything is broken up in to different files. One added complication in this particular application: I used modules because I'm calling arbitrary methods defined in some XML format. Obviously I wanted to keep security in mind, so my application goes something like this: import modules module, method, args = getXmlAction() m = getattr(modules, module) m.c = db.cursor() f = getattr(m, method) f(args) In PHP this method is excellent, because I can include all the files I need, each containing a class, and I can use variable variables: ?php $class = new $module; // can't remember if this works, there are // alternatves though $class-$method($args); ? And $class-$method() just does global $db; $db-query(...);. Any advice would be greatly appreciated! Cheers -Robin Haswell -- http://mail.python.org/mailman/listinfo/python-list
Re: Multiple modules with database access + general app design?
Robin Haswell [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] Hey people I'm an experience PHP programmer who's been writing python for a couple of weeks now. I'm writing quite a large application which I've decided to break down in to lots of modules (replacement for PHP's include() statement). My problem is, in PHP if you open a database connection it's always in scope for the duration of the script. Even if you use an abstraction layer ($db = DB::connect(...)) you can `global $db` and bring it in to scope, but in Python I'm having trouble keeping the the database in scope. At the moment I'm having to push the database into the module, but I'd prefer the module to bring the database connection in (pull) from its parent. Eg: import modules modules.foo.c = db.cursor() modules.foo.Bar() Can anyone recommend any cleaner solutions to all of this? Um, I think your Python solution *is* moving in a cleaner direction than simple sharing of a global $db variable. Why make the Bar class have to know where to get a db cursor from? What do you do if your program extends to having multiple Bar() objects working with different cursors into the db? The unnatural part of this (and hopefully, the part that you feel is unclean) is that you're trading one global for another. By just setting modules.foo.c to the db cursor, you force all Bar() instances to use that same cursor. Instead, make the database cursor part of Bar's constructor. Now you can externally create multiple db cursors, a Bar for each, and they all merrily do their own separate, isolated processing, in blissful ignorance of each other's db cursors (vs. colliding on the shared $db variable). -- Paul -- http://mail.python.org/mailman/listinfo/python-list
Re: Multiple modules with database access + general app design?
On Thu, 19 Jan 2006 12:23:12 +, Paul McGuire wrote: Robin Haswell [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] Hey people I'm an experience PHP programmer who's been writing python for a couple of weeks now. I'm writing quite a large application which I've decided to break down in to lots of modules (replacement for PHP's include() statement). My problem is, in PHP if you open a database connection it's always in scope for the duration of the script. Even if you use an abstraction layer ($db = DB::connect(...)) you can `global $db` and bring it in to scope, but in Python I'm having trouble keeping the the database in scope. At the moment I'm having to push the database into the module, but I'd prefer the module to bring the database connection in (pull) from its parent. Eg: import modules modules.foo.c = db.cursor() modules.foo.Bar() Can anyone recommend any cleaner solutions to all of this? Um, I think your Python solution *is* moving in a cleaner direction than simple sharing of a global $db variable. Why make the Bar class have to know where to get a db cursor from? What do you do if your program extends to having multiple Bar() objects working with different cursors into the db? The unnatural part of this (and hopefully, the part that you feel is unclean) is that you're trading one global for another. By just setting modules.foo.c to the db cursor, you force all Bar() instances to use that same cursor. Instead, make the database cursor part of Bar's constructor. Now you can externally create multiple db cursors, a Bar for each, and they all merrily do their own separate, isolated processing, in blissful ignorance of each other's db cursors (vs. colliding on the shared $db variable). Hm if truth be told, I'm not totally interested in keeping a separate cursor for every class instance. This application runs in a very simple threaded socket server - every time a new thread is created, we create a new db.cursor (m = getattr(modules, module)\n m.c = db.cursor() is the first part of the thread), and when the thread finishes all its actions (of which there are many, but all sequential), the thread exits. I don't see any situations where lots of methods will tread on another method's cursor. My main focus really is minimising the number of connections. Using MySQLdb, I'm not sure if every MySQLdb.connect or db.cursor is a separate connection, but I get the feeling that a lot of cursors = a lot of connections. I'd much prefer each method call with a thread to reuse that thread's connection, as creating a connection incurs significant overhead on the MySQL server and DNS server. -Rob -- Paul -- http://mail.python.org/mailman/listinfo/python-list
Re: Multiple modules with database access + general app design?
Robin Haswell wrote: cursor for every class instance. This application runs in a very simple threaded socket server - every time a new thread is created, we create a new db.cursor (m = getattr(modules, module)\n m.c = db.cursor() is the first part of the thread), and when the thread finishes all its actions (of which there are many, but all sequential), the thread exits. I don't If you use a threading server, you can't put the connection object into the module. Modules and hence module variables are shared across threads. You could use thread local storage, but I think it's better to pass the connection explicitely as a parameter. separate connection, but I get the feeling that a lot of cursors = a lot of connections. I'd much prefer each method call with a thread to reuse that thread's connection, as creating a connection incurs significant overhead on the MySQL server and DNS server. You can create several cursor objects from one connection. There should be no problems if you finish processing of one cursor before you open the next one. In earlier (current?) versions of MySQL, only one result set could be opened at a time, so using cursors in parallel present some problems to the driver implementor. Daniel -- http://mail.python.org/mailman/listinfo/python-list
Re: Multiple modules with database access + general app design?
On Thu, 19 Jan 2006 14:37:34 +0100, Daniel Dittmar wrote: Robin Haswell wrote: cursor for every class instance. This application runs in a very simple threaded socket server - every time a new thread is created, we create a new db.cursor (m = getattr(modules, module)\n m.c = db.cursor() is the first part of the thread), and when the thread finishes all its actions (of which there are many, but all sequential), the thread exits. I don't If you use a threading server, you can't put the connection object into the module. Modules and hence module variables are shared across threads. You could use thread local storage, but I think it's better to pass the connection explicitely as a parameter. Would you say it would be better if in every thread I did: m = getattr(modules, module) b.db = db ... def Foo(): c = db.cursor() ? separate connection, but I get the feeling that a lot of cursors = a lot of connections. I'd much prefer each method call with a thread to reuse that thread's connection, as creating a connection incurs significant overhead on the MySQL server and DNS server. You can create several cursor objects from one connection. There should be no problems if you finish processing of one cursor before you open the next one. In earlier (current?) versions of MySQL, only one result set could be opened at a time, so using cursors in parallel present some problems to the driver implementor. Daniel -- http://mail.python.org/mailman/listinfo/python-list
Re: Multiple modules with database access + general app design?
Robin Haswell wrote: Hey people I'm an experience PHP programmer who's been writing python for a couple of weeks now. I'm writing quite a large application which I've decided to break down in to lots of modules (replacement for PHP's include() statement). My problem is, in PHP if you open a database connection it's always in scope for the duration of the script. Even if you use an abstraction layer ($db = DB::connect(...)) you can `global $db` and bring it in to scope, but in Python I'm having trouble keeping the the database in scope. At the moment I'm having to push the database into the module, but I'd prefer the module to bring the database connection in (pull) from its parent. This is what I do. Create a separate module to contain your global variables - mine is called 'common'. In common, create a class, with attributes, but with no methods. Each attribute becomes a global variable. My class is called 'c'. At the top of every other module, put 'from common import c'. Within each module, you can now refer to any global variable as c.whatever. You can create class attributes on the fly. You can therefore have something like - c.db = MySql.connect(...) All modules will be able to access c.db As Daniel has indicated, it may not be safe to share one connection across multiple threads, unless you can guarantee that one thread completes its processing before another one attempts to access the database. You can use threading locks to assist with this. HTH Frank Millman -- http://mail.python.org/mailman/listinfo/python-list
Re: Multiple modules with database access + general app design?
Robin Haswell wrote: On Thu, 19 Jan 2006 14:37:34 +0100, Daniel Dittmar wrote: If you use a threading server, you can't put the connection object into the module. Modules and hence module variables are shared across threads. You could use thread local storage, but I think it's better to pass the connection explicitely as a parameter. Would you say it would be better if in every thread I did: m = getattr(modules, module) b.db = db ... def Foo(): c = db.cursor() I was thinking (example from original post): import modules modules.foo.Bar(db.cursor ()) # file modules.foo def Bar (cursor): cursor.execute (...) The same is true for other objects like the HTTP request: always pass them as parameters because module variables are shared between threads. If you have an HTTP request object, then you could attach the database connection to that object, that way you have to pass only one object. Or you create a new class that encompasses everything useful for this request: the HTTP request, the database connection, possibly an object containing authorization infos etc. I assume that in PHP, global still means 'local to this request', as PHP probably runs in threads under Windows IIS (and Apache 2.0?). In Python, you have to be more explicit about the scope. Daniel -- http://mail.python.org/mailman/listinfo/python-list
Re: Multiple modules with database access + general app design?
On Thu, 19 Jan 2006 15:43:58 +0100, Daniel Dittmar wrote: Robin Haswell wrote: On Thu, 19 Jan 2006 14:37:34 +0100, Daniel Dittmar wrote: If you use a threading server, you can't put the connection object into the module. Modules and hence module variables are shared across threads. You could use thread local storage, but I think it's better to pass the connection explicitely as a parameter. Would you say it would be better if in every thread I did: m = getattr(modules, module) b.db = db ... def Foo(): c = db.cursor() I was thinking (example from original post): import modules modules.foo.Bar(db.cursor ()) # file modules.foo def Bar (cursor): cursor.execute (...) Ah I see.. sounds interesting. Is it possible to make any module variable local to a thread, if set within the current thread? Your method, although good, would mean revising all my functions in order to make it work? Thanks -- http://mail.python.org/mailman/listinfo/python-list
Re: Multiple modules with database access + general app design?
On Thu, 19 Jan 2006 06:38:39 -0800, Frank Millman wrote: Robin Haswell wrote: Hey people I'm an experience PHP programmer who's been writing python for a couple of weeks now. I'm writing quite a large application which I've decided to break down in to lots of modules (replacement for PHP's include() statement). My problem is, in PHP if you open a database connection it's always in scope for the duration of the script. Even if you use an abstraction layer ($db = DB::connect(...)) you can `global $db` and bring it in to scope, but in Python I'm having trouble keeping the the database in scope. At the moment I'm having to push the database into the module, but I'd prefer the module to bring the database connection in (pull) from its parent. This is what I do. Create a separate module to contain your global variables - mine is called 'common'. In common, create a class, with attributes, but with no methods. Each attribute becomes a global variable. My class is called 'c'. At the top of every other module, put 'from common import c'. Within each module, you can now refer to any global variable as c.whatever. You can create class attributes on the fly. You can therefore have something like - c.db = MySql.connect(...) All modules will be able to access c.db As Daniel has indicated, it may not be safe to share one connection across multiple threads, unless you can guarantee that one thread completes its processing before another one attempts to access the database. You can use threading locks to assist with this. HTH Frank Millman Thanks, that sounds like an excellent idea. While I don't think it applies to the database (threading seems to be becoming a bit of an issue at the moment), I know I can use that in other areas :-) Cheers -Rob -- http://mail.python.org/mailman/listinfo/python-list
Re: Multiple modules with database access + general app design?
Robin Haswell wrote: Can anyone give me advice on making this all a bit more transparent? I guess I really would like a method to bring all these files in to the same scope to make everything seem to be all one application, even though everything is broken up in to different files. This is very much a deliberate design decision in Python. I haven't used PHP, but in e.g. C, the #include directive means that you pollute your namespace with all sorts of strange names from all the third party libraries you are using, and this doesn't scale well. As your application grows, you'll get mysterious bugs due to strange name clashes, removing some module you no-longer need means that your app won't build since the include file you no longer include in turn included another file that you should have included but didn't etc. In Python, explicit is better than implicit (type import this at the Python prompt) and while this causes some extra typing it helps with code maintenance. You can always see where a name in your current namespace comes from (unless you use from xxx import *). No magic! Concerning your database operations, it seems they are distributed over a lot of different modules, and that might also cause problems, whatever programming language we use. In typical database applications, you need to keep track of transactions properly. For each opened connection, you can perform a number of transactions after each other. A transaction starts with the first database operation after a connect, commit or rollback. A cursor should only live within a transaction. In other words, you should close all cursors before you perform a commit or rollback. I find it very difficult to manage transactions properly if the commits are spread out in the code. Usually I want one module to contain some kind of transaction management logic, where I determine the transaction boundries. This logic will hand out cursor object to various pieces of code, and determine when to close the cursors and commit the transaction. I haven't really written multithreaded applications, so I don't have any experiences in the problems that might cause. I know that it's a fairly common pattern to have all database transactions in one thread though, and to use Queue.Queue instances to pass data to and from the thread that handles DB. Anyway, you can only have one transaction going on at a time for a connection, so if you share connections between threads (or use a separate DB thread and queues) a rollback or commit in one thread will affect the other threads as well... Each DB-API 2.0 compliant library should be able to declare how it can be used in a threaded application. See the DB-API 2.0 spec: http://python.org/peps/pep-0249.html Look for threadsafety. -- http://mail.python.org/mailman/listinfo/python-list
Re: Multiple modules with database access + general app design?
Robin Haswell wrote: Ah I see.. sounds interesting. Is it possible to make any module variable local to a thread, if set within the current thread? Not directly. The following class tries to simulate it (only in Python 2.4): import threading class ThreadLocalObject (threading.local): def setObject (self, object): setattr (self, 'object', object) def clearObject (self): setattr (self, 'object', None) def __getattr__ (self, name): object = threading.local.__getattribute__ (self, 'object') return getattr (object, name) You use it as: in some module x: db = ThreadLocalObject () in some module that create the database connection: import x def createConnection () localdb = ...connect (...) x.db.setObject (localdb) in some module that uses the databasse connection: import x def bar (): cursor = x.db.cursor () The trick is: - every attribute of a threading.local is thread local (see doc of module threading) - when accessing an attribute of object x.db, the method __getattr__ will first retrieve the thread local database connection and then access the specific attribute of the database connection. Thus it looks as if x.db is itself a database connection object. That way, only the setting of the db variable would have to be changed. I'm not exactly recommneding this, as it seems very error prone to me. It's easy to overwrite the variable holding the cursors with an actual cursor object. Daniel -- http://mail.python.org/mailman/listinfo/python-list
Re: Multiple modules with database access + general app design?
Daniel Dittmar wrote: Robin Haswell wrote: Ah I see.. sounds interesting. Is it possible to make any module variable local to a thread, if set within the current thread? Not directly. The following class tries to simulate it (only in Python 2.4): import threading class ThreadLocalObject (threading.local): Daniel, perhaps you can help me here. I have subclassed threading.Thread, and I store a number of attributes within the subclass that are local to the thread. It seems to work fine, but according to what you say (and according to the Python docs, otherwise why would there be a 'Local' class) there must be some reason why it is not a good idea. Please can you explain the problem with this approach. Briefly, this is what I am doing. class Link(threading.Thread): # each link runs in its own thread Run a loop listening for messages from client. def __init__(self,args): threading.Thread.__init__(self) print 'link connected',self.getName() self.ctrl, self.conn = args self._db = {} # to store db connections for this client connection [create various other local attributes] def run(self): readable = [self.conn.fileno()] error = [] self.sendData = [] # 'stack' of replies to be sent self.running = True while self.running: if self.sendData: writable = [self.conn.fileno()] else: writable = [] r,w,e = select.select(readable,writable,error,0.1) # 0.1 timeout [continue to handle connection] class Controller(object): Run a main loop listening for client connections. def __init__(self): self.s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) self.s.bind((HOST,PORT)) self.s.listen(5) self.running = True def mainloop(self): while self.running: try: conn,addr = self.s.accept() Link(args=(self,conn)).start() # create thread to handle connection except KeyboardInterrupt: self.shutdown() Controller().mainloop() TIA Frank Millman -- http://mail.python.org/mailman/listinfo/python-list