Hello, I'd like to discuss a bit the way we handle the main objects within Trac.
The different problems I'd like to address are: 1. there's a lot of verbosity involved for passing around the 'db' argument, or recreate it from the env when it's missing. 2. in the vc layer, there's a similar concern for the `repos` object, except that there's currently no way to get it efficiently from the env. 3. in some situation, we badly need a req, when we don't have it (e.g. some parts of the notification code, IIRC) 4. we're sometimes using the env, when we clearly should use the req (the env.href vs. req.href issue) So, a bit of clarification is needed, IMO. The *env* should be used to represent the *server* side, the environment/project itself, irrelevant of what the current action/transaction is. The *req* should represent a *client* operation on the environment, in a broad sense. Ideally, a `req` should not be reserved to the client requests coming from web browsers. There should be other ways to create `req` objects, e.g. for use by scripts. Regarding thread safety and object state, there are different assumptions concerning the above two objects: * a `req` object and therefore the objects associated to it during its lifetime are only to be manipulated by one thread (the client thread). * an `env` object and related (all the component objects actually) are to be manipulated by many client requests, so they need to maintain their state in a thread-safe way (and this is not done in a systematic way). The `db` and `repos` objects are costly objects that are created by the `env` for being used during the processing of a client request. Virtually all client requests will need to use the `db` object. Only those related to the vc layer would eventually make use of the `repos` object, but other than that, they're similar. Currently, the env.get_db_cnx() makes use of a connection pool that is "thread-oriented", that is, if a thread has already got a `db` object, subsequent calls to get_db_cnx from the same thread will return the same `db` connection _provided_ the connection has not been gc'ed and returned to the pool in between. This means that in theory, a request can begin some work, then fail to get the db later, and finally timeout... I think it would be more efficient to streamline this, and explicitely attach the `db` object to the `req` object: * a `db` is extracted from the pool only once for each request; once a request has it (cached after first call to req.db), it keeps it for all its subsequent operations. This is mostly similar to what is done now, but without putting it back and forth in the pool, and without having to get at times a db object "out of the blue" (env.get_db_cnx()). * the `db` shouldn't be carried everywhere as an additional argument anymore. When a db operation is needed, there should always be the notion of which client is doing this operation (and therefore a `req` object should be made available). Because a `req` is never reused across threads, this satisfies the constraint set by sqlite that a connection can't be used from a thread different than the one which created it. A completely similar approach could be taken for the `repos`, i.e. create it and attach it to the `req` upon the first call. This would for example enable to solve the recent problem #2891 (Timeline slowness due to the use of get_repository) in a better way. In summary, the problems mentioned at the beginning could be addressed like this: 1. use a `req.db` property to get access to the db layer for the client operations (maybe even use `req.cursor` as a shortcut) 2. use a `req.repos` property to get access to the vc layer for the client operations 3. and 4. in general use `req` more consistently to represent the current client operation -- Christian _______________________________________________ Trac-dev mailing list [email protected] http://lists.edgewall.com/mailman/listinfo/trac-dev
