[Trac-dev] About env, req, db, and repos...

Christian Boos Tue, 21 Mar 2006 03:18:28 -0800

Hello,

I'd like to discuss a bit the way we handle the main objects within Trac.


The different problems I'd like to address are:
1. there's a lot of verbosity involved for passing around the 'db' argument,
  or recreate it from the env when it's missing.
2. in the vc layer, there's a similar concern for the `repos` object,
  except that there's currently no way to get it efficiently from the env.
3. in some situation, we badly need a req, when we don't have it
  (e.g. some parts of the notification code, IIRC)
4. we're sometimes using the env, when we clearly should use the req
  (the env.href vs. req.href issue)


So, a bit of clarification is needed, IMO.

The *env* should be used to represent the *server* side, the
environment/project itself, irrelevant of what the current
action/transaction is.

The *req* should represent a *client* operation on the environment,
in a broad sense. Ideally, a `req` should not be reserved to the
client requests coming from web browsers. There should be other
ways to create `req` objects, e.g. for use by scripts.

Regarding thread safety and object state, there are different
assumptions concerning the above two objects:
* a `req` object and therefore the objects associated to it
  during its lifetime are only to be manipulated by one thread
  (the client thread).
* an `env` object and related (all the component objects actually)
  are to be manipulated by many client requests, so they need to
  maintain their state in a thread-safe way (and this is not done
  in a systematic way).

The `db` and `repos` objects are costly objects that are created
by the `env` for being used during the processing of a client request.
Virtually all client requests will need to use the `db` object.
Only those related to the vc layer would eventually make use of the
`repos` object, but other than that, they're similar.

Currently, the env.get_db_cnx() makes use of a connection pool
that is "thread-oriented", that is, if a thread has already
got a `db` object, subsequent calls to get_db_cnx from the same
thread will return the same `db` connection _provided_ the connection
has not been gc'ed and returned to the pool in between.
This means that in theory, a request can begin some work, then
fail to get the db later, and finally timeout...

I think it would be more efficient to streamline this, and explicitely
attach the `db` object to the `req` object:
* a `db` is extracted from the pool only once for each request;
  once a request has it (cached after first call to req.db),
  it keeps it for all its subsequent operations.
  This is mostly similar to what is done now, but without
  putting it back and forth in the pool, and without having
  to get at times a db object "out of the blue" (env.get_db_cnx()).
* the `db` shouldn't be carried everywhere as an additional
  argument anymore. When a db operation is needed, there should
  always be the notion of which client is doing this operation
 (and therefore a `req` object should be made available).
Because a `req` is never reused across threads, this satisfies
the constraint set by sqlite that a connection can't be used
from a thread different than the one which created it.

A completely similar approach could be taken for the `repos`,
i.e. create it and attach it to the `req` upon the first call.
This would for example enable to solve the recent problem #2891
(Timeline slowness due to the use of get_repository) in a better
way.


In summary, the problems mentioned at the beginning could be
addressed like this:
1. use a `req.db` property to get access to the db layer for the
  client operations (maybe even use `req.cursor` as a shortcut)
2. use a `req.repos` property to get access to the vc layer for
  the client operations
3. and 4. in general use `req` more consistently to represent
  the current client operation

-- Christian


_______________________________________________
Trac-dev mailing list
[email protected]
http://lists.edgewall.com/mailman/listinfo/trac-dev

[Trac-dev] About env, req, db, and repos...

Reply via email to