Cache DB backend

2009-08-01 Thread Fraser Nevett

I was looking at the database cache backend and noticed a number of of
things that might be worth improving...

  * The "createcachetable" management command manually constructs and
executes SQL to create the table. This is seems a bit undesirable as
we already have a well-established and well-tested ORM for the
management of creating tables. It also seems a bit untidy to have the
definition of the database table in a completely separate part of the
codebase from the rest of the db cache code.

  * The db backend code manually constructs and executes SQL to read
from and write to the cache table. Looking through that code, the
table name doesn't get escaped when put into the SQL and the expiry
timestamp doesn't get properly pre-processed before insertion (see
#11483).

I'm posting here first rather than opening a ticket to find out
whether there was a deliberate design decision made for any of this.
Is there any reason the cache table is not just a standard Model?

Using a Model and the ORM would resolve some of the issues I've listed
above and seems to be good DRY with respect to writing SQL. It would
require an app to be added to INSTALLED_APPS if the db cache was being
used, though I'm not sure that's a big issue.

There is a little complexity surrounding the name of the table, as at
present this gets set in CACHE_BACKEND (or as an argument to the
createcachetable management command), so the model would have to
somehow set its db_table meta data from this. We could actually
possibly get rid of createcachetable altogether and just leave the
table creation to syncdb.

By switching to using a standard Model, we would also get the benefit
of the multi-db support that's currently being worked on. This would
allow the db cache to run of an entirely separate database, if
required.

Could I please get some feedback on the above and as to whether a
refactor of this part of the code as described would be a worthwhile
exercise?

Thanks,

Fraser
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en
-~--~~~~--~~--~--~---



Re: Cache DB backend

2009-08-02 Thread Russell Keith-Magee

On Sat, Aug 1, 2009 at 6:06 PM, Fraser  Nevett wrote:
>
> I was looking at the database cache backend and noticed a number of of
> things that might be worth improving...
>
>  * The "createcachetable" management command manually constructs and
> executes SQL to create the table. This is seems a bit undesirable as
> we already have a well-established and well-tested ORM for the
> management of creating tables. It also seems a bit untidy to have the
> definition of the database table in a completely separate part of the
> codebase from the rest of the db cache code.
>
>  * The db backend code manually constructs and executes SQL to read
> from and write to the cache table. Looking through that code, the
> table name doesn't get escaped when put into the SQL and the expiry
> timestamp doesn't get properly pre-processed before insertion (see
> #11483).
>
> I'm posting here first rather than opening a ticket to find out
> whether there was a deliberate design decision made for any of this.
> Is there any reason the cache table is not just a standard Model?
>
> Using a Model and the ORM would resolve some of the issues I've listed
> above and seems to be good DRY with respect to writing SQL. It would
> require an app to be added to INSTALLED_APPS if the db cache was being
> used, though I'm not sure that's a big issue.
>
> There is a little complexity surrounding the name of the table, as at
> present this gets set in CACHE_BACKEND (or as an argument to the
> createcachetable management command), so the model would have to
> somehow set its db_table meta data from this. We could actually
> possibly get rid of createcachetable altogether and just leave the
> table creation to syncdb.
>
> By switching to using a standard Model, we would also get the benefit
> of the multi-db support that's currently being worked on. This would
> allow the db cache to run of an entirely separate database, if
> required.
>
> Could I please get some feedback on the above and as to whether a
> refactor of this part of the code as described would be a worthwhile
> exercise?

There are a couple of answers to your "why is it so" query.

One reason we don't use a standard Django model for the cache backend
is that the only cache backend that would use the table is the
database backend. Django doesn't have 'optional' tables - you includes
the app, you gets the tables. If you're using dummy, locmem,
memcached, file, or any other custom backend, you would end up with
empty tables serving no purpose in your database. Ok, there isn't
really any overhead to an empty table, and you don't have to add the
cache app to your INSTALLED_APPS if you're not using the database
backend. However, it's just as easy to write and document the use of a
one-shot management command.

A second reason is that when you're talking about caching, you're
implicitly talking about optimal performance, so you need to get
everything out of the way that could impede that performance. Django's
ORM doesn't have a huge overhead, but it does have _some_ overhead.
Given that a caching query is such a simple beast, writing it using
the ORM rather than raw queries doesn't really yield you much
advantage in terms of implementation simplicity, but it does slow down
an operation that, by definition, is supposed to be as fast as
possible.

Lastly, and perhaps most importantly, if you're using the database
backend for anything other that amusement purposes, UR DOING IT WRONG
:-). The database backend is an interesting novelty and a way to
demonstrate the flexibility of the caching backend, but it really
shouldn't be confused as a good idea. One of the major goals of
caching is to reduce database load, and you don't reduce load by
adding extra reads and writes to the database. If your site needs
caching, use memcached. No ifs, no buts. Just use memcached. When
you're doing local testing, locmem and file are decent substitutes,
and dummy is helpful when you are debugging, but memcached is where
the serious money is.

In summary - I'm sure there are ways that the implementation of the
database cache backend could be improved, but in all honesty, the best
refactoring would be an rm -rf :-)

Yours,
Russ Magee %-)

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en
-~--~~~~--~~--~--~---



Re: Cache DB backend

2009-08-02 Thread Fraser Nevett

On Aug 2, 2:48 pm, Russell Keith-Magee  wrote:
> On Sat, Aug 1, 2009 at 6:06 PM, Fraser  Nevett wrote:
>
> > I was looking at the database cache backend and noticed a number of of
> > things that might be worth improving...
>
> >  * The "createcachetable" management command manually constructs and
> > executes SQL to create the table. This is seems a bit undesirable as
> > we already have a well-established and well-tested ORM for the
> > management of creating tables. It also seems a bit untidy to have the
> > definition of the database table in a completely separate part of the
> > codebase from the rest of the db cache code.
>
> >  * The db backend code manually constructs and executes SQL to read
> > from and write to the cache table. Looking through that code, the
> > table name doesn't get escaped when put into the SQL and the expiry
> > timestamp doesn't get properly pre-processed before insertion (see
> > #11483).
>
> > I'm posting here first rather than opening a ticket to find out
> > whether there was a deliberate design decision made for any of this.
> > Is there any reason the cache table is not just a standard Model?
>
> > Using a Model and the ORM would resolve some of the issues I've listed
> > above and seems to be good DRY with respect to writing SQL. It would
> > require an app to be added to INSTALLED_APPS if the db cache was being
> > used, though I'm not sure that's a big issue.
>
> > There is a little complexity surrounding the name of the table, as at
> > present this gets set in CACHE_BACKEND (or as an argument to the
> > createcachetable management command), so the model would have to
> > somehow set its db_table meta data from this. We could actually
> > possibly get rid of createcachetable altogether and just leave the
> > table creation to syncdb.
>
> > By switching to using a standard Model, we would also get the benefit
> > of the multi-db support that's currently being worked on. This would
> > allow the db cache to run of an entirely separate database, if
> > required.
>
> > Could I please get some feedback on the above and as to whether a
> > refactor of this part of the code as described would be a worthwhile
> > exercise?
>
> There are a couple of answers to your "why is it so" query.
>
> One reason we don't use a standard Django model for the cache backend
> is that the only cache backend that would use the table is the
> database backend. Django doesn't have 'optional' tables - you includes
> the app, you gets the tables. If you're using dummy, locmem,
> memcached, file, or any other custom backend, you would end up with
> empty tables serving no purpose in your database. Ok, there isn't
> really any overhead to an empty table, and you don't have to add the
> cache app to your INSTALLED_APPS if you're not using the database
> backend. However, it's just as easy to write and document the use of a
> one-shot management command.
>
> A second reason is that when you're talking about caching, you're
> implicitly talking about optimal performance, so you need to get
> everything out of the way that could impede that performance. Django's
> ORM doesn't have a huge overhead, but it does have _some_ overhead.
> Given that a caching query is such a simple beast, writing it using
> the ORM rather than raw queries doesn't really yield you much
> advantage in terms of implementation simplicity, but it does slow down
> an operation that, by definition, is supposed to be as fast as
> possible.
>
> Lastly, and perhaps most importantly, if you're using the database
> backend for anything other that amusement purposes, UR DOING IT WRONG
> :-). The database backend is an interesting novelty and a way to
> demonstrate the flexibility of the caching backend, but it really
> shouldn't be confused as a good idea. One of the major goals of
> caching is to reduce database load, and you don't reduce load by
> adding extra reads and writes to the database. If your site needs
> caching, use memcached. No ifs, no buts. Just use memcached. When
> you're doing local testing, locmem and file are decent substitutes,
> and dummy is helpful when you are debugging, but memcached is where
> the serious money is.
>
> In summary - I'm sure there are ways that the implementation of the
> database cache backend could be improved, but in all honesty, the best
> refactoring would be an rm -rf :-)
>
> Yours,
> Russ Magee %-)

Hi Russ,

Thanks for your reply. I understand and completely agree with what
you're saying, especially with regards to using a database as a cache
backend.

I won't spend any time on refactoring, but I have opened a ticket
(#11623) and created a patch to ensure the table names get properly
quoted in the SQL (which I think is a good idea, even if actually
using the db cache isn't!).

Fraser
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-dev

Re: Cache DB backend

2009-08-02 Thread Leo Soto M.

On Sun, Aug 2, 2009 at 11:41 AM, Fraser Nevett wrote:
[...]
> I won't spend any time on refactoring, but I have opened a ticket
> (#11623) and created a patch to ensure the table names get properly
> quoted in the SQL (which I think is a good idea, even if actually
> using the db cache isn't!).

If someone is going to commit fixes to the DB cache backend, please
also take a look at 
(which is causing failures with Jython backends).
-- 
Leo Soto M.
http://blog.leosoto.com

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en
-~--~~~~--~~--~--~---