Cache and GET parameters

2005-12-05 Thread Adrian Holovaty

Right now, the Django cache system doesn't cache pages that have GET
parameters. This is because GET parameters don't necessarily influence
the output of the page. For example, if the page example.com/foo/ is
cached, anybody could simply add a "?bar=baz" to the URL and Django
wouldn't know whether that was a separate page, or just a bunch of
bogus query string cruft added by a nincompoop. So that's why Django
currently doesn't cache any pages with GET parameters, across the
board.

This is a bad long-term solution, though.

I have a couple of ideas for solutions. The first is to introduce a
NO_GET_PARAMS setting, which would default to False. If it's set to
True, Django would assume that *all* GET parameters (query strings),
sitewide, contain meaningless information, and therefore would not
account for them in creating cache. For example, a request to
example.com/foo/?bar=baz would use the same cache as example.com/foo/.
We might even be able to reuse this setting for other things; I'm not
sure what, yet.

Another solution could be to introduce a view decorator that specifies
the view doesn't care about GET parameters. Essentially it'd be the
opposite of the vary_on_headers decorator
(http://www.djangoproject.com/documentation/cache/#controlling-cache-using-vary-headers).
However, it'd a hassle to have to add that decorator to each view,
particularly if you're like me and rarely query strings.

Finally, along those lines, we could introduce a vary_on_get
decorator, which, used with the NO_GET_PARAMS setting, would be an
opt-in signifying a view *does* rely on query string. This could be
for stuff like search engines, which do vary based on the query string
(e.g. /search/?q=foo). In this case, though, it'd be nice to be able
to specify the variables that are valid. For example, with the
decorator @vary_on_get('foo', 'bar'), the cache would store separate
pages for /search/?foo=1 and /search/?bar=1, but it would use the same
cache for /search/?foo=1 and /search/?foo=1&gonzo=2, because "gonzo"
isn't specified in "vary_on_get" and thus would be ignored.

What do people think of these ideas?

Adrian

--
Adrian Holovaty
holovaty.com | djangoproject.com | chicagocrime.org


Re: Cache and GET parameters

2008-11-01 Thread PeterK

Picking up an old thread (because it is still relevant)

On 6 Dec 2005, 15:37, Adrian Holovaty <[EMAIL PROTECTED]> wrote:
>
> The remaining question is: What's the behavior if vary_on_get() isn't
> specified for a particular view? Do we cache everything (including
> separate cache entries for any combination of different GETparameters) or 
> cachenothing (current behavior)?
>

URL:s should be treated as opaque in the default behaviour so there
would be a separate cache entry for each of these:

example.com/list/
example.com/list/?a=1&b=2
example.com/list/?b=2&a=1

However, the developer may know better and details which parameters
that affect the get request. This could be provided in a decorator for
the view method like this:

Vary by the entire URL (should be default behaviour):

@cache_page(60 * 15)
@vary_by_param("*") #This should not be required to get per full URL
caching.
def slashdot_this(request):
...

Only vary by values for parameter a and b (ignore everything else):

@cache_page(60 * 15)
@vary_by_param(["a","b"])
def slashdot_this(request):
...

I have added this to ticket 4992 [1] as I believe it would be of great
benefit for everyone filtering lists of data by URL parameters (a
common use case).

Kind regards,

Peter Krantz

[1]: http://code.djangoproject.com/ticket/4992

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en
-~--~~~~--~~--~--~---



Re: Cache and GET parameters

2008-11-01 Thread Jeremy Dunck

On Tue, Dec 6, 2005 at 9:37 AM, Adrian Holovaty <[EMAIL PROTECTED]> wrote:
...
> Looks like vary_on_get is the most popular choice. So here's how that
> might work:
>
> @vary_on_get('id')
> def my_view(request):
>id = request.GET.get('id', None)

To be clear, the generated cache key would still include anything
stated in the HTTP Vary heads, right?

Vary: Cookie combined with @vary_on_get() should still vary on Cookie.

> The remaining question is: What's the behavior if vary_on_get() isn't
> specified for a particular view? Do we cache everything (including
> separate cache entries for any combination of different GET
> parameters) or cache nothing (current behavior)?

I say cache nothing; doing otherwise is backwards-incompatible.   I
realize that means a bunch of decorators on views if you want the
cache-everything behavior.

Assuming vary_on_get() with no parameters means no variance (other
than the HTTP Vary headers), then
perhaps we could write a helper to walk URLConf and apply a
vary_on_get() decorator to indicate cache-everything.  People could
opt-in this way without having to go update all code.

(This does fall down if you're mixing reusable apps that expect
cache-nothing.  Hmm.)

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en
-~--~~~~--~~--~--~---



Re: Cache and GET parameters

2008-11-01 Thread SmileyChris

On Nov 2, 2:52 am, "Jeremy Dunck" <[EMAIL PROTECTED]> wrote:
> Assuming vary_on_get() with no parameters means no variance (other
> than the HTTP Vary headers), then [...]

That seems confusing - the decorator name seems to imply that it would
vary on any get attribute (even though this is the default) - at least
that's how I'd look at it if I didn't know otherwise.
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en
-~--~~~~--~~--~--~---



Re: Cache and GET parameters

2008-11-01 Thread Jeremy Dunck

On Sat, Nov 1, 2008 at 8:32 PM, SmileyChris <[EMAIL PROTECTED]> wrote:
>
> On Nov 2, 2:52 am, "Jeremy Dunck" <[EMAIL PROTECTED]> wrote:
>> Assuming vary_on_get() with no parameters means no variance (other
>> than the HTTP Vary headers), then [...]
>
> That seems confusing - the decorator name seems to imply that it would
> vary on any get attribute (even though this is the default) - at least
> that's how I'd look at it if I didn't know otherwise.

@vary_on_get(None) ?  :-)

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en
-~--~~~~--~~--~--~---



Re: Cache and GET parameters

2008-11-02 Thread David Cramer

I really like the idea of the explicit GET params passed.So I'm +1
especially on solution #3. I actually had never realized it wasn't
caching pages with GET params, luckily though, any pages where I use
this decorator don't fluctuate like that :)

On Nov 1, 7:51 pm, "Jeremy Dunck" <[EMAIL PROTECTED]> wrote:
> On Sat, Nov 1, 2008 at 8:32 PM, SmileyChris <[EMAIL PROTECTED]> wrote:
>
> > On Nov 2, 2:52 am, "Jeremy Dunck" <[EMAIL PROTECTED]> wrote:
> >> Assuming vary_on_get() with no parameters means no variance (other
> >> than the HTTP Vary headers), then [...]
>
> > That seems confusing - the decorator name seems to imply that it would
> > vary on any get attribute (even though this is the default) - at least
> > that's how I'd look at it if I didn't know otherwise.
>
> @vary_on_get(None) ?  :-)
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en
-~--~~~~--~~--~--~---



Re: Cache and GET parameters

2008-11-02 Thread PeterK

On Nov 1, 2:52 pm, "Jeremy Dunck" <[EMAIL PROTECTED]> wrote:
>
> To be clear, the generated cache key would still include anything
> stated in the HTTP Vary heads, right?
>
> Vary: Cookie combined with @vary_on_get() should still vary on Cookie.
>

Yes.

>
> I say cache nothing; doing otherwise is backwards-incompatible.   I
> realize that means a bunch of decorators on views if you want the
> cache-everything behavior.
>

Maybe, it's just me, but I find the current behaviour confusing after
reading the introduction to the cache documentation [1]. It says
"Given a URL..." so I expected the cache to use everyting that
identifies an object in the URL (path and query as described in RFC
3986 [2]).

But, it is backwards-incompatible so maybe your suggestion is the
right way to go.

[1]: http://docs.djangoproject.com/en/dev/topics/cache/
[2]: http://labs.apache.org/webarch/uri/rfc/rfc3986.html#components

I attached a patch to ticket #4992 for the behaviour I (and apparently
other people) expected:

http://code.djangoproject.com/attachment/ticket/4992/cache_by_request_full_path.diff

Regards,

Peter
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en
-~--~~~~--~~--~--~---



Re: Cache and GET parameters

2005-12-05 Thread Amit Upadhyay
+1On 12/6/05, Adrian Holovaty <[EMAIL PROTECTED]> wrote:
Right now, the Django cache system doesn't cache pages that have GETparameters. This is because GET parameters don't necessarily influencethe output of the page. For example, if the page 
example.com/foo/ iscached, anybody could simply add a "?bar=baz" to the URL and Djangowouldn't know whether that was a separate page, or just a bunch ofbogus query string cruft added by a nincompoop. So that's why Django
currently doesn't cache any pages with GET parameters, across theboard.This is a bad long-term solution, though.I have a couple of ideas for solutions. The first is to introduce aNO_GET_PARAMS setting, which would default to False. If it's set to
True, Django would assume that *all* GET parameters (query strings),sitewide, contain meaningless information, and therefore would notaccount for them in creating cache. For example, a request to
example.com/foo/?bar=baz would use the same cache as example.com/foo/.We might even be able to reuse this setting for other things; I'm notsure what, yet.Another solution could be to introduce a view decorator that specifies
the view doesn't care about GET parameters. Essentially it'd be theopposite of the vary_on_headers decorator(http://www.djangoproject.com/documentation/cache/#controlling-cache-using-vary-headers
).However, it'd a hassle to have to add that decorator to each view,particularly if you're like me and rarely query strings.Finally, along those lines, we could introduce a vary_on_getdecorator, which, used with the NO_GET_PARAMS setting, would be an
opt-in signifying a view *does* rely on query string. This could befor stuff like search engines, which do vary based on the query string(e.g. /search/?q=foo). In this case, though, it'd be nice to be able
to specify the variables that are valid. For example, with thedecorator @vary_on_get('foo', 'bar'), the cache would store separatepages for /search/?foo=1 and /search/?bar=1, but it would use the samecache for /search/?foo=1 and /search/?foo=1&gonzo=2, because "gonzo"
isn't specified in "vary_on_get" and thus would be ignored.What do people think of these ideas?Adrian--Adrian Holovatyholovaty.com | 
djangoproject.com | chicagocrime.org-- Amit UpadhyayBlog: http://www.rootshell.be/~upadhyay
+91-9867-359-701


Re: Cache and GET parameters

2005-12-05 Thread Jacob Kaplan-Moss


On Dec 6, 2005, at 12:28 AM, Adrian Holovaty wrote:

Finally, along those lines, we could introduce a vary_on_get
decorator, which, used with the NO_GET_PARAMS setting, would be an
opt-in signifying a view *does* rely on query string. This could be
for stuff like search engines, which do vary based on the query string
(e.g. /search/?q=foo). In this case, though, it'd be nice to be able
to specify the variables that are valid. For example, with the
decorator @vary_on_get('foo', 'bar'), the cache would store separate
pages for /search/?foo=1 and /search/?bar=1, but it would use the same
cache for /search/?foo=1 and /search/?foo=1&gonzo=2, because "gonzo"
isn't specified in "vary_on_get" and thus would be ignored.


This sounds like the right idea to me: explicitly state which GET  
params invalidate the cache.


Jacob


Re: Cache and GET parameters

2005-12-05 Thread Eugene Lazutkin

"Adrian Holovaty" <[EMAIL PROTECTED]> wrote 
in message 
news:[EMAIL PROTECTED]

>Finally, along those lines, we could introduce a vary_on_get
>decorator, which, used with the NO_GET_PARAMS setting, would be an
>opt-in signifying a view *does* rely on query string. This could be
>for stuff like search engines, which do vary based on the query string
>(e.g. /search/?q=foo). In this case, though, it'd be nice to be able
>to specify the variables that are valid. For example, with the
>decorator @vary_on_get('foo', 'bar'), the cache would store separate
>pages for /search/?foo=1 and /search/?bar=1, but it would use the same
>cache for /search/?foo=1 and /search/?foo=1&gonzo=2, because "gonzo"
>isn't specified in "vary_on_get" and thus would be ignored.

I like @vary_on_get(). IMHO, it covers a lot of real-life scenarios.

Additionally I would be nice to be specify life time of cached copy from 
view. In this case we can dynamically assign bigger time to least likely to 
change items based on their content. Example: recent articles can be 
modified, while old articles are unlikely to be changed now (see 
http://code.djangoproject.com/ticket/590).

One more wish: add "clear the cache" button to Admin. I don't ask for 
sophisticated cache management (it would be nice to have, but...). Even 
simple thing will help a lot. ;-)

Thanks,

Eugene





Re: Cache and GET parameters

2005-12-05 Thread Maniac


Jacob Kaplan-Moss wrote:

This sounds like the right idea to me: explicitly state which GET  
params invalidate the cache.


So when the view's code change during development one should alsways 
remember to update this invalidators list. Not very DRY :-(


Re: Cache and GET parameters

2005-12-06 Thread James Bennett

On 12/6/05, Maniac <[EMAIL PROTECTED]> wrote:
> So when the view's code change during development one should alsways
> remember to update this invalidators list. Not very DRY :-(

Except it's a decorator, so it's right there with your view code.


--
"May the forces of evil become confused on the way to your house."
  -- George Carlin


Re: Cache and GET parameters

2005-12-06 Thread Maniac


James Bennett wrote:


Except it's a decorator, so it's right there with your view code.
 

But still you have to blindly copy strings from code to decorator 
parameters.


Re: Cache and GET parameters

2005-12-06 Thread Cheng Zhang



On Dec 6, 2005, at 2:35 PM, Jacob Kaplan-Moss wrote:



On Dec 6, 2005, at 12:28 AM, Adrian Holovaty wrote:

Finally, along those lines, we could introduce a vary_on_get
decorator, which, used with the NO_GET_PARAMS setting, would be an
opt-in signifying a view *does* rely on query string. This could be
for stuff like search engines, which do vary based on the query  
string

(e.g. /search/?q=foo). In this case, though, it'd be nice to be able
to specify the variables that are valid. For example, with the
decorator @vary_on_get('foo', 'bar'), the cache would store separate
pages for /search/?foo=1 and /search/?bar=1, but it would use the  
same

cache for /search/?foo=1 and /search/?foo=1&gonzo=2, because "gonzo"
isn't specified in "vary_on_get" and thus would be ignored.


This sounds like the right idea to me: explicitly state which GET  
params invalidate the cache.



I agree. This is the best one among the three proposals.


Re: Cache and GET parameters

2005-12-06 Thread hugo

>Finally, along those lines, we could introduce a vary_on_get
>decorator, which, used with the NO_GET_PARAMS setting, would be an
>opt-in signifying a view *does* rely on query string. This could be
>for stuff like search engines, which do vary based on the query string
>(e.g. /search/?q=foo). In this case, though, it'd be nice to be able
>to specify the variables that are valid.

+1 for vary_on_get, that fits nicely into the current scheme and just
sounds right to me.

bye, Georg



Re: Cache and GET parameters

2005-12-06 Thread James Bennett

On 12/6/05, Maniac <[EMAIL PROTECTED]> wrote:
> But still you have to blindly copy strings from code to decorator
> parameters.

Any way of implementing this is going to require you to specify
*somewhere* which GET parameters are relevant to caching a particular
view, and it'd be hard to implement that directly in the view syntax
(since not everyone will be using caching). The proposed decorator
does the next best thing to having it directly "in" the view, and
keeps that information bundled with your view code instead of storing
it somewhere else. So it gets a +1 from me.

And while DRY is great, I'm still not convinced that this is a
violation of it, or at least one we need to worry too much about -- if
strictly following DRY means needlessly complicating things, then I
don't think it should be strictly followed.


--
"May the forces of evil become confused on the way to your house."
  -- George Carlin


Re: Cache and GET parameters

2005-12-06 Thread Maniac


James Bennett wrote:


Any way of implementing this is going to require you to specify
*somewhere* which GET parameters are relevant to caching a particular
view, and it'd be hard to implement that directly in the view syntax
(since not everyone will be using caching). The proposed decorator
does the next best thing to having it directly "in" the view, and
keeps that information bundled with your view code instead of storing
it somewhere else. So it gets a +1 from me.

And while DRY is great, I'm still not convinced that this is a
violation of it, or at least one we need to worry too much about -- if
strictly following DRY means needlessly complicating things, then I
don't think it should be strictly followed.
 

I completely agree. I was just expressing a concern about it, may be 
someone would come up with a better solution. Thinking about the issue I 
too fail to invent something absolutely automatic...


Re: Cache and GET parameters

2005-12-06 Thread [EMAIL PROTECTED]

Don't you want to use my cache algorithm, proposed here:
http://groups.google.fi/group/django-developers/browse_thread/thread/fdc59b0b46502ede
?

It is able to handle GET/POST parameters (via converting these
parameters to the array and futher hashing array to the string, which
will be used as an unique indentifier).



Re: Cache and GET parameters

2005-12-06 Thread Adrian Holovaty

On 12/6/05, hugo <[EMAIL PROTECTED]> wrote:
> +1 for vary_on_get, that fits nicely into the current scheme and just
> sounds right to me.

Looks like vary_on_get is the most popular choice. So here's how that
might work:

@vary_on_get('id')
def my_view(request):
id = request.GET.get('id', None)

@vary_on_get('q', 'page')
def search(request):
q = request.GET.get('q', None)
page = request.GET.get('page', 1)

In the second example, a request to /search/?foo=bar would use the
cached version of /search/, because "foo" isn't in vary_on_get.

The remaining question is: What's the behavior if vary_on_get() isn't
specified for a particular view? Do we cache everything (including
separate cache entries for any combination of different GET
parameters) or cache nothing (current behavior)?

Adrian

--
Adrian Holovaty
holovaty.com | djangoproject.com | chicagocrime.org


Re: Cache and GET parameters

2005-12-06 Thread Amit Upadhyay
On 12/6/05, Adrian Holovaty <[EMAIL PROTECTED]> wrote:
The remaining question is: What's the behavior if vary_on_get() isn'tspecified for a particular view? Do we cache everything (includingseparate cache entries for any combination of different GETparameters) or cache nothing (current behavior)?
Quoting your original post:I have a couple of ideas for solutions. The first is to introduce aNO_GET_PARAMS setting, which would default to False. If it's set toTrue, Django would assume that *all* GET parameters (query strings),
sitewide, contain meaningless information, and therefore would notaccount for them in creating cache. For example, a request to
example.com/foo/?bar=baz would use the same cache as example.com/foo/.We might even be able to reuse this setting for other things; I'm not
sure what, yet.Sounds fine to me.-- Amit UpadhyayBlog: http://www.rootshell.be/~upadhyay+91-9867-359-701