Re: The state of per-site/per-view middleware caching in Django

2011-10-20 Thread Carl Meyer
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hi Jim,

This is a really useful summary of the current state of things, thanks
for putting it together.

Re the anonymous/authenticated issue, CSRF token, and Google Analytics
cookies, it all boils down to the same root issue. And Niran is right,
what we currently do re setting Vary: Cookie is what we have to do in
order to be correct with respect to HTTP and upstream caches. For
instance, we can't just remove Vary: Cookie from unauthenticated
responses, because then upstream caches could serve that unauthenticated
response to anyone, even if they are actually authenticated.

Currently the Django page caching middleware behaves pretty much just
like an upstream cache in terms of the Vary header. Apart from the
CACHE_MIDDLEWARE_ANONYMOUS_ONLY setting, it just looks at the response,
it doesn't make use of any additional "inside information" about what
your Django site did to generate that response in order to decide what
to cache and how to cache it.

This approach is pretty attractive, because it's conceptually simple,
consistent with upstream HTTP caching, and conservative (quite unlikely
to serve the wrong cached content).

It might be possible to make it "smarter" in certain cases, and allow it
to cache more aggressively than an upstream cache can. #9249 is one
proposal to do this for cookies that aren't used on the server, either
via explicit setting or (in a recently-added proposal) via tracking
which cookie values are accessed. If we did that, plus special-cased the
session cookie if the user is unauthenticated and the session isn't used
outside of contrib.auth, I think that could possibly solve the
unauthenticated-users and GA issues.

However, this (especially the latter) would come with the cost of making
the cache middleware implementation more fragile and coupled to other
parts of the framework. And it still doesn't help with CSRF, which is a
much tougher nut to crack, because every response for pages using CSRF
come with a Set-Cookie header and probably with a CSRF token embedded in
the response content; and those both mean that response really can't be
re-used for anyone else. (Getting rid of the token embedded in the HTML
means forms couldn't ever POST without JS help, which is not an option
as the documented default approach). You can mark some form-using views
that are available to anonymous users as csrf-exempt, which exposes you
potentially to CSRF-based spam, but isn't a security issue if you aren't
treating authenticated submissions any differently from
non-authenticated ones.

Generally, I come down on the side of skepticism that introducing these
special cases into the cache middleware really buys enough to be worth
the added complexity (though I could be convinced that #9249 is worth it).

I do think we should improve the cache middleware documentation so its
limitations are outlined more clearly upfront, and point people towards
existing solutions for caching mostly-but-not-entirely-anonymous pages:
edge-side-includes, two-phase-render, and JS/AJAX fetch.

#15855, on the other hand, is a bug that really does need to be fixed. I
still don't see a better fix than the one I outlined in the ticket
description: requiring some middleware to be in MIDDLEWARE_CLASSES for
the cache_page decorator to work, and not doing the actual caching until
we hit that middleware. Or alternatively, adding an implicit "cache any
responses that had cache_page used on them" phase to response
processing, after all middleware. I think those are both ugly fixes,
though; maybe someone has a better idea. The last time I know of that
this was discussed in-depth was in
http://groups.google.com/group/django-developers/browse_frm/thread/f96e982254fbe5c3/2b02361fd6e706f4

Carl
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk6gxKkACgkQ8W4rlRKtE2dnggCfeNOeAw8g4/Y5Zu6iM73HFK0m
V6EAn0mGvzLzOs0daC1UZWQp6hZnxvH8
=La3y
-END PGP SIGNATURE-

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: Writing your first Django app, part 1 - link to deploying Django

2011-10-20 Thread Russell Keith-Magee
On Thu, Oct 20, 2011 at 3:05 PM, Mateusz Marzantowicz
 wrote:
> Although I'm not a Django developer I'd like to make some suggestion about
> intro tutorial (part 1).
>
> Is it possible to place some reference to "Deploying Django" site, in or
> blow the "development server" section in this intro tutorial?

You won't get any argument from me that there is room for a tutorial
about deployment. There's room for many more tutorials, on many
different topics.

You will, however, get a lot of argument that Tutorial *1* is the
right place for that tutorial. At the end of Tutorial 1, we've got the
demonstration web server running, and we've explored some models. We
don't even have a useful view to look at yet. That isn't the time to
start talking about the configuration of a deployment web server.

If there's going to be a tutorial on deployment, it's a separate
tutorial in itself, that is placed *after* you've developed the web
site - that is, *after* tutorial 4 (or potentially 5,6), when the site
is built, is working happily on the development server, and needs to
be shown to the outside world.

You also need to keep in mind that "deployment" isn't a simple topic.
What operating system are we deploying against? What web server? Do we
address basic sysadmin issues, or do we assume that you're a competent
sysadmin, and just point you at a mod_wsgi configuration and assume
you know how to install it?

I'm not saying that we shouldn't include tutorial materials on Django
deployment -- just that you need to be very careful how you pitch that
tutorial.

Most of all -- show us the draft. If you think there is something
needed in Django's documentation, write it! If you don't know, do some
research and document your learning process. We can always correct
mistakes and provide suggestions, but we need a first draft to comment
on.

Yours,
Russ Magee %-)

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Request for eyes familiar with ORM internals and defer/only (attn: Malcolm)

2011-10-20 Thread Tai Lee
I've run into a bug that is exposed when using defer() or only() with
select_related(). A couple others have come across it, and there is an
existing ticket.

Exceptions appear to be silenced somewhere under normal circumstances
when evaluating `queryset` objects, which made it difficult to track
down, initially. I'm still not sure where the exceptions are silenced,
or if that should be changed. The exceptions appear to be raised when
running the test suite or when coercing a `query` object to `str`.

The symptom is an empty queryset being returned, when there are
actually results.

I've written a patch with tests that I think fixes the problem. I just
need somebody familiar with the ORM and defer/only to give feedback or
bump to RFC.

Malcolm, if you have a moment for a quick review that would be great
since you implemented the defer/only functionality and no doubt have
much greater. If you Otherwise, I'd really appreciate anyone familiar
with the ORM and defer/only taking a quick look :)

https://code.djangoproject.com/ticket/14694

Cheers.
Tai.

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: The state of per-site/per-view middleware caching in Django

2011-10-20 Thread Jens Diemer

Hi...

For PyLucid i made a simple cache middleware [1] simmilar to Django per-site 
cache middleware [2]. But i doesn't vary on Cookies and don't cache cookies. I 
simply cache only the response content.


Of course: This doesn't solve the problem if "csrfmiddlewaretoken" in content.

Here some pseudo code from [1]:
-
def process_request(self, request):
if not self.use_cache(request):
return

response = cache.get(cache_key)
if response is not None:
return response

def process_response(self, request, response):
if not self.use_cache(request):
return response

# Cache only the raw content
response2 = HttpResponse(
content=response._container, status=200,
content_type=response['Content-Type']
)

patch_response_headers(response2, timeout)

cache.set(request.path, response2, timeout)

return response

-

[1] 
https://github.com/jedie/PyLucid/blob/master/pylucid_project/middlewares/cache.py

[2] https://docs.djangoproject.com/en/1.3/topics/cache/#the-per-site-cache


Mfg.

Jens D.

--
You received this message because you are subscribed to the Google Groups "Django 
developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: Sane defaults for Startapp and Startproject

2011-10-20 Thread Carl Meyer
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 10/19/2011 01:07 PM, Gabriel Hurley wrote:
> I think there is sufficient interest in the idea of a "how to organize
> your Django project" page in the documentation that it would be worth
> beginning work on a patch for one, much in the way that the "How to
> contribute to Django"/"Spirit of contributing" page got started.

Agreed.

> That means:
> 
>   1. Opening a ticket for it if one doesn't already exist.

The ticket already exists: https://code.djangoproject.com/ticket/17044

Carl
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk6giaUACgkQ8W4rlRKtE2e10ACg55foQJffrStkk3iiekzaxmH6
f9YAoLQrpRPO+H2I8qPfMqssk5shkvrN
=bevB
-END PGP SIGNATURE-

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: The state of per-site/per-view middleware caching in Django

2011-10-20 Thread Jim Dalton
On Oct 20, 2011, at 10:26 AM, Niran Babalola wrote:

> This problem is inherent to page caching. Workarounds to avoid varying
> by cookie for anonymous users are conceptually incorrect. If a single
> URL can give different responses depending on who's viewing it, then
> it varies by cookie. Preventing CSRF is inherently session-variable as
> well. Loading the token via a separate AJAX call is possible, but
> there are simpler solutions.

You may in fact be correct, but I'm not convinced by what you're saying here 
(not that there is any onus on you to convince me of anything of course). 

I"m suggesting that all anonymous users *could* receive an identical page from 
the server, theoretically, since the same URL does *not* need to return a 
different response depending on which (anonymous) user is viewing it. CSRF is 
obviously a trickier problem, and it's not really worth solving the anonymous 
user problem if CSRF isn't solved as well. But if both problems were somehow 
solvable, then we're in a position where per-site cache would be viable for 
many common scenarios such as the one I described in my original post.

If these two problems are in fact unsolvable or not worth solving because 
simpler alternatives exist, that's fine and understandable. Perhaps 
per-site/per-view caching are indeed exceptionally limited tools that are 
beneficial in a very limited number of use cases, and perhaps the "solution" 
here is tidying up the outstanding bugs and perhaps clarifying the 
documentation as needed to make the limitations more explicit.


> 
> If you want to cache pages with small portions that vary by user, then
> you want edge site includes and something like Varnish to process
> them. If you want a much slower, pure-python solution that doesn't
> require a separate service running somewhere, then you want
> armstrong.esi[1].


Thanks. This post wasn't really about what *I* need btw; I can definitely sort 
out my caching strategies in other areas as I need to. The post only relates to 
"me" because I sat down yesterday and said, "Gee, I wonder if I could make use 
of Django's per-site caching feature for this project I'm working on." I turned 
it "on" to test it out and then spent the next 6 hours delving into the source 
code, IRC, ticket tracker, Google etc. to figure out why it wasn't working at 
all and why @cache_page was, and then after finally sorting it out and grokking 
all of the moving parts etc, realizing that there was extraordinarily limited 
value in a per-site/view caching strategy that caches per unique visitor, which 
is pretty much unavoidable for most common usage patterns.

So yeah, maybe it's me and I'm looking at things the wrong way, but needless to 
say it wasn't a particularly pleasant or worthwhile experience. Not looking for 
pity btw, but just wondering what I/we can or should do to make it better.

Jim

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: The state of per-site/per-view middleware caching in Django

2011-10-20 Thread Niran Babalola
On Thu, Oct 20, 2011 at 7:45 AM, Jim Dalton  wrote:
> There
> is still an exceptionally narrow set of circumstances that would allow me to
> serve a single cached page to all anonymous visitors to my site: namely, I
> can't touch request.user and I can't use CSRF.

This problem is inherent to page caching. Workarounds to avoid varying
by cookie for anonymous users are conceptually incorrect. If a single
URL can give different responses depending on who's viewing it, then
it varies by cookie. Preventing CSRF is inherently session-variable as
well. Loading the token via a separate AJAX call is possible, but
there are simpler solutions.

If you want to cache pages with small portions that vary by user, then
you want edge site includes and something like Varnish to process
them. If you want a much slower, pure-python solution that doesn't
require a separate service running somewhere, then you want
armstrong.esi[1].

- Niran

[1] . armstrong.esi
isn't part of Armstrong proper yet, but if you want to know more about
the project, head to  and
.

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



The state of per-site/per-view middleware caching in Django

2011-10-20 Thread Jim Dalton
I spent the better part of yesterday mucking around in the dregs of Django's 
cache middleware and related modules, and in doing so I've come to the 
conclusion that, due to an accumulation of hinderances and minor bugs, the 
per-site and per-view caching mechanism are effectively broken for many 
fairly typical usage patterns.

Let me demonstrate by fictional example, with what I would consider to be a 
pretty typical configuration and use case for the per-site cache:

Let's pretend I'm developing a blog powered by Django. I'm using memcached, 
and I would like to cache pages on that blog for anonymous users, who are 
going to make up the vast majority of my site's visitors. Ideally, I will 
serve the exact same cached version of a blog post to every single anonymous 
visitor to my site, which will help keep server load under control, 
particularly when I get slashdotted/reddited/what-have-you.

Like any blog, a typical page view features the content primarily (e.g a 
blog post). It also has some "auth" stuff at the top right, which will say 
"Log in / Register" for non logged in users but show a username and welcome 
message for logged in users. Each blog post also has an empty comment form 
at the bottom of it where users can leave comments on the post. Like 99% of 
the websites out there, I will be using Google Analytics to track my 
visitors etc.

Pretty straightforward, right?

Let me count the ways that Django's cache middleware will muck up my goals 
in the above scenario.

First, I'm going to try use the per site cache. Here's what's going to go 
wrong for me:

* It's going to be virtually impossible for me to avoid my cache varying by 
cookie and thus by visitor. Because in my templates I am checking to see if 
the current user is logged in, I'm touching the session, which is going to 
now set the vary cookie header. That means if there is any difference in the 
cookies users are requesting my pages with, I'm going to be sending each 
user a separate cached page, keyed off of SESSION_COOKIE_NAME, which is 
unique for every visitor.

* Even if I avoid touching the request user somehow, the CSRF middleware 
presents the same issue. Because I have a comment form on every page, I have 
a unique CSRF token for each visitor. Thankfully Django doesn't let me 
completely shoot myself in the foot by caching the page with one user's 
token and serving it to everybody else. At least it helpfully sets a CSRF 
token cookie and varies on it to prevent this. However, that cookie is 
different for every unique user. That triggers the the same problem as 
above. I again cannot avoid caching a unique page for each unique visitor.

* Unfortunately, my troubles are not over, even if I resign myself to having 
a cache that varies per visitor. You see, Google Analytics actually sets a 
handful of other cookies with each page request. And guess what? The values 
for those cookies are unique *for each request*. This mean...I'm actually 
not caching at all. Cookies are unique for each and every page request 
thanks to Google Analytics. My per-site cache configuration is totally and 
completely inoperable, all because I'm using a tracking service that pretty 
much *everybody* uses.

Since that didn't work, I wonder if it'll work if I do per-view caching? It 
shouldn't work at all, should it, since it's not like any of the factors I 
outlined above are different if I'm using the @cache_page decorator to do my 
caching vs the per-site cache.

Well, the sad news is caching does "work" when I use cache_page, and that's 
not a good thing:

* @cache_page caches the direct output of the view/render function. It skips 
over the middleware that might have very good reason to introduce vary 
headers and doesn't introduce any vary headers of it's own. So now, with 
this applied, I *am* serving a cached version of this page even though I 
absolutely should not be. Some poor user's token is now being sent to 
everybody. My only chance of redemption is if I happen to have read the docs 
and discovered that this incantation is required to prevent having 
cache_page improperly cache the page:

   @cache_page(60 * 15)
   @csrf_protect
   def my_view(request):
   # ...
   
Of course, the above just puts me right back where I started at the per-site 
level. There was never any chance of making cache_page work any different 
from the per-site cache, but it certainly proved to be a temptation if I'm a 
hurried developer, frustrated by why my per site cache wasn't working and 
"thankful" for the fact that I could get the cache to start "working" with 
the cache_page decorator.

Hopefully the above example really makes it clear to you guys how all of the 
seemingly minor bugs and imperfections really do add up to a broken 
situation for someone coming to this with a pretty standard set of 
expectations and requirements.

Anyhow, the good news is that a good portion of what I have written about 
already has open tickets which in some 

Re: Towards a more friendly NoReverseMatch

2011-10-20 Thread Tom Evans
On Wed, Oct 19, 2011 at 7:20 PM, Wilfred Hughes
 wrote:
> 1. Can we provide an example of a pattern containing "|" that doesn't
> work? I've successfully reversed the pattern r'^fruit/(bananas|apples)
> $' above.

Any regexp with alternation that is not part of a captured parameter:

url(r'^homepage/(?:apple|banana)$', 'homepage', name='bad_homepage')
url(r'^homepage/a|b$', 'homepage', name='bad_homepage2')


Cheers

Tom

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: [NoSQL] Sub-object queries / refactoring JOIN syntax into model fields

2011-10-20 Thread Jonas H.

bump

On 09/28/2011 12:52 AM, Jonas H. wrote:

Hallöchen,

some non-relational databases (e.g. MongoDB) have support for
arbitrarily nested objects. To make queries that "reach" into these
sub-objects, the Django-nonrel developers find it appealing to use JOIN
syntax. For instance, if you had this person in your database

{'name': 'Bob', 'address': {'city': 'NY', 'street': 'Wall Street 42'}}

you could find Bob using these queries:

Person.objects.filter(name='Bob')
Person.objects.filter(address__city='NY')
Person.objects.filter(address__street__startswith='Wall')
...

Similarly, sub-objects may be stored in a list, like so:

{
'votes': [
{'voter': 'Bob', 'vote': 42},
{'voter': 'Ann', 'vote': 3.14}}
]
}

Vote.objects.filter(votes__vote__gt=2)
...


These sub-object queries are essential for non-relational databases to
be really useful so this is an important feature.

What's the core team's opinion on this topic -- is there any chance to
get something like that into Django at all? (Maybe you think two
meanings for one syntax cause too much confusion)

Secondly, how could this be implemented? I thought about refactoring
JOIN syntax handling into the model fields (as little logic as required;
refactoring the actual hardcore JOIN generation code seems like an
impossible task for anyone but the original author)... any other ideas?

So far,
Jonas


--
You received this message because you are subscribed to the Google Groups "Django 
developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Writing your first Django app, part 1 - link to deploying Django

2011-10-20 Thread Mateusz Marzantowicz
Although I'm not a Django developer I'd like to make some suggestion about
intro tutorial (part 1).

Is it possible to place some reference to "Deploying Django" site, in or
blow the "development server" section in this intro tutorial?
As a regular Django (and other web frameworks) user I'm really interested in
finding there how to quickly deploy my project on production server.
I know it's written somewhere in the docs but it is a bit of inconvenience
for the first contact. I know there are already many boxes on this page,
but please think as a Django newcomer and try to answer the question - how
do I deploy my project?


Mateusz Marzantowicz

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.