Re: Syndication Contrib and HTTP Not Modified

Malcolm Tredinnick Thu, 08 Mar 2007 18:43:41 -0800

Hey Jeremy,

On Thu, 2007-03-08 at 12:37 -0500, Jeremy Bowers wrote:
> Malcolm Tredinnick wrote:
> 
> For concreteness, I've added a ticket and there is a code-only patch at 
> http://code.djangoproject.com/attachment/ticket/3680/cache.control.on.syndication.code.only.diff
>  
> . The patch is exceedingly lightly tested (I've only smoke tested it on 
> a local dev box, and I haven't tested cache_control_from_bits at all), 
> it's just for discussion.


Cool. I'll have a look at it in detail. From a first read, though, it
looks like a reasonable approach: building in cache management to the
view. Why can't this be done using the normal CacheMiddleware, though?
Requests for the same URL will check the cache first using that. You can
also invalidate cached entries manually, if you like, as part of adding
new objects or updating an object that might be in a feed.

I'm not completely in love with all portions of the patch, but I'll have
a read through it more carefully, test some things and get back to you.

The rest of my response is on a more theoretical level...

> > Could you make some timings to show the difference between retrieving
> > all the objects for a feed (which you need to do to establish whether
> > it's changed) and retrieving all the objects and constructing the feed?
> > The extra overhead of turning objects into a sequence of bytes isn't
> > that huge as a proportion of the overall time -- I measured it for my
> > own feed examples when working on some syndication problems last year
> > and the generation was very short when compared to the database
> > interaction and pushing out the data via the network, except when I had
> > quite tiny datasets or very complex transformations in templates. The
> > former is ignorable, since the total time is so small, whilst the latter
> > is atypical for feeds. It would be interesting to see if that (the
> > timing ratios) has changed much.
> >   
> I understand what you're getting at, but the value of benchmarking 
> something where there can be arbitrary client code is dubious.

Disagree. Benchmarking is required as part of examining performance
problems. Otherwise you cannot tell what has changed and whether it's an
improvement. It obviously cannot cover every case, but we are trying to
strike a balance between common usage, maintainable and correct code,
and edge cases. 

[...]
> However, you've raised a good point w.r.t. database access, so I did go 
> ahead and add the hook to determine ETag and Last-Modified before even 
> hitting the DB. Perhaps you pull this from a cache. In that case, a 
> no-change RSS hit is no database hits at all.

So you've worked out a way to test for "no changes" without retrieving
all the objects that make up the feed? OK, I see you're just using the
cache stuff. Makes sense.

> > There is some consistent logic supporting things like the
> > ConditionalGetMiddleware: content generation is typically small
> > (timewise) compared to content shipping and database retrieval.
> I find this logic questionable. It's the non-typical cases where it is 
> expensive where it is disproportionately important.

[...]
> > I think a bunch of the things you're talking about can be done by
> > modifying contrib.syndication.views.feed() so that more return options
> > are handled. Views can already jump out early with their own HTTP
> > response codes and/or tweak their HTTP headers.
> I don't know what you mean by this; return values from what? 
> contrib.syndication.views.feed (at least prior to my patch) never calls 
> any user code.

I used lazy language here. Apologies. Your Feed-derived class's methods
are called as part of a view function. That view function is, in the
standard setup, contrib.syndication.views.feed(). Don't look at the Feed
class and sub-classes as special, in this respect: they are just one
part of a view. So you can bail out of feed processing by raising
exceptions, for example, and catching them in the view to return the
right HTTP code. At the moment, the standard view doesn't catch any
special exceptions (except for Http404), but writing a slightly
different view that did would be easy.

Your code seems to be taking a similar, but slightly different approach,
by putting cache support into the view. That may work, too.

Finally, let me just make something clear...

The work you seem to be doing may well be of real benefit, so don't
think I'm dismissing it out of hand. Even just talking through the
design is going to help somebody someday.

It's impossible to really judge the impact on the usefulness vs.
maintainable/good design scale without seeing the code and you're
clearly writing something that scratches your own itch. I'm genuinely
not intending to pre-judge your code at all here, since I haven't looked
at it until just now. I am, however, quite willing to keep questioning
your assumptions and start from my default position of "prove it" when
it comes to adding extra code to support fringe cases.

You may be unlucky here in who chose to respond. There's evidence to
suggest that I'm more of a hard-ass on this front than the other core
developers, but only by a little bit.

Regards,
Malcolm


--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en
-~----------~----~----~----~------~----~------~--~---

Re: Syndication Contrib and HTTP Not Modified

Reply via email to