On Tue, Sep 15, 2009 at 10:55 PM, Curt Arnold <carn...@apache.org> wrote:
>
> On Sep 15, 2009, at 5:30 PM, Christopher Lenz wrote:
>>
>> This is a somewhat misleading description; it's not the lack of an Expires
>> header on CouchDB responses that results in incorrect caching, it's a
>> (really ugly) bug in the XMLHttpRequest implementation of IE6 that does
>> this. As far as I know, the cache control headers sent by CouchDB are
>> absolutely correct according to the HTTP specification. Unconditionally
>> adding an Expires header (with a date in the past) just to workaround the
>> XHR bug in IE6 *completely disables* any caching by any user agent!
>
> One of the earlier patches did, but the current patch has no negative effect
> on other browsers (other than adding 20 or so bytes to the header) and
> brings XmlHttpRequest's behavior into line with the other browsers.  Adding
> Expires in conjunction with the must-revalidate simply explicitly declares
> that any reuse of previously cached documents must be revalidated with the
> server.  There is no time that the document is fresh and does not need
> revalidation.  Without the Expires header, other browsers guess what we'd
> like them to (that the document isn't fresh) and IE 6's XmlHttpRequest uses
> heuristics and typically guesses that the document is fresh.  Adding an
> Expires header in the past (or a value of 0) is explicitly mentioned in the
> HTTP RFC as the mechanism to mark as response as immediately stale.  I kept
> the date in the past (though I would have preferred sending a 0) since one
> of the UUID tests was over-specified and would fail if it didn't have a
> valid date.

Regardless of browser support, the first question should always be
weather we can avoid hacks specific to a user agent. Unless you can
show that there's a case where its absolutely impossible for a
significant user agent to configure itself to work properly I would be
at least a -0 on this.

Also, the spec fairly explicitly states:

  > The format is an absolute date and time as defined by HTTP-date in
section 3.3.1; it MUST be in RFC 1123 date format

And

  > To mark a response as "already expired," an origin server sends an
Expires date that is equal to the Date header value.

That pretty much says that neither a random historical date or value
of 0 is ever good. The place where it does mention 0:

  > HTTP/1.1 clients and caches MUST treat other invalid date formats,
especially including the value "0", as in the past (i.e., "already
expired").

Here it seems that the spec is specifically saying that this A Bad
Thing &trade; so much that it went out of its way to specify the error
condition. That's not the same as saying "its ok to send 0, any other
invalid date, or a date in the past".

Also I looked around the spec for a bit trying to find a logical
progression for when Expires applies vs ETag but couldn't find
anything. Though also importantly I didn't see "Clients are free to
use a heuristic in the absence of this header" clause. I'm well aware
that RFC's can be difficult to respect given their ambiguity in
places, but this appears to be another example of just making stuff up
though I could be convinced otherwise if there's a thread on a W3C ML
or something about why this heuristic exists.

> I've monitored the traffic and the logs before and after the patch and see
> exactly what I would expect.  Second requests to unchanged documents get a
> 304 returned from the CouchDB and use the previously retrieved value on
> every browser I've tried.  Fire up Fiddler or your favorite network
> monitoring tool and see for yourself.

Second requests from Safari get a 304 returned without the patch. Feel
free to fire up Wireshark. :)

But in all seriousness, the real question is whether we're improving
the situation by fudging this aspect of the HTTP spec or not. The fact
is IE6 (as much as we all hate it) still has a noticeable market
share. Just kicking it to the curb would be expedient but isn't the
right answer either. The answer is that we need to make sure that it
can be made compatible, and if not then and only then should we
consider breaking HTTP as a special case.

As Christopher Lenz mentions, if the concern is a working Futon on IE6
then adding smarts for detecting the browser environment and
configuring itself is a patch away. If its trying to force CouchDB to
make amends for a specific broken HTTP stack, that's another. Unless
it can be shown that its impossible for IE6 to fix itself there's no
reason to complicate every other client.

>>
>> Note also that IE6 gets cache invalidation right when you don't go through
>> XMLHttpRequest (that is, request CouchDB documents/views directly).
>>
>> A CouchDB-based application that runs on the client (CouchApp-style), and
>> that needs to work on IE6, has the option to force all XHR requests to
>> invalidate the cache (for example by using jQuery's "cache=false" option to
>> AJAX requests, which simply adds an extra timestamp-valued query string
>> parameter). In addition, it can choose to do so if, and only if, it detects
>> to be running on IE6. And of course, applications accessing CouchDB only
>> through server-side code do not need to care about this issue at all.
>>
>> Presumably we *could* add the browser detection and the conditional
>> addition of an Expires header to the CouchDB HTTP server, but browser
>> detection is a really slippery slope that I think we should avoid, and IE6
>> is likely (or so I hope) to be pretty much extinct by the time CouchDB
>> reaches the 1.0 mark. In any case, I'm -1 on any patch that does this
>> unconditionally, and -0.5 on any patch that does the Expires header dance
>> conditionally. I'd be okay with adding the cache busting trick to Futon (but
>> again, only conditionally), and documenting the issue and workaround for
>> CouchApps.
>>
>> Cheers,
>
> The unconditional Expires header is the simplest fix.  As far as I can tell,
> it has no undesirable effects and accomplishes the goal.  If that doesn't go
> in, then I'd prefer to see the header values being configurable instead of
> baking in other logic.

Configurable headers are a good idea. "X-Noah: Awesome" is something
that CouchDB should be able to do. Though I don't know what other
logic we'd be baking in unless you mean browser sniffing. Christopher
Lenz might've only been -0 on that, but I'd be -shitton.

> I don't want to get in a reopen war, but please reopen the bug.

As far as I'm concerned you haven't done anything to refute
Christopher's logic on why this isn't a bug. Feel free to open a
configurable headers ticket though because I think that'd be generally
useful.

Paul Davis

Reply via email to