I am reviewing this patch and have a few questions for Thomas or someone
in the know.

The first has to do with Thomas' observation that Cache-Control is to be
found in r->err_headers_out rather than in r->headers_out... I looked into
this and ran into the following piece of code in mod_expires.c (expires_filter):

    /*
     * Check to see which output header table we should use;
     * mod_cgi loads script fields into r->err_headers_out,
     * for instance.
     */
    expiry = apr_table_get(r->err_headers_out, "Expires");
    if (expiry != NULL) {
        t = r->err_headers_out;
    }
    else {
        expiry = apr_table_get(r->headers_out, "Expires");
        t = r->headers_out;
    }

This code later calls set_expiration_fields which adds Cache-Control
to whichever headers t points to.

Question 1:
Does this imply that the cache code needs to look for ETags, Cache-Control,
etc in both places too? Right now the code all looks in r->headers_out.
Thomas is changing some of them to look in r->err_headers_out instead.
Is there a rule of thumb for determining when to check headers_out vs.
err_headers_out?



The second observation is related to the freshness computation referenced
below. I agree with Thomas that the code as it stands isn't quite right,
but I disagree that negating the logic fixes it. If I understand the
RFC correctly (section 13.2.4), max_age and s-maxage take precedence over
an expires header. This would mean that, as Thomas points out, if an
expires header is more liberal than the max_age value then it currently
passes the freshness test even though it should fail.

Question 2:
Am I correct in my belief that the fix requires seperation of the checks
so that the maxage and s-maxage checks/computations happen first and you
only evaluate freshness using the expires value if there are no max-age or
s-maxage values present? (as in the following code)

    if (((smaxage != -1) && (age < (smaxage - minfresh))) ||
        ((maxage  != -1) && (age < (maxage + maxstale - minfresh))) ||
        ((smaxage == -1) && (maxage == -1) &&
         (info->expire != APR_DATE_BAD) &&
         (age < (apr_time_sec(info->expire - info->date) + maxstale - minfresh)))) {
        /* it's fresh darlings... */
        ...
    }

In other words, if a value is specified for smaxage or maxage, the freshness
is determined solely based on those values (without regard to any expires
header). If there are no values specified for either maxage or smaxage then
the freshness is determined based on the expires header (if it exists). Is
that correct?

By the way, mod_disk_cache has a lot of issues. If it isn't saving and
retrieving the headers correctly, its a bug in the disk_cache code that
needs to be fixed. I wouldn't look to mod_disk_cache as an example of
how the caching code should be working. :)


Thanks in advance for any insight that Thomas or anyone else has to offer on these questions. Given a couple of answers I can have something committed Tuesday. Thanks for the research and patch Thomas.


CASTELLE Thomas wrote:
Hello,

In order to accelerate the RFC compliance of mod_cache, I propose these two patches which fix two problems :
- It doesn't handle the Cache-Control directives (max-age, max-stale, min-fresh...) properly.
- It doesn't send a "If-Modified-Since" to avoid that the backend server sends every time a 200 response where a 304 would be enough.


Actually, we are waiting for these features to be implemented since http-2.0.43 so that we could put Apache in our production environment. I am not an Apache developper, so this is just a proposition, but I tested it and it seemed to work.


The cache_util.c patch deals with the first issue. First, the Cache-Control directive seems to be in the r->err_headers_out and not in the r->headers_out. Second, the following test seems useless :


if ((-1 < smaxage && age < (smaxage - minfresh)) ||
(-1 < maxage && age < (maxage + maxstale - minfresh)) ||
(info->expire != APR_DATE_BAD &&
age < (apr_time_sec(info->expire - info->date) + maxstale - minfresh))) {


because it is always true, no matter if max-age is set or not.
Let's take an example (I suppose here s-maxage, max-stale and min-fresh not set,
so smaxage = - 1, maxstale = 0 and minfresh = 0) :
- with age = 20, maxage = -1 (not set) and expire - date = 30, the second test
is FALSE. The third is TRUE. So the whole test is TRUE, the page is considered
to be fresh => no problem.
- with age = 20, maxage = 10 and expire - date = 30, the second test is FALSE,
but the third is still TRUE. So the whole test is TRUE, the page is considered
to be fresh => problem.



The mod_cache.c patch deals with the second issue. The info structure is never initialized, and even if it was, the info->lastmods and info->etag don't seem to be saved in the file when using mod_disk_cache. So I used the Etag and Last-Modified informations we can find in the r->headers_out and r->err_headers_out. I don't know if it's correct, but it seems to work now...
>
Thanks for looking to these patch and eventually integrate it in the next Apache release !

Thanks a lot for this really great product !


Thomas.

-- Paul J. Reder ----------------------------------------------------------- "The strength of the Constitution lies entirely in the determination of each citizen to defend it. Only if every single citizen feels duty bound to do his share in this defense are the constitutional rights secure." -- Albert Einstein




Reply via email to