Hello! On Thu, Jun 20, 2024 at 08:39:43PM +0900, Hiroaki Nakamura wrote:
> # HG changeset patch > # User Hiroaki Nakamura <[email protected]> > # Date 1718882801 -32400 > # Thu Jun 20 20:26:41 2024 +0900 > # Branch correct_age > # Node ID c81df54e3d0333c26d4296792dc0df767b386f91 > # Parent 73929a4f3447d558747623884b5ba281c13332d8 > Correctly calculate and set Age header. > Implement the calculation of the Age header as specified in > "RFC 9111: HTTP Caching" > https://www.rfc-editor.org/rfc/rfc9111.html Thanks for the patches. Note that it might be a good idea to update the commit log to follow style rules, such as: : Cache: added calculation of the Age header. : : Implement the calculation of the Age header as specified in : RFC 9111 "HTTP Caching", https://www.rfc-editor.org/rfc/rfc9111.html It also might be also a good idea to add additional implementation details which might worth explaining, especially given that RFC 911 does not really specify how the Age header is to be calculated, but rather gives some guidelines on possible approaches. See below for some more comments. > > diff -r 73929a4f3447 -r c81df54e3d03 src/http/ngx_http_cache.h > --- a/src/http/ngx_http_cache.h Thu Jun 20 20:26:24 2024 +0900 > +++ b/src/http/ngx_http_cache.h Thu Jun 20 20:26:41 2024 +0900 > @@ -59,6 +59,8 @@ > size_t body_start; > off_t fs_size; > ngx_msec_t lock_time; > + time_t response_time; > + time_t corrected_initial_age; > } ngx_http_file_cache_node_t; > > > @@ -75,6 +77,8 @@ > time_t error_sec; > time_t last_modified; > time_t date; > + time_t response_time; Note that the "response_time" field being added seems to be exactly equivalent to the exiting "date" field. > + time_t corrected_initial_age; > > ngx_str_t etag; > ngx_str_t vary; > diff -r 73929a4f3447 -r c81df54e3d03 src/http/ngx_http_file_cache.c > --- a/src/http/ngx_http_file_cache.c Thu Jun 20 20:26:24 2024 +0900 > +++ b/src/http/ngx_http_file_cache.c Thu Jun 20 20:26:41 2024 +0900 > @@ -971,6 +971,8 @@ > fcn->uniq = 0; > fcn->body_start = 0; > fcn->fs_size = 0; > + fcn->response_time = 0; > + fcn->corrected_initial_age = 0; > > done: > > @@ -980,6 +982,8 @@ Nitpicking: please consider adding [diff] showfunc=1 to your Mercurial configuration to ensure that function names are shown in diffs, this makes reviews easier. > > c->uniq = fcn->uniq; > c->error = fcn->error; > + c->response_time = fcn->response_time; > + c->corrected_initial_age = fcn->corrected_initial_age; > c->node = fcn; > > failed: > @@ -1624,6 +1628,7 @@ > ngx_int_t > ngx_http_cache_send(ngx_http_request_t *r) > { > + time_t resident_time, current_age; > ngx_int_t rc; > ngx_buf_t *b; > ngx_chain_t out; > @@ -1646,6 +1651,17 @@ > return NGX_HTTP_INTERNAL_SERVER_ERROR; > } > > + /* > + * Update age response header. > + * https://www.rfc-editor.org/rfc/rfc9111.html#name-calculating-age > + */ > + resident_time = ngx_time() - c->response_time; > + current_age = c->corrected_initial_age + resident_time; > + r->headers_out.age_n = current_age; Note that this uses data from the cache node, which is not available if the cache node was loaded from disk, for example, on server restart, and this will certainly lead to incorrect results. If I'm reading the code correctly, it will result in seconds since the Epoch in the Age header on all responses after a restart. This suggests that either things should be implemented quite differently, or this patch needs to be merged with the second one, which implements loading relevant information from the cache header. > + ngx_log_debug3(NGX_LOG_DEBUG_HTTP, r->connection->log, 0, > + "http file cache send, resp:%O, resident:%d, age:%d", > + c->response_time, resident_time, current_age); Note that there seems to be somewhat too many debug logging in the patch. Please also note that using "%d" for time_t is wrong, as time_t size might be different from the size of int. For time_t, "%T" should be used instead. See ngx_sprintf() comment in src/core/ngx_string.c for the full list of supported formats and corresponding types. > + > rc = ngx_http_send_header(r); > > if (rc == NGX_ERROR || rc > NGX_OK || r->header_only) { > diff -r 73929a4f3447 -r c81df54e3d03 src/http/ngx_http_header_filter_module.c > --- a/src/http/ngx_http_header_filter_module.c Thu Jun 20 20:26:24 2024 +0900 > +++ b/src/http/ngx_http_header_filter_module.c Thu Jun 20 20:26:41 2024 +0900 > @@ -322,6 +322,10 @@ > len += sizeof("Last-Modified: Mon, 28 Sep 1970 06:00:00 GMT" CRLF) - > 1; > } > > + if (r->headers_out.age_n != -1) { > + len += sizeof("Age: ") - 1 + NGX_OFF_T_LEN + 2; > + } > + > c = r->connection; > > if (r->headers_out.location > @@ -518,6 +522,10 @@ > *b->last++ = CR; *b->last++ = LF; > } > > + if (r->headers_out.age_n != -1) { > + b->last = ngx_sprintf(b->last, "Age: %O" CRLF, r->headers_out.age_n); > + } > + > if (host.data) { > > p = b->last + sizeof("Location: ") - 1; > diff -r 73929a4f3447 -r c81df54e3d03 src/http/ngx_http_request.c > --- a/src/http/ngx_http_request.c Thu Jun 20 20:26:24 2024 +0900 > +++ b/src/http/ngx_http_request.c Thu Jun 20 20:26:41 2024 +0900 > @@ -646,6 +646,7 @@ > r->headers_in.keep_alive_n = -1; > r->headers_out.content_length_n = -1; > r->headers_out.last_modified_time = -1; > + r->headers_out.age_n = -1; > > r->uri_changes = NGX_HTTP_MAX_URI_CHANGES + 1; > r->subrequests = NGX_HTTP_MAX_SUBREQUESTS + 1; > diff -r 73929a4f3447 -r c81df54e3d03 src/http/ngx_http_request.h > --- a/src/http/ngx_http_request.h Thu Jun 20 20:26:24 2024 +0900 > +++ b/src/http/ngx_http_request.h Thu Jun 20 20:26:41 2024 +0900 > @@ -290,6 +290,7 @@ > off_t content_offset; > time_t date_time; > time_t last_modified_time; > + off_t age_n; Using off_t here looks wrong, as off_t is a type for file offsets, and not for dates. (Not sure it needs to be here at all though, a better solution might be to rewrite/add the header within the upstream module.) > } ngx_http_headers_out_t; > > > diff -r 73929a4f3447 -r c81df54e3d03 src/http/ngx_http_special_response.c > --- a/src/http/ngx_http_special_response.c Thu Jun 20 20:26:24 2024 +0900 > +++ b/src/http/ngx_http_special_response.c Thu Jun 20 20:26:41 2024 +0900 > @@ -581,6 +581,7 @@ > > r->headers_out.content_length_n = -1; > r->headers_out.last_modified_time = -1; > + r->headers_out.age_n = -1; > } > > > diff -r 73929a4f3447 -r c81df54e3d03 src/http/ngx_http_upstream.c > --- a/src/http/ngx_http_upstream.c Thu Jun 20 20:26:24 2024 +0900 > +++ b/src/http/ngx_http_upstream.c Thu Jun 20 20:26:41 2024 +0900 > @@ -50,6 +50,8 @@ > ngx_http_upstream_t *u); > static ngx_int_t ngx_http_upstream_test_next(ngx_http_request_t *r, > ngx_http_upstream_t *u); > +static void ngx_http_upstream_update_age(ngx_http_request_t *r, > + ngx_http_upstream_t *u, time_t now); > static ngx_int_t ngx_http_upstream_intercept_errors(ngx_http_request_t *r, > ngx_http_upstream_t *u); > static ngx_int_t ngx_http_upstream_test_connect(ngx_connection_t *c); > @@ -132,6 +134,8 @@ > ngx_table_elt_t *h, ngx_uint_t offset); > static ngx_int_t ngx_http_upstream_process_vary(ngx_http_request_t *r, > ngx_table_elt_t *h, ngx_uint_t offset); > +static ngx_int_t ngx_http_upstream_process_age(ngx_http_request_t *r, > + ngx_table_elt_t *h, ngx_uint_t offset); > static ngx_int_t ngx_http_upstream_copy_header_line(ngx_http_request_t *r, > ngx_table_elt_t *h, ngx_uint_t offset); > static ngx_int_t > @@ -319,6 +323,10 @@ > ngx_http_upstream_copy_header_line, > offsetof(ngx_http_headers_out_t, content_encoding), 0 }, > > + { ngx_string("Age"), > + ngx_http_upstream_process_age, 0, > + ngx_http_upstream_ignore_header_line, 0, 0 }, > + > { ngx_null_string, NULL, 0, NULL, 0, 0 } > }; > > @@ -499,6 +507,7 @@ > > u->headers_in.content_length_n = -1; > u->headers_in.last_modified_time = -1; > + u->headers_in.age_n = -1; > > return NGX_OK; > } > @@ -1068,6 +1077,7 @@ > ngx_memzero(&u->headers_in, sizeof(ngx_http_upstream_headers_in_t)); > u->headers_in.content_length_n = -1; > u->headers_in.last_modified_time = -1; > + u->headers_in.age_n = -1; > > if (ngx_list_init(&u->headers_in.headers, r->pool, 8, > sizeof(ngx_table_elt_t)) > @@ -1549,6 +1559,7 @@ > ngx_memzero(u->state, sizeof(ngx_http_upstream_state_t)); > > u->start_time = ngx_current_msec; > + u->request_time = ngx_time(); Adding a duplicate start time shouldn't be needed. > > u->state->response_time = (ngx_msec_t) -1; > u->state->connect_time = (ngx_msec_t) -1; > @@ -2008,6 +2019,7 @@ > ngx_memzero(&u->headers_in, sizeof(ngx_http_upstream_headers_in_t)); > u->headers_in.content_length_n = -1; > u->headers_in.last_modified_time = -1; > + u->headers_in.age_n = -1; > > if (ngx_list_init(&u->headers_in.headers, r->pool, 8, > sizeof(ngx_table_elt_t)) > @@ -2529,6 +2541,8 @@ > return; > } > > + ngx_http_upstream_update_age(r, u, ngx_time()); Note that the "ngx_time()" as an argument looks unneeded, it should be better to obtain the time in the function itself. Also, from semantic point of view this probably should be in ngx_http_upstream_process_headers(), and not a dedicated function call. > + > ngx_http_upstream_send_response(r, u); > } > > @@ -2615,6 +2629,7 @@ > "http upstream not modified"); > > now = ngx_time(); > + ngx_http_upstream_update_age(r, u, now); > > valid = r->cache->valid_sec; > updating = r->cache->updating_sec; > @@ -2648,7 +2663,12 @@ > valid = ngx_http_file_cache_valid(u->conf->cache_valid, > u->headers_in.status_n); > if (valid) { > - valid = now + valid; > + ngx_log_debug3(NGX_LOG_DEBUG_HTTP, r->connection->log, 0, > + "adjust cache valid_sec:%O, " > + "valid:%O, init_age:%d for 304", > + now + valid - r->cache->corrected_initial_age, > + valid, r->cache->corrected_initial_age); > + valid = now + valid - r->cache->corrected_initial_age; Not sure if Age calculations should apply to caching time as calculated with proxy_cache_valid directives, but if it does, it should apply consistently, in all cases where proxy_cache_valid times are used. At least ngx_http_upstream_intercept_errors() case seems to be ignored in the patch. Also, if Age applies, there should be a way to ignore it, similarly to how Cache-Control can be ignored with the proxy_ignore_headers directive. > } > } > > @@ -2672,6 +2692,59 @@ > } > > > +static void > +ngx_http_upstream_update_age(ngx_http_request_t *r, ngx_http_upstream_t *u, > + time_t now) > +{ > + time_t response_time, date, apparent_age, response_delay, age_value, > + corrected_age_value, corrected_initial_age; > + > + /* > + * Update age response header. > + * https://www.rfc-editor.org/rfc/rfc9111.html#name-calculating-age > + */ > + response_time = now; > + if (u->headers_in.date != NULL) { > + date = ngx_parse_http_time(u->headers_in.date->value.data, > + u->headers_in.date->value.len); > + if (date == NGX_ERROR) { > + date = now; > + } > + } else { > + date = now; > + } > + apparent_age = ngx_max(0, response_time - date); > + > + response_delay = response_time - u->request_time; > + age_value = u->headers_in.age_n != -1 ? u->headers_in.age_n : 0; > + corrected_age_value = age_value + response_delay; > + > + corrected_initial_age = ngx_max(apparent_age, corrected_age_value); > + r->headers_out.age_n = corrected_initial_age; Note that this approach, as described in RFC 9111 as a possible way to calculate the Age header, implies that time on both proxy server and the origin server must be "reasonably well synchronized", which is not really the case in many practical situations. If the time on the origin server is mostly arbitrary, for example, in the past, apparent_age might be very large, leading to a very large corrected_initial_age as well, preventing caching. As such, I would rather consider a more robust algorithm instead. > + > + ngx_log_debug8(NGX_LOG_DEBUG_HTTP, u->peer.connection->log, 0, > + "http upstream set age:%O, req:%O, resp:%O, date:%O, " > + "a_age:%O, resp_delay:%O, u_age:%O, c_age:%O", > + corrected_initial_age, u->request_time, response_time, > date, > + apparent_age, response_delay, u->headers_in.age_n, > + corrected_age_value); > + > +#if (NGX_HTTP_CACHE) > + if (r->cache) { > + r->cache->response_time = response_time; > + r->cache->corrected_initial_age = corrected_initial_age; > + if (u->headers_in.adjusting_valid_sec) { > + r->cache->valid_sec -= corrected_initial_age; The "adjusting_valid_sec" logic looks unclear (probably at least needs a better name) and fragile. If I'm reading the code correctly, it will do the wrong thing at least if the upstream server returns something like: Cache-Control: max-age=60 X-Accel-Expires: @<time> since u->headers_in.adjusting_valid_sec won't be cleared. > + ngx_log_debug2(NGX_LOG_DEBUG_HTTP, u->peer.connection->log, 0, > + "http upstream adjusted cache " > + "valid_sec:%O, init_age:%O", > + r->cache->valid_sec, corrected_initial_age); > + } > + } > +#endif If I'm reading the code correctly, this will add the Age header to responses which are not cached. This somewhat contradicts to what RFC 9111 says in the description of the Age header (https://www.rfc-editor.org/rfc/rfc9111.html#section-5.1-6): : The presence of an Age header field implies that the response was : not generated or validated by the origin server for this request. Please also note that in many cases [free]nginx is expected to be seen as the origin server, even if it caches some upstream server responses according to the configuration (in many cases, ignoring Cache-Control or Expires headers). In this case using the Age header even on cached responses might not be desired. It would be great to dig further into possible use cases and make sure all are properly covered. > +} > + > + > static ngx_int_t > ngx_http_upstream_intercept_errors(ngx_http_request_t *r, > ngx_http_upstream_t *u) > @@ -2747,6 +2820,7 @@ > status); > if (valid) { > r->cache->valid_sec = ngx_time() + valid; > + u->headers_in.adjusting_valid_sec = 1; > } > } > > @@ -2952,6 +3026,7 @@ > r->headers_out.status_line = u->headers_in.status_line; > > r->headers_out.content_length_n = u->headers_in.content_length_n; > + r->headers_out.age_n = u->headers_in.age_n; > > r->disable_not_modified = !u->cacheable; > > @@ -4616,6 +4691,7 @@ > > if (valid) { > r->cache->valid_sec = ngx_time() + valid; > + u->headers_in.adjusting_valid_sec = 1; > r->cache->error = rc; > } > } > @@ -4891,6 +4967,7 @@ > } > > r->cache->valid_sec = ngx_time() + n; > + u->headers_in.adjusting_valid_sec = 1; > u->headers_in.expired = 0; > } > > @@ -5053,6 +5130,7 @@ > > default: > r->cache->valid_sec = ngx_time() + n; > + u->headers_in.adjusting_valid_sec = 1; > u->headers_in.no_cache = 0; > u->headers_in.expired = 0; > return NGX_OK; > @@ -5319,6 +5397,39 @@ > > > static ngx_int_t > +ngx_http_upstream_process_age(ngx_http_request_t *r, > + ngx_table_elt_t *h, ngx_uint_t offset) > +{ > + ngx_http_upstream_t *u; > + > + u = r->upstream; > + > + if (u->headers_in.age) { > + ngx_log_error(NGX_LOG_ERR, r->connection->log, 0, > + "upstream sent duplicate header line: "%V: %V", " > + "previous value: "%V: %V"", > + &h->key, &h->value, > + &u->headers_in.age->key, > + &u->headers_in.age->value); > + return NGX_HTTP_UPSTREAM_INVALID_HEADER; > + } > + > + h->next = NULL; > + u->headers_in.age = h; > + u->headers_in.age_n = ngx_atoof(h->value.data, h->value.len); > + > + if (u->headers_in.age_n == NGX_ERROR) { > + ngx_log_error(NGX_LOG_ERR, r->connection->log, 0, > + "upstream sent invalid "Age" header: " > + ""%V: %V"", &h->key, &h->value); > + return NGX_HTTP_UPSTREAM_INVALID_HEADER; > + } Note that returning errors here violates RFC 9111 SHOULD recommendations, notably (https://www.rfc-editor.org/rfc/rfc9111.html#section-5.1): : Although it is defined as a singleton header field, a cache : encountering a message with a list-based Age field value SHOULD : use the first member of the field value, discarding subsequent : ones. : If the field value (after discarding additional members, as per : above) is invalid (e.g., it contains something other than a : non-negative integer), a cache SHOULD ignore the field. Unless there are good reasons, a better solution might be to report warnings and follow the recommendations instead. > + > + return NGX_OK; > +} > + > + > +static ngx_int_t > ngx_http_upstream_copy_header_line(ngx_http_request_t *r, ngx_table_elt_t *h, > ngx_uint_t offset) > { > diff -r 73929a4f3447 -r c81df54e3d03 src/http/ngx_http_upstream.h > --- a/src/http/ngx_http_upstream.h Thu Jun 20 20:26:24 2024 +0900 > +++ b/src/http/ngx_http_upstream.h Thu Jun 20 20:26:41 2024 +0900 > @@ -287,14 +287,17 @@ > > ngx_table_elt_t *cache_control; > ngx_table_elt_t *set_cookie; > + ngx_table_elt_t *age; > > off_t content_length_n; > time_t last_modified_time; > + off_t age_n; See above, "off_t" looks wrong. [...] Hope this helps. -- Maxim Dounin http://mdounin.ru/
