cnkedao opened a new issue, #13118:
URL: https://github.com/apache/trafficserver/issues/13118

   ## Summary
   
   In ATS 9.2.x, when a transaction is routed through 
`HttpTransact::handle_no_cache_operation_on_forward_server_response()` (i.e. 
`cache_info.action == CACHE_DO_NO_ACTION`, which is the typical path for Range 
requests when `proxy.config.http.cache.range.write = 0`, for non-cacheable 
methods, when caching is disabled, etc.), the `via_string` slots 
`VIA_SERVER_RESULT` and `VIA_PROXY_RESULT` are **only updated for `200 OK` and 
`304 Not Modified`**. For `206 Partial Content` and for everything that falls 
into the `default` branch (`404`, `5xx`, …), the function does *not* call 
`SET_VIA_STRING(...)` at all.
   
   As a result, the response `Via` header for these very common cases looks 
like:
   
   ```
   Via: http/1.1 cache-foo (ATS/9.2.x [cMs f ])
   ```
   
   with two empty slots where `VIA_SERVER_RESULT` and `VIA_CACHE_FILL_ACTION` 
should have been recorded. This makes the `Via` header much less useful for 
operators trying to distinguish between "origin actually answered" vs. 
"internally synthesized error" (e.g. DNS failure → `500 Cannot find server.`) — 
both look identical at `proxy.config.http.insert_response_via_str = 2`.
   
   ## Affected version
   
   - `9.2.x` (verified against the current `9.2.x` branch on GitHub, file 
`proxy/http/HttpTransact.cc`)
   - Likely present in earlier versions as well; please confirm against 
`master` / `10.0.x`.
   
   ## Reproduction
   
   ATS configuration (defaults are sufficient — these are the relevant ones):
   
   ```
   CONFIG proxy.config.http.insert_response_via_str    INT 2
   CONFIG proxy.config.http.cache.range.lookup         INT 1
   CONFIG proxy.config.http.cache.range.write          INT 0
   ```
   
   A simple remap to any HTTP origin, then:
   
   ### Case A — Range request returning `206`
   
   ```bash
   curl -v -x 127.0.0.1:8686 http://www.example.com/videos/output.mp4 -r 0-1 -o 
/dev/null
   ```
   
   Response (excerpt):
   
   ```
   HTTP/1.1 206 Partial Content
   ...
   Via: http/1.1 CDN (Up [cMs f ])
   ```
   
   ### Case B — Range request returning `404`
   
   ```bash
   curl -v -x 127.0.0.1:8686 http://www.example.com/videos/output.mp4 -r 0-1 -o 
/dev/null
   ```
   
   ```
   HTTP/1.1 404 Not Found
   ...
   Via: http/1.1 CDN (Up [cMs f ])
   ```
   
   ### Case C — DNS failure, ATS-synthesized `500`
   
   ```bash
   curl -v -x 127.0.0.1:8686 http://www.example.com/
   ```
   
   ```
   HTTP/1.1 500 Cannot find server.
   ...
   Via: http/1.1 CDN (Up [cMs f ])
   ```
   
   In all three cases the `Via` debug field is identical (`[cMs f ]`), even 
though Case A and Case B did contact the origin successfully and Case C never 
even attempted a server connection.
   
   ## Expected behavior
   
   For Case A (`206`) and Case B (`404`, going through the `default` branch), 
the origin *did* serve the response, so the `Via` debug field should reflect 
that:
   
   - `VIA_SERVER_RESULT = 'S'` (`VIA_SERVER_SERVED`)
   - `VIA_PROXY_RESULT  = 'S'` (`VIA_PROXY_SERVED`)
   
   i.e. the field should look like `[cMsSf ]` — consistent with what already 
happens for `200`/`304`.
   
   For Case C (DNS failure), the current `[cMs f ]` is "expected" because no 
server connection was ever made, but the `VIA_ERROR_TYPE` slot does carry 
useful info (`'D'` for `VIA_ERROR_DNS_FAILURE`) — it is just not visible at 
`insert_response_via_str = 2` because that level only emits the `[VIA_CACHE, 
VIA_PROXY)` window. That part is by design and not the subject of this issue, 
but it is worth mentioning that operators currently cannot tell Case A/B from 
Case C without bumping verbosity to `3`.
   
   ## Root cause (source pointers, 9.2.x)
   
   The construction of the response `Via` debug window happens in 
`HttpTransactHeaders::insert_via_header_in_response()`:
   
   ```cpp
   // proxy/http/HttpTransactHeaders.cc, ~line 872
   if (s->txn_conf->insert_response_via_string > 1) {
     *via_string++ = '[';
   
     if (s->txn_conf->insert_response_via_string > 2) { // Highest verbosity
       via_string += nstrcpy(via_string, incoming_via);
     } else {
       memcpy(via_string, incoming_via + VIA_CACHE, VIA_PROXY - VIA_CACHE);
       via_string += VIA_PROXY - VIA_CACHE;
     }
     *via_string++ = ']';
     ...
   }
   ```
   
   So at verbosity `2` the visible window covers exactly:
   
   | index | name                    |
   |-------|-------------------------|
   | 2     | `VIA_CACHE`             |
   | 3     | `VIA_CACHE_RESULT`      |
   | 4     | `VIA_SERVER`            |
   | 5     | `VIA_SERVER_RESULT`     |
   | 6     | `VIA_CACHE_FILL`        |
   | 7     | `VIA_CACHE_FILL_ACTION` |
   
   `via_string[]` is initialized in `HttpTransact::State()` 
(`proxy/http/HttpTransact.h`, ~line 830): all slots are filled with `' '`, then 
the six "marker" slots are overwritten with their fixed letters 
(`u/c/s/f/p/e`). Result slots are only set later when the corresponding 
`SET_VIA_STRING(...)` is executed.
   
   For Range requests with `cache_range_write = 0`, `HandleCacheOpenReadMiss()` 
sets `VIA_CACHE_RESULT = 'M'` and forces `cache_info.action = 
CACHE_DO_NO_ACTION`:
   
   ```cpp
   // proxy/http/HttpTransact.cc, ~line 3319
   SET_VIA_STRING(VIA_CACHE_RESULT, VIA_CACHE_MISS);
   ...
   } else if ((s->hdr_info.client_request.presence(MIME_PRESENCE_RANGE) && 
!s->txn_conf->cache_range_write) ||
              does_method_effect_cache(s->method) == false || s->range_setup == 
RANGE_NOT_SATISFIABLE ||
              s->range_setup == RANGE_NOT_HANDLED) {
     s->cache_info.action = CACHE_DO_NO_ACTION;
   }
   ```
   
   Because of `CACHE_DO_NO_ACTION`, `handle_response_from_server()` then 
dispatches to `handle_no_cache_operation_on_forward_server_response()`:
   
   ```cpp
   // proxy/http/HttpTransact.cc, ~line 4170
   // Just tunnel?
   TxnDebug("http_trans", "[hfsco] cache action: %s",
            HttpDebugNames::get_cache_action_name(s->cache_info.action));
   handle_no_cache_operation_on_forward_server_response(s);
   ```
   
   And here is the actual problem — `case HTTP_STATUS_PARTIAL_CONTENT` and 
`default` simply don't update the `Via` slots:
   
   ```cpp
   // proxy/http/HttpTransact.cc, ~line 4791
   switch (s->hdr_info.server_response.status_get()) {
   case HTTP_STATUS_OK:
     ...
     SET_VIA_STRING(VIA_SERVER_RESULT, VIA_SERVER_SERVED);
     SET_VIA_STRING(VIA_PROXY_RESULT,  VIA_PROXY_SERVED);
     ...
     break;
   case HTTP_STATUS_NOT_MODIFIED:
     ...
     SET_VIA_STRING(VIA_SERVER_RESULT, VIA_SERVER_NOT_MODIFIED);
     SET_VIA_STRING(VIA_PROXY_RESULT,  VIA_PROXY_NOT_MODIFIED);
     ...
     break;
   case HTTP_STATUS_HTTPVER_NOT_SUPPORTED:
     ...
     return;
   case HTTP_STATUS_PARTIAL_CONTENT:
     // If we get this back we should be just passing it through.
     ink_assert(s->cache_info.action == CACHE_DO_NO_ACTION);
     s->next_action = SM_ACTION_SERVER_READ;
     break;                       // <-- no SET_VIA_STRING
   default:
     TxnDebug("http_trans", "[hncoofsr] server sent back something other than 
100,304,200");
     /* Default behavior is to pass-through response to the client */
     ink_assert(s->cache_info.action == CACHE_DO_NO_ACTION);
     s->next_action = SM_ACTION_SERVER_READ;
     break;                       // <-- no SET_VIA_STRING
   }
   ```
   
   For comparison, the cached path 
(`handle_cache_operation_on_forward_server_response()`, same file, ~line 4435) 
does set both fields for the same `default` branch:
   
   ```cpp
   default:
     TxnDebug("http_trans", "[hcoofsr] response code: %d", 
server_response_code);
     SET_VIA_STRING(VIA_SERVER_RESULT, VIA_SERVER_SERVED);
     SET_VIA_STRING(VIA_PROXY_RESULT,  VIA_PROXY_SERVED);
     ...
   ```
   
   So the no-cache path is simply inconsistent with the cache path.
   
   `VIA_CACHE_FILL_ACTION` staying empty in this scenario is correct: 
`cache_info.action == CACHE_DO_NO_ACTION` falls through the `default` of the 
cache-action `switch` later and no fill action ever happens. That is by design 
and not part of this report.
   
   ## Proposed fix
   
   Update the no-cache path so that the `Via` debug field accurately reflects 
that the origin actually answered. Minimal patch (untested, against `9.2.x`):
   
   ```diff
   --- a/proxy/http/HttpTransact.cc
   +++ b/proxy/http/HttpTransact.cc
   @@ -4791,16 +4851,22 @@ 
HttpTransact::handle_no_cache_operation_on_forward_server_response(State *s)
      case HTTP_STATUS_PARTIAL_CONTENT:
        // If we get this back we should be just passing it through.
        ink_assert(s->cache_info.action == CACHE_DO_NO_ACTION);
   +    SET_VIA_STRING(VIA_SERVER_RESULT, VIA_SERVER_SERVED);
   +    SET_VIA_STRING(VIA_PROXY_RESULT,  VIA_PROXY_SERVED);
        s->next_action = SM_ACTION_SERVER_READ;
        break;
      default:
        TxnDebug("http_trans", "[hncoofsr] server sent back something other 
than 100,304,200");
        /* Default behavior is to pass-through response to the client */
   
        ink_assert(s->cache_info.action == CACHE_DO_NO_ACTION);
   +    SET_VIA_STRING(VIA_SERVER_RESULT, VIA_SERVER_SERVED);
   +    SET_VIA_STRING(VIA_PROXY_RESULT,  VIA_PROXY_SERVED);
        s->next_action = SM_ACTION_SERVER_READ;
        break;
      }
   ```
   
   After the patch the visible `Via` window would become:
   
   - `206` (Range, no cache write):  `[cMsSf ]`
   - `404` from origin (Range path): `[cMsSf ]`
   - DNS failure synthesized `500`:  `[cMs f ]` (unchanged, since the server 
side never ran)
   
   …which lets operators distinguish "origin returned this status" from "ATS 
synthesized an error" at a glance, just from `insert_response_via_str = 2`.
   
   If the project prefers to leave `default` alone (because some non-2xx codes 
might have been generated locally by other code paths I'm not considering), 
then at least adding the two `SET_VIA_STRING` lines to the 
`HTTP_STATUS_PARTIAL_CONTENT` branch would already resolve the most common 
operator confusion.
   
   ## Environment
   
   - ATS version: `9.2.11` (also reproducible on other 9.2.x patch levels)
   - OS: Ubuntu 22.04
   - `proxy.config.http.insert_response_via_str = 2`
   - `proxy.config.http.cache.range.lookup = 1`
   - `proxy.config.http.cache.range.write = 0`
   - No relevant plugins (reproduces with a stock build + a single forward 
remap)
   
   ## Additional notes
   
   - Workaround for operators: bump `proxy.config.http.insert_response_via_str` 
to `3` to expose the full 12-character `via_string`, including 
`VIA_ERROR_TYPE`. That at least lets you distinguish a DNS-failure-synthesized 
500 (`VIA_ERROR_TYPE = 'D'`) from a real origin response (`VIA_ERROR_TYPE = 
'N'`).
   - Documentation impact: the admin guide section describing the `Via` header 
should clarify that, prior to a fix, slot 5 (`VIA_SERVER_RESULT`) is *not* 
populated for `206`/`default` when the transaction is forwarded without 
caching. Today this asymmetry is undocumented and surprising.
   
   Happy to put up a PR with the diff above if the maintainers agree on the 
direction.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to