Copilot commented on code in PR #12879:
URL: https://github.com/apache/trafficserver/pull/12879#discussion_r3018163551


##########
src/proxy/http/HttpSM.cc:
##########
@@ -1685,9 +1685,20 @@ HttpSM::handle_api_return()
 
   switch (t_state.next_action) {
   case HttpTransact::StateMachineAction_t::TRANSFORM_READ: {
-    HttpTunnelProducer *p = setup_transfer_from_transform();
-    perform_transform_cache_write_action();
-    tunnel.tunnel_run(p);
+    if (t_state.internal_msg_buffer) {
+      // A plugin replaced the response body via TSHttpTxnErrorBodySet().
+      // Use internal transfer instead of the transform tunnel.
+      SMDbg(dbg_ctl_http, "plugin set internal body, bypassing transform for 
internal transfer");
+      if (server_txn != nullptr) {
+        do_drain_server_response_body();
+        release_server_session();
+      }
+      setup_internal_transfer(&HttpSM::tunnel_handler);

Review Comment:
   The new internal body handling in TRANSFORM_READ treats any non-null 
internal_msg_buffer as a plugin-set response body, but internal_msg_buffer is 
also used by TSHttpTxnServerRequestBodySet(). If api_server_request_body_set is 
true, this branch would incorrectly bypass the transform/server tunnel and send 
an internal response instead. Please gate this on 
`!t_state.api_server_request_body_set` (same rationale as 
how_to_open_connection()).



##########
tests/gold_tests/pluginTest/header_rewrite/rules/rule_set_body_from_send_resp.conf:
##########
@@ -0,0 +1,21 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+# Replace the origin response body with content fetched from a secondary URL
+# at SEND_RESPONSE_HDR_HOOK.
+cond %{SEND_RESPONSE_HDR_HOOK}
+set-body-from http://www.example.com/custom_body

Review Comment:
   For readability/consistency with the other header_rewrite rule files in this 
directory, consider indenting the operator line under the `cond`. This file 
currently has `set-body-from` unindented, unlike most other rules.
   ```suggestion
     set-body-from http://www.example.com/custom_body
   ```



##########
src/proxy/http/HttpSM.cc:
##########
@@ -1720,6 +1731,15 @@ HttpSM::handle_api_return()
       }
 
       setup_blind_tunnel(true, initial_data);
+    } else if (t_state.internal_msg_buffer) {
+      // A plugin replaced the origin response body via 
TSHttpTxnErrorBodySet().
+      // Drain the origin body if possible, then use internal transfer.
+      SMDbg(dbg_ctl_http, "plugin set internal body, using internal transfer 
instead of server tunnel");
+      if (server_txn != nullptr) {
+        do_drain_server_response_body();
+        release_server_session();
+      }
+      setup_internal_transfer(&HttpSM::tunnel_handler);

Review Comment:
   Same issue as TRANSFORM_READ: internal_msg_buffer is also used for 
TSHttpTxnServerRequestBodySet(). If api_server_request_body_set is true, this 
SERVER_READ branch would incorrectly treat the request body buffer as a 
replacement response body and divert to setup_internal_transfer(). Please add 
`!t_state.api_server_request_body_set` to this condition.



##########
src/proxy/http/HttpSM.cc:
##########
@@ -1685,9 +1685,20 @@ HttpSM::handle_api_return()
 
   switch (t_state.next_action) {
   case HttpTransact::StateMachineAction_t::TRANSFORM_READ: {
-    HttpTunnelProducer *p = setup_transfer_from_transform();
-    perform_transform_cache_write_action();
-    tunnel.tunnel_run(p);
+    if (t_state.internal_msg_buffer) {
+      // A plugin replaced the response body via TSHttpTxnErrorBodySet().
+      // Use internal transfer instead of the transform tunnel.
+      SMDbg(dbg_ctl_http, "plugin set internal body, bypassing transform for 
internal transfer");
+      if (server_txn != nullptr) {
+        do_drain_server_response_body();
+        release_server_session();

Review Comment:
   These new code paths drain the origin response body and then call 
release_server_session() expecting the connection to be reusable, but 
release_server_session() currently only returns sessions to the pool for 304 / 
(auth) HEAD responses. For typical 200/403 responses, the drained connection 
will still be closed, making the drain work ineffective and contradicting the 
new docs/test intent. Either widen the pooling condition for the 
successful-drain case or perform an explicit `server_txn->release()` when drain 
succeeds.
   ```suggestion
           server_txn->release();
   ```



##########
doc/admin-guide/plugins/header_rewrite.en.rst:
##########
@@ -1142,35 +1142,102 @@ set-body
 
   set-body <text>
 
-Sets the body to ``<text>``. Can also be used to delete a body with ``""``. 
This is only useful when overriding the origin status, i.e.
-intercepting/pre-empting a request so that you can override the body from the 
body-factory with your own.
+Sets the response body to ``<text>``. Can also be used to delete a body with 
``""``.
+
+This operator can be used to replace the response body in three scenarios:
+
+1. **Synthetic responses (no origin connection)**: When used at 
``REMAP_PSEUDO_HOOK``
+   (before the origin connection is established), ``set-body`` will skip the 
origin
+   connection entirely and serve a synthetic response directly. Use 
``set-status`` to
+   control the response status code (defaults to 200 OK if not specified). 
This is
+   useful for blocking requests, serving synthetic pages, or returning canned 
responses
+   without touching the origin.
+
+2. **ATS-generated responses**: When overriding the origin status at remap 
time (e.g.
+   with ``set-status``), you can use ``set-body`` at 
``SEND_RESPONSE_HDR_HOOK`` to
+   override the body from the body-factory with your own.
+
+3. **Origin server responses**: When the origin returns a response with a body,
+   ``set-body`` can replace the origin body with the specified text. This is 
useful
+   for sanitizing error responses (e.g. replacing a 403 body that contains 
sensitive
+   information). Use ``READ_RESPONSE_HDR_HOOK`` to inspect the origin response
+   headers and replace the body, or ``SEND_RESPONSE_HDR_HOOK`` to replace the 
body
+   just before sending to the client.
+
+When replacing an origin response body (scenario 3), ATS will attempt to drain 
the
+origin body from the server connection buffer so the connection can be reused. 
If the
+origin body is too large to be fully buffered or uses chunked encoding, the 
server
+connection will be closed instead.
+
+.. note::
+
+   When replacing an origin response body, the origin's response headers are
+   preserved. If the origin headers contain sensitive information (e.g.
+   ``Server``, ``X-Amz-Request-Id``, or other internal identifiers), use
+   ``rm-header`` to strip them::
+
+      cond %{READ_RESPONSE_HDR_HOOK} [AND]
+      cond %{STATUS} =403
+          set-body "Access Denied"
+          rm-header Server
+          rm-header X-Amz-Request-Id
+
+   If a response transform plugin is also active on the same transaction,
+   ``set-body`` takes precedence and the transform is bypassed.
+
+Example: block a request at remap time without contacting the origin::
+
+   cond %{REMAP_PSEUDO_HOOK} [AND]
+   cond %{CLIENT-URL:PATH} /blocked
+       set-status 403
+       set-body "Access Denied"
+
+Example: sanitize a 403 response from the origin::
+
+   cond %{READ_RESPONSE_HDR_HOOK} [AND]
+   cond %{STATUS} =403
+       set-body "Access Denied"
 
 set-body-from
 ~~~~~~~~~~~~~
 ::
 
   set-body-from <URL>
 
-Will call ``<URL>`` (see URL in `URL Parts`_) to retrieve a custom error 
response
-and set the body with the result. Triggering this rule on an OK transaction 
will
-send a 500 status code to the client with the desired response. If this is 
triggered
-on any error status code, that original status code will be sent to the client.
+Will call ``<URL>`` (see URL in `URL Parts`_) to retrieve a response body from 
a
+secondary URL and use it to replace the origin's response body. The origin's 
response
+status code and headers are preserved, and the fetched content replaces only 
the body.
+
+This operator can be used at ``READ_RESPONSE_HDR_HOOK`` or 
``SEND_RESPONSE_HDR_HOOK``.
+When the origin response body is fully buffered and has a known 
``Content-Length``,
+ATS will drain the origin body to allow connection reuse. Otherwise, the origin
+connection is closed.
+
+If the fetch fails or times out, the original origin response (including its 
body)
+is sent unmodified. This means ``set-body-from`` does **not** guarantee the 
origin
+body will be hidden from the client -- only successful fetches result in body
+replacement. Use ``set-body`` instead if the replacement text is static and 
must

Review Comment:
   Documentation says that if the fetch fails or times out, the origin response 
(including its body) is sent unmodified. However the updated tests/rules 
indicate that some fetch connection failures can still produce an ATS-generated 
error page that FetchSM treats as a successful fetch, which will replace the 
origin body. Please clarify this behavior so users understand that certain 
fetch errors may still result in body replacement (with ATS error page content).
   ```suggestion
   In most cases, if the fetch fails or times out, the original origin response 
(including its body)
   is sent unmodified. However, certain fetch connection failures can cause ATS 
to generate an internal
   error page that ``set-body-from`` treats as a successful fetch, and that 
content will replace the
   origin body. This means ``set-body-from`` does **not** guarantee the origin 
body will be hidden from
   the client or always preserved. Use ``set-body`` instead if the replacement 
text is static and must
   ```



##########
doc/admin-guide/plugins/header_rewrite.en.rst:
##########
@@ -1142,35 +1142,102 @@ set-body
 
   set-body <text>
 
-Sets the body to ``<text>``. Can also be used to delete a body with ``""``. 
This is only useful when overriding the origin status, i.e.
-intercepting/pre-empting a request so that you can override the body from the 
body-factory with your own.
+Sets the response body to ``<text>``. Can also be used to delete a body with 
``""``.
+
+This operator can be used to replace the response body in three scenarios:
+
+1. **Synthetic responses (no origin connection)**: When used at 
``REMAP_PSEUDO_HOOK``
+   (before the origin connection is established), ``set-body`` will skip the 
origin
+   connection entirely and serve a synthetic response directly. Use 
``set-status`` to
+   control the response status code (defaults to 200 OK if not specified). 
This is
+   useful for blocking requests, serving synthetic pages, or returning canned 
responses
+   without touching the origin.
+
+2. **ATS-generated responses**: When overriding the origin status at remap 
time (e.g.
+   with ``set-status``), you can use ``set-body`` at 
``SEND_RESPONSE_HDR_HOOK`` to
+   override the body from the body-factory with your own.
+
+3. **Origin server responses**: When the origin returns a response with a body,
+   ``set-body`` can replace the origin body with the specified text. This is 
useful
+   for sanitizing error responses (e.g. replacing a 403 body that contains 
sensitive
+   information). Use ``READ_RESPONSE_HDR_HOOK`` to inspect the origin response
+   headers and replace the body, or ``SEND_RESPONSE_HDR_HOOK`` to replace the 
body
+   just before sending to the client.
+
+When replacing an origin response body (scenario 3), ATS will attempt to drain 
the
+origin body from the server connection buffer so the connection can be reused. 
If the
+origin body is too large to be fully buffered or uses chunked encoding, the 
server
+connection will be closed instead.
+
+.. note::
+
+   When replacing an origin response body, the origin's response headers are
+   preserved. If the origin headers contain sensitive information (e.g.
+   ``Server``, ``X-Amz-Request-Id``, or other internal identifiers), use
+   ``rm-header`` to strip them::

Review Comment:
   The docs state that the origin response headers are preserved when replacing 
the body, but the internal transfer path unconditionally sets `Content-Type` to 
`internal_msg_buffer_type` (or defaults it to `text/html` when the plugin 
passes nullptr). For set-body / set-body-from, this will overwrite the origin 
`Content-Type` header unless the operator explicitly supplies a mimetype. 
Please either clarify this in the docs or preserve the existing `Content-Type` 
when one is already present (and no mimetype is provided).
   ```suggestion
      When replacing an origin response body, most of the origin's response 
headers
      are preserved. Entity headers such as ``Content-Length`` and
      ``Transfer-Encoding`` are recomputed, and ``Content-Type`` is **not**
      preserved: it is set to the mimetype passed to ``set-body``, or defaults 
to
      ``text/html`` if no mimetype is provided. If the origin headers contain
      sensitive information (e.g. ``Server``, ``X-Amz-Request-Id``, or other
      internal identifiers), use ``rm-header`` to strip them::
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to