yknoya commented on code in PR #12729:
URL: https://github.com/apache/trafficserver/pull/12729#discussion_r2618628728
##########
src/proxy/http/HttpSM.cc:
##########
@@ -1823,6 +1823,9 @@ HttpSM::state_http_server_open(int event, void *data)
_netvc->do_io_write(this, 1, _netvc_reader);
_netvc->set_inactivity_timeout(this->get_server_connect_timeout());
+ // Pre-emptively set a server connect failure that will be cleared once
a WRITE_READY is received from origin or
+ // bytes are received back
+ t_state.set_connect_fail(EIO);
Review Comment:
Sorry for the delay.
I investigated the points you raised and implemented some changes.
Below is a summary of the investigation performed during the modification.
First, I found that the master branch already contains logic that invokes
the `set_connect_fail` method when a connection attempt fails:
https://github.com/apache/trafficserver/blob/90dbc21a541986db6b223649cb4f798e190a550f/src/proxy/http/ConnectingEntry.cc#L142
https://github.com/apache/trafficserver/blob/90dbc21a541986db6b223649cb4f798e190a550f/src/proxy/http/HttpSM.cc#L1874
https://github.com/apache/trafficserver/blob/90dbc21a541986db6b223649cb4f798e190a550f/src/proxy/http/HttpSM.cc#L1130-L1136
Since `set_connect_fail` is executed when a connection actually fails, it
seemed unnecessary to pre-set `EIO` before the connection is made, so I looked
into the reasoning.
My assumption is that the pre-set `EIO` exists to avoid updating
`connect_result` when `set_connect_fail` is invoked **after** the connection
has already succeeded.
There are several locations where `set_connect_fail` may run after the
connection is established, but due to the following logic, only
`cause_of_death_errno` is updated while `connect_result` is left unchanged.
This is because `connect_result` is set to 0 after the connection succeeds.
https://github.com/apache/trafficserver/blob/90dbc21a541986db6b223649cb4f798e190a550f/include/proxy/http/HttpTransact.h#L931-L936
Based on these findings, I identified the following issues:
1. The `set_connect_fail` method is used not only for connection-related
failures but also for other types of errors (its name no longer matches its
actual behavior).
2. The logic determines whether the error occurred during connection by
relying on a pre-set `EIO`, which is not ideal.
I was not able to resolve these issues perfectly cleanly, but I believe
improvements are possible, and I applied the following changes:
- Renamed `set_connect_fail` to `set_fail`.
- Changed the initial value of `connect_result` to `-UNKNOWN_INTERNAL_ERROR`
(instead of `EIO`), matching the initial value of `cause_of_death_errno` and
making the intent clearer.
- Updated `set_fail` so that it determines whether the failure occurred
during connection by checking `next_action` and the value of `connect_result`.
- Removed the logic that pre-sets `EIO` before a connection attempt.
Additionally, although slightly outside the main issue, I made the following
related improvements:
- Added a `set_success` method to reset both `connect_result` and
`cause_of_death_errno`, since callers of `clear_connect_fail` were not
resetting `cause_of_death_errno`.
- Renamed `clear_connect_fail` to `set_connect_success`.
- Ensured that `set_success` is always invoked upon successful connection by
calling it within `handle_http_server_open`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]