Le 06/02/2021 à 20:47, William Dauchy a écrit :
Hello,

This is a followup from last week cleaning regarding check and agent
check. This patch series brings some more coherency on the CLI side. I
also put some minor cleaning.

William Dauchy (6):
   CLEANUP: check: fix some typo in comments
   CLEANUP: tools: typo in `strl2irc` mention
   MEDIUM: cli: add check-addr command
   MEDIUM: cli: add agent-port command
   MEDIUM: server: support {check,agent}_addr, agent_port in server state
   CLEANUP: server: add missing space in server-state error output

  doc/management.txt                            |  15 +-
  include/haproxy/server-t.h                    |   9 +-
  .../checks/1be_40srv_odd_health_checks.vtc    |   2 +-
  .../checks/40be_2srv_odd_health_checks.vtc    |   2 +-
  reg-tests/checks/4be_1srv_health_checks.vtc   |   6 +-
  src/check.c                                   |  18 +-
  src/proxy.c                                   |  41 ++--
  src/server.c                                  | 192 ++++++++++++++----
  src/tools.c                                   |   2 +-
  9 files changed, 213 insertions(+), 74 deletions(-)


Thanks William ! I merged the first two patches. For the others, I've some questions/comments.

First, there is a test to be sure the agent-check is enabled before updating the agent address and/or port. Do you think it should also be done for the health-check? Because, for now, it is possible to set an address on a disabled (or even not configured) health-check. But it is not possible for an agent-check. Given that it is not possible to enable/disable an agent-check or an health-check from the CLI, I guess the warning for both makes sense.

Then, there are some problems when the server-state file is loaded. The trash chunk used by srv_update_state() is overwritten when update_server_agent_addr_port() is called. It is not obvious, but the get_trash_chunk() function returns cyclically one of two trash chunks. srv_update_state() uses a trash chunk. When called, update_server_check_addr_port() uses the other one. So, update_server_agent_addr_port() re-uses the first trash chunk, the same than srv_update_state(). A solution may be to allocate a dedicated chunk in srv_update_state(), using alloc_trash_chunk().

A more annoying problem happens when an old state file is loaded (prior these changes). Last parameters are not defined, params[18] (check-addr), params[19] (agent-addr) and params[20] (agent-port) are set to NULL. Thus, HAProxy crashes when the health-check address is compared to "-". In fact, this comes from the SRV_STATE_FILE_NB_FIELDS_VERSION_1 macro. It should be set to 24. You added 3 new fields but just incremented it by 1. Hum... No, there is also another problem. In srv_state_parse_line(). SRV_STATE_FILE_NB_FIELDS_VERSION_1 must be equal to SRV_STATE_FILE_MAX_FIELDS. Or better, it should really reflect the argument number of the version 1, so 21. And it must be compared to srv_arg instead of arg. It is a bug since the 1.8 (commit 316947196).

However, this means with these changes, a cold restart is required. In the 2.4, 5 new fields were added to the server-state file. Is it expected to perform a cold restart when upgrading to the 2.4 from a prior major version ? If so, I'm ok. But, it may be a good idea to change the state file version then to not silently ignore the state file. Otherwise, I guess it is possible to make these 5 new fields optional, isn't it ?

William, I'm sorry if I'm not really clear but the subject is a bit foggy for me :) Do you feel confident to handle all the changes ?

--
Christopher Faulet

Reply via email to