On 6/20/25 18:17, Jillian Morgan wrote:
On Fri, Jun 20, 2025 at 10:32 AM Daniel Kral wrote:
Add the location rule plugin to allow users to specify node affinity
constraints for independent services.
Location rules must specify one or more services, one or more node with
optional priorities (t
Add a rules base plugin to allow users to specify different kinds of HA
rules in a single configuration file, which put constraints on the HA
Manager's behavior.
Rule checkers can be registered for every plugin with the
register_check(...) method and are used for checking the feasibility of
the ru
Add test cases for strict negative colocation rules, i.e. where services
must be kept on separate nodes. These verify the behavior of the
services in strict negative colocation rules in case of a failover of
the node of one or more of these services in the following scenarios:
1. 2 neg. colocated
Signed-off-by: Daniel Kral
---
This patch is more of a show-case how the static files changed.
changes since v1:
- NEW!
api-viewer/apidata.js | 14 ++
ha-manager.1-synopsis.adoc | 8
ha-resources-opts.adoc | 4
3 files changed, 26 insertions(+)
diff --g
Add CRUD API endpoints for HA rules, which assert whether the given
properties for the rules are valid and will not make the existing rule
set infeasible.
Disallowing changes to the rule set via the API, which would make this
and other rules infeasible, makes it safer for users of the HA Manager
t
Daniel,
Firstly I want to say thank you very, very, very much! This extensive work
obviously took a lot of time and effort. I feel like one of my Top-5 gripes
with Proxmox (after moving from oVirt) will finally be resolved by this new
feature.
Next, however, I would like to add my two cents to th
--- Begin Message ---
>>1) Having "location" and "colocation" rules is, I think, going to be
>>unnecessarily confusing for people. While it isn't too complicated to
>>glean
>>the distinction once having read the descriptions of them (and I had
>>to go
>>read the descriptions), they don't convey imm
On Fri, Jun 20, 2025 at 10:32 AM Daniel Kral wrote:
> Add the location rule plugin to allow users to specify node affinity
> constraints for independent services.
>
> Location rules must specify one or more services, one or more node with
> optional priorities (the default is 0), and a strictness
On 6/20/25 16:31, Daniel Kral wrote:
Changelog
-
Just noticed that I missed one detail that might be beneficial to know,
so following the patch changes is easier:
- migrate ha groups internally in the HA Manager to ha location rules,
so that internally these can already be replaced
Add test cases for strict positive colocation rules, i.e. where services
must be kept on the same node together. These verify the behavior of the
services in strict positive colocation rules in case of a failover of
their assigned nodes in the following scenarios:
1. 2 pos. colocated services in a
On June 18, 2025 3:01 pm, Fiona Ebner wrote:
> In certain situations like restoring a backup, it can be useful to
> skip certain properties. This allows to drop outdated properties from
> the schema while still being able to parse property strings that
> contain them, but without allowing all addit
Superseded by:
https://lore.proxmox.com/pve-devel/20250620143148.218469-1-d.k...@proxmox.com/
On 3/25/25 16:12, Daniel Kral wrote:
This RFC patch series is a draft for the implementation to allow users
to specify colocation rules (or affinity/anti-affinity) for the HA
Manager, so that two or mo
Signed-off-by: Daniel Kral
---
This patch is more of a show-case how the static files changed.
changes since v1:
- NEW!
api-viewer/apidata.js | 363 +
ha-manager.1-synopsis.adoc | 138 ++
2 files changed, 501 insertions(+)
diff --git a/a
Add a mechanism to the node selection subroutine, which enforces the
colocation rules defined in the rules config.
The algorithm makes in-place changes to the set of nodes in such a way,
that the final set contains only the nodes where the colocation rules
allow the service to run on, depending on
Signed-off-by: Daniel Kral
---
This patch is more of a show-case how the static files changed.
changes since v1:
- NEW!
api-viewer/apidata.js | 9 -
datacenter.cfg.5-opts.adoc | 6 +-
2 files changed, 13 insertions(+), 2 deletions(-)
diff --git a/api-viewer/apidata.js b/ap
The HA Manager already handles positive and negative colocations for
individual service migration, but the information about these is only
redirected to the HA environment's logger, i.e., for production usage
these messages are redirected to the HA Manager node's syslog.
Therefore, add checks when
Replace the HA group mechanism by replacing it with the functionally
equivalent location rules' get_location_preference(...), which enforces
the location rules defined in the rules config.
This allows the $groups parameter to be replaced with the $rules
parameter in select_service_node(...) as all
select_service_node(...) in 'none' mode will usually only return no
node, if negative colocations specify more services than nodes
available. In these cases, these cannot be separated as there are no
more nodes left, so these are put in error state for now.
Signed-off-by: Daniel Kral
---
This is
Add test cases, where colocation rules are used with the static
utilization scheduler and the rebalance on start option enabled. These
verify the behavior in the following scenarios:
- 7 services with interwined colocation rules in a 3 node cluster;
1 node failing
- 3 neg. colocated services in
Remove the HA Groups entry from the datacenter's config tabs if the
use-location-rules feature flag is enabled.
As changing the use-location-rules feature flag doesn't automatically
reload the web interface, show an empty message if the HA Groups page is
still open.
Remove the 'ha-groups' from th
Add 'use-location-rules' feature flag to the datacenter options input
panel to control the behavior of the HA Manager, API endpoints, and web
interface to either use and show HA Groups (disabled), or use and show
HA Location rules (enabled).
The util helper is used in following patches to control
Add test cases to verify that the rule checkers correctly identify and
remove ill-defined location and colocation rules from the rules:
- Set defaults when reading location and colocation rules
- Dropping location rules, which specify the same service multiple times
- Dropping colocation rules, wh
Add components for basic CRUD operations on the HA rules and viewing
potentially errors of contradictory HA rules, which are currently only
possible by manually editing the file right now.
The feature flag 'use-location-rules' controls whether location rules
can be created from the web interface.
Add section about how to create and modify ha rules, describing their
use cases and document their common and plugin-specific properties.
As of now, HA Location rules are controlled by the feature flag
'use-location-rules' in the datacenter config to replace HA Groups.
Signed-off-by: Daniel Kral
Remove the group selector from the Resources grid view and edit window
and replace it with the 'failback' field if the use-location-rules
feature flag is enabled.
Signed-off-by: Daniel Kral
---
changes since v1:
- NEW!
www/manager6/ha/ResourceEdit.js | 27 ++-
www/ma
Assert whether certain properties are allowed to be passed for the HA
groups and HA services API endpoints depending on whether the
use-location-rules feature flag is enabled or disabled.
Signed-off-by: Daniel Kral
---
changes since v1:
- NEW!
src/PVE/API2/HA/Groups.pm| 20 +
Signed-off-by: Daniel Kral
---
changes since v1:
- NEW!
PVE/API2/HAConfig.pm | 8 +++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/PVE/API2/HAConfig.pm b/PVE/API2/HAConfig.pm
index 35f49cbb..d29211fb 100644
--- a/PVE/API2/HAConfig.pm
+++ b/PVE/API2/HAConfig.pm
@@ -12,6 +
Add the location rule plugin to allow users to specify node affinity
constraints for independent services.
Location rules must specify one or more services, one or more node with
optional priorities (the default is 0), and a strictness, which is
either
* 0 (loose): services MUST be located on o
Make positively colocated services migrate to the same target node as
the manually migrated service and prevent a service to be manually
migrated to a node, which contains negatively colocated services.
The log information here is only redirected to the HA Manager node's
syslog, so user-facing end
Add the colocation rule plugin to allow users to specify inter-service
affinity constraints. Colocation rules must specify two or more services
and a colocation affinity. The inter-service affinity of colocation
rules must be either
* together (positive): keeping services together, or
* separa
Migrate the currently configured HA groups to HA Location rules
in-memory if the use-location-rules feature flag isn't set, so that they
can be applied as such in the next patches and therefore replace HA
groups internally.
Also ignore location rules written to the rules config if the
use-location
Add an option to the VirtFail's name to allow the start and migrate fail
counts to only apply on a certain node number with a specific naming
scheme.
This allows a slightly more elaborate test type, e.g. where a service
can start on one node (or any other in that case), but fails to start on
a spe
Add the failback property in the service config, which is functionally
equivalent to the negation of the HA group's nofailback property.
It is set to be enabled by default as the HA group's nofailback property
was disabled by default.
Signed-off-by: Daniel Kral
---
changes since v1:
- NEW!
Read the rules configuration in each round and update the canonicalized
rules configuration if there were any changes since the last round to
reduce the amount of times of verifying the rule set.
Signed-off-by: Daniel Kral
---
changes since v1:
- only read and canonicalize rules here... intro
This is a follow-up to the previous RFC patch series for the HA
colocation rules feature, which allow users to specify colocation rules
(or affinity/anti-affinity) for the HA Manager, so that two or more
services are either kept together or apart with respect to each other.
Changelog
-
I
Add a feature flag 'use-location-rules', which is used to control the
behavior of how the HA WebGUI interface and HA API endpoints handle HA
Groups and HA Location rules.
If the flag is not set, HA Location rules shouldn't be able to be
created or modified, but only allow their behavior to be repr
If there are other properties in the HA config hash, these cannot be set
without also giving a value for shutdown_policy, which is unnecessary as
it already has a default value. Therefore, make it optional.
Signed-off-by: Daniel Kral
---
changes since v1:
- NEW!
src/PVE/DataCenterConfig.pm
Signed-off-by: Daniel Kral
---
This patch is more of a show-case how the static files changed.
changes since v1:
- NEW!
api-viewer/apidata.js | 28 ++--
1 file changed, 26 insertions(+), 2 deletions(-)
diff --git a/api-viewer/apidata.js b/api-viewer/apidata.js
index
Signed-off-by: Daniel Kral
---
changes since v1:
- only rebased on master
src/PVE/Cluster.pm | 1 +
src/pmxcfs/status.c | 1 +
2 files changed, 2 insertions(+)
diff --git a/src/PVE/Cluster.pm b/src/PVE/Cluster.pm
index 3b1de57..9ec4f66 100644
--- a/src/PVE/Cluster.pm
+++ b/src/PVE/Cluster.
Expose the HA rules API endpoints through the CLI in its own subcommand.
The names of the subsubcommands are chosen to be consistent with the
other commands provided by the ha-manager CLI for services and groups.
The properties specified for the 'rules config' command are chosen to
reflect the co
Remove services from rules, where these services are used, if they are
removed by delete_service_from_config(...), which is called by the
services' delete API endpoint and possibly external callers, e.g. if the
service is removed externally.
If all of the rules' services have been removed, the rul
This will be used to retrieve the nodes, which a service is currently
putting load on and using their resources, when dealing with colocation
rules in select_service_node(...). For example, a migrating service in a
negative colocation will need to block other negatively colocated
services to migrat
Add checks, which determine infeasible colocation rules, because their
services are already restricted by their location rules in such a way,
that these cannot be satisfied or are reasonable to be proven to be
satisfiable.
Positive colocation rule services need to have at least one common node
to
Adds methods to the HA environment to read and write the rules
configuration file for the different environment implementations.
Signed-off-by: Daniel Kral
---
changes since v1:
- reorder use statements
- use property isolation for the rules plugin
- introduce `read_and_check_rules_co
As the signature of select_service_node(...) has become rather long
already, make it more compact by retrieving service- and
affinity-related data directly from the service state in $sd and
introduce a $mode parameter to distinguish the behaviors of $try_next
and $best_scored, which have already be
Explicitly state all the parameters at all call sites for
select_service_node(...) to clarify in which states these are.
The call site in next_state_recovery(...) sets $best_scored to 1, as it
should find the next best node when recovering from the failed node
$current_node. All references to $bes
Add a new package PVE::HashTools to provide helpers for common
operations done on hashes.
These initial helper subroutines implement basic set operations done on
hash sets, i.e. hashes with elements set to a true value, e.g. 1.
Signed-off-by: Daniel Kral
---
changes since v1:
- moved from pv
applied most of the patches, except for 6, 28/29 and 31/32.
6 is dropped altogether, the other 4 can be included in the switch to
blockdev to avoid affecting the block graph before that happens.
folded in a small fixup for 7/8, and committed the changed expected
output for tests.
thanks a lot
Hi all,
This patch addresses excessive "connection lost" and "connection reset" log
spam on iSCSI targets caused by Proxmox storage monitoring performing TCP
connection tests every 10 seconds, even when iSCSI sessions are active.
The issue appears as continuous log entries on iSCSI targets:
"ctld
On June 18, 2025 3:01 pm, Fiona Ebner wrote:
> With '-blockdev', it is necessary to activate the volumes to generate
> the command line, because it can be necessary to check whether the
> volume is a block device or a regular file.
>
> Do not deactivate after commandline generation for 'qm showcmd
On June 18, 2025 3:01 pm, Fiona Ebner wrote:
> It was not possible to start a QEMU instance with these options set
> since QEMU version 3.1, QEMU commit b24ec3c462 ("block: Remove
> deprecated -drive geometry options") and thus also not to take a
> backup. It is still possible to restore an old bac
Am 20.06.25 um 13:03 schrieb Fabian Grünbichler:
> On June 18, 2025 3:01 pm, Fiona Ebner wrote:
>> It was not possible to start a QEMU instance with these options set
>> since QEMU version 3.1, QEMU commit b24ec3c462 ("block: Remove
>> deprecated -drive geometry options") and thus also not to take
52 matches
Mail list logo