Copilot commented on code in PR #13076:
URL: https://github.com/apache/cloudstack/pull/13076#discussion_r3152672827
##########
plugins/storage/volume/linstor/src/main/java/org/apache/cloudstack/storage/datastore/util/LinstorUtil.java:
##########
@@ -401,6 +401,57 @@ public static List<ResourceDefinition>
getRDListStartingWith(DevelopersApi api,
.collect(Collectors.toList());
}
+ /**
+ * Default per-call timeout for {@link #waitForResourceDefinitionDeleted}.
Long enough for a
+ * healthy LINSTOR controller to finish a normal delete; short enough not
to block the calling
+ * agent thread for too long if the delete is genuinely stuck.
+ */
+ public static final long DEFAULT_RD_DELETE_VERIFY_TIMEOUT_MILLIS = 30_000L;
+
+ /**
+ * Returns {@code true} if the named resource definition is no longer
present on the LINSTOR
+ * controller. Used after a {@code resourceDefinitionDelete} to verify the
delete actually
+ * completed (LINSTOR can return success on the API call while the
resource lingers in
+ * DELETING state due to peer issues, lost quorum, or down satellites).
+ */
+ public static boolean isResourceDefinitionGone(DevelopersApi api, String
rscName) throws ApiException {
+ List<ResourceDefinition> all = api.resourceDefinitionList(null, false,
null, null, null);
+ if (all == null) {
+ return true;
+ }
+ return all.stream().noneMatch(rd ->
rscName.equalsIgnoreCase(rd.getName()));
+ }
+
+ /**
+ * Polls the controller until the named resource definition is gone or the
timeout elapses.
+ * Returns {@code true} if the resource was confirmed gone, {@code false}
if it was still
+ * present (or the controller kept erroring) at the deadline. Callers
should NOT throw on a
+ * {@code false} return — the upstream API call already reported success
and the operator
+ * may need to investigate manually. Log a WARN with the resource name
instead.
+ */
+ public static boolean waitForResourceDefinitionDeleted(DevelopersApi api,
String rscName, long timeoutMillis) {
+ final long deadline = System.currentTimeMillis() + timeoutMillis;
+ while (true) {
+ try {
+ if (isResourceDefinitionGone(api, rscName)) {
+ return true;
+ }
+ } catch (ApiException e) {
+ LOGGER.debug("LINSTOR delete-verify poll failed for {}: {}",
rscName, e.getMessage());
+ // Keep polling — controller may be transiently unavailable.
+ }
+ if (System.currentTimeMillis() >= deadline) {
+ return false;
+ }
+ try {
+ Thread.sleep(1_000L);
+ } catch (InterruptedException ie) {
+ Thread.currentThread().interrupt();
+ return false;
+ }
+ }
Review Comment:
New delete-verification behavior in `waitForResourceDefinitionDeleted` /
`isResourceDefinitionGone` isn’t covered by the existing `LinstorUtilTest`
suite. Please add unit tests for (a) confirmed-gone case, and (b)
timeout/exception paths (ideally without introducing real sleeps) so
regressions in the polling logic are caught in CI.
##########
plugins/storage/volume/linstor/src/main/java/org/apache/cloudstack/storage/datastore/util/LinstorUtil.java:
##########
@@ -401,6 +401,57 @@ public static List<ResourceDefinition>
getRDListStartingWith(DevelopersApi api,
.collect(Collectors.toList());
}
+ /**
+ * Default per-call timeout for {@link #waitForResourceDefinitionDeleted}.
Long enough for a
+ * healthy LINSTOR controller to finish a normal delete; short enough not
to block the calling
+ * agent thread for too long if the delete is genuinely stuck.
Review Comment:
`DEFAULT_RD_DELETE_VERIFY_TIMEOUT_MILLIS` Javadoc mentions "the calling
agent thread", but this constant is also used on the management-server side
(e.g., `LinstorPrimaryDataStoreDriverImpl`). Consider rewording to "calling
thread" (or mention both contexts) to keep the comment accurate.
```suggestion
* thread for too long if the delete is genuinely stuck.
```
##########
plugins/storage/volume/linstor/src/main/java/org/apache/cloudstack/storage/datastore/util/LinstorUtil.java:
##########
@@ -401,6 +401,57 @@ public static List<ResourceDefinition>
getRDListStartingWith(DevelopersApi api,
.collect(Collectors.toList());
}
+ /**
+ * Default per-call timeout for {@link #waitForResourceDefinitionDeleted}.
Long enough for a
+ * healthy LINSTOR controller to finish a normal delete; short enough not
to block the calling
+ * agent thread for too long if the delete is genuinely stuck.
+ */
+ public static final long DEFAULT_RD_DELETE_VERIFY_TIMEOUT_MILLIS = 30_000L;
+
+ /**
+ * Returns {@code true} if the named resource definition is no longer
present on the LINSTOR
+ * controller. Used after a {@code resourceDefinitionDelete} to verify the
delete actually
+ * completed (LINSTOR can return success on the API call while the
resource lingers in
+ * DELETING state due to peer issues, lost quorum, or down satellites).
+ */
+ public static boolean isResourceDefinitionGone(DevelopersApi api, String
rscName) throws ApiException {
+ List<ResourceDefinition> all = api.resourceDefinitionList(null, false,
null, null, null);
+ if (all == null) {
+ return true;
+ }
+ return all.stream().noneMatch(rd ->
rscName.equalsIgnoreCase(rd.getName()));
Review Comment:
`isResourceDefinitionGone` currently calls `resourceDefinitionList(null,
...)`, which fetches *all* resource definitions and then streams to find a
match. This can be expensive (and will be repeated every second in
`waitForResourceDefinitionDeleted`). Use the API's name filter instead (e.g.,
`resourceDefinitionList(Collections.singletonList(rscName), ...)`) and treat an
empty result as "gone" to reduce controller load and payload size.
```suggestion
List<ResourceDefinition> matches =
api.resourceDefinitionList(Collections.singletonList(rscName), false, null,
null, null);
if (matches == null || matches.isEmpty()) {
return true;
}
return matches.stream().noneMatch(rd ->
rscName.equalsIgnoreCase(rd.getName()));
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]