kiranchavala commented on code in PR #8394:
URL: https://github.com/apache/cloudstack/pull/8394#discussion_r1436812958
##########
framework/jobs/src/main/java/org/apache/cloudstack/framework/jobs/impl/AsyncJobManagerImpl.java:
##########
@@ -1128,6 +1139,65 @@ public void
doInTransactionWithoutResult(TransactionStatus status) {
}
}
+ /*
+ Cleanup Resources in transition state and move them to appropriate state
+ This will allow other operation on the resource, instead of being stuck in
transition state
+ */
+ protected boolean cleanupResources(AsyncJobVO job) {
+ try {
+ ApiCommandResourceType resourceType =
ApiCommandResourceType.fromString(job.getInstanceType());
+ if (resourceType == null) {
+ s_logger.warn("Unknown ResourceType. Skip Cleanup: " +
job.getInstanceType());
+ return true;
+ }
+ switch (resourceType) {
+ case Volume:
+ VolumeInfo vol = volFactory.getVolume(job.getInstanceId());
+ if (vol == null) {
+ s_logger.warn("Volume not found. Skip Cleanup.
VolumeId: " + job.getInstanceId());
Review Comment:
@JoaoJandre @sureshanaparti
According to the design document
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=39620237
when a volume state is "UploadInProgress" and the process is interrupted by
stopping the ssvm.
The volume state changes to "NotUploaded" > UploadError"
The admin user after investigating the issue can be delete the volumes in
"UploadError" state manually
**Cleanup**
A cleanup thread will be running at regular intervals (configurable, provide
details). It will pick up all volume/template with upload state as
"UPLOAD_ERROR" and "ABANDONED" and send agent command to cleanup any partial
data from secondary store. The cleanup will be a best-effort approach.
**Recovery mechanisms**
There isn't any recovery or retry mechanism as this is a POST request. Once
error happens user gets notified with a clear error message as part of the
response. The template/volume will remain in the error state and admin will be
able to troubleshoot it based on the appropriate log messages in management
server log, agent log, apache access/error log files. These failed entries will
eventually get be cleaned by the cleanup process. The user has to reinitiate
the upload by calling getUploadParams API again.
Global settings are
Upload monitoring interval
Upload operation timeout
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]