tschacara opened a new issue, #7593:
URL: https://github.com/apache/cloudstack/issues/7593

   ISSUE TYPE
   
       Bug Report
   
   COMPONENT NAME
   
       Management, Storage, KVM
   
   CLOUDSTACK VERSION
   
       4.17.2.0
   
   CONFIGURATION
   
       1 X Management Server
       2 X Primary Storage“s 
       1 X Secondary Storage
       3 X KVM Server.
       Advanced Network - Default Security Group.
   
   OS / ENVIRONMENT
   
   Both Management, KVM and Storage“s run Rocklinux 8.7
   
   SUMMARY
   
       Recurring volume snapshots fails and times out after 3600 seconds.
   
   STEPS TO REPRODUCE
   
       Scheduler DATA Volume Snapshot VM (1TB) while VM is online.
   
   
   LOG HERE
   ~~~
   2023-06-05 15:56:57,212 DEBUG [c.c.s.s.SnapshotSchedulerImpl] 
(SnapshotPollTask:ctx-f75b0142) (logid:1000c112) Got 0 snapshots to be executed 
at 2023-06-05 18:56:57 GMT
   2023-06-05 15:56:59,143 DEBUG 
[o.a.c.s.d.d.CloudStackPrimaryDataStoreDriverImpl] 
(Work-Job-Executor-1:ctx-b15acf5a job-15633/job-15636 ctx-a16b6ca1) 
(logid:8443133e) Failed to take snapshot: 375
           at 
org.apache.cloudstack.storage.snapshot.SnapshotServiceImpl.takeSnapshot(SnapshotServiceImpl.java:208)
           at 
org.apache.cloudstack.storage.snapshot.DefaultSnapshotStrategy.takeSnapshot(DefaultSnapshotStrategy.java:439)
           at 
com.cloud.storage.snapshot.SnapshotManagerImpl.takeSnapshot(SnapshotManagerImpl.java:1240)
   2023-06-05 15:56:59,147 DEBUG [o.a.c.s.s.SnapshotServiceImpl] 
(Work-Job-Executor-1:ctx-b15acf5a job-15633/job-15636 ctx-a16b6ca1) 
(logid:8443133e) create snapshot FileServer_DATA-414_20230605175657 failed: 
com.cloud.utils.exception.CloudRuntimeException: Failed to send command, due to 
Agent:15, com.cloud.exception.OperationTimedoutException: Commands 
4851502698584866841 to Host 15 timed out after 3600
   2023-06-05 15:56:59,173 DEBUG [o.a.c.s.s.DefaultSnapshotStrategy] 
(Work-Job-Executor-1:ctx-b15acf5a job-15633/job-15636 ctx-a16b6ca1) 
(logid:8443133e) Failed to take snapshot: 
com.cloud.utils.exception.CloudRuntimeException: Failed to send command, due to 
Agent:15, com.cloud.exception.OperationTimedoutException: Commands 
4851502698584866841 to Host 15 timed out after 3600
   2023-06-05 15:56:59,188 DEBUG [c.c.s.s.SnapshotManagerImpl] 
(Work-Job-Executor-1:ctx-b15acf5a job-15633/job-15636 ctx-a16b6ca1) 
(logid:8443133e) Failed to create 
snapshotcom.cloud.utils.exception.CloudRuntimeException: Failed to send 
command, due to Agent:15, com.cloud.exception.OperationTimedoutException: 
Commands 4851502698584866841 to Host 15 timed out after 3600
   2023-06-05 15:56:59,200 DEBUG [c.c.r.ResourceLimitManagerImpl] 
(Work-Job-Executor-1:ctx-b15acf5a job-15633/job-15636 ctx-a16b6ca1) 
(logid:8443133e) Updating resource Type = snapshot count for Account = 219 
Operation = decreasing Amount = 1
   2023-06-05 15:56:59,232 ERROR [o.a.c.s.v.VolumeServiceImpl] 
(Work-Job-Executor-1:ctx-b15acf5a job-15633/job-15636 ctx-a16b6ca1) 
(logid:8443133e) Take snapshot: 1191 failed
           at 
org.apache.cloudstack.storage.snapshot.DefaultSnapshotStrategy.takeSnapshot(DefaultSnapshotStrategy.java:442)
           at 
com.cloud.storage.snapshot.SnapshotManagerImpl.takeSnapshot(SnapshotManagerImpl.java:1240)
           at 
org.apache.cloudstack.storage.snapshot.DefaultSnapshotStrategy.takeSnapshot(DefaultSnapshotStrategy.java:442)
           at 
com.cloud.storage.snapshot.SnapshotManagerImpl.takeSnapshot(SnapshotManagerImpl.java:1240)
   2023-06-05 15:56:59,303 ERROR [o.a.c.a.c.u.s.CreateSnapshotCmd] 
(API-Job-Executor-1:ctx-225af267 job-15633 ctx-cb317a27) (logid:8443133e) 
Failed to create snapshot due to an internal error creating snapshot for volume 
d2f310f4-af8e-41fa-a94d-a1863d566503
           at 
org.apache.cloudstack.api.command.user.snapshot.CreateSnapshotCmd.execute(CreateSnapshotCmd.java:219)
           at 
org.apache.cloudstack.storage.snapshot.DefaultSnapshotStrategy.takeSnapshot(DefaultSnapshotStrategy.java:442)
           at 
com.cloud.storage.snapshot.SnapshotManagerImpl.takeSnapshot(SnapshotManagerImpl.java:1240)
   2023-06-05 15:56:59,322 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] 
(API-Job-Executor-1:ctx-225af267 job-15633) (logid:8443133e) Complete async 
job-15633, jobStatus: FAILED, resultCode: 530, result: 
org.apache.cloudstack.api.response.ExceptionResponse/null/{"uuidList":[],"errorcode":"530","errortext":"Failed
 to create snapshot due to an internal error creating snapshot for volume 
d2f310f4-af8e-41fa-a94d-a1863d566503"}
   2023-06-05 15:56:59,328 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] 
(API-Job-Executor-1:ctx-225af267 job-15633) (logid:8443133e) Done executing 
org.apache.cloudstack.api.command.user.snapshot.CreateSnapshotCmd for job-15633
   ~~~
   EXPECTED RESULTS
   
       Volume snapshot is created and transported to secondary storage.
   
   ACTUAL RESULTS
   ~~~
   Volume DATA snapshot process can successfully create VM Snapshot. 
   Checking the logs, the operation is being timeout after 3600 seconds.
   I noticed that even getting the error in the ACS log and in the UI. 
   The process is interrupted by the ACS in 3600 seconds and does not copy the 
data to the secondary storage. 
   Even receiving the error, the snapshot of the volume runs in the background, 
but is not transferred, giving the impression that the hypervisor continued to 
execute what was requested in that step, when it received the timeout it does 
not execute the next steps to complete the request .
   
   ~~~
   
   SNAPSHOTS
   
   ~~~
   virsh snapshot-list i-219-414-VM
    Name                                   Creation Time               State
   
-----------------------------------------------------------------------------------
    19290884-7bd7-4b29-863a-e2400f4db35f   2023-06-03 14:23:01 -0300   
disk-snapshot
    3de38bcb-8e2c-4616-9ade-66524eae41b7   2023-05-27 00:08:01 -0300   
disk-snapshot
    40a49d8b-294b-4139-97d1-3fb3d5318b5e   2023-05-20 00:08:01 -0300   
disk-snapshot
    4561cfde-8e21-495c-878a-e3511b3a1b0b   2023-05-27 00:03:01 -0300   
disk-snapshot
    60f9e8ea-a15c-4c2d-a5f9-320670c44c48   2023-06-05 14:56:59 -0300   
disk-snapshot
    6cb5b473-9b47-42bb-90ce-99871ed42517   2023-06-03 17:44:44 -0300   
disk-snapshot
    6d9f0c1a-15ff-4e9c-b739-95406ae86649   2023-05-15 08:15:40 -0300   
disk-snapshot
    6e4efb0b-f17e-4488-b1a4-3b2cc8053c44   2023-05-06 00:00:38 -0300   
disk-snapshot
    94d85369-9914-4ed2-9263-04823f992c38   2023-06-05 12:03:54 -0300   
disk-snapshot
    9e3466e2-5479-4abd-8003-f37336234799   2023-06-03 12:18:01 -0300   
disk-snapshot
    a722a007-0a8f-447c-96df-14e38360dfc5   2023-05-08 10:20:38 -0300   
disk-snapshot
    af3236fc-a865-4916-bfcf-3433de227a70   2023-06-03 00:03:01 -0300   
disk-snapshot
    b54669a5-d7bc-46c0-9fda-f7cf822ad512   2023-05-15 08:20:40 -0300   
disk-snapshot
    cb237c76-0adb-4288-8a0d-699ff4e63d92   2023-06-03 00:12:11 -0300   
disk-snapshot
    d61caeaa-5a0c-4b5e-852b-6069a5136bf3   2023-05-20 00:03:01 -0300   
disk-snapshot
   
   /mnt/a6c33213-6d0f-365a-845c-021a5b5c2aed/snapshots
   drwxr-xr-x. 2 root root 4.0K Jun  5 14:56 .
   drwxr-xr-x. 4 root root 4.0K Jun  5 14:56 ..
   -rw-r--r--. 1 root root 217G Jun  3 16:26 
19290884-7bd7-4b29-863a-e2400f4db35f
   -rw-r--r--. 1 root root 243G Jun  5 17:08 
60f9e8ea-a15c-4c2d-a5f9-320670c44c48
   -rw-r--r--. 1 root root 217G Jun  3 19:46 
6cb5b473-9b47-42bb-90ce-99871ed42517
   -rw-r--r--. 1 root root 278G Jun  5 14:37 
94d85369-9914-4ed2-9263-04823f992c38
   -rw-r--r--. 1 root root 217G Jun  3 14:21 
9e3466e2-5479-4abd-8003-f37336234799
   -rw-r--r--. 1 root root 217G Jun  3 02:14 
cb237c76-0adb-4288-8a0d-699ff4e63d92
   ~~~
   
   GLOBAL SETTINGS
   ~~~
   job.cancel.threshold.minutes         Time (in minutes) for async-jobs to be 
forcely cancelled if it has been in process for long     Advanced        
default(60)     set(1440)
   copy.volume.wait                     In second, timeout for copy volume 
command                                                      Storage 
default(10800)  set(86400)
   backup.snapshot.wait                 In second, timeout for 
BackupSnapshotCommand                                                    
Storage default(21600)  set(86400)
   ~~~


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to