[jira] [Commented] (CLOUDSTACK-9864) cleanup stale worker VMs after job expiry time
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15968860#comment-15968860 ] ASF GitHub Bot commented on CLOUDSTACK-9864: Github user DaanHoogland closed the pull request at: https://github.com/apache/cloudstack/pull/2030 > cleanup stale worker VMs after job expiry time > -- > > Key: CLOUDSTACK-9864 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9864 > Project: CloudStack > Issue Type: Improvement > Security Level: Public(Anyone can view this level - this is the > default.) > Components: VMware >Reporter: Daan Hoogland >Assignee: Daan Hoogland > Labels: vmware, vsphere, workers > > In the present code cleaning worker vms after a timeout is disabled, with the > documented reason that there is no API to query for related tasks in vcenter. > ACS has an expiry time for jobs and a cancel time for jobs. > - Jobs that take longer then the expiry time will have their results be be > neglected. > - Jobs that are cancelled are forcibly removed after the cancellation expity > time. > Any worker remaining after expiry+cancellation will surely be stale and can > be removed. > As some administrators may not want this behaviour there will be a setting > which by default is false that will guard against cleaning stale worker VMs. > Stale worker VMs will be cleaned after 2 * (expiry-time + cancellation-time) > as a safe margin. > related settings: > job.expire.minutes: 1440 > job.cancel.threshold.minutes: 60 > vmware.clean.old.worker.vms: false (new) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CLOUDSTACK-9864) cleanup stale worker VMs after job expiry time
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15968861#comment-15968861 ] ASF GitHub Bot commented on CLOUDSTACK-9864: Github user DaanHoogland commented on the issue: https://github.com/apache/cloudstack/pull/2030 show me the money (tm) > cleanup stale worker VMs after job expiry time > -- > > Key: CLOUDSTACK-9864 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9864 > Project: CloudStack > Issue Type: Improvement > Security Level: Public(Anyone can view this level - this is the > default.) > Components: VMware >Reporter: Daan Hoogland >Assignee: Daan Hoogland > Labels: vmware, vsphere, workers > > In the present code cleaning worker vms after a timeout is disabled, with the > documented reason that there is no API to query for related tasks in vcenter. > ACS has an expiry time for jobs and a cancel time for jobs. > - Jobs that take longer then the expiry time will have their results be be > neglected. > - Jobs that are cancelled are forcibly removed after the cancellation expity > time. > Any worker remaining after expiry+cancellation will surely be stale and can > be removed. > As some administrators may not want this behaviour there will be a setting > which by default is false that will guard against cleaning stale worker VMs. > Stale worker VMs will be cleaned after 2 * (expiry-time + cancellation-time) > as a safe margin. > related settings: > job.expire.minutes: 1440 > job.cancel.threshold.minutes: 60 > vmware.clean.old.worker.vms: false (new) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CLOUDSTACK-9864) cleanup stale worker VMs after job expiry time
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15968859#comment-15968859 ] ASF GitHub Bot commented on CLOUDSTACK-9864: Github user DaanHoogland commented on the issue: https://github.com/apache/cloudstack/pull/2030 renaming and close-opening for retests. "it works on my laptop (tm)" > cleanup stale worker VMs after job expiry time > -- > > Key: CLOUDSTACK-9864 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9864 > Project: CloudStack > Issue Type: Improvement > Security Level: Public(Anyone can view this level - this is the > default.) > Components: VMware >Reporter: Daan Hoogland >Assignee: Daan Hoogland > Labels: vmware, vsphere, workers > > In the present code cleaning worker vms after a timeout is disabled, with the > documented reason that there is no API to query for related tasks in vcenter. > ACS has an expiry time for jobs and a cancel time for jobs. > - Jobs that take longer then the expiry time will have their results be be > neglected. > - Jobs that are cancelled are forcibly removed after the cancellation expity > time. > Any worker remaining after expiry+cancellation will surely be stale and can > be removed. > As some administrators may not want this behaviour there will be a setting > which by default is false that will guard against cleaning stale worker VMs. > Stale worker VMs will be cleaned after 2 * (expiry-time + cancellation-time) > as a safe margin. > related settings: > job.expire.minutes: 1440 > job.cancel.threshold.minutes: 60 > vmware.clean.old.worker.vms: false (new) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CLOUDSTACK-9864) cleanup stale worker VMs after job expiry time
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15968862#comment-15968862 ] ASF GitHub Bot commented on CLOUDSTACK-9864: GitHub user DaanHoogland reopened a pull request: https://github.com/apache/cloudstack/pull/2030 WIP: CLOUDSTACK-9864 cleanup stale worker VMs after job expiry time You can merge this pull request into a Git repository by running: $ git pull https://github.com/shapeblue/cloudstack snapshot-housekeeping Alternatively you can review and apply these changes as the patch at: https://github.com/apache/cloudstack/pull/2030.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2030 commit 40869570fc510fac0d2357f272e96cd4a4518176 Author: Daan HooglandDate: 2017-03-30T14:35:37Z CE-113 trace logging and rethrow instead of nesting CloudRuntimeException commit 66d7d846352d52cc539b1dafb5e4d0f1620829a5 Author: Daan Hoogland Date: 2017-04-05T12:19:14Z CE-113 configure workervm gc based on job expiry commit 996f5834e6a0a9e4dc57d436ceeb5b89e6dc9974 Author: Daan Hoogland Date: 2017-04-05T15:35:41Z CE-113 extra trace log of worker VMs commit 9a8ea7c0d1c9775ad7e4200db2b3eca93e121909 Author: Daan Hoogland Date: 2017-04-06T09:33:53Z CE-113 removed TODOs commit e2c0f09609b48f4539f13edcc742ca7e06f0cca2 Author: Daan Hoogland Date: 2017-04-07T12:54:19Z CE-113 use of duration (instead of the old clock-tick-based code > cleanup stale worker VMs after job expiry time > -- > > Key: CLOUDSTACK-9864 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9864 > Project: CloudStack > Issue Type: Improvement > Security Level: Public(Anyone can view this level - this is the > default.) > Components: VMware >Reporter: Daan Hoogland >Assignee: Daan Hoogland > Labels: vmware, vsphere, workers > > In the present code cleaning worker vms after a timeout is disabled, with the > documented reason that there is no API to query for related tasks in vcenter. > ACS has an expiry time for jobs and a cancel time for jobs. > - Jobs that take longer then the expiry time will have their results be be > neglected. > - Jobs that are cancelled are forcibly removed after the cancellation expity > time. > Any worker remaining after expiry+cancellation will surely be stale and can > be removed. > As some administrators may not want this behaviour there will be a setting > which by default is false that will guard against cleaning stale worker VMs. > Stale worker VMs will be cleaned after 2 * (expiry-time + cancellation-time) > as a safe margin. > related settings: > job.expire.minutes: 1440 > job.cancel.threshold.minutes: 60 > vmware.clean.old.worker.vms: false (new) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CLOUDSTACK-9864) cleanup stale worker VMs after job expiry time
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15962768#comment-15962768 ] ASF GitHub Bot commented on CLOUDSTACK-9864: Github user blueorangutan commented on the issue: https://github.com/apache/cloudstack/pull/2030 Packaging result: ✔centos6 ✔centos7 ✔debian. JID-632 > cleanup stale worker VMs after job expiry time > -- > > Key: CLOUDSTACK-9864 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9864 > Project: CloudStack > Issue Type: Improvement > Security Level: Public(Anyone can view this level - this is the > default.) > Components: VMware >Reporter: Daan Hoogland >Assignee: Daan Hoogland > Labels: vmware, vsphere, workers > > In the present code cleaning worker vms after a timeout is disabled, with the > documented reason that there is no API to query for related tasks in vcenter. > ACS has an expiry time for jobs and a cancel time for jobs. > - Jobs that take longer then the expiry time will have their results be be > neglected. > - Jobs that are cancelled are forcibly removed after the cancellation expity > time. > Any worker remaining after expiry+cancellation will surely be stale and can > be removed. > As some administrators may not want this behaviour there will be a setting > which by default is false that will guard against cleaning stale worker VMs. > Stale worker VMs will be cleaned after 2 * (expiry-time + cancellation-time) > as a safe margin. > related settings: > job.expire.minutes: 1440 > job.cancel.threshold.minutes: 60 > vmware.clean.old.worker.vms: false (new) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CLOUDSTACK-9864) cleanup stale worker VMs after job expiry time
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15962746#comment-15962746 ] ASF GitHub Bot commented on CLOUDSTACK-9864: Github user borisstoyanov commented on the issue: https://github.com/apache/cloudstack/pull/2030 @blueorangutan package > cleanup stale worker VMs after job expiry time > -- > > Key: CLOUDSTACK-9864 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9864 > Project: CloudStack > Issue Type: Improvement > Security Level: Public(Anyone can view this level - this is the > default.) > Components: VMware >Reporter: Daan Hoogland >Assignee: Daan Hoogland > Labels: vmware, vsphere, workers > > In the present code cleaning worker vms after a timeout is disabled, with the > documented reason that there is no API to query for related tasks in vcenter. > ACS has an expiry time for jobs and a cancel time for jobs. > - Jobs that take longer then the expiry time will have their results be be > neglected. > - Jobs that are cancelled are forcibly removed after the cancellation expity > time. > Any worker remaining after expiry+cancellation will surely be stale and can > be removed. > As some administrators may not want this behaviour there will be a setting > which by default is false that will guard against cleaning stale worker VMs. > Stale worker VMs will be cleaned after 2 * (expiry-time + cancellation-time) > as a safe margin. > related settings: > job.expire.minutes: 1440 > job.cancel.threshold.minutes: 60 > vmware.clean.old.worker.vms: false (new) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CLOUDSTACK-9864) cleanup stale worker VMs after job expiry time
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15962747#comment-15962747 ] ASF GitHub Bot commented on CLOUDSTACK-9864: Github user blueorangutan commented on the issue: https://github.com/apache/cloudstack/pull/2030 @borisstoyanov a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress. > cleanup stale worker VMs after job expiry time > -- > > Key: CLOUDSTACK-9864 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9864 > Project: CloudStack > Issue Type: Improvement > Security Level: Public(Anyone can view this level - this is the > default.) > Components: VMware >Reporter: Daan Hoogland >Assignee: Daan Hoogland > Labels: vmware, vsphere, workers > > In the present code cleaning worker vms after a timeout is disabled, with the > documented reason that there is no API to query for related tasks in vcenter. > ACS has an expiry time for jobs and a cancel time for jobs. > - Jobs that take longer then the expiry time will have their results be be > neglected. > - Jobs that are cancelled are forcibly removed after the cancellation expity > time. > Any worker remaining after expiry+cancellation will surely be stale and can > be removed. > As some administrators may not want this behaviour there will be a setting > which by default is false that will guard against cleaning stale worker VMs. > Stale worker VMs will be cleaned after 2 * (expiry-time + cancellation-time) > as a safe margin. > related settings: > job.expire.minutes: 1440 > job.cancel.threshold.minutes: 60 > vmware.clean.old.worker.vms: false (new) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CLOUDSTACK-9864) cleanup stale worker VMs after job expiry time
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15961650#comment-15961650 ] ASF GitHub Bot commented on CLOUDSTACK-9864: Github user blueorangutan commented on the issue: https://github.com/apache/cloudstack/pull/2030 Trillian test result (tid-983) Environment: vmware-55u3 (x2), Advanced Networking with Mgmt server 7 Total time taken: 51519 seconds Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr2030-t983-vmware-55u3.zip Intermitten failure detected: /marvin/tests/smoke/test_deploy_vgpu_enabled_vm.py Intermitten failure detected: /marvin/tests/smoke/test_internal_lb.py Intermitten failure detected: /marvin/tests/smoke/test_privategw_acl.py Intermitten failure detected: /marvin/tests/smoke/test_routers_network_ops.py Intermitten failure detected: /marvin/tests/smoke/test_routers.py Intermitten failure detected: /marvin/tests/smoke/test_vm_snapshots.py Intermitten failure detected: /marvin/tests/smoke/test_vpc_redundant.py Test completed. 48 look ok, 2 have error(s) Test | Result | Time (s) | Test File --- | --- | --- | --- test_01_test_vm_volume_snapshot | `Failure` | 322.23 | test_vm_snapshots.py test_04_rvpc_privategw_static_routes | `Failure` | 888.45 | test_privategw_acl.py test_01_vpc_site2site_vpn | Success | 371.71 | test_vpc_vpn.py test_01_vpc_remote_access_vpn | Success | 166.99 | test_vpc_vpn.py test_01_redundant_vpc_site2site_vpn | Success | 593.09 | test_vpc_vpn.py test_02_VPC_default_routes | Success | 354.29 | test_vpc_router_nics.py test_01_VPC_nics_after_destroy | Success | 742.66 | test_vpc_router_nics.py test_05_rvpc_multi_tiers | Success | 675.76 | test_vpc_redundant.py test_04_rvpc_network_garbage_collector_nics | Success | 1534.04 | test_vpc_redundant.py test_03_create_redundant_VPC_1tier_2VMs_2IPs_2PF_ACL_reboot_routers | Success | 751.75 | test_vpc_redundant.py test_02_redundant_VPC_default_routes | Success | 704.82 | test_vpc_redundant.py test_01_create_redundant_VPC_2tiers_4VMs_4IPs_4PF_ACL | Success | 1374.94 | test_vpc_redundant.py test_09_delete_detached_volume | Success | 30.73 | test_volumes.py test_06_download_detached_volume | Success | 60.53 | test_volumes.py test_05_detach_volume | Success | 105.26 | test_volumes.py test_04_delete_attached_volume | Success | 10.18 | test_volumes.py test_03_download_attached_volume | Success | 20.32 | test_volumes.py test_02_attach_volume | Success | 58.72 | test_volumes.py test_01_create_volume | Success | 519.39 | test_volumes.py test_change_service_offering_for_vm_with_snapshots | Success | 548.99 | test_vm_snapshots.py test_03_delete_vm_snapshots | Success | 275.23 | test_vm_snapshots.py test_02_revert_vm_snapshots | Success | 232.04 | test_vm_snapshots.py test_01_create_vm_snapshots | Success | 161.65 | test_vm_snapshots.py test_deploy_vm_multiple | Success | 242.48 | test_vm_life_cycle.py test_deploy_vm | Success | 0.03 | test_vm_life_cycle.py test_advZoneVirtualRouter | Success | 0.02 | test_vm_life_cycle.py test_10_attachAndDetach_iso | Success | 26.83 | test_vm_life_cycle.py test_09_expunge_vm | Success | 125.25 | test_vm_life_cycle.py test_08_migrate_vm | Success | 60.94 | test_vm_life_cycle.py test_07_restore_vm | Success | 0.10 | test_vm_life_cycle.py test_06_destroy_vm | Success | 10.14 | test_vm_life_cycle.py test_03_reboot_vm | Success | 5.13 | test_vm_life_cycle.py test_02_start_vm | Success | 20.25 | test_vm_life_cycle.py test_01_stop_vm | Success | 10.14 | test_vm_life_cycle.py test_CreateTemplateWithDuplicateName | Success | 206.29 | test_templates.py test_08_list_system_templates | Success | 0.03 | test_templates.py test_07_list_public_templates | Success | 0.04 | test_templates.py test_05_template_permissions | Success | 0.06 | test_templates.py test_04_extract_template | Success | 10.20 | test_templates.py test_03_delete_template | Success | 5.09 | test_templates.py test_02_edit_template | Success | 90.13 | test_templates.py test_01_create_template | Success | 121.06 | test_templates.py test_10_destroy_cpvm | Success | 266.87 | test_ssvm.py test_09_destroy_ssvm | Success | 268.71 | test_ssvm.py test_08_reboot_cpvm | Success | 156.52 | test_ssvm.py test_07_reboot_ssvm | Success | 188.45 | test_ssvm.py test_06_stop_cpvm | Success | 176.88 | test_ssvm.py test_05_stop_ssvm | Success | 203.78 | test_ssvm.py test_04_cpvm_internals | Success | 1.20 | test_ssvm.py test_03_ssvm_internals | Success | 3.39 | test_ssvm.py test_02_list_cpvm_vm | Success | 0.12 | test_ssvm.py test_01_list_sec_storage_vm | Success | 0.12 | test_ssvm.py
[jira] [Commented] (CLOUDSTACK-9864) cleanup stale worker VMs after job expiry time
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15960681#comment-15960681 ] ASF GitHub Bot commented on CLOUDSTACK-9864: Github user blueorangutan commented on the issue: https://github.com/apache/cloudstack/pull/2030 @borisstoyanov a Trillian-Jenkins test job (centos7 mgmt + vmware-55u3) has been kicked to run smoke tests > cleanup stale worker VMs after job expiry time > -- > > Key: CLOUDSTACK-9864 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9864 > Project: CloudStack > Issue Type: Improvement > Security Level: Public(Anyone can view this level - this is the > default.) > Components: VMware >Reporter: Daan Hoogland >Assignee: Daan Hoogland > Labels: vmware, vsphere, workers > > In the present code cleaning worker vms after a timeout is disabled, with the > documented reason that there is no API to query for related tasks in vcenter. > ACS has an expiry time for jobs and a cancel time for jobs. > - Jobs that take longer then the expiry time will have their results be be > neglected. > - Jobs that are cancelled are forcibly removed after the cancellation expity > time. > Any worker remaining after expiry+cancellation will surely be stale and can > be removed. > As some administrators may not want this behaviour there will be a setting > which by default is false that will guard against cleaning stale worker VMs. > Stale worker VMs will be cleaned after 2 * (expiry-time + cancellation-time) > as a safe margin. > related settings: > job.expire.minutes: 1440 > job.cancel.threshold.minutes: 60 > vmware.clean.old.worker.vms: false (new) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CLOUDSTACK-9864) cleanup stale worker VMs after job expiry time
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15960678#comment-15960678 ] ASF GitHub Bot commented on CLOUDSTACK-9864: Github user borisstoyanov commented on the issue: https://github.com/apache/cloudstack/pull/2030 @blueorangutan test centos7 vmware-55u3 > cleanup stale worker VMs after job expiry time > -- > > Key: CLOUDSTACK-9864 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9864 > Project: CloudStack > Issue Type: Improvement > Security Level: Public(Anyone can view this level - this is the > default.) > Components: VMware >Reporter: Daan Hoogland >Assignee: Daan Hoogland > Labels: vmware, vsphere, workers > > In the present code cleaning worker vms after a timeout is disabled, with the > documented reason that there is no API to query for related tasks in vcenter. > ACS has an expiry time for jobs and a cancel time for jobs. > - Jobs that take longer then the expiry time will have their results be be > neglected. > - Jobs that are cancelled are forcibly removed after the cancellation expity > time. > Any worker remaining after expiry+cancellation will surely be stale and can > be removed. > As some administrators may not want this behaviour there will be a setting > which by default is false that will guard against cleaning stale worker VMs. > Stale worker VMs will be cleaned after 2 * (expiry-time + cancellation-time) > as a safe margin. > related settings: > job.expire.minutes: 1440 > job.cancel.threshold.minutes: 60 > vmware.clean.old.worker.vms: false (new) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CLOUDSTACK-9864) cleanup stale worker VMs after job expiry time
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15960655#comment-15960655 ] ASF GitHub Bot commented on CLOUDSTACK-9864: Github user blueorangutan commented on the issue: https://github.com/apache/cloudstack/pull/2030 Packaging result: ✔centos6 ✔centos7 ✔debian. JID-624 > cleanup stale worker VMs after job expiry time > -- > > Key: CLOUDSTACK-9864 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9864 > Project: CloudStack > Issue Type: Improvement > Security Level: Public(Anyone can view this level - this is the > default.) > Components: VMware >Reporter: Daan Hoogland >Assignee: Daan Hoogland > Labels: vmware, vsphere, workers > > In the present code cleaning worker vms after a timeout is disabled, with the > documented reason that there is no API to query for related tasks in vcenter. > ACS has an expiry time for jobs and a cancel time for jobs. > - Jobs that take longer then the expiry time will have their results be be > neglected. > - Jobs that are cancelled are forcibly removed after the cancellation expity > time. > Any worker remaining after expiry+cancellation will surely be stale and can > be removed. > As some administrators may not want this behaviour there will be a setting > which by default is false that will guard against cleaning stale worker VMs. > Stale worker VMs will be cleaned after 2 * (expiry-time + cancellation-time) > as a safe margin. > related settings: > job.expire.minutes: 1440 > job.cancel.threshold.minutes: 60 > vmware.clean.old.worker.vms: false (new) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CLOUDSTACK-9864) cleanup stale worker VMs after job expiry time
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15960628#comment-15960628 ] ASF GitHub Bot commented on CLOUDSTACK-9864: Github user borisstoyanov commented on the issue: https://github.com/apache/cloudstack/pull/2030 @blueorangutan package > cleanup stale worker VMs after job expiry time > -- > > Key: CLOUDSTACK-9864 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9864 > Project: CloudStack > Issue Type: Improvement > Security Level: Public(Anyone can view this level - this is the > default.) > Components: VMware >Reporter: Daan Hoogland >Assignee: Daan Hoogland > Labels: vmware, vsphere, workers > > In the present code cleaning worker vms after a timeout is disabled, with the > documented reason that there is no API to query for related tasks in vcenter. > ACS has an expiry time for jobs and a cancel time for jobs. > - Jobs that take longer then the expiry time will have their results be be > neglected. > - Jobs that are cancelled are forcibly removed after the cancellation expity > time. > Any worker remaining after expiry+cancellation will surely be stale and can > be removed. > As some administrators may not want this behaviour there will be a setting > which by default is false that will guard against cleaning stale worker VMs. > Stale worker VMs will be cleaned after 2 * (expiry-time + cancellation-time) > as a safe margin. > related settings: > job.expire.minutes: 1440 > job.cancel.threshold.minutes: 60 > vmware.clean.old.worker.vms: false (new) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CLOUDSTACK-9864) cleanup stale worker VMs after job expiry time
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15958811#comment-15958811 ] ASF GitHub Bot commented on CLOUDSTACK-9864: Github user abhinandanprateek commented on a diff in the pull request: https://github.com/apache/cloudstack/pull/2030#discussion_r110145761 --- Diff: plugins/hypervisors/vmware/src/com/cloud/hypervisor/vmware/manager/VmwareManagerImpl.java --- @@ -128,6 +129,7 @@ public class VmwareManagerImpl extends ManagerBase implements VmwareManager, VmwareStorageMount, Listener, VmwareDatacenterService, Configurable { private static final Logger s_logger = Logger.getLogger(VmwareManagerImpl.class); +private static final long MILISECONDS_PER_MINUTE = 6; --- End diff -- MILI typo MILLISECONDS .. > cleanup stale worker VMs after job expiry time > -- > > Key: CLOUDSTACK-9864 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9864 > Project: CloudStack > Issue Type: Improvement > Security Level: Public(Anyone can view this level - this is the > default.) > Components: VMware >Reporter: Daan Hoogland >Assignee: Daan Hoogland > Labels: vmware, vsphere, workers > > In the present code cleaning worker vms after a timeout is disabled, with the > documented reason that there is no API to query for related tasks in vcenter. > ACS has an expiry time for jobs and a cancel time for jobs. > - Jobs that take longer then the expiry time will have their results be be > neglected. > - Jobs that are cancelled are forcibly removed after the cancellation expity > time. > Any worker remaining after expiry+cancellation will surely be stale and can > be removed. > As some administrators may not want this behaviour there will be a setting > which by default is false that will guard against cleaning stale worker VMs. > Stale worker VMs will be cleaned after 2 * (expiry-time + cancellation-time) > as a safe margin. > related settings: > job.expire.minutes: 1440 > job.cancel.threshold.minutes: 60 > vmware.clean.old.worker.vms: false (new) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CLOUDSTACK-9864) cleanup stale worker VMs after job expiry time
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15958812#comment-15958812 ] ASF GitHub Bot commented on CLOUDSTACK-9864: Github user abhinandanprateek commented on a diff in the pull request: https://github.com/apache/cloudstack/pull/2030#discussion_r110145877 --- Diff: plugins/hypervisors/vmware/src/com/cloud/hypervisor/vmware/manager/VmwareManagerImpl.java --- @@ -550,15 +552,21 @@ public boolean needRecycle(String workerTag) { return true; } -// disable time-out check until we have found out a VMware API that can check if -// there are pending tasks on the subject VM -/* -if(System.currentTimeMillis() - startTick > _hungWorkerTimeout) { -if(s_logger.isInfoEnabled()) -s_logger.info("Worker VM expired, seconds elapsed: " + (System.currentTimeMillis() - startTick) / 1000); -return true; -} - */ +// this time-out check was disabled +// "until we have found out a VMware API that can check if there are pending tasks on the subject VM" +// but as we expire jobs and those stale worker VMs stay around untill an MS reboot we opt in to have them removed anyway +Long hungWorkerTimeout = 2 * (AsyncJobManagerImpl.JobExpireMinutes.value() + AsyncJobManagerImpl.JobCancelThresholdMinutes.value()) * MILISECONDS_PER_MINUTE; +Long letsSayNow = System.currentTimeMillis(); +if(s_vmwareCleanOldWorderVMs.value() && letsSayNow - startTick > hungWorkerTimeout) { +if(s_logger.isInfoEnabled()) { +s_logger.info("Worker VM expired, seconds elapsed: " + (System.currentTimeMillis() - startTick) / 1000); +} --- End diff -- For timeouts you may want to use java Duration, that is much cleaner. > cleanup stale worker VMs after job expiry time > -- > > Key: CLOUDSTACK-9864 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9864 > Project: CloudStack > Issue Type: Improvement > Security Level: Public(Anyone can view this level - this is the > default.) > Components: VMware >Reporter: Daan Hoogland >Assignee: Daan Hoogland > Labels: vmware, vsphere, workers > > In the present code cleaning worker vms after a timeout is disabled, with the > documented reason that there is no API to query for related tasks in vcenter. > ACS has an expiry time for jobs and a cancel time for jobs. > - Jobs that take longer then the expiry time will have their results be be > neglected. > - Jobs that are cancelled are forcibly removed after the cancellation expity > time. > Any worker remaining after expiry+cancellation will surely be stale and can > be removed. > As some administrators may not want this behaviour there will be a setting > which by default is false that will guard against cleaning stale worker VMs. > Stale worker VMs will be cleaned after 2 * (expiry-time + cancellation-time) > as a safe margin. > related settings: > job.expire.minutes: 1440 > job.cancel.threshold.minutes: 60 > vmware.clean.old.worker.vms: false (new) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CLOUDSTACK-9864) cleanup stale worker VMs after job expiry time
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15958542#comment-15958542 ] ASF GitHub Bot commented on CLOUDSTACK-9864: GitHub user DaanHoogland opened a pull request: https://github.com/apache/cloudstack/pull/2030 WIP: CLOUDSTACK-9864 cleanup stale worker VMs after job expiry time You can merge this pull request into a Git repository by running: $ git pull https://github.com/shapeblue/cloudstack snapshot-housekeeping Alternatively you can review and apply these changes as the patch at: https://github.com/apache/cloudstack/pull/2030.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2030 commit 40869570fc510fac0d2357f272e96cd4a4518176 Author: Daan HooglandDate: 2017-03-30T14:35:37Z CE-113 trace logging and rethrow instead of nesting CloudRuntimeException commit 66d7d846352d52cc539b1dafb5e4d0f1620829a5 Author: Daan Hoogland Date: 2017-04-05T12:19:14Z CE-113 configure workervm gc based on job expiry commit 996f5834e6a0a9e4dc57d436ceeb5b89e6dc9974 Author: Daan Hoogland Date: 2017-04-05T15:35:41Z CE-113 extra trace log of worker VMs > cleanup stale worker VMs after job expiry time > -- > > Key: CLOUDSTACK-9864 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9864 > Project: CloudStack > Issue Type: Improvement > Security Level: Public(Anyone can view this level - this is the > default.) > Components: VMware >Reporter: Daan Hoogland >Assignee: Daan Hoogland > Labels: vmware, vsphere, workers > > In the present code cleaning worker vms after a timeout is disabled, with the > documented reason that there is no API to query for related tasks in vcenter. > ACS has an expiry time for jobs and a cancel time for jobs. > - Jobs that take longer then the expiry time will have their results be be > neglected. > - Jobs that are cancelled are forcibly removed after the cancellation expity > time. > Any worker remaining after expiry+cancellation will surely be stale and can > be removed. > As some administrators may not want this behaviour there will be a setting > which by default is false that will guard against cleaning stale worker VMs. > Stale worker VMs will be cleaned after 2 * (expiry-time + cancellation-time) > as a safe margin. > related settings: > job.expire.minutes: 1440 > job.cancel.threshold.minutes: 60 > vmware.clean.old.worker.vms: false (new) -- This message was sent by Atlassian JIRA (v6.3.15#6346)