[jira] [Commented] (CLOUDSTACK-9583) VR: In CsDhcp.py preseed both hostaname and localhost to resolve to 127.0.0.1
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15666210#comment-15666210 ] ASF GitHub Bot commented on CLOUDSTACK-9583: Github user blueorangutan commented on the issue: https://github.com/apache/cloudstack/pull/1757 Trillian test result (tid-337) Environment: xenserver-65sp1 (x2), Advanced Networking with Mgmt server 6 Total time taken: 30772 seconds Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr1757-t337-xenserver-65sp1.zip Test completed. 39 look ok, 4 have error(s) Test | Result | Time (s) | Test File --- | --- | --- | --- test_05_rvpc_multi_tiers | `Failure` | 420.58 | test_vpc_redundant.py test_04_rvpc_network_garbage_collector_nics | `Failure` | 1372.61 | test_vpc_redundant.py test_01_create_redundant_VPC_2tiers_4VMs_4IPs_4PF_ACL | `Failure` | 479.80 | test_vpc_redundant.py test_04_rvpc_privategw_static_routes | `Failure` | 619.78 | test_privategw_acl.py ContextSuite context=TestRVPCSite2SiteVpn>:setup | `Error` | 0.00 | test_vpc_vpn.py test_06_download_detached_volume | `Error` | 30.43 | test_volumes.py test_01_vpc_site2site_vpn | Success | 326.97 | test_vpc_vpn.py test_01_vpc_remote_access_vpn | Success | 121.74 | test_vpc_vpn.py test_02_VPC_default_routes | Success | 271.09 | test_vpc_router_nics.py test_01_VPC_nics_after_destroy | Success | 638.24 | test_vpc_router_nics.py test_03_create_redundant_VPC_1tier_2VMs_2IPs_2PF_ACL_reboot_routers | Success | 777.50 | test_vpc_redundant.py test_02_redundant_VPC_default_routes | Success | 984.30 | test_vpc_redundant.py test_09_delete_detached_volume | Success | 15.75 | test_volumes.py test_08_resize_volume | Success | 85.99 | test_volumes.py test_07_resize_fail | Success | 91.06 | test_volumes.py test_05_detach_volume | Success | 100.28 | test_volumes.py test_04_delete_attached_volume | Success | 10.27 | test_volumes.py test_03_download_attached_volume | Success | 15.38 | test_volumes.py test_02_attach_volume | Success | 10.71 | test_volumes.py test_01_create_volume | Success | 427.98 | test_volumes.py test_03_delete_vm_snapshots | Success | 280.78 | test_vm_snapshots.py test_02_revert_vm_snapshots | Success | 186.62 | test_vm_snapshots.py test_01_create_vm_snapshots | Success | 133.90 | test_vm_snapshots.py test_deploy_vm_multiple | Success | 294.33 | test_vm_life_cycle.py test_deploy_vm | Success | 0.03 | test_vm_life_cycle.py test_advZoneVirtualRouter | Success | 0.02 | test_vm_life_cycle.py test_10_attachAndDetach_iso | Success | 31.87 | test_vm_life_cycle.py test_09_expunge_vm | Success | 125.17 | test_vm_life_cycle.py test_08_migrate_vm | Success | 66.24 | test_vm_life_cycle.py test_07_restore_vm | Success | 0.14 | test_vm_life_cycle.py test_06_destroy_vm | Success | 10.18 | test_vm_life_cycle.py test_03_reboot_vm | Success | 10.20 | test_vm_life_cycle.py test_02_start_vm | Success | 15.22 | test_vm_life_cycle.py test_01_stop_vm | Success | 30.30 | test_vm_life_cycle.py test_CreateTemplateWithDuplicateName | Success | 126.15 | test_templates.py test_08_list_system_templates | Success | 0.03 | test_templates.py test_07_list_public_templates | Success | 0.04 | test_templates.py test_05_template_permissions | Success | 0.06 | test_templates.py test_04_extract_template | Success | 5.17 | test_templates.py test_03_delete_template | Success | 5.12 | test_templates.py test_02_edit_template | Success | 90.13 | test_templates.py test_01_create_template | Success | 80.84 | test_templates.py test_10_destroy_cpvm | Success | 221.74 | test_ssvm.py test_09_destroy_ssvm | Success | 204.12 | test_ssvm.py test_08_reboot_cpvm | Success | 141.60 | test_ssvm.py test_07_reboot_ssvm | Success | 153.97 | test_ssvm.py test_06_stop_cpvm | Success | 136.74 | test_ssvm.py test_05_stop_ssvm | Success | 174.01 | test_ssvm.py test_04_cpvm_internals | Success | 1.10 | test_ssvm.py test_03_ssvm_internals | Success | 3.54 | test_ssvm.py test_02_list_cpvm_vm | Success | 0.12 | test_ssvm.py test_01_list_sec_storage_vm | Success | 0.13 | test_ssvm.py test_01_snapshot_root_disk | Success | 16.69 | test_snapshots.py test_04_change_offering_small | Success | 58.90 | test_service_offerings.py test_03_delete_service_offering | Success | 0.05 | test_service_offerings.py test_02_edit_service_offering | Success | 0.12 | test_service_offerings.py test_01_create_service_offering | Success | 0.10 | test_service_offerings.py test_02_sys_template_ready | Success | 0.13 | test_secondary_storage.py test_01_sys_vm_start | Success | 0.20 | test_secondary_storage.py test_01_scale_vm | Success | 5.24 | test_scale_vm.py test_09
[jira] [Commented] (CLOUDSTACK-9583) VR: In CsDhcp.py preseed both hostaname and localhost to resolve to 127.0.0.1
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15666156#comment-15666156 ] ASF GitHub Bot commented on CLOUDSTACK-9583: Github user blueorangutan commented on the issue: https://github.com/apache/cloudstack/pull/1757 Trillian test result (tid-339) Environment: vmware-55u3 (x2), Advanced Networking with Mgmt server 7 Total time taken: 29505 seconds Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr1757-t339-vmware-55u3.zip Test completed. 42 look ok, 1 have error(s) Test | Result | Time (s) | Test File --- | --- | --- | --- test_01_vpc_site2site_vpn | `Error` | 426.66 | test_vpc_vpn.py test_01_redundant_vpc_site2site_vpn | `Error` | 669.50 | test_vpc_vpn.py test_01_vpc_remote_access_vpn | Success | 142.18 | test_vpc_vpn.py test_02_VPC_default_routes | Success | 299.75 | test_vpc_router_nics.py test_01_VPC_nics_after_destroy | Success | 661.29 | test_vpc_router_nics.py test_05_rvpc_multi_tiers | Success | 494.72 | test_vpc_redundant.py test_04_rvpc_network_garbage_collector_nics | Success | 1513.28 | test_vpc_redundant.py test_03_create_redundant_VPC_1tier_2VMs_2IPs_2PF_ACL_reboot_routers | Success | 571.88 | test_vpc_redundant.py test_02_redundant_VPC_default_routes | Success | 511.48 | test_vpc_redundant.py test_01_create_redundant_VPC_2tiers_4VMs_4IPs_4PF_ACL | Success | 1135.81 | test_vpc_redundant.py test_09_delete_detached_volume | Success | 31.73 | test_volumes.py test_06_download_detached_volume | Success | 70.64 | test_volumes.py test_05_detach_volume | Success | 110.39 | test_volumes.py test_04_delete_attached_volume | Success | 15.24 | test_volumes.py test_03_download_attached_volume | Success | 20.38 | test_volumes.py test_02_attach_volume | Success | 58.72 | test_volumes.py test_01_create_volume | Success | 455.56 | test_volumes.py test_03_delete_vm_snapshots | Success | 275.26 | test_vm_snapshots.py test_02_revert_vm_snapshots | Success | 194.06 | test_vm_snapshots.py test_01_test_vm_volume_snapshot | Success | 156.69 | test_vm_snapshots.py test_01_create_vm_snapshots | Success | 129.77 | test_vm_snapshots.py test_deploy_vm_multiple | Success | 238.87 | test_vm_life_cycle.py test_deploy_vm | Success | 0.03 | test_vm_life_cycle.py test_advZoneVirtualRouter | Success | 0.02 | test_vm_life_cycle.py test_10_attachAndDetach_iso | Success | 26.90 | test_vm_life_cycle.py test_09_expunge_vm | Success | 125.28 | test_vm_life_cycle.py test_08_migrate_vm | Success | 66.47 | test_vm_life_cycle.py test_07_restore_vm | Success | 0.09 | test_vm_life_cycle.py test_06_destroy_vm | Success | 10.20 | test_vm_life_cycle.py test_03_reboot_vm | Success | 5.23 | test_vm_life_cycle.py test_02_start_vm | Success | 20.38 | test_vm_life_cycle.py test_01_stop_vm | Success | 10.30 | test_vm_life_cycle.py test_CreateTemplateWithDuplicateName | Success | 257.55 | test_templates.py test_08_list_system_templates | Success | 0.03 | test_templates.py test_07_list_public_templates | Success | 0.04 | test_templates.py test_05_template_permissions | Success | 0.07 | test_templates.py test_04_extract_template | Success | 15.24 | test_templates.py test_03_delete_template | Success | 5.10 | test_templates.py test_02_edit_template | Success | 90.14 | test_templates.py test_01_create_template | Success | 111.24 | test_templates.py test_10_destroy_cpvm | Success | 266.90 | test_ssvm.py test_09_destroy_ssvm | Success | 238.71 | test_ssvm.py test_08_reboot_cpvm | Success | 126.62 | test_ssvm.py test_07_reboot_ssvm | Success | 128.49 | test_ssvm.py test_06_stop_cpvm | Success | 207.77 | test_ssvm.py test_05_stop_ssvm | Success | 174.68 | test_ssvm.py test_04_cpvm_internals | Success | 1.22 | test_ssvm.py test_03_ssvm_internals | Success | 4.42 | test_ssvm.py test_02_list_cpvm_vm | Success | 0.12 | test_ssvm.py test_01_list_sec_storage_vm | Success | 0.18 | test_ssvm.py test_01_snapshot_root_disk | Success | 26.71 | test_snapshots.py test_04_change_offering_small | Success | 92.16 | test_service_offerings.py test_03_delete_service_offering | Success | 0.03 | test_service_offerings.py test_02_edit_service_offering | Success | 0.08 | test_service_offerings.py test_01_create_service_offering | Success | 0.10 | test_service_offerings.py test_02_sys_template_ready | Success | 0.11 | test_secondary_storage.py test_01_sys_vm_start | Success | 0.16 | test_secondary_storage.py test_09_reboot_router | Success | 105.96 | test_routers.py test_08_start_router | Success | 100.77 | test_routers.py test_07_stop_router | Success | 20.20 | test_routers.py test_06_router_advanced | Success
[jira] [Commented] (CLOUDSTACK-9588) Add Load Balancer functionality in Network page is redundant.
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15666113#comment-15666113 ] ASF GitHub Bot commented on CLOUDSTACK-9588: Github user nitin-maharana commented on the issue: https://github.com/apache/cloudstack/pull/1758 The Add Load Balancer tab was removed. ![image](https://cloud.githubusercontent.com/assets/12583725/20293745/f1a66b9a-ab1e-11e6-9707-40af38637447.png) The same functionality is done by Load Balancing tab. https://cloud.githubusercontent.com/assets/12583725/20293902/26cd3f1e-ab20-11e6-9b59-05ac6ec8194b.png";> > Add Load Balancer functionality in Network page is redundant. > - > > Key: CLOUDSTACK-9588 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9588 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Reporter: Nitin Kumar Maharana > > Steps to Reproduce: > Network -> Select any network -> Observer Add Load Balancer tab > The "Add Load Balancer" functionality is redundant. > The above is used to create LB rule without any public IP. > Resolution: > There exist similar functionality in Network -> Any Network -> Details Tab -> > View IP Addresses -> Any public IP -> Configuration Tab -> Observe Load > Balancing. > The above is used to create LB rule with a public IP. This is a more > convenient way of creating LB rule as the IP is involved. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9583) VR: In CsDhcp.py preseed both hostaname and localhost to resolve to 127.0.0.1
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15665988#comment-15665988 ] ASF GitHub Bot commented on CLOUDSTACK-9583: Github user blueorangutan commented on the issue: https://github.com/apache/cloudstack/pull/1757 Trillian test result (tid-338) Environment: kvm-centos7 (x2), Advanced Networking with Mgmt server 7 Total time taken: 24043 seconds Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr1757-t338-kvm-centos7.zip Test completed. 42 look ok, 1 have error(s) Test | Result | Time (s) | Test File --- | --- | --- | --- test_01_create_redundant_VPC_2tiers_4VMs_4IPs_4PF_ACL | `Failure` | 396.62 | test_vpc_redundant.py test_01_vpc_site2site_vpn | Success | 150.37 | test_vpc_vpn.py test_01_vpc_remote_access_vpn | Success | 56.14 | test_vpc_vpn.py test_01_redundant_vpc_site2site_vpn | Success | 257.22 | test_vpc_vpn.py test_02_VPC_default_routes | Success | 291.44 | test_vpc_router_nics.py test_01_VPC_nics_after_destroy | Success | 502.74 | test_vpc_router_nics.py test_05_rvpc_multi_tiers | Success | 530.07 | test_vpc_redundant.py test_04_rvpc_network_garbage_collector_nics | Success | 1443.95 | test_vpc_redundant.py test_03_create_redundant_VPC_1tier_2VMs_2IPs_2PF_ACL_reboot_routers | Success | 576.62 | test_vpc_redundant.py test_02_redundant_VPC_default_routes | Success | 758.56 | test_vpc_redundant.py test_09_delete_detached_volume | Success | 15.46 | test_volumes.py test_08_resize_volume | Success | 15.38 | test_volumes.py test_07_resize_fail | Success | 20.55 | test_volumes.py test_06_download_detached_volume | Success | 15.30 | test_volumes.py test_05_detach_volume | Success | 100.36 | test_volumes.py test_04_delete_attached_volume | Success | 10.21 | test_volumes.py test_03_download_attached_volume | Success | 15.45 | test_volumes.py test_02_attach_volume | Success | 43.71 | test_volumes.py test_01_create_volume | Success | 712.26 | test_volumes.py test_deploy_vm_multiple | Success | 254.22 | test_vm_life_cycle.py test_deploy_vm | Success | 0.03 | test_vm_life_cycle.py test_advZoneVirtualRouter | Success | 0.02 | test_vm_life_cycle.py test_10_attachAndDetach_iso | Success | 26.66 | test_vm_life_cycle.py test_09_expunge_vm | Success | 125.29 | test_vm_life_cycle.py test_08_migrate_vm | Success | 41.43 | test_vm_life_cycle.py test_07_restore_vm | Success | 0.13 | test_vm_life_cycle.py test_06_destroy_vm | Success | 126.01 | test_vm_life_cycle.py test_03_reboot_vm | Success | 126.01 | test_vm_life_cycle.py test_02_start_vm | Success | 10.23 | test_vm_life_cycle.py test_01_stop_vm | Success | 40.42 | test_vm_life_cycle.py test_CreateTemplateWithDuplicateName | Success | 65.72 | test_templates.py test_08_list_system_templates | Success | 0.03 | test_templates.py test_07_list_public_templates | Success | 0.03 | test_templates.py test_05_template_permissions | Success | 0.05 | test_templates.py test_04_extract_template | Success | 5.28 | test_templates.py test_03_delete_template | Success | 5.51 | test_templates.py test_02_edit_template | Success | 90.16 | test_templates.py test_01_create_template | Success | 65.67 | test_templates.py test_10_destroy_cpvm | Success | 131.43 | test_ssvm.py test_09_destroy_ssvm | Success | 163.33 | test_ssvm.py test_08_reboot_cpvm | Success | 101.52 | test_ssvm.py test_07_reboot_ssvm | Success | 103.22 | test_ssvm.py test_06_stop_cpvm | Success | 101.46 | test_ssvm.py test_05_stop_ssvm | Success | 133.18 | test_ssvm.py test_04_cpvm_internals | Success | 1.02 | test_ssvm.py test_03_ssvm_internals | Success | 3.99 | test_ssvm.py test_02_list_cpvm_vm | Success | 0.11 | test_ssvm.py test_01_list_sec_storage_vm | Success | 0.12 | test_ssvm.py test_01_snapshot_root_disk | Success | 12.48 | test_snapshots.py test_04_change_offering_small | Success | 209.76 | test_service_offerings.py test_03_delete_service_offering | Success | 0.03 | test_service_offerings.py test_02_edit_service_offering | Success | 0.05 | test_service_offerings.py test_01_create_service_offering | Success | 0.10 | test_service_offerings.py test_02_sys_template_ready | Success | 0.16 | test_secondary_storage.py test_01_sys_vm_start | Success | 0.24 | test_secondary_storage.py test_09_reboot_router | Success | 35.33 | test_routers.py test_08_start_router | Success | 30.32 | test_routers.py test_07_stop_router | Success | 10.28 | test_routers.py test_06_router_advanced | Success | 0.06 | test_routers.py test_05_router_basic | Success | 0.05 | test_routers.py test_04_restart_network_wo_cleanup | Success | 5.65 | test_routers.py test_03_restart_
[jira] [Commented] (CLOUDSTACK-9595) Transactions are not getting retried in case of database deadlock errors
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15665577#comment-15665577 ] ASF GitHub Bot commented on CLOUDSTACK-9595: Github user jburwell commented on the issue: https://github.com/apache/cloudstack/pull/1762 @serg38 with custom plugins, there is no way to reliably perform such tracing. I can think of batch cleanup operations in the storage layer that follow the pattern I described. Even if there were, we would have planted a landline for future changes to the system. Deadlocks are significant technical debt that are clearly causing significant operational issues. Unfortunately, there is no way to address them generically > Transactions are not getting retried in case of database deadlock errors > > > Key: CLOUDSTACK-9595 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9595 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Affects Versions: 4.8.0 >Reporter: subhash yedugundla > Fix For: 4.8.1 > > > Customer is seeing occasional error 'Deadlock found when trying to get lock; > try restarting transaction' messages in their management server logs. It > happens regularly at least once a day. The following is the error seen > 2015-12-09 19:23:19,450 ERROR [cloud.api.ApiServer] > (catalina-exec-3:ctx-f05c58fc ctx-39c17156 ctx-7becdf6e) unhandled exception > executing api command: [Ljava.lang.String;@230a6e7f > com.cloud.utils.exception.CloudRuntimeException: DB Exception on: > com.mysql.jdbc.JDBC4PreparedStatement@74f134e3: DELETE FROM > instance_group_vm_map WHERE instance_group_vm_map.instance_id = 941374 > at com.cloud.utils.db.GenericDaoBase.expunge(GenericDaoBase.java:1209) > at sun.reflect.GeneratedMethodAccessor360.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:317) > at > org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183) > at > org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150) > at > com.cloud.utils.db.TransactionContextInterceptor.invoke(TransactionContextInterceptor.java:34) > at > org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:161) > at > org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:91) > at > org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172) > at > org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204) > at com.sun.proxy.$Proxy237.expunge(Unknown Source) > at > com.cloud.vm.UserVmManagerImpl$2.doInTransactionWithoutResult(UserVmManagerImpl.java:2593) > at > com.cloud.utils.db.TransactionCallbackNoReturn.doInTransaction(TransactionCallbackNoReturn.java:25) > at com.cloud.utils.db.Transaction$2.doInTransaction(Transaction.java:57) > at com.cloud.utils.db.Transaction.execute(Transaction.java:45) > at com.cloud.utils.db.Transaction.execute(Transaction.java:54) > at > com.cloud.vm.UserVmManagerImpl.addInstanceToGroup(UserVmManagerImpl.java:2575) > at > com.cloud.vm.UserVmManagerImpl.updateVirtualMachine(UserVmManagerImpl.java:2332) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9595) Transactions are not getting retried in case of database deadlock errors
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15665556#comment-15665556 ] ASF GitHub Bot commented on CLOUDSTACK-9595: Github user serg38 commented on the issue: https://github.com/apache/cloudstack/pull/1762 @jburwell I concur but if @yvsubhash verified that those methods don't participate in complex DML transactions this might be still a good start. If so this approach might be expanded later to multi DML transaction so that each piece can be retired individually. I myself traced few deadlocks in ACS using native mysql deadlock logging and it doesn't seem there would be a viable alternative to retires due to well known complexity of ACS DB operations. > Transactions are not getting retried in case of database deadlock errors > > > Key: CLOUDSTACK-9595 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9595 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Affects Versions: 4.8.0 >Reporter: subhash yedugundla > Fix For: 4.8.1 > > > Customer is seeing occasional error 'Deadlock found when trying to get lock; > try restarting transaction' messages in their management server logs. It > happens regularly at least once a day. The following is the error seen > 2015-12-09 19:23:19,450 ERROR [cloud.api.ApiServer] > (catalina-exec-3:ctx-f05c58fc ctx-39c17156 ctx-7becdf6e) unhandled exception > executing api command: [Ljava.lang.String;@230a6e7f > com.cloud.utils.exception.CloudRuntimeException: DB Exception on: > com.mysql.jdbc.JDBC4PreparedStatement@74f134e3: DELETE FROM > instance_group_vm_map WHERE instance_group_vm_map.instance_id = 941374 > at com.cloud.utils.db.GenericDaoBase.expunge(GenericDaoBase.java:1209) > at sun.reflect.GeneratedMethodAccessor360.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:317) > at > org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183) > at > org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150) > at > com.cloud.utils.db.TransactionContextInterceptor.invoke(TransactionContextInterceptor.java:34) > at > org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:161) > at > org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:91) > at > org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172) > at > org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204) > at com.sun.proxy.$Proxy237.expunge(Unknown Source) > at > com.cloud.vm.UserVmManagerImpl$2.doInTransactionWithoutResult(UserVmManagerImpl.java:2593) > at > com.cloud.utils.db.TransactionCallbackNoReturn.doInTransaction(TransactionCallbackNoReturn.java:25) > at com.cloud.utils.db.Transaction$2.doInTransaction(Transaction.java:57) > at com.cloud.utils.db.Transaction.execute(Transaction.java:45) > at com.cloud.utils.db.Transaction.execute(Transaction.java:54) > at > com.cloud.vm.UserVmManagerImpl.addInstanceToGroup(UserVmManagerImpl.java:2575) > at > com.cloud.vm.UserVmManagerImpl.updateVirtualMachine(UserVmManagerImpl.java:2332) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9595) Transactions are not getting retried in case of database deadlock errors
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15665499#comment-15665499 ] ASF GitHub Bot commented on CLOUDSTACK-9595: Github user jburwell commented on the issue: https://github.com/apache/cloudstack/pull/1762 @serg38 there remains a risk when those methods are executed in the context of an open transaction where DMLs have already been executed and subsequent DMLs will be executed. In this scenario, the first set of the changes would be lost due to the rollback triggered by the query deadlock with the second set proceeding successfully. > Transactions are not getting retried in case of database deadlock errors > > > Key: CLOUDSTACK-9595 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9595 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Affects Versions: 4.8.0 >Reporter: subhash yedugundla > Fix For: 4.8.1 > > > Customer is seeing occasional error 'Deadlock found when trying to get lock; > try restarting transaction' messages in their management server logs. It > happens regularly at least once a day. The following is the error seen > 2015-12-09 19:23:19,450 ERROR [cloud.api.ApiServer] > (catalina-exec-3:ctx-f05c58fc ctx-39c17156 ctx-7becdf6e) unhandled exception > executing api command: [Ljava.lang.String;@230a6e7f > com.cloud.utils.exception.CloudRuntimeException: DB Exception on: > com.mysql.jdbc.JDBC4PreparedStatement@74f134e3: DELETE FROM > instance_group_vm_map WHERE instance_group_vm_map.instance_id = 941374 > at com.cloud.utils.db.GenericDaoBase.expunge(GenericDaoBase.java:1209) > at sun.reflect.GeneratedMethodAccessor360.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:317) > at > org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183) > at > org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150) > at > com.cloud.utils.db.TransactionContextInterceptor.invoke(TransactionContextInterceptor.java:34) > at > org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:161) > at > org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:91) > at > org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172) > at > org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204) > at com.sun.proxy.$Proxy237.expunge(Unknown Source) > at > com.cloud.vm.UserVmManagerImpl$2.doInTransactionWithoutResult(UserVmManagerImpl.java:2593) > at > com.cloud.utils.db.TransactionCallbackNoReturn.doInTransaction(TransactionCallbackNoReturn.java:25) > at com.cloud.utils.db.Transaction$2.doInTransaction(Transaction.java:57) > at com.cloud.utils.db.Transaction.execute(Transaction.java:45) > at com.cloud.utils.db.Transaction.execute(Transaction.java:54) > at > com.cloud.vm.UserVmManagerImpl.addInstanceToGroup(UserVmManagerImpl.java:2575) > at > com.cloud.vm.UserVmManagerImpl.updateVirtualMachine(UserVmManagerImpl.java:2332) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9595) Transactions are not getting retried in case of database deadlock errors
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15665463#comment-15665463 ] ASF GitHub Bot commented on CLOUDSTACK-9595: Github user serg38 commented on the issue: https://github.com/apache/cloudstack/pull/1762 @jburwell @yvsubhash I might be wrong but this PR will retry on deadlock for only 2 DAO methods searchIncludingRemoved and customSearchIncludingRemoved. No update methods are set with this retry mechanism. If that's the case there is no risk of corrupting DB. > Transactions are not getting retried in case of database deadlock errors > > > Key: CLOUDSTACK-9595 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9595 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Affects Versions: 4.8.0 >Reporter: subhash yedugundla > Fix For: 4.8.1 > > > Customer is seeing occasional error 'Deadlock found when trying to get lock; > try restarting transaction' messages in their management server logs. It > happens regularly at least once a day. The following is the error seen > 2015-12-09 19:23:19,450 ERROR [cloud.api.ApiServer] > (catalina-exec-3:ctx-f05c58fc ctx-39c17156 ctx-7becdf6e) unhandled exception > executing api command: [Ljava.lang.String;@230a6e7f > com.cloud.utils.exception.CloudRuntimeException: DB Exception on: > com.mysql.jdbc.JDBC4PreparedStatement@74f134e3: DELETE FROM > instance_group_vm_map WHERE instance_group_vm_map.instance_id = 941374 > at com.cloud.utils.db.GenericDaoBase.expunge(GenericDaoBase.java:1209) > at sun.reflect.GeneratedMethodAccessor360.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:317) > at > org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183) > at > org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150) > at > com.cloud.utils.db.TransactionContextInterceptor.invoke(TransactionContextInterceptor.java:34) > at > org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:161) > at > org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:91) > at > org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172) > at > org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204) > at com.sun.proxy.$Proxy237.expunge(Unknown Source) > at > com.cloud.vm.UserVmManagerImpl$2.doInTransactionWithoutResult(UserVmManagerImpl.java:2593) > at > com.cloud.utils.db.TransactionCallbackNoReturn.doInTransaction(TransactionCallbackNoReturn.java:25) > at com.cloud.utils.db.Transaction$2.doInTransaction(Transaction.java:57) > at com.cloud.utils.db.Transaction.execute(Transaction.java:45) > at com.cloud.utils.db.Transaction.execute(Transaction.java:54) > at > com.cloud.vm.UserVmManagerImpl.addInstanceToGroup(UserVmManagerImpl.java:2575) > at > com.cloud.vm.UserVmManagerImpl.updateVirtualMachine(UserVmManagerImpl.java:2332) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9560) Root volume of deleted VM left unremoved
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15665362#comment-15665362 ] ASF GitHub Bot commented on CLOUDSTACK-9560: Github user jburwell commented on a diff in the pull request: https://github.com/apache/cloudstack/pull/1726#discussion_r87917132 --- Diff: server/src/com/cloud/storage/StorageManagerImpl.java --- @@ -2199,15 +2199,20 @@ public void cleanupDownloadUrls(){ if(downloadUrlCurrentAgeInSecs < _downloadUrlExpirationInterval){ // URL hasnt expired yet continue; } - -s_logger.debug("Removing download url " + volumeOnImageStore.getExtractUrl() + " for volume id " + volumeOnImageStore.getVolumeId()); +long volumeId = volumeOnImageStore.getVolumeId(); +s_logger.debug("Removing download url " + volumeOnImageStore.getExtractUrl() + " for volume id " + volumeId); // Remove it from image store ImageStoreEntity secStore = (ImageStoreEntity) _dataStoreMgr.getDataStore(volumeOnImageStore.getDataStoreId(), DataStoreRole.Image); secStore.deleteExtractUrl(volumeOnImageStore.getInstallPath(), volumeOnImageStore.getExtractUrl(), Upload.Type.VOLUME); // Now expunge it from DB since this entry was created only for download purpose _volumeStoreDao.expunge(volumeOnImageStore.getId()); +Volume volume = _volumeDao.findById(volumeId); +if (volume != null && volume.getState() == Volume.State.Expunged) +{ +_volumeDao.remove(volumeId); +} --- End diff -- @yvsubhash have you had a chance to review @ustcweizhou's feedback? > Root volume of deleted VM left unremoved > > > Key: CLOUDSTACK-9560 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9560 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) > Components: Volumes >Affects Versions: 4.8.0 > Environment: XenServer >Reporter: subhash yedugundla > Fix For: 4.8.1 > > > In the following scenario root volume gets unremoved > Steps to reproduce the issue > 1. Create a VM. > 2. Stop this VM. > 3. On the page of the volume of the VM, click 'Download Volume' icon. > 4. Wait for the popup screen to display and cancel out with/without clicking > the download link. > 5. Destroy the VM > Even after the corresponding VM is deleted,expunged, the root-volume is left > in 'Expunging' state unremoved. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9570) Bug in listSnapshots for snapshots with deleted data stores
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15665336#comment-15665336 ] ASF GitHub Bot commented on CLOUDSTACK-9570: Github user jburwell commented on a diff in the pull request: https://github.com/apache/cloudstack/pull/1735#discussion_r87915067 --- Diff: server/src/com/cloud/api/ApiResponseHelper.java --- @@ -526,16 +529,18 @@ public static DataStoreRole getDataStoreRole(Snapshot snapshot, SnapshotDataStor } long storagePoolId = snapshotStore.getDataStoreId(); -DataStore dataStore = dataStoreMgr.getDataStore(storagePoolId, DataStoreRole.Primary); +if (snapshotStore.getState() != null && ! snapshotStore.getState().equals(ObjectInDataStoreStateMachine.State.Destroyed)) { +DataStore dataStore = dataStoreMgr.getDataStore(storagePoolId, DataStoreRole.Primary); -Map mapCapabilities = dataStore.getDriver().getCapabilities(); +Map mapCapabilities = dataStore.getDriver().getCapabilities(); -if (mapCapabilities != null) { -String value = mapCapabilities.get(DataStoreCapabilities.STORAGE_SYSTEM_SNAPSHOT.toString()); -Boolean supportsStorageSystemSnapshots = new Boolean(value); +if (mapCapabilities != null) { +String value = mapCapabilities.get(DataStoreCapabilities.STORAGE_SYSTEM_SNAPSHOT.toString()); +Boolean supportsStorageSystemSnapshots = new Boolean(value); --- End diff -- `new Boolean` skips the constant pool -- putting unnecessary pressure on the heap and creating a potential memory leak. Please use `Boolean.valueOf` to part the value to avoid this issue. > Bug in listSnapshots for snapshots with deleted data stores > --- > > Key: CLOUDSTACK-9570 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9570 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) > Components: API >Reporter: Nicolas Vazquez >Assignee: Nicolas Vazquez > > h3. Actual behaviour > If there is snapshot on a data store that is removed, {{listSnapshots}} still > tries to enumerate it and gives error (in this example data store 2 has been > removed): > {code:xml|title=/client/api?command=listSnapshots&isrecursive=true&listall=true|borderStyle=solid} > >530 >4250 >Unable to locate datastore with id 2 > > {code} > h3. Reproduce error > This steps can be followed to reproduce issue: > * Take a snapshot of a volume (this creates a references for primary storage > and secondary storage in snapshot_store_ref table > * Simulate retiring primary data storage where snapshot is cached (in this > example X is a fake data store and Y is snapshot id): > {{UPDATE `cloud`.`snapshot_store_ref` SET `store_id`='X', `state`="Destroyed" > WHERE `id`='Y';}} > * List snapshots > {{/client/api?command=listSnapshots&isrecursive=true&listall=true}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9561) After domain/account deletion, snapshot taken by the domain/account remains undeleted
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15665328#comment-15665328 ] ASF GitHub Bot commented on CLOUDSTACK-9561: Github user jburwell commented on the issue: https://github.com/apache/cloudstack/pull/1737 @SudharmaJain this fix seems like it would be good for LTS users as well. Could you please change the base branch to 4.9? > After domain/account deletion, snapshot taken by the domain/account remains > undeleted > - > > Key: CLOUDSTACK-9561 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9561 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Reporter: sudharma jain > > While deleting the UserAccount Cleanup for the removed VMs/volumes are not > happening. For the removed VMs, snapshots doesn't get cleaned. Only for > volumes in ready state the cleanup happens. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9572) Snapshot on primary storage not cleaned up after Storage migration
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15665316#comment-15665316 ] ASF GitHub Bot commented on CLOUDSTACK-9572: Github user jburwell commented on a diff in the pull request: https://github.com/apache/cloudstack/pull/1740#discussion_r87913074 --- Diff: server/src/com/cloud/storage/snapshot/SnapshotManagerImpl.java --- @@ -,6 +,20 @@ public boolean canOperateOnVolume(Volume volume) { } @Override +public void cleanupSnapshotsByVolume(Long volumeId) { +List volSnapShots = _snapshotDao.listByVolumeId(volumeId); +for(SnapshotVO snapshot: volSnapShots) { +SnapshotInfo info = snapshotFactory.getSnapshot(snapshot.getId(), DataStoreRole.Primary); +try { +snapshotSrv.deleteSnapshot(info); +} catch(CloudRuntimeException e) { +String msg = "Cleanup of Snapshot with uuid " + snapshot.getUuid() + " in primary storage is failed. Ignoring"; --- End diff -- This local variable is only used once. Please consider collapsing into lint 1122. Also, please add the message from the exception to the message to provide greater detail for debugging efforts. > Snapshot on primary storage not cleaned up after Storage migration > -- > > Key: CLOUDSTACK-9572 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9572 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) > Components: Storage Controller >Affects Versions: 4.8.0 > Environment: Xen Server >Reporter: subhash yedugundla > Fix For: 4.8.1 > > > Issue Description > === > 1. Create an instance on the local storage on any host > 2. Create a scheduled snapshot of the volume: > 3. Wait until ACS created the snapshot. ACS is creating a snapshot on local > storage and is transferring this snapshot to secondary storage. But the > latest snapshot on local storage will stay there. This is as expected. > 4. Migrate the instance to another XenServer host with ACS UI and Storage > Live Migration > 5. The Snapshot on the old host on local storage will not be cleaned up and > is staying on local storage. So local storage will fill up with unneeded > snapshots. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9572) Snapshot on primary storage not cleaned up after Storage migration
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15665315#comment-15665315 ] ASF GitHub Bot commented on CLOUDSTACK-9572: Github user jburwell commented on a diff in the pull request: https://github.com/apache/cloudstack/pull/1740#discussion_r87913578 --- Diff: server/src/com/cloud/storage/snapshot/SnapshotManagerImpl.java --- @@ -,6 +,20 @@ public boolean canOperateOnVolume(Volume volume) { } @Override +public void cleanupSnapshotsByVolume(Long volumeId) { +List volSnapShots = _snapshotDao.listByVolumeId(volumeId); +for(SnapshotVO snapshot: volSnapShots) { +SnapshotInfo info = snapshotFactory.getSnapshot(snapshot.getId(), DataStoreRole.Primary); --- End diff -- This appears to be an application side join. Please consider creating a new query to retrieve all snapshot info instances associated with `volumeId` to reduce load on the database and simplify this method. > Snapshot on primary storage not cleaned up after Storage migration > -- > > Key: CLOUDSTACK-9572 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9572 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) > Components: Storage Controller >Affects Versions: 4.8.0 > Environment: Xen Server >Reporter: subhash yedugundla > Fix For: 4.8.1 > > > Issue Description > === > 1. Create an instance on the local storage on any host > 2. Create a scheduled snapshot of the volume: > 3. Wait until ACS created the snapshot. ACS is creating a snapshot on local > storage and is transferring this snapshot to secondary storage. But the > latest snapshot on local storage will stay there. This is as expected. > 4. Migrate the instance to another XenServer host with ACS UI and Storage > Live Migration > 5. The Snapshot on the old host on local storage will not be cleaned up and > is staying on local storage. So local storage will fill up with unneeded > snapshots. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9595) Transactions are not getting retried in case of database deadlock errors
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15665209#comment-15665209 ] ASF GitHub Bot commented on CLOUDSTACK-9595: Github user jburwell commented on the issue: https://github.com/apache/cloudstack/pull/1762 @serg38 that is not a safe assumption. Transactions often span multiple statements and methods across DAOs. `TransactionLegacy` has a transaction stacking/nested model that further occludes when a transaction actually completely. Deadlocks are a severe problem that need to be fixed. Unfortunately, this patch would do more harm than good as it would eventually corrupt the database. In, and of themselves, retries are also a very expensive solution to the problem both in terms of the engineering effort required to do it properly and the extra stress placed on the database to perform additional work that will likely fail. Furthermore, a generic **and** correct retry mechanism is a very difficult thing to write. Given the way transaction boundaries are managed in ACS, I think such an effort would be nearly impossible. In a properly written application, deadlocks should very rarely, if ever, occur. Their presence is a symptom of improper transaction handling and/or poor lock management problems. Therefore, my suggestion is that we change this patch to log details about the context in which deadlocks occur. We can then use this information to identify the areas in ACS where these contention problems are location and fix the root cause. > Transactions are not getting retried in case of database deadlock errors > > > Key: CLOUDSTACK-9595 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9595 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Affects Versions: 4.8.0 >Reporter: subhash yedugundla > Fix For: 4.8.1 > > > Customer is seeing occasional error 'Deadlock found when trying to get lock; > try restarting transaction' messages in their management server logs. It > happens regularly at least once a day. The following is the error seen > 2015-12-09 19:23:19,450 ERROR [cloud.api.ApiServer] > (catalina-exec-3:ctx-f05c58fc ctx-39c17156 ctx-7becdf6e) unhandled exception > executing api command: [Ljava.lang.String;@230a6e7f > com.cloud.utils.exception.CloudRuntimeException: DB Exception on: > com.mysql.jdbc.JDBC4PreparedStatement@74f134e3: DELETE FROM > instance_group_vm_map WHERE instance_group_vm_map.instance_id = 941374 > at com.cloud.utils.db.GenericDaoBase.expunge(GenericDaoBase.java:1209) > at sun.reflect.GeneratedMethodAccessor360.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:317) > at > org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183) > at > org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150) > at > com.cloud.utils.db.TransactionContextInterceptor.invoke(TransactionContextInterceptor.java:34) > at > org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:161) > at > org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:91) > at > org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172) > at > org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204) > at com.sun.proxy.$Proxy237.expunge(Unknown Source) > at > com.cloud.vm.UserVmManagerImpl$2.doInTransactionWithoutResult(UserVmManagerImpl.java:2593) > at > com.cloud.utils.db.TransactionCallbackNoReturn.doInTransaction(TransactionCallbackNoReturn.java:25) > at com.cloud.utils.db.Transaction$2.doInTransaction(Transaction.java:57) > at com.cloud.utils.db.Transaction.execute(Transaction.java:45) > at com.cloud.utils.db.Transaction.execute(Transaction.java:54) > at > com.cloud.vm.UserVmManagerImpl.addInstanceToGroup(UserVmManagerImpl.java:2575) > at > com.cloud.vm.UserVmManagerImpl.updateVirtualMachine(UserVmManagerImpl.java:2332) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9592) Empty responses from site to site connection status are not handled propertly
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15665165#comment-15665165 ] ASF GitHub Bot commented on CLOUDSTACK-9592: Github user jburwell commented on a diff in the pull request: https://github.com/apache/cloudstack/pull/1761#discussion_r87900141 --- Diff: server/src/com/cloud/network/router/VirtualNetworkApplianceManagerImpl.java --- @@ -962,18 +962,22 @@ protected void updateSite2SiteVpnConnectionState(final List rout } final Site2SiteVpnConnection.State oldState = conn.getState(); final Site2SiteCustomerGateway gw = _s2sCustomerGatewayDao.findById(conn.getCustomerGatewayId()); -if (answer.isConnected(gw.getGatewayIp())) { - conn.setState(Site2SiteVpnConnection.State.Connected); -} else { - conn.setState(Site2SiteVpnConnection.State.Disconnected); -} -_s2sVpnConnectionDao.persist(conn); -if (oldState != conn.getState()) { -final String title = "Site-to-site Vpn Connection to " + gw.getName() + " just switch from " + oldState + " to " + conn.getState(); -final String context = "Site-to-site Vpn Connection to " + gw.getName() + " on router " + router.getHostName() + "(id: " + router.getId() + ") " -+ " just switch from " + oldState + " to " + conn.getState(); -s_logger.info(context); - _alertMgr.sendAlert(AlertManager.AlertType.ALERT_TYPE_DOMAIN_ROUTER, router.getDataCenterId(), router.getPodIdToDeployIn(), title, context); + +if (answer.isIPPresent(gw.getGatewayIp())) { +if (answer.isConnected(gw.getGatewayIp())) { + conn.setState(Site2SiteVpnConnection.State.Connected); +} else { + conn.setState(Site2SiteVpnConnection.State.Disconnected); +} +_s2sVpnConnectionDao.persist(conn); +if (oldState != conn.getState()) { +final String title = "Site-to-site Vpn Connection to " + gw.getName() + " just switch from " + oldState + " to " + conn.getState(); --- End diff -- Minor nit: could you please fix the grammatical error in this error message? It should read "~~just~~ switch**ed** from". > Empty responses from site to site connection status are not handled propertly > - > > Key: CLOUDSTACK-9592 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9592 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) > Components: Network Controller >Affects Versions: 4.8.0 > Environment: Any Hypervisor >Reporter: subhash yedugundla > Fix For: 4.8.1 > > > vpn connection status gives responses like the below sometimes > Processing: { Ans: , MgmtId: 7203499016310, via: 1(10.147.28.37), Ver: v1, > Flags: 110, > [{"com.cloud.agent.api.CheckS2SVpnConnectionsAnswer":{"ipToConnected":{},"ipToDetail":{},"details":"","result":true,"wait":0}}] > } > 2016-09-27 08:52:19,211 DEBUG [c.c.a.t.Request] > (RouterStatusMonitor-1:ctx-c20f391d) (logid:c217239d) Seq > 1-2315413158421863581: Received: { Ans: , MgmtId: 7203499016310, via: > 1(10.147.28.37), Ver: v1, Flags: 110, > { CheckS2SVpnConnectionsAnswer } > In the above scenario, the bug in the processing of this response assumes the > connection is disconnected even though it is not disconnected and there would > be two consecutive alerts in logs as well as emails even though there is not > actual disconnection and reconnection > Site-to-site Vpn Connection XYZ-VPN on router r-197-VM(id: 197) just switch > from Disconnected to Connected > Site-to-site Vpn Connection to D1 site to site VPN on router r-372-VM(id: > 372) just switch from Connected to Disconnected -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9592) Empty responses from site to site connection status are not handled propertly
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15665164#comment-15665164 ] ASF GitHub Bot commented on CLOUDSTACK-9592: Github user jburwell commented on a diff in the pull request: https://github.com/apache/cloudstack/pull/1761#discussion_r87897999 --- Diff: core/src/com/cloud/agent/api/CheckS2SVpnConnectionsAnswer.java --- @@ -76,4 +76,14 @@ public String getDetail(String ip) { } return null; } + +public boolean isIPPresent(String ip) { +if (this.getResult()) { +Boolean status = ipToConnected.get(ip); +if (status != null) { --- End diff -- Is the IP present if `status` is equal to `false`? > Empty responses from site to site connection status are not handled propertly > - > > Key: CLOUDSTACK-9592 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9592 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) > Components: Network Controller >Affects Versions: 4.8.0 > Environment: Any Hypervisor >Reporter: subhash yedugundla > Fix For: 4.8.1 > > > vpn connection status gives responses like the below sometimes > Processing: { Ans: , MgmtId: 7203499016310, via: 1(10.147.28.37), Ver: v1, > Flags: 110, > [{"com.cloud.agent.api.CheckS2SVpnConnectionsAnswer":{"ipToConnected":{},"ipToDetail":{},"details":"","result":true,"wait":0}}] > } > 2016-09-27 08:52:19,211 DEBUG [c.c.a.t.Request] > (RouterStatusMonitor-1:ctx-c20f391d) (logid:c217239d) Seq > 1-2315413158421863581: Received: { Ans: , MgmtId: 7203499016310, via: > 1(10.147.28.37), Ver: v1, Flags: 110, > { CheckS2SVpnConnectionsAnswer } > In the above scenario, the bug in the processing of this response assumes the > connection is disconnected even though it is not disconnected and there would > be two consecutive alerts in logs as well as emails even though there is not > actual disconnection and reconnection > Site-to-site Vpn Connection XYZ-VPN on router r-197-VM(id: 197) just switch > from Disconnected to Connected > Site-to-site Vpn Connection to D1 site to site VPN on router r-372-VM(id: > 372) just switch from Connected to Disconnected -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9595) Transactions are not getting retried in case of database deadlock errors
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15665125#comment-15665125 ] ASF GitHub Bot commented on CLOUDSTACK-9595: Github user serg38 commented on the issue: https://github.com/apache/cloudstack/pull/1762 @jburwell I thought that most if not all of ACS interaction through DAO is rather atomic transactions. Do we have cases of multiple DML statements as a part of the same transaction? We have been seeing quite a few deadlock in a high transaction volume environments where multiple management servers are employed. This causes quite a pain for users due to the randomness and no good recourse/explanation. I would argue that proper retry is a better choice should we cover all the cases including all cases with complex transactions. We have been successful leveraging this approach in systems built on the top of ACS. > Transactions are not getting retried in case of database deadlock errors > > > Key: CLOUDSTACK-9595 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9595 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Affects Versions: 4.8.0 >Reporter: subhash yedugundla > Fix For: 4.8.1 > > > Customer is seeing occasional error 'Deadlock found when trying to get lock; > try restarting transaction' messages in their management server logs. It > happens regularly at least once a day. The following is the error seen > 2015-12-09 19:23:19,450 ERROR [cloud.api.ApiServer] > (catalina-exec-3:ctx-f05c58fc ctx-39c17156 ctx-7becdf6e) unhandled exception > executing api command: [Ljava.lang.String;@230a6e7f > com.cloud.utils.exception.CloudRuntimeException: DB Exception on: > com.mysql.jdbc.JDBC4PreparedStatement@74f134e3: DELETE FROM > instance_group_vm_map WHERE instance_group_vm_map.instance_id = 941374 > at com.cloud.utils.db.GenericDaoBase.expunge(GenericDaoBase.java:1209) > at sun.reflect.GeneratedMethodAccessor360.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:317) > at > org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183) > at > org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150) > at > com.cloud.utils.db.TransactionContextInterceptor.invoke(TransactionContextInterceptor.java:34) > at > org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:161) > at > org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:91) > at > org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172) > at > org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204) > at com.sun.proxy.$Proxy237.expunge(Unknown Source) > at > com.cloud.vm.UserVmManagerImpl$2.doInTransactionWithoutResult(UserVmManagerImpl.java:2593) > at > com.cloud.utils.db.TransactionCallbackNoReturn.doInTransaction(TransactionCallbackNoReturn.java:25) > at com.cloud.utils.db.Transaction$2.doInTransaction(Transaction.java:57) > at com.cloud.utils.db.Transaction.execute(Transaction.java:45) > at com.cloud.utils.db.Transaction.execute(Transaction.java:54) > at > com.cloud.vm.UserVmManagerImpl.addInstanceToGroup(UserVmManagerImpl.java:2575) > at > com.cloud.vm.UserVmManagerImpl.updateVirtualMachine(UserVmManagerImpl.java:2332) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9589) vmName entries from host_details table for the VM's whose state is Expunging should be deleted during upgrade from older versions
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15665086#comment-15665086 ] ASF GitHub Bot commented on CLOUDSTACK-9589: Github user jburwell commented on the issue: https://github.com/apache/cloudstack/pull/1759 This change has been added to the `schema-480to481.sql` script. Since 4.8.1 has already shipped, this script will not be applied for those users. Therefore, this change needs to be placed in the `schema-481to4820.sql` script. Also, the base branch for this PR is master. However, the database change is targeted at 4.8. Therefore, the base branch should be changed to 4.8. > vmName entries from host_details table for the VM's whose state is Expunging > should be deleted during upgrade from older versions > - > > Key: CLOUDSTACK-9589 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9589 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) > Components: Baremetal >Affects Versions: 4.4.4 > Environment: Baremetal zone >Reporter: subhash yedugundla > Fix For: 4.8.1 > > > Having vmName entries for VMs in 'expunging' states would cause with > deploying VMs with matching host tags fail. So removing them during upgrade -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9589) vmName entries from host_details table for the VM's whose state is Expunging should be deleted during upgrade from older versions
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15665082#comment-15665082 ] ASF GitHub Bot commented on CLOUDSTACK-9589: Github user jburwell commented on a diff in the pull request: https://github.com/apache/cloudstack/pull/1759#discussion_r87896137 --- Diff: setup/db/db/schema-480to481-cleanup.sql --- @@ -18,3 +18,6 @@ --; -- Schema cleanup from 4.8.0 to 4.8.1; --; + +DELETE FROM `cloud`.`host_details` where name = 'vmName' and value in (select name from `cloud`.`vm_instance` where state = 'Expunging' and hypervisor_type ='BareMetal'); --- End diff -- Why is this change scoped only to the baremetal hypervisor? It would seem that it should apply to all hypervisors. > vmName entries from host_details table for the VM's whose state is Expunging > should be deleted during upgrade from older versions > - > > Key: CLOUDSTACK-9589 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9589 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) > Components: Baremetal >Affects Versions: 4.4.4 > Environment: Baremetal zone >Reporter: subhash yedugundla > Fix For: 4.8.1 > > > Having vmName entries for VMs in 'expunging' states would cause with > deploying VMs with matching host tags fail. So removing them during upgrade -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9593) User data check is inconsistent with python
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15665071#comment-15665071 ] ASF GitHub Bot commented on CLOUDSTACK-9593: Github user jburwell commented on the issue: https://github.com/apache/cloudstack/pull/1760 @marcaurele this change looks a good check to add to LTS to as well. Could you please change the base branch to 4.9? Once you do, I will kick regression tests across all hypervisors in order to merge the fix. > User data check is inconsistent with python > --- > > Key: CLOUDSTACK-9593 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9593 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Affects Versions: 4.4.2, 4.4.3, 4.3.2, 4.5.1, 4.4.4, 4.5.2, 4.6.0, 4.6.1, > 4.6.2, 4.7.0, 4.7.1, 4.8.0, 4.9.0 >Reporter: Marc-Aurèle Brothier >Assignee: Marc-Aurèle Brothier > > The user data is validated through the Apache commons codec library, but this > library does not check that the length is a multiple of 4 characters. The RFC > does not require it either. But the python script in the virtual router that > loads the user data does check for the possible padding presence, requiring > the string to be a multiple of 4 characters. > {code:python} > >>> import base64 > >>> base64.b64decode('foo') > Traceback (most recent call last): > File "", line 1, in > File > "/usr/local/Cellar/python/2.7.12/Frameworks/Python.framework/Versions/2.7/lib/python2.7/base64.py", > line 78, in b64decode > raise TypeError(msg) > TypeError: Incorrect padding > >>> base64.b64decode('foo=') > '~\x8a' > {code} > Currently since the java check is less restrictive, the user data gets saved > into the database but the VR script crashes when it receives this VM user > data. On a single VM it is not really a problem. The critical issue is when a > VR is restarted. The invalid pythonic base64 string makes the vmdata.py > script crashed, resulting in a VR not starting at all. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9588) Add Load Balancer functionality in Network page is redundant.
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15665062#comment-15665062 ] ASF GitHub Bot commented on CLOUDSTACK-9588: Github user jburwell commented on the issue: https://github.com/apache/cloudstack/pull/1758 @nitin-maharana could you please provide a screenshot of this change? > Add Load Balancer functionality in Network page is redundant. > - > > Key: CLOUDSTACK-9588 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9588 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Reporter: Nitin Kumar Maharana > > Steps to Reproduce: > Network -> Select any network -> Observer Add Load Balancer tab > The "Add Load Balancer" functionality is redundant. > The above is used to create LB rule without any public IP. > Resolution: > There exist similar functionality in Network -> Any Network -> Details Tab -> > View IP Addresses -> Any public IP -> Configuration Tab -> Observe Load > Balancing. > The above is used to create LB rule with a public IP. This is a more > convenient way of creating LB rule as the IP is involved. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9583) VR: In CsDhcp.py preseed both hostaname and localhost to resolve to 127.0.0.1
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15665061#comment-15665061 ] ASF GitHub Bot commented on CLOUDSTACK-9583: Github user blueorangutan commented on the issue: https://github.com/apache/cloudstack/pull/1757 @jburwell a Trillian-Jenkins matrix job (centos6 mgmt + xs65sp1, centos7 mgmt + vmware55u3, centos7 mgmt + kvmcentos7) has been kicked to run smoke tests > VR: In CsDhcp.py preseed both hostaname and localhost to resolve to 127.0.0.1 > - > > Key: CLOUDSTACK-9583 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9583 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Reporter: Murali Reddy > Fix For: 4.9.1.0 > > > It is observed that 'ip route flush' was timing out after 20 seconds with > the error that can't resolve the name of the vrouter. Since this is done for > each rule for a router with a lot of rules, adding the entry to hosts file > fixes it and the router provisioning is observed faster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9583) VR: In CsDhcp.py preseed both hostaname and localhost to resolve to 127.0.0.1
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15665059#comment-15665059 ] ASF GitHub Bot commented on CLOUDSTACK-9583: Github user jburwell commented on the issue: https://github.com/apache/cloudstack/pull/1757 @blueorangutan test matrix > VR: In CsDhcp.py preseed both hostaname and localhost to resolve to 127.0.0.1 > - > > Key: CLOUDSTACK-9583 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9583 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Reporter: Murali Reddy > Fix For: 4.9.1.0 > > > It is observed that 'ip route flush' was timing out after 20 seconds with > the error that can't resolve the name of the vrouter. Since this is done for > each rule for a router with a lot of rules, adding the entry to hosts file > fixes it and the router provisioning is observed faster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9595) Transactions are not getting retried in case of database deadlock errors
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15665052#comment-15665052 ] ASF GitHub Bot commented on CLOUDSTACK-9595: Github user jburwell commented on the issue: https://github.com/apache/cloudstack/pull/1762 @serg38 my reading of the code is that only the most recently attempted DML will be re-executed. Furthermore, retrying without refreshing the base data can also lead to data corruption. The best thing to do in a case of a dead lock is to fail and rollback due to the risk of data corruption. > Transactions are not getting retried in case of database deadlock errors > > > Key: CLOUDSTACK-9595 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9595 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Affects Versions: 4.8.0 >Reporter: subhash yedugundla > Fix For: 4.8.1 > > > Customer is seeing occasional error 'Deadlock found when trying to get lock; > try restarting transaction' messages in their management server logs. It > happens regularly at least once a day. The following is the error seen > 2015-12-09 19:23:19,450 ERROR [cloud.api.ApiServer] > (catalina-exec-3:ctx-f05c58fc ctx-39c17156 ctx-7becdf6e) unhandled exception > executing api command: [Ljava.lang.String;@230a6e7f > com.cloud.utils.exception.CloudRuntimeException: DB Exception on: > com.mysql.jdbc.JDBC4PreparedStatement@74f134e3: DELETE FROM > instance_group_vm_map WHERE instance_group_vm_map.instance_id = 941374 > at com.cloud.utils.db.GenericDaoBase.expunge(GenericDaoBase.java:1209) > at sun.reflect.GeneratedMethodAccessor360.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:317) > at > org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183) > at > org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150) > at > com.cloud.utils.db.TransactionContextInterceptor.invoke(TransactionContextInterceptor.java:34) > at > org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:161) > at > org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:91) > at > org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172) > at > org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204) > at com.sun.proxy.$Proxy237.expunge(Unknown Source) > at > com.cloud.vm.UserVmManagerImpl$2.doInTransactionWithoutResult(UserVmManagerImpl.java:2593) > at > com.cloud.utils.db.TransactionCallbackNoReturn.doInTransaction(TransactionCallbackNoReturn.java:25) > at com.cloud.utils.db.Transaction$2.doInTransaction(Transaction.java:57) > at com.cloud.utils.db.Transaction.execute(Transaction.java:45) > at com.cloud.utils.db.Transaction.execute(Transaction.java:54) > at > com.cloud.vm.UserVmManagerImpl.addInstanceToGroup(UserVmManagerImpl.java:2575) > at > com.cloud.vm.UserVmManagerImpl.updateVirtualMachine(UserVmManagerImpl.java:2332) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9595) Transactions are not getting retried in case of database deadlock errors
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15664656#comment-15664656 ] ASF GitHub Bot commented on CLOUDSTACK-9595: Github user serg38 commented on the issue: https://github.com/apache/cloudstack/pull/1762 @jburwell @yvsubhash My understanding that all roll back statements will receive MYSQL_DEADLOCK_ERROR_CODE and will be retired as a part of this patch. > Transactions are not getting retried in case of database deadlock errors > > > Key: CLOUDSTACK-9595 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9595 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Affects Versions: 4.8.0 >Reporter: subhash yedugundla > Fix For: 4.8.1 > > > Customer is seeing occasional error 'Deadlock found when trying to get lock; > try restarting transaction' messages in their management server logs. It > happens regularly at least once a day. The following is the error seen > 2015-12-09 19:23:19,450 ERROR [cloud.api.ApiServer] > (catalina-exec-3:ctx-f05c58fc ctx-39c17156 ctx-7becdf6e) unhandled exception > executing api command: [Ljava.lang.String;@230a6e7f > com.cloud.utils.exception.CloudRuntimeException: DB Exception on: > com.mysql.jdbc.JDBC4PreparedStatement@74f134e3: DELETE FROM > instance_group_vm_map WHERE instance_group_vm_map.instance_id = 941374 > at com.cloud.utils.db.GenericDaoBase.expunge(GenericDaoBase.java:1209) > at sun.reflect.GeneratedMethodAccessor360.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:317) > at > org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183) > at > org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150) > at > com.cloud.utils.db.TransactionContextInterceptor.invoke(TransactionContextInterceptor.java:34) > at > org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:161) > at > org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:91) > at > org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172) > at > org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204) > at com.sun.proxy.$Proxy237.expunge(Unknown Source) > at > com.cloud.vm.UserVmManagerImpl$2.doInTransactionWithoutResult(UserVmManagerImpl.java:2593) > at > com.cloud.utils.db.TransactionCallbackNoReturn.doInTransaction(TransactionCallbackNoReturn.java:25) > at com.cloud.utils.db.Transaction$2.doInTransaction(Transaction.java:57) > at com.cloud.utils.db.Transaction.execute(Transaction.java:45) > at com.cloud.utils.db.Transaction.execute(Transaction.java:54) > at > com.cloud.vm.UserVmManagerImpl.addInstanceToGroup(UserVmManagerImpl.java:2575) > at > com.cloud.vm.UserVmManagerImpl.updateVirtualMachine(UserVmManagerImpl.java:2332) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9590) KVM + CentOS 7.2 + Agent in Alert State for long time
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15664298#comment-15664298 ] Sven Vogel commented on CLOUDSTACK-9590: in Management Server i see always things like {code} 2016-11-14 16:44:56,034 WARN [c.c.a.m.AgentManagerImpl] (AgentConnectTaskPool-15:ctx-f99e936f) (logid:6cdf713e) Monitor NetworkOrchestrator says there is an error in the connect process for 70 due to Unable to get an answer to the Check NetworkCommand from agent: 70 2016-11-14 16:44:56,034 INFO [c.c.a.m.AgentManagerImpl] (AgentConnectTaskPool-15:ctx-f99e936f) (logid:6cdf713e) Host 70 is disconnecting with event AgentDisconnected 2016-11-14 16:44:56,040 DEBUG [c.c.a.m.AgentManagerImpl] (AgentConnectTaskPool-15:ctx-f99e936f) (logid:6cdf713e) The next status of agent 70is Alert, current status is Connecting 2016-11-14 16:44:56,040 DEBUG [c.c.a.m.AgentManagerImpl] (AgentConnectTaskPool-15:ctx-f99e936f) (logid:6cdf713e) Deregistering link for 70 with state Alert 2016-11-14 16:44:56,040 DEBUG [c.c.a.m.AgentManagerImpl] (AgentConnectTaskPool-15:ctx-f99e936f) (logid:6cdf713e) Remove Agent : 70 2016-11-14 16:44:56,040 DEBUG [c.c.a.m.ConnectedAgentAttache] (AgentConnectTaskPool-15:ctx-f99e936f) (logid:6cdf713e) Processing Disconnect. 2016-11-14 16:44:56,040 DEBUG [c.c.a.m.AgentManagerImpl] (AgentConnectTaskPool-15:ctx-f99e936f) (logid:6cdf713e) Sending Disconnect to listener: com.cloud.hypervisor.xenserver.discoverer.XcpServerDiscoverer 2016-11-14 16:44:56,040 DEBUG [c.c.a.m.AgentManagerImpl] (AgentConnectTaskPool-15:ctx-f99e936f) (logid:6cdf713e) Sending Disconnect to listener: com.cloud.hypervisor.hyperv.discoverer.HypervServerDiscoverer 2016-11-14 16:44:56,040 DEBUG [c.c.a.m.AgentManagerImpl] (AgentConnectTaskPool-15:ctx-f99e936f) (logid:6cdf713e) Sending Disconnect to listener: com.cloud.storage.secondary.SecondaryStorageListener 2016-11-14 16:44:56,040 DEBUG [c.c.a.m.AgentManagerImpl] (AgentConnectTaskPool-15:ctx-f99e936f) (logid:6cdf713e) Sending Disconnect to listener: com.cloud.storage.listener.StoragePoolMonitor 2016-11-14 16:44:56,040 DEBUG [c.c.a.m.AgentManagerImpl] (AgentConnectTaskPool-15:ctx-f99e936f) (logid:6cdf713e) Sending Disconnect to listener: com.cloud.vm.ClusteredVirtualMachineManagerImpl 2016-11-14 16:44:56,040 DEBUG [c.c.a.m.AgentManagerImpl] (AgentConnectTaskPool-15:ctx-f99e936f) (logid:6cdf713e) Sending Disconnect to listener: com.cloud.network.security.SecurityGroupListener 2016-11-14 16:44:56,040 DEBUG [c.c.a.m.AgentManagerImpl] (AgentConnectTaskPool-15:ctx-f99e936f) (logid:6cdf713e) Sending Disconnect to listener: com.cloud.deploy.DeploymentPlanningManagerImpl 2016-11-14 16:44:56,040 DEBUG [c.c.a.m.AgentManagerImpl] (AgentConnectTaskPool-15:ctx-f99e936f) (logid:6cdf713e) Sending Disconnect to listener: org.apache.cloudstack.engine.orchestration.NetworkOrchestrator 2016-11-14 16:44:56,040 DEBUG [c.c.a.m.AgentManagerImpl] (AgentConnectTaskPool-15:ctx-f99e936f) (logid:6cdf713e) Sending Disconnect to listener: com.cloud.network.SshKeysDistriMonitor 2016-11-14 16:44:56,040 DEBUG [c.c.a.m.AgentManagerImpl] (AgentConnectTaskPool-15:ctx-f99e936f) (logid:6cdf713e) Sending Disconnect to listener: com.cloud.network.router.VpcVirtualNetworkApplianceManagerImpl 2016-11-14 16:44:56,041 DEBUG [c.c.a.m.AgentManagerImpl] (AgentConnectTaskPool-15:ctx-f99e936f) (logid:6cdf713e) Sending Disconnect to listener: com.cloud.storage.LocalStoragePoolListener 2016-11-14 16:44:56,041 DEBUG [c.c.a.m.AgentManagerImpl] (AgentConnectTaskPool-15:ctx-f99e936f) (logid:6cdf713e) Sending Disconnect to listener: com.cloud.capacity.StorageCapacityListener 2016-11-14 16:44:56,041 DEBUG [c.c.a.m.AgentManagerImpl] (AgentConnectTaskPool-15:ctx-f99e936f) (logid:6cdf713e) Sending Disconnect to listener: com.cloud.capacity.ComputeCapacityListener 2016-11-14 16:44:56,041 DEBUG [c.c.a.m.AgentManagerImpl] (AgentConnectTaskPool-15:ctx-f99e936f) (logid:6cdf713e) Sending Disconnect to listener: com.cloud.network.SshKeysDistriMonitor 2016-11-14 16:44:56,041 DEBUG [c.c.a.m.AgentManagerImpl] (AgentConnectTaskPool-15:ctx-f99e936f) (logid:6cdf713e) Sending Disconnect to listener: com.cloud.network.router.VirtualNetworkApplianceManagerImpl 2016-11-14 16:44:56,041 DEBUG [c.c.a.m.AgentManagerImpl] (AgentConnectTaskPool-15:ctx-f99e936f) (logid:6cdf713e) Sending Disconnect to listener: com.cloud.storage.upload.UploadListener 2016-11-14 16:44:56,041 DEBUG [c.c.a.m.AgentManagerImpl] (AgentConnectTaskPool-15:ctx-f99e936f) (logid:6cdf713e) Sending Disconnect to listener: com.cloud.network.NetworkUsageManagerImpl$DirectNetworkStatsListener 2016-11-14 16:44:56,041 DEBUG [c.c.n.NetworkUsageManagerImpl] (AgentConnectTaskPool-15:ctx-f99e936f) (logid:6cdf713e) Disconnected called on 70 with status Alert 2016-11-14 16:44:56,041 DEBUG [c.c.a.m.AgentManagerImpl] (AgentConnectTaskPool-15:ctx-f99e936f)
[jira] [Commented] (CLOUDSTACK-9595) Transactions are not getting retried in case of database deadlock errors
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15664293#comment-15664293 ] ASF GitHub Bot commented on CLOUDSTACK-9595: Github user jburwell commented on the issue: https://github.com/apache/cloudstack/pull/1762 @yvsubhash according to the (MySQL deadlock documenation)[http://dev.mysql.com/doc/refman/5.7/en/innodb-deadlocks.html], a `MYSQL_DEADLOCK_ERROR_CODE` error indicates the enclosing transaction has been rolled back. The proper handling for this error is to re-execute all statements executed in the aborted transaction. From a best practices perspective, all base data should be re-retrieved and changed to ensure logical consistency with changes made by the transaction that won deadlock resolution. As I understand this patch, only the most recently executed DML is retried. Therefore, any previously executed changes will be discarded and the DML will be re-executed either in a new transaction or in auto-commit (I didn't look up how the client handles the transaction context in this scenario). If my understanding is correct, this patch could lead to issues ranging from unexpected foreign key integrity errors to data corruption. Rather attempting to implement a generic retry, I think the best approach to addressing deadlocks is to treat them bugs. This patch could be modified to provide detailed logging information about the conditions under which a deadlock occurs providing the information necessary to refactor the system to avoid lock contention. > Transactions are not getting retried in case of database deadlock errors > > > Key: CLOUDSTACK-9595 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9595 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Affects Versions: 4.8.0 >Reporter: subhash yedugundla > Fix For: 4.8.1 > > > Customer is seeing occasional error 'Deadlock found when trying to get lock; > try restarting transaction' messages in their management server logs. It > happens regularly at least once a day. The following is the error seen > 2015-12-09 19:23:19,450 ERROR [cloud.api.ApiServer] > (catalina-exec-3:ctx-f05c58fc ctx-39c17156 ctx-7becdf6e) unhandled exception > executing api command: [Ljava.lang.String;@230a6e7f > com.cloud.utils.exception.CloudRuntimeException: DB Exception on: > com.mysql.jdbc.JDBC4PreparedStatement@74f134e3: DELETE FROM > instance_group_vm_map WHERE instance_group_vm_map.instance_id = 941374 > at com.cloud.utils.db.GenericDaoBase.expunge(GenericDaoBase.java:1209) > at sun.reflect.GeneratedMethodAccessor360.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:317) > at > org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183) > at > org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150) > at > com.cloud.utils.db.TransactionContextInterceptor.invoke(TransactionContextInterceptor.java:34) > at > org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:161) > at > org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:91) > at > org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172) > at > org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204) > at com.sun.proxy.$Proxy237.expunge(Unknown Source) > at > com.cloud.vm.UserVmManagerImpl$2.doInTransactionWithoutResult(UserVmManagerImpl.java:2593) > at > com.cloud.utils.db.TransactionCallbackNoReturn.doInTransaction(TransactionCallbackNoReturn.java:25) > at com.cloud.utils.db.Transaction$2.doInTransaction(Transaction.java:57) > at com.cloud.utils.db.Transaction.execute(Transaction.java:45) > at com.cloud.utils.db.Transaction.execute(Transaction.java:54) > at > com.cloud.vm.UserVmManagerImpl.addInstanceToGroup(UserVmManagerImpl.java:2575) > at > com.cloud.vm.UserVmManagerImpl.updateVirtualMachine(UserVmManagerImpl.java:2332) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CLOUDSTACK-9590) KVM + CentOS 7.2 + Agent in Alert State for long time
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15664147#comment-15664147 ] Sven Vogel edited comment on CLOUDSTACK-9590 at 11/14/16 3:11 PM: -- 1. first of all i add the host from cs management 2. Management Host {code} cloudstack-setup-agent -m 192.168.85.25 -z 3 -p 3 -c 9 -g 6e6cff15-3183-3cca-9389-ed1a78f6236a -a --pubNic=cloudbr2 --prvNic=cloudbr0 --guestNic=cloudbr1 --hypervisor=kvm {code} 3. agent will be dead after add the host 4. restart agent 6. agent reconnect to server and wait with alert was (Author: sven.vogel): 1. first of all i add the host from cs management 2. Management Host -- {code} cloudstack-setup-agent -m 192.168.85.25 -z 3 -p 3 -c 9 -g 6e6cff15-3183-3cca-9389-ed1a78f6236a -a --pubNic=cloudbr2 --prvNic=cloudbr0 --guestNic=cloudbr1 --hypervisor=kvm {code} 3. agent will be dead after add the host 4. restart agent 6. agent reconnect to server and wait with alert > KVM + CentOS 7.2 + Agent in Alert State for long time > - > > Key: CLOUDSTACK-9590 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9590 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) > Components: cloudstack-agent >Affects Versions: 4.9.0 > Environment: entOS Linux release 7.2.1511 (Core) > cloudstack-agent-4.9.0-1.el7.centos.x86_64 >Reporter: Sven Vogel > Attachments: agent.log, cloudstack-startup.log, management-server.zip > > > Hi, > When i add a new host to cloudstack management server it take some time to > get host out from alert state. > 1. i add the host and host add not possible > 2. values are correct set to agent.properties, restart cloustack agent > 3. agent says connected to server > 4. management server says "alert" > management-server.log > 2016-11-10 13:23:06,783 DEBUG [c.c.h.Status] > (AgentConnectTaskPool-49:ctx-c3b72839) (logid:5a86e1fb) Transition:[Resource > state = Enabled, Agent event = AgentDisconnected, Host > id = 51, name = kvm02.oscloud.local] > 2016-11-10 13:23:06,798 DEBUG [c.c.a.m.ClusteredAgentManagerImpl] > (AgentConnectTaskPool-49:ctx-c3b72839) (logid:5a86e1fb) Notifying other nodes > of to disconnect > 2016-11-10 13:23:06,806 DEBUG [c.c.a.m.AgentManagerImpl] > (AgentConnectTaskPool-49:ctx-c3b72839) (logid:5a86e1fb) Failed to handle host > connection: com.cloud.exception.Connection > Exception: Unable to get an answer to the CheckNetworkCommand from agent: 51 > is there any way to speed up the alert state? is it normal that it take so > long? > thanks > Sven -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9590) KVM + CentOS 7.2 + Agent in Alert State for long time
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15664147#comment-15664147 ] Sven Vogel commented on CLOUDSTACK-9590: 1. first of all i add the host from cs management 2. Management Host -- {code} cloudstack-setup-agent -m 192.168.85.25 -z 3 -p 3 -c 9 -g 6e6cff15-3183-3cca-9389-ed1a78f6236a -a --pubNic=cloudbr2 --prvNic=cloudbr0 --guestNic=cloudbr1 --hypervisor=kvm {code} 3. agent will be dead after add the host 4. restart agent 6. agent reconnect to server and wait with alert > KVM + CentOS 7.2 + Agent in Alert State for long time > - > > Key: CLOUDSTACK-9590 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9590 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) > Components: cloudstack-agent >Affects Versions: 4.9.0 > Environment: entOS Linux release 7.2.1511 (Core) > cloudstack-agent-4.9.0-1.el7.centos.x86_64 >Reporter: Sven Vogel > Attachments: agent.log, cloudstack-startup.log, management-server.zip > > > Hi, > When i add a new host to cloudstack management server it take some time to > get host out from alert state. > 1. i add the host and host add not possible > 2. values are correct set to agent.properties, restart cloustack agent > 3. agent says connected to server > 4. management server says "alert" > management-server.log > 2016-11-10 13:23:06,783 DEBUG [c.c.h.Status] > (AgentConnectTaskPool-49:ctx-c3b72839) (logid:5a86e1fb) Transition:[Resource > state = Enabled, Agent event = AgentDisconnected, Host > id = 51, name = kvm02.oscloud.local] > 2016-11-10 13:23:06,798 DEBUG [c.c.a.m.ClusteredAgentManagerImpl] > (AgentConnectTaskPool-49:ctx-c3b72839) (logid:5a86e1fb) Notifying other nodes > of to disconnect > 2016-11-10 13:23:06,806 DEBUG [c.c.a.m.AgentManagerImpl] > (AgentConnectTaskPool-49:ctx-c3b72839) (logid:5a86e1fb) Failed to handle host > connection: com.cloud.exception.Connection > Exception: Unable to get an answer to the CheckNetworkCommand from agent: 51 > is there any way to speed up the alert state? is it normal that it take so > long? > thanks > Sven -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9590) KVM + CentOS 7.2 + Agent in Alert State for long time
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15664111#comment-15664111 ] Sven Vogel commented on CLOUDSTACK-9590: no maintenance mode. the host are fresh installed and added to cs. after that they stay for lime in alert mode. > KVM + CentOS 7.2 + Agent in Alert State for long time > - > > Key: CLOUDSTACK-9590 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9590 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) > Components: cloudstack-agent >Affects Versions: 4.9.0 > Environment: entOS Linux release 7.2.1511 (Core) > cloudstack-agent-4.9.0-1.el7.centos.x86_64 >Reporter: Sven Vogel > Attachments: agent.log, cloudstack-startup.log, management-server.zip > > > Hi, > When i add a new host to cloudstack management server it take some time to > get host out from alert state. > 1. i add the host and host add not possible > 2. values are correct set to agent.properties, restart cloustack agent > 3. agent says connected to server > 4. management server says "alert" > management-server.log > 2016-11-10 13:23:06,783 DEBUG [c.c.h.Status] > (AgentConnectTaskPool-49:ctx-c3b72839) (logid:5a86e1fb) Transition:[Resource > state = Enabled, Agent event = AgentDisconnected, Host > id = 51, name = kvm02.oscloud.local] > 2016-11-10 13:23:06,798 DEBUG [c.c.a.m.ClusteredAgentManagerImpl] > (AgentConnectTaskPool-49:ctx-c3b72839) (logid:5a86e1fb) Notifying other nodes > of to disconnect > 2016-11-10 13:23:06,806 DEBUG [c.c.a.m.AgentManagerImpl] > (AgentConnectTaskPool-49:ctx-c3b72839) (logid:5a86e1fb) Failed to handle host > connection: com.cloud.exception.Connection > Exception: Unable to get an answer to the CheckNetworkCommand from agent: 51 > is there any way to speed up the alert state? is it normal that it take so > long? > thanks > Sven -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9557) Deploy from VMsnapshot fails with exception if source template is removed or made private
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15663798#comment-15663798 ] ASF GitHub Bot commented on CLOUDSTACK-9557: Github user yvsubhash commented on the issue: https://github.com/apache/cloudstack/pull/1721 @rhtyd i will merge this to #1664 once the conflicts are resolved int the other one > Deploy from VMsnapshot fails with exception if source template is removed or > made private > - > > Key: CLOUDSTACK-9557 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9557 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) > Components: Template >Affects Versions: 4.8.0 > Environment: Any Hypervisor >Reporter: subhash yedugundla > Fix For: 4.8.1 > > > Steps to reproduce the issue > i) Upload a template as admin user and make sure "public" is selected when > uploading it. > ii) Now login as a user to CloudStack and deploy a VM with the template > created in step i). > iii) Create a VM snapshot as the user for the VM in step ii). Once created > deploy a VM from the snapshot ( this will work as expected) > iv) Now login as admin again , edit the template created in step i) and > Uncheck "public". This is make the template as private ( or else delete the > template from UI) > v) Login as same user as in step ii) and try to create a VM from the same > snapshot ( created in step iii)). This will fail now. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9572) Snapshot on primary storage not cleaned up after Storage migration
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15663510#comment-15663510 ] ASF GitHub Bot commented on CLOUDSTACK-9572: GitHub user yvsubhash reopened a pull request: https://github.com/apache/cloudstack/pull/1740 CLOUDSTACK-9572 Snapshot on primary storage not cleaned up after Stor… Snapshot on primary storage not cleaned up after Storage migration. This happens in the following scenario ## Steps To Reproduce 1. Create an instance on the local storage on any host 2. Create a scheduled snapshot of the volume: 3. Wait until ACS created the snapshot. ACS is creating a snapshot on local storage and is transferring this snapshot to secondary storage. But the latest snapshot on local storage will stay there. This is as expected. 4. Migrate the instance to another XenServer host with ACS UI and Storage Live Migration 5. The Snapshot on the old host on local storage will not be cleaned up and is staying on local storage. So local storage will fill up with unneeded snapshots. You can merge this pull request into a Git repository by running: $ git pull https://github.com/yvsubhash/cloudstack CLOUDSTACK-9572 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/cloudstack/pull/1740.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1740 commit 13820fdae5a22573db1c964f02e37d232228b3d8 Author: subhash yedugundla Date: 2016-09-12T13:29:53Z CLOUDSTACK-9572 Snapshot on primary storage not cleaned up after Storage migration > Snapshot on primary storage not cleaned up after Storage migration > -- > > Key: CLOUDSTACK-9572 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9572 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) > Components: Storage Controller >Affects Versions: 4.8.0 > Environment: Xen Server >Reporter: subhash yedugundla > Fix For: 4.8.1 > > > Issue Description > === > 1. Create an instance on the local storage on any host > 2. Create a scheduled snapshot of the volume: > 3. Wait until ACS created the snapshot. ACS is creating a snapshot on local > storage and is transferring this snapshot to secondary storage. But the > latest snapshot on local storage will stay there. This is as expected. > 4. Migrate the instance to another XenServer host with ACS UI and Storage > Live Migration > 5. The Snapshot on the old host on local storage will not be cleaned up and > is staying on local storage. So local storage will fill up with unneeded > snapshots. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9572) Snapshot on primary storage not cleaned up after Storage migration
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15663505#comment-15663505 ] ASF GitHub Bot commented on CLOUDSTACK-9572: Github user yvsubhash closed the pull request at: https://github.com/apache/cloudstack/pull/1740 > Snapshot on primary storage not cleaned up after Storage migration > -- > > Key: CLOUDSTACK-9572 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9572 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) > Components: Storage Controller >Affects Versions: 4.8.0 > Environment: Xen Server >Reporter: subhash yedugundla > Fix For: 4.8.1 > > > Issue Description > === > 1. Create an instance on the local storage on any host > 2. Create a scheduled snapshot of the volume: > 3. Wait until ACS created the snapshot. ACS is creating a snapshot on local > storage and is transferring this snapshot to secondary storage. But the > latest snapshot on local storage will stay there. This is as expected. > 4. Migrate the instance to another XenServer host with ACS UI and Storage > Live Migration > 5. The Snapshot on the old host on local storage will not be cleaned up and > is staying on local storage. So local storage will fill up with unneeded > snapshots. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9590) KVM + CentOS 7.2 + Agent in Alert State for long time
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15663308#comment-15663308 ] Wei Zhou commented on CLOUDSTACK-9590: -- is the host in Maintenance ? > KVM + CentOS 7.2 + Agent in Alert State for long time > - > > Key: CLOUDSTACK-9590 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9590 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) > Components: cloudstack-agent >Affects Versions: 4.9.0 > Environment: entOS Linux release 7.2.1511 (Core) > cloudstack-agent-4.9.0-1.el7.centos.x86_64 >Reporter: Sven Vogel > Attachments: agent.log, cloudstack-startup.log, management-server.zip > > > Hi, > When i add a new host to cloudstack management server it take some time to > get host out from alert state. > 1. i add the host and host add not possible > 2. values are correct set to agent.properties, restart cloustack agent > 3. agent says connected to server > 4. management server says "alert" > management-server.log > 2016-11-10 13:23:06,783 DEBUG [c.c.h.Status] > (AgentConnectTaskPool-49:ctx-c3b72839) (logid:5a86e1fb) Transition:[Resource > state = Enabled, Agent event = AgentDisconnected, Host > id = 51, name = kvm02.oscloud.local] > 2016-11-10 13:23:06,798 DEBUG [c.c.a.m.ClusteredAgentManagerImpl] > (AgentConnectTaskPool-49:ctx-c3b72839) (logid:5a86e1fb) Notifying other nodes > of to disconnect > 2016-11-10 13:23:06,806 DEBUG [c.c.a.m.AgentManagerImpl] > (AgentConnectTaskPool-49:ctx-c3b72839) (logid:5a86e1fb) Failed to handle host > connection: com.cloud.exception.Connection > Exception: Unable to get an answer to the CheckNetworkCommand from agent: 51 > is there any way to speed up the alert state? is it normal that it take so > long? > thanks > Sven -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9370) Failed to create VPC: Unable to start VPC VR (VM DomainRouter) due to error in finalizeStart, not retrying
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15663303#comment-15663303 ] yang commented on CLOUDSTACK-9370: -- Can yout tell us how to fix this bug? or we need to wait for the new release? > Failed to create VPC: Unable to start VPC VR (VM DomainRouter) due to error > in finalizeStart, not retrying > --- > > Key: CLOUDSTACK-9370 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9370 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) > Components: Virtual Router >Affects Versions: 4.9.0 > Environment: Centos El7 > KVM > OpenvSwitch (VLAN 0) > NFS (primary/secondary) >Reporter: Mani Prashanth Varma Manthena >Priority: Critical > Fix For: 4.9.1.0 > > > I am unable to create VPCs on latest cloudstack master due to the following > error: > {noformat:title=Root Cause Error in Agent log} > 2016-04-27 02:31:03,134 DEBUG [kvm.resource.LibvirtComputingResource] > (agentRequest-Handler-2:null) (logid:6b2d4faa) [INFO] update_config.py :: > Processing incoming file => ip_associations.json[INFO] Processing JSON file > ip_associations.jsonTraceback (most recent call last): File > "/opt/cloud/bin/update_config.py", line 140, in process_file() > File "/opt/cloud/bin/update_config.py", line 52, in process_file > qf.load(None) File "/opt/cloud/bin/merge.py", line 258, in loadproc = > updateDataBag(self) File "/opt/cloud/bin/merge.py", line 91, in __init__ > self.process() File "/opt/cloud/bin/merge.py", line 103, in processdbag > = self.processIP(self.db.getDataBag()) File "/opt/cloud/bin/merge.py", line > 190, in processIPdbag = cs_ip.merge(dbag, ip) File > "/opt/cloud/bin/cs_ip.py", line 32, in mergeip['device'] = 'eth' + > str(ip['nic_dev_id'])KeyError: 'nic_dev_id' > 2016-04-27 02:31:03,135 DEBUG > [resource.virtualnetwork.VirtualRoutingResource] > (agentRequest-Handler-2:null) (logid:6b2d4faa) Processing ScriptConfigItem, > executing update_config.py ip_associations.json took 911ms > {noformat} > {noformat:title=Root Cause Error in Management Server log} > 2016-04-27 02:30:19,975 DEBUG [c.c.a.m.ClusteredAgentAttache] > (Work-Job-Executor-10:ctx-1279b068 job-1159/job-1160 ctx-c31efe73) > (logid:6b2d4faa) Seq 9-332421947495286159: Forwarding Seq > 9-332421947495286159: { Cmd , MgmtId: 275619427298304, via: > 9(ovs-2.mvdcvtb16.us.alcatel-lucent.com), Ver: v1, Flags: 100111, > [{"com.cloud.agent.api.StartCommand":{"vm":{"id":252,"name":"r-252-VM","type":"DomainRouter","cpus":1,"minSpeed":500,"maxSpeed":500,"minRam":268435456,"maxRam":268435456,"arch":"x86_64","os":"Debian > GNU/Linux 5.0 (64-bit)","platformEmulator":"Debian GNU/Linux 5","bootArgs":" > vpccidr=10.1.1.1/16 domain=cs2cloud.internal dns1=128.251.10.29 template=domP > name=r-252-VM eth0ip=169.254.1.123 eth0mask=255.255.0.0 type=vpcrouter > disable_rp_filter=true > baremetalnotificationsecuritykey=0oLpL4swbL6Yu_xsuRdyjwmmyPHAU1V-iMpmMNKO00vNIP5bxronvhQZ_qehiEZ99Eo9avCHg9uLh1cbiz7pQA > > baremetalnotificationapikey=wEax_CyEaKZHn8ZkPBQLQaibjSWZ0OYJuEQA3l2RUA41GXZxaie9P6oQPeNlzjIGl-fDpKWp9MkAEQOJYvE4vA > host=10.31.59.151 > port=8080","enableHA":true,"limitCpuUse":false,"enableDynamicallyScaleVm":false,"vncPassword":"0Q5cib8VX0wIh1nvNsJktw","params":{},"uuid":"2e54aa49-ae38-405d-b1fe-f14b30053ab6","disks":[{"data":{"org.apache.cloudstack.storage.to.VolumeObjectTO":{"uuid":"767e7794-1c87-4957-834a-a92eb711a15d","volumeType":"ROOT","dataStore":{"org.apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid":"bf0515dd-fead-3151-b1a0-9d88c468583a","id":1,"poolType":"NetworkFilesystem","host":"mvdcvtb16.us.alcatel-lucent.com","path":"/mvdcvtb16/storage","port":2049,"url":"NetworkFilesystem://mvdcvtb16.us.alcatel-lucent.com/mvdcvtb16/storage/?ROLE=Primary&STOREUUID=bf0515dd-fead-3151-b1a0-9d88c468583a"}},"name":"ROOT-252","size":322954240,"path":"767e7794-1c87-4957-834a-a92eb711a15d","volumeId":252,"vmName":"r-252-VM","accountId":2,"format":"QCOW2","provisioningType":"THIN","id":252,"deviceId":0,"hypervisorType":"KVM"}},"diskSeq":0,"path":"767e7794-1c87-4957-834a-a92eb711a15d","type":"ROOT","_details":{"managed":"false","storagePort":"2049","storageHost":"mvdcvtb16.us.alcatel-lucent.com","volumeSize":"322954240"}}],"nics":[{"deviceId":0,"networkRateMbps":-1,"defaultNic":false,"pxeDisable":true,"nicUuid":"bf300ca0-afdb-4277-ac71-b7d0d041e29a","uuid":"1ec6ef5d-bed2-475a-9abb-122750fa8ea5","ip":"169.254.1.123","netmask":"255.255.0.0","gateway":"169.254.0.1","mac":"0e:00:a9:fe:01:7b","broadcastType":"LinkLocal","type":"Control","isSecurityGroupEnabled":false}]},"hostIp":"10.100.100.12"