[ https://issues.apache.org/jira/browse/CLOUDSTACK-3896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14057197#comment-14057197 ]
Mandar Barve commented on CLOUDSTACK-3896: ------------------------------------------ Looking into the logs here are a few things that I see: Management server log shows pool id 12 named primaryZone2 threw an exception for deletePool command. 013-07-29 14:43:45,503 ERROR [cloud.api.ApiServer] (catalina-exec-20:null) unhandled exception executing api command: deleteStoragePool com.cloud.utils.exception.CloudRuntimeException: Cannot delete pool primaryZone2 as there are associated volumes for this pool at com.cloud.storage.StorageManagerImpl.deletePool(StorageManagerImpl.java:829) at com.cloud.utils.component.ComponentInstantiationPostProcessor$InterceptorDispatcher.intercept(ComponentInstantiationPostProcessor.java:125) at org.apache.cloudstack.api.command.admin.storage.DeletePoolCmd.execute(DeletePoolCmd.java:78) at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:158) at com.cloud.api.ApiServer.queueCommand(ApiServer.java:514) at com.cloud.api.ApiServer.handleRequest(ApiServer.java:372) at com.cloud.api.ApiServlet.processRequest(ApiServlet.java:305) at com.cloud.api.ApiServlet.doGet(ApiServlet.java:66) at javax.servlet.http.HttpServlet.service(HttpServlet.java:617) at javax.servlet.http.HttpServlet.service(HttpServlet.java:717) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:555) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298) at org.apache.coyote.http11.Http11NioProcessor.process(Http11NioProcessor.java:889) at org.apache.coyote.http11.Http11NioProtocol$Http11ConnectionHandler.process(Http11NioProtocol.java:721) at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.run(NioEndpoint.java:2268) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:679) The pool has a template with id = 1 in Ready state. Volumes table has couple of volumes that are in Expunged state and refer to this template. mysql> select * from volumes where template_id=1 AND pool_id=12\G; *************************** 1. row *************************** id: 27 account_id: 1 domain_id: 1 pool_id: 12 last_pool_id: NULL instance_id: 24 device_id: 0 name: ROOT-24 uuid: c43cf1b3-238c-4d1f-b55b-d2fb5150bc4e size: 2097152000 folder: NULL path: 6501771c-65d5-4971-adff-e1f03626bac5 pod_id: NULL data_center_id: 2 iscsi_name: NULL host_ip: NULL volume_type: ROOT pool_type: NULL disk_offering_id: 9 template_id: 1 first_snapshot_backup_uuid: NULL recreatable: 1 created: 2013-07-29 07:22:12 attached: NULL updated: 2013-07-29 09:14:04 removed: 2013-07-29 09:14:04 state: Expunged chain_info: NULL update_count: 6 disk_type: NULL display_volume: 0 format: VHD min_iops: NULL max_iops: NULL *************************** 2. row *************************** id: 28 account_id: 1 domain_id: 1 pool_id: 12 last_pool_id: NULL instance_id: 25 device_id: 0 name: ROOT-25 uuid: 92ff2b77-d2aa-4e0f-9508-b5414df2730f size: 2097152000 folder: NULL path: 2c36952b-2584-4b60-aff9-9b42cbc92258 pod_id: NULL data_center_id: 2 iscsi_name: NULL host_ip: NULL volume_type: ROOT pool_type: NULL disk_offering_id: 11 template_id: 1 first_snapshot_backup_uuid: NULL recreatable: 1 created: 2013-07-29 07:22:13 attached: NULL updated: 2013-07-29 09:14:07 removed: 2013-07-29 09:14:07 state: Expunged chain_info: NULL update_count: 6 disk_type: NULL display_volume: 0 format: VHD min_iops: NULL max_iops: NULL 2 rows in set (0.00 sec) Management server logs also show a print that says "Storage pool garbage collector found 0 templates to cleanup in storage pool primaryZone2" which is little confusing. This code looks to clean up those templates that are "unused". It checks if the template is not a router template and already DOWNLOADED and has no references in volumes table. This template should really be "UNUSED" since both the volumes referring to it are in 'Expunged' state and following query returns a result of 0 mysql> SELECT COUNT(*) FROM volumes WHERE volumes.pool_id=12 AND volumes.template_id=1 AND volumes.removed IS NULL; +----------+ | COUNT(*) | +----------+ | 0 | +----------+ Why does the garbage collector find 0 "unused" templates on this storage pool? This code checks all template ids on the storage pool and for each it checks if the template in the vm_template table for that ID is marked as of type SYSTEM. This template looks like is marked as SYSTEM as a result will be considered to be in use mysql> select * from vm_template where id=1\G *************************** 1. row *************************** id: 1 unique_name: routing-1 name: SystemVM Template (XenServer) uuid: 4cdfb5c8-f4ef-11e2-a91c-069f2c0000aa public: 0 featured: 0 type: SYSTEM hvm: 0 bits: 64 url: http://10.147.28.7/templates/acton/acton-systemvm-02062012.vhd.bz2 format: VHD created: 2013-07-25 11:28:59 removed: NULL account_id: 1 checksum: f613f38c96bf039f2e5cbf92fa8ad4f8 display_text: SystemVM Template (XenServer) enable_password: 0 enable_sshkey: 0 guest_os_id: 133 bootable: 1 prepopulate: 0 cross_zones: 1 extractable: 0 hypervisor_type: XenServer source_template_id: NULL template_tag: NULL sort_key: 0 size: NULL state: Allocated update_count: 0 updated: NULL dynamically_scalable: 0 1 row in set (0.00 sec) > [PrimaryStorage] deleteStoragePool is not kicking GC for the downloaded > system vm templates on Primary Storage > -------------------------------------------------------------------------------------------------------------- > > Key: CLOUDSTACK-3896 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-3896 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) > Components: Storage Controller > Affects Versions: 4.2.0 > Environment: commit # ca474d0e09f772cb22abf2802a308a2da5351592 > Reporter: venkata swamybabu budumuru > Priority: Minor > Fix For: 4.4.0 > > Attachments: logs.tgz > > > Steps to reproduce: > 1. Have the latest cloudstack setup with at least 1 advanced zone using > XenServer > 2. make sure that system vm is up and running (which means the system vm is > downloaded to the primary storage) > 3. Disable zone and destroy the system vas > 4. place the primary & secondary storages in maintenance mode. > 5. delete both primary and secondary > # select * from storage_pool where id=12 > *************************** 9. row *************************** > id: 12 > name: primaryZone2 > uuid: NULL > pool_type: NetworkFilesystem > port: 2049 > data_center_id: 2 > pod_id: 2 > cluster_id: 2 > used_bytes: 1993387966464 > capacity_bytes: 5902284816384 > host_address: 10.147.28.7 > user_info: NULL > path: /export/home/swamy/primary.campo.xen.1.cluster > created: 2013-07-29 07:19:06 > removed: 2013-07-29 09:14:19 > update_time: NULL > status: Maintenance > storage_provider_name: DefaultPrimary > scope: CLUSTER > hypervisor: NULL > managed: 0 > capacity_iops: NULL > 6. check cloud.template_spool_ref for the above system vm template. > Observations: > (i) template_spool_ref still shows that system vm template as "Ready" > (ii) storage GC didn't happen for the above template. > mysql> select * from template_spool_ref where pool_id=12\G > *************************** 1. row *************************** > id: 10 > pool_id: 12 > template_id: 1 > created: 2013-07-29 07:22:12 > last_updated: NULL > job_id: NULL > download_pct: 100 > download_state: DOWNLOADED > error_str: NULL > local_path: 332cedca-b187-4af8-9d0a-ac3379741211 > install_path: 332cedca-b187-4af8-9d0a-ac3379741211 > template_size: 0 > marked_for_gc: 0 > state: Ready > update_count: 2 > updated: 2013-07-29 07:36:24 > 1 row in set (0.00 sec) > (iii) Storage.cleanup.interval is enabled and set to 10 in my setup. > Attaching all the required logs along with db dump to the bug. -- This message was sent by Atlassian JIRA (v6.2#6252)