[
https://issues.apache.org/jira/browse/CLOUDSTACK-7697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14213477#comment-14213477
]
Koushik Das commented on CLOUDSTACK-7697:
-
When host is detected as 'Down', all VMs running on that are scheduled for
investigation (using HA worker threads). Based on investigation if VM is
detected as alive, HA process is marked as completed. If it cannot be
conclusively determined as alive/dead then the VM is fenced. After this if the
VM is not HA enabled then the process is marked as completed. In this case no
alert is generated for individual VMs as this is expected behaviour.
For SSVM/CPVM, there is a separate monitoring logic that will detect when there
is no running SSVM/CPVM and spawn a new one. As part of this alerts will be
generated.
In case of non-HA user VMs there is no need for alerts as the VM is not
critical since it is not HA enabled. For HA enabled VMs it will tried to be
started on another host and if there are any failures alerts are generated.
> HA - No alerts being generated when SSVM/CPVM is being HA-ed to a different
> hosts.
> --
>
> Key: CLOUDSTACK-7697
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-7697
> Project: CloudStack
> Issue Type: Bug
> Security Level: Public(Anyone can view this level - this is the
> default.)
> Components: Management Server
>Affects Versions: 4.5.0
> Environment: Build from 4.5
>Reporter: Sangeetha Hariharan
>Assignee: Koushik Das
> Fix For: 4.5.0
>
>
> HA - No alerts being generated when SSVM/CPVM is being HA-ed to a different
> hosts.
> Steps to reproduce the problem:
> Zone with 1 cluster having 2 hosts.
> Bring down master host where SSVM and CPVM is running.
> All user Vms , SSVM and CPVM running in this host is HA-ed to another host.
> There is no Alert being generated for SSVM and CPVM being detected as being
> stopped .
> Also there are no events/alerts being generated for all the user Vms that
> were detected as being stopped and started in a different host.
> Should we expect events/alerts being generated for these as well ?
> mysql> select * from alert;
> ++--+--++++-++-+-+--+--++
> | id | uuid | type | cluster_id | pod_id |
> data_center_id | subject
>| sent_count | created | last_sent
> | resolved | archived | name |
> ++--+--++++-++-+-+--+--++
> | 1 | aeef592e-3bb4-431e-911d-16280bf8a8ad | 14 | NULL | 0 |
> 0 | Management network CIDR is not configured originally. Set it
> default to 10.223.130.0/24 | 1 | 2014-10-09 22:19:14 |
> 2014-10-09 22:19:14 | NULL |0 | ALERT.MANAGEMENT |
> | 2 | 1a0bb67d-9346-4078-a80d-e6669116e7fd | 14 | NULL | 0 |
> 0 | Management server node 10.223.130.101 is up
> | 1 | 2014-10-09 22:19:16 | 2014-10-09
> 22:19:16 | NULL |0 | ALERT.MANAGEMENT |
> | 3 | 5c37924e-50cd-413f-a37a-ac275dbc46f9 | 13 | NULL | 0 |
> 0 | No usage server process running
> | 1 | 2014-10-09 23:19:14 | 2014-10-09
> 23:19:14 | NULL |0 | ALERT.USAGE|
> | 4 | 4d1b8b64-f59a-4405-a244-14e054297f04 |2 | 1 | 1 |
> 1 | System Alert: Low Available Storage in cluster cluster1 pod
> pod1 of availability zone zone1 | 1 | 2014-10-09 23:39:44 |
> 2014-10-09 23:39:44 | NULL |0 | ALERT.STORAGE |
> | 5 | aaf9bb96-799c-40d0-a652-96566c7ff47a |7 | NULL | 1 |
> 1 | Host is down, name: Rack3Host20.lab.vmops.com (id:1),
> availability zone: zone1, pod: pod1 | 1 | 2014-10-10 15:05:41 |
> 2014-10-10 15:05:41 | NULL |0 | ALERT.COMPUTE.HOST |
> ++--+--++++-++-+-+-