Dag
Do you mean check the pools with "Infrastructure -> Primary Storage" and "Infrastructure -> Secondary Storage" within the UI ? If so Primary Storage has a state of UP, secondary storage does not show a state as such so not sure where else to check it ? Rerun of the command - mysql> select * from cloud.storage_pool where cluster_id = 1; Empty set (0.00 sec) mysql> I think it is something to do with my zone creation rather than the NIC, bridge setup although I can post those if needed. I may try to setup just the 2 NIC solution you mentioned although as I say I had the same issue with that ie. host goes to "Altert" state and same error messages. The only time I can get it to go to "Down" state is when it is all on the single NIC. Quick question just to make sure - assuming management/storage is on the same NIC when I setup basic networking the physical network has the management and guest icons already there and I just edit the KVM labels. If I am running storage over management do I need to drag the storage icon to the physical network and use the same KVM label (cloudbr0) as the management or does CS automatically just use the management NIC ie. I would only need to drag the storage icon across in basic setup if I wanted it on a different NIC/IP subnet ? (hope that makes sense !) On the plus side I have been at this for so long now and done so many rebuilds I could do it in my sleep now đ ________________________________ From: Dag Sonstebo <dag.sonst...@shapeblue.com> Sent: 06 June 2018 12:28 To: users@cloudstack.apache.org Subject: Re: advanced networking with public IPs direct to VMs Looks OK to me Jon. The one thing that throws me is your storage pools â can you rerun your query: select * from cloud.storage_pool where cluster_id = 1; Do the pools show up as online in the CloudStack GUI? Regards, Dag Sonstebo Cloud Architect ShapeBlue On 06/06/2018, 12:08, "Jon Marshall" <jms....@hotmail.co.uk> wrote: Don't know whether this helps or not but I logged into the SSVM and ran an ifconfig - eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 169.254.3.35 netmask 255.255.0.0 broadcast 169.254.255.255 ether 0e:00:a9:fe:03:23 txqueuelen 1000 (Ethernet) RX packets 141 bytes 20249 (19.7 KiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 108 bytes 16287 (15.9 KiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 eth1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 172.30.3.34 netmask 255.255.255.192 broadcast 172.30.3.63 ether 1e:00:3b:00:00:05 txqueuelen 1000 (Ethernet) RX packets 56722 bytes 4953133 (4.7 MiB) RX errors 0 dropped 44573 overruns 0 frame 0 TX packets 11224 bytes 1234932 (1.1 MiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 eth2: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 172.30.4.86 netmask 255.255.255.128 broadcast 172.30.4.127 ether 1e:00:d9:00:00:53 txqueuelen 1000 (Ethernet) RX packets 366191 bytes 435300557 (415.1 MiB) RX errors 0 dropped 39456 overruns 0 frame 0 TX packets 145065 bytes 7978602 (7.6 MiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 eth3: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 172.30.5.14 netmask 255.255.255.240 broadcast 172.30.5.15 ether 1e:00:cb:00:00:1a txqueuelen 1000 (Ethernet) RX packets 132440 bytes 426362982 (406.6 MiB) RX errors 0 dropped 39446 overruns 0 frame 0 TX packets 67443 bytes 423670834 (404.0 MiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536 inet 127.0.0.1 netmask 255.0.0.0 loop txqueuelen 1 (Local Loopback) RX packets 18 bytes 1440 (1.4 KiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 18 bytes 1440 (1.4 KiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 so it has interfaces in both the management and the storage subnets (as well as guest). ________________________________ From: Jon Marshall <jms....@hotmail.co.uk> Sent: 06 June 2018 11:08 To: users@cloudstack.apache.org Subject: Re: advanced networking with public IPs direct to VMs Hi Rafael Thanks for the help, really appreciate it. So rerunning that command with all servers up - mysql> select * from cloud.storage_pool where cluster_id = 1 and removed is null; Empty set (0.00 sec) mysql> As for the storage IP no I'm not setting it to be the management IP when I setup the zone but the output of the SQL command suggests that is what has happened. As I said to Dag I am using a different subnet for storage ie. 172.30.3.0/26 - management subnet 172.30.4.0/25 - guest VM subnet 172.30.5.0/28 - storage the NFS server IP is 172.30.5.2 each compute node has 3 NICs with an IP from each subnet (i am assuming the management node only needs an IP in the management network ?) When I add the zone in the UI I have one physical network with management (cloudbr0), guest (cloudbr1) and storage (cloudbr2). When I fill in the storage traffic page I use the range 172.16.5.10 - 14 as free IPs as I exclude the ones already allocated to the compute nodes and the NFS server. I think maybe I am doing something wrong in the UI setup but it is not obvious to me what it is. What I might try today unless you want me to keep the setup I have for more outputs is to go back to 2 NICs, one for storage/management and one for guest VMs. I think with the 2 NICs setup the mistake I made last time when adding the zone was to assume storage would just run over management so I did not drag and drop the storage icon and assign it to cloudbr0 as with the management which I think is what I should do ? ________________________________ From: Rafael Weingärtner <rafaelweingart...@gmail.com> Sent: 06 June 2018 10:54 To: users Subject: Re: advanced networking with public IPs direct to VMs Jon, do not panic we are here to help you :) So, I might have mistyped the SQL query. You you use select * from cloud.storage_pool where cluster_id = 1 and removed is not null ", you are listing the storage pools removed. Therefore, the right query would be " select * from cloud.storage_pool where cluster_id = 1 and removed is null " There is also something else I do not understand. You are setting the storage IP in the management subnet? I am not sure if you should be doing like this. Normally, I set all my storages (primary[when working with NFS] and secondary) to IPs in the storage subnet. On Wed, Jun 6, 2018 at 6:49 AM, Dag Sonstebo <dag.sonst...@shapeblue.com> wrote: > Hi John, > > Iâm late to this thread and have possibly missed some things â but a > couple of observations: > > âWhen I add the zone and get to the storage web page I exclude the IPs > already used for the compute node NICs and the NFS server itself. âŚ..â > âSo the range is 172.30.5.1 -> 15 and the range I fill in is 172.30.5.10 > -> 172.30.5.14.â > > I think you may have some confusion around the use of the storage network. > The important part here is to understand this is for *secondary storage* > use only â it has nothing to do with primary storage. This means this > storage network needs to be accessible to the SSVM, to the hypervisors, and > secondary storage NFS pools needs to be accessible on this network. > > The important part â this also means you *can not use the same IP ranges > for management and storage networks* - doing so means you will have issues > where effectively both hypervisors and SSVM can see the same subnet on two > NICs â and you end up in a routing black hole. > > So â you need to either: > > 1) Use different IP subnets on management and storage, or > 2) preferably just simplify your setup â stop using a secondary storage > network altogether and just allow secondary storage to use the management > network (which is default). Unless you have a very high I/O environment in > production you are just adding complexity by running separate management > and storage. > > Regards, > Dag Sonstebo > Cloud Architect > ShapeBlue > > On 06/06/2018, 10:18, "Jon Marshall" <jms....@hotmail.co.uk> wrote: > > I will disconnect the host this morning and test but before I do that > I ran this command when all hosts are up - > > > > > > select * from cloud.host; > +----+-----------------+------------------------------------ > --+--------+--------------------+--------------------+------ > -----------+---------------------+--------------------+----- > ------------+---------------------+----------------------+-- > ---------------------+-------------------+------------+----- > --------------+-----------------+--------------------+------ > ------+----------------+--------+-------------+------+------ > -+-------------------------------------+---------+---------- > -------+--------------------+------------+----------+------- > ---+--------+------------+--------------+------------------- > --------------------------------------------+-----------+--- > ----+-------------+------------+----------------+----------- > ----------+---------------------+---------+--------------+-- > --------------+-------+-------------+--------------+ > | id | name | uuid | status > | type | private_ip_address | private_netmask | > private_mac_address | storage_ip_address | storage_netmask | > storage_mac_address | storage_ip_address_2 | storage_mac_address_2 | > storage_netmask_2 | cluster_id | public_ip_address | public_netmask | > public_mac_address | proxy_port | data_center_id | pod_id | cpu_sockets | > cpus | speed | url | fs_type | > hypervisor_type | hypervisor_version | ram | resource | version | > parent | total_size | capabilities | guid > | available | setup | dom0_memory | last_ping | > mgmt_server_id | disconnected | created | removed | > update_count | resource_state | owner | lastUpdated | engine_state | > +----+-----------------+------------------------------------ > --+--------+--------------------+--------------------+------ > -----------+---------------------+--------------------+----- > ------------+---------------------+----------------------+-- > ---------------------+-------------------+------------+----- > --------------+-----------------+--------------------+------ > ------+----------------+--------+-------------+------+------ > -+-------------------------------------+---------+---------- > -------+--------------------+------------+----------+------- > ---+--------+------------+--------------+------------------- > --------------------------------------------+-----------+--- > ----+-------------+------------+----------------+----------- > ----------+---------------------+---------+--------------+-- > --------------+-------+-------------+--------------+ > | 1 | dcp-cscn1.local | d97b930c-ab5f-4b7d-9243-eabd60012284 | Up > | Routing | 172.30.3.3 | 255.255.255.192 | > 00:22:19:92:4e:34 | 172.30.3.3 | 255.255.255.192 | > 00:22:19:92:4e:34 | NULL | NULL | NULL > | 1 | 172.30.4.3 | 255.255.255.128 | > 00:22:19:92:4e:35 | NULL | 1 | 1 | 1 | > 2 | 2999 | iqn.1994-05.com.redhat:fa437fb0c023 | NULL | KVM > | NULL | 7510159360 | NULL | 4.11.0.0 | NULL | > NULL | hvm,snapshot | 9f2b15cb-1b75-321b-bf59-f83e7a5e8efb-LibvirtComputingResource > | 1 | 0 | 0 | 1492390408 | 146457912294 | > 2018-06-05 14:09:22 | 2018-06-05 13:44:33 | NULL | 4 | > Enabled | NULL | NULL | Disabled | > | 2 | v-2-VM | ce1f4594-2b4f-4b2b-a239-3f5e2c2215b0 | Up > | ConsoleProxy | 172.30.3.49 | 255.255.255.192 | > 1e:00:80:00:00:14 | 172.30.3.49 | 255.255.255.192 | > 1e:00:80:00:00:14 | NULL | NULL | NULL > | NULL | 172.30.4.98 | 255.255.255.128 | > 1e:00:c9:00:00:5f | NULL | 1 | 1 | NULL | > NULL | NULL | NoIqn | NULL | NULL > | NULL | 0 | NULL | 4.11.0.0 | NULL | > NULL | NULL | Proxy.2-ConsoleProxyResource > | 1 | 0 | 0 | 1492390409 | 146457912294 | > 2018-06-05 14:09:22 | 2018-06-05 13:46:22 | NULL | 7 | > Enabled | NULL | NULL | Disabled | > | 3 | s-1-VM | 107d0a8e-e2d1-42b5-8b9d-ff3845bb556c | Up > | SecondaryStorageVM | 172.30.3.34 | 255.255.255.192 | > 1e:00:3b:00:00:05 | 172.30.3.34 | 255.255.255.192 | > 1e:00:3b:00:00:05 | NULL | NULL | NULL > | NULL | 172.30.4.86 | 255.255.255.128 | > 1e:00:d9:00:00:53 | NULL | 1 | 1 | NULL | > NULL | NULL | NoIqn | NULL | NULL > | NULL | 0 | NULL | 4.11.0.0 | NULL | > NULL | NULL | s-1-VM-NfsSecondaryStorageResource > | 1 | 0 | 0 | 1492390407 | 146457912294 > | 2018-06-05 14:09:22 | 2018-06-05 13:46:27 | NULL | 7 | > Enabled | NULL | NULL | Disabled | > | 4 | dcp-cscn2.local | f0c076cb-112f-4f4b-a5a4-1a96ffac9794 | Up > | Routing | 172.30.3.4 | 255.255.255.192 | > 00:26:b9:4a:97:7d | 172.30.3.4 | 255.255.255.192 | > 00:26:b9:4a:97:7d | NULL | NULL | NULL > | 1 | 172.30.4.4 | 255.255.255.128 | > 00:26:b9:4a:97:7e | NULL | 1 | 1 | 1 | > 2 | 2999 | iqn.1994-05.com.redhat:e9b4aa7e7881 | NULL | KVM > | NULL | 7510159360 | NULL | 4.11.0.0 | NULL | > NULL | hvm,snapshot | 40e58399-fc7a-3a59-8f48-16d0f99b11c9-LibvirtComputingResource > | 1 | 0 | 0 | 1492450882 | 146457912294 | > 2018-06-05 14:09:22 | 2018-06-05 13:46:33 | NULL | 8 | > Enabled | NULL | NULL | Disabled | > | 5 | dcp-cscn3.local | 0368ae16-550f-43a9-bb40-ee29d2b5c274 | Up > | Routing | 172.30.3.5 | 255.255.255.192 | > 00:24:e8:73:6a:b2 | 172.30.3.5 | 255.255.255.192 | > 00:24:e8:73:6a:b2 | NULL | NULL | NULL > | 1 | 172.30.4.5 | 255.255.255.128 | > 00:24:e8:73:6a:b3 | NULL | 1 | 1 | 1 | > 2 | 3000 | iqn.1994-05.com.redhat:ccdce43aff1c | NULL | KVM > | NULL | 7510159360 | NULL | 4.11.0.0 | NULL | > NULL | hvm,snapshot | 10bb1c01-0e92-3108-8209-37f3eebad8fb-LibvirtComputingResource > | 1 | 0 | 0 | 1492390408 | 146457912294 | > 2018-06-05 14:09:22 | 2018-06-05 13:47:04 | NULL | 6 | > Enabled | NULL | NULL | Disabled | > +----+-----------------+------------------------------------ > --+--------+--------------------+--------------------+------ > -----------+---------------------+--------------------+----- > ------------+---------------------+----------------------+-- > ---------------------+-------------------+------------+----- > --------------+-----------------+--------------------+------ > ------+----------------+--------+-------------+------+------ > -+-------------------------------------+---------+---------- > -------+--------------------+------------+----------+------- > ---+--------+------------+--------------+------------------- > --------------------------------------------+-----------+--- > ----+-------------+------------+----------------+----------- > ----------+---------------------+---------+--------------+-- > --------------+-------+-------------+--------------+ > 5 rows in set (0.00 sec) > > > > and you can see that it says the storage IP address is the same as the > private IP address (the management network). > > > I also ran the command you provided using the Cluster ID number from > the table above - > > > > mysql> select * from cloud.storage_pool where cluster_id = 1 and > removed is not null; > Empty set (0.00 sec) > > mysql> > > So assuming I am reading this correctly that seems to be the issue. > > > I am at a loss as to why though. > > > I have a separate NIC for storage as described. When I add the zone > and get to the storage web page I exclude the IPs already used for the > compute node NICs and the NFS server itself. I do this because initially I > didn't and the SSVM started using the IP address of the NFS server. > > > So the range is 172.30.5.1 -> 15 and the range I fill in is > 172.30.5.10 -> 172.30.5.14. > > > And I used the label "cloudbr2" for storage. > > > I must be doing this wrong somehow. > > > Any pointers would be much appreciated. > > > > > ________________________________ > From: Rafael Weingärtner <rafaelweingart...@gmail.com> > Sent: 05 June 2018 16:13 > To: users > Subject: Re: advanced networking with public IPs direct to VMs > > That is interesting. Let's see the source of all truth... > This is the code that is generating that odd message. > > > List<StoragePoolVO> clusterPools = > > _storagePoolDao.listPoolsByCluster(agent.getClusterId()); > > boolean hasNfs = false; > > for (StoragePoolVO pool : clusterPools) { > > if (pool.getPoolType() == StoragePoolType.NetworkFilesystem) > { > > hasNfs = true; > > break; > > } > > } > > if (!hasNfs) { > > s_logger.warn( > > "Agent investigation was requested on host " + > agent + > > ", but host does not support investigation because it has no NFS > storage. > > Skipping investigation."); > > return Status.Disconnected; > > } > > > > There are two possibilities here. You do not have any NFS storage? Is > that > the case? Or maybe, for some reason, the call > "_storagePoolDao.listPoolsByCluster(agent.getClusterId())" is not > returning > any NFS storage pools. Looking at the "listPoolsByCluster " we will see > that the following SQL is used: > > Select * from storage_pool where cluster_id = <host'sClusterId> and > removed > > is not null > > > > Can you run that SQL to see the its return when your hosts are marked > as > disconnected? > > > dag.sonst...@shapeblue.com > www.shapeblue.com<http://www.shapeblue.com> Shapeblue - The CloudStack Company<http://www.shapeblue.com/> www.shapeblue.com<http://www.shapeblue.com> ShapeBlue are the largest independent integrator of CloudStack technologies globally and are specialists in the design and implementation of IaaS cloud infrastructures for both private and public cloud implementations. > 53 Chandos Place, Covent Garden, London WC2N 4HSUK > @shapeblue > > > > On Tue, Jun 5, 2018 at 11:32 AM, Jon Marshall <jms....@hotmail.co.uk> > wrote: > > > I reran the tests with the 3 NIC setup. When I configured the zone > through > > the UI I used the labels cloudbr0 for management, cloudbr1 for guest > > traffic and cloudbr2 for NFS as per my original response to you. > > > > > > When I pull the power to the node (dcp-cscn2.local) after about 5 > mins > > the host status goes to "Alert" but never to "Down" > > > > > > I get this in the logs - > > > > > > 2018-06-05 15:17:14,382 WARN [c.c.h.KVMInvestigator] > > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) Agent investigation > was > > requested on host Host[-4-Routing], but host does not support > investigation > > because it has no NFS storage. Skipping investigation. > > 2018-06-05 15:17:14,382 DEBUG [c.c.h.HighAvailabilityManagerImpl] > > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) KVMInvestigator was > able to > > determine host 4 is in Disconnected > > 2018-06-05 15:17:14,382 INFO [c.c.a.m.AgentManagerImpl] > > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) The agent from host > 4 state > > determined is Disconnected > > 2018-06-05 15:17:14,382 WARN [c.c.a.m.AgentManagerImpl] > > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) Agent is > disconnected but > > the host is still up: 4-dcp-cscn2.local > > > > I don't understand why it thinks there is no NFS storage as each > compute > > node has a dedicated storage NIC. > > > > > > I also don't understand why it thinks the host is still up ie. what > test > > is it doing to determine that ? > > > > > > Am I just trying to get something working that is not supported ? > > > > > > ________________________________ > > From: Rafael Weingärtner <rafaelweingart...@gmail.com> > > Sent: 04 June 2018 15:31 > > To: users > > Subject: Re: advanced networking with public IPs direct to VMs > > > > What type of failover are you talking about? > > What ACS version are you using? > > What hypervisor are you using? > > How are you configuring your NICs in the hypervisor? > > How are you configuring the traffic labels in ACS? > > > > On Mon, Jun 4, 2018 at 11:29 AM, Jon Marshall <jms....@hotmail.co.uk > > > > wrote: > > > > > Hi all > > > > > > > > > I am close to giving up on basic networking as I just cannot get > failover > > > working with multiple NICs (I am not even sure it is supported). > > > > > > > > > What I would like is to use 3 NICs for management, storage and > guest > > > traffic. I would like to assign public IPs direct to the VMs which > is > > why I > > > originally chose basic. > > > > > > > > > If I switch to advanced networking do I just configure a guest VM > with > > > public IPs on one NIC and not both with the public traffic - > > > > > > > > > would this work ? > > > > > > > > > > > -- > > Rafael Weingärtner > > > > > > -- > Rafael Weingärtner > > > -- Rafael Weingärtner dag.sonst...@shapeblue.com www.shapeblue.com<http://www.shapeblue.com> 53 Chandos Place, Covent Garden, London WC2N 4HSUK @shapeblue