While someone ponders about the zone wide storage, you could try adding a
cluster wide nfs storage and see if it the rest works in that setup.

Erik

On Thu, Jun 7, 2018 at 11:49 AM Jon Marshall <jms....@hotmail.co.uk> wrote:

> Yes, all basic. I read a Shapeblue doc that recommended splitting traffic
> across multiple NICs even in basic networking mode so that is what I am
> trying to do.
>
>
> With single NIC you do not get the NFS storage message.
>
>
> I have the entire management server logs for both scenarios after I pulled
> the power to one of the compute nodes but from the single NIC setup these
> seem to be the relevant lines -
>
>
> 2018-06-04 10:17:10,972 DEBUG [c.c.n.NetworkUsageManagerImpl]
> (AgentTaskPool-3:ctx-8627b348) (logid:ef7b8230) Disconnected called on 4
> with status Down
> 2018-06-04 10:17:10,972 DEBUG [c.c.h.Status]
> (AgentTaskPool-3:ctx-8627b348) (logid:ef7b8230) Transition:[Resource state
> = Enabled, Agent event = HostDown, Host id = 4, name = dcp-cscn2.local]
> 2018-06-04 10:17:10,981 WARN  [o.a.c.alerts]
> (AgentTaskPool-3:ctx-8627b348) (logid:ef7b8230) AlertType:: 7 |
> dataCenterId:: 1 | podId:: 1 | clusterId:: null | message:: Host is down,
> name: dcp-cscn2.local (id:4), availability zone: dcpz1, pod: dcp1
> 2018-06-04 10:17:11,000 DEBUG [c.c.h.CheckOnAgentInvestigator]
> (HA-Worker-1:ctx-f763f12f work-17) (logid:77c56778) Unable to reach the
> agent for VM[User|i-2-6-VM]: Resource [Host:4] is unreachable: Host 4: Host
> with specified id is not in the right state: Down
> 2018-06-04 10:17:11,006 DEBUG [c.c.h.KVMInvestigator]
> (AgentTaskPool-2:ctx-a6f6dbd1) (logid:774553ff) Neighbouring host:5
> returned status:Down for the investigated host:4
> 2018-06-04 10:17:11,006 DEBUG [c.c.h.KVMInvestigator]
> (AgentTaskPool-2:ctx-a6f6dbd1) (logid:774553ff) HA: HOST is ineligible
> legacy state Down for host 4
> 2018-06-04 10:17:11,006 DEBUG [c.c.h.HighAvailabilityManagerImpl]
> (AgentTaskPool-2:ctx-a6f6dbd1) (logid:774553ff) KVMInvestigator was able to
> determine host 4 is in Down
> 2018-06-04 10:17:11,006 INFO  [c.c.a.m.AgentManagerImpl]
> (AgentTaskPool-2:ctx-a6f6dbd1) (logid:774553ff) The agent from host 4 state
> determined is Down
> 2018-06-04 10:17:11,006 ERROR [c.c.a.m.AgentManagerImpl]
> (AgentTaskPool-2:ctx-a6f6dbd1) (logid:774553ff) Host is down:
> 4-dcp-cscn2.local. Starting HA on the VMs
>
> At the moment I only need to assign public IPs direct to VMs rather than
> using NAT with the virtual router but would be happy to go with advanced
> networking if it would make things easier :)
>
> ________________________________
> From: Rafael Weingärtner <rafaelweingart...@gmail.com>
> Sent: 07 June 2018 10:35
> To: users
> Subject: Re: advanced networking with public IPs direct to VMs
>
> Ah so, it is not an advanced setup; even when you use multiple NICs.
> Can you confirm that the message ""Agent investigation was requested on
> host, but host does not support investigation because it has no NFS
> storage. Skipping investigation." does not appear when you use a single
> NIC? Can you check other log entries that might appear when the host is
> marked as "down"?
>
> On Thu, Jun 7, 2018 at 6:30 AM, Jon Marshall <jms....@hotmail.co.uk>
> wrote:
>
> > It is all basic networking at the moment for all the setups.
> >
> >
> > If you want me to I can setup a single NIC solution again and run any
> > commands you need me to do.
> >
> >
> > FYI when I setup single NIC I use the guided  installtion option in the
> UI
> > rather than manual setup which I do for the multiple NIC scenario.
> >
> >
> > Happy to set it up if it helps.
> >
> >
> >
> >
> > ________________________________
> > From: Rafael Weingärtner <rafaelweingart...@gmail.com>
> > Sent: 07 June 2018 10:23
> > To: users
> > Subject: Re: advanced networking with public IPs direct to VMs
> >
> > Ok, so that explains the log message. This is looking like a bug to me.
> It
> > seems that in Zone wide the host state (when disconnected) is not being
> > properly identified due to this NFS thing, and as a consequency it has a
> > side effect in VM HA.
> >
> > We would need some inputs from guys that have advanced networking
> > deployments and Zone wide storage.
> >
> > I do not see how the all in one NIC deployment scenario is working
> though.
> > This method "com.cloud.ha.KVMInvestigator.isAgentAlive(Host)" is dead
> > simple, if there is no NFS in the cluster (NFS storage pools found for a
> > host's cluster), KVM hosts will be detected as "disconnected" and not
> down
> > with that warning message you noticed.
> >
> > When you say "all in one NIC", is it an advanced network deployment where
> > you put all traffic in a single network, or is it a basic networking that
> > you are doing?
> >
> > On Thu, Jun 7, 2018 at 6:06 AM, Jon Marshall <jms....@hotmail.co.uk>
> > wrote:
> >
> > > zone wide.
> > >
> > >
> > > ________________________________
> > > From: Rafael Weingärtner <rafaelweingart...@gmail.com>
> > > Sent: 07 June 2018 10:04
> > > To: users
> > > Subject: Re: advanced networking with public IPs direct to VMs
> > >
> > > What type of storage are you using? Zone wide? Or cluster "wide"
> storage?
> > >
> > > On Thu, Jun 7, 2018 at 4:25 AM, Jon Marshall <jms....@hotmail.co.uk>
> > > wrote:
> > >
> > > > Rafael
> > > >
> > > >
> > > > Here is the output as requested -
> > > >
> > > >
> > > >
> > > > mysql> mysql> select * from cloud.storage_pool where removed is null;
> > > > +----+------+--------------------------------------+--------
> > > > -----------+------+----------------+--------+------------+--
> > > > ----------+----------------+--------------+-----------+-----
> > > > ------------+---------------------+---------+-------------+-
> > > > -------+-----------------------+-------+------------+-------
> > > > --+---------------+
> > > > | id | name | uuid                                 | pool_type
> >  |
> > > > port | data_center_id | pod_id | cluster_id | used_bytes |
> > > capacity_bytes |
> > > > host_address | user_info | path            | created             |
> > > removed
> > > > | update_time | status | storage_provider_name | scope | hypervisor |
> > > > managed | capacity_iops |
> > > > +----+------+--------------------------------------+--------
> > > > -----------+------+----------------+--------+------------+--
> > > > ----------+----------------+--------------+-----------+-----
> > > > ------------+---------------------+---------+-------------+-
> > > > -------+-----------------------+-------+------------+-------
> > > > --+---------------+
> > > > |  1 | ds1  | a234224f-05fb-3f4c-9b0f-c51ebdf9a601 |
> > NetworkFilesystem |
> > > > 2049 |              1 |   NULL |       NULL | 6059720704 |
> > > 79133933568 |
> > > > 172.30.5.2   | NULL      | /export/primary | 2018-06-05 13:45:01 |
> NULL
> > > > | NULL        | Up     | DefaultPrimary        | ZONE  | KVM        |
> > > >  0 |          NULL |
> > > > +----+------+--------------------------------------+--------
> > > > -----------+------+----------------+--------+------------+--
> > > > ----------+----------------+--------------+-----------+-----
> > > > ------------+---------------------+---------+-------------+-
> > > > -------+-----------------------+-------+------------+-------
> > > > --+---------------+
> > > > 1 row in set (0.00 sec)
> > > >
> > > > mysql>
> > > >
> > > > Do you think this problem is related to my NIC/bridge configuration
> or
> > > the
> > > > way I am configuring the zone ?
> > > >
> > > > Jon
> > > > ________________________________
> > > > From: Rafael Weingärtner <rafaelweingart...@gmail.com>
> > > > Sent: 07 June 2018 06:45
> > > > To: users
> > > > Subject: Re: advanced networking with public IPs direct to VMs
> > > >
> > > > Can you also post the result of:
> > > > select * from cloud.storage_pool where removed is null
> > > >
> > > > On Wed, Jun 6, 2018 at 3:06 PM, Dag Sonstebo <
> > dag.sonst...@shapeblue.com
> > > >
> > > > wrote:
> > > >
> > > > > Hi Jon,
> > > > >
> > > > > Still confused where your primary storage pools are – are you sure
> > your
> > > > > hosts are in cluster 1?
> > > > >
> > > > > Quick question just to make sure - assuming management/storage is
> on
> > > the
> > > > > same NIC when I setup basic networking the physical network has the
> > > > > management and guest icons already there and I just edit the KVM
> > > labels.
> > > > If
> > > > > I am running storage over management do I need to drag the storage
> > icon
> > > > to
> > > > > the physical network and use the same KVM label (cloudbr0) as the
> > > > > management or does CS automatically just use the management NIC
> ie. I
> > > > would
> > > > > only need to drag the storage icon across in basic setup if I
> wanted
> > it
> > > > on
> > > > > a different NIC/IP subnet ?  (hope that makes sense !)
> > > > >
> > > > > >> I would do both – set up your 2/3 physical networks, name isn’t
> > that
> > > > > important – but then drag the traffic types to the correct one and
> > make
> > > > > sure the labels are correct.
> > > > > Regards,
> > > > > Dag Sonstebo
> > > > > Cloud Architect
> > > > > ShapeBlue
> > > > >
> > > > > On 06/06/2018, 12:39, "Jon Marshall" <jms....@hotmail.co.uk>
> wrote:
> > > > >
> > > > >     Dag
> > > > >
> > > > >
> > > > >     Do you mean  check the pools with "Infrastructure -> Primary
> > > Storage"
> > > > > and "Infrastructure -> Secondary Storage" within the UI ?
> > > > >
> > > > >
> > > > >     If so Primary Storage has a state of UP, secondary storage does
> > not
> > > > > show a state as such so not sure where else to check it ?
> > > > >
> > > > >
> > > > >     Rerun of the command -
> > > > >
> > > > >     mysql> select * from cloud.storage_pool where cluster_id = 1;
> > > > >     Empty set (0.00 sec)
> > > > >
> > > > >     mysql>
> > > > >
> > > > >     I think it is something to do with my zone creation rather than
> > the
> > > > > NIC, bridge setup although I can post those if needed.
> > > > >
> > > > >     I may try to setup just the 2 NIC solution you mentioned
> although
> > > as
> > > > I
> > > > > say I had the same issue with that ie. host goes to "Altert" state
> > and
> > > > same
> > > > > error messages.  The only time I can get it to go to "Down" state
> is
> > > when
> > > > > it is all on the single NIC.
> > > > >
> > > > >     Quick question just to make sure - assuming management/storage
> is
> > > on
> > > > > the same NIC when I setup basic networking the physical network has
> > the
> > > > > management and guest icons already there and I just edit the KVM
> > > labels.
> > > > If
> > > > > I am running storage over management do I need to drag the storage
> > icon
> > > > to
> > > > > the physical network and use the same KVM label (cloudbr0) as the
> > > > > management or does CS automatically just use the management NIC
> ie. I
> > > > would
> > > > > only need to drag the storage icon across in basic setup if I
> wanted
> > it
> > > > on
> > > > > a different NIC/IP subnet ?  (hope that makes sense !)
> > > > >
> > > > >     On the plus side I have been at this for so long now and done
> so
> > > many
> > > > > rebuilds I could do it in my sleep now 😊
> > > > >
> > > > >
> > > > >     ________________________________
> > > > >     From: Dag Sonstebo <dag.sonst...@shapeblue.com>
> > > > >     Sent: 06 June 2018 12:28
> > > > >     To: users@cloudstack.apache.org
> > > > >     Subject: Re: advanced networking with public IPs direct to VMs
> > > > >
> > > > >     Looks OK to me Jon.
> > > > >
> > > > >     The one thing that throws me is your storage pools – can you
> > rerun
> > > > > your query: select * from cloud.storage_pool where cluster_id = 1;
> > > > >
> > > > >     Do the pools show up as online in the CloudStack GUI?
> > > > >
> > > > >     Regards,
> > > > >     Dag Sonstebo
> > > > >     Cloud Architect
> > > > >     ShapeBlue
> > > > >
> > > > >     On 06/06/2018, 12:08, "Jon Marshall" <jms....@hotmail.co.uk>
> > > wrote:
> > > > >
> > > > >         Don't know whether this helps or not but I logged into the
> > SSVM
> > > > > and ran an ifconfig -
> > > > >
> > > > >
> > > > >         eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
> > > > >                 inet 169.254.3.35  netmask 255.255.0.0  broadcast
> > > > > 169.254.255.255
> > > > >                 ether 0e:00:a9:fe:03:23  txqueuelen 1000
> (Ethernet)
> > > > >                 RX packets 141  bytes 20249 (19.7 KiB)
> > > > >                 RX errors 0  dropped 0  overruns 0  frame 0
> > > > >                 TX packets 108  bytes 16287 (15.9 KiB)
> > > > >                 TX errors 0  dropped 0 overruns 0  carrier 0
> > > collisions
> > > > 0
> > > > >
> > > > >         eth1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
> > > > >                 inet 172.30.3.34  netmask 255.255.255.192
> broadcast
> > > > > 172.30.3.63
> > > > >                 ether 1e:00:3b:00:00:05  txqueuelen 1000
> (Ethernet)
> > > > >                 RX packets 56722  bytes 4953133 (4.7 MiB)
> > > > >                 RX errors 0  dropped 44573  overruns 0  frame 0
> > > > >                 TX packets 11224  bytes 1234932 (1.1 MiB)
> > > > >                 TX errors 0  dropped 0 overruns 0  carrier 0
> > > collisions
> > > > 0
> > > > >
> > > > >         eth2: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
> > > > >                 inet 172.30.4.86  netmask 255.255.255.128
> broadcast
> > > > > 172.30.4.127
> > > > >                 ether 1e:00:d9:00:00:53  txqueuelen 1000
> (Ethernet)
> > > > >                 RX packets 366191  bytes 435300557 (415.1 MiB)
> > > > >                 RX errors 0  dropped 39456  overruns 0  frame 0
> > > > >                 TX packets 145065  bytes 7978602 (7.6 MiB)
> > > > >                 TX errors 0  dropped 0 overruns 0  carrier 0
> > > collisions
> > > > 0
> > > > >
> > > > >         eth3: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
> > > > >                 inet 172.30.5.14  netmask 255.255.255.240
> broadcast
> > > > > 172.30.5.15
> > > > >                 ether 1e:00:cb:00:00:1a  txqueuelen 1000
> (Ethernet)
> > > > >                 RX packets 132440  bytes 426362982 (406.6 MiB)
> > > > >                 RX errors 0  dropped 39446  overruns 0  frame 0
> > > > >                 TX packets 67443  bytes 423670834 (404.0 MiB)
> > > > >                 TX errors 0  dropped 0 overruns 0  carrier 0
> > > collisions
> > > > 0
> > > > >
> > > > >         lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
> > > > >                 inet 127.0.0.1  netmask 255.0.0.0
> > > > >                 loop  txqueuelen 1  (Local Loopback)
> > > > >                 RX packets 18  bytes 1440 (1.4 KiB)
> > > > >                 RX errors 0  dropped 0  overruns 0  frame 0
> > > > >                 TX packets 18  bytes 1440 (1.4 KiB)
> > > > >                 TX errors 0  dropped 0 overruns 0  carrier 0
> > > collisions
> > > > 0
> > > > >
> > > > >
> > > > >         so it has interfaces in both the management and the storage
> > > > > subnets (as well as guest).
> > > > >
> > > > >
> > > > >
> > > > >         ________________________________
> > > > >         From: Jon Marshall <jms....@hotmail.co.uk>
> > > > >         Sent: 06 June 2018 11:08
> > > > >         To: users@cloudstack.apache.org
> > > > >         Subject: Re: advanced networking with public IPs direct to
> > VMs
> > > > >
> > > > >         Hi Rafael
> > > > >
> > > > >
> > > > >         Thanks for the help, really appreciate it.
> > > > >
> > > > >
> > > > >         So rerunning that command with all servers up -
> > > > >
> > > > >
> > > > >
> > > > >         mysql> select * from cloud.storage_pool where cluster_id =
> 1
> > > and
> > > > > removed is null;
> > > > >         Empty set (0.00 sec)
> > > > >
> > > > >         mysql>
> > > > >
> > > > >
> > > > >         As for the storage IP no I'm not setting it to be the
> > > management
> > > > > IP when I setup the zone but the output of the SQL command suggests
> > > that
> > > > is
> > > > > what has happened.
> > > > >
> > > > >         As I said to Dag I am using a different subnet for storage
> > ie.
> > > > >
> > > > >         172.30.3.0/26  - management subnet
> > > > >         172.30.4.0/25 -  guest VM subnet
> > > > >         172.30.5.0/28 - storage
> > > > >
> > > > >         the NFS server IP is 172.30.5.2
> > > > >
> > > > >         each compute node has 3 NICs with an IP from each subnet (i
> > am
> > > > > assuming the management node only needs an IP in the management
> > network
> > > > ?)
> > > > >
> > > > >         When I add the zone in the UI I have one physical network
> > with
> > > > > management (cloudbr0), guest (cloudbr1) and storage (cloudbr2).
> > > > >         When I fill in the storage traffic page I use the range
> > > > > 172.16.5.10 - 14 as free IPs as I exclude the ones already
> allocated
> > to
> > > > the
> > > > > compute nodes and the NFS server.
> > > > >
> > > > >         I think maybe I am doing something wrong in the UI setup
> but
> > it
> > > > is
> > > > > not obvious to me what it is.
> > > > >
> > > > >         What I might try today unless you want me to keep the
> setup I
> > > > have
> > > > > for more outputs is to go back to 2 NICs, one for
> storage/management
> > > and
> > > > > one for guest VMs.
> > > > >
> > > > >         I think with the 2 NICs setup the mistake I made last time
> > when
> > > > > adding the zone was to assume storage would just run over
> management
> > > so I
> > > > > did not drag and drop the storage icon and assign it to cloudbr0 as
> > > with
> > > > > the management which I think is what I should do ?
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >         ________________________________
> > > > >         From: Rafael Weingärtner <rafaelweingart...@gmail.com>
> > > > >         Sent: 06 June 2018 10:54
> > > > >         To: users
> > > > >         Subject: Re: advanced networking with public IPs direct to
> > VMs
> > > > >
> > > > >         Jon, do not panic we are here to help you :)
> > > > >         So, I might have mistyped the SQL query. You you use
> select *
> > > > from
> > > > >         cloud.storage_pool where cluster_id = 1 and removed is not
> > null
> > > > ",
> > > > > you are
> > > > >         listing the storage pools removed. Therefore, the right
> query
> > > > > would be "
> > > > >         select * from cloud.storage_pool where cluster_id = 1 and
> > > removed
> > > > > is null "
> > > > >
> > > > >         There is also something else I do not understand. You are
> > > setting
> > > > > the
> > > > >         storage IP in the management subnet? I am not sure if you
> > > should
> > > > > be doing
> > > > >         like this. Normally, I set all my storages (primary[when
> > > working
> > > > > with NFS]
> > > > >         and secondary) to IPs in the storage subnet.
> > > > >
> > > > >         On Wed, Jun 6, 2018 at 6:49 AM, Dag Sonstebo <
> > > > > dag.sonst...@shapeblue.com>
> > > > >         wrote:
> > > > >
> > > > >         > Hi John,
> > > > >         >
> > > > >         > I’m late to this thread and have possibly missed some
> > things
> > > –
> > > > > but a
> > > > >         > couple of observations:
> > > > >         >
> > > > >         > “When I add the zone and get to the storage web page I
> > > exclude
> > > > > the IPs
> > > > >         > already used for the compute node NICs and the NFS server
> > > > > itself. …..”
> > > > >         > “So the range is 172.30.5.1 -> 15 and the range I fill in
> > is
> > > > > 172.30.5.10
> > > > >         > -> 172.30.5.14.”
> > > > >         >
> > > > >         > I think you may have some confusion around the use of the
> > > > > storage network.
> > > > >         > The important part here is to understand this is for
> > > *secondary
> > > > > storage*
> > > > >         > use only – it has nothing to do with primary storage.
> This
> > > > means
> > > > > this
> > > > >         > storage network needs to be accessible to the SSVM, to
> the
> > > > > hypervisors, and
> > > > >         > secondary storage NFS pools needs to be accessible on
> this
> > > > > network.
> > > > >         >
> > > > >         > The important part – this also means you *can not use the
> > > same
> > > > > IP ranges
> > > > >         > for management and storage networks* - doing so means you
> > > will
> > > > > have issues
> > > > >         > where effectively both hypervisors and SSVM can see the
> > same
> > > > > subnet on two
> > > > >         > NICs – and you end up in a routing black hole.
> > > > >         >
> > > > >         > So – you need to either:
> > > > >         >
> > > > >         > 1) Use different IP subnets on management and storage, or
> > > > >         > 2) preferably just simplify your setup – stop using a
> > > secondary
> > > > > storage
> > > > >         > network altogether and just allow secondary storage to
> use
> > > the
> > > > > management
> > > > >         > network (which is default). Unless you have a very high
> I/O
> > > > > environment in
> > > > >         > production you are just adding complexity by running
> > separate
> > > > > management
> > > > >         > and storage.
> > > > >         >
> > > > >         > Regards,
> > > > >         > Dag Sonstebo
> > > > >         > Cloud Architect
> > > > >         > ShapeBlue
> > > > >         >
> > > > >         > On 06/06/2018, 10:18, "Jon Marshall" <
> > jms....@hotmail.co.uk>
> > > > > wrote:
> > > > >         >
> > > > >         >     I will disconnect the host this morning and test but
> > > before
> > > > > I do that
> > > > >         > I ran this command when all hosts are up -
> > > > >         >
> > > > >         >
> > > > >         >
> > > > >         >
> > > > >         >
> > > > >         >      select * from cloud.host;
> > > > >         >     +----+-----------------+------
> > > > ------------------------------
> > > > >         > --+--------+------------------
> > --+--------------------+------
> > > > >         > -----------+------------------
> > ---+--------------------+-----
> > > > >         > ------------+-----------------
> > ----+----------------------+--
> > > > >         > ---------------------+--------
> > -----------+------------+-----
> > > > >         > --------------+---------------
> > --+--------------------+------
> > > > >         > ------+----------------+------
> > --+-------------+------+------
> > > > >         > -+----------------------------
> > ---------+---------+----------
> > > > >         > -------+--------------------+-
> > -----------+----------+-------
> > > > >         > ---+--------+------------+----
> > ----------+-------------------
> > > > >         > ------------------------------
> > --------------+-----------+---
> > > > >         > ----+-------------+-----------
> > -+----------------+-----------
> > > > >         > ----------+-------------------
> > --+---------+--------------+--
> > > > >         > --------------+-------+-------------+--------------+
> > > > >         >     | id | name            | uuid
> > > > >  | status
> > > > >         > | type               | private_ip_address |
> > private_netmask |
> > > > >         > private_mac_address | storage_ip_address |
> storage_netmask
> > |
> > > > >         > storage_mac_address | storage_ip_address_2 |
> > > > > storage_mac_address_2 |
> > > > >         > storage_netmask_2 | cluster_id | public_ip_address |
> > > > > public_netmask  |
> > > > >         > public_mac_address | proxy_port | data_center_id |
> pod_id |
> > > > > cpu_sockets |
> > > > >         > cpus | speed | url                                 |
> > fs_type
> > > |
> > > > >         > hypervisor_type | hypervisor_version | ram        |
> > resource
> > > |
> > > > > version  |
> > > > >         > parent | total_size | capabilities | guid
> > > > >         >                         | available | setup |
> dom0_memory |
> > > > > last_ping  |
> > > > >         > mgmt_server_id | disconnected        | created
> >  |
> > > > > removed |
> > > > >         > update_count | resource_state | owner | lastUpdated |
> > > > > engine_state |
> > > > >         >     +----+-----------------+------
> > > > ------------------------------
> > > > >         > --+--------+------------------
> > --+--------------------+------
> > > > >         > -----------+------------------
> > ---+--------------------+-----
> > > > >         > ------------+-----------------
> > ----+----------------------+--
> > > > >         > ---------------------+--------
> > -----------+------------+-----
> > > > >         > --------------+---------------
> > --+--------------------+------
> > > > >         > ------+----------------+------
> > --+-------------+------+------
> > > > >         > -+----------------------------
> > ---------+---------+----------
> > > > >         > -------+--------------------+-
> > -----------+----------+-------
> > > > >         > ---+--------+------------+----
> > ----------+-------------------
> > > > >         > ------------------------------
> > --------------+-----------+---
> > > > >         > ----+-------------+-----------
> > -+----------------+-----------
> > > > >         > ----------+-------------------
> > --+---------+--------------+--
> > > > >         > --------------+-------+-------------+--------------+
> > > > >         >     |  1 | dcp-cscn1.local | d97b930c-ab5f-4b7d-9243-
> > > > eabd60012284
> > > > > | Up
> > > > >         >  | Routing            | 172.30.3.3         |
> > 255.255.255.192
> > > |
> > > > >         > 00:22:19:92:4e:34   | 172.30.3.3         |
> 255.255.255.192
> > |
> > > > >         > 00:22:19:92:4e:34   | NULL                 | NULL
> > > > >   | NULL
> > > > >         >             |          1 | 172.30.4.3        |
> > > 255.255.255.128
> > > > |
> > > > >         > 00:22:19:92:4e:35  |       NULL |              1 |
> 1 |
> > > > >      1 |
> > > > >         >   2 |  2999 | iqn.1994-05.com.redhat:fa437fb0c023 | NULL
> > > |
> > > > > KVM
> > > > >         >      | NULL               | 7510159360 | NULL     |
> > 4.11.0.0
> > > |
> > > > > NULL   |
> > > > >         >    NULL | hvm,snapshot | 9f2b15cb-1b75-321b-bf59-
> > > f83e7a5e8efb-
> > > > > LibvirtComputingResource
> > > > >         > |         1 |     0 |           0 | 1492390408 |
> > > >  146457912294 |
> > > > >         > 2018-06-05 14:09:22 | 2018-06-05 13:44:33 | NULL    |
> > > > > 4 |
> > > > >         > Enabled        | NULL  | NULL        | Disabled     |
> > > > >         >     |  2 | v-2-VM          | ce1f4594-2b4f-4b2b-a239-
> > > > 3f5e2c2215b0
> > > > > | Up
> > > > >         >  | ConsoleProxy       | 172.30.3.49        |
> > 255.255.255.192
> > > |
> > > > >         > 1e:00:80:00:00:14   | 172.30.3.49        |
> 255.255.255.192
> > |
> > > > >         > 1e:00:80:00:00:14   | NULL                 | NULL
> > > > >   | NULL
> > > > >         >             |       NULL | 172.30.4.98       |
> > > 255.255.255.128
> > > > |
> > > > >         > 1e:00:c9:00:00:5f  |       NULL |              1 |
> 1 |
> > > > >   NULL |
> > > > >         > NULL |  NULL | NoIqn                               | NULL
> > > |
> > > > > NULL
> > > > >         >     | NULL               |          0 | NULL     |
> > 4.11.0.0 |
> > > > > NULL   |
> > > > >         >  NULL | NULL         | Proxy.2-ConsoleProxyResource
> > > > >         >           |         1 |     0 |           0 | 1492390409
> |
> > > > >  146457912294 |
> > > > >         > 2018-06-05 14:09:22 | 2018-06-05 13:46:22 | NULL    |
> > > > > 7 |
> > > > >         > Enabled        | NULL  | NULL        | Disabled     |
> > > > >         >     |  3 | s-1-VM          | 107d0a8e-e2d1-42b5-8b9d-
> > > > ff3845bb556c
> > > > > | Up
> > > > >         >  | SecondaryStorageVM | 172.30.3.34        |
> > 255.255.255.192
> > > |
> > > > >         > 1e:00:3b:00:00:05   | 172.30.3.34        |
> 255.255.255.192
> > |
> > > > >         > 1e:00:3b:00:00:05   | NULL                 | NULL
> > > > >   | NULL
> > > > >         >             |       NULL | 172.30.4.86       |
> > > 255.255.255.128
> > > > |
> > > > >         > 1e:00:d9:00:00:53  |       NULL |              1 |
> 1 |
> > > > >   NULL |
> > > > >         > NULL |  NULL | NoIqn                               | NULL
> > > |
> > > > > NULL
> > > > >         >     | NULL               |          0 | NULL     |
> > 4.11.0.0 |
> > > > > NULL   |
> > > > >         >  NULL | NULL         | s-1-VM-NfsSecondaryStorageResource
> > > > >         >             |         1 |     0 |           0 |
> 1492390407
> > |
> > > > >  146457912294
> > > > >         > | 2018-06-05 14:09:22 | 2018-06-05 13:46:27 | NULL    |
> > > > >   7 |
> > > > >         > Enabled        | NULL  | NULL        | Disabled     |
> > > > >         >     |  4 | dcp-cscn2.local | f0c076cb-112f-4f4b-a5a4-
> > > > 1a96ffac9794
> > > > > | Up
> > > > >         >  | Routing            | 172.30.3.4         |
> > 255.255.255.192
> > > |
> > > > >         > 00:26:b9:4a:97:7d   | 172.30.3.4         |
> 255.255.255.192
> > |
> > > > >         > 00:26:b9:4a:97:7d   | NULL                 | NULL
> > > > >   | NULL
> > > > >         >             |          1 | 172.30.4.4        |
> > > 255.255.255.128
> > > > |
> > > > >         > 00:26:b9:4a:97:7e  |       NULL |              1 |
> 1 |
> > > > >      1 |
> > > > >         >   2 |  2999 | iqn.1994-05.com.redhat:e9b4aa7e7881 | NULL
> > > |
> > > > > KVM
> > > > >         >      | NULL               | 7510159360 | NULL     |
> > 4.11.0.0
> > > |
> > > > > NULL   |
> > > > >         >    NULL | hvm,snapshot | 40e58399-fc7a-3a59-8f48-
> > > 16d0f99b11c9-
> > > > > LibvirtComputingResource
> > > > >         > |         1 |     0 |           0 | 1492450882 |
> > > >  146457912294 |
> > > > >         > 2018-06-05 14:09:22 | 2018-06-05 13:46:33 | NULL    |
> > > > > 8 |
> > > > >         > Enabled        | NULL  | NULL        | Disabled     |
> > > > >         >     |  5 | dcp-cscn3.local | 0368ae16-550f-43a9-bb40-
> > > > ee29d2b5c274
> > > > > | Up
> > > > >         >  | Routing            | 172.30.3.5         |
> > 255.255.255.192
> > > |
> > > > >         > 00:24:e8:73:6a:b2   | 172.30.3.5         |
> 255.255.255.192
> > |
> > > > >         > 00:24:e8:73:6a:b2   | NULL                 | NULL
> > > > >   | NULL
> > > > >         >             |          1 | 172.30.4.5        |
> > > 255.255.255.128
> > > > |
> > > > >         > 00:24:e8:73:6a:b3  |       NULL |              1 |
> 1 |
> > > > >      1 |
> > > > >         >   2 |  3000 | iqn.1994-05.com.redhat:ccdce43aff1c | NULL
> > > |
> > > > > KVM
> > > > >         >      | NULL               | 7510159360 | NULL     |
> > 4.11.0.0
> > > |
> > > > > NULL   |
> > > > >         >    NULL | hvm,snapshot | 10bb1c01-0e92-3108-8209-
> > > 37f3eebad8fb-
> > > > > LibvirtComputingResource
> > > > >         > |         1 |     0 |           0 | 1492390408 |
> > > >  146457912294 |
> > > > >         > 2018-06-05 14:09:22 | 2018-06-05 13:47:04 | NULL    |
> > > > > 6 |
> > > > >         > Enabled        | NULL  | NULL        | Disabled     |
> > > > >         >     +----+-----------------+------
> > > > ------------------------------
> > > > >         > --+--------+------------------
> > --+--------------------+------
> > > > >         > -----------+------------------
> > ---+--------------------+-----
> > > > >         > ------------+-----------------
> > ----+----------------------+--
> > > > >         > ---------------------+--------
> > -----------+------------+-----
> > > > >         > --------------+---------------
> > --+--------------------+------
> > > > >         > ------+----------------+------
> > --+-------------+------+------
> > > > >         > -+----------------------------
> > ---------+---------+----------
> > > > >         > -------+--------------------+-
> > -----------+----------+-------
> > > > >         > ---+--------+------------+----
> > ----------+-------------------
> > > > >         > ------------------------------
> > --------------+-----------+---
> > > > >         > ----+-------------+-----------
> > -+----------------+-----------
> > > > >         > ----------+-------------------
> > --+---------+--------------+--
> > > > >         > --------------+-------+-------------+--------------+
> > > > >         >     5 rows in set (0.00 sec)
> > > > >         >
> > > > >         >
> > > > >         >
> > > > >         >     and you can see that it says the storage IP address
> is
> > > the
> > > > > same as the
> > > > >         > private IP address (the management network).
> > > > >         >
> > > > >         >
> > > > >         >     I also ran the command you provided using the Cluster
> > ID
> > > > > number from
> > > > >         > the table above -
> > > > >         >
> > > > >         >
> > > > >         >
> > > > >         >     mysql> select * from cloud.storage_pool where
> > cluster_id
> > > =
> > > > 1
> > > > > and
> > > > >         > removed is not null;
> > > > >         >     Empty set (0.00 sec)
> > > > >         >
> > > > >         >     mysql>
> > > > >         >
> > > > >         >     So assuming I am reading this correctly that seems to
> > be
> > > > the
> > > > > issue.
> > > > >         >
> > > > >         >
> > > > >         >     I am at a loss as to why though.
> > > > >         >
> > > > >         >
> > > > >         >     I have a separate NIC for storage as described. When
> I
> > > add
> > > > > the zone
> > > > >         > and get to the storage web page I exclude the IPs already
> > > used
> > > > > for the
> > > > >         > compute node NICs and the NFS server itself. I do this
> > > because
> > > > > initially I
> > > > >         > didn't and the SSVM started using the IP address of the
> NFS
> > > > > server.
> > > > >         >
> > > > >         >
> > > > >         >     So the range is 172.30.5.1 -> 15 and the range I fill
> > in
> > > is
> > > > >         > 172.30.5.10 -> 172.30.5.14.
> > > > >         >
> > > > >         >
> > > > >         >     And I used the label "cloudbr2" for storage.
> > > > >         >
> > > > >         >
> > > > >         >     I must be doing this wrong somehow.
> > > > >         >
> > > > >         >
> > > > >         >     Any pointers would be much appreciated.
> > > > >         >
> > > > >         >
> > > > >         >
> > > > >         >
> > > > >         >     ________________________________
> > > > >         >     From: Rafael Weingärtner <
> rafaelweingart...@gmail.com>
> > > > >         >     Sent: 05 June 2018 16:13
> > > > >         >     To: users
> > > > >         >     Subject: Re: advanced networking with public IPs
> direct
> > > to
> > > > > VMs
> > > > >         >
> > > > >         >     That is interesting. Let's see the source of all
> > truth...
> > > > >         >     This is the code that is generating that odd message.
> > > > >         >
> > > > >         >     >     List<StoragePoolVO> clusterPools =
> > > > >         >     > _storagePoolDao.listPoolsByCluster(agent.
> > > > getClusterId());
> > > > >         >     >         boolean hasNfs = false;
> > > > >         >     >         for (StoragePoolVO pool : clusterPools) {
> > > > >         >     >             if (pool.getPoolType() ==
> > StoragePoolType.
> > > > > NetworkFilesystem)
> > > > >         > {
> > > > >         >     >                 hasNfs = true;
> > > > >         >     >                 break;
> > > > >         >     >             }
> > > > >         >     >         }
> > > > >         >     >         if (!hasNfs) {
> > > > >         >     >             s_logger.warn(
> > > > >         >     >                     "Agent investigation was
> > requested
> > > on
> > > > > host " +
> > > > >         > agent +
> > > > >         >     > ", but host does not support investigation because
> it
> > > has
> > > > > no NFS
> > > > >         > storage.
> > > > >         >     > Skipping investigation.");
> > > > >         >     >             return Status.Disconnected;
> > > > >         >     >         }
> > > > >         >     >
> > > > >         >
> > > > >         >     There are two possibilities here. You do not have any
> > NFS
> > > > > storage? Is
> > > > >         > that
> > > > >         >     the case? Or maybe, for some reason, the call
> > > > >         >     "_storagePoolDao.listPoolsByCluster(agent.
> > > getClusterId())"
> > > > > is not
> > > > >         > returning
> > > > >         >     any NFS storage pools. Looking at the
> > > "listPoolsByCluster "
> > > > > we will see
> > > > >         >     that the following SQL is used:
> > > > >         >
> > > > >         >     Select * from storage_pool where cluster_id =
> > > > > <host'sClusterId> and
> > > > >         > removed
> > > > >         >     > is not null
> > > > >         >     >
> > > > >         >
> > > > >         >     Can you run that SQL to see the its return when your
> > > hosts
> > > > > are marked
> > > > >         > as
> > > > >         >     disconnected?
> > > > >         >
> > > > >         >
> > > > >         > dag.sonst...@shapeblue.com
> > > > >         > www.shapeblue.com<http://www.shapeblue.com>
> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> www.shapeblue.com
> ShapeBlue are the largest independent integrator of CloudStack
> technologies globally and are specialists in the design and implementation
> of IaaS cloud infrastructures for both private and public cloud
> implementations.
>
>
>
> > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > www.shapeblue.com<http://www.shapeblue.com>
> > ShapeBlue are the largest independent integrator of CloudStack
> > technologies globally and are specialists in the design and
> implementation
> > of IaaS cloud infrastructures for both private and public cloud
> > implementations.
> >
> >
> >
> > > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > > www.shapeblue.com<http://www.shapeblue.com>
> > > ShapeBlue are the largest independent integrator of CloudStack
> > > technologies globally and are specialists in the design and
> > implementation
> > > of IaaS cloud infrastructures for both private and public cloud
> > > implementations.
> > >
> > >
> > >
> > > > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > > > www.shapeblue.com<http://www.shapeblue.com>
> > > > ShapeBlue are the largest independent integrator of CloudStack
> > > > technologies globally and are specialists in the design and
> > > implementation
> > > > of IaaS cloud infrastructures for both private and public cloud
> > > > implementations.
> > > >
> > > >
> > > >
> > > > >         Shapeblue - The CloudStack Company<
> http://www.shapeblue.com/
> > >
> > > > >         www.shapeblue.com<http://www.shapeblue.com>
> > > > >         ShapeBlue are the largest independent integrator of
> > CloudStack
> > > > > technologies globally and are specialists in the design and
> > > > implementation
> > > > > of IaaS cloud infrastructures for both private and public cloud
> > > > > implementations.
> > > > >
> > > > >
> > > > >
> > > > >         > 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> > > > >         > @shapeblue
> > > > >         >
> > > > >         >
> > > > >         >
> > > > >         > On Tue, Jun 5, 2018 at 11:32 AM, Jon Marshall <
> > > > > jms....@hotmail.co.uk>
> > > > >         > wrote:
> > > > >         >
> > > > >         >     > I reran the tests with the 3 NIC setup. When I
> > > configured
> > > > > the zone
> > > > >         > through
> > > > >         >     > the UI I used the labels cloudbr0 for management,
> > > > cloudbr1
> > > > > for guest
> > > > >         >     > traffic and cloudbr2 for NFS as per my original
> > > response
> > > > > to you.
> > > > >         >     >
> > > > >         >     >
> > > > >         >     > When I pull the power to the node (dcp-cscn2.local)
> > > after
> > > > > about 5
> > > > >         > mins
> > > > >         >     > the  host status goes to "Alert" but never to
> "Down"
> > > > >         >     >
> > > > >         >     >
> > > > >         >     > I get this in the logs -
> > > > >         >     >
> > > > >         >     >
> > > > >         >     > 2018-06-05 15:17:14,382 WARN
> [c.c.h.KVMInvestigator]
> > > > >         >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93)
> Agent
> > > > > investigation
> > > > >         > was
> > > > >         >     > requested on host Host[-4-Routing], but host does
> not
> > > > > support
> > > > >         > investigation
> > > > >         >     > because it has no NFS storage. Skipping
> > investigation.
> > > > >         >     > 2018-06-05 15:17:14,382 DEBUG [c.c.h.
> > > > > HighAvailabilityManagerImpl]
> > > > >         >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93)
> > > > > KVMInvestigator was
> > > > >         > able to
> > > > >         >     > determine host 4 is in Disconnected
> > > > >         >     > 2018-06-05 15:17:14,382 INFO
> > > [c.c.a.m.AgentManagerImpl]
> > > > >         >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) The
> > > agent
> > > > > from host
> > > > >         > 4 state
> > > > >         >     > determined is Disconnected
> > > > >         >     > 2018-06-05 15:17:14,382 WARN
> > > [c.c.a.m.AgentManagerImpl]
> > > > >         >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93)
> Agent
> > > is
> > > > >         > disconnected but
> > > > >         >     > the host is still up: 4-dcp-cscn2.local
> > > > >         >     >
> > > > >         >     > I don't understand why it thinks there is no NFS
> > > storage
> > > > > as each
> > > > >         > compute
> > > > >         >     > node has a dedicated storage NIC.
> > > > >         >     >
> > > > >         >     >
> > > > >         >     > I also don't understand why it thinks the host is
> > still
> > > > up
> > > > > ie. what
> > > > >         > test
> > > > >         >     > is it doing to determine that ?
> > > > >         >     >
> > > > >         >     >
> > > > >         >     > Am I just trying to get something working that is
> not
> > > > > supported ?
> > > > >         >     >
> > > > >         >     >
> > > > >         >     > ________________________________
> > > > >         >     > From: Rafael Weingärtner <
> > rafaelweingart...@gmail.com>
> > > > >         >     > Sent: 04 June 2018 15:31
> > > > >         >     > To: users
> > > > >         >     > Subject: Re: advanced networking with public IPs
> > direct
> > > > to
> > > > > VMs
> > > > >         >     >
> > > > >         >     > What type of failover are you talking about?
> > > > >         >     > What ACS version are you using?
> > > > >         >     > What hypervisor are you using?
> > > > >         >     > How are you configuring your NICs in the
> hypervisor?
> > > > >         >     > How are you configuring the traffic labels in ACS?
> > > > >         >     >
> > > > >         >     > On Mon, Jun 4, 2018 at 11:29 AM, Jon Marshall <
> > > > > jms....@hotmail.co.uk
> > > > >         > >
> > > > >         >     > wrote:
> > > > >         >     >
> > > > >         >     > > Hi all
> > > > >         >     > >
> > > > >         >     > >
> > > > >         >     > > I am close to giving up on basic networking as I
> > just
> > > > > cannot get
> > > > >         > failover
> > > > >         >     > > working with multiple NICs (I am not even sure it
> > is
> > > > > supported).
> > > > >         >     > >
> > > > >         >     > >
> > > > >         >     > > What I would like is to use 3 NICs for
> management,
> > > > > storage and
> > > > >         > guest
> > > > >         >     > > traffic. I would like to assign public IPs direct
> > to
> > > > the
> > > > > VMs which
> > > > >         > is
> > > > >         >     > why I
> > > > >         >     > > originally chose basic.
> > > > >         >     > >
> > > > >         >     > >
> > > > >         >     > > If I switch to advanced networking do I just
> > > configure
> > > > a
> > > > > guest VM
> > > > >         > with
> > > > >         >     > > public IPs on one NIC and not both with the
> public
> > > > > traffic -
> > > > >         >     > >
> > > > >         >     > >
> > > > >         >     > > would this work ?
> > > > >         >     > >
> > > > >         >     >
> > > > >         >     >
> > > > >         >     >
> > > > >         >     > --
> > > > >         >     > Rafael Weingärtner
> > > > >         >     >
> > > > >         >
> > > > >         >
> > > > >         >
> > > > >         >     --
> > > > >         >     Rafael Weingärtner
> > > > >         >
> > > > >         >
> > > > >         >
> > > > >
> > > > >
> > > > >         --
> > > > >         Rafael Weingärtner
> > > > >
> > > > >
> > > > >
> > > > >     dag.sonst...@shapeblue.com
> > > > >     www.shapeblue.com<http://www.shapeblue.com>
> > > > >     53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> > > > >     @shapeblue
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > dag.sonst...@shapeblue.com
> > > > > www.shapeblue.com<http://www.shapeblue.com>
> > > > > 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> > > > > @shapeblue
> > > > >
> > > > >
> > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > > Rafael Weingärtner
> > > >
> > >
> > >
> > >
> > > --
> > > Rafael Weingärtner
> > >
> >
> >
> >
> > --
> > Rafael Weingärtner
> >
>
>
>
> --
> Rafael Weingärtner
>

Reply via email to