On Friday 11 November 2016 02:18 PM, Andy Zhou wrote:


On Mon, Nov 7, 2016 at 11:55 PM, Babu Shanmugam <bscha...@redhat.com <mailto:bscha...@redhat.com>> wrote:



    On Monday 07 November 2016 06:49 PM, Andy Zhou wrote:
    This version is better, I am able to apply them. Thanks.

    I got the system running, but managed to get system into a state
    where both machines (centos and centos2)
    are running the ovsdb in a backup mode. The output of "pcs
    status" shows an error message, but the message is not
    very helpful.  Any suggestion on how to debug this?

    root@centos:/# pcs status
    Cluster name: mycluster
    Last updated: Mon Nov  7 05:12:06 2016Last change: Mon Nov  7
    05:08:24 2016 by root via cibadmin on centos
    Stack: corosync
    Current DC: centos2 (version 1.1.13-10.el7_2.4-44eb2dd) -
    partition with quorum
    2 nodes and 3 resources configured

    Node centos: standby
    Online: [ centos2 ]

    Full list of resources:

     virtip(ocf::heartbeat:IPaddr):Started centos2
     Master/Slave Set: ovndb_servers_master [ovndb_servers]
         Stopped: [ centos centos2 ]

    Failed Actions:
    * ovndb_servers_start_0 on centos2 'unknown error' (1): call=18,
    status=Timed Out, exitreason='none',
        last-rc-change='Mon Nov  7 02:28:07 2016', queued=0ms,
    exec=30002ms


    PCSD Status:
      centos: Online
      centos2: Online

    Daemon Status:
      corosync: active/enabled
      pacemaker: active/enabled
      pcsd: active/enabled

    --------------------------------------------
    root@centos:/# pcs config
    Cluster Name: mycluster
    Corosync Nodes:
     centos centos2
    Pacemaker Nodes:
     centos centos2

    Resources:
     Resource: virtip (class=ocf provider=heartbeat type=IPaddr)
      Attributes: ip=192.168.122.200 cidr_netmask=24
      Operations: start interval=0s timeout=20s
    (virtip-start-interval-0s)
                  stop interval=0s timeout=20s (virtip-stop-interval-0s)
                  monitor interval=30s (virtip-monitor-interval-30s)
     Master: ovndb_servers_master
      Meta Attrs: notify=true
      Resource: ovndb_servers (class=ocf provider=ovn type=ovndb-servers)
       Attributes: master_ip=192.168.122.200

    Andy, you don't seem to have defined an attribute for ovn_ctl. It
    means, the ovn-ctl script will be assumed to be present in
    /usr/share/openvswitch/scripts/ovn-ctl. Can you check if you have
    ovn-ctl at the correct location?

Yes, The script was installed there.--db-nb-sync-from-addr=

    If not, please define an attribute similar to master_ip and name
    it ovn_ctl and point that to the correct location of ovn-ctl ?

The document says "ovn-ctl" is optional. I now changed to have it fully specified, but makes no difference. There are some log information towards the end of email if they help. Overall, it could just be something weird about my system, I am not sure it will be worth while to track it down. On the other hand, I will be happy to provide more information about the my setup in case they
are useful.

Andy, can you please try to manually start the db servers using ovn-ctl and see if they are really started. I would use the following commands to learn that.

/usr/share/openvswitch/scripts/ovn-ctl --db-nb-sync-from-addr=192.0.2.254 --db-nb-sync-from-addr=192.0.2.254 start_ovsdb
/usr/share/openvswitch/scripts/ovn-ctl status_ovnsb
/usr/share/openvswitch/scripts/ovn-ctl status_ovnnb

The last two commands should print 'running/backup'.

There are some log messages added in the ocf script in case the start funcitonality fails by timing out. You should be able to see some print starting with "ovndb_servers: After starting ovsdb, status is". These log messages should be present in the pacemaker logs.

    Is the user expected to populate those files by hand?  If yes,
    what IP address should be used? The floating IP?

    This file will have to be populated by the user, only when the
    user wants ovn-northd to connect to a different set of DB urls,
    other than unix sockets in that same machine.
    The IP address depends  on the setup. The pacemaker script uses
    the master-ip address that you supply to the OCF resource as an
    attribute.

Thanks. Should this be added to IntegrationGuide.rst?


I have added the need of IPAddr2 resource in IntegrationGuide.rst. The new options are in ovn-ctl man page. Is there something specific, that you feel missed out in the documentation?

More logs..

root@centos:~# ls -l /usr/share/openvswitch/scripts/ovn-ctl
-rwxr-xr-x. 1 root root 15539 Nov 7 02:12 /usr/share/openvswitch/scripts/ovn-ctl

Resources:
 Resource: virtip (class=ocf provider=heartbeat type=IPaddr)
  Attributes: ip=192.168.122.200 cidr_netmask=24
  Operations: start interval=0s timeout=20s (virtip-start-interval-0s)
              stop interval=0s timeout=20s (virtip-stop-interval-0s)
              monitor interval=30s (virtip-monitor-interval-30s)
 Master: ovndb_servers_master
  Meta Attrs: notify=true
  Resource: ovndb_servers (class=ocf provider=ovn type=ovndb-servers)
Attributes: master_ip=192.168.122.200 ovn_ctl=/usr/share/openvswitch/scripts/ovn-ctl Operations: start interval=0s timeout=30s (ovndb_servers-start-interval-0s) stop interval=0s timeout=20s (ovndb_servers-stop-interval-0s) promote interval=0s timeout=50s (ovndb_servers-promote-interval-0s) demote interval=0s timeout=50s (ovndb_servers-demote-interval-0s)
               monitor interval=10s (ovndb_servers-monitor-interval-10s)


Resource configuration looks fine. I think the above experiment would help us catch the problem.


pcs status still shows ovsdb are offline on both hosts:
==========================================
Cluster name: mycluster
Last updated: Fri Nov 11 00:33:10 2016 Last change: Fri Nov 11 00:09:13 2016 by root via crm_attribute on centos2
Stack: corosync
Current DC: centos (version 1.1.13-10.el7_2.4-44eb2dd) - partition with quorum
2 nodes and 3 resources configured

Online: [ centos centos2 ]

Full list of resources:

 virtip (ocf::heartbeat:IPaddr):        Started centos
 Master/Slave Set: ovndb_servers_master [ovndb_servers]
     Stopped: [ centos centos2 ]

Failed Actions:
* ovndb_servers_start_0 on centos 'unknown error' (1): call=18, status=Timed Out, exitreason='none',
    last-rc-change='Fri Nov 11 00:09:13 2016', queued=0ms, exec=30280ms
* ovndb_servers_start_0 on centos2 'unknown error' (1): call=13, status=Timed Out, exitreason='none',
    last-rc-change='Fri Nov 11 00:07:42 2016', queued=0ms, exec=30234ms


PCSD Status:
  centos: Online
  centos2: Online




_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to