Hi, When I am trying to create new projects and one network for each projects after 70 or 80 projects and networks in OC. Controller HA availablitliy failed with below error.
[stack@director LogTool_Python2]$ ssh [email protected] "sudo pcs status" Cluster name: tripleo_cluster Stack: corosync Current DC: overcloud-controller-1 (version 1.1.21-4.el7-f14e36fd43) - partition with quorum Last updated: Thu Jul 23 17:00:22 2020 Last change: Wed Jul 22 14:35:34 2020 by hacluster via crmd on overcloud-controller-2 12 nodes configured 37 resources configured Online: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ] GuestOnline: [ galera-bundle-0@overcloud-controller-0 galera-bundle-1@overcloud-controller-1 galera-bundle-2@overcloud-controller-2 rabbitmq-bundle-0@overcloud-controller-0 rabbitmq-bundle-1@overcloud-controller-1 rabbitmq-bundle-2@overcloud-controller-2 redis-bundle-0@overcloud-controller-0 redis-bundle-1@overcloud-controller-1 redis-bundle-2@overcloud-controller-2 ] Full list of resources: Docker container set: rabbitmq-bundle [192.168.100.1:8787/tripleorocky/centos-binary-rabbitmq:pcmklatest] rabbitmq-bundle-0 (ocf::heartbeat:rabbitmq-cluster): Started overcloud-controller-0 rabbitmq-bundle-1 (ocf::heartbeat:rabbitmq-cluster): Started overcloud-controller-1 rabbitmq-bundle-2 (ocf::heartbeat:rabbitmq-cluster): Started overcloud-controller-2 Docker container set: galera-bundle [192.168.100.1:8787/tripleorocky/centos-binary-mariadb:pcmklatest] galera-bundle-0 (ocf::heartbeat:galera): Master overcloud-controller-0 galera-bundle-1 (ocf::heartbeat:galera): Master overcloud-controller-1 galera-bundle-2 (ocf::heartbeat:galera): FAILED Master overcloud-controller-2 (blocked) Docker container set: redis-bundle [192.168.100.1:8787/tripleorocky/centos-binary-redis:pcmklatest] redis-bundle-0 (ocf::heartbeat:redis): Master overcloud-controller-0 redis-bundle-1 (ocf::heartbeat:redis): Slave overcloud-controller-1 redis-bundle-2 (ocf::heartbeat:redis): Slave overcloud-controller-2 ip-192.168.100.98 (ocf::heartbeat:IPaddr2): Started overcloud-controller-0 ip-10.10.0.11 (ocf::heartbeat:IPaddr2): Started overcloud-controller-1 ip-192.168.102.185 (ocf::heartbeat:IPaddr2): Started overcloud-controller-2 ip-192.168.102.116 (ocf::heartbeat:IPaddr2): Started overcloud-controller-0 ip-192.168.103.187 (ocf::heartbeat:IPaddr2): Started overcloud-controller-1 ip-192.168.104.127 (ocf::heartbeat:IPaddr2): Started overcloud-controller-2 Docker container set: haproxy-bundle [192.168.100.1:8787/tripleorocky/centos-binary-haproxy:pcmklatest] haproxy-bundle-docker-0 (ocf::heartbeat:docker): Started overcloud-controller-0 haproxy-bundle-docker-1 (ocf::heartbeat:docker): Started overcloud-controller-1 haproxy-bundle-docker-2 (ocf::heartbeat:docker): Started overcloud-controller-2 Docker container: openstack-cinder-volume [192.168.100.1:8787/tripleorocky/centos-binary-cinder-volume:pcmklatest] openstack-cinder-volume-docker-0 (ocf::heartbeat:docker): Started overcloud-controller-0 Failed Resource Actions: * redis-bundle-docker-1_monitor_60000 on overcloud-controller-1 'unknown error' (1): call=132, status=Timed Out, exitreason='', last-rc-change='Thu Jul 23 16:42:15 2020', queued=0ms, exec=0ms * galera-bundle-docker-2_monitor_60000 on overcloud-controller-2 'unknown error' (1): call=41, status=Timed Out, exitreason='', last-rc-change='Thu Jul 23 16:48:39 2020', queued=0ms, exec=0ms * redis-bundle-docker-2_monitor_60000 on overcloud-controller-2 'unknown error' (1): call=62, status=Timed Out, exitreason='', last-rc-change='Thu Jul 23 16:48:39 2020', queued=0ms, exec=0ms * haproxy-bundle-docker-2_monitor_60000 on overcloud-controller-2 'unknown error' (1): call=106, status=Timed Out, exitreason='', last-rc-change='Thu Jul 23 16:48:39 2020', queued=0ms, exec=0ms * rabbitmq-bundle-docker-2_monitor_60000 on overcloud-controller-2 'unknown error' (1): call=121, status=Timed Out, exitreason='', last-rc-change='Thu Jul 23 16:48:39 2020', queued=0ms, exec=0ms * galera_promote_0 on galera-bundle-2 'unknown error' (1): call=43, status=complete, exitreason='MySQL server failed to start (pid=646) (rc=0), please check your installation', last-rc-change='Thu Jul 23 16:49:14 2020', queued=0ms, exec=12193ms Daemon Status: corosync: active/enabled pacemaker: active/enabled pcsd: active/enabled It happens all the time when OC total number of networks goes above 70+ . I am attaching error logs of overcloud also. Regards Rahul Pathak i2k2 Networks (P) Ltd. | Spring Meadows Business Park A61-B4 & 4A First Floor, Sector 63, Noida - 201 301 ISO/IEC 27001:2005 & ISO 9001:2008 Certified ----- Original Message ----- From: "Alfredo Moralejo Alonso" <[email protected]> To: "Rahul Pathak" <[email protected]> Cc: "RDO Developmen List" <[email protected]> Sent: Thursday, July 23, 2020 3:34:45 PM Subject: Re: [rdo-dev] tripleo cluster failure On Wed, Jul 22, 2020 at 3:23 PM Rahul Pathak < [email protected] > wrote: Hi, I have installed tripleo openstack version rocky containerized with Undercloud in virtual platform and 3 controllers and 2 compute Baremetal. My whole setup is running on centos7. Overcloud cluster start failing once number of networks in overcloud reach more than 70. Lots of resources failure issue shown there. I don't know why HA cluster failed after 70 networks in OC. What kind of errors are you seeing?, what "resource failures"? <blockquote> Is some kind of threshold in tripleo configuration? so it is restricted not to create more than 70 or 80 netwoks. How could i fix this? I did not see such issue when I am using redhat platform and it's repos. This issue coming in opensource repos on Centos 7 . Please help how to fix this issue so I can scale up my openstack upto 2000 vms in this situation it's not possible. Regards Rahul Pathak i2k2 Networks (P) Ltd. | Spring Meadows Business Park A61-B4 & 4A First Floor, Sector 63, Noida - 201 301 ISO/IEC 27001:2005 & ISO 9001:2008 Certified _______________________________________________ dev mailing list [email protected] http://lists.rdoproject.org/mailman/listinfo/dev To unsubscribe: [email protected] </blockquote>
Overcloud_ERROR.rar
Description: application/rar
_______________________________________________ dev mailing list [email protected] http://lists.rdoproject.org/mailman/listinfo/dev To unsubscribe: [email protected]
