Hello,

Out project consists of:

- a Mongo cluster consisting of three pods
- a single Postgres pod
- one or more application pods

When starting up an application pod, it opens about 20 connections to
each the Mongo cluster and the Postgres database.
After a few seconds new connections from the application pods to both
the databases start failing.

This can be reproduced from a command line running in the application
pod's name space: When repeatedly opening telnet connections to the
respective database ports, some connections succeed while others
(about 25%) seemingly randomly fail with 'network unreachable'. This
only happens when the application pod and the database pod are on
different OpenShift nodes. If both are on the same node, connect
attempts always succeed.

This problem manifests itself on both our OpenShift 3.9 clusters. One
is installed on Centos 7 nodes in a Google Cloud environment, the other
one on Centos 7 KVM VMs running on Ubuntu 18.04 hosts on bare metal.
We are using the default ovs-subnet SDN.

We did not encounter this problem in OpenShift versions 1.2.1 and 1.4.1
(installed on the same types of VMs and hardware).

Has anybody else noticed similar problems? What's the best way to debug
this?


Thanks,

Andre
-- 
Andre Esser, IT Manager
Voidbridge Software Ltd

_______________________________________________
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Reply via email to