Re: OKD installation on CentOS 7.6

Nikolas Philips Tue, 16 Apr 2019 08:27:59 -0700

Hi Wilfried,
did you check that you could connect to these server via ssh before you run
the ansible-installer?
Do you have applied any custom iptable rules to these servers (via
cloud-init or similar maybe)?
Do these servers only have one NIC resp. one IP address over which you
access them?
Maybe try to open port 22 explicitly via iptables on one node to test, if
it's the firewall which blocks the requests.
Try what happens, if you stop the origin-node service on a compute node
(sytemctl stop origin-node). If this doesn't help try to flush all applied
iptable rules, and add only port 22 for example afterwards (better backup.
I think the kube-proxy will generate them, but not 100% sure).
And please provide the output of "netstat -tupln" and "ip address show" of
one node and the infra node (resp. check if the ip binding for the sshd
service is your external ip).
Even if the cluster is causing this behaviour, I think the issue might be
caused from a certain server config (e.g. firewall, network). I try to
isolate the possible cause with these questions.


Best Regards,
Nikolas


Am Di., 16. Apr. 2019 um 16:42 Uhr schrieb ANUZET Wilfried <
wilfried.anu...@uclouvain.be>:

> Hello Nikolas,
>
>
>
> I just test something and it seems obviously a network problem on the
> openshift cluster itself.
>
> I just reboot the master to test and it seems that the server is
> accessible throught a little window when the TCP/IP stack is up but before
> the firewall / OKD start.
>
>
>
> Don't know where I missed something.
>
>
>
> [image: logo-stluc]
>
> *Wilfried Anuzet*
> Service Infrastructure
> Département Information & Systèmes
> Tél: +32 2 764 2488
> ------------------------------
>
> Avenue Hippocrate, 10 - 1200 Bruxelles - Belgique - Tel: + 32 2 764 11 11
> - www.saintluc.be
>
> [image: logo-fsl]
>
> Soutenez les Cliniques, soutenez la Fondation Saint-Luc
> <http://www.fondationsaintluc.be/>
> Support our Hospital, support Fondation Saint-Luc
> <http://www.fondationsaintluc.be/>
>
>
>
>
>
> *De :* ANUZET Wilfried
> *Envoyé :* mardi 16 avril 2019 15:17
> *À :* 'Nikolas Philips' <nikolas.phil...@gmail.com>
> *Objet :* RE: OKD installation on CentOS 7.6
>
>
>
> Hello Nikolas,
>
>
>
> Here's the points you mentions I've to check:
>
> ·         On all servers the NM_CONTROLLED=yes is set in the network
> interfaces definitions.
>
>                 The service itself is running:
>
> [root@okdmst01t ~]# systemctl status NetworkManager
>
> ● NetworkManager.service - Network Manager
>
>    Loaded: loaded (/usr/lib/systemd/system/NetworkManager.service;
> enabled; vendor preset: enabled)
>
>    Active: active (running) since Mon 2019-04-15 10:05:21 CEST; 1 day 5h
> ago
>
>      Docs: man:NetworkManager(8)
>
> Main PID: 10304 (NetworkManager)
>
>    CGroup: /system.slice/NetworkManager.service
>
>            └─10304 /usr/sbin/NetworkManager --no-daemon
>
>
>
> Apr 15 10:17:40 okdmst01t.stluc.ucl.ac.be NetworkManager[10304]: <info>
> [1555316260.6944] device (veth726db232): enslaved to non-master-type device
> ovs-system; ignoring
>
> Apr 15 10:18:35 okdmst01t.stluc.ucl.ac.be NetworkManager[10304]: <info>
> [1555316315.3185] device (veth138e5060): carrier: link connected
>
> Apr 15 10:18:35 okdmst01t.stluc.ucl.ac.be NetworkManager[10304]: <info>
> [1555316315.3188] manager: (veth138e5060): new Veth device
> (/org/freedesktop/NetworkManager/Devices/12)
>
> Apr 15 10:18:35 okdmst01t.stluc.ucl.ac.be NetworkManager[10304]: <info>
> [1555316315.3346] device (veth138e5060): enslaved to non-master-type device
> ovs-system; ignoring
>
> Apr 15 10:18:44 okdmst01t.stluc.ucl.ac.be NetworkManager[10304]: <info>
> [1555316324.3338] manager: (veth95ee3ae7): new Veth device
> (/org/freedesktop/NetworkManager/Devices/13)
>
> Apr 15 10:18:44 okdmst01t.stluc.ucl.ac.be NetworkManager[10304]: <info>
> [1555316324.3347] device (veth95ee3ae7): carrier: link connected
>
> Apr 15 10:18:44 okdmst01t.stluc.ucl.ac.be NetworkManager[10304]: <info>
> [1555316324.3555] device (veth95ee3ae7): enslaved to non-master-type device
> ovs-system; ignoring
>
> Apr 15 10:20:39 okdmst01t.stluc.ucl.ac.be NetworkManager[10304]: <info>
> [1555316439.2149] device (vethb5a95288): carrier: link connected
>
> Apr 15 10:20:39 okdmst01t.stluc.ucl.ac.be NetworkManager[10304]: <info>
> [1555316439.2155] manager: (vethb5a95288): new Veth device
> (/org/freedesktop/NetworkManager/Devices/14)
>
> Apr 15 10:20:39 okdmst01t.stluc.ucl.ac.be NetworkManager[10304]: <info>
> [1555316439.2515] device (vethb5a95288): enslaved to non-master-type device
> ovs-system; ignoring
>
>
>
> ·         I can reach another server in another internal subnet outside
> our /24 subnet defined in OKD servers (I can't go reach a server outside
> our internal network as SSH and ICMP out are disabled at our firewall
> level…)
>
>
>
> ·         The route are the same on the master and lb node:
>
> LB:
>
> [root@okdlb01t ~]$ ip route show
>
> default via 10.244.246.2 dev ens192 proto static metric 100
>
> 10.244.246.0/24 dev ens192 proto kernel scope link src 10.244.246.84
> metric 100
>
> 172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1
>
>
>
> MASTER:
>
> [root@okdmst01t ~]# ip route show
>
> default via 10.244.246.2 dev ens192 proto static metric 100
>
> 10.128.0.0/14 dev tun0 scope link
>
> 10.244.246.0/24 dev ens192 proto kernel scope link src 10.244.246.66
> metric 100
>
> 172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1
>
> 172.30.0.0/16 dev tun0
>
>
>
> ·         Here's the result of the oc commands regarding the openshift
> SDN pods:
>
> [root@okdmst01t ~]# oc get pods -n openshift-sdn
>
> NAME        READY     STATUS    RESTARTS   AGE
>
> ovs-h6vqq   1/1       Running   0          1d
>
> ovs-prm2z   1/1       Running   0          1d
>
> ovs-r5wll   1/1       Running   0          1d
>
> ovs-stnc5   1/1       Running   0          1d
>
> sdn-4g5fk   1/1       Running   0          1d
>
> sdn-4vlpr   1/1       Running   0          1d
>
> sdn-5775r   1/1       Running   0          1d
>
> sdn-j87dp   1/1       Running   0          1d
>
>
>
>
>
> Thanks for your help.
>
>
>
> [image: logo-stluc]
>
> *Wilfried Anuzet*
> Service Infrastructure
> Département Information & Systèmes
> Tél: +32 2 764 2488
> ------------------------------
>
> Avenue Hippocrate, 10 - 1200 Bruxelles - Belgique - Tel: + 32 2 764 11 11
> - www.saintluc.be
>
> [image: logo-fsl]
>
> Soutenez les Cliniques, soutenez la Fondation Saint-Luc
> <http://www.fondationsaintluc.be/>
> Support our Hospital, support Fondation Saint-Luc
> <http://www.fondationsaintluc.be/>
>
>
>
>
>
> *De :* Nikolas Philips <nikolas.phil...@gmail.com>
> *Envoyé :* mardi 16 avril 2019 14:47
> *À :* ANUZET Wilfried <wilfried.anu...@uclouvain.be>
> *Objet :* Re: OKD installation on CentOS 7.6
>
>
>
> Hey Wilfried,
>
> it looks like you got some networking issues. I think the [lb] node isn't
> affected because there's only a HAProxy deployment, and the node is
> probably not integrated in the SDN of your cluster. So I guess the ansible
> installer resp. the installation of the SDN messed up with your network
> settings.
>
> Can you reach hosts outside of your subnet from the master node? E.g.
> 1.1.1.1 or a different internal host from a different subnet?
>
> Is NetworkManager enabled and running on all nodes (required!)?
>
> Are the default routes correct on all nodes (check with "ip route show",
> and look for the line with default. Is the gateway correct? Is it the same
> as the LB node has?)
>
> When you are connected to the master node, can you execute "oc get nodes"?
> If yes, can you check if the SDN pods are running ("oc get pods -n
> openshift-sdn")? And  are the nodes ready?
>
>
>
> Best Regards,
>
> Nikolas
>
>
>
>
>
> Am Di., 16. Apr. 2019 um 14:21 Uhr schrieb ANUZET Wilfried <
> wilfried.anu...@uclouvain.be>:
>
> Hello,
>
>
>
> I tried to install OKD onto brand new CentOS VM 7.6.
>
> As I already set up a simple cluster on my cloud server to learn Openshift
> (1 master 1 node / CentOS 7.6 running on proxmox), I assume it will be easy
> as well using the openshift-ansible project.
>
>
>
> Here's the server I want to deploy:
>
> okdlb01t => OKD Load balancer / 1CPU / 2G RAM / 1NIC
>
> okdmst01t => OKD master / 8CPU / 16G RAM / 1NIC
>
> okdnod01t / okdnod02t => 2 OKD nodes / 4CPU / 8G RAM / 1NIC
>
> okdinf01t => OKD infrastructure node / 4CPU / 8G RAM / 1NIC
>
>
>
> All serveurs are configured to
>
>
>
> All servers are configured to:
>
> - use one of our internal /24 network
>
> - use the coporate proxy at user space and docker level
>
> - use Red Hat Satellite as repositories source
>
> - use Active Directory as user authentication method
>
> - be accessible throught SSH.
>
>
>
> Here's my inventory file:
>
> ---------------------
>
> [masters]
>
> okdmst01t.stluc.ucl.ac.be
>
>
>
> [etcd]
>
> okdmst01t.stluc.ucl.ac.be openshift_master_cluster_hostname="
> okdmst01t.stluc.ucl.ac.be" openshift_schedulable=true
>
>
>
> [nodes]
>
> okdmst01t.stluc.ucl.ac.be openshift_node_group_name="node-config-master"
>
> okdinf01t.stluc.ucl.ac.be openshift_node_group_name="node-config-infra"
>
> okdnod0[1:2]t.stluc.ucl.ac.be
> openshift_node_group_name="node-config-compute"
>
>
>
> [lb]
>
> okdlb01t.stluc.ucl.ac.be
>
>
>
> [OSEv3:children]
>
> masters
>
> nodes
>
> etcd
>
> lb
>
>
>
> [OSEv3:vars]
>
> openshift_deployment_type=origin
>
> openshift_master_default_subdomain=okdt.stluc.ucl.ac.be
>
> debug_level=2
>
> ansible_become=true
>
> openshift_docker_insecure_registries=172.30.0.0/16
>
> openshift_release=3.11
>
> openshift_install_examples=true
>
> os_firewall_use_firewalld=true
>
> openshift_disable_check:=docker_image_availability
>
> ---------------------
>
>
>
> I use ansible tower upstream (AWX) to deploy OKD and made the following
> workflow:
> prerequisites.yml == on-success ==> deply-cluster.yml == on-failure ==>
> uninstall.yml
>
>
>
> Everything seems to tun well and my workflow execute correctly.
>
>
>
> But I don't know why but when OKD is deployed none of the master / nodes /
> infra server are accessible throught ssh and none respond to ping.
>
> I can still use the vmware console and see that every conainers are up and
> running.
>
>
>
> I can still login to the lb and all nodes are visible from this one.
>
>
>
> So I can't connect to the web console or login using oc using the
> following:
>
> - in Browser (tested with latest Firefox and Chromium):
> https://okdmst01t.stluc.ucl.ac.be:8443/
>
>   Connection time out
>
>
>
> - CLI:
>
>   oc login https://okdmst01t.stluc.ucl.ac.be:8443
>
>   error: dial tcp 10.244.246.66:8443: i/o timeout - verify you have
> provided the correct host and port and that the server is currently running.
>
>
>
> Do you have a clue that I've to check ?
>
> Is there something I missed ?
>
> I already read the OKD latest doc and serverworld tutorial (
> https://www.server-world.info/en/note?os=CentOS_7&p=openshift311&f=1) but
> I can't found something to help me solve this.
>
> I don't really know what to search …
>
> If you have a clue or something to help please share it.
>
>
>
> Bests regards.
>
>
>
> [image: logo-stluc]
>
> *Wilfried Anuzet*
> Service Infrastructure
> Département Information & Systèmes
> Tél: +32 2 764 2488
> ------------------------------
>
> Avenue Hippocrate, 10 - 1200 Bruxelles - Belgique - Tel: + 32 2 764 11 11
> - www.saintluc.be
>
> [image: logo-fsl]
>
> Soutenez les Cliniques, soutenez la Fondation Saint-Luc
> <http://www.fondationsaintluc.be/>
> Support our Hospital, support Fondation Saint-Luc
> <http://www.fondationsaintluc.be/>
>
>
>
>
>
> _______________________________________________
> users mailing list
> users@lists.openshift.redhat.com
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>
>

_______________________________________________
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: OKD installation on CentOS 7.6

Reply via email to