Re: Pods stuck on 'ContainerCreating' when redhat/openshift-ovs-multitenant enabled

2019-10-16 Thread Yu Wei
Hi Dan,
I checked the logs of all pods in namespace openshift-sdn and I didn’t find any 
errors in them.
I reinstalled with ‘redhat/openshift-ovs-multitenant’ on a clean machine, 
everything works well.

So I suspect uninstall playbook didn’t clean calico plugin properly.

Thanks,
Jared


On Oct 16, 2019, at 1:09 AM, Dan Williams 
mailto:d...@redhat.com>> wrote:

On Tue, 2019-10-15 at 06:18 +, Yu Wei wrote:
I found the root cause for this issue.
In my machine, I firstly deployed cop with calico. It works well.
Then run uninstall playbook and reinstall with sdn openshift-ovs-
multitenant.
And it didn’t work anymore.
I found something as below,

[root@buzz1 openshift-ansible]# systemctl status  atomic-openshift-
node.service
● atomic-openshift-node.service - OpenShift Node
  Loaded: loaded (/etc/systemd/system/atomic-openshift-node.service;
enabled; vendor preset: disabled)
  Active: active (running) since Mon 2019-10-14 00:43:08 PDT; 22h
ago
Docs: https://github.com/openshift/origin
Main PID: 87388 (hyperkube)
  CGroup: /system.slice/atomic-openshift-node.service
  ├─87388 /usr/bin/hyperkube kubelet --v=6 --address=0.0.0.0
--allow-privileged=true --anonymous-auth=true --authentication-
toke...
  └─88872 /opt/cni/bin/calico

Oct 14 23:15:48 
buzz1.fyre.ibm.com>
 atomic-
openshift-node[87388]: I1014 23:15:48.289674   87388 common.go:71]
Using namespace "kube-syaml
Oct 14 23:15:48 
buzz1.fyre.ibm.com>
 atomic-
openshift-node[87388]: I1014 23:15:48.289809   87388 file.go:199]
Reading config file "/et...yaml"
Oct 14 23:15:48 
buzz1.fyre.ibm.com>
 atomic-
openshift-node[87388]: I1014 23:15:48.292556   87388 common.go:62]
Generated UID "598eab3cyaml
Oct 14 23:15:48 
buzz1.fyre.ibm.com>
 atomic-
openshift-node[87388]: I1014 23:15:48.293602   87388 common.go:66]
Generated Name "master-yaml
Oct 14 23:15:48 
buzz1.fyre.ibm.com>
 atomic-
openshift-node[87388]: I1014 23:15:48.294512   87388 common.go:71]
Using namespace "kube-syaml
Oct 14 23:15:48 
buzz1.fyre.ibm.com>
 atomic-
openshift-node[87388]: I1014 23:15:48.295667   87388 file.go:199]
Reading config file "/et...yaml"
Oct 14 23:15:48 
buzz1.fyre.ibm.com>
 atomic-
openshift-node[87388]: I1014 23:15:48.296350   87388 common.go:62]
Generated UID "d71dc810yaml
Oct 14 23:15:48 
buzz1.fyre.ibm.com>
 atomic-
openshift-node[87388]: I1014 23:15:48.296367   87388 common.go:66]
Generated Name "master-yaml
Oct 14 23:15:48 
buzz1.fyre.ibm.com>
 atomic-
openshift-node[87388]: I1014 23:15:48.296379   87388 common.go:71]
Using namespace "kube-syaml
Oct 14 23:15:48 
buzz1.fyre.ibm.com>
 atomic-
openshift-node[87388]: I1014 23:15:48.300194   87388 config.go:303]
Setting pods for source file
Oct 14 23:15:48 
buzz1.fyre.ibm.com>
 atomic-
openshift-node[87388]: I1014 23:15:48.361625   87388 kubelet.go:1884]
SyncLoop (SYNC): 3 p...d33c)
Oct 14 23:15:48 
buzz1.fyre.ibm.com>
 atomic-
openshift-node[87388]: I1014 23:15:48.361693   87388 config.go:100]
Looking for [api file]...e:{}]
Oct 14 23:15:48 
buzz1.fyre.ibm.com>
 atomic-
openshift-node[87388]: I1014 23:15:48.361716   87388 kubelet.go:1907]
SyncLoop (housekeeping)
Hint: Some lines were ellipsized, use -l to show in full.
[root@buzz1 openshift-ansible]# ps -ef | grep calico
root  88872  87388  0 23:15 ?00:00:00 /opt/cni/bin/calico
root  88975  74601  0 23:15 pts/000:00:00 grep --color=auto
calico
[root@buzz1 openshift-ansible]#

It seemed that calico is extra here. Then using the same inventory
file, OCP 3.11 could be deployed on a clean VM successfully.
I guessed that uninstall playbook did not clear calico thoroughly.


On Oct 12, 2019, at 11:52 PM, Yu Wei 
mailto:yu20...@hotmail.com>mailto:yu20...@hotmail.com>>> wrote:

Hi,
I tried to install OCP 3.11 with following variables set.
openshift_use_openshift_sdn=true
os_sdn_network_plugin_name='redhat/openshift-ovs-multitenant’

Some pods stuck on ‘ContainerCreating’.
[

Re: Pods stuck on 'ContainerCreating' when redhat/openshift-ovs-multitenant enabled

2019-10-14 Thread Yu Wei
I found the root cause for this issue.
In my machine, I firstly deployed cop with calico. It works well.
Then run uninstall playbook and reinstall with sdn openshift-ovs-multitenant.
And it didn’t work anymore.
I found something as below,

[root@buzz1 openshift-ansible]# systemctl status  atomic-openshift-node.service
● atomic-openshift-node.service - OpenShift Node
   Loaded: loaded (/etc/systemd/system/atomic-openshift-node.service; enabled; 
vendor preset: disabled)
   Active: active (running) since Mon 2019-10-14 00:43:08 PDT; 22h ago
 Docs: https://github.com/openshift/origin
 Main PID: 87388 (hyperkube)
   CGroup: /system.slice/atomic-openshift-node.service
   ├─87388 /usr/bin/hyperkube kubelet --v=6 --address=0.0.0.0 
--allow-privileged=true --anonymous-auth=true --authentication-toke...
   └─88872 /opt/cni/bin/calico

Oct 14 23:15:48 buzz1.fyre.ibm.com 
atomic-openshift-node[87388]: I1014 23:15:48.289674   87388 common.go:71] Using 
namespace "kube-syaml
Oct 14 23:15:48 buzz1.fyre.ibm.com 
atomic-openshift-node[87388]: I1014 23:15:48.289809   87388 file.go:199] 
Reading config file "/et...yaml"
Oct 14 23:15:48 buzz1.fyre.ibm.com 
atomic-openshift-node[87388]: I1014 23:15:48.292556   87388 common.go:62] 
Generated UID "598eab3cyaml
Oct 14 23:15:48 buzz1.fyre.ibm.com 
atomic-openshift-node[87388]: I1014 23:15:48.293602   87388 common.go:66] 
Generated Name "master-yaml
Oct 14 23:15:48 buzz1.fyre.ibm.com 
atomic-openshift-node[87388]: I1014 23:15:48.294512   87388 common.go:71] Using 
namespace "kube-syaml
Oct 14 23:15:48 buzz1.fyre.ibm.com 
atomic-openshift-node[87388]: I1014 23:15:48.295667   87388 file.go:199] 
Reading config file "/et...yaml"
Oct 14 23:15:48 buzz1.fyre.ibm.com 
atomic-openshift-node[87388]: I1014 23:15:48.296350   87388 common.go:62] 
Generated UID "d71dc810yaml
Oct 14 23:15:48 buzz1.fyre.ibm.com 
atomic-openshift-node[87388]: I1014 23:15:48.296367   87388 common.go:66] 
Generated Name "master-yaml
Oct 14 23:15:48 buzz1.fyre.ibm.com 
atomic-openshift-node[87388]: I1014 23:15:48.296379   87388 common.go:71] Using 
namespace "kube-syaml
Oct 14 23:15:48 buzz1.fyre.ibm.com 
atomic-openshift-node[87388]: I1014 23:15:48.300194   87388 config.go:303] 
Setting pods for source file
Oct 14 23:15:48 buzz1.fyre.ibm.com 
atomic-openshift-node[87388]: I1014 23:15:48.361625   87388 kubelet.go:1884] 
SyncLoop (SYNC): 3 p...d33c)
Oct 14 23:15:48 buzz1.fyre.ibm.com 
atomic-openshift-node[87388]: I1014 23:15:48.361693   87388 config.go:100] 
Looking for [api file]...e:{}]
Oct 14 23:15:48 buzz1.fyre.ibm.com 
atomic-openshift-node[87388]: I1014 23:15:48.361716   87388 kubelet.go:1907] 
SyncLoop (housekeeping)
Hint: Some lines were ellipsized, use -l to show in full.
[root@buzz1 openshift-ansible]# ps -ef | grep calico
root  88872  87388  0 23:15 ?00:00:00 /opt/cni/bin/calico
root  88975  74601  0 23:15 pts/000:00:00 grep --color=auto calico
[root@buzz1 openshift-ansible]#

It seemed that calico is extra here. Then using the same inventory file, OCP 
3.11 could be deployed on a clean VM successfully.
I guessed that uninstall playbook did not clear calico thoroughly.


On Oct 12, 2019, at 11:52 PM, Yu Wei 
mailto:yu20...@hotmail.com>> wrote:

Hi,
I tried to install OCP 3.11 with following variables set.
openshift_use_openshift_sdn=true
os_sdn_network_plugin_name='redhat/openshift-ovs-multitenant’

Some pods stuck on ‘ContainerCreating’.
[root@buzz1 openshift-ansible]# oc get pods --all-namespaces
NAMESPACE   NAMEREADY 
STATUS  RESTARTS   AGE
default docker-registry-1-deploy0/1   
ContainerCreating   0  5h
default registry-console-1-deploy   0/1   
ContainerCreating   0  5h
kube-system 
master-api-buzz1.center1.com
1/1   Running 0  5h
kube-system 
master-controllers-buzz1.center1.com
1/1   Running 0  5h
kube-system 
master-etcd-buzz1.center1.com   
1/1   Running 0  5h
openshift-node  sync-x8j7d  1/1   
Running 0  5h
openshift-sdn   ovs-ff7r7   1/1   
Running 0  5h
openshift-sdn   sdn-7frfw   1/1   
Running 10 5h
openshift