Dear community,
I think the problem lies here:

$ openssl x509 -in /etc/etcd/peer.crt -text -noout
        Subject: CN=xxx.xxx
            X509v3 Subject Alternative Name:
                IP Address:z.z.z.z

CN - master 1
IP - master 3

Plus this cert  /etc/etcd/peer.crt appears in all three masters - with the same 
values.
It should be: (on master1) CN:master1 IP:master1
(on master2) CN:master2 IP:master2

Seems like one of the last commits in these area broke things. It was working 
fine before :(
But I can’t find the commit. :(

Really need help with this.
Thanks a lot!
   Sebastian Wieseler



On 8 Apr 2016, at 12:05 PM, Sebastian Wieseler 
<sebast...@myrepublic.com.sg<mailto:sebast...@myrepublic.com.sg>> wrote:

Dear community,
I am running the latest ansible playbook version and followed the advanced 
installation guide.
(Updating 6bae443..1b82b1b)


When I execute ansible-playbook ~/openshift-ansible/playbooks/byo/config.yml it 
fails with:
TASK: [openshift_master | Start and enable master api] ************************
failed: [x.x.x.x] => {"failed": true}
msg: Job for origin-master-api.service failed because the control process 
exited with error code. See "systemctl status origin-master-api.service" and 
"journalctl -xe" for details.



Apr 08 03:47:43   etcd[12180]: dropped MsgAppResp to 9dc58f8e2290c613 since 
pipeline's sending buffer is full
Apr 08 03:47:43   etcd[12180]: dropped MsgAppResp to 9dc58f8e2290c613 since 
pipeline's sending buffer is full
Apr 08 03:47:43   etcd[12180]: dropped MsgAppResp to 9dc58f8e2290c613 since 
pipeline's sending buffer is full
Apr 08 03:47:43   etcd[12180]: dropped MsgHeartbeatResp to 9dc58f8e2290c613 
since pipeline's sending buffer is full
Apr 08 03:47:43   etcd[12180]: dropped MsgProp to 9dc58f8e2290c613 since 
pipeline's sending buffer is full
Apr 08 03:47:45   etcd[12180]: dropped MsgHeartbeatResp to 9dc58f8e2290c613 
since pipeline's sending buffer is full
Apr 08 03:47:45   etcd[12180]: publish error: etcdserver: request timed out, 
possibly due to connection lost
Apr 08 03:47:45   origin-master-controllers[116866]: E0408 03:47:45.976514  
116866 leaderlease.go:69] unable to check lease 
openshift.io/leases/controllers:<http://openshift.io/leases/controllers:> 501:
All the given peers are not reachable (failed to propose on members 
[https://xxx.xxx<https://xxx.xxx/>:2379 x509: certificate signed by unknown 
authority]) [0]

Apr 08 03:47:47   etcd[12180]: dropped MsgAppResp to 9dc58f8e2290c613 since 
pipeline's sending buffer is full
Apr 08 03:47:47   etcd[12180]: the connection to peer af936f5f6ff57c05 is 
unhealthy
Apr 08 03:47:47   etcd[12180]: dropped MsgAppResp to 9dc58f8e2290c613 since 
pipeline's sending buffer is full
Apr 08 03:47:47   etcd[12180]: dropped MsgAppResp to 9dc58f8e2290c613 since 
pipeline's sending buffer is full
Apr 08 03:47:47   etcd[12180]: dropped MsgHeartbeatResp to 9dc58f8e2290c613 
since pipeline's sending buffer is full
Apr 08 03:47:47   etcd[12180]: dropped MsgAppResp to 9dc58f8e2290c613 since 
pipeline's sending buffer is full
Apr 08 03:47:47   etcd[12180]: dropped MsgProp to 9dc58f8e2290c613 since 
pipeline's sending buffer is full
Apr 08 03:47:47   origin-node[26652]: E0408 03:47:47.708378   26652 
kubelet.go:2761] Error updating node status, will retry: error getting node 
“xxx.xxx": error #0: net/http: TLS handshake timeout
Apr 08 03:47:47   origin-node[26652]: error #1: net/http: TLS handshake timeout
Apr 08 03:47:47   origin-node[26652]: error #2: x509: certificate signed by 
unknown authority
Apr 08 03:47:48   etcd[12180]: dropped MsgAppResp to 9dc58f8e2290c613 since 
pipeline's sending buffer is full
Apr 08 03:47:48   etcd[12180]: dropped MsgHeartbeatResp to 9dc58f8e2290c613 
since pipeline's sending buffer is full
Apr 08 03:47:48   etcd[12180]: dropped MsgHeartbeatResp to 9dc58f8e2290c613 
since pipeline's sending buffer is full
Apr 08 03:47:49   origin-node[26652]: E0408 03:47:49.187066   26652 
kubelet.go:2761] Error updating node status, will retry: error getting node 
“xxx.xxx": error #0: x509: certificate signed by unknown authority
Apr 08 03:47:49   origin-node[26652]: error #1: x509: certificate signed by 
unknown authority
Apr 08 03:47:49   origin-node[26652]: error #2: x509: certificate signed by 
unknown authority
Apr 08 03:47:49   etcd[12180]: dropped MsgAppResp to 9dc58f8e2290c613 since 
pipeline's sending buffer is full
Apr 08 03:47:49   etcd[12180]: dropped MsgAppResp to 9dc58f8e2290c613 since 
pipeline's sending buffer is full
Apr 08 03:47:49   etcd[12180]: dropped MsgAppResp to 9dc58f8e2290c613 since 
pipeline's sending buffer is full
Apr 08 03:47:49   etcd[12180]: dropped MsgAppResp to 9dc58f8e2290c613 since 
pipeline's sending buffer is full
Apr 08 03:47:49   etcd[12180]: dropped MsgAppResp to 9dc58f8e2290c613 since 
pipeline's sending buffer is full
Apr 08 03:47:49   etcd[12180]: dropped MsgHeartbeatResp to 9dc58f8e2290c613 
since pipeline's sending buffer is full
Apr 08 03:47:49   etcd[12180]: dropped MsgAppResp to 9dc58f8e2290c613 since 
pipeline's sending buffer is full
Apr 08 03:47:49   etcd[12180]: failed to dial af936f5f6ff57c05 on stream MsgApp 
v2 (EOF)
Apr 08 03:47:49   etcd[12180]: dropped MsgAppResp to 9dc58f8e2290c613 since 
pipeline's sending buffer is full
Apr 08 03:47:49   etcd[12180]: failed to dial af936f5f6ff57c05 on stream 
Message (EOF)
Apr 08 03:47:49   etcd[12180]: dropped MsgHeartbeatResp to 9dc58f8e2290c613 
since pipeline's sending buffer is full
Apr 08 03:47:49   etcd[12180]: dropped MsgHeartbeatResp to 9dc58f8e2290c613 
since pipeline's sending buffer is full
Apr 08 03:47:49   etcd[12180]: the connection with 9dc58f8e2290c613 became 
inactive
Apr 08 03:47:49   etcd[12180]: failed to write 9dc58f8e2290c613 on pipeline 
(EOF)
Apr 08 03:47:51   etcd[12180]: failed to dial af936f5f6ff57c05 on stream 
Message (x509: certificate is valid for y.y.y.y, not z.z.z.z)
 ———>  z.z.z.z is my master03 and y.y.y.y my master02
Apr 08 03:47:51   etcd[12180]: failed to dial af936f5f6ff57c05 on stream MsgApp 
v2 (x509: certificate is valid for y.y.y.y, not z.z.z.z)
Apr 08 03:47:52   etcd[12180]: failed to write 9dc58f8e2290c613 on pipeline 
(net/http: TLS handshake timeout)
Apr 08 03:47:52   etcd[12180]: the connection with 9dc58f8e2290c613 became 
active
Apr 08 03:47:53   etcd[12180]: the connection with 9dc58f8e2290c613 became 
inactive
Apr 08 03:47:53   etcd[12180]: failed to write 9dc58f8e2290c613 on pipeline 
(net/http: TLS handshake timeout)
Apr 08 03:47:54   etcd[12180]: etcdserver: request timed out, possibly due to 
connection lost
Apr 08 03:47:56   etcd[12180]: publish error: etcdserver: request timed out, 
possibly due to connection lost
Apr 08 03:47:56   etcd[12180]: the connection with 9dc58f8e2290c613 became 
active
Apr 08 03:48:01   origin-node[26652]: E0408 03:48:01.380964   26652 
kubelet.go:2761] Error updating node status, will retry: error getting node 
“xxx.xxxt": error
Apr 08 03:48:01   origin-node[26652]: error #1: net/http: TLS handshake timeout
Apr 08 03:48:01   origin-node[26652]: error #2: x509: certificate signed by 
unknown authority
Apr 08 03:48:03   etcd[12180]: the connection with 9dc58f8e2290c613 became 
inactive
Apr 08 03:48:03   etcd[12180]: failed to write 9dc58f8e2290c613 on pipeline 
(EOF)
Apr 08 03:48:04   origin-master-controllers[116866]: E0408 03:48:04.691728  
116866 leaderlease.go:69] unable to check lease 
openshift.io/leases/controllers:<http://openshift.io/leases/controllers:> 501: 
All the given peers are not reachable



My setup includes three masters:
[masters]
x.x.x.x openshift_hostname=xxx.xxx openshift_public_hostname=xxx.xxx
y.y.y.y openshift_hostname=yyy.yyy openshift_public_hostname=yyy.yyy
z.z.z.z openshift_hostname=zzz.zzz openshift_public_hostname=zzz.zzz

[etcd]
x.x.x.x openshift_hostname=xxx.xxx openshift_public_hostname=xxx.xxx
y.y.y.y openshift_hostname=yyy.yyy openshift_public_hostname=yyy.yyy
z.z.z.z openshift_hostname=zzz.zzz openshift_public_hostname=zzz.zzz



I also tried destroying the config:
# yum -y remove openshift openshift-* etcd
# rm -rf /etc/origin /var/lib/openshift /etc/etcd \
    /var/lib/etcd /etc/sysconfig/atomic-openshift* \
    /root/.kube/config /etc/ansible/facts.d /usr/share/openshift

But ansible fails at the same step and the cert errors persist.

Can somebody help me?

Thanks a lot i advance!
Best Regards,
  Sebastian Wieseler


_______________________________________________
users mailing list
users@lists.openshift.redhat.com<mailto:users@lists.openshift.redhat.com>
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

_______________________________________________
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Reply via email to