Le 07/11/2019 à 07:18, Roy Golan a écrit :


On Thu, 7 Nov 2019 at 00:10, Nathanaël Blanchet <blanc...@abes.fr <mailto:blanc...@abes.fr>> wrote:


    Le 05/11/2019 à 21:50, Roy Golan a écrit :


    On Tue, 5 Nov 2019 at 22:46, Roy Golan <rgo...@redhat.com
    <mailto:rgo...@redhat.com>> wrote:



        On Tue, 5 Nov 2019 at 20:28, Nathanaël Blanchet
        <blanc...@abes.fr <mailto:blanc...@abes.fr>> wrote:


            Le 05/11/2019 à 18:22, Roy Golan a écrit :


            On Tue, 5 Nov 2019 at 19:12, Nathanaël Blanchet
            <blanc...@abes.fr <mailto:blanc...@abes.fr>> wrote:


                Le 05/11/2019 à 13:54, Roy Golan a écrit :


                On Tue, 5 Nov 2019 at 14:52, Nathanaël Blanchet
                <blanc...@abes.fr <mailto:blanc...@abes.fr>> wrote:

                    I tried openshift-install after compiling but
                    no ovirt provider is available... So waht do
                    you mean when you say "give a try"? Maybe only
                    provisionning ovirt with the terraform module?

                    [root@vm5 installer]# bin/openshift-install
                    create cluster
                    ? Platform  [Use arrows to move, space to
                    select, type to filter, ? for more help]
                    > aws
                      azure
                      gcp
                      openstack


                Its not merged yet. Please pull this image and work
                with it as a container
                quay.io/rgolangh/openshift-installer
                <http://quay.io/rgolangh/openshift-installer>

                A little feedback as you asked:

                [root@openshift-installer ~]# docker run -it
                56e5b667100f create cluster
                ? Platform ovirt
                ? Enter oVirt's api endpoint URL
                https://air-dev.v100.abes.fr/ovirt-engine/api
                ? Enter ovirt-engine username admin@internal
                ? Enter password **********
                ? Pick the oVirt cluster Default
                ? Pick a VM template centos7.x
                ? Enter the internal API Virtual IP 10.34.212.200
                ? Enter the internal DNS Virtual IP 10.34.212.100
                ? Enter the ingress IP 10.34.212.50
                ? Base Domain oc4.localdomain
                ? Cluster Name test
                ? Pull Secret [? for help]
                *************************************
                INFO Creating infrastructure resources...
                INFO Waiting up to 30m0s for the Kubernetes API at
                https://api.test.oc4.localdomain:6443...
                ERROR Attempted to gather ClusterOperator status
                after installation failure: listing ClusterOperator
                objects: Get
                
https://api.test.oc4.localdomain:6443/apis/config.openshift.io/v1/clusteroperators:
                dial tcp: lookup api.test.oc4.localdomain on
                10.34.212.100:53 <http://10.34.212.100:53>: no such
                host
                INFO Pulling debug logs from the bootstrap machine
                ERROR Attempted to gather debug logs after
                installation failure: failed to create SSH client,
                ensure the proper ssh key is in your keyring or
                specify with --key: failed to initialize the SSH
                agent: failed to read directory "/output/.ssh": open
                /output/.ssh: no such file or directory
                FATAL Bootstrap failed to complete: waiting for
                Kubernetes API: context deadline exceeded

                  * 6 vms are successfully created thin dependent
                    from the template

                  * each vm is provisionned by cloud-init
                  * the step "INFO Waiting up to 30m0s for the
                    Kubernetes API at
                    https://api.test.oc4.localdomain:6443..."; fails.
                    It seems that the DNS pod is not up at this time.
                  * Right this moment, there is no more visibility
                    on what is done, what goes wrong... what's
                    happening there? supposing a kind of playbook
                    downloading a kind of images...
                  * The" pull secret step" is not clear: we must
                    have a redhat account to
                    https://cloud.redhat.com/openshift/install/ to
                    get a key like
                 *

                    {"auths":{"cloud.openshift.com
                    
<http://cloud.openshift.com>":{"auth":"b3BlbnNoaWZ0LXJlbGVhc2UtZGV2K2V4cGxvaXRhYmVzZnIxdGN0ZnR0dmFnMHpuazMxd2IwMnIwenV1MDg6TE9XVzFQODM1NzNJWlI4MlZDSUEyTFdEVlJJS0U5VTVWM0NTSUdOWjJH********************==","email":"expl...@abes.fr"
                    <mailto:expl...@abes.fr>},"quay.io
                    
<http://quay.io>":{"auth":"b3BlbnNoaWZ0LXJlbGVhc2UtZGV2K2V4cGxvaXRhYmVzZnIxdGN0ZnR0dmFnMHpuazMxd2IwMnIwenV1MDg6TE9XVzFQODM1NzNJWlI4MlZDSUEyTFdEVlJJS0U5VTVWM0NTSUdOWjJH********************==","email":"expl...@abes.fr"
                    <mailto:expl...@abes.fr>},"registry.connect.redhat.com
                    
<http://registry.connect.redhat.com>":{"auth":"NTI0MjkwMnx1aGMtMVRDVEZUVFZBRzBaTkszMXdCMDJSMFp1VTA4OmV5SmhiR2NpT2lKU1V6VXhNaUo5LmV5SnpkV0lpT2lJMk4ySTJNREV3WXpObE1HSTBNbVE0T1RGbVpUZGxa**********************","email":"expl...@abes.fr"
                    <mailto:expl...@abes.fr>},"registry.redhat.io
                    
<http://registry.redhat.io>":{"auth":"NTI0MjkwMnx1aGMtMVRDVEZUVFZBRzBaTkszMXdCMDJSMFp1VTA4OmV5SmhiR2NpT2lKU1V6VXhNaUo5LmV5SnpkV0lpT2lJMk4ySTJNREV3WXpObE1HSTBNbVE0T1RGbVpUZGxa**********************","email":"expl...@abes.fr"
                    <mailto:expl...@abes.fr>}}}

                Can you tell me if I'm doing wrong?


            What is the template you are using? I don't think its
            RHCOS(Red Hat CoreOs) template, it looks like Centos?

            Use this gist to import the template
            https://gist.github.com/rgolangh/adccf6d6b5eaecaebe0b0aeba9d3331b
            Unfortunately, the result is the same with the RHCOS
            template...


        Make sure that:
        - the IPs supplied are taken, and belong to the VM network of
        those master VMs
        - localdomain or local domain suffix shouldn't be used
        - your ovirt-engine is version 4.3.7 or master

    I didn't mention that you can provide any domain name, even
    non-existing.
    When the bootstrap phase will be done, the instllation will
    teardown the bootsrap mahchine.
    At this stage if you are using a non-existing domain you would
    need to add the DNS Virtual IP
    you provided to your resolv.conf so the installation could
    resolve api.$CLUSTER_NAME.$CLUSTER_DOMAIN.

    Also, you have a log under your $INSTALL_DIR/.openshift_install.log

    I tried several things with your advices, but I'm still stuck  at
    the https://api.test.oc4.localdomain:6443/version?timeout=32s test

    with logs:

    time="2019-11-06T20:21:15Z" level=debug msg="Still waiting for the
    Kubernetes API: the server could not find the requested resource"

    So it means DNS resolution and network are now good and ignition
    provisionning is is OK but something goes wrong with the bootstrap vm.

    Now if I log into the bootstrap vm, I can see a selinux message,
    but it may be not relevant...

    SELinux: mount invalid. Same Superblock, different security
    settings for (dev nqueue, type nqueue).

    Some other cluewWith journalctl:

    journalctl -b -f -u bootkube

    Nov 06 21:55:40 localhost bootkube.sh[2101]:
    
{"level":"warn","ts":"2019-11-06T21:55:40.661Z","caller":"clientv3/retry_interceptor.go:61","msg":"retrying
    of unary invoker
    
failed","target":"endpoint://client-7beef51d-daad-4b46-9497-8e135e528f7c/etcd-1.test.oc4.localdomain:2379","attempt":0,"error":"rpc
    error: code = DeadlineExceeded desc = latest connection error:
    connection error: desc = \"transport: Error while dialing dial
    tcp: lookup etcd-1.test.oc4.localdomain on 10.34.212.101:53
    <http://10.34.212.101:53>: no such host\""}
    Nov 06 21:55:40 localhost bootkube.sh[2101]:
    
{"level":"warn","ts":"2019-11-06T21:55:40.662Z","caller":"clientv3/retry_interceptor.go:61","msg":"retrying
    of unary invoker
    
failed","target":"endpoint://client-03992fc6-5a87-4160-9b87-44ec6e82f7cd/etcd-2.test.oc4.localdomain:2379","attempt":0,"error":"rpc
    error: code = DeadlineExceeded desc = latest connection error:
    connection error: desc = \"transport: Error while dialing dial
    tcp: lookup etcd-2.test.oc4.localdomain on 10.34.212.101:53
    <http://10.34.212.101:53>: no such host\""}
    Nov 06 21:55:40 localhost bootkube.sh[2101]:
    
{"level":"warn","ts":"2019-11-06T21:55:40.662Z","caller":"clientv3/retry_interceptor.go:61","msg":"retrying
    of unary invoker
    
failed","target":"endpoint://client-00db28a7-5188-4666-896b-e37c88ad3ae9/etcd-0.test.oc4.localdomain:2379","attempt":0,"error":"rpc
    error: code = DeadlineExceeded desc = latest connection error:
    connection error: desc = \"transport: Error while dialing dial
    tcp: lookup etcd-0.test.oc4.localdomain on 10.34.212.101:53
    <http://10.34.212.101:53>: no such host\""}
    Nov 06 21:55:40 localhost bootkube.sh[2101]:
    https://etcd-1.test.oc4.localdomain:2379 is unhealthy: failed to
    commit proposal: context deadline exceeded
    Nov 06 21:55:40 localhost bootkube.sh[2101]:
    https://etcd-2.test.oc4.localdomain:2379 is unhealthy: failed to
    commit proposal: context deadline exceeded
    Nov 06 21:55:40 localhost bootkube.sh[2101]:
    https://etcd-0.test.oc4.localdomain:2379 is unhealthy: failed to
    commit proposal: context deadline exceeded
    Nov 06 21:55:40 localhost bootkube.sh[2101]: Error: unhealthy cluster
    Nov 06 21:55:40 localhost podman[61210]: 2019-11-06
    21:55:40.720514151 +0000 UTC m=+5.813853296 container died
    7db3014e3f19c61775bac2a7a155eeb8521a6b78fea0d512384dd965cb0b8b01
    
(image=registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:696a0ba7d344e625ec71d915d74d387fc2a951b879d4d54bdc69d460724c01ae
    
<http://registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:696a0ba7d344e625ec71d915d74d387fc2a951b879d4d54bdc69d460724c01ae>,
    name=etcdctl)
    Nov 06 21:55:40 localhost podman[61210]: 2019-11-06
    21:55:40.817475095 +0000 UTC m=+5.910814273 container remove
    7db3014e3f19c61775bac2a7a155eeb8521a6b78fea0d512384dd965cb0b8b01
    
(image=registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:696a0ba7d344e625ec71d915d74d387fc2a951b879d4d54bdc69d460724c01ae
    
<http://registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:696a0ba7d344e625ec71d915d74d387fc2a951b879d4d54bdc69d460724c01ae>,
    name=etcdctl)
    Nov 06 21:55:40 localhost bootkube.sh[2101]: etcdctl failed.
    Retrying in 5 seconds...

    It seems to be again a dns resolution issue.

    [user1@localhost ~]$ dig api.test.oc4.localdomain +short
    10.34.212.201

    [user1@localhost ~]$ dig etcd-2.test.oc4.localdomain +short
    nothing


    So what do you think about that?


Key here is the masters - they need to boot, get ignition from the bootstrap machine and start publishing their IPs and hostnames.

Connect to a master, check its hostname, check its running or failing containers `crictl ps -a` by root user.

You were right:

# crictl ps -a
CONTAINER ID IMAGE CREATED             STATE               NAME ATTEMPT             POD ID 744cb8e654705 e77034cf36baff5e625acbba15331db68e1d84571f977d254fd833341158daa8 4 minutes ago       Running             discovery 75                  9462e9a8ca478 912ba9db736c3 e77034cf36baff5e625acbba15331db68e1d84571f977d254fd833341158daa8 14 minutes ago      Exited              discovery 74                  9462e9a8ca478

# crictl logs 744cb8e654705
E1107 08:10:04.262330       1 run.go:67] error looking up self for candidate IP 10.34.212.227: lookup _etcd-server-ssl._tcp.test.oc4.localdomain on 10.34.212.51:53: no such host

# hostname
localhost

Conclusion: discovery didn't publish IPs and hostname to coreDNS because the master didn't get its name master-0.test.oc4.localdomain during provisionning phase.

I changed the master-0 hostname and reinitiates ignition to verify:

# hostnamectl set-hostname master-0.test.oc4.localdomain

# touch /boot/ignition.firstboot && rm -rf /etc/machine-id && reboot

After reboot is completed, no more exited discovery container:

CONTAINER ID IMAGE CREATED             STATE               NAME ATTEMPT             POD ID e701efa8bc583 77ec5e26cc676ef2bf5c42dd40e55394a11fb45a3e2d7e95cbaf233a1eef472f 20 seconds ago      Running             coredns 1                   cbabc53322ac8 2c7bc6abb5b65 d73eca122bd567a3a1f70fa5021683bc17dd87003d05d88b1cdd0215c55049f6 20 seconds ago      Running             mdns-publisher 1                   6f8914ff9db35 b3f619d5afa2c 7aa184de043265814f9a775968234ac3280a285056da773f1aba0917e9615370 21 seconds ago      Running             haproxy-monitor 1                   0e5c209496787 07769ce79b032 7aa184de043265814f9a775968234ac3280a285056da773f1aba0917e9615370 21 seconds ago      Running             keepalived-monitor 1                   02cf141d01a29 fb20d66b81254 e77034cf36baff5e625acbba15331db68e1d84571f977d254fd833341158daa8 21 seconds ago      Running             discovery 77                  562f32067e0a7 476b07599260e 86a34bc5edd3e70073313f97bfd51ed8937658b341dc52334fb98ea6896ebdc2 22 seconds ago      Running             haproxy 1                   0e5c209496787 26b53050a412b 9f94e500f85a735ec212ffb7305e0b63f7151a5346e41c2d5d293c8456f6fa42 22 seconds ago      Running             keepalived 1                   02cf141d01a29 30ce48453854b 7aa184de043265814f9a775968234ac3280a285056da773f1aba0917e9615370 22 seconds ago      Exited              render-config 1                   cbabc53322ac8 ad3ab0ae52077 7aa184de043265814f9a775968234ac3280a285056da773f1aba0917e9615370 22 seconds ago      Exited              render-config 1                   6f8914ff9db35 650d62765e9e1 registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:9a7e8297f7b70ee3a11fcbe4a78c59a5861e1afda5657a7437de6934bdc2458e 13 hours ago        Exited              coredns 0                   2ae0512b3b6ac 481969ce49bb9 registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:768194132b4dbde077e32de8801c952265643da00ae161f1ee560fabf6ed1f8e 13 hours ago        Exited              mdns-publisher 0                   d49754042b792 3594d9d261ca7 registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:8c3b0223699801a9cb93246276da4746fa4d6fa66649929b2d9b702c17dac75d 13 hours ago        Exited              haproxy-monitor 0                   3476219058ba8 88b13ec02a5c1 7aa184de043265814f9a775968234ac3280a285056da773f1aba0917e9615370 13 hours ago        Exited              keepalived-monitor 0                   a3e13cf07c04f 1ab721b5599ed registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:629d73fdd28beafda788d2248608f90c7ed048357e250f3e855b9462b92cfe60 13 hours ago

because DNS registration is OK:

[user1@master-0 ~]$ dig etcd-0.test.oc4.localdomain +short
10.34.212.227

CONCLUSION:

 * none of rhcos vm is correctly provisionned to their targeted
   hostname, so they all stay with localhost.
 * Cloud-init syntax for the hostname is ok, but it is not provisioned
   by ignition:

Why not provisionning these hostnames with a json snippet or else?

|{"ignition":{"version":"2.2.0"},"storage":{"files":[{"filesystem":"root","path":"/etc/hostname","mode":420,"contents":{"source":"data:,master-0.test.oc4.localdomain"}}]}}|









                    Le 05/11/2019 à 12:24, Roy Golan a écrit :


                    On Tue, 5 Nov 2019 at 13:22, Nathanaël
                    Blanchet <blanc...@abes.fr
                    <mailto:blanc...@abes.fr>> wrote:

                        Hello,

                        I'm interested by installing okd on ovirt
                        with the official openshift
                        installer
                        (https://github.com/openshift/installer),
                        but ovirt is not yet
                        supported.


                    If you want to give a try and supply feedback
                    I'll be glad.

                        Regarding
                        https://bugzilla.redhat.com/show_bug.cgi?id=1578255
                        and
                        
https://lists.ovirt.org/archives/list/users@ovirt.org/thread/EF7OQUVTY53GV3A7NVQVUT7UCUYKK5CH/

                        , how ovirt 4.3.7 should integrate
                        openshift installer integration with
                        terraform?


                    Terraform is part of it, yes, It is what we
                    use to spin the first 3 masters, plus a
                    bootstraping machine.

-- Nathanaël Blanchet

                        Supervision réseau
                        Pôle Infrastrutures Informatiques
                        227 avenue Professeur-Jean-Louis-Viala
                        34193 MONTPELLIER CEDEX 5
                        Tél. 33 (0)4 67 54 84 55
                        Fax  33 (0)4 67 54 84 14
                        blanc...@abes.fr <mailto:blanc...@abes.fr>

-- Nathanaël Blanchet

                    Supervision réseau
                    Pôle Infrastrutures Informatiques
                    227 avenue Professeur-Jean-Louis-Viala
                    34193 MONTPELLIER CEDEX 5   
                    Tél. 33 (0)4 67 54 84 55
                    Fax  33 (0)4 67 54 84 14
blanc...@abes.fr <mailto:blanc...@abes.fr>
-- Nathanaël Blanchet

                Supervision réseau
                Pôle Infrastrutures Informatiques
                227 avenue Professeur-Jean-Louis-Viala
                34193 MONTPELLIER CEDEX 5       
                Tél. 33 (0)4 67 54 84 55
                Fax  33 (0)4 67 54 84 14
blanc...@abes.fr <mailto:blanc...@abes.fr>
-- Nathanaël Blanchet

            Supervision réseau
            Pôle Infrastrutures Informatiques
            227 avenue Professeur-Jean-Louis-Viala
            34193 MONTPELLIER CEDEX 5   
            Tél. 33 (0)4 67 54 84 55
            Fax  33 (0)4 67 54 84 14
blanc...@abes.fr <mailto:blanc...@abes.fr>
-- Nathanaël Blanchet

    Supervision réseau
    Pôle Infrastrutures Informatiques
    227 avenue Professeur-Jean-Louis-Viala
    34193 MONTPELLIER CEDEX 5   
    Tél. 33 (0)4 67 54 84 55
    Fax  33 (0)4 67 54 84 14
blanc...@abes.fr <mailto:blanc...@abes.fr>
--
Nathanaël Blanchet

Supervision réseau
Pôle Infrastrutures Informatiques
227 avenue Professeur-Jean-Louis-Viala
34193 MONTPELLIER CEDEX 5       
Tél. 33 (0)4 67 54 84 55
Fax  33 (0)4 67 54 84 14
blanc...@abes.fr

_______________________________________________
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/OQQO7C26JELAWIO3S63JUMKJOEZ3YBKR/

Reply via email to