Re: [openstack-dev] [magnum] supported OS images and magnum spawn failures for Swarm and Kubernetes

Tobias Urdin Thu, 23 Aug 2018 05:47:50 -0700

Thanks for all of your help everyone,

I've been busy with other thing but was able to pick up where I leftregarding Magnum.After fixing some issues I have been able to provision a workingKubernetes cluster.

I'm still having issues with getting Docker Swarm working, I've triedwith both Docker and flannel as the networking layer butnone of these works. After investigating the issue seems to be thatetcd.service is not installed (unit file doesn't exist) so the masterdoesn't work, the minion swarm node is provisioned but cannot join thecluster because there is no etcd.

Anybody seen this issue before? I've been digging through all cloud-initlogs and cannot see anything that would cause this.

I also have another separate issue, when provisioning using themagnum-ui in Horizon and selecting ubuntu with Mesos I get the error"The Parameter (nodes_affinity_policy) was not provided". Thenodes_affinity_policy do have a default value in magnum.conf so I'm starting

to think this might be an issue with the magnum-ui dashboard?

Best regards
Tobias

On 08/04/2018 06:24 PM, Joe Topjian wrote:

We recently deployed Magnum and I've been making my way throughgetting both Swarm and Kubernetes running. I also ran into someinitial issues. These notes may or may not help, but thought I'd sharethem in case:
* We're using Barbican for SSL. I have not tried with the internalx509keypair.
* I was only able to get things running with Fedora Atomic 27,specifically the version used in the Magnum docs:https://docs.openstack.org/magnum/latest/install/launch-instance.html
Anything beyond that wouldn't even boot in my cloud. I haven't duginto this.
* Kubernetes requires a Cluster Template to have a label ofcert_manager_api=true set in order for the cluster to fully come up(at least, it didn't work for me until I set this).
As far as troubleshooting methods go, check the cloud-init logs on theindividual instances to see if any of the "parts" have failed to run.Manually re-run the parts on the command-line to get a better idea ofwhy they failed. Review the actual script, figure out the variableinterpolation and how it relates to the Cluster Template being used.
Eventually I was able to get clusters running with the stockdriver/templates, but wanted to tune them in order to better fit inour cloud, so I've "forked" them. This is in no way a slight againstthe existing drivers/templates nor do I recommend doing this until youreach a point where the stock drivers won't meet your needs. But Imention it because it's possible to do and it's not terribly hard.This is still a work-in-progress and a bit hacky:
https://github.com/cybera/magnum-templates

Hope that helps,
Joe
On Fri, Aug 3, 2018 at 6:46 AM, Tobias Urdin <tobias.ur...@binero.se<mailto:tobias.ur...@binero.se>> wrote:
    Hello,

    I'm testing around with Magnum and have so far only had issues.
    I've tried deploying Docker Swarm (on Fedora Atomic 27, Fedora
    Atomic 28) and Kubernetes (on Fedora Atomic 27) and haven't been
    able to get it working.

    Running Queens, is there any information about supported images?
    Is Magnum maintained to support Fedora Atomic still?
    What is in charge of population the certificates inside the
    instances, because this seems to be the root of all issues, I'm
    not using Barbican but the x509keypair driver
    is that the reason?

    Perhaps I missed some documentation that x509keypair does not
    support what I'm trying to do?

    I've seen the following issues:

    Docker:
    * Master does not start and listen on TCP because of certificate
    issues
    dockerd-current[1909]: Could not load X509 key pair (cert:
    "/etc/docker/server.crt", key: "/etc/docker/server.key")

    * Node does not start with:
    Dependency failed for Docker Application Container Engine.
    docker.service: Job docker.service/start failed with result
    'dependency'.

    Kubernetes:
    * Master etcd does not start because /run/etcd does not exist
    ** When that is created it fails to start because of certificate
    2018-08-03 12:41:16.554257 C | etcdmain: open
    /etc/etcd/certs/server.crt: no such file or directory

    * Master kube-apiserver does not start because of certificate
    unable to load server certificate: open
    /etc/kubernetes/certs/server.crt: no such file or directory

    * Master heat script just sleeps forever waiting for port 8080 to
    become available (kube-apiserver) so it can never kubectl apply
    the final steps.

    * Node does not even start and times out when Heat deploys it,
    probably because master never finishes

    Any help is appreciated perhaps I've missed something crucial,
    I've not tested Kubernetes on CoreOS yet.

    Best regards
    Tobias

    __________________________________________________________________________
    OpenStack Development Mailing List (not for usage questions)
    Unsubscribe:
    openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
    <http://openstack-dev-requ...@lists.openstack.org?subject:unsubscribe>
    http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
    <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev>

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [magnum] supported OS images and magnum spawn failures for Swarm and Kubernetes

Reply via email to