Issues running 'cotd' demo on Ubuntu Linux 16.04
Hi there, I am trying to follow the book "DevOps With OpenShift" and having some trouble. When I get to the point of launching the 'cotd' container demo, specifically this line: oc new-app --name='cotd' --labels name='cotd' php~https://github.com/devops- with-openshift/cotd.git -e SELECTOR=cats It appears to spool up and build okay, but then fails in a crash loop: AH00558: httpd: Could not reliably determine the server's fully qualified domain name, using 172.17.0.2. Set the 'ServerName' directive globally to suppress this message (13)Permission denied: AH00058: Error retrieving pid file /opt/rh/httpd24/root/var/run/httpd/httpd.pid AH00059: Remove it before continuing if it is corrupted. I'm running Ubuntu 16.04 64-bit, fully up-to-date including the kernel, and using the latest stable oc binary (Origin v3.6.0), and running Docker from its official repos. OpenShift itself seems to work great and has no major issues. FWIW I've used this exact same book and exact same commands and processes to successfully get the demo up and running on two different Macs, so it looks like this is a Ubuntu-specific issue. I would switch to Fedora but my workstation requires Ubuntu for various annoying reasons. I want to get this up and running as my workstation has 32GB of RAM which is 2-4x more than all my other machines! Version information follows: foo@bar:~$ oc version oc v3.6.0+c4dd4cf kubernetes v1.6.1+5115d708d7 features: Basic-Auth GSSAPI Kerberos SPNEGO Server https://127.0.0.1:8443 openshift v3.6.0+c4dd4cf kubernetes v1.6.1+5115d708d7 foo@bar:~$ docker -v Docker version 17.06.1-ce, build 874a737 foo@bar:~$ lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 16.04.3 LTS Release: 16.04 Codename: xenial foo@bar:~$ uname -a Linux bar 4.4.0-93-generic #116-Ubuntu SMP Fri Aug 11 21:17:51 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux ___ users mailing list users@lists.openshift.redhat.com http://lists.openshift.redhat.com/openshiftmm/listinfo/users
Re: Let's Encrypt certificates
Hey. How ready is it to use on production? Is there any plans to change interfaces/mechanisms in the near future? Thanks, great job! ;) Em 5 de set de 2017 2:23 PM, "Tim Dudgeon"escreveu: > Tomas > > Thanks, that helped. > > The problem was that it wasn't clear that you needed to install into a new > project, and then update the > > oc adm policy add-cluster-role-to-user acme-controller > system:serviceaccount:acme:default > > command and replace acme with the name of the project. Once done it > installs fine and issues certificates as described. > > Thanks > Tim > > > On 05/09/2017 17:38, Tomas Nozicka wrote: > >> Hi Tim, >> >> (see inline...) >> >> On Tue, 2017-09-05 at 17:12 +0100, Tim Dudgeon wrote: >> >>> Thanks. >>> >>> I'm having problems getting this running. >>> When I deploy the deploymentconfig the pod fails to start and the >>> logs >>> contain these errors: >>> >>> 2017-09-05T16:03:11.764025351Z ERROR cmd.go:138 Unable to bootstrap certificate database: 'User >>> and >>> 2017-09-05T16:03:11.766213869Z ERROR cmd.go:173 Couln't initialize RouteController: 'RouteController could not find its own service: 'User "system:serviceaccount:acme-controller:default" cannot get services in project "acme-controller"'' >>> misconfigured SA is system:serviceaccount:acme-controller:default >> - notably the namespace is **acme-controller** >> >> I already deployed the clusterrole and executed >>> >>> oc adm policy add-cluster-role-to-user acme-controller system:serviceaccount:acme:default >>> Even tried as suggested: >>> >>> oc adm policy add-cluster-role-to-user cluster-admin system:serviceaccount:acme:default >>> You are modifying SA in namespace **acme** not **acme-controller** >> >> I tried this in the default project and in a new acme-controller >>> project. >>> >>> Could you help describe steps to get this running in a new openshift >>> environment? >>> >> Try looking at the exact steps our CI is using to create it from >> scratch but it should work as described in our docs. >> >>https://github.com/tnozicka/openshift-acme/blob/master/.travis.yml#L6 >> 7-L73 >> >> Thanks >>> Tim >>> >>> >>> >>> On 04/09/2017 09:44, Tomas Nozicka wrote: >>> Hi Tim, On Mon, 2017-09-04 at 09:16 +0100, Tim Dudgeon wrote: > Tomas > > Thanks for that. Looks very interesting. > > I've looked it over and not totally sure how to use this. > > Am I right that if this controller is deployed and running > correctly > then all you need to do for any routes is add the > 'kubernetes.io/tls-acme: "true"' annotation to your route and > the > controller will handle creating the initial certificate and > renewing > it > as needed? > Correct. And in doing so it will generate/renew certificate for the > hostname, > add/update it as a secret, and update the route definition to use > that > certificate? > For Routes it will generate a secret with that certificate and also inline it into the Route as it doesn't support referencing it. (Ingresses do, but the project doesn't support those yet.) The secret can be useful for checking or mounting it into pods directly if you don't want to terminate your TLS in the router but in pods. And that this will only apply to external routes. Some mechanism, > such > as the Ansible playbook, will still be required to maintain the > certificates that are used internally by the Openshift > infrastructure? > I have some thoughts on this but no code :/ As I said at this point you need to bootstrap the infra using your own CA/self-signed cert and then you can expose the OpenShift API + web console using a Route. This should work fine even for 'oc' client unless the Router is down and you need to fix it. For that rare case, when only the admin will need to log in to fix the router he can use the internal cert or ssh into the cluster directly. So this hack should cover all the use cases for users except this special case for an admin. Thanks > Tim > > On 25/08/2017 17:09, Tomas Nozicka wrote: > >> Hi Tim, >> >> there is a controller to take care about generating and >> renewing >> Let's >> Encrypt certificates for you. >> >> https://github.com/tnozicka/openshift-acme >> >> That said it won't generate it for masters but you can expose >> master >> API using Route and certificate for that Route would be fully >> managed >> by openshift-acme. >> >> Further integrations might be possible in future but this is >> how >> you >> can get it done now. >> >> Regards, >> Tomas >> >> >> On Fri, 2017-08-25 at 16:27 +0100, Tim Dudgeon wrote: >> >>> Does
Re: Let's Encrypt certificates
Tomas Thanks, that helped. The problem was that it wasn't clear that you needed to install into a new project, and then update the oc adm policy add-cluster-role-to-user acme-controller system:serviceaccount:acme:default command and replace acme with the name of the project. Once done it installs fine and issues certificates as described. Thanks Tim On 05/09/2017 17:38, Tomas Nozicka wrote: Hi Tim, (see inline...) On Tue, 2017-09-05 at 17:12 +0100, Tim Dudgeon wrote: Thanks. I'm having problems getting this running. When I deploy the deploymentconfig the pod fails to start and the logs contain these errors: 2017-09-05T16:03:11.764025351Z ERROR cmd.go:138 Unable to bootstrap certificate database: 'User and 2017-09-05T16:03:11.766213869Z ERROR cmd.go:173 Couln't initialize RouteController: 'RouteController could not find its own service: 'User "system:serviceaccount:acme-controller:default" cannot get services in project "acme-controller"'' misconfigured SA is system:serviceaccount:acme-controller:default - notably the namespace is **acme-controller** I already deployed the clusterrole and executed oc adm policy add-cluster-role-to-user acme-controller system:serviceaccount:acme:default Even tried as suggested: oc adm policy add-cluster-role-to-user cluster-admin system:serviceaccount:acme:default You are modifying SA in namespace **acme** not **acme-controller** I tried this in the default project and in a new acme-controller project. Could you help describe steps to get this running in a new openshift environment? Try looking at the exact steps our CI is using to create it from scratch but it should work as described in our docs. https://github.com/tnozicka/openshift-acme/blob/master/.travis.yml#L6 7-L73 Thanks Tim On 04/09/2017 09:44, Tomas Nozicka wrote: Hi Tim, On Mon, 2017-09-04 at 09:16 +0100, Tim Dudgeon wrote: Tomas Thanks for that. Looks very interesting. I've looked it over and not totally sure how to use this. Am I right that if this controller is deployed and running correctly then all you need to do for any routes is add the 'kubernetes.io/tls-acme: "true"' annotation to your route and the controller will handle creating the initial certificate and renewing it as needed? Correct. And in doing so it will generate/renew certificate for the hostname, add/update it as a secret, and update the route definition to use that certificate? For Routes it will generate a secret with that certificate and also inline it into the Route as it doesn't support referencing it. (Ingresses do, but the project doesn't support those yet.) The secret can be useful for checking or mounting it into pods directly if you don't want to terminate your TLS in the router but in pods. And that this will only apply to external routes. Some mechanism, such as the Ansible playbook, will still be required to maintain the certificates that are used internally by the Openshift infrastructure? I have some thoughts on this but no code :/ As I said at this point you need to bootstrap the infra using your own CA/self-signed cert and then you can expose the OpenShift API + web console using a Route. This should work fine even for 'oc' client unless the Router is down and you need to fix it. For that rare case, when only the admin will need to log in to fix the router he can use the internal cert or ssh into the cluster directly. So this hack should cover all the use cases for users except this special case for an admin. Thanks Tim On 25/08/2017 17:09, Tomas Nozicka wrote: Hi Tim, there is a controller to take care about generating and renewing Let's Encrypt certificates for you. https://github.com/tnozicka/openshift-acme That said it won't generate it for masters but you can expose master API using Route and certificate for that Route would be fully managed by openshift-acme. Further integrations might be possible in future but this is how you can get it done now. Regards, Tomas On Fri, 2017-08-25 at 16:27 +0100, Tim Dudgeon wrote: Does anyone have any experience on how best to use Let' Encrypt certificates for an OpenShift Origin cluster? In once sense this is simple. The Ansible installer can be specified to use this custom certificate and key to sign all the certificates it generates, and doing so ensures you don't get the dreaded "This site is insecure" messages from your browser. And there is a playbook for updating certificates (which is essential as Let' Encrypt certificates are short lived) so this must be automated. But how best to set this up and automate the certificate generation and renewal? Let's assume Ansible is being run from a separate machine that is not part of the cluster and needs to deploy those custom certificates to the master(s). The certificate needs to be present on the ansible machine but needs to apply to the master(s) (or load balancer?). So you can't just generate the certificate on the ansible machine (e.g. using --standalone
Re: Let's Encrypt certificates
Hi Tim, (see inline...) On Tue, 2017-09-05 at 17:12 +0100, Tim Dudgeon wrote: > Thanks. > > I'm having problems getting this running. > When I deploy the deploymentconfig the pod fails to start and the > logs > contain these errors: > > > 2017-09-05T16:03:11.764025351Z ERROR cmd.go:138 Unable to > > bootstrap > > certificate database: 'User > > and > > 2017-09-05T16:03:11.766213869Z ERROR cmd.go:173 Couln't > > initialize > > RouteController: 'RouteController could not find its own service: > > 'User "system:serviceaccount:acme-controller:default" cannot get > > services in project "acme-controller"'' misconfigured SA is system:serviceaccount:acme-controller:default - notably the namespace is **acme-controller** > > I already deployed the clusterrole and executed > > > oc adm policy add-cluster-role-to-user acme-controller > > system:serviceaccount:acme:default > > Even tried as suggested: > > > oc adm policy add-cluster-role-to-user cluster-admin > > system:serviceaccount:acme:default You are modifying SA in namespace **acme** not **acme-controller** > > I tried this in the default project and in a new acme-controller > project. > > Could you help describe steps to get this running in a new openshift > environment? Try looking at the exact steps our CI is using to create it from scratch but it should work as described in our docs. https://github.com/tnozicka/openshift-acme/blob/master/.travis.yml#L6 7-L73 > > Thanks > Tim > > > > On 04/09/2017 09:44, Tomas Nozicka wrote: > > Hi Tim, > > > > On Mon, 2017-09-04 at 09:16 +0100, Tim Dudgeon wrote: > > > Tomas > > > > > > Thanks for that. Looks very interesting. > > > > > > I've looked it over and not totally sure how to use this. > > > > > > Am I right that if this controller is deployed and running > > > correctly > > > then all you need to do for any routes is add the > > > 'kubernetes.io/tls-acme: "true"' annotation to your route and > > > the > > > controller will handle creating the initial certificate and > > > renewing > > > it > > > as needed? > > > > Correct. > > > > > And in doing so it will generate/renew certificate for the > > > hostname, > > > add/update it as a secret, and update the route definition to use > > > that > > > certificate? > > > > For Routes it will generate a secret with that certificate and also > > inline it into the Route as it doesn't support referencing it. > > (Ingresses do, but the project doesn't support those yet.) The > > secret > > can be useful for checking or mounting it into pods directly if you > > don't want to terminate your TLS in the router but in pods. > > > > > And that this will only apply to external routes. Some mechanism, > > > such > > > as the Ansible playbook, will still be required to maintain the > > > certificates that are used internally by the Openshift > > > infrastructure? > > > > I have some thoughts on this but no code :/ > > > > As I said at this point you need to bootstrap the infra using your > > own > > CA/self-signed cert and then you can expose the OpenShift API + web > > console using a Route. This should work fine even for 'oc' client > > unless the Router is down and you need to fix it. For that rare > > case, > > when only the admin will need to log in to fix the router he can > > use > > the internal cert or ssh into the cluster directly. > > > > So this hack should cover all the use cases for users except this > > special case for an admin. > > > > > Thanks > > > Tim > > > > > > On 25/08/2017 17:09, Tomas Nozicka wrote: > > > > Hi Tim, > > > > > > > > there is a controller to take care about generating and > > > > renewing > > > > Let's > > > > Encrypt certificates for you. > > > > > > > > https://github.com/tnozicka/openshift-acme > > > > > > > > That said it won't generate it for masters but you can expose > > > > master > > > > API using Route and certificate for that Route would be fully > > > > managed > > > > by openshift-acme. > > > > > > > > Further integrations might be possible in future but this is > > > > how > > > > you > > > > can get it done now. > > > > > > > > Regards, > > > > Tomas > > > > > > > > > > > > On Fri, 2017-08-25 at 16:27 +0100, Tim Dudgeon wrote: > > > > > Does anyone have any experience on how best to use Let' > > > > > Encrypt > > > > > certificates for an OpenShift Origin cluster? > > > > > > > > > > In once sense this is simple. The Ansible installer can be > > > > > specified > > > > > to > > > > > use this custom certificate and key to sign all the > > > > > certificates > > > > > it > > > > > generates, and doing so ensures you don't get the dreaded > > > > > "This > > > > > site > > > > > is > > > > > insecure" messages from your browser. And there is a playbook > > > > > for > > > > > updating certificates (which is essential as Let' Encrypt > > > > > certificates > > > > > are short lived) so this must be automated. > > > > > > > > > > But how best to set this up and automate the
Re: Let's Encrypt certificates
Thanks. I'm having problems getting this running. When I deploy the deploymentconfig the pod fails to start and the logs contain these errors: 2017-09-05T16:03:11.764025351Z ERROR cmd.go:138 Unable to bootstrap certificate database: 'User and 2017-09-05T16:03:11.766213869Z ERROR cmd.go:173 Couln't initialize RouteController: 'RouteController could not find its own service: 'User "system:serviceaccount:acme-controller:default" cannot get services in project "acme-controller"'' I already deployed the clusterrole and executed oc adm policy add-cluster-role-to-user acme-controller system:serviceaccount:acme:default Even tried as suggested: oc adm policy add-cluster-role-to-user cluster-admin system:serviceaccount:acme:default I tried this in the default project and in a new acme-controller project. Could you help describe steps to get this running in a new openshift environment? Thanks Tim On 04/09/2017 09:44, Tomas Nozicka wrote: Hi Tim, On Mon, 2017-09-04 at 09:16 +0100, Tim Dudgeon wrote: Tomas Thanks for that. Looks very interesting. I've looked it over and not totally sure how to use this. Am I right that if this controller is deployed and running correctly then all you need to do for any routes is add the 'kubernetes.io/tls-acme: "true"' annotation to your route and the controller will handle creating the initial certificate and renewing it as needed? Correct. And in doing so it will generate/renew certificate for the hostname, add/update it as a secret, and update the route definition to use that certificate? For Routes it will generate a secret with that certificate and also inline it into the Route as it doesn't support referencing it. (Ingresses do, but the project doesn't support those yet.) The secret can be useful for checking or mounting it into pods directly if you don't want to terminate your TLS in the router but in pods. And that this will only apply to external routes. Some mechanism, such as the Ansible playbook, will still be required to maintain the certificates that are used internally by the Openshift infrastructure? I have some thoughts on this but no code :/ As I said at this point you need to bootstrap the infra using your own CA/self-signed cert and then you can expose the OpenShift API + web console using a Route. This should work fine even for 'oc' client unless the Router is down and you need to fix it. For that rare case, when only the admin will need to log in to fix the router he can use the internal cert or ssh into the cluster directly. So this hack should cover all the use cases for users except this special case for an admin. Thanks Tim On 25/08/2017 17:09, Tomas Nozicka wrote: Hi Tim, there is a controller to take care about generating and renewing Let's Encrypt certificates for you. https://github.com/tnozicka/openshift-acme That said it won't generate it for masters but you can expose master API using Route and certificate for that Route would be fully managed by openshift-acme. Further integrations might be possible in future but this is how you can get it done now. Regards, Tomas On Fri, 2017-08-25 at 16:27 +0100, Tim Dudgeon wrote: Does anyone have any experience on how best to use Let' Encrypt certificates for an OpenShift Origin cluster? In once sense this is simple. The Ansible installer can be specified to use this custom certificate and key to sign all the certificates it generates, and doing so ensures you don't get the dreaded "This site is insecure" messages from your browser. And there is a playbook for updating certificates (which is essential as Let' Encrypt certificates are short lived) so this must be automated. But how best to set this up and automate the certificate generation and renewal? Let's assume Ansible is being run from a separate machine that is not part of the cluster and needs to deploy those custom certificates to the master(s). The certificate needs to be present on the ansible machine but needs to apply to the master(s) (or load balancer?). So you can't just generate the certificate on the ansible machine (e.g. using --standalone option for certbot) as it would not be for the right machine. Similarly it doesn't seem right to request and update the certificates on the master (which master in the case of multiple masters?), and those certificates need to be present on the ansible machine. Seems like the answer might be to run a process on the ansible machine that requests the certificates using the webroot plugin and in doing so places the magical key that is used to verify ownership of the domain under the https://your.site.com/.well-known/acme-challenge location? But how to go about doing this? Ports 80 and 443 seem to be in use on the cluster, but not serving up any particular content. How to place the content there? I'm hoping others have already needed to handle this problem and can point to some best practice. Thanks Tim ___
Re: Metrics not accessible
Still no joy with this. I retried with the latest code and still hitting the same problem. Metrics does not seem to be working with a new Ansible install. I'm using a minimal setup with an inventory like this: [OSEv3:children] masters nodes etcd nfs [OSEv3:vars] ansible_ssh_user=centos ansible_become=yes openshift_deployment_type=origin openshift_release=v3.6 openshift_disable_check=disk_availability,docker_storage,memory_availability openshift_hosted_metrics_deploy=true openshift_hosted_metrics_storage_kind=nfs openshift_hosted_metrics_storage_access_modes=['ReadWriteOnce'] openshift_hosted_metrics_storage_nfs_directory=/exports openshift_hosted_metrics_storage_nfs_options='*(rw,root_squash)' openshift_hosted_metrics_storage_volume_name=metrics openshift_hosted_metrics_storage_volume_size=10Gi openshift_hosted_metrics_storage_labels={'storage': 'metrics'} [masters] ip-10-0-113-31.eu-west-1.compute.internal [etcd] ip-10-0-113-31.eu-west-1.compute.internal [nfs] ip-10-0-113-31.eu-west-1.compute.internal [nodes] ip-10-0-113-31.eu-west-1.compute.internal openshift_node_labels="{'region': 'infra','zone': 'default'}" openshift_schedulable=true When the install completes the openshift-infra project pods ends up like this: NAME READY STATUS RESTARTS AGE hawkular-cassandra-1-4m7lq 1/1 Running 0 16m hawkular-metrics-0nl1q 0/1 CrashLoopBackOff 7 16m heapster-cgw0b 0/1 Running 1 16m The hawkular-metrics pods is failing, and it looks like its because it can't connect to the cassandra pod. The full log of the hawkular-metrics pod is here: https://gist.github.com/tdudgeon/f3099911eed441817369ee03635aad7d Any help resolving this would be appreciated. Tim ___ users mailing list users@lists.openshift.redhat.com http://lists.openshift.redhat.com/openshiftmm/listinfo/users
Re: oc -w timeout
Thanks a lot! It would take me forever to realize masters are behind an ELB ;) Best -- Mateus Caruccio / Master of Puppets GetupCloud.com We make the infrastructure invisible Gartner Cool Vendor 2017 2017-09-05 9:44 GMT-03:00 Philippe Lafoucrière < philippe.lafoucri...@tech-angels.com>: > Hi, > > You might want to take a look at this thread: https://lists. > openshift.redhat.com/openshift-archives/users/2017-June/msg00135.html > > Cheers > ___ users mailing list users@lists.openshift.redhat.com http://lists.openshift.redhat.com/openshiftmm/listinfo/users
Re: oc -w timeout
Hi, You might want to take a look at this thread: https://lists.openshift.redhat.com/openshift-archives/users/2017-June/msg00135.html Cheers ___ users mailing list users@lists.openshift.redhat.com http://lists.openshift.redhat.com/openshiftmm/listinfo/users
oc -w timeout
Hi there. Where is located the config to change timeout of watch operations? I'm getting disconnected after 5 minutes and would like to increase this value. -- Mateus Caruccio / Master of Puppets GetupCloud.com We make the infrastructure invisible Gartner Cool Vendor 2017 ___ users mailing list users@lists.openshift.redhat.com http://lists.openshift.redhat.com/openshiftmm/listinfo/users