from:"Joel Pearson"

Re: Installing packages on OKD 4 nodes

2020-10-28 Thread Joel Pearson

Hi Benjamin,

Alas, Fedora CoreOS must differ here.

I did find this workaround for installing packages via ignition, if adding
linuxptp actually made it work for you, then this might help:
https://github.com/coreos/fedora-coreos-tracker/issues/307

I don't know how this would work with OS updates coming via the machine
config operator. Could be worth a try I guess?

Thanks,

Joel

On Thu, 29 Oct 2020 at 00:17, Benjamin Guillon 
wrote:

> Hi Joel,
>
> Well, no: /dev/ptp0 does not magically exists. Wouldn't that be too easy?
> :)
>
> As for a FCOS specific documentation regarding NTP/PTP I'm afraid I didn't
> find any.
>
> Best,
> --
> Benjamin
> --
> *De: *"Joel Pearson" 
> *À: *"Benjamin Guillon" 
> *Cc: *"users" 
> *Envoyé: *Mercredi 28 Octobre 2020 13:56:05
> *Objet: *Re: Installing packages on OKD 4 nodes
>
> Hi Benjamin,
> Those docs you’ve mentioned are for regular fedora not fedora coreos I
> believe which I’m pretty sure are very different.
>
> So I presume you have checked that /dev/ptp0 doesn’t already magically
> exist?
>
> Thanks,
>
> Joel
>
> Sent from my iPhone
>
> On 28 Oct 2020, at 11:45 pm, Benjamin Guillon <
> benjamin.guil...@cc.in2p3.fr> wrote:
>
> 
> Hi Joel,
>
> Thanks for the reply :)
>
> I did give a try to the Openshift PTP operator and works well aside from
> the fact that I can't use it here since I'm not running on Baremetal.
>
> Our cluster indeed runs on our in-house Openstack platform.
>
> Usually we use the KVM PTP module with something like:
>
> refclock PHC /dev/ptp0 poll 2
>
> In the chrony.conf file.
> But that's for our usual CentOs based VMs, not FCOS :/
>
> So now I'm just trying to reproduce that setup on FCOS.
> I found this Fedora documentation earlier about PTP
> https://docs.fedoraproject.org/en-US/fedora/rawhide/system-administrators-guide/servers/Configuring_PTP_Using_ptp4l/
> Where they mention this linuxptp package, hence my questions.
>
> If I can't manage this, I'll resort to using standard NTP instead of PTP.
>
> Best,
> Benjamin
> --
> *De: *"Joel Pearson" 
> *À: *"Benjamin Guillon" 
> *Cc: *"users" 
> *Envoyé: *Mercredi 28 Octobre 2020 12:56:19
> *Objet: *Re: Installing packages on OKD 4 nodes
>
> Ahh I found the support article that talks about OpenShift 4 and PTP
>
> https://access.redhat.com/solutions/5106141
>
> If you don't have access to that solution the crux of it is that the PTP
> operator is for baremetal nodes (so probably not you, as you mentioned
> OpenStack).
>
> The chrony config they mention is:
>
> $ cat << EOF | base64 -w0
> refclock PHC /dev/ptp0 poll 3 dpoll -2 offset 0
> driftfile /var/lib/chrony/drift
> makestep 1.0 3
> rtcsync
> logdir /var/log/chrony
> EOF
>
>
> On Wed, 28 Oct 2020 at 22:32, Joel Pearson 
> wrote:
>
>> Hi Benjamin,
>> Have you checked if you actually need it? At least enterprise openshift
>> 4.x already had ptp support in the kernel (without a module), as I bumped
>> into it earlier in the year for PTP Azure syncing, I opened a support
>> ticket and it turned out I just needed this in chrony.conf
>>
>> refclock PHC /dev/ptp0 poll 3 dpoll -2 offset 0
>>
>> So I think it'd be worth checking if you already have /dev/ptp0 available
>> before installing linuxptp.  I realise OKD uses Fedora Core OS instead of
>> RedHat Core OS, so the default kernel modules might be different.
>>
>> Here are some docs for configuring chrony
>> <https://docs.okd.io/latest/installing/install_config/installing-customizing.html#installation-special-config-crony_installing-customizing>,
>> I think you just need to switch the iburst line for the refclock one.
>>
>> Otherwise, if the PTP support you need is more complicated than I needed
>> on Azure, you could potentially look at the specific PTP operator
>> <https://docs.okd.io/latest/networking/multiple_networks/configuring-ptp.html>
>> in the OKD docs.
>>
>> Hope this helps.
>>
>> Thanks,
>>
>> Joel
>>
>>
>> On Sat, 24 Oct 2020 at 03:00, Benjamin Guillon <
>> benjamin.guil...@cc.in2p3.fr> wrote:
>>
>>> Hello,
>>>
>>> I'm deploying an OKD4 cluster on Openstack.
>>> I wish to configure NTP on my nodes and for that I need to install a PTP
>>> dependency: linuxptp.
>>> And enable the kvm_ptp module in the kernel.
>>>
>>&

Re: Installing packages on OKD 4 nodes

2020-10-28 Thread Joel Pearson

Hi Benjamin,

Those docs you’ve mentioned are for regular fedora not fedora coreos I believe 
which I’m pretty sure are very different. 

So I presume you have checked that /dev/ptp0 doesn’t already magically exist?

Thanks,

Joel

Sent from my iPhone

> On 28 Oct 2020, at 11:45 pm, Benjamin Guillon  
> wrote:
> 
> 
> Hi Joel,
> 
> Thanks for the reply :)
> 
> I did give a try to the Openshift PTP operator and works well aside from the 
> fact that I can't use it here since I'm not running on Baremetal.
> 
> Our cluster indeed runs on our in-house Openstack platform.
> 
> Usually we use the KVM PTP module with something like:
> refclock PHC /dev/ptp0 poll 2
> In the chrony.conf file.
> But that's for our usual CentOs based VMs, not FCOS :/
> 
> So now I'm just trying to reproduce that setup on FCOS.
> I found this Fedora documentation earlier about PTP 
> https://docs.fedoraproject.org/en-US/fedora/rawhide/system-administrators-guide/servers/Configuring_PTP_Using_ptp4l/
> Where they mention this linuxptp package, hence my questions.
> 
> If I can't manage this, I'll resort to using standard NTP instead of PTP.
> 
> Best,
> Benjamin
> De: "Joel Pearson" 
> À: "Benjamin Guillon" 
> Cc: "users" 
> Envoyé: Mercredi 28 Octobre 2020 12:56:19
> Objet: Re: Installing packages on OKD 4 nodes
> 
> Ahh I found the support article that talks about OpenShift 4 and PTP
> 
> https://access.redhat.com/solutions/5106141
> 
> If you don't have access to that solution the crux of it is that the PTP 
> operator is for baremetal nodes (so probably not you, as you mentioned 
> OpenStack).
> 
> The chrony config they mention is:
> 
> $ cat << EOF | base64 -w0
> refclock PHC /dev/ptp0 poll 3 dpoll -2 offset 0
> driftfile /var/lib/chrony/drift
> makestep 1.0 3
> rtcsync
> logdir /var/log/chrony
> EOF
> 
>> On Wed, 28 Oct 2020 at 22:32, Joel Pearson  
>> wrote:
>> Hi Benjamin,
>> Have you checked if you actually need it? At least enterprise openshift 4.x 
>> already had ptp support in the kernel (without a module), as I bumped into 
>> it earlier in the year for PTP Azure syncing, I opened a support ticket and 
>> it turned out I just needed this in chrony.conf 
>> 
>> refclock PHC /dev/ptp0 poll 3 dpoll -2 offset 0
>> So I think it'd be worth checking if you already have /dev/ptp0 available 
>> before installing linuxptp.  I realise OKD uses Fedora Core OS instead of 
>> RedHat Core OS, so the default kernel modules might be different. 
>> 
>> Here are some docs for configuring chrony, I think you just need to switch 
>> the iburst line for the refclock one.
>> 
>> Otherwise, if the PTP support you need is more complicated than I needed on 
>> Azure, you could potentially look at the specific PTP operator in the OKD 
>> docs.
>> 
>> Hope this helps.
>> 
>> Thanks,
>> 
>> Joel
>> 
>> 
>>> On Sat, 24 Oct 2020 at 03:00, Benjamin Guillon 
>>>  wrote:
>>> Hello,
>>> 
>>> I'm deploying an OKD4 cluster on Openstack.
>>> I wish to configure NTP on my nodes and for that I need to install a PTP 
>>> dependency: linuxptp.
>>> And enable the kvm_ptp module in the kernel.
>>> 
>>> However, I couldn't manage to install the package through ignition.
>>> I had to do it manually with rpm-ostree: rpm-ostree install linuxptp.
>>> 
>>> Am I missing something here?
>>> How am I supposed to provide packages or drivers cluster wide through 
>>> Ignition?
>>> Can such a task be done through the MachineConfig Operator?
>>> 
>>> Thanks for the help!
>>> -- 
>>> Benjamin Guillon
>>> CNRS/IN2P3 Computing Center
>>> 21 Avenue Pierre de Coubertin, CS70202
>>> 69627 Villeurbanne Cedex, France
>>> ___
>>> users mailing list
>>> users@lists.openshift.redhat.com
>>> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>> 
> 
> 
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: Installing packages on OKD 4 nodes

2020-10-28 Thread Joel Pearson

Ahh I found the support article that talks about OpenShift 4 and PTP

https://access.redhat.com/solutions/5106141

If you don't have access to that solution the crux of it is that the PTP
operator is for baremetal nodes (so probably not you, as you mentioned
OpenStack).

The chrony config they mention is:

$ cat << EOF | base64 -w0
refclock PHC /dev/ptp0 poll 3 dpoll -2 offset 0
driftfile /var/lib/chrony/drift
makestep 1.0 3
rtcsync
logdir /var/log/chrony
EOF


On Wed, 28 Oct 2020 at 22:32, Joel Pearson 
wrote:

> Hi Benjamin,
>
> Have you checked if you actually need it? At least enterprise openshift
> 4.x already had ptp support in the kernel (without a module), as I bumped
> into it earlier in the year for PTP Azure syncing, I opened a support
> ticket and it turned out I just needed this in chrony.conf
>
> refclock PHC /dev/ptp0 poll 3 dpoll -2 offset 0
>
> So I think it'd be worth checking if you already have /dev/ptp0 available
> before installing linuxptp.  I realise OKD uses Fedora Core OS instead of
> RedHat Core OS, so the default kernel modules might be different.
>
> Here are some docs for configuring chrony
> <https://docs.okd.io/latest/installing/install_config/installing-customizing.html#installation-special-config-crony_installing-customizing>,
> I think you just need to switch the iburst line for the refclock one.
>
> Otherwise, if the PTP support you need is more complicated than I needed
> on Azure, you could potentially look at the specific PTP operator
> <https://docs.okd.io/latest/networking/multiple_networks/configuring-ptp.html>
> in the OKD docs.
>
> Hope this helps.
>
> Thanks,
>
> Joel
>
>
> On Sat, 24 Oct 2020 at 03:00, Benjamin Guillon <
> benjamin.guil...@cc.in2p3.fr> wrote:
>
>> Hello,
>>
>> I'm deploying an OKD4 cluster on Openstack.
>> I wish to configure NTP on my nodes and for that I need to install a PTP
>> dependency: linuxptp.
>> And enable the kvm_ptp module in the kernel.
>>
>> However, I couldn't manage to install the package through ignition.
>> I had to do it manually with rpm-ostree: rpm-ostree install linuxptp.
>>
>> Am I missing something here?
>> How am I supposed to provide packages or drivers cluster wide through
>> Ignition?
>> Can such a task be done through the MachineConfig Operator?
>>
>> Thanks for the help!
>> --
>> Benjamin Guillon
>> CNRS/IN2P3 Computing Center
>> 21 Avenue Pierre de Coubertin, CS70202
>> 69627 Villeurbanne Cedex, France
>> ___
>> users mailing list
>> users@lists.openshift.redhat.com
>> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>>
>
>
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: Installing packages on OKD 4 nodes

2020-10-28 Thread Joel Pearson

Hi Benjamin,

Have you checked if you actually need it? At least enterprise openshift 4.x
already had ptp support in the kernel (without a module), as I bumped into
it earlier in the year for PTP Azure syncing, I opened a support ticket and
it turned out I just needed this in chrony.conf

refclock PHC /dev/ptp0 poll 3 dpoll -2 offset 0

So I think it'd be worth checking if you already have /dev/ptp0 available
before installing linuxptp.  I realise OKD uses Fedora Core OS instead of
RedHat Core OS, so the default kernel modules might be different.

Here are some docs for configuring chrony
,
I think you just need to switch the iburst line for the refclock one.

Otherwise, if the PTP support you need is more complicated than I needed on
Azure, you could potentially look at the specific PTP operator

in the OKD docs.

Hope this helps.

Thanks,

Joel

On Sat, 24 Oct 2020 at 03:00, Benjamin Guillon 
wrote:

> Hello,
>
> I'm deploying an OKD4 cluster on Openstack.
> I wish to configure NTP on my nodes and for that I need to install a PTP
> dependency: linuxptp.
> And enable the kvm_ptp module in the kernel.
>
> However, I couldn't manage to install the package through ignition.
> I had to do it manually with rpm-ostree: rpm-ostree install linuxptp.
>
> Am I missing something here?
> How am I supposed to provide packages or drivers cluster wide through
> Ignition?
> Can such a task be done through the MachineConfig Operator?
>
> Thanks for the help!
> --
> Benjamin Guillon
> CNRS/IN2P3 Computing Center
> 21 Avenue Pierre de Coubertin, CS70202
> 69627 Villeurbanne Cedex, France
> ___
> users mailing list
> users@lists.openshift.redhat.com
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: scaleTargetRef for autoscaling

2020-06-24 Thread Joel Pearson

Hi Marvin,

I presume you are using a deployment config?

If so, doesn't a deployment config create a new replication controller
every time you do a deploy?

Which means you'd lose your scaling every deploy, so I think if you are
using deployment configs, then you'd want to reference those, rather than
the replication controllers that it automatically creates for you.

Cheers,

Joel

On Wed, 17 Jun 2020 at 20:42, Just Marvin <
marvin.the.cynical.ro...@gmail.com> wrote:

> Hi,
>
> The docs say that the scaleTargetRef can point to either the
> deployment config or the replication controller. Is there a difference in
> autoscaling behavior if I pick one over the other?
>
> Regards,
> Marvin
> ___
> users mailing list
> users@lists.openshift.redhat.com
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: cockpit/kubernetes images in OKD 3.11 not being pulled

2020-03-23 Thread Joel Pearson

When I try a docker pull from a centos 7 box I have, it fails for me saying
not found too.

It looks rather deleted to me, because looking at the tags
https://hub.docker.com/r/cockpit/kubernetes/tags it doesn't have a digest
or compressed size. Docker hub shows it changed 18 days ago, so I guess
that's when it got deleted.

Since you have the image locally, why don't you push from your machine into
the openshift registry and run it from there?

On Sun, 22 Mar 2020 at 19:47, Tim Dudgeon  wrote:

> We're running OKD 3.11 clusters and they have started having problems
> with the registry console.
>
> This uses the docker.io/cockpit/kubernetes container image which can no
> longer be pulled from the node on which the registry is running:
>
> $ docker pull cockpit/kubernetes:latest
> Trying to pull repository docker.io/cockpit/kubernetes ...
> manifest for docker.io/cockpit/kubernetes:latest not found
>
> However I can pull that image from my laptop without problems.
>
> I notice that an DockerHub this image is described at 'obsoleted in
> 2018'. Is there anything in the OKD Docker configuration that blocks
> this image being pulled?
>
> Thanks
> Tim
>
>
> ___
> users mailing list
> users@lists.openshift.redhat.com
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: okd web console custom SSL certificate

2020-03-23 Thread Joel Pearson

Hi,

If you can I'd recommend OpenShift 4.x, however, if you want to stay on
3.11, then I'd recommend an ansible based install. It is much more
configurable than oc cluster up.

There is an "all-in-one" inventory where it's just a single node.
https://github.com/openshift/openshift-ansible/blob/release-3.11/inventory/hosts.localhost

That way you can let ansible install the certificates and configure the
master-config for you, and it will be a lot more repeatable.

Cheers,

Joel

On Tue, 24 Mar 2020 at 02:33, mcom  wrote:

> Hello,
>
> Maybe you can give me some hint as I've just stucked with okd web
> console custom SSL certificate.I have all in one openshift cluster
> (ubuntu 18, downloaded
>
> https://github.com/openshift/origin/releases/download/v3.11.0/openshift-origin-server-v3.11.0-0cbc58b-linux-64bit.tar.gz
> and started by oc cluster up --public-hostname="myip" ); I was trying to
> follow
>
> https://docs.openshift.com/container-platform/3.11/install_config/certificate_customization.html
> by making changes (in my case) in
> openshift.local.clusterup/kube-apiserver/master-config.yaml but so for
> with no luck (despite that it's not logical if I change certificate for
> API then it load my certificate but whole cluster cannot start which is
> logical as certificate doesn't include 127.0.0.1; when I change
> certificate for web console (which should be correct) nothing happen -
> cluster starts but with it's own self-generated certificate instead of
> my own); I don't have inventory file so I could run ansible playbooks
> but as far I'm concern working directly on master-config should be also
> possible (or maybe I'm wrong) - could you give me some hint (my OS is
> ubuntu - not centos so many documentation cannot be directly applied
> along with ansible playbooks as even paths are not the same)
>
> --
> MCOM Wojciech Matys
> Doradztwo IT & Rozwiazania Sieciowe
> tel. +48 604915987
> e-mail: m...@mcompany.pl
>
> ___
> users mailing list
> users@lists.openshift.redhat.com
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>
>
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: ocp4 no way to configure ROUTER_USE_PROXY_PROTOCOL

2020-03-22 Thread Joel Pearson

It looks like the proxy protocol is only supported on AWS. Maybe you should
create a Bugzilla ticket requesting support for the proxy protocol in a
general way? I will most likely need this myself in the future too.

https://github.com/openshift/cluster-ingress-operator/blob/d4593e0dfed867465648737b29abee872c2429b2/pkg/operator/controller/ingress/deployment.go#L253-L259


On Fri, 20 Mar 2020 at 17:22, Andreas Nussbaun <
andreas.nussb...@tuwien.ac.at> wrote:

> Hi Everyone,
>
> Hope you are doing well in these hard times.
>
> We have a Problem in the use of Openshift4 -
>
> In Openshift3 we could configure the default router with
> ROUTER_USE_PROXY_PROTOCOL and the router in front of openshift with
> send-proxy to
>
> preserve the real Client IP for the applications running on Openshift3.
>
> In Openshift4 this does not work, because the ENV variable is gone.
>
> So we have no way to present the real Client IP to the applications.
>
> Can anyone help or have any suggestions how we can achive this or get it
> working ?
>
> Thanks everyone
>
> Br
>
> Andreas
>
> ___
> users mailing list
> users@lists.openshift.redhat.com
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>
>
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: configuring frontend 2 the openshift

2020-03-04 Thread Joel Pearson

Hi Kate,

Regarding part of question 2, it looks like you added an extra slash before
/console, ie your error message shows "//console".  I tried it on my
OpenShift 3.x cluster and having a double forward-slash at the front
created the same problem.  So try removed that extra slash before, so that
you end up with: yourhostname.com/console not  yourhostname.com//console

On Sat, 22 Feb 2020 at 00:14, Kate Brush  wrote:

> I'm running latest okd 3.11, start with oc cluster up
> --public-hostname=myip. It's for dev purpose
>
> 1. is it supposed to expose openshift directly to the internet ?
> 2. I need to use RequestHeader identity provider or set some header like
> CORS headers so I could login automatically from my webapp. Anyway it looks
> I need to setup apache working as a proxy. Do you know any examples or tips
> how to setup it ?
> Currently instead of webconsole I'm getting json with message forbidden:
> User \"system:anonymous\" cannot get path \"//console\": no RBAC policy
> matched
> Also there' re some warnings in the apache logs that downstream server
> need client certificate
>
> I've read webconsole uses websocket then I suppose I should proxy it as
> well
>
> Now I'm running openshift like oc cluster up
> --public-hostnam=my-external-apache-ip:443
> I've modified openshift.local.clusterup/kube-apiserver/master-config.yaml
> and ouathConfig.masterPublicURL, oauthConfig.assetPublicURL,
> masterPublicURL change values so apache external ip is used
>
> ___
> users mailing list
> users@lists.openshift.redhat.com
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: How to debug the machine config operator in 4.2.10?

2020-02-18 Thread Joel Pearson

After a bit more digging and looking at other pod logs, I managed to find
some useful logs in the machine-config-daemon on one of the nodes.

The error is:

content mismatch for file
/etc/pki/ca-trust/source/anchors/openshift-config-user-ca-bundle.crt:
-BEGIN CERTIFICATE...

...certificate data...

Marking Degraded due to: unexpected on-disk state validating against
rendered-worker-987dsa987f98

When I ssh onto the node, I can see that
/etc/pki/ca-trust/source/anchors/openshift-config-user-ca-bundle.crt
already had the certificates that I specified via setting up additional
trusted CA's for builds
<https://docs.openshift.com/container-platform/4.2/builds/setting-up-trusted-ca.html>
instructions.
But when trying to pull an image via "sudo crictl pull
myprivate.registry:5001/image:tag", it would complain about x509
certificates not being trusted. But if I reboot the node, then pulling via
crictl starts working. However, the machine config operator remains broken
complaining about the above error.  So it seems that the certificates are
finding their way onto the node via different mechanism than the MCO.

This cluster is a disconnected cluster with some extra trusted CAs that
were configured during installation, so I'm wondering if the content
mismatch in the MCO is related to merging the CA certs for images and the
certs inside the "user-ca-bundle" configmap in the "openshift-config"
namespace

Any ideas?

On Tue, 18 Feb 2020 at 17:33, Joel Pearson 
wrote:

> Hi,
>
> I've been having trouble to get openshift to reliably accept CA's for
> custom secure registries:
> We've been following this guide:
> https://docs.openshift.com/container-platform/4.2/builds/setting-up-trusted-ca.html
>
> And it has worked sometimes and not others. The most frustrating bit is
> not being able to figure out when the CA certificates have been applied,
> sometimes just waiting 5 minutes is enough, other times, it never happens.
> I'm not sure what logs I need to watch so I know that it has seen it, and
> done something.
>
> This article
> <https://docs.openshift.com/container-platform/4.2/openshift_images/image-configuration.html#images-configuration-insecure_image-configuration>
> says that the machine config operator (MCO) restarts nodes to apply the
> updates, but when I watch "oc get nodes", I don't see anything restarting,
> but sometimes it seems the certificates get applied anyway, somehow.
>
> Additionally, the MCO is degraded in the cluster, and it's not clear why.
> All I have managed to find so far is timeout error messages in the MCO pod,
> and then in the MCO cluster operator status, it just says it timed out
> waiting for them to sync, and that they're all unavailable.
>
> Where do I need to look to debug any errors related to the MCO?
>
> Any help or pointers would be appreciated.
>
> Thanks,
>
> Joel
>
> <https://docs.openshift.com/container-platform/4.2/builds/setting-up-trusted-ca.html>
>
>
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

How to debug the machine config operator in 4.2.10?

2020-02-17 Thread Joel Pearson

Hi,

I've been having trouble to get openshift to reliably accept CA's for
custom secure registries:
We've been following this guide:
https://docs.openshift.com/container-platform/4.2/builds/setting-up-trusted-ca.html

And it has worked sometimes and not others. The most frustrating bit is not
being able to figure out when the CA certificates have been applied,
sometimes just waiting 5 minutes is enough, other times, it never happens.
I'm not sure what logs I need to watch so I know that it has seen it, and
done something.

This article

says that the machine config operator (MCO) restarts nodes to apply the
updates, but when I watch "oc get nodes", I don't see anything restarting,
but sometimes it seems the certificates get applied anyway, somehow.

Additionally, the MCO is degraded in the cluster, and it's not clear why.
All I have managed to find so far is timeout error messages in the MCO pod,
and then in the MCO cluster operator status, it just says it timed out
waiting for them to sync, and that they're all unavailable.

Where do I need to look to debug any errors related to the MCO?

Any help or pointers would be appreciated.

Thanks,

Joel

___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: Can't use the privileged scc in OpenShift 4.2.16

2020-02-12 Thread Joel Pearson

Hi Samuel,

Thanks for pointing that out, didn't realise that privileged mode was a
kubernetes specific thing as opposed to an openshift thing.  That'd explain
why it barely gets a passing reference in the docs. I found some
information on the kubernetes website:
https://kubernetes.io/docs/concepts/policy/pod-security-policy/#privileged

The cluster I was trying this out on is a lab cluster that only I use, but
thanks for the tip about being careful copying scc's.

Thanks,

Joel

On Wed, 12 Feb 2020 at 20:37, Samuel Martín Moro  wrote:

> Hi,
>
>
> In addition to granting your ServiceAccount with permissions to use the
> privileged SCC, you should add some securityContext.privileged: true to
> your Pod definition. Otherwise, the restricted SCC first matches your Pod
> securityContext, privileged would not be considered.
>
> I  couldn't find this in 4.x docs, though you'ld have it in 3.11:
>
> https://docs.openshift.com/container-platform/3.11/admin_guide/manage_scc.html#grant-a-service-account-access-to-the-privileged-scc
>
>
> Changing priorities could indeed be a way to work around this.
> Though probably not something to recommend.
>
> If you made a copy of the existing privileged SCC, then there's good
> chances you kept its lists of allowed users / groups.
> This means that when Pods relying on those SA would next restart, while
> not including a securityContext.privileged in their definition: they would
> mistakenly start as root. Rolling this back could require chowning files
> back on persistent volumes.
>
> While it is unlikely OpenShift core components would include
> ServiceAccounts running both privileged and unprivileged Pods (not
> certain/to check), it could still be a surprise for users in your cluster.
> This is not a big deal, on a lab, if you're just testing something on your
> own, ... though I would avoid this on real-life clusters, or warn other
> admins at least, ideally make sure only your Jira SA may use that SCC.
>
>
> Regards.
>
>
> On Wed, Feb 12, 2020 at 4:36 AM Joel Pearson <
> japear...@agiledigital.com.au> wrote:
>
>> Hi,
>>
>> I have been trying to use the privileged scc in OpenShift 4.2.16
>>
>> I follow the normal way adding an scc to a service account.
>>
>> oc create sa jira
>> oc adm policy add-scc-to-user privileged -z jira
>>
>> But it always ends up using the restricted scc. However, anyuid gets
>> applied successfully.
>>
>> I read about SCC prioritisation
>> <https://docs.openshift.com/container-platform/4.2/authentication/managing-security-context-constraints.html#scc-prioritization_configuring-internal-oauth>
>>  and made
>> a copy of privileged scc and set "priority: 10", and then I was able to use
>> it.
>>
>> What is the proper way to use the privileged scc? Or is this by design?
>>
>> PS. I realise using privileged is not recommended, and in my case to make
>> jira work I managed to use a customised version of anyuid that contained
>> the AUDIT_WRITE capability so that "su" would work.  However, I figured it
>> would be good to know why privileged kept getting overridden by "restricted"
>>
>> Thanks,
>>
>> Joel
>> ___
>> users mailing list
>> users@lists.openshift.redhat.com
>> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>>
>
>
> --
> Samuel Martín Moro
> {EPITECH.} 2011
>
> "Nobody wants to say how this works.
>  Maybe nobody knows ..."
>   Xorg.conf(5)
>
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Can't use the privileged scc in OpenShift 4.2.16

2020-02-11 Thread Joel Pearson

Hi,

I have been trying to use the privileged scc in OpenShift 4.2.16

I follow the normal way adding an scc to a service account.

oc create sa jira
oc adm policy add-scc-to-user privileged -z jira

But it always ends up using the restricted scc. However, anyuid gets
applied successfully.

I read about SCC prioritisation

and made
a copy of privileged scc and set "priority: 10", and then I was able to use
it.

What is the proper way to use the privileged scc? Or is this by design?

PS. I realise using privileged is not recommended, and in my case to make
jira work I managed to use a customised version of anyuid that contained
the AUDIT_WRITE capability so that "su" would work.  However, I figured it
would be good to know why privileged kept getting overridden by "restricted"

Thanks,

Joel
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: OCP 4.2 setup -

2020-01-12 Thread Joel Pearson

On Fri, 10 Jan 2020 at 20:06, sofia qirjazi  wrote:

> Cool, thanks!
>
> Before deploying the OCP cluster that uses UPI, It is needed configuration
> of DHCP , LB and DNS.
> I am interested to know which is :
>
> a) Which is DNS best practice for offine installation?
>For DNS server , it is better to use Openshift DNS server or External
> DNS (for example, company DNS ) ?
>

OpenShift 4.x needs an external DNS provider for the records mentioned in
the instructions:
https://docs.openshift.com/container-platform/4.2/installing/installing_bare_metal/installing-restricted-networks-bare-metal.html#installation-dns-user-infra_installing-restricted-networks-bare-metal

It does run DNS inside the cluster too, but that is separate.


>
> b) Which is DHCP best practice ? Which is better approach ?
>

Not sure what you're asking here, any DHCP server is fine as long as it can
hand out static ip addresses (probably all DHCP servers).


>
> c) Which is LB/SSL-Offloading best practice ?
>Using SSL offLoading, how is better the approach of LB ?
>- to have LB in front of OCP Cluster or LB inside Openshift ?
>

You most likely do not want to do SSL-offloading, just do TCP load
balancing (layer 4) as mentioned in the instructions.

https://docs.openshift.com/container-platform/4.2/installing/installing_bare_metal/installing-restricted-networks-bare-metal.html#installation-network-user-infra_installing-restricted-networks-bare-metal


OpenShift runs it's own SSL (in most circumstances), so doing offloading
doesn't make all that much sense, especially when you're just learning
OpenShift.


>
> Thanks & Regards,
> Sofia
>
> On Thu, Jan 9, 2020 at 8:01 PM W. Trevor King  wrote:
>
>> On Thu, Jan 9, 2020 at 7:51 AM sofia qirjazi wrote:
>> > I want to setup OCP 4.2 as PoC in bare metal using below steps:
>> >
>> > https://blog.openshift.com/openshift-4-2-disconnected-install/
>>
>> Now that we have official docs [1], it's better to use them instead of
>> the early blog post.  Not sure if they talk about subnet division
>> though.
>>
>> Cheers,
>> Trevor
>>
>> [1]:
>> https://docs.openshift.com/container-platform/4.2/installing/installing_bare_metal/installing-restricted-networks-bare-metal.html
>>
>> ___
> users mailing list
> users@lists.openshift.redhat.com
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: OpenShift on Fedora – a Quick Installation

2020-01-09 Thread Joel Pearson

OpenShift 3.9 is pretty old. If you're trying to learn OpenShift I'd
recommend v4.x

If you really want the OpenShift 3.x series then you should be trying
OpenShift 3.11.

On Thu, 9 Jan 2020 at 22:22, sofia qirjazi  wrote:

> Hello all,
>
> I am trying to perform a quick installation through Fedora VM
> and I have followed the guide:
> https://vocon-it.com/2018/09/25/how-to-install-openshift-on-fedora-a-quick-installation-guide/
> As you can see, the guide is simple to follow, it has a straightforward
> action, but the playbook ended up will below error :
>
> Failure summary:
> --
>   1. localhost
>  Determine openshift_version to configure on first master
>  openshift_version : fail
>  The conditional check 'not rpm_results.results.package_found' failed.
> The error was: error while evaluating conditional (not
> rpm_results.results.package_found): 'dict object' has no attribute 'results'
>
>The error appears to be in
> '/root/bug/install-openshift-on-fedora/openshift-ansible/roles/openshift_version/tasks/check_available_rpms.yml':
> line 8, column 3, but may
>be elsewhere in the file depending on the exact syntax
> problem.
>
>The offending line appears to be:
>
>
>- fail:
>  ^ here
>
> -
> I will appreciate if someone has any recommendation or suggestion where
> might be a problem
> I have followed the same procedure for ansible 2.4 and ansible 2.7 but in
> all cases is faced the same error
>
> BR,
> Sofia
>
> _______
> users mailing list
> users@lists.openshift.redhat.com
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>


-- 
Kind Regards,

Joel Pearson
Agile Digital | Senior Software Consultant

Love Your Software™ | ABN 98 106 361 273
p: 1300 858 277 | m: 0405 417 843 <0405417843> | w: agiledigital.com.au
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Fwd: ocp 4.3 nightly install on openstack queens

2019-12-15 Thread Joel Pearson

On Mon, 16 Dec 2019 at 14:41, Dale Bewley  wrote:

>
>
> On Sat, Dec 14, 2019 at 3:31 AM Joel Pearson <
> japear...@agiledigital.com.au> wrote:
>
>> I think there is one last thing that is worth trying...
>>
>> On Sat, 14 Dec 2019 at 18:56, Dale Bewley  wrote:
>>
>>> Thanks for the tips, Joel, but no luck so far with
>>> 4.3.0-0.nightly-2019-12-13-180405.
>>>
>>>
>> It's possible you might be able to fix it by modifying the
>> machine-api-controllers deployment to mount in the ssl certificates from
>> the host.
>>
>
> If I touched (mounted within) `/etc/pki` it resulted in a permissions
> denial when the cert bundle was referenced, so I tried `/tmp/pki`.
>

When you say touched, do you mean
"touch /etc/pki/ca-trust/extracted/openssl/ca-bundle.trust.crt"?

You shouldn't have write access inside the container, but the ca
bundle should already have the correct CA certificates. I can go to any
worker or master and have a look inside
"/etc/pki/ca-trust/extracted/openssl/ca-bundle.trust.crt" and I see my
extra CA's up the top of that file.  Some operator makes sure that the ca
bundle is correct on the masters and worker nodes, so it should be safe to
just mount /etc/pki (and /etc/ssl/certs) straight from the host.


>
> $ oc create secret generic my-ca-bundle --from-file=ca-bundle.crt -n
> openshift-machine-api
> $ oc set volume deployment machine-api-controllers -c machine-controller
> -n openshift-machine-api --add --mount-path=/tmp/pki -t secret
> --name=my-ca-bundle --secret-name=my-ca-bundle --overwrite
>
> Curl within the container was satisfied when I point SSL_CERT_DIR to
> /tmp/pki.
>
> sh-4.2$ SSL_CERT_DIR=/tmp/pki curl -I https://openstack.domain.com:13000
> HTTP/1.1 300 Multiple Choices
> Date: Mon, 16 Dec 2019 03:00:02 GMT
> Server: Apache
> Vary: X-Auth-Token
> Content-Length: 617
> Content-Type: application/json
>
> For some reason though, I could not get the deployment to define the env
> variable in the machine-controller containe, so this isn't yet a workaround.
>
> $ oc set env deployment machine-api-controllers -c machine-controller -n
> openshift-machine-api SSL_CERT_DIR=/tmp/pki
> deployment.extensions/machine-api-controllers updated
> $ oc rsh -n openshift-machine-api -c machine-controller $(oc get pod -n
> openshift-machine-api -l k8s-app=controller -o name) env | grep SSL
>
>
>
>> I had to do something like this for the cluster version operator, because
>> it was failing due to my MITM proxy. Which I had to solve by ensuring the
>> CA certificate of the proxy was available in the container, which I believe
>> is a fairly similar situation to what you have.
>> https://bugzilla.redhat.com/show_bug.cgi?id=1773419
>>
>> Failing that, are you able to configure your openstack cluster to use
>> real SSL certs from letsencrypt or something like that? I ended up doing
>> that for my openstack cluster, as I found it was hard to make sure that
>> anything talking to openstack had my CA certificate. It was just simpler to
>> have a real SSL cert.
>>
>>
> I hear what you are saying, but our enterprise CA is pretty real, and OCP
> is an enterprise product. :)
>
>
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: ocp 4.3 nightly install on openstack queens

2019-12-14 Thread Joel Pearson

I think there is one last thing that is worth trying...

On Sat, 14 Dec 2019 at 18:56, Dale Bewley  wrote:

> Thanks for the tips, Joel, but no luck so far with
> 4.3.0-0.nightly-2019-12-13-180405.
>
> After the following:
>
> - destroy cluster
> - copy backup install-config.yaml with my CA cert at additionalTrustBundle
> to empty osp-nightly/ dir
> - generate manifests `openshift-install create manifests --dir osp-nightly`
> - update osp-nightly/manifests/cluster-proxy-01-config.yaml setting
> spec/trustedCA/name=user-ca-bundle
> - run install `openshift-install create cluster --dir=osp-nightly
> --log-level=debug`
>
> I still see cert errors from machine-api controller
>
> ```
> $ export KUBECONFIG=osp-nightly/auth/kubeconfig
> $ oc logs -c machine-controller -f -n openshift-machine-api $(oc get pods
> -n openshift-machine-api  -l k8s-app=controller -o name)
> ...
> I1214 07:34:19.124112   1 controller.go:164] Reconciling Machine
> "osp-nightly-rrzv5-worker-tk495"
> I1214 07:34:19.124188   1 controller.go:376] Machine
> "osp-nightly-rrzv5-worker-tk495" in namespace "openshift-machine-api"
> doesn't specify "cluster.k8s.io/cluster-name" label, assuming nil cluster
> E1214 07:34:19.132925   1 controller.go:279] Failed to check if
> machine "osp-nightly-rrzv5-worker-tk495" exists: Error checking if instance
> exists (machine/actuator.go 346):
> Error getting a new instance service from the machine (machine/actuator.go
> 467): Create providerClient err: Post
> https://openstack.domain.com:13000/v3/auth/tokens: x509: certificate
> signed by unknown authority
> 
>

It's possible you might be able to fix it by modifying the
machine-api-controllers deployment to mount in the ssl certificates from
the host.

Essentially add this under template.spec:

  volumes:
- name: etc-ssl-certs
  hostPath:
path: /etc/ssl/certs
type: ''
- name: etc-pki
  hostPath:
path: /etc/pki
type: ''

And

volume mounts to the container bit (template.spec.containers[0]):

  containers:
- name: controller-manager
  volumeMounts:
- name: etc-ssl-certs
  readOnly: true
  mountPath: /etc/ssl/certs
- name: etc-pki
  readOnly: true
  mountPath: /etc/pki

I had to do something like this for the cluster version operator, because
it was failing due to my MITM proxy. Which I had to solve by ensuring the
CA certificate of the proxy was available in the container, which I believe
is a fairly similar situation to what you have.
https://bugzilla.redhat.com/show_bug.cgi?id=1773419

Failing that, are you able to configure your openstack cluster to use real
SSL certs from letsencrypt or something like that? I ended up doing that
for my openstack cluster, as I found it was hard to make sure that anything
talking to openstack had my CA certificate. It was just simpler to have a
real SSL cert.


> I can confirm my cert is here:
>
> $ oc get cm user-ca-bundle -n openshift-config -o json | jq -r
> '.data."ca-bundle.crt"'
>
> And that the proxy received the configmap name from the custom manifest
> rather than default "":
>
> $ oc get proxy cluster -o json | jq .spec.trustedCA
> {"name": "user-ca-bundle"}
>
> I'm stuck with 3 masters and no workers while installer says:
>
> DEBUG Still waiting for the cluster to initialize: Some cluster operators
> are still updating: authentication, console, image-registry, ingress,
> monitoring
>
> I guess I'll keep watching
> https://bugzilla.redhat.com/show_bug.cgi?id=1769879 and
> https://github.com/openshift/enhancements/pull/115 and running 3.11 :)
>
> On Wed, Dec 4, 2019 at 9:29 PM Joel Pearson 
> wrote:
>
>>
>>
>> On Wed, 4 Dec 2019 at 08:02, Dale Bewley  wrote:
>>
>>>
>>> On Tue, Nov 26, 2019 at 7:29 PM Joel Pearson <
>>> japear...@agiledigital.com.au> wrote:
>>>
>>> Thanks for taking the time to reply, Joel.
>>>
>>>
>>>> On Sat, 23 Nov 2019 at 13:21, Dale Bewley  wrote:
>>>>
>>>>> Hello,
>>>>> I'm testing OCP 4.3 2019-11-19 nightly on OSP 13.
>>>>>
>>>>> I added my CA cert [1] to install-config.yaml [3]  and the installer
>>>>> now progresses. I can even `oc get nodes` and see the masters. [2].
>>>>>
>>>>> I still have the following errors and no worker nodes though.
>>>>>
>>>>> ERROR Cluster operator authentication Degraded is True with
>>>

Re: ocp 4.3 nightly install on openstack queens

2019-12-04 Thread Joel Pearson

On Wed, 4 Dec 2019 at 08:02, Dale Bewley  wrote:

>
> On Tue, Nov 26, 2019 at 7:29 PM Joel Pearson <
> japear...@agiledigital.com.au> wrote:
>
> Thanks for taking the time to reply, Joel.
>
>
>> On Sat, 23 Nov 2019 at 13:21, Dale Bewley  wrote:
>>
>>> Hello,
>>> I'm testing OCP 4.3 2019-11-19 nightly on OSP 13.
>>>
>>> I added my CA cert [1] to install-config.yaml [3]  and the installer now
>>> progresses. I can even `oc get nodes` and see the masters. [2].
>>>
>>> I still have the following errors and no worker nodes though.
>>>
>>> ERROR Cluster operator authentication Degraded is True with
>>> RouteStatusDegradedFailedHost: RouteStatusDegraded: route is not available
>>> at canonical host
>>> oauth-openshift.apps.osp-nightly.osp-nightly.domain.com: []
>>>
>>
>> This sounds like ingress isn't deploying because the worker nodes are not
>> deployed or your load balancer isn't making ingress available. Are your
>> master nodes schedulable? Ie are your masters also workers? If not, then
>> ingress won't deploy.
>>
>>
> $ oc describe node osp-nightly-tfz6p-master-0 | grep -i schedul
> Taints: node-role.kubernetes.io/master:NoSchedule
> Unschedulable:  false
>
> They are schedulable, but there are no matching tolerations in
> openshift-ingress/router-default deployment, so those pods are indeed stuck
> in _pending_ without any worker nodes.
>
> How is your load balancer configured for 80/443 traffic? If the masters
>> aren't targets of that, then even if ingress deploys you still won't be
>> able to use any routes
>>
>>
>
> No load balancer exists. I'm just trying to smoke test
> https://docs.openshift.com/container-platform/4.2/installing/installing_openstack/installing-openstack-installer-custom.html
>
>
>>
>>>
>>> This is likely a symptom of not yet having associated a floating IP to
>>> the app neutron port, and not having created an /etc/hosts entry on the
>>> installer host. I assume that's a nonfatal error.
>>>
>>> I assume this one is fatal, however:
>>>
>>> INFO Cluster operator image-registry Progressing is True with Error:
>>> Unable to apply resources: unable to sync storage configuration: Post
>>> https://openstack.domain.com:13000/v3/auth/tokens: x509: certificate
>>> signed by unknown authority
>>>
>>
>> Have you added the CA that covers openstack.domain.com
>> to install-config.yaml at .additionalTrustBundle like you mentioned in your
>> previous post?
>>
>
> Yep.
>
>
>>
>> Otherwise you might need to edit Proxy config and set spec.trustedCA.name
>> to  user-ca-bundle
>>
>> apiVersion: config.openshift.io/v1
>> kind: Proxy
>> metadata:
>>   name: cluster
>> spec:
>>   trustedCA:
>> name: user-ca-bundle
>>
>> I had to do this even though I don't have an explicit proxy. I do have a
>> transparent proxy though, which was doing MITM, essentially breaking
>> anything trying to talk to the internet.
>>
>
> Where did you make this change?
>

I did this before installation, for convenience mostly, after running
"openshift-install create manifests --dir=ignition-files", I edited the
ignition-files/manifests/cluster-proxy-01-config.yaml file.

Otherwise, it looks like you can do it after the fact using "oc edit
proxies cluster", then you'll need to wait for the masters to reboot I
think. Which for me sometimes takes like 10 minutes until it has done all
of them.

FYI, I managed to find out what name to use to edit that proxy config by
running "oc api-resources --api-group=config.openshift.io" and then finding
the name for apigroup "config.openshift.io" and kind "Proxy".


>
> I was going to try the 12/02 4.3 nightly build, but based on the following
> 2 blockers it doesn't look like it will work:
>
> * https://bugzilla.redhat.com/show_bug.cgi?id=1769879 Machine-api cannot
> create workers on osp envs installed with self-signed certs
>

There is a fair chance the above proxy config will fix this one


> * https://github.com/openshift/enhancements/pull/115 enhancements/x509-trust:
> Propose a new enhancement
>

I triggered this whole discussion from here:
https://bugzilla.redhat.com/show_bug.cgi?id=1771564 originally, so the
above proxy config should help.


>
> It's disappointing that the 4.2 release notes claim that OpenStack is
> supported when it does not seem to be supported in what I presume to be the
> majorit

Re: where does CRC store its data?

2019-11-28 Thread Joel Pearson

Hi Marvin,

Did you ever use minishift? It behaves in the same way, all the data is
inside the CRC VM.  If you manage to get into the CRC VM, and you get to
/mnt/pv-data then you'd see lots of directories pv0001, pv0002 etc.  If you
create yourself a PVC then it will automatically attach to an existing
PV's, so yes you can use it for applications if you want.

minishift used to have an option, "minishift ssh", which would ssh into the
VM that minishift was running inside.  But I can't see that as a
command-line option for crc.

On my windows installation of crc (it has since expired), I found the ssh
private key for the instance at

C:\Users\\.crc\machines\crc

You can get the ip of the VM with "crc ip"

I'm not totally sure what the ssh username is, but it's probably one of
crc, core, openshift.

ssh -i C:\Users\\.crc\machines\crc\id_rsa  @

Otherwise, a slightly shady way of getting to the host is to start a
special pod and get a root shell, using one of the techniques here:
https://gist.github.com/jjo/a8243c677f7e79f2f1d610f02365fdd7
I used that technique once when ssh had died and I wanted to restart ssh
without restarting the whole machine.
You might need to use kubeadmin or some privileged user.

Anyway, good luck.

On Fri, 22 Nov 2019 at 11:44, Just Marvin <
marvin.the.cynical.ro...@gmail.com> wrote:

> Hi,
>
> On my host system, I see:
>
> [zaphod@oc6010654212 code]$ oc get pv
> NAME CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS  CLAIM
> STORAGECLASS   REASON   AGE
> pv0001   100Gi  RWO,ROX,RWXRecycle  Bound
> openshift-image-registry/crc-image-registry-storage
>   22d
> pv0002   100Gi  RWO,ROX,RWXRecycle  Available
> 22d
> pv0003   100Gi  RWO,ROX,RWXRecycle  Available
> 22d
> pv0004   100Gi  RWO,ROX,RWXRecycle  Available
> 22d
> .
> .
> .
> pv0030   100Gi  RWO,ROX,RWXRecycle  Available
> 22d
> [zaphod@oc6010654212 code]$ oc describe pv pv0001
> Name:pv0001
> Labels:  volume=pv0001
> Annotations: pv.kubernetes.io/bound-by-controller: yes
> Finalizers:  [kubernetes.io/pv-protection]
> StorageClass:
> Status:  Bound
> Claim:   openshift-image-registry/crc-image-registry-storage
> Reclaim Policy:  Recycle
> Access Modes:RWO,ROX,RWX
> VolumeMode:  Filesystem
> Capacity:100Gi
> Node Affinity:   
> Message:
> Source:
> Type:  HostPath (bare host directory volume)
> Path:  /mnt/pv-data/pv0001
> HostPathType:
> Events:
> [zaphod@oc6010654212 code]$ ls -l /mnt/pv-data/pv0001
> ls: cannot access /mnt/pv-data/pv0001: No such file or directory
> [zaphod@oc6010654212 code]$ ls -l /mnt
> total 0
>
> What gives? Where is CRC actually storing the data in its registry,
> etc? More importantly, if I want to use one of those unbound pv's for
> applications, can I?
>
> Regards,
> Marvin
> ___
> users mailing list
> users@lists.openshift.redhat.com
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: ocp 4.3 nightly install on openstack queens

2019-11-26 Thread Joel Pearson

On Sat, 23 Nov 2019 at 13:21, Dale Bewley  wrote:

> Hello,
> I'm testing OCP 4.3 2019-11-19 nightly on OSP 13.
>
> I added my CA cert [1] to install-config.yaml [3]  and the installer now
> progresses. I can even `oc get nodes` and see the masters. [2].
>
> I still have the following errors and no worker nodes though.
>
> ERROR Cluster operator authentication Degraded is True with
> RouteStatusDegradedFailedHost: RouteStatusDegraded: route is not available
> at canonical host oauth-openshift.apps.osp-nightly.osp-nightly.domain.com:
> []
>

This sounds like ingress isn't deploying because the worker nodes are not
deployed or your load balancer isn't making ingress available. Are your
master nodes schedulable? Ie are your masters also workers? If not, then
ingress won't deploy.

How is your load balancer configured for 80/443 traffic? If the masters
aren't targets of that, then even if ingress deploys you still won't be
able to use any routes


>
>
> This is likely a symptom of not yet having associated a floating IP to the
> app neutron port, and not having created an /etc/hosts entry on the
> installer host. I assume that's a nonfatal error.
>
> I assume this one is fatal, however:
>
> INFO Cluster operator image-registry Progressing is True with Error:
> Unable to apply resources: unable to sync storage configuration: Post
> https://openstack.domain.com:13000/v3/auth/tokens: x509: certificate
> signed by unknown authority
>

Have you added the CA that covers openstack.domain.com
to install-config.yaml at .additionalTrustBundle like you mentioned in your
previous post?

Otherwise you might need to edit Proxy config and set spec.trustedCA.name
to  user-ca-bundle

apiVersion: config.openshift.io/v1
kind: Proxy
metadata:
  name: cluster
spec:
  trustedCA:
name: user-ca-bundle

I had to do this even though I don't have an explicit proxy. I do have a
transparent proxy though, which was doing MITM, essentially breaking
anything trying to talk to the internet.


>
> Is it safe to assume this BZ comment is related to that error?
> https://bugzilla.redhat.com/show_bug.cgi?id=1735192#c17
>
> Bootstrap host has already been removed by the installer, so
> `openshift-install gather` does not seem usable, but the installer debug
> output can be found at
> https://paste.fedoraproject.org/paste/SzIqAMU4DWHN3Bw3WDKfTQ
>
> Any advice?
>
> Thanks!
>
>
> [1]
> https://lists.openshift.redhat.com/openshift-archives/users/2019-November/msg00073.html
>
> [2]
> $ export KUBECONFIG=osp-nightly/auth/kubeconfig
> $ oc get nodes
> NAME STATUSROLES AGE   VERSION
> osp-nightly-tfz6p-master-0   Ready master102m  v1.16.2
> osp-nightly-tfz6p-master-1   Ready master103m  v1.16.2
> osp-nightly-tfz6p-master-2   Ready master103m  v1.16.2
>
> [3] install-config.yaml
> apiVersion: v1
> baseDomain: ocp.domain.com
> additionalTrustBundle: |
>   -BEGIN CERTIFICATE-
>   MI...
> compute:
> - hyperthreading: Enabled
>   name: worker
>   platform:
> openstack:
>   rootVolume:
> size: 10
>   replicas: 3
> controlPlane:
>   hyperthreading: Enabled
>   name: master
>   platform: {}
>   replicas: 3
> metadata:
>   creationTimestamp: null
>   name: osp-nightly
> networking:
>   clusterNetwork:
>   - cidr: 10.128.0.0/14
> hostPrefix: 23
>   machineCIDR: 10.0.0.0/16
>   networkType: OpenShiftSDN
>   serviceNetwork:
>   - 172.30.0.0/16
> platform:
>   openstack:
> cloud: shiftstack
> computeFlavor: ocp4.worker.4x16
> externalDNS: null
> externalNetwork: floating
> lbFloatingIP: 192.0.2.29
> octaviaSupport: "0"
> region: ""
> trunkSupport: "1"
> publish: External
> pullSecret: '{"...
> sshKey: |
>   ssh-rsa A...
>
>
> ___
> users mailing list
> users@lists.openshift.redhat.com
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: How to recover from failed update in OpenShift 4.2.x?

2019-11-26 Thread Joel Pearson

On Thu, 21 Nov 2019 at 10:58, Clayton Coleman  wrote:

>
>
> On Nov 17, 2019, at 9:34 PM, Joel Pearson 
> wrote:
>
> So, I'm running OpenShift 4.2 on Azure UPI following this blog article:
> https://blog.openshift.com/openshift-4-1-upi-environment-deployment-on-microsoft-azure-cloud/
>  with
> a few customisations on the terraform side.
>
> One of the main differences it seems, is how the router/ingress is
> handled. Normal Azure uses load balancers, but UPI Azure uses a regular
> router (that I'm used to seeing the 3.x version) which is configured by
> setting the "HostNetwork" for the endpoint publishing strategy
> <https://github.com/JuozasA/ocp4-azure-upi/blob/master/ingresscontroller-default.yaml#L9-L10>
>
>
> This sounds like a bug in Azure UPI.  IPI is the reference architecture,
> it shouldn’t have a default divergent from the ref arch.
>

In the blog, he mentions that he has changed the architecture because it
creates a public facing load balancer.  In my case I'm not allowed to
create a public load balancer at all, additionally I can't use Azure's
Public or Private DNS either, so I had to customise the terraform templates
even more.

Maybe supported UPI Azure will allow internally facing load balancers?


>
>
> It was all working fine in OpenShift 4.2.0 and 4.2.2, but when I upgraded
> to OpenShift 4.2.4, the router stopped listening on ports 80 and 443, I
> could see the pod running with "crictl ps", but a "netstat -tpln" didn't
> show anything listening.
>
> I tried updating the version back from 4.2.4 to 4.2.2, but I
> accidentally used 4.1.22 image digest value, so I quickly reverted back to
> 4.2.4 once I saw the apiservers coming up as 4.1.22.  I then noticed that
> there was a 4.2.7 release on the candidate-4.2 channel, so I switched to
> that, and ingress started working properly again.
>
> So my question is, what is the strategy for recovering from a failed
> update? Do I need to have etcd backups and then restore the cluster by
> restoring etcd? Ie.
> https://docs.openshift.com/container-platform/4.2/backup_and_restore/disaster_recovery/scenario-2-restoring-cluster-state.html
>
> The upgrade page
> <https://docs.openshift.com/container-platform/4.2/updating/updating-cluster-between-minor.html>
> specifically says "Reverting your cluster to a previous version, or a
> rollback, is not supported. Only upgrading to a newer version is
> supported." so is it an expectation for a production cluster that you would
> restore from backup if the cluster isn't usable?
>
>
> Backup, yes.  If you could open a bug for the documentation that would be
> great.
>

Thanks, raised it here: https://bugzilla.redhat.com/show_bug.cgi?id=1777155


>
>
> Maybe the upgrade page should mention taking backups? Especially if there
> is no rollback option.
>
> ___
> users mailing list
> users@lists.openshift.redhat.com
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>
>
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: Idle OpenShift 4.2 Image Registry running on Azure listing storage keys about 40 times per minute

2019-11-25 Thread Joel Pearson

Thanks Adam, I created  https://bugzilla.redhat.com/show_bug.cgi?id=1776665

On Tue, 26 Nov 2019 at 03:17, Adam Kaplan  wrote:

> Please raise a BZ for this. We do make these requests to verify that the
> registry's storage has been provisioned properly on Azure.
>
> I'd expect this to API request every 10 minutes when the operator's relist
> interval is hit. ~40 per minute suggests that we are reacting to a lot of
> events that we probably shouldn't react to.
>
> On Mon, Nov 25, 2019 at 1:19 AM Joel Pearson <
> japear...@agiledigital.com.au> wrote:
>
>> Hi,
>>
>> I've noticed a strange thing with the Image Registry running on Azure in
>> OpenShift 4.2.7 (possibly all other versions too).
>>
>> When the registry is idle, I'm seeing about 40 requests per minute for
>> "List Storage Account Keys" per minute in Azure console, under the resource
>> group "Activity log".  Each request is has a "Started" and "Succeeded"
>> event, so it's 80 events in total per minute.
>>
>> This to me seems hugely excessive and unnecessary, is it expected
>> behaviour?
>>
>> Surely listing storage account keys only needs to happen a few times per
>> day instead of almost per second?
>>
>> I looked in the logs of the "cluster-image-registry-operator" and these
>> log entries seem to correlate within about a second of the azure activity
>> log entries.
>>
>> I1125 05:20:33.804388  14 controller.go:260] event from workqueue
>> successfully processed
>> I1125 05:20:36.816148  14 controller.go:260] event from workqueue
>> successfully processed
>> I1125 05:20:39.987651  14 controller.go:260] event from workqueue
>> successfully processed
>>
>> Should I raise a bugzilla for this?
>>
>> Thanks,
>>
>> Joel
>> ___
>> users mailing list
>> users@lists.openshift.redhat.com
>> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>>
>
>
> --
>
> Adam Kaplan
>
> He/Him
>
> Senior Software Engineer - OpenShift
>
> Red Hat <https://www.redhat.com/>
>
> 100 E. Davie St. Raleigh, NC 27601 USA
>
> adam.kap...@redhat.comT: +1-919-754-4843 IM: adambkaplan
> <https://www.redhat.com/>
>


-- 
Kind Regards,

Joel Pearson
Agile Digital | Senior Software Consultant

Love Your Software™ | ABN 98 106 361 273
p: 1300 858 277 | m: 0405 417 843 <0405417843> | w: agiledigital.com.au
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Idle OpenShift 4.2 Image Registry running on Azure listing storage keys about 40 times per minute

2019-11-24 Thread Joel Pearson

Hi,

I've noticed a strange thing with the Image Registry running on Azure in
OpenShift 4.2.7 (possibly all other versions too).

When the registry is idle, I'm seeing about 40 requests per minute for
"List Storage Account Keys" per minute in Azure console, under the resource
group "Activity log".  Each request is has a "Started" and "Succeeded"
event, so it's 80 events in total per minute.

This to me seems hugely excessive and unnecessary, is it expected behaviour?

Surely listing storage account keys only needs to happen a few times per
day instead of almost per second?

I looked in the logs of the "cluster-image-registry-operator" and these log
entries seem to correlate within about a second of the azure activity log
entries.

I1125 05:20:33.804388  14 controller.go:260] event from workqueue
successfully processed
I1125 05:20:36.816148  14 controller.go:260] event from workqueue
successfully processed
I1125 05:20:39.987651  14 controller.go:260] event from workqueue
successfully processed

Should I raise a bugzilla for this?

Thanks,

Joel
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: Failure to detach Azure Disk in OpenShift 4.2.7 after 15 minutes

2019-11-24 Thread Joel Pearson

Unfortunately, I didn't run it before I made the manual change.

I ran it just then, and I can see error messages in the output, is that
worth giving it to you still?

The errors seemed to be coming from "azure_controller_standard.go" which
seemed to be the code responsible for attaching/detaching Azure disks.
Although I'm guessing the code that decides when to detach a disk, is
hiding somewhere else?

On Mon, 25 Nov 2019 at 15:26, Clayton Coleman  wrote:

> Did you run must-gather while it couldn’t detach?
>
> Without deeper debug info from the interval it’s hard to say.  If you can
> recreate it and run must gather we might be able to find it.
>
> On Nov 24, 2019, at 10:25 PM, Joel Pearson 
> wrote:
>
> Hi,
>
> I updated some machine config to configure chrony for masters and workers,
> and I found that one of my containers got stuck after the masters had
> restarted.
>
> One of the containers still couldn't start for 15 minutes, as the disk was
> still attached to master-2 whereas the pod had been scheduled on master-1.
>
> In the end I manually detached the disk in the azure console.
>
> Is this a known issue? Or should I have waited for more than 15 minutes?
>
> Maybe this happened because the masters restarted and maybe whatever is
> responsible for detaching the disk got restarted, and there wasn't a
> cleanup process to detach from the original node? I'm not sure if this is
> further complicated by the fact that my masters are also workers?
>
> Here is the event information from the pod:
>
>   Warning  FailedMount 57s (x8 over 16m)   kubelet,
> resource-group-prefix-master-1  Unable to mount volumes for pod
> "odoo-3-m9kxs_odoo(c0a31c68-0f2c-11ea-b695-000d3a970043)": timeout expired
> waiting for volumes to attach or mount for pod "odoo"/"odoo-3-m9kxs". list
> of unmounted volumes=[odoo-data]. list of unattached volumes=[odoo-1
> odoo-data default-token-5d6x7]
>
>   Warning  FailedAttachVolume  55s (x15 over 15m)  attachdetach-controller
>   AttachVolume.Attach failed for volume
> "pvc-61f1ad81-0f24-11ea-8f8f-000d3a970df2" : Attach volume
> "resource-group-prefix-dynamic-pvc-61f1ad81-0f24-11ea-8f8f-000d3a970df2" to
> instance "resource-group-prefix-master-1" failed with
> compute.VirtualMachinesClient#CreateOrUpdate: Failure sending request:
> StatusCode=0 -- Original Error: autorest/azure: Service returned an error.
> Status= Code="ConflictingUserInput" Message="A disk with name
> resource-group-prefix-dynamic-pvc-61f1ad81-0f24-11ea-8f8f-000d3a970df2
> already exists in Resource Group RESOURCE-GROUP-PREFIX-RG and is attached
> to VM
> /subscriptions/-xxx---x/resourceGroups/resource-group-prefix-rg/providers/Microsoft.Compute/virtualMachines/resource-group-prefix-master-2.
> 'Name' is an optional property for a disk and a unique name will be
> generated if not provided."
> Target="/subscriptions/-xxx---x/resourceGroups/resource-group-prefix-rg/providers/Microsoft.Compute/disks/resource-group-prefix-dynamic-pvc-61f1ad81-0f24-11ea-8f8f-000d3a970df2"
>
> Thanks,
>
> Joel
>
> ___
> users mailing list
> users@lists.openshift.redhat.com
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>
>
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Failure to detach Azure Disk in OpenShift 4.2.7 after 15 minutes

2019-11-24 Thread Joel Pearson

Hi,

I updated some machine config to configure chrony for masters and workers,
and I found that one of my containers got stuck after the masters had
restarted.

One of the containers still couldn't start for 15 minutes, as the disk was
still attached to master-2 whereas the pod had been scheduled on master-1.

In the end I manually detached the disk in the azure console.

Is this a known issue? Or should I have waited for more than 15 minutes?

Maybe this happened because the masters restarted and maybe whatever is
responsible for detaching the disk got restarted, and there wasn't a
cleanup process to detach from the original node? I'm not sure if this is
further complicated by the fact that my masters are also workers?

Here is the event information from the pod:

  Warning  FailedMount 57s (x8 over 16m)   kubelet,
resource-group-prefix-master-1  Unable to mount volumes for pod
"odoo-3-m9kxs_odoo(c0a31c68-0f2c-11ea-b695-000d3a970043)": timeout expired
waiting for volumes to attach or mount for pod "odoo"/"odoo-3-m9kxs". list
of unmounted volumes=[odoo-data]. list of unattached volumes=[odoo-1
odoo-data default-token-5d6x7]

  Warning  FailedAttachVolume  55s (x15 over 15m)  attachdetach-controller
  AttachVolume.Attach failed for volume
"pvc-61f1ad81-0f24-11ea-8f8f-000d3a970df2" : Attach volume
"resource-group-prefix-dynamic-pvc-61f1ad81-0f24-11ea-8f8f-000d3a970df2" to
instance "resource-group-prefix-master-1" failed with
compute.VirtualMachinesClient#CreateOrUpdate: Failure sending request:
StatusCode=0 -- Original Error: autorest/azure: Service returned an error.
Status= Code="ConflictingUserInput" Message="A disk with name
resource-group-prefix-dynamic-pvc-61f1ad81-0f24-11ea-8f8f-000d3a970df2
already exists in Resource Group RESOURCE-GROUP-PREFIX-RG and is attached
to VM
/subscriptions/-xxx---x/resourceGroups/resource-group-prefix-rg/providers/Microsoft.Compute/virtualMachines/resource-group-prefix-master-2.
'Name' is an optional property for a disk and a unique name will be
generated if not provided."
Target="/subscriptions/-xxx---x/resourceGroups/resource-group-prefix-rg/providers/Microsoft.Compute/disks/resource-group-prefix-dynamic-pvc-61f1ad81-0f24-11ea-8f8f-000d3a970df2"

Thanks,

Joel
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: How to use extra trusted CA certs when pulling images for a builder

2019-11-17 Thread Joel Pearson

On Mon, 18 Nov 2019 at 13:05, Clayton Coleman  wrote:

> Raise a bug to the installler component, yes
>

Ok thanks, I raised a bug here:
https://bugzilla.redhat.com/show_bug.cgi?id=1773419


> On Nov 17, 2019, at 6:03 PM, Joel Pearson 
> wrote:
>
> On Mon, 18 Nov 2019 at 12:37, Ben Parees  wrote:
>
>>
>>
>> On Sun, Nov 17, 2019 at 7:24 PM Joel Pearson <
>> japear...@agiledigital.com.au> wrote:
>>
>>>
>>>
>>> On Wed, 13 Nov 2019 at 02:43, Ben Parees  wrote:
>>>
>>>>
>>>>
>>>> On Mon, Nov 11, 2019 at 11:27 PM Ben Parees  wrote:
>>>>
>>>>>
>>>>>
>>>>> On Mon, Nov 11, 2019 at 10:47 PM Joel Pearson <
>>>>> japear...@agiledigital.com.au> wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> On Tue, 12 Nov 2019 at 06:56, Ben Parees  wrote:
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> Can I use the “trustedCA” part of the proxy configuration without
>>>>>>>> actually specifying an explicit proxy?
>>>>>>>>
>>>>>>>
>>>>>>> you should be able to.  Daneyon can you confirm?  (if you can't i'd
>>>>>>> consider it a bug).
>>>>>>>
>>>>>>> It does work! Thanks for that. user-ca-bundle already existed and
>>>>>> had my certificate in there, I just needed to reference user-ca-bundle in
>>>>>> the proxy config.
>>>>>>
>>>>>
>>>>> cool, given that you supplied the CAs during install, and the
>>>>> user-ca-bundle CM was created, i'm a little surprised the install didn't
>>>>> automatically setup the reference in the proxyconfig resource for you.  
>>>>> I'm
>>>>> guessing it did not because there was no actual proxy hostname configured.
>>>>> I think that's a gap we should close..would you mind filing a bug?  (
>>>>> bugzilla.redhat.com).  You can submit it against the install
>>>>> component.
>>>>>
>>>>
>>>> fyi I've filed a bug for this aspect of the issues you ran into:
>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1771564
>>>>
>>>>
>>> Thanks for raising this, reading through the related github tickets it
>>> looks like I've opened a can of worms to some degree.
>>>
>>
>> Yes there's some difference of opinion on what the out of box desired
>> behavior is, but at a minimum you've exposed a gap in our documentation
>> that we will get fixed.
>>
>>
>> I also just discovered that the openshift cluster version operator (CVO),
> isn't quite configured correctly out of the box to use the correct trusted
> CA certs (which means it can't download cluster updates).
>
> It correctly mounts /etc/ssl/certs from the host (the masters), but it
> fails to also mount /etc/pki, because the certs are a symlink
> /etc/ssl/certs/ca-bundle.crt ->
> /etc/pki/ca-trust/extracted/pem/tls-ca-bundle.pem
>
> I couldn't find where the installer sets up the CVO but an example of what
> is missing is here.
>
> https://github.com/openshift/cluster-version-operator/blob/01a7825179246fa708ac64de96e6675c0bf9a930/bootstrap/bootstrap-pod.yaml#L44-L46
>
>
> Is there an existing bug for this? Or should I raise a bugzilla for this?
> Would it be part of the installer?
>
> ___
> users mailing list
> users@lists.openshift.redhat.com
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>
>
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

How to recover from failed update in OpenShift 4.2.x?

2019-11-17 Thread Joel Pearson

So, I'm running OpenShift 4.2 on Azure UPI following this blog article:
https://blog.openshift.com/openshift-4-1-upi-environment-deployment-on-microsoft-azure-cloud/
with
a few customisations on the terraform side.

One of the main differences it seems, is how the router/ingress is handled.
Normal Azure uses load balancers, but UPI Azure uses a regular router (that
I'm used to seeing the 3.x version) which is configured by setting the
"HostNetwork"
for the endpoint publishing strategy

It was all working fine in OpenShift 4.2.0 and 4.2.2, but when I upgraded
to OpenShift 4.2.4, the router stopped listening on ports 80 and 443, I
could see the pod running with "crictl ps", but a "netstat -tpln" didn't
show anything listening.

I tried updating the version back from 4.2.4 to 4.2.2, but I
accidentally used 4.1.22 image digest value, so I quickly reverted back to
4.2.4 once I saw the apiservers coming up as 4.1.22. I then noticed that
there was a 4.2.7 release on the candidate-4.2 channel, so I switched to
that, and ingress started working properly again.

So my question is, what is the strategy for recovering from a failed
update? Do I need to have etcd backups and then restore the cluster by
restoring etcd? Ie.
https://docs.openshift.com/container-platform/4.2/backup_and_restore/disaster_recovery/scenario-2-restoring-cluster-state.html

The upgrade page

specifically says "Reverting your cluster to a previous version, or a
rollback, is not supported. Only upgrading to a newer version is
supported." so is it an expectation for a production cluster that you would
restore from backup if the cluster isn't usable?

Maybe the upgrade page should mention taking backups? Especially if there
is no rollback option.
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: How to use extra trusted CA certs when pulling images for a builder

2019-11-17 Thread Joel Pearson

On Mon, 18 Nov 2019 at 12:37, Ben Parees  wrote:

>
>
> On Sun, Nov 17, 2019 at 7:24 PM Joel Pearson <
> japear...@agiledigital.com.au> wrote:
>
>>
>>
>> On Wed, 13 Nov 2019 at 02:43, Ben Parees  wrote:
>>
>>>
>>>
>>> On Mon, Nov 11, 2019 at 11:27 PM Ben Parees  wrote:
>>>
>>>>
>>>>
>>>> On Mon, Nov 11, 2019 at 10:47 PM Joel Pearson <
>>>> japear...@agiledigital.com.au> wrote:
>>>>
>>>>>
>>>>>
>>>>> On Tue, 12 Nov 2019 at 06:56, Ben Parees  wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> Can I use the “trustedCA” part of the proxy configuration without
>>>>>>> actually specifying an explicit proxy?
>>>>>>>
>>>>>>
>>>>>> you should be able to.  Daneyon can you confirm?  (if you can't i'd
>>>>>> consider it a bug).
>>>>>>
>>>>>> It does work! Thanks for that. user-ca-bundle already existed and had
>>>>> my certificate in there, I just needed to reference user-ca-bundle in the
>>>>> proxy config.
>>>>>
>>>>
>>>> cool, given that you supplied the CAs during install, and the
>>>> user-ca-bundle CM was created, i'm a little surprised the install didn't
>>>> automatically setup the reference in the proxyconfig resource for you.  I'm
>>>> guessing it did not because there was no actual proxy hostname configured.
>>>> I think that's a gap we should close..would you mind filing a bug?  (
>>>> bugzilla.redhat.com).  You can submit it against the install component.
>>>>
>>>
>>> fyi I've filed a bug for this aspect of the issues you ran into:
>>> https://bugzilla.redhat.com/show_bug.cgi?id=1771564
>>>
>>>
>> Thanks for raising this, reading through the related github tickets it
>> looks like I've opened a can of worms to some degree.
>>
>
> Yes there's some difference of opinion on what the out of box desired
> behavior is, but at a minimum you've exposed a gap in our documentation
> that we will get fixed.
>
>
> I also just discovered that the openshift cluster version operator (CVO),
isn't quite configured correctly out of the box to use the correct trusted
CA certs (which means it can't download cluster updates).

It correctly mounts /etc/ssl/certs from the host (the masters), but it
fails to also mount /etc/pki, because the certs are a symlink
/etc/ssl/certs/ca-bundle.crt ->
/etc/pki/ca-trust/extracted/pem/tls-ca-bundle.pem

I couldn't find where the installer sets up the CVO but an example of what
is missing is here.
https://github.com/openshift/cluster-version-operator/blob/01a7825179246fa708ac64de96e6675c0bf9a930/bootstrap/bootstrap-pod.yaml#L44-L46


Is there an existing bug for this? Or should I raise a bugzilla for this?
Would it be part of the installer?
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: How to use extra trusted CA certs when pulling images for a builder

2019-11-17 Thread Joel Pearson

On Wed, 13 Nov 2019 at 02:43, Ben Parees  wrote:

>
>
> On Mon, Nov 11, 2019 at 11:27 PM Ben Parees  wrote:
>
>>
>>
>> On Mon, Nov 11, 2019 at 10:47 PM Joel Pearson <
>> japear...@agiledigital.com.au> wrote:
>>
>>>
>>>
>>> On Tue, 12 Nov 2019 at 06:56, Ben Parees  wrote:
>>>
>>>>
>>>>
>>>>>
>>>>> Can I use the “trustedCA” part of the proxy configuration without
>>>>> actually specifying an explicit proxy?
>>>>>
>>>>
>>>> you should be able to.  Daneyon can you confirm?  (if you can't i'd
>>>> consider it a bug).
>>>>
>>>> It does work! Thanks for that. user-ca-bundle already existed and had
>>> my certificate in there, I just needed to reference user-ca-bundle in the
>>> proxy config.
>>>
>>
>> cool, given that you supplied the CAs during install, and the
>> user-ca-bundle CM was created, i'm a little surprised the install didn't
>> automatically setup the reference in the proxyconfig resource for you.  I'm
>> guessing it did not because there was no actual proxy hostname configured.
>> I think that's a gap we should close..would you mind filing a bug?  (
>> bugzilla.redhat.com).  You can submit it against the install component.
>>
>
> fyi I've filed a bug for this aspect of the issues you ran into:
> https://bugzilla.redhat.com/show_bug.cgi?id=1771564
>
>
Thanks for raising this, reading through the related github tickets it
looks like I've opened a can of worms to some degree.
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: How to use extra trusted CA certs when pulling images for a builder

2019-11-17 Thread Joel Pearson

On Wed, 13 Nov 2019 at 01:34, Ben Parees  wrote:

>
>
> On Tue, Nov 12, 2019 at 3:45 AM Joel Pearson <
> japear...@agiledigital.com.au> wrote:
>
>>
>>
>> On Tue, 12 Nov 2019 at 15:37, Ben Parees  wrote:
>>
>>>
>>>
>>> On Mon, Nov 11, 2019 at 11:26 PM Joel Pearson <
>>> japear...@agiledigital.com.au> wrote:
>>>
>>>> I've now discovered that the cluster-samples-operator doesn't seem
>>>> honour the proxy settings, and I see lots of errors in the
>>>> cluster-samples-operator- pod logs
>>>>
>>>> time="2019-11-12T04:15:49Z" level=warning msg="Image import for
>>>> imagestream dotnet tag 2.1 generation 2 failed with detailed message
>>>> Internal error occurred: Get https://I /v2/
>>>> <https://registry.redhat.io/v2/>: x509: certificate signed by unknown
>>>> authority"
>>>>
>>>> Is there a way to get that operator to use the same user-ca-bundle?
>>>>
>>>
>>> image import should be using those CAs (it's really about the
>>> openshift-apiserver, not the samples operator) automatically (sounds like
>>> another potential bug, but i'll let Oleg weigh in on this one).
>>>
>>> However barring that, you can use the mechanism described here to
>>> setup additional CAs for importing from registries:
>>>
>>> https://docs.openshift.com/container-platform/4.2/openshift_images/image-configuration.html#images-configuration-file_image-configuration
>>>
>>> you can follow the more detailed instructions here:
>>>
>>> https://docs.openshift.com/container-platform/4.2/builds/setting-up-trusted-ca.html#configmap-adding-ca_setting-up-trusted-ca
>>>
>>
>> I tried this approach but it didn't work for me.
>>
>> I ran this command:
>>
>> oc create configmap registry-cas -n openshift-config \
>> --from-file=registry.redhat.io..5000=/path/to/ca.crt \
>> --from-file=registry.redhat.io..443=/path/to/ca.crt \
>> --from-file=registry.redhat.io=/path/to/ca.crt
>>
>> and:
>>
>> oc patch image.config.openshift.io/cluster --patch
>> '{"spec":{"additionalTrustedCA":{"name":"registry-cas"}}}' --type=merge
>>
>> And that still didn't work. First I deleted the
>> cluster-samples-operator- pod, then I tried forcing the masters to
>> restart by touching some machine config (I don't know a better way).
>> But it still didn't work.  Maybe the samples operator doesn't let you
>> easily override the trusted CA certs?
>>
>
> Because no good bug report should not be rewarded with some educational
> background:
>
> The samples operator is only responsible for creating the imagestream, it
> isn't actually doing the import (ie reaching out to the registry and
> pulling down the metadata and putting it in the imagestream).  That task is
> performed by the openshift-apiserver.  What should be happening when you
> update the image config resource with the name of the CA configmap is that
> the openshift-apiserver operator should observe the configuration change
> and provide the new CAs to the openshift-apiserver pods (which necessitates
> a restart of the openshift-apiserver pods).
>
> Once the openshift-apiserver pods are restarted with the new CAs, you
> should be able to run "oc import-image" to retry the import.  (The samples
> operator is supposed to retry the failed imports periodically, but there is
> a different bug that is being fixed related to that, so until then you'll
> have to retry the import manually once you've corrected whatever caused the
> failure).
>
> So again, there may be a bug here in terms of the openshift-apiserver
> picking up the CAs and we need to investigate it (as well as a separate bug
> if it is not picking up the proxy CAs), but I wanted you to understand the
> relevant components so your own debugging process can be more productive.
>
>
Thanks for the explanation Ben, it helped me figure out there the issues
where.

For other reasons (I had tried to customise name of the Azure
resource group and something would occasionally (once a day?) change it
back to the default name), I ended up burning the cluster to the ground,
and I configured the "spec.trustedCA.name" reference during installation by
customising the manifest files generated by "openshift-install create
manifests --dir=ignition-files" and then the samples operators worked out
of the box!

So it was just that the samples operator doesn't retry as you mentioned.

Also, I didn't need to setup the additional CAs for importing from
registries in the end.
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: How to use extra trusted CA certs when pulling images for a builder

2019-11-12 Thread Joel Pearson

On Tue, 12 Nov 2019 at 15:37, Ben Parees  wrote:

>
>
> On Mon, Nov 11, 2019 at 11:26 PM Joel Pearson <
> japear...@agiledigital.com.au> wrote:
>
>> I've now discovered that the cluster-samples-operator doesn't seem honour
>> the proxy settings, and I see lots of errors in the
>> cluster-samples-operator- pod logs
>>
>> time="2019-11-12T04:15:49Z" level=warning msg="Image import for
>> imagestream dotnet tag 2.1 generation 2 failed with detailed message
>> Internal error occurred: Get https://I /v2/
>> <https://registry.redhat.io/v2/>: x509: certificate signed by unknown
>> authority"
>>
>> Is there a way to get that operator to use the same user-ca-bundle?
>>
>
> image import should be using those CAs (it's really about the
> openshift-apiserver, not the samples operator) automatically (sounds like
> another potential bug, but i'll let Oleg weigh in on this one).
>
> However barring that, you can use the mechanism described here to
> setup additional CAs for importing from registries:
>
> https://docs.openshift.com/container-platform/4.2/openshift_images/image-configuration.html#images-configuration-file_image-configuration
>
> you can follow the more detailed instructions here:
>
> https://docs.openshift.com/container-platform/4.2/builds/setting-up-trusted-ca.html#configmap-adding-ca_setting-up-trusted-ca
>

I tried this approach but it didn't work for me.

I ran this command:

oc create configmap registry-cas -n openshift-config \
--from-file=registry.redhat.io..5000=/path/to/ca.crt \
--from-file=registry.redhat.io..443=/path/to/ca.crt \
--from-file=registry.redhat.io=/path/to/ca.crt

and:

oc patch image.config.openshift.io/cluster --patch
'{"spec":{"additionalTrustedCA":{"name":"registry-cas"}}}' --type=merge

And that still didn't work. First I deleted the
cluster-samples-operator- pod, then I tried forcing the masters to
restart by touching some machine config (I don't know a better way).
But it still didn't work.  Maybe the samples operator doesn't let you
easily override the trusted CA certs?


>
>
> (Brandi/Adam, we should really include the example from that second link,
> in the general "image resource configuration" page from the first link).
>
> Unfortunately it does not allow you to reuse the user-ca-bundle CM since
> the format of the CM is a bit different (needs an entry per registry
> hostname).
>
>
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: How to use extra trusted CA certs when pulling images for a builder

2019-11-11 Thread Joel Pearson

I've now discovered that the cluster-samples-operator doesn't seem honour
the proxy settings, and I see lots of errors in the
cluster-samples-operator- pod logs

time="2019-11-12T04:15:49Z" level=warning msg="Image import for imagestream
dotnet tag 2.1 generation 2 failed with detailed message Internal error
occurred: Get https://registry.redhat.io/v2/: x509: certificate signed by
unknown authority"

Is there a way to get that operator to use the same user-ca-bundle?

On Tue, 12 Nov 2019 at 14:46, Joel Pearson 
wrote:

>
>
> On Tue, 12 Nov 2019 at 06:56, Ben Parees  wrote:
>
>>
>>
>>>
>>> Can I use the “trustedCA” part of the proxy configuration without
>>> actually specifying an explicit proxy?
>>>
>>
>> you should be able to.  Daneyon can you confirm?  (if you can't i'd
>> consider it a bug).
>>
>> It does work! Thanks for that. user-ca-bundle already existed and had my
> certificate in there, I just needed to reference user-ca-bundle in the
> proxy config.
>
> apiVersion: config.openshift.io/v1
> kind: Proxy
> metadata:
>   name: cluster
> spec:
>   trustedCA:
> name: user-ca-bundle
>
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: How to use extra trusted CA certs when pulling images for a builder

2019-11-11 Thread Joel Pearson

On Tue, 12 Nov 2019 at 06:56, Ben Parees  wrote:

>
>
>>
>> Can I use the “trustedCA” part of the proxy configuration without
>> actually specifying an explicit proxy?
>>
>
> you should be able to.  Daneyon can you confirm?  (if you can't i'd
> consider it a bug).
>
> It does work! Thanks for that. user-ca-bundle already existed and had my
certificate in there, I just needed to reference user-ca-bundle in the
proxy config.

apiVersion: config.openshift.io/v1
kind: Proxy
metadata:
  name: cluster
spec:
  trustedCA:
name: user-ca-bundle
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: How to use extra trusted CA certs when pulling images for a builder

2019-11-11 Thread Joel Pearson

On Tue, 12 Nov 2019 at 12:26 am, Ben Parees  wrote:

>
>
> On Mon, Nov 11, 2019 at 1:17 AM Joel Pearson <
> japear...@agiledigital.com.au> wrote:
>
>> Hi,
>>
>> I’m trying to build an image in Openshift 4.2 where my internet has an
>> MITM proxy.
>>
>> So trying to pull docker images fails during the build with x509 errors.
>>
>> Is there a way to provide extra trusted CA certificates to the builder?
>>
>
> Did you supply additional CAs via the proxy configuration?  Those should
> be picked up by the builder automatically when it is pulling images and I
> think it'd be a bug if you configured that and it's not working:
>
> https://docs.openshift.com/container-platform/4.2/networking/enable-cluster-wide-proxy.html#nw-proxy-configure-object_config-cluster-wide-proxy
>

<https://docs.openshift.com/container-platform/4.2/networking/enable-cluster-wide-proxy.html#nw-proxy-configure-object_config-cluster-wide-proxy>
>
I forgot to mention that it’s a transparent proxy, in install-config.yaml I
added the proxy CA to “additionalTrustBundle” which helped it install the
cluster. But it just didn’t seem to apply to the builder.

Can I use the “trustedCA” part of the proxy configuration without actually
specifying an explicit proxy?
-- 
Kind Regards,

Joel Pearson
Agile Digital | Senior Software Consultant

Love Your Software™ | ABN 98 106 361 273
p: 1300 858 277 | m: 0405 417 843 <0405417843> | w: agiledigital.com.au
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

How to use extra trusted CA certs when pulling images for a builder

2019-11-10 Thread Joel Pearson

Hi,

I’m trying to build an image in Openshift 4.2 where my internet has an MITM
proxy.

So trying to pull docker images fails during the build with x509 errors.

Is there a way to provide extra trusted CA certificates to the builder?

Pulling image registry.redhat.io/ubi7-minimal:7.7 ...

Warning: Pull failed, retrying in 5s ...

Warning: Pull failed, retrying in 5s ...

Warning: Pull failed, retrying in 5s ...

error: build error: failed to pull image: After retrying 2 times, Pull
image still failed due to error: while pulling "docker://
registry.redhat.io/ubi7-minimal:7.7" as "registry.redhat.io/ubi7-minimal:7.7":
Error initializing source docker://registry.redhat.io/ubi7-minimal:7.7:
pinging docker registry returned: Get https://registry.redhat.io/v2/: x509:
certificate signed by unknown authority

Thanks,

Joel

-- 
Kind Regards,

Joel Pearson
Agile Digital | Senior Software Consultant

Love Your Software™ | ABN 98 106 361 273
p: 1300 858 277 | m: 0405 417 843 <0405417843> | w: agiledigital.com.au
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: Failing to bootstrap disconnected 4.2 cluster on metal

2019-10-28 Thread Joel Pearson

>
> Almost always means a node is broken / blocked / unable to schedule pods,
> which prevents DNS from deploying.


That's the weird thing though. DNS is deployed, and all the nodes are happy
according to "oc get nodes".

It seems that the operator is misreporting the error.  In the console
dashboard it has a number of alerts that seem out of date, that I'm not
able to clear too.

The dns-default DaemonSet says that 7 of 7 pods are ok.

Is there a way to reboot/re-initialise a "stuck" operator?
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: Failing to bootstrap disconnected 4.2 cluster on metal

2019-10-28 Thread Joel Pearson

>
> > Maybe must-gather could be included in the release manifest so that it's
> available in disconnected environments by default?
> It is:
>   $ oc adm release info --image-for=must-gather
> quay.io/openshift-release-dev/ocp-release:4.2.0
>
> quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:34ff29512304f77b0ab70ea6850e7f8295a4d19e497ab690ea5102a7044ea993
> If your 'oc adm must-gather' is reaching out to Quay, instead of
> hitting your mirror, it may be because your samples operator has yet
> to get the mirrored must-gather ImageStream set up.

It looks like image streams don't honor the imageContentSources mirror, and
try to reach out to the internet.

I had a look at the openshift/must-gather image stream and there was an
error saying:

Internal error occurred: Get https://quay.io/v2: dial tcp: lookup quay.io
on 172.30.0.10:53 server misbehaving

Running "oc adm must-gather --image
quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:34ff29512304f77b0ab70ea6850e7f8295a4d19e497ab690ea5102a7044ea993"
actually worked.

That (un)available typo should be fixed in master by [1], but looks
> like that hasn't been backported to 4.2.z.  But look for the
> machine-config daemon that is unready (possibly by listing Pods), and
> see why it's not going ready.

Turns out that all of the machine-config daemon's are ready (I can see 7 of
them all marked as ready). But the machine-config operator just doesn't
appear to be trying anymore.

It's listed as Available=False Progressing=False and Degraded=True.

I tried deleting the operator pod in the hope that it'd kickstart
something, but it didn't seem to help.

I noticed a message right up the top saying:
event.go:247] Could not construct reference to: '&v1.ConfigMap...' Will not
report event 'Normal' 'LeaderElection' 'machine-config-operator-5f47...
become leader'

The pod that I deleted had that same message too, is this a red herring?

I have must-gather logs now, except that it will probably be complicated to
get them off this air-gapped system.  Are there any pointers about where I
should look to find out why it's no longer progressing? Can I make the
operator try again somehow?

I also noticed that the dns operator is marked available, but there is a
degraded status saying that "Not all desired DNS DaemonSets available"
however, they are all available.

On Tue, 29 Oct 2019 at 05:24, W. Trevor King  wrote:

> On Mon, Oct 28, 2019 at 4:05 AM Joel Pearson wrote:
> > Maybe must-gather could be included in the release manifest so that it's
> available in disconnected environments by default?
>
> It is:
>
>   $ oc adm release info --image-for=must-gather
> quay.io/openshift-release-dev/ocp-release:4.2.0
>
> quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:34ff29512304f77b0ab70ea6850e7f8295a4d19e497ab690ea5102a7044ea993
>
> If your 'oc adm must-gather' is reaching out to Quay, instead of
> hitting your mirror, it may be because your samples operator has yet
> to get the mirrored must-gather ImageStream set up.
>
> >> Failed to resync 4.2.0 because: timed out waiting for the condition
> during waitForFaemonsetRollout: Daemonset machine-config-daemon is not
> ready. status (desired:7, updated 7, ready: 6, unavailable: 6)
>
> That (un)available typo should be fixed in master by [1], but looks
> like that hasn't been backported to 4.2.z.  But look for the
> machine-config daemon that is unready (possibly by listing Pods), and
> see why it's not going ready.
>
> Cheers,
> Trevor
>
> [1]:
> https://github.com/openshift/machine-config-operator/commit/efb6a96a5bcb13cb3c0c0a0ac0c2e7b022b72665
>

-- 
Kind Regards,

Joel Pearson
Agile Digital | Senior Software Consultant

Love Your Software™ | ABN 98 106 361 273
p: 1300 858 277 | m: 0405 417 843 <0405417843> | w: agiledigital.com.au
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: Failing to bootstrap disconnected 4.2 cluster on metal

2019-10-28 Thread Joel Pearson

So I got past bootstrap this time, and it made it almost all the way, it
got stuck on the machine-operator.  All the other cluster operators have
passed.

I'm not really sure how to diagnose what is wrong with the machine-config
operator.  I tried 'oc adm must-gather', but it didn't work because the
must-gather container isn't part of the release manifest, so it tried to
reach out to quay.io to download that container, which obviously fails in a
disconnected environment.  Maybe must-gather could be included in the
release manifest so that it's available in disconnected environments by
default?

I instead ran "oc describe clusteroperator machine-config", this error
message was there, how do I diagnose this?

Failed to resync 4.2.0 because: timed out waiting for the condition during
> waitForFaemonsetRollout: Daemonset machine-config-daemon is not ready.
> status (desired:7, updated 7, ready: 6, unavailable: 6)

On Mon, 28 Oct 2019 at 05:59, Clayton Coleman  wrote:

> We probably need to remove the example from the docs and highlight
> that you must copy the value reported by image mirror
>
> > On Oct 27, 2019, at 11:33 AM, W. Trevor King  wrote:
> >
> >> On Sun, Oct 27, 2019 at 2:17 AM Joel Pearson wrote:
> >> Ooh, does this mean 4.2.2 is out or the release is imminent? Should I
> be trying to install 4.2.2 instead of 4.2.0?
> >
> > 4.2.2 exists and is in candidate-4.2.  That means it's currently
> > unsupported.  The point of candidate-* testing is to test the releases
> > to turn up anything that should block them going stable, so we're
> > certainly launching a bunch of throw-away 4.2.2 clusters at this
> > point, and other folks are welcome to do that too.  But if you want
> > stability and support, you should wait until it is promoted into
> > fast-4.2 or stable-4.2 (which may never happen if testing turns up a
> > serious-enough issue).  So "maybe" to both your question ;).
> >
> >> I mirrored quay.io/openshift-release-dev/ocp-release:4.2.0
> >
> > Yeah, should be no CI-registry images under that.
> >
> > Cheers,
> > Trevor
> >
> > ___
> > users mailing list
> > users@lists.openshift.redhat.com
> > http://lists.openshift.redhat.com/openshiftmm/listinfo/users
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: Failing to bootstrap disconnected 4.2 cluster on metal

2019-10-27 Thread Joel Pearson

>
> quay.io/openshift-release-dev/ocp-release:4.2.0$ oc adm release info
> --pullspecs
> quay.io/openshift-release-dev/ocp-release:4.2.2 | grep -A3 Images:


Ooh, does this mean 4.2.2 is out or the release is imminent? Should I be
trying to install 4.2.2 instead of 4.2.0?

 ... And it's not in [1], although you should just be
> recycling whatever 'oc adm release mirror' suggests instead of blindly
> copy/pasting from docs.  Which release did you mirror?


Thanks for this information. Looks like I must have skipped reading the
output of the mirror command. Thanks for that heads up!

I mirrored quay.io/openshift-release-dev/ocp-release:4.2.0

  I dunno what happened with your API-server lock-up, but
> 'openshift-install gather bootstrap ...' will SSH into your bootstrap
> machine and from there onto the control-plane machines and gather the
> things we expected would be useful for debugging this sort of thing,
> so probably start with that.


I'll try out "openshift-install gather bootstrap" tomorrow. It sounds very
useful, thanks for that information.

Thanks,

Joel
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Failing to bootstrap disconnected 4.2 cluster on metal

2019-10-25 Thread Joel Pearson

Hi,

I'm trying to bootstrap a disconnected (air-gapped) 4.2 cluster using the bare
metal method
.
It is technically vmware, but I'm following the bare metal version as our
vmware cluster wasn't quite compatible with the vmware instructions.

After a few false starts I managed to get the bootstrapping to start to
take place.  One strange thing that happened was that it was trying to
download images from "quay.io/openshift-release-dev/ocp-v4.0-art-dev"
instead of the documented "quay.io/openshift-release-dev/ocp-release". I
found this rather odd, and I couldn't find many references to
"ocp-v4.0-art-dev" on the internet, so I'm not sure exactly where it came
from.  I did a "strings openshift-install | grep ocp-v4.0-art-dev" but that
didn't show anything, so it's a bit of a strange one.

So my image content sources ended up being:

imageContentSources: - mirrors: -
:5000//release source:
quay.io/openshift-release-dev/ocp-release - mirrors: -
:5000//release source:
quay.io/openshift-release-dev/ocp-v4.0-art-dev
- mirrors: - :5000//release source:
registry.svc.ci.openshift.org/ocp/release

I was watching the journalctl on the bootstrap server, and I saw each etcd
server join one by one, then once they had all joined, then the apiserver
on the bootstrap server seemed to lockup, when I tried to connect to
https://localhost:6443 the connections would hang.  Initially, I thought
this meant that bootstrap had completed, but then I noticed that none of
the master nodes were listing on 6443, they were all trying to look
themselves up in etcd at "api-int.." but nothing
was listening.

I then scoured the journal on the bootstrap node, but I struggled to find
logs related to why the apiserver had disappeared.  The journal was mostly
full of the bootstrap node trying to connect to https://localhost:6443,
which suggested to me that bootstrap was not yet complete.

I tried rebooting the bootstrap node, but I think that made it worse, it
seemed to be in a crash loop whinging about files in /etc/kubernetes
already existing or something like that.  I had a look through /var/logs
and found this error message in some pod logs:

exiting because of error: log: unable to create log: open
/var/log/bootstrap-control-plane/kube-apiserver.log: permission denied

I'm not sure if that error is because I restarted before bootstrap was
successful, or if that is actually some sort of problem.

I tried reinstalling from scratch a few times, and it always got stuck in
the same place, so it doesn't seem to be transient.

Where can I look for errors? Is "ocp-v4.0-art-dev" an indication of a
problem? Since it's an air-gapped solution it's difficult to get logs out
of the system, so I don't know if I'll be able to use must-gather.
However, if I'm understanding it correctly, must-gather can only be used
after bootstrap has succeeded.

Thoughts?
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: DNS resolution performance woeful while CRC is running in Windows

2019-10-07 Thread Joel Pearson

Thanks for pointing that out, when crc starts, I do see:

INFO Will run as admin: add dns server address to interface vEthernet
(Default Switch)

I also noticed in my ipconfig /all, that a bunch of the vEthernet
interfaces have these IPv6 link-local DNS addresses:

   DNS Servers . . . . . . . . . . . : fec0:0:0:::1%1

  fec0:0:0:::2%1

 fec0:0:0:::3%1


Which don't seem to be a problem until crc starts.  So maybe whatever crc
does to make the name resolution work, inadvertently enables resolution via
IPv6 nameservers somehow, causing it to try and resolve names via the above
DNS servers.

>From what I can tell those DNS servers seem fairly common, I don't exactly
know where they come from, or how they're supposed to work.  I suspect if I
go through all my interfaces disabling ipv6 then the DNS issue will go away.

Thanks,

Joel


On Mon, 7 Oct 2019 at 22:40, Walther, Jens-Uwe <
jens-uwe.walt...@proservia.de> wrote:

> Hi, I had an CRC issue and one of the developer stated that the default
> Hyper-V network switch “Default switch” is the one who is “hosting” the
>
> crc.testing domain via DHCP in some kind.
>
>
>
> Beste Grüße / Best regards
>
>
> *Jens-Uwe Walther *
>
>
>
> M: +49 (160) 97250976
>
>
>
> *Von:* users-boun...@lists.openshift.redhat.com <
> users-boun...@lists.openshift.redhat.com> *Im Auftrag von *Joel Pearson
> *Gesendet:* Montag, 7. Oktober 2019 12:29
> *An:* users 
> *Betreff:* DNS resolution performance woeful while CRC is running in
> Windows
>
>
>
> *CAUTION: This email originated from outside of the organization.*
>
> *Do not click links or open attachments unless you recognize the sender
> and know the content is safe. *
>
> Hi,
>
>
>
> I'm wondering if someone can let me know how the crc.testing domain works
> in crc for windows?
>
>
>
> I can't see any entries in c:\windows\system32\drivers\etc\hosts, and my
> DNS entries appear to be the same, but a dig command doesn't
> find api.crc.testing, so it's doing something special to get that name
> resolution to work.
>
>
>
> When crc is running, DNS resolution performance in chrome and edge, is
> woeful, around the 10-second mark.
>
>
>
> Once I ran crc stop then DNS performance went back to normal.
>
>
>
> I'm running Windows 10, Windows Insiders Build 18990 (fast ring).
>
>
>
> I went hunting through "ipconfig /all", but haven't found anything.
>
>
>
> Thanks,
>
>
>
> Joel
>
>
>
>
>
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

DNS resolution performance woeful while CRC is running in Windows

2019-10-07 Thread Joel Pearson

Hi,

I'm wondering if someone can let me know how the crc.testing domain works
in crc for windows?

I can't see any entries in c:\windows\system32\drivers\etc\hosts, and my
DNS entries appear to be the same, but a dig command doesn't
find api.crc.testing, so it's doing something special to get that name
resolution to work.

When crc is running, DNS resolution performance in chrome and edge, is
woeful, around the 10-second mark.

Once I ran crc stop then DNS performance went back to normal.

I'm running Windows 10, Windows Insiders Build 18990 (fast ring).

I went hunting through "ipconfig /all", but haven't found anything.

Thanks,

Joel
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: [OKD/OCP v4]: deployment on a single node using CodeReady Container

2019-09-19 Thread Joel Pearson

Marvin, you could try enabling nested virtualisation in GCP?

https://cloud.google.com/compute/docs/instances/enable-nested-virtualization-vm-instances


On Fri, 20 Sep 2019 at 09:50, Just Marvin <
marvin.the.cynical.ro...@gmail.com> wrote:

> Fernando,
>
> Is CRC only expected to run on bare-metal? I tried running it on a VM
> in GCP and it didn't work, complaining about virtualization problems (sorry
> - forget the exact error). It runs find on my laptop, but I'd really like
> to not muddy up my laptop with all kinds of experimental things.
>
> Regards,
> Marvin
>
> On Wed, Sep 18, 2019 at 12:35 PM Fernando Lozano 
> wrote:
>
>> Hi Joel,
>>
>> Yes, CRC requires virtualization. It creates and manages a VM, using the
>> hypervisor provided by your laptop OS, and runs OpenShift inside that VM.
>> AFAIK there is no more all-in-one containerized support for OpenShift so
>> more 'oc cluster up' for OpenShift 4.x.
>>
>> []s, Fernando Lozano
>>
>>
>> On Wed, Sep 18, 2019 at 9:44 AM Joel Pearson <
>> japear...@agiledigital.com.au> wrote:
>>
>>> With CodeReady Container, it's not possible to use it without
>>> virtualisation right?  Because it needs CoreOS, and can't startup on an
>>> existing docker installation like you can with "oc cluster up"?
>>>
>>> I'm only asking because I almost got OKD 3.11 running on Windows 10 WSL
>>> (windows subsystem for linux) v2.  But if it's a full VM, then running
>>> inside WSL 2 doesn't really make sense (and probably doesn't work anyway).
>>>
>>> On Sat, 14 Sep 2019 at 02:35, Daniel Comnea 
>>> wrote:
>>>
>>>> Recently folks were asking what is the minishift's alternative for v4
>>>> and in case you've missed the news see [1]
>>>>
>>>> Hopefully that will also work for OKD v4 once  the MVP is out.
>>>>
>>>>
>>>> Dani
>>>>
>>>> [1]
>>>> https://developers.redhat.com/blog/2019/09/05/red-hat-openshift-4-on-your-laptop-introducing-red-hat-codeready-containers/
>>>> ___
>>>> users mailing list
>>>> users@lists.openshift.redhat.com
>>>> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>>>>
>>>
>>> ___
>>> users mailing list
>>> users@lists.openshift.redhat.com
>>> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>>>
>> ___
>> users mailing list
>> users@lists.openshift.redhat.com
>> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>>
>

-- 
Kind Regards,

Joel Pearson
Agile Digital | Senior Software Consultant

Love Your Software™ | ABN 98 106 361 273
p: 1300 858 277 | m: 0405 417 843 <0405417843> | w: agiledigital.com.au
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: [OKD/OCP v4]: deployment on a single node using CodeReady Container

2019-09-18 Thread Joel Pearson

With CodeReady Container, it's not possible to use it without
virtualisation right?  Because it needs CoreOS, and can't startup on an
existing docker installation like you can with "oc cluster up"?

I'm only asking because I almost got OKD 3.11 running on Windows 10 WSL
(windows subsystem for linux) v2.  But if it's a full VM, then running
inside WSL 2 doesn't really make sense (and probably doesn't work anyway).

On Sat, 14 Sep 2019 at 02:35, Daniel Comnea  wrote:

> Recently folks were asking what is the minishift's alternative for v4 and
> in case you've missed the news see [1]
>
> Hopefully that will also work for OKD v4 once  the MVP is out.
>
>
> Dani
>
> [1]
> https://developers.redhat.com/blog/2019/09/05/red-hat-openshift-4-on-your-laptop-introducing-red-hat-codeready-containers/
> ___
> users mailing list
> users@lists.openshift.redhat.com
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: [ssl] oc cluster up

2019-02-27 Thread Joel Pearson

Why not use an ansible installation for a single node instead? Then you can let 
ansible configure everything properly for you. 

Sent from my iPhone

> On 28 Feb 2019, at 9:02 am, Pavel Maslov  wrote:
> 
> With my original question, I meant how can I secure the Web Console (I was 
> able to install a custom SSL certificate for the Router, so now it's the Web 
> Console's turn). I am following the instructions from the documentation [1], 
> but to no avail - Web Console is still picking up the default self-singed 
> certificate by Openshift.
> 
> Since I am starting my Openshift cluster using oc cluster up, a new directory 
> gets created, namely openshift.local.clusterup/.
> So what I did I edited the file 
> openshift.local.clusterup/kub-apiserver/master-config.yaml as described in 
> [1]:
> 
> servingInfo:
>   masterPublicURL: https://dev3.maslick.com:8443
>   publicURL: https://dev3.maslick.com:8443/console/
>   bindAddress: 0.0.0.0:8443
>   bindNetwork: tcp4
>   certFile: master.server.crt
>   clientCA: ca.crt
>   keyFile: master.server.key
>   maxRequestsInFlight: 1200
>   namedCertificates:
>   - certFile: dev3-maslick-com.crt
> clientCA: ca-maslick-com.pem
> keyFile: key-dev3-maslick-com.pem
> names:
>   - "dev3.maslick.com"
>   requestTimeoutSeconds: 3600
> volumeConfig:
>   dynamicProvisioningEnabled: true
> 
> It doesn't work though. It doesn't even pick up my certificate. I put the 
> crt, ca and key files into the same folder as master-config.yaml: 
> $HOME/openshift.local.clusterup/kub-apiserver/.
> Any thoughts? Thanks!
> 
> [1] 
> https://docs.okd.io/latest/install_config/certificate_customization.html#configuring-custom-certificates
> 
> Regards,
> Pavel Maslov, MS
> 
> 
>> On Mon, Feb 25, 2019 at 4:31 PM Pavel Maslov  wrote:
>> Hi, all
>> 
>> I'm new to the list. Perhaps, smb already asked this question:
>> 
>> When I start a cluster using oc cluster up command, Openshift generates a 
>> self-signed certificate. Is it possible to give it a real certificate? 
>> 
>> Thanks in advance.
>> 
>> Regards,
>> Pavel Maslov, MS
> ___
> users mailing list
> users@lists.openshift.redhat.com
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: RPMs for 3.11 still missing from the official OpenShift Origin CentOS repo

2019-01-07 Thread Joel Pearson

It just detects. It checks the operating system type. You don’t even need
to change the inventory at all. As rpms are only supported on Centos and
containerised only on Atomic

On Mon, 7 Jan 2019 at 7:47 pm, mabi  wrote:

> ‐‐‐ Original Message ‐‐‐
> On Sunday, January 6, 2019 11:13 PM, Joel Pearson <
> japear...@agiledigital.com.au> wrote:
>
> It looks like the RPMs will eventually get the security fix according to
> the other reply from Daniel Comnea. But with containers you could have a
> fix within a day as opposed to waiting for new tag which still hasn’t
> happened yet and it’s been more than 1 month.
>
>
> That's good to know that it will eventually get fixed but with security
> vulnerabilities 1 month is already too long.
>
> The upgrade procedure is the same as RPMs, however you wouldn’t need to
> change the rpm repo.
>
>
> That's great! So this means that the OpenShift Ansible upgrade.yml
> playbook detects if the node is using CentOS+RPMs or Atomic Host+Docker and
> then upgrades using the correct way? or is there any special parameter I
> need for example in my Ansible inventory file to let the playbook know that
> I would be using Atomic Host?
>
>
> On Sun, 6 Jan 2019 at 07:03, mabi  wrote:
>>
>>> ‐‐‐ Original Message ‐‐‐
>>> On Saturday, January 5, 2019 3:57 PM, Daniel Comnea <
>>> comnea.d...@gmail.com> wrote:
>>>
>>> [DC]: i think you are a bit confused: there are 2 ways to get the rpms
>>> from CentOS yum repo: using the generic repo [1] which will always have the
>>> latest origin release OR [2] where i've mentioned that you can install
>>> *centos-release-openshift-origin3** rpm which will give you [3] yum repo
>>>
>>>
>>> Thank you for your precisions and yes I am confused because first of all
>>> the upgrading documentation on the okd.io website does not mention
>>> anything about having to manually change the yum repo.repos.d file to match
>>> a new directory for a new version of openshift.
>>>
>>> Then second, this mail (
>>> https://lists.openshift.redhat.com/openshift-archives/users/2018-November/msg7.html)
>>> has the following sentence, I quote:
>>>
>>> "Please note that due to ongoing work on releasing CentOS 7.6, the
>>> mirror.centos.org repo is in freeze mode - see [4] and as such we have
>>> not published the rpms to [5]. Once the freeze mode will end, we'll publish
>>> the rpms."
>>>
>>> So when is the freeze mode over for this repo? I read this should have
>>> happened after the CentOS 7.6 release but that was already one month ago
>>> and still no version 3.11 RPMs in the
>>> http://mirror.centos.org/centos/7/paas/x86_64/openshift-origin/ repo...
>>>
>>> Finally, all I want to do is to upgrade my current okd version 3.10 to
>>> version 3.11 but I can't find any complete instructions documented
>>> correctly. The best I can find is
>>> https://docs.okd.io/3.11/upgrading/automated_upgrades.html which simply
>>> mentions running the following upgrade playbook:
>>>
>>> ansible-playbook \
>>> -i  \
>>> playbooks/byo/openshift-cluster/upgrades//upgrade.yml
>>>
>>> Again here there is no mention of having to modify a yum.repos.d file
>>> beforehand or having to install the centos-release-openshift-origin
>>> package...
>>>
>>> I would be glad if someone can clarify the full upgrade process and/or
>>> have the official documentation enhanced.
>>> ___
>>> users mailing list
>>> users@lists.openshift.redhat.com
>>> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>>>
>>
>>
>> --
> Kind Regards,
>
> Joel Pearson
> Agile Digital | Senior Software Consultant
>
> Love Your Software™ | ABN 98 106 361 273
> p: 1300 858 277 | m: 0405 417 843 <0405417843> | w: agiledigital.com.au
>
>
> --
Kind Regards,

Joel Pearson
Agile Digital | Senior Software Consultant

Love Your Software™ | ABN 98 106 361 273
p: 1300 858 277 | m: 0405 417 843 <0405417843> | w: agiledigital.com.au
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: RPMs for 3.11 still missing from the official OpenShift Origin CentOS repo

2019-01-06 Thread Joel Pearson

On Mon, 7 Jan 2019 at 8:01 am, mabi  wrote:

> ‐‐‐ Original Message ‐‐‐
> On Sunday, January 6, 2019 12:28 PM, Joel Pearson <
> japear...@agiledigital.com.au> wrote:
>
> I think it's worth mentioning here that the RPMs at
> http://mirror.centos.org/centos/7/paas/x86_64/openshift-origin311/ have a
> critical security vulnerability, I think it's unsafe to use the RPMs if
> you're planning on having your cluster available on the internet.
>
> https://access.redhat.com/security/cve/cve-2018-1002105
>
>
> Thank you Joel for pointing this important security issue out. I was not
> aware that the OpenShift RPMs on this official CentOS repository are not
> being updated for security vulnerabilities. This is a total nogo for me as
> my cluster is facing the internet.
>

It looks like the RPMs will eventually get the security fix according to
the other reply from Daniel Comnea. But with containers you could have a
fix within a day as opposed to waiting for new tag which still hasn’t
happened yet and it’s been more than 1 month.


> Unless you're going to be using the RedHat supported version of OpenShift,
> ie OCP, then I think the only safe option is to install OKD with Centos
> Atomic Host and the containerised version of OpenShift, ie not use the RPMs
> at all.
>
>
> I will stick with OKD and try out CentOS Atomic Host instead of plain
> CentOS.
>
> However, the bad news for you is that an upgrade from RPMs to
> containerised would not be simple, and you couldn't reuse your nodes
> because you'd need to switch from Centos regular to Centos Atomic Host.  It
> would probably be technically possible but not simple.  I guess you'd
> upgrade your 3.10 cluster to the vulnerable version of 3.11 via RPMs, and
> then migrate your cluster to another cluster running on Atomic Host, I'm
> guessing there is probably some way to replicate the etcd data from one
> cluster to another. But it sounds like it'd be a lot of work, and you'd
> need some pretty deep skills in etcd and openshift.
>
>
> As I am still trying out OKD I will simply trash my existing CentOS nodes
> and re-install them all with CentOS Atomic Host. That shouldn't be a
> problem. I just hope that installing OKD on Atomic Host is better
> documented than the installation on plain CentOS, especially in regard of
> the upgrading procedure. But If I understand correctly the upgrade
> procedure here should be simplified as everything runs inside Docker
> containers.
>

The upgrade procedure is the same as RPMs, however you wouldn’t need to
change the rpm repo.

https://docs.okd.io/3.11/upgrading/automated_upgrades.html

A word of warning about the next major version upgrade, v4.0, Atomic Host
support is deprecated in favour of CoreOS (which RedHat recently acquired)
however CoreOS is not supported for 3.11 so it looks like you’ll need to do
a cluster rebuild for v4.0.  But at least you’ll be able to get 3.11
patches in the meantime.

>
>
> Now I first have to figure out how to install my CentOS Atomic
> Host virtual machines automatically with PXE and kickstart. It looks like I
> just need to adapt my kickstart file for Atomic Host (rpm ostree) and I get
> Atomic Host instead of plain CentOS...
>
>
> On Sun, 6 Jan 2019 at 07:03, mabi  wrote:
>
>> ‐‐‐ Original Message ‐‐‐
>> On Saturday, January 5, 2019 3:57 PM, Daniel Comnea <
>> comnea.d...@gmail.com> wrote:
>>
>> [DC]: i think you are a bit confused: there are 2 ways to get the rpms
>> from CentOS yum repo: using the generic repo [1] which will always have the
>> latest origin release OR [2] where i've mentioned that you can install
>> *centos-release-openshift-origin3** rpm which will give you [3] yum repo
>>
>>
>> Thank you for your precisions and yes I am confused because first of all
>> the upgrading documentation on the okd.io website does not mention
>> anything about having to manually change the yum repo.repos.d file to match
>> a new directory for a new version of openshift.
>>
>> Then second, this mail (
>> https://lists.openshift.redhat.com/openshift-archives/users/2018-November/msg7.html)
>> has the following sentence, I quote:
>>
>> "Please note that due to ongoing work on releasing CentOS 7.6, the
>> mirror.centos.org repo is in freeze mode - see [4] and as such we have
>> not published the rpms to [5]. Once the freeze mode will end, we'll publish
>> the rpms."
>>
>> So when is the freeze mode over for this repo? I read this should have
>> happened after the CentOS 7.6 release but that was already one month ago
>> and still no version 3.11 RPMs in the
>&g

Re: RPMs for 3.11 still missing from the official OpenShift Origin CentOS repo

2019-01-06 Thread Joel Pearson

I think it's worth mentioning here that the RPMs at
http://mirror.centos.org/centos/7/paas/x86_64/openshift-origin311/ have a
critical security vulnerability, I think it's unsafe to use the RPMs if
you're planning on having your cluster available on the internet.

https://access.redhat.com/security/cve/cve-2018-1002105

Unless you're going to be using the RedHat supported version of OpenShift,
ie OCP, then I think the only safe option is to install OKD with Centos
Atomic Host and the containerised version of OpenShift, ie not use the RPMs
at all.

The problem with the RPMs, is that you get no patches, only the version of
OpenShift 3.11.0 as it was when it was released, however, the containerized
version of OKD (only supported on Atomic Host) has a rolling tag (see
https://lists.openshift.redhat.com/openshift-archives/users/2018-October/msg00049.html)
and you'll notice that the containers were just rebuilt a few minutes ago:
https://hub.docker.com/r/openshift/origin-node/tags

It looks like the OKD images are rebuilt from the release-3.11 branch:
https://github.com/openshift/origin/commits/release-3.11

You can see the CVE critical vulnerability was fixed in commits on December
4, however, the RPMs were built on the 5th of November so they certainly do
not contain the critical vulnerability fixes.

I am running OKD 3.11 on Centos Atomic Host on an OpenStack cluster and it
works fine, and I can confirm from the OKD About page that I'm running a
version of OpenShift that is patched: OpenShift Master: v3.11.0+d0a16e1-79
(which lines up with commits on December 31)

However, the bad news for you is that an upgrade from RPMs to containerised
would not be simple, and you couldn't reuse your nodes because you'd need
to switch from Centos regular to Centos Atomic Host.  It would probably be
technically possible but not simple.  I guess you'd upgrade your 3.10
cluster to the vulnerable version of 3.11 via RPMs, and then migrate your
cluster to another cluster running on Atomic Host, I'm guessing there is
probably some way to replicate the etcd data from one cluster to another.
But it sounds like it'd be a lot of work, and you'd need some pretty deep
skills in etcd and openshift.

On Sun, 6 Jan 2019 at 07:03, mabi  wrote:

> ‐‐‐ Original Message ‐‐‐
> On Saturday, January 5, 2019 3:57 PM, Daniel Comnea 
> wrote:
>
> [DC]: i think you are a bit confused: there are 2 ways to get the rpms
> from CentOS yum repo: using the generic repo [1] which will always have the
> latest origin release OR [2] where i've mentioned that you can install
> *centos-release-openshift-origin3** rpm which will give you [3] yum repo
>
>
> Thank you for your precisions and yes I am confused because first of all
> the upgrading documentation on the okd.io website does not mention
> anything about having to manually change the yum repo.repos.d file to match
> a new directory for a new version of openshift.
>
> Then second, this mail (
> https://lists.openshift.redhat.com/openshift-archives/users/2018-November/msg7.html)
> has the following sentence, I quote:
>
> "Please note that due to ongoing work on releasing CentOS 7.6, the
> mirror.centos.org repo is in freeze mode - see [4] and as such we have
> not published the rpms to [5]. Once the freeze mode will end, we'll publish
> the rpms."
>
> So when is the freeze mode over for this repo? I read this should have
> happened after the CentOS 7.6 release but that was already one month ago
> and still no version 3.11 RPMs in the
> http://mirror.centos.org/centos/7/paas/x86_64/openshift-origin/ repo...
>
> Finally, all I want to do is to upgrade my current okd version 3.10 to
> version 3.11 but I can't find any complete instructions documented
> correctly. The best I can find is
> https://docs.okd.io/3.11/upgrading/automated_upgrades.html which simply
> mentions running the following upgrade playbook:
>
> ansible-playbook \
> -i  \
> playbooks/byo/openshift-cluster/upgrades//upgrade.yml
>
> Again here there is no mention of having to modify a yum.repos.d file
> beforehand or having to install the centos-release-openshift-origin
> package...
>
> I would be glad if someone can clarify the full upgrade process and/or
> have the official documentation enhanced.
> ___
> users mailing list
> users@lists.openshift.redhat.com
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: How do edit Grafana dashboards in OpenShift 3.11

2019-01-03 Thread Joel Pearson

Oh, it looks like it's read-only in 3.11:
https://bugzilla.redhat.com/show_bug.cgi?id=1652536

On Thu, 3 Jan 2019 at 22:49, Joel Pearson 
wrote:

> Hi,
>
> I found the grafana instance in OpenShift 3.11 in the openshift-monitoring
> project.
>
> I'm wondering how do I modify the dashboards? It seems to be in read-only
> mode.
>
> I'm a cluster-admin so I thought that it would give me write access.
>
> I'm guessing there is another role that gives that access?
>
> Thanks,
>
> Joel
>


-- 
Kind Regards,

Joel Pearson
Agile Digital | Senior Software Consultant

Love Your Software™ | ABN 98 106 361 273
p: 1300 858 277 | m: 0405 417 843 <0405417843> | w: agiledigital.com.au
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

How do edit Grafana dashboards in OpenShift 3.11

2019-01-03 Thread Joel Pearson

Hi,

I found the grafana instance in OpenShift 3.11 in the openshift-monitoring
project.

I'm wondering how do I modify the dashboards? It seems to be in read-only
mode.

I'm a cluster-admin so I thought that it would give me write access.

I'm guessing there is another role that gives that access?

Thanks,

Joel
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: OpenShift Origin on AWS

2018-10-09 Thread Joel Pearson

Answers inline


On Wed, 10 Oct 2018 at 12:08 am, David Conde  wrote:

> We have upgraded from the 3.6 reference architecture to the 3.9 aws
> playbooks in openshift-ansible. There was quite a bit of work in getting
> nodes ported into the scaling groups. We have upgraded our masters to 3.9
> with the BYO playbooks but have not ported them to use scaling groups yet.
>

Oh wow, I figured we’d have to blow away the cluster to get the scaling
groups. Was most of the work on the masters? Because I presume you just
deleted the 3.6 nodes and recreated them in the scaling group?



> We'll be sticking with the aws openshift-ansible playbooks in the future
> over the reference architecture so that we can upgrade easily.
>

We came to that conclusion too, it seems like it’ll be more likely to be
supported for longer.


> On Tue, Oct 9, 2018 at 1:29 PM Joel Pearson 
> wrote:
>
>> There is cloud formation templates as part of the 3.6 reference
>> architecture. But that is now deprecated. I’m using that template at a
>> client site and it worked fine (I’ve adapted it to work with 3.9 by using a
>> static inventory as we didn’t want to revisit our architecture from
>> scratch). We did customise it a fair bit though.
>>
>>
>> https://github.com/openshift/openshift-ansible-contrib/blob/master/reference-architecture/aws-ansible/README.md
>>
>> Here is an example of a jinja template that outputs a cloud formation
>> template.
>>
>> However, you can’t use the playbook as is for 3.9/3.10 because
>> openshift-ansible has breaking changes to the playbooks.
>>
>> For some reason the new playbooks for 3.9/3.10 don’t use cloud formation,
>> but rather use the amazon ansible plugins instead and directly interact
>> with AWS resources:
>>
>>
>> https://github.com/openshift/openshift-ansible/blob/master/playbooks/aws/README.md
>>
>> That new approach is pretty interesting though as it uses prebuilt AMIs
>> and auto-scaling groups, which make it very quick to add nodes.
>>
>> Hopefully some of that is useful to you.
>>
>> On Tue, 9 Oct 2018 at 9:42 pm, Peter Heitman  wrote:
>>
>>> Thank you for the reminder and the pointer. I know of that document but
>>> was too focused on searching for a CloudFormation template. I'll go back to
>>> the reference architecture which I'm sure will answer at least some of my
>>> questions.
>>>
>>> On Sun, Oct 7, 2018 at 4:24 PM Joel Pearson <
>>> japear...@agiledigital.com.au> wrote:
>>>
>>>> Have you seen the AWS reference architecture?
>>>> https://access.redhat.com/documentation/en-us/reference_architectures/2018/html/deploying_and_managing_openshift_3.9_on_amazon_web_services/index#
>>>> On Tue, 2 Oct 2018 at 3:11 am, Peter Heitman  wrote:
>>>>
>>>>> I've created a CloudFormation Stack for simple lab-test deployments of
>>>>> OpenShift Origin on AWS. Now I'd like to understand what would be best for
>>>>> production deployments of OpenShift Origin on AWS. In particular I'd like
>>>>> to create the corresponding CloudFormation Stack.
>>>>>
>>>>> I've seen the Install Guide page on Configuring for AWS and I've
>>>>> looked through the RedHat QuickStart Guide for OpenShift Enterprise but am
>>>>> still missing information. For example, the RedHat QuickStart Guide 
>>>>> creates
>>>>> 3 masters, 3 etcd servers and some number of compute nodes. Where are the
>>>>> routers (infra nodes) located? On the masters or on the etcd servers? How
>>>>> are the ELBs configured to work with those deployed routers? What if some
>>>>> of the traffic you are routing is not http/https? What is required to
>>>>> support that?
>>>>>
>>>>> I've seen the simple CloudFormation stack (
>>>>> https://sysdig.com/blog/deploy-openshift-aws/) but haven't found
>>>>> anything comparable for something that is closer to production ready (and
>>>>> likely takes advantage of using the AWS VPC QuickStart (
>>>>> https://aws.amazon.com/quickstart/architecture/vpc/).
>>>>>
>>>>> Does anyone have any prior work that they could share or point me to?
>>>>>
>>>>> Thanks in advance,
>>>>>
>>>>> Peter Heitman
>>>>>
>>>>> ___
>>>>> users mailing list
>>>>> users@lists.openshift.redhat.com
>>>>> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>>>>>
>>>> ___
>> users mailing list
>> users@lists.openshift.redhat.com
>> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>>
>
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: OpenShift Origin on AWS

2018-10-09 Thread Joel Pearson

There is cloud formation templates as part of the 3.6 reference
architecture. But that is now deprecated. I’m using that template at a
client site and it worked fine (I’ve adapted it to work with 3.9 by using a
static inventory as we didn’t want to revisit our architecture from
scratch). We did customise it a fair bit though.

https://github.com/openshift/openshift-ansible-contrib/blob/master/reference-architecture/aws-ansible/README.md

Here is an example of a jinja template that outputs a cloud formation
template.

However, you can’t use the playbook as is for 3.9/3.10 because
openshift-ansible has breaking changes to the playbooks.

For some reason the new playbooks for 3.9/3.10 don’t use cloud formation,
but rather use the amazon ansible plugins instead and directly interact
with AWS resources:

https://github.com/openshift/openshift-ansible/blob/master/playbooks/aws/README.md

That new approach is pretty interesting though as it uses prebuilt AMIs and
auto-scaling groups, which make it very quick to add nodes.

Hopefully some of that is useful to you.

On Tue, 9 Oct 2018 at 9:42 pm, Peter Heitman  wrote:

> Thank you for the reminder and the pointer. I know of that document but
> was too focused on searching for a CloudFormation template. I'll go back to
> the reference architecture which I'm sure will answer at least some of my
> questions.
>
> On Sun, Oct 7, 2018 at 4:24 PM Joel Pearson 
> wrote:
>
>> Have you seen the AWS reference architecture?
>> https://access.redhat.com/documentation/en-us/reference_architectures/2018/html/deploying_and_managing_openshift_3.9_on_amazon_web_services/index#
>> On Tue, 2 Oct 2018 at 3:11 am, Peter Heitman  wrote:
>>
>>> I've created a CloudFormation Stack for simple lab-test deployments of
>>> OpenShift Origin on AWS. Now I'd like to understand what would be best for
>>> production deployments of OpenShift Origin on AWS. In particular I'd like
>>> to create the corresponding CloudFormation Stack.
>>>
>>> I've seen the Install Guide page on Configuring for AWS and I've looked
>>> through the RedHat QuickStart Guide for OpenShift Enterprise but am still
>>> missing information. For example, the RedHat QuickStart Guide creates 3
>>> masters, 3 etcd servers and some number of compute nodes. Where are the
>>> routers (infra nodes) located? On the masters or on the etcd servers? How
>>> are the ELBs configured to work with those deployed routers? What if some
>>> of the traffic you are routing is not http/https? What is required to
>>> support that?
>>>
>>> I've seen the simple CloudFormation stack (
>>> https://sysdig.com/blog/deploy-openshift-aws/) but haven't found
>>> anything comparable for something that is closer to production ready (and
>>> likely takes advantage of using the AWS VPC QuickStart (
>>> https://aws.amazon.com/quickstart/architecture/vpc/).
>>>
>>> Does anyone have any prior work that they could share or point me to?
>>>
>>> Thanks in advance,
>>>
>>> Peter Heitman
>>>
>>> ___
>>> users mailing list
>>> users@lists.openshift.redhat.com
>>> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>>>
>>
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: OC client slowness Windows

2018-10-08 Thread Joel Pearson

My guess is that you’ve probably got some antivirus software interfering.
I’d recommend disabling all antivirus software and seeing if the
performance improves. It’s very slow for me at one of my client sites, but
I’ve discovered so is Cygwin in general, so I think it’s related to the
Symantec Endpoint Protection that is installed.
On Mon, 8 Oct 2018 at 8:14 pm, Marcello Lorenzi  wrote:

> Hi All,
> we installed the newer version of oc-client on a Windows 7 machine and we
> tested the oc client commands via git bash shell. We noticed some seconds
> of waiting during the oc commands execution and with the --loglevel=8, the
> commands reported their output after some seconds of hang. Do you notice
> this behavior on your experience?
>
> We're trying to identify the cause of this issue.
>
> Thanks,
> Marcello
> ___
> users mailing list
> users@lists.openshift.redhat.com
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: https route stopped working

2018-10-08 Thread Joel Pearson

Oh right. Now that you mention it. I think I have encountered that before
too. I don’t remember the circumstances though.
On Mon, 8 Oct 2018 at 7:44 pm, Tim Dudgeon  wrote:

> Yes, I had tried re-creating the route and that didn't work.
>
> Eventually I did manage to solve it. The 'Destination CA Cert' property
> for the route was (automatically) filled with some place holder 'backwards
> compatibility' text. When I replaced this with the CA cert used by the
> service (found in the secrets) things started working again.
>
> I have no idea why this stopped working and why this fix became necessary.
>
> On 07/10/18 21:14, Joel Pearson wrote:
>
> Have you tried looking at the generated haproxy file inside the router? It
> might give some hints as to what went wrong. I presume you’ve already tried
> recreating the route?
> On Wed, 3 Oct 2018 at 2:30 am, Tim Dudgeon  wrote:
>
>> We've hit a problem with a HTTPS route that used to work fine has now
>> stopped working.
>> Instead of the application we're are seeing the 'Application is not
>> available' page from the router.
>>
>> The route is using 'reencrypt' termination type to hit the service on
>> port 8443.
>> The service itself and its pod is running OK as indicated by being able
>> to curl it from inside the router pod using:
>>
>> curl -kL https://secure-sso.openrisknet-infra.svc:8443/auth
>>
>> (the -k is needed).
>>
>> An equivalent HTTP route that hits the HTTP service on port 8080 is
>> working fine.
>>
>> The only thing I can think of that might have caused this is redeploying
>> the master certificates using the 'redeploy-certificates.yml' playbook,
>> but I can't see how that would cause this.
>> This is all with Origin 3.7.
>>
>> Any thoughts on what might be wrong here?
>>
>> ___
>> users mailing list
>> users@lists.openshift.redhat.com
>> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>>
>
>
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: OpenShift Origin on AWS

2018-10-07 Thread Joel Pearson

Have you seen the AWS reference architecture?
https://access.redhat.com/documentation/en-us/reference_architectures/2018/html/deploying_and_managing_openshift_3.9_on_amazon_web_services/index#
On Tue, 2 Oct 2018 at 3:11 am, Peter Heitman  wrote:

> I've created a CloudFormation Stack for simple lab-test deployments of
> OpenShift Origin on AWS. Now I'd like to understand what would be best for
> production deployments of OpenShift Origin on AWS. In particular I'd like
> to create the corresponding CloudFormation Stack.
>
> I've seen the Install Guide page on Configuring for AWS and I've looked
> through the RedHat QuickStart Guide for OpenShift Enterprise but am still
> missing information. For example, the RedHat QuickStart Guide creates 3
> masters, 3 etcd servers and some number of compute nodes. Where are the
> routers (infra nodes) located? On the masters or on the etcd servers? How
> are the ELBs configured to work with those deployed routers? What if some
> of the traffic you are routing is not http/https? What is required to
> support that?
>
> I've seen the simple CloudFormation stack (
> https://sysdig.com/blog/deploy-openshift-aws/) but haven't found anything
> comparable for something that is closer to production ready (and likely
> takes advantage of using the AWS VPC QuickStart (
> https://aws.amazon.com/quickstart/architecture/vpc/).
>
> Does anyone have any prior work that they could share or point me to?
>
> Thanks in advance,
>
> Peter Heitman
>
> ___
> users mailing list
> users@lists.openshift.redhat.com
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: https route stopped working

2018-10-07 Thread Joel Pearson

Have you tried looking at the generated haproxy file inside the router? It
might give some hints as to what went wrong. I presume you’ve already tried
recreating the route?
On Wed, 3 Oct 2018 at 2:30 am, Tim Dudgeon  wrote:

> We've hit a problem with a HTTPS route that used to work fine has now
> stopped working.
> Instead of the application we're are seeing the 'Application is not
> available' page from the router.
>
> The route is using 'reencrypt' termination type to hit the service on
> port 8443.
> The service itself and its pod is running OK as indicated by being able
> to curl it from inside the router pod using:
>
> curl -kL https://secure-sso.openrisknet-infra.svc:8443/auth
>
> (the -k is needed).
>
> An equivalent HTTP route that hits the HTTP service on port 8080 is
> working fine.
>
> The only thing I can think of that might have caused this is redeploying
> the master certificates using the 'redeploy-certificates.yml' playbook,
> but I can't see how that would cause this.
> This is all with Origin 3.7.
>
> Any thoughts on what might be wrong here?
>
> ___
> users mailing list
> users@lists.openshift.redhat.com
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: Atomic Host support on OpenShift 3.11 and up

2018-09-25 Thread Joel Pearson

Clayton, does this mean that in OpenShift 4.0 you'd be able to take a
vanilla kubernetes installation and then install a bunch of OpenShift
operators and basically have an OpenShift cluster? Or is that not really
the goal of migration to operators? Is it just to make future OpenShift
releases easier to package?

On Fri, Sep 7, 2018 at 9:18 AM Clayton Coleman  wrote:

> Master right now will be labeled 4.0 when 3.11 branches (happening right
> now).  It’s possible we might later cut a 3.12 but no plans at the current
> time.
>
> Changes to master will include significant changes as the core is rewired
> with operators - you’ll also see much more focus on preparing
> openshift/installer and refractors in openshift-ansible that reduce its
> scope as the hand-off to operators happens.  Expect churn for the next
> months.
>
> On Sep 6, 2018, at 6:23 PM, Daniel Comnea  wrote:
>
> Clayton,
>
> 4.0 is that going to be 3.12 rebranded (if we follow the current release
> cycle) or 3.13 ?
>
>
>
> On Thu, Sep 6, 2018 at 2:34 PM Clayton Coleman 
> wrote:
>
>> The successor to atomic host will be RH CoreOS and the community
>> variants.  That is slated for 4.0.
>>
>> > On Sep 6, 2018, at 9:25 AM, Marc Ledent  wrote:
>> >
>> > Hi all,
>> >
>> > I have read in the 3.10 release notes that Atomic Host is deprecated
>> and will nod be supported starting release 3.11.
>> >
>> > What this means? Is it advisable to migrate all Atomic host vms to
>> "standard" RHEL server?
>> >
>> > Kind regards,
>> > Marc
>> >
>> >
>> > ___
>> > users mailing list
>> > users@lists.openshift.redhat.com
>> > http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>>
>> ___
>> users mailing list
>> users@lists.openshift.redhat.com
>> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>>
> ___
> users mailing list
> users@lists.openshift.redhat.com
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: IPv6

2018-09-25 Thread Joel Pearson

It looks like not, I found some references saying that Kubernetes has alpha
support in 1.9 and some improvements in 1.10

https://github.com/kubernetes/kubernetes/issues/1443
https://github.com/kubernetes/kubernetes/issues/62822

I did find this article suggesting that you might be able to use project
calico for IPv6 support, I don't know if that applies to 3.7 or not, but
calico is quite a different network deployment though.

https://www.projectcalico.org/enable-ipv6-on-kubernetes-with-project-calico/

On Tue, Sep 25, 2018 at 11:46 AM Diego Armando Ramirez Avelino <
dramir...@ipn.mx> wrote:

> IPv6 support for Openshift 3.7,  is available?
>
> Greetings
> --
>
> --
>
> La información de este correo así como la contenida en los documentos que
> se adjuntan, pueden ser objeto de solicitudes de acceso a la información.
> Visítanos: http://www.ipn.mx
> ___
> users mailing list
> users@lists.openshift.redhat.com
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: How to make 172.30.0.1 (kubernetes service) health checked?

2018-09-10 Thread Joel Pearson

Hi Clayton,

Sorry for the extensive delay, but I’ve been thinking about this more and
I’m wondering if it’s safe to remove a master from the endpoint just before
restarting it (say in Ansible), so that failures aren’t seen inside the
cluster?

Or would something in Kubernetes just go and add the master back to the
endpoint?

Alternatively, would it be possible to tell Kubernetes not to add the
individual masters to that endpoint and use a load balancer instead? Say a
private ELB for example?

Or are there future features in kubernetes that will make master failover
more reliable internally?

Thanks,

Joel
On Thu, 28 Jun 2018 at 12:48 pm, Clayton Coleman 
wrote:

> In OpenShift 3.9, when a master goes down the endpoints object should be
> updated within 15s (the TTL on the record for the master).  You can check
> the value of "oc get endpoints -n default kubernetes" - if you still see
> the master IP in that list after 15s then something else is wrong.
>
> On Wed, Jun 27, 2018 at 9:33 AM, Joel Pearson <
> japear...@agiledigital.com.au> wrote:
>
>> Hi,
>>
>> I'm running OpenShift 3.9 on AWS with masters in HA mode using Classic
>> ELB's doing TCP load balancing.  If I restart masters, from outside the
>> cluster the ELB does the right thing and takes a master out of service.
>> However, if something tries to talk to the kubernetes API inside the
>> cluster, it seems that kubernetes is unaware the master is missing, and I
>> get failures when I'm serially restarting masters.
>>
>> Is there some way that I can point the kubernetes service to use the load
>> balancer?  Maybe I should update the kubernetes endpoint object to use the
>> ELB IP address instead of the actual master addresses?  Is this a valid
>> approach?  Is there some way with openshift-ansible I can tell the
>> kubernetes service to use the load balancer when it creates the kubernetes
>> service?
>>
>>  Thanks,
>>
>> Joel
>>
>>
>> apiVersion: v1
>> kind: Service
>> metadata:
>>   creationTimestamp: '2018-06-27T06:30:50Z'
>>   labels:
>> component: apiserver
>> provider: kubernetes
>>   name: kubernetes
>>   namespace: default
>>   resourceVersion: '45'
>>   selfLink: /api/v1/namespaces/default/services/kubernetes
>>   uid: a224fd75-79d3-11e8-bd57-0a929ba50438
>> spec:
>>   clusterIP: 172.30.0.1
>>   ports:
>> - name: https
>>   port: 443
>>   protocol: TCP
>>   targetPort: 443
>> - name: dns
>>   port: 53
>>   protocol: UDP
>>   targetPort: 8053
>> - name: dns-tcp
>>   port: 53
>>   protocol: TCP
>>   targetPort: 8053
>>   sessionAffinity: ClientIP
>>   sessionAffinityConfig:
>> clientIP:
>>   timeoutSeconds: 10800
>>   type: ClusterIP
>> status:
>>   loadBalancer: {}
>>
>>
>> apiVersion: v1
>> kind: Endpoints
>> metadata:
>>   creationTimestamp: '2018-06-27T06:30:50Z'
>>   name: kubernetes
>>   namespace: default
>>   resourceVersion: '83743'
>>   selfLink: /api/v1/namespaces/default/endpoints/kubernetes
>>   uid: a22a0283-79d3-11e8-bd57-0a929ba50438
>> subsets:
>>   - addresses:
>>   - ip: 10.2.12.53
>>   - ip: 10.2.12.72
>>   - ip: 10.2.12.91
>> ports:
>>   - name: dns
>> port: 8053
>> protocol: UDP
>>   - name: dns-tcp
>> port: 8053
>> protocol: TCP
>>   - name: https
>> port: 443
>> protocol: TCP
>>
>>
>> ___
>> users mailing list
>> users@lists.openshift.redhat.com
>> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>>
>>
>
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

What is the most current OpenShift on OpenStack playbooks?

2018-08-29 Thread Joel Pearson

Hi,

I just wanted to find out if
https://github.com/openshift/openshift-ansible-contrib/tree/master/playbooks/provisioning/openstack
is
still the most current for deploying OpenShift on OpenStack?

I had a read of
https://access.redhat.com/documentation/en-us/reference_architectures/2018/html-single/deploying_and_managing_openshift_3.9_on_red_hat_openstack_platform_10/
but
it doesn't appear to use Ansible for the OpenStack infrastructure
configuration, but rather it is done by hand.

Is there an equivalent of the AMI approach for OpenShift nodes, ie:
https://github.com/openshift/openshift-ansible/tree/master/playbooks/aws ?

Or is that something I'd need to do myself if I wanted such a thing?

Thanks,

Joel
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: scheduler policy to spread pods

2018-07-04 Thread Joel Pearson

Here’s an OpenShift reference for the same thing.

https://docs.openshift.com/container-platform/3.6/admin_guide/scheduling/pod_affinity.html
On Wed, 4 Jul 2018 at 9:14 pm, Joel Pearson 
wrote:

> You’re probably after pod anti-affinity?
> https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#affinity-and-anti-affinity
>
> That lets you tell the scheduler that the pods aren’t allowed to be on the
> same node for example.
> On Wed, 4 Jul 2018 at 8:51 pm, Tim Dudgeon  wrote:
>
>> I've got a process the fires up a number of pods (bare pods, not backed
>> by replication controller) to execute a computationally demanding job in
>> parallel.
>> What I find is that the pods do not spread effectively across the
>> available nodes. In my case I have a node selector that restricts
>> execution to 3 nodes, and the pods run mostly on the first node, a few
>> run on the second node, and none run on the third node.
>>
>> I know that I could specify cpu resource requests and limits to help
>> with this, but for other reasons I'm currently unable to do this.
>>
>> It looks like this is controllable through the scheduler, but the
>> options for controlling this look pretty complex.
>> Could someone advise on how best to allow pods to spread evenly across
>> nodes rather than execute preferentially on one node?
>>
>> ___
>> users mailing list
>> users@lists.openshift.redhat.com
>> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>>
>
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: scheduler policy to spread pods

2018-07-04 Thread Joel Pearson

You’re probably after pod anti-affinity?
https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#affinity-and-anti-affinity

That lets you tell the scheduler that the pods aren’t allowed to be on the
same node for example.
On Wed, 4 Jul 2018 at 8:51 pm, Tim Dudgeon  wrote:

> I've got a process the fires up a number of pods (bare pods, not backed
> by replication controller) to execute a computationally demanding job in
> parallel.
> What I find is that the pods do not spread effectively across the
> available nodes. In my case I have a node selector that restricts
> execution to 3 nodes, and the pods run mostly on the first node, a few
> run on the second node, and none run on the third node.
>
> I know that I could specify cpu resource requests and limits to help
> with this, but for other reasons I'm currently unable to do this.
>
> It looks like this is controllable through the scheduler, but the
> options for controlling this look pretty complex.
> Could someone advise on how best to allow pods to spread evenly across
> nodes rather than execute preferentially on one node?
>
> ___
> users mailing list
> users@lists.openshift.redhat.com
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

How to make 172.30.0.1 (kubernetes service) health checked?

2018-06-27 Thread Joel Pearson

Hi,

I'm running OpenShift 3.9 on AWS with masters in HA mode using Classic
ELB's doing TCP load balancing.  If I restart masters, from outside the
cluster the ELB does the right thing and takes a master out of service.
However, if something tries to talk to the kubernetes API inside the
cluster, it seems that kubernetes is unaware the master is missing, and I
get failures when I'm serially restarting masters.

Is there some way that I can point the kubernetes service to use the load
balancer?  Maybe I should update the kubernetes endpoint object to use the
ELB IP address instead of the actual master addresses?  Is this a valid
approach?  Is there some way with openshift-ansible I can tell the
kubernetes service to use the load balancer when it creates the kubernetes
service?

 Thanks,

Joel


apiVersion: v1
kind: Service
metadata:
  creationTimestamp: '2018-06-27T06:30:50Z'
  labels:
component: apiserver
provider: kubernetes
  name: kubernetes
  namespace: default
  resourceVersion: '45'
  selfLink: /api/v1/namespaces/default/services/kubernetes
  uid: a224fd75-79d3-11e8-bd57-0a929ba50438
spec:
  clusterIP: 172.30.0.1
  ports:
- name: https
  port: 443
  protocol: TCP
  targetPort: 443
- name: dns
  port: 53
  protocol: UDP
  targetPort: 8053
- name: dns-tcp
  port: 53
  protocol: TCP
  targetPort: 8053
  sessionAffinity: ClientIP
  sessionAffinityConfig:
clientIP:
  timeoutSeconds: 10800
  type: ClusterIP
status:
  loadBalancer: {}


apiVersion: v1
kind: Endpoints
metadata:
  creationTimestamp: '2018-06-27T06:30:50Z'
  name: kubernetes
  namespace: default
  resourceVersion: '83743'
  selfLink: /api/v1/namespaces/default/endpoints/kubernetes
  uid: a22a0283-79d3-11e8-bd57-0a929ba50438
subsets:
  - addresses:
  - ip: 10.2.12.53
  - ip: 10.2.12.72
  - ip: 10.2.12.91
ports:
  - name: dns
port: 8053
protocol: UDP
  - name: dns-tcp
port: 8053
protocol: TCP
  - name: https
port: 443
protocol: TCP
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: What is the most reliable deployment method for 3.9 origin

2018-06-15 Thread Joel Pearson

Hi Wolf,

Given the silence, we've decided to go with the RPM method, as it's the
default for Centos/non-Atomic.

Thanks,

Joel

On Thu, Jun 14, 2018 at 3:21 PM Wolf Noble  wrote:

> I’ve been in the process of trying to assess this myself.
>
> Interested to hear what you settle on regardless
>
>
>
> > On Jun 13, 2018, at 23:26, Joel Pearson 
> wrote:
> >
> > Hi,
> >
> > I’m wondering what the most reliable method for installing Origin on
> Centos 7 is?
> >
> > * RPMs
> > * Containerized
> > * System containers
> >
> > Just recently we discovered that upgrading from 3.6 to 3.7 doesn’t seem
> to be tested using the containerized method, as the etcd upgrade fails as
> it tries to find specific versions of etcd on the fedora registry but the
> fedora registry only has a latest tag for etcd and then a few other random
> tags. So we had to switch to etcd from the redhat registry. This to me
> suggested that RPMs are probably the best method, as etcd at least has a
> version number, so the upgrade should succeed.
> >
> > How do system containers work? Are they still pulling containers from
> docker hub or are they something else entirely? Are they preferred over
> RPMs? Are they tested in origin? Or are RPMs they only real tested path for
> Origin?
> >
> > Thanks,
> >
> > Joel
> > ___
> > users mailing list
> > users@lists.openshift.redhat.com
> > http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

What is the most reliable deployment method for 3.9 origin

2018-06-13 Thread Joel Pearson

Hi,

I’m wondering what the most reliable method for installing Origin on Centos
7 is?

* RPMs
* Containerized
* System containers

Just recently we discovered that upgrading from 3.6 to 3.7 doesn’t seem to
be tested using the containerized method, as the etcd upgrade fails as it
tries to find specific versions of etcd on the fedora registry but the
fedora registry only has a latest tag for etcd and then a few other random
tags. So we had to switch to etcd from the redhat registry. This to me
suggested that RPMs are probably the best method, as etcd at least has a
version number, so the upgrade should succeed.

How do system containers work? Are they still pulling containers from
docker hub or are they something else entirely? Are they preferred over
RPMs? Are they tested in origin? Or are RPMs they only real tested path for
Origin?

Thanks,

Joel
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: load balancing for infra node in HA setup

2018-06-08 Thread Joel Pearson

Hi Tim,

Answers inline.

On 8 June 2018 at 23:00, Tim Dudgeon  wrote:

> The docs for installing a high availability openshift cluster e.g. [1] are
> fairly clear when it comes to the master node. If you set up a 3 masters
> then you need a load balancer that sits in front of these. OpenShift can
> provide this or you can provide your own external one.
>
> What not so clear is how to handle the nodes where the infrastructure
> components (registry and router) get deployed. In a typical example you
> would have 2 of these nodes, but what would happen in this case?
>
> I presume you are still openstack? Here is the OpenStack reference
architecture for Openshift:
https://access.redhat.com/documentation/en-us/reference_architectures/2018/html/deploying_and_managing_openshift_3.9_on_red_hat_openstack_platform_10/reference_architecture_summary

Normally you have 3 infra nodes with 3 router replicas with 1 load balancer
in front.


> Does a single registry and router get deployed to one of those nodes (in
> which case it would be difficult to set up DNS for the router to point to
> the right one).
>
> You simply point the DNS at the load balancer in front of the infra
nodes.  In the AWS reference architecture I run 3 registries, but they're
backed by S3, so it depends on the backing store for the registry I guess.
But it doesn't matter if you run 1 registry or 3, as long as the traffic
comes in via the load balancer, the OpenShift Routers will figure out where
the registries are running.

Or does the router get deployed to both so a load balancer is needed in
> front of these?
>
Yes, routers should be deployed on all infra nodes with a load balancer in
front.

>
> And similarly for the registry. Is there one or two of these deployed? How
> does this work?
>
As mentioned above, it doesn't matter how many registries, but for ha you
could have as many as the number of infra nodes, provided the backend for
your registry allows multiple replicas.

>
> I hope someone can clarify this.
> Tim
>
> [1] https://docs.openshift.org/latest/install_config/install/adv
> anced_install.html#multiple-masters
>
> ___
> users mailing list
> users@lists.openshift.redhat.com
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>



-- 
Kind Regards,

Joel Pearson
Agile Digital | Senior Software Consultant

[=Love Your Software™ | ABN 98 106 361 273
p: 1300 858 277 |  w: agiledigital.com.au
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: OC debug command does not show command prompt

2018-06-06 Thread Joel Pearson

What operating system is your local machine? Ok windows I’ve noticed the oc
binary doesn’t doesn’t do terminal emulation properly. So it looks like
it’s hanging but it’s actually working. Try typing “ls” and see if the
command has actually worked, but you’re just not setting the command
prompt.
On Thu, 7 Jun 2018 at 6:52 am, Brian Keyes  wrote:

> no I dont think so , but I am running the CLI on my local machine , I will
> ssh  into one of the nodes and try
>
> thanks
>
>
> On Wed, Jun 6, 2018 at 4:49 PM, Aleksandar Lazic 
> wrote:
>
>> On 06/06/2018 13:04, Brian Keyes wrote:
>>
>>> If I do a "debug in terminal" in the console I always get a command
>>> prompt
>>>
>>> if i goto the command line and do a "oc debug   i get this
>>> message
>>>
>>> Debugging with pod/lster-1-2rqg9-debug, original command:
>>> container-entrypoint /tmp/scripts/run
>>> Waiting for pod to start ...
>>> Pod IP: 10.252.4.18
>>> If you don't see a command prompt, try pressing enter.
>>>
>>> i hit enter many many times and do not ever get a command prompt
>>>
>>
>> Are you behind a proxy?
>>
>> --
>>> thanks
>>>
>>
>> ___
>>> users mailing list
>>> users@lists.openshift.redhat.com
>>> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>>>
>>
>>
>
>
> --
> Brian Keyes
> Systems Engineer, Vizuri
> 703-855-9074(Mobile)
> 703-464-7030 x8239 (Office)
>
> FOR OFFICIAL USE ONLY: This email and any attachments may contain
> information that is privacy and business sensitive.  Inappropriate or
> unauthorized disclosure of business and privacy sensitive information may
> result in civil and/or criminal penalties as detailed in as amended Privacy
> Act of 1974 and DoD 5400.11-R.
>
> ___
> users mailing list
> users@lists.openshift.redhat.com
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: errors accessing egressnetworkpolicies.network.openshift.io when attempting to export project

2018-06-01 Thread Joel Pearson

I guess that means your admin user doesn’t have the cluster-admin role
On Sat, 2 Jun 2018 at 4:02 am, Brian Keyes  wrote:

> I am attempting to follow these instructions
>
>
> https://docs.openshift.com/container-platform/3.7/day_two_guide/project_level_tasks.html
>
> I want to backup THE sample python app and I created a script like this (
> from the documentation)
>
>
>
>
> $ for object in rolebindings serviceaccounts secrets imagestreamtags 
> podpreset cms egressnetworkpolicies rolebindingrestrictions limitranges 
> resourcequotas pvcs templates cronjobs statefulsets hpas deployments 
> replicasets poddisruptionbudget endpoints
> do
>   oc export $object -o yaml > $object.yaml
> done
>
>
> --
> but when I run this I get some access denied errors like this , is this
> saying that the objects I am attempting to back up do not exist?
>
>
> $ ./exportotherprojects.sh
> error: no resources found - nothing to export
> the server doesn't have a resource type "cms"
> Error from server (Forbidden): User "admin" cannot list
> egressnetworkpolicies.network.openshift.io in the namespace "sample-py":
> User "admin" cannot list egressnetworkpolicies.network.openshift.io in
> project "sample-py" (get egressnetworkpolicies.network.openshift.io)
> error: no resources found - nothing to export
> error: no resources found - nothing to export
> error: no resources found - nothing to export
> the server doesn't have a resource type "pvcs"
> error: no resources found - nothing to export
> error: no resources found - nothing to export
> error: no resources found - nothing to export
> the server doesn't have a resource type "hpas"
> error: no resources found - nothing to export
> error: no resources found - nothing to export
> Error from server (Forbidden): User "admin" cannot list
> poddisruptionbudgets.policy in the namespace "sample-py": User "admin"
> cannot list poddisruptionbudgets.policy in project "sample-py" (get
> poddisruptionbudgets.policy)
>
>
> thanks
>
> ___
> users mailing list
> users@lists.openshift.redhat.com
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: RPMs for 3.9 on Centos

2018-05-21 Thread Joel Pearson

You shouldn’t need testing. It looks like they’ve been in the repo for
about a month.

Not sure about the ansible side I haven’t actually tried to install 3.9
yet. And when I do I plan on using system containers.

But you could grep through the ansible scripts looking for what installs to
repo so you can figure out why it isn’t using it.
On Mon, 21 May 2018 at 8:38 pm, Tim Dudgeon  wrote:

> Seems like Ansible isn't doing so for me.
> Are there any special params needed for this?
>
> I did try setting these two, but to no effect:
>
> openshift_enable_origin_repo=true
> openshift_repos_enable_testing=true
>
> On 21/05/18 11:32, Joel Pearson wrote:
>
> They’re in the paas repo. You don’t have that repo installed for some
> reason.
>
> Ansible is supposed to lay that down
>
> http://mirror.centos.org/centos/7/paas/x86_64/openshift-origin/
>
> Why don’t you use the system container version instead? Or you prefer rpms?
> On Mon, 21 May 2018 at 8:30 pm, Tim Dudgeon  wrote:
>
>> I looks like RPMs for Origin 3.9 are still not available from the Centos
>> repos:
>>
>> > $ yum search origin
>> > Loaded plugins: fastestmirror
>> > Loading mirror speeds from cached hostfile
>> >  * base: ftp.lysator.liu.se
>> >  * extras: ftp.lysator.liu.se
>> >  * updates: ftp.lysator.liu.se
>> >
>> 
>>
>> > N/S matched: origin
>> >
>> =
>> > centos-release-openshift-origin13.noarch : Yum configuration for
>> > OpenShift Origin 1.3 packages
>> > centos-release-openshift-origin14.noarch : Yum configuration for
>> > OpenShift Origin 1.4 packages
>> > centos-release-openshift-origin15.noarch : Yum configuration for
>> > OpenShift Origin 1.5 packages
>> > centos-release-openshift-origin36.noarch : Yum configuration for
>> > OpenShift Origin 3.6 packages
>> > centos-release-openshift-origin37.noarch : Yum configuration for
>> > OpenShift Origin 3.7 packages
>> > google-noto-sans-canadian-aboriginal-fonts.noarch : Sans Canadian
>> > Aboriginal font
>> > centos-release-openshift-origin.noarch : Common release file to
>> > establish shared metadata for CentOS PaaS SIG
>> > ksh.x86_64 : The Original ATT Korn Shell
>> > texlive-tetex.noarch : scripts and files originally written for or
>> > included in teTeX
>> >
>> >   Name and summary matches only, use "search all" for everything.
>> Any idea when these will be available, or instructions for finding them
>> somewhere else?
>>
>>
>>
>>
>>
>> ___
>> users mailing list
>> users@lists.openshift.redhat.com
>> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>>
>
>
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: RPMs for 3.9 on Centos

2018-05-21 Thread Joel Pearson

They’re in the paas repo. You don’t have that repo installed for some
reason.

Ansible is supposed to lay that down

http://mirror.centos.org/centos/7/paas/x86_64/openshift-origin/

Why don’t you use the system container version instead? Or you prefer rpms?
On Mon, 21 May 2018 at 8:30 pm, Tim Dudgeon  wrote:

> I looks like RPMs for Origin 3.9 are still not available from the Centos
> repos:
>
> > $ yum search origin
> > Loaded plugins: fastestmirror
> > Loading mirror speeds from cached hostfile
> >  * base: ftp.lysator.liu.se
> >  * extras: ftp.lysator.liu.se
> >  * updates: ftp.lysator.liu.se
> >
> 
>
> > N/S matched: origin
> >
> =
> > centos-release-openshift-origin13.noarch : Yum configuration for
> > OpenShift Origin 1.3 packages
> > centos-release-openshift-origin14.noarch : Yum configuration for
> > OpenShift Origin 1.4 packages
> > centos-release-openshift-origin15.noarch : Yum configuration for
> > OpenShift Origin 1.5 packages
> > centos-release-openshift-origin36.noarch : Yum configuration for
> > OpenShift Origin 3.6 packages
> > centos-release-openshift-origin37.noarch : Yum configuration for
> > OpenShift Origin 3.7 packages
> > google-noto-sans-canadian-aboriginal-fonts.noarch : Sans Canadian
> > Aboriginal font
> > centos-release-openshift-origin.noarch : Common release file to
> > establish shared metadata for CentOS PaaS SIG
> > ksh.x86_64 : The Original ATT Korn Shell
> > texlive-tetex.noarch : scripts and files originally written for or
> > included in teTeX
> >
> >   Name and summary matches only, use "search all" for everything.
> Any idea when these will be available, or instructions for finding them
> somewhere else?
>
>
>
>
>
> ___
> users mailing list
> users@lists.openshift.redhat.com
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: question about external load balancer

2018-05-18 Thread Joel Pearson

OpenShift already has some support for F5 load balancer’s as a router. So
maybe given the choice between F5 or netscalers, then F5’s might make
sense.

But either will work fine, it’s probably more a question of which device
you have more skills in.

On Wed, 16 May 2018 at 3:17 am, Yu Wei  wrote:

> Hi guys,
> I tried to setup openshift origin cluster with multiple masters for HA.
> I read the doc in
> https://github.com/redhat-cop/openshift-playbooks/blob/master/playbooks/installation/load_balancing.adoc
> .
>
> Any other advice for external load balancer?
> Which solution should I select for external load balancer?  F5 or
> netscaler? Which is better?
> My cluster is about more than 200 physical machines.
>
> Thanks,
>
> Jared, (韦煜）
> Software developer
> Interested in open source software, big data, Linux
> ___
> users mailing list
> users@lists.openshift.redhat.com
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: Using RMI Protocol to connect to OpenShift from external application

2018-05-04 Thread Joel Pearson

Hi Tien,

You just need to create a passthrough route like this:

https://docs.openshift.com/container-platform/3.9/architecture/networking/routes.html#passthrough-termination

For it to work, your swing client needs to use SNI (server name
identification), so that the OpenShift router knows what you're trying to
connect to.  A bit of quick googling suggests that Java 7 supports that,
but it depends if the RMI SSL Client uses SNI or not.

Then you'd need to tell your swing clients to connect to RMI on port 443,
ie :443. This is because that is what port the
router listens on.

Then the router will do the normal things like redirecting the traffic to
your service on whatever port your RMI server is actually running on.

In this passthrough mode, you're essentially using the router as a TCP load
balancer.

I think it has a fair chance of working.

Good luck.

Thanks,

Joel

On Thu, May 3, 2018 at 1:40 AM Tien Hung Nguyen 
wrote:

> Currently, our application is already running on Docker through RMI over
> SSL. Therefore, we are able to connect our client to the server via SSL and
> RMI using Docker.
>
> What do we have to do in order to make it work with OpenShift, now?
>
> 2018-05-02 16:34 GMT+02:00 Joel Pearson :
>
>> Selectors refer to labels, so it’d be
>> deploymentconfig.metadata.labels.name
>>
>> SSL/TLS means the client has to support it too. So if there is some
>> option to run RMI over SSL/TLS then it could work pretty easily. But if
>> it’s not possible to run server and client that way then yes, nodeports
>> will be easier. Otherwise I think there might be other Ingress options. But
>> I’ve never used them.
>> On Thu, 3 May 2018 at 12:14 am, Tien Hung Nguyen <
>> tienhng.ngu...@gmail.com> wrote:
>>
>>> Thank you for the response.
>>>
>>> How can I set up SSL/TLS as a connection method on OpenShift that my
>>> Client connects through SSL/TLS to the server? Is that done on the
>>> OpenShift router or where can I do the settings?
>>>
>>> Otherwise, I think NodePorts are the easier solution to establish a
>>> connection between Client-Server using RMI. In this case, do I just have to
>>> specify the service with the proper NodePort as the property like this
>>> example, where the selector.name is the name of the
>>> deploymentConfig.metadata.name? :
>>>
>>> apiVersion: v1
>>> kind: Service
>>> metadata:
>>>   name: mysql
>>>   labels:
>>> name: mysql
>>> spec:
>>>   type: NodePort
>>>   ports:
>>> - port: 3036
>>>   nodePort: 30036
>>>   name: http
>>>   selector:
>>> name: mysql
>>>
>>>
>>>
>>>
>>>
>>> 2018-05-02 15:53 GMT+02:00 Joel Pearson :
>>>
>>>> If you're using SSL/TLS you could traverse the Router by use
>>>> Passthrough.  Otherwise, you have to use NodePorts on a Service or
>>>> something like that.  The Router is generally only really for HTTP, but
>>>> with passthrough SSL/TLS just about anything could be running in the pod.
>>>>
>>>> On Wed, May 2, 2018 at 10:52 PM Tien Hung Nguyen <
>>>> tienhng.ngu...@gmail.com> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> we have a application, which is actually running on
>>>>> Wildfly 12.0.0.Final via Docker.
>>>>> Now, we would like to put our application on OpenShift with the
>>>>> existing Dockerfile.
>>>>>
>>>>> However, our client is using RMI to connect connect to the server. Is
>>>>> it still possible to run our application on OpenShift while using RMI for
>>>>> the client-server connection? If yes, how should we configure the client
>>>>> and the router of OpenShift to connect to the server?
>>>>>
>>>>> At the moment our java client is using the hostname:port in order to
>>>>> connect to the server running on Docker.
>>>>>
>>>>> Regards,
>>>>> Tien
>>>>>
>>>>> Note: Our application is not a web application, but it is java swing
>>>>> application (desktop application) which uses RMI to connect to the server.
>>>>>
>>>>>
>>>>> ___
>>>>> users mailing list
>>>>> users@lists.openshift.redhat.com
>>>>> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>>>>>
>>>>
>>>
>
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: Prometheus node exporter on v3.7

2018-05-03 Thread Joel Pearson

Upgrade your cluster to 3.9 just to be safe? You know you want too ... ;)
On Fri, 4 May 2018 at 6:00 am, Tim Dudgeon  wrote:

> Any Prometheus experts out there that can comment on this?
>
>
> On 30/04/18 15:19, Tim Dudgeon wrote:
> > I'm running Prometheus an Origin cluster using v3.7.2 installed from
> > the playbooks on the release-3.7 branch of openshift/openshift-ansible.
> >
> > It looks like the node exported was not included in this version [1]
> > but was added for the 3.9 version [2].
> > As it's metrics on the nodes that I'm wanting most I wonder what the
> > best approach is here.
> >
> > It is safe to run the `playbooks/openshift-prometheus/config.yml`
> > playbook from the release-3.9 branch on a cluster running v3.7.2, or
> > is there a better approach?
> >
> > [1] (v3.7)
> >
> https://github.com/openshift/openshift-ansible/tree/release-3.7/roles/openshift_prometheus/tasks
> > [2] (v3.9)
> >
> https://github.com/openshift/openshift-ansible/tree/release-3.9/roles/openshift_prometheus/tasks
> >
>
> ___
> users mailing list
> users@lists.openshift.redhat.com
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: Using RMI Protocol to connect to OpenShift from external application

2018-05-02 Thread Joel Pearson

Selectors refer to labels, so it’d be deploymentconfig.metadata.labels.name

SSL/TLS means the client has to support it too. So if there is some option
to run RMI over SSL/TLS then it could work pretty easily. But if it’s not
possible to run server and client that way then yes, nodeports will be
easier. Otherwise I think there might be other Ingress options. But I’ve
never used them.
On Thu, 3 May 2018 at 12:14 am, Tien Hung Nguyen 
wrote:

> Thank you for the response.
>
> How can I set up SSL/TLS as a connection method on OpenShift that my
> Client connects through SSL/TLS to the server? Is that done on the
> OpenShift router or where can I do the settings?
>
> Otherwise, I think NodePorts are the easier solution to establish a
> connection between Client-Server using RMI. In this case, do I just have to
> specify the service with the proper NodePort as the property like this
> example, where the selector.name is the name of the
> deploymentConfig.metadata.name? :
>
> apiVersion: v1
> kind: Service
> metadata:
>   name: mysql
>   labels:
> name: mysql
> spec:
>   type: NodePort
>   ports:
> - port: 3036
>   nodePort: 30036
>   name: http
>   selector:
> name: mysql
>
>
>
>
>
> 2018-05-02 15:53 GMT+02:00 Joel Pearson :
>
>> If you're using SSL/TLS you could traverse the Router by use
>> Passthrough.  Otherwise, you have to use NodePorts on a Service or
>> something like that.  The Router is generally only really for HTTP, but
>> with passthrough SSL/TLS just about anything could be running in the pod.
>>
>> On Wed, May 2, 2018 at 10:52 PM Tien Hung Nguyen <
>> tienhng.ngu...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> we have a application, which is actually running on Wildfly 12.0.0.Final
>>> via Docker.
>>> Now, we would like to put our application on OpenShift with the existing
>>> Dockerfile.
>>>
>>> However, our client is using RMI to connect connect to the server. Is it
>>> still possible to run our application on OpenShift while using RMI for the
>>> client-server connection? If yes, how should we configure the client and
>>> the router of OpenShift to connect to the server?
>>>
>>> At the moment our java client is using the hostname:port in order to
>>> connect to the server running on Docker.
>>>
>>> Regards,
>>> Tien
>>>
>>> Note: Our application is not a web application, but it is java swing
>>> application (desktop application) which uses RMI to connect to the server.
>>>
>>>
>>> ___
>>> users mailing list
>>> users@lists.openshift.redhat.com
>>> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>>>
>>
>
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: Using RMI Protocol to connect to OpenShift from external application

2018-05-02 Thread Joel Pearson

If you're using SSL/TLS you could traverse the Router by use Passthrough.
Otherwise, you have to use NodePorts on a Service or something like that.
The Router is generally only really for HTTP, but with passthrough SSL/TLS
just about anything could be running in the pod.

On Wed, May 2, 2018 at 10:52 PM Tien Hung Nguyen 
wrote:

> Hi,
>
> we have a application, which is actually running on Wildfly 12.0.0.Final
> via Docker.
> Now, we would like to put our application on OpenShift with the existing
> Dockerfile.
>
> However, our client is using RMI to connect connect to the server. Is it
> still possible to run our application on OpenShift while using RMI for the
> client-server connection? If yes, how should we configure the client and
> the router of OpenShift to connect to the server?
>
> At the moment our java client is using the hostname:port in order to
> connect to the server running on Docker.
>
> Regards,
> Tien
>
> Note: Our application is not a web application, but it is java swing
> application (desktop application) which uses RMI to connect to the server.
>
>
> ___
> users mailing list
> users@lists.openshift.redhat.com
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: FW: installing newest OCP 3.9 on RHEL 7.4 failed (MODULE ERROR)

2018-04-02 Thread Joel Pearson

Do you have a Redhat subscription? If not, you shouldn’t be trying to
install OCP but rather Origin. If you don’t have a subscription configured
then that’d probably explain why it can’t find the rpms.
On Mon, 2 Apr 2018 at 8:35 pm, Lukas Budiman  wrote:

> I am really stuck, trying 4 times installing OCP 3.9, all returned same
> error like this. I'm completely newbie in Openshift, Ansible, but
> understand some basic linux command.
> I run 2 hosts (master called OSMaster & 1 nodes called OSNodeA.
>
>
>
> Any help is greatly appreciated!
>
>
>
> 2018-04-02 10:41:13,571 p=12188 u=root |  PLAY [Set openshift_version for
> etcd, node, and master hosts] ***
> 2018-04-02 10:41:13,585 p=12188 u=root |  TASK [Gathering Facts]
> **
> 2018-04-02 10:41:14,437 p=12188 u=root |  ok: [osnodea.172.16.0.15.nip.io]
> 2018-04-02 10:41:14,496 p=12188 u=root |  TASK [set_fact]
> *
> 2018-04-02 10:41:14,660 p=12188 u=root |  ok: [osnodea.172.16.0.15.nip.io]
> 2018-04-02 10:41:14,674 p=12188 u=root |  PLAY [Ensure the requested
> version packages are available.] *
> 2018-04-02 10:41:14,685 p=12188 u=root |  TASK [Gathering Facts]
> **
> 2018-04-02 10:41:15,500 p=12188 u=root |  ok: [osnodea.172.16.0.15.nip.io]
> 2018-04-02 10:41:15,555 p=12188 u=root |  TASK [include_role]
> *
> 2018-04-02 10:41:15,641 p=12188 u=root |  TASK [openshift_version : Check
> openshift_version for rpm installation] *
> 2018-04-02 10:41:15,682 p=12188 u=root |  included:
> /usr/share/ansible/openshift-ansible/roles/openshift_version/tasks/check_available_rpms.yml
> for osnodea.172.16.0.15.nip.io
> 2018-04-02 10:41:15,699 p=12188 u=root |  TASK [openshift_version : Get
> available atomic-openshift version] ***
> 2018-04-02 10:41:16,134 p=12188 u=root |  fatal: [
> osnodea.172.16.0.15.nip.io]: FAILED! => {"changed": false,
> "module_stderr": "Shared connection to osnodea.172.16.0.15.nip.io
> closed.\r\n", "module_stdout": "Traceback (most recent call last):\r\n
> File \"/tmp/ansible_d2lUs_/ansible_module_repoquery.py\", line 642, in
> \r\nmain()\r\n  File
> \"/tmp/ansible_d2lUs_/ansible_module_repoquery.py\", line 632, in
> main\r\nrval = Repoquery.run_ansible(module.params,
> module.check_mode)\r\n  File
> \"/tmp/ansible_d2lUs_/ansible_module_repoquery.py\", line 588, in
> run_ansible\r\nresults = repoquery.repoquery()\r\n  File
> \"/tmp/ansible_d2lUs_/ansible_module_repoquery.py\", line 547, in
> repoquery\r\nrval = self._repoquery_cmd(repoquery_cmd, True,
> 'raw')\r\n  File \"/tmp/ansible_d2lUs_/ansible_module_repoquery.py\", line
> 385, in _repoquery_cmd\r\nreturncode, stdout, stderr = _run(cmds)\r\n
> File \"/tmp/ansible_d2lUs_/ansible_module_repoquery.py\", line 356, in
> _run\r\nstderr=subprocess.PIPE)\r\n  File
> \"/usr/lib64/python2.7/subprocess.py\", line 711, in __init__\r\n
> errread, errwrite)\r\n  File \"/usr/lib64/python2.7/subprocess.py\", line
> 1327, in _execute_child\r\nraise child_exception\r\nOSError: [Errno 2]
> No such file or directory\r\n", "msg": "MODULE FAILURE", "rc": 0}
> 2018-04-02 10:41:16,136 p=12188 u=root |  PLAY RECAP
> **
> 2018-04-02 10:41:16,137 p=12188 u=root |  localhost  :
> ok=12   changed=0unreachable=0failed=0
> 2018-04-02 10:41:16,137 p=12188 u=root |  osmaster.172.16.0.14.nip.io :
> ok=35   changed=2unreachable=0failed=0
> 2018-04-02 10:41:16,137 p=12188 u=root |  osnodea.172.16.0.15.nip.io :
> ok=20   changed=2unreachable=0failed=1
> 2018-04-02 10:41:16,137 p=12188 u=root |  INSTALLER STATUS
> 
> 2018-04-02 10:41:16,142 p=12188 u=root |  Initialization : In
> Progress (0:00:26)
> ___
> users mailing list
> users@lists.openshift.redhat.com
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: Accessing Remote Files via SSHFS

2018-03-28 Thread Joel Pearson

A quick google found this:

https://karlstoney.com/2017/03/01/fuse-mount-in-kubernetes/

It looks like the approach would work for you too. But it’s worth
mentioning that he’s doing the mount from within the container, so he needs
the pod to start as a privileged pod. You can do that in open shift but
running privileged pods does have security implications, so it depends if
you trust your legacy app enough to run it this way.
On Thu, 29 Mar 2018 at 1:59 am, Jamie Jackson  wrote:

> Hi Folks,
>
> I'm in the process of containerizing my stack. One of the pieces of the
> legacy stack accesses a remote file system over SSHFS (autofs manages the
> access). What would be the best way to handle this kind of requirement on
> OpenShift?
>
> FYI, I'm currently using straight docker for the stack (docker-compose,
> but no orchestration), but the end goal is probably to run on OpenShift, so
> I'm trying to approach things in a way that will be most transferable to
> OpenShift.
>
> (Note, this conversation started on Google Groups:
> https://groups.google.com/d/msg/openshift/9hjDE2INe5o/vqPoQq-6AwAJ )
>
> Thanks,
> Jamie
> ___
> users mailing list
> users@lists.openshift.redhat.com
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: glusterfs setup

2018-03-28 Thread Joel Pearson

You’d have to run your Gluster cluster separate from OpenShift if you want
a different volume type I’m guessing.
On Thu, 29 Mar 2018 at 12:15 am, Tim Dudgeon  wrote:

> Ah!, that's a shame.
>
> Tim
>
> On 28/03/18 14:11, Joel Pearson wrote:
>
> “Distributed-Three-way replication is the only supported volume type.”
>
>
> https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.2/html/container-native_storage_for_openshift_container_platform/ch03s02
>
>
> On Thu, 29 Mar 2018 at 12:00 am, Tim Dudgeon 
> wrote:
>
>> When using native glusterfs its not clear to me how to configure the
>> types of storage.
>>
>> As described in the glusterfs docs [1] there are multiple types of
>> volume that can be created (Distributed, Replicated, Distributed
>> Replicated, Striped, Distributed Striped).
>>
>> In the example ansible inventory file [2] you are suggested to set up
>> the glusterfs_devices variable like this:
>>
>> [glusterfs]
>> node0  glusterfs_devices='[ "/dev/vdb", "/dev/vdc", "/dev/vdd" ]'
>> node1  glusterfs_devices='[ "/dev/vdb", "/dev/vdc", "/dev/vdd" ]'
>> node2  glusterfs_devices='[ "/dev/vdb", "/dev/vdc", "/dev/vdd" ]'
>>
>> But how is the way those block devices are utilised to create a
>> particular type of volume?
>>
>> How would you specify that you wanted multiple types of volume
>> (presumably each with its own storage class)?
>>
>> Thanks
>> Tim
>>
>> [1]
>>
>> https://docs.gluster.org/en/latest/Quick-Start-Guide/Architecture/#types-of-volumes
>> [2]
>>
>> https://github.com/openshift/openshift-ansible/blob/master/inventory/hosts.glusterfs.native.example
>>
>> ___
>> users mailing list
>> users@lists.openshift.redhat.com
>> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>>
>
>
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: glusterfs setup

2018-03-28 Thread Joel Pearson

“Distributed-Three-way replication is the only supported volume type.”

https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.2/html/container-native_storage_for_openshift_container_platform/ch03s02


On Thu, 29 Mar 2018 at 12:00 am, Tim Dudgeon  wrote:

> When using native glusterfs its not clear to me how to configure the
> types of storage.
>
> As described in the glusterfs docs [1] there are multiple types of
> volume that can be created (Distributed, Replicated, Distributed
> Replicated, Striped, Distributed Striped).
>
> In the example ansible inventory file [2] you are suggested to set up
> the glusterfs_devices variable like this:
>
> [glusterfs]
> node0  glusterfs_devices='[ "/dev/vdb", "/dev/vdc", "/dev/vdd" ]'
> node1  glusterfs_devices='[ "/dev/vdb", "/dev/vdc", "/dev/vdd" ]'
> node2  glusterfs_devices='[ "/dev/vdb", "/dev/vdc", "/dev/vdd" ]'
>
> But how is the way those block devices are utilised to create a
> particular type of volume?
>
> How would you specify that you wanted multiple types of volume
> (presumably each with its own storage class)?
>
> Thanks
> Tim
>
> [1]
>
> https://docs.gluster.org/en/latest/Quick-Start-Guide/Architecture/#types-of-volumes
> [2]
>
> https://github.com/openshift/openshift-ansible/blob/master/inventory/hosts.glusterfs.native.example
>
> ___
> users mailing list
> users@lists.openshift.redhat.com
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: Reverse Proxy using Nginx

2018-03-20 Thread Joel Pearson

So your problem is solved then?
On Wed, 21 Mar 2018 at 4:47 am, Gaurav Ojha  wrote:

> Hi,
>
> Thanks for the reply. I have router, but have a bunch of APIs behind
> gunicorn which I wanted to route through nginx.
>
> I deployed a nginx image and am using it.
>
> On Tue, Mar 20, 2018, 9:43 AM Joel Pearson 
> wrote:
>
>> What do you want Nginx for? OpenShift has a component called the Router
>> which routes traffic. It is based on Haproxy. You could run an nginx
>> container that the router will send traffic to, but if you’re just trying
>> to expose other apps. Then just use the built in Router.
>>
>> Unless you’re talking about the kubernetes reference nginx ingress
>> controller?
>> On Sat, 17 Mar 2018 at 5:05 am, Gaurav Ojha  wrote:
>>
>>> Hello,
>>>
>>> I have a single host OpenShift cluster. Is it possible to install Nginx
>>> (run it as a docker image) and route traffic using Nginx?
>>>
>>> If so, can someone point out the configurations for NO_PROXY and
>>> HTTP_PROXY in this case?
>>>
>>> I dont want any OpenShift instance IP managed by OpenShift. What I am
>>> confused about is this part of the document
>>>
>>> HTTP_PROXY=http://:@:/
>>> HTTPS_PROXY=https://:@:/
>>> NO_PROXY=master.hostname.example.com,10.1.0.0/16,172.30.0.0/16
>>>
>>>
>>> It mentions that NO_PROXY has the hostname of the master included in
>>> NO_PROXY. But since my cluster only has 1 host, so all my routes are
>>> managed through that hostname. In this case, do I just assign some random
>>> routes, and route through Nginx?
>>>
>>> Regards
>>>
>>> ___
>>> users mailing list
>>> users@lists.openshift.redhat.com
>>> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>>>
>>
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: Pods stuck on Terminating status

2018-03-20 Thread Joel Pearson

I had this at one point, but it was before I cared about the data in that
cluster so I just rebuilt it, so you could just rebuild your cluster ;)

But in all seriousness sounds like you need to do some etcd surgery, but I
have no idea how that works.
On Tue, 20 Mar 2018 at 4:00 am, bahhooo  wrote:

> Hi Rodrigo,
>
>
> Restarting master services did not help either..
> I tried afterwards again with grace period and force delete. Stil no luck.
>
>
>
>
> Bahho
>
> On 16 March 2018 at 21:29, Rodrigo Bersa  wrote:
>
>> Bahhoo,
>>
>> I believe that the namespace will get stuck also. 'Cause it will only be
>> deleted after all of it's objects got deleted.
>>
>> I would try to restart the Masters services before.
>>
>>
>> Regards,
>>
>>
>> Rodrigo Bersa
>>
>> Cloud Consultant, RHCVA, RHCE
>>
>> Red Hat Brasil 
>>
>> rbe...@redhat.comM: +55-11-99557-5841
>> 
>> TRIED. TESTED. TRUSTED. 
>> Red Hat é reconhecida entre as melhores empresas para trabalhar no Brasil
>> pelo *Great Place to Work*.
>>
>> On Fri, Mar 16, 2018 at 5:25 PM, Bahhoo  wrote:
>>
>>> Hi  Rodrigo,
>>>
>>> No PVs are used. One of the pods is a build pod, the other one's a
>>> normal pod without storage.
>>> I'll try deleting the namespace. I didn't want to do that,since I had
>>> running pods in the namespace.
>>>
>>> Best,
>>> Bahho
>>> --
>>> Kimden: Rodrigo Bersa 
>>> Gönderme tarihi: ‎16.‎3.‎2018 16:12
>>> Kime: Bahhoo 
>>> Bilgi: rahul334...@gmail.com; users 
>>>
>>> Konu: Re: Pods stuck on Terminating status
>>>
>>> Hi Bahhoo,
>>>
>>> Are you using PVs on the "Terminating" POD? I heard about some issues
>>> with PODs bounded to PV/PVCs provided by dynamic storage, where you have to
>>> first remove the volume form POD, then the PVPVC. Just after that remove
>>> the POD or the DeplymentConfig.
>>>
>>> If it's not the case, maybe restarting the atomic-openshift-master-*
>>> services can work removing the inconsistent POD.
>>>
>>>
>>> Regards,
>>>
>>>
>>> Rodrigo Bersa
>>>
>>> Cloud Consultant, RHCVA, RHCE
>>>
>>> Red Hat Brasil 
>>>
>>> rbe...@redhat.comM: +55-11-99557-5841
>>> 
>>> TRIED. TESTED. TRUSTED. 
>>> Red Hat é reconhecida entre as melhores empresas para trabalhar no
>>> Brasil pelo *Great Place to Work*.
>>>
>>> On Thu, Mar 15, 2018 at 7:28 PM, Bahhoo  wrote:
>>>
 Hi Rahul,

 That won't do it either.

 Thanks
 Bahho
 --
 Kimden: Rahul Agarwal 
 Gönderme tarihi: ‎15.‎3.‎2018 22:26
 Kime: bahhooo 
 Bilgi: users 
 Konu: Re: Pods stuck on Terminating status

 Hi Bahho

 Try: oc delete all -l app=

 Thanks,
 Rahul

 On Thu, Mar 15, 2018 at 5:19 PM, bahhooo  wrote:

> Hi all,
>
> I have some zombie pods stuck on Terminating status on a OCP 3.7
> HA-cluster.
>
> oc delete with --grace-period=0 --force etc. won't work.
> Docker restart. server reboot won't help either.
>
> I tried to find the pod key in etcd either in order to delete it
> manually. I couldn't find it.
>
> Is there a way to delete these pods?
>
>
>
>
> Bahho
>
> ___
> users mailing list
> users@lists.openshift.redhat.com
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>
>

 ___
 users mailing list
 users@lists.openshift.redhat.com
 http://lists.openshift.redhat.com/openshiftmm/listinfo/users


>>>
>>
> ___
> users mailing list
> users@lists.openshift.redhat.com
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: route resolution happens intermittently

2018-03-20 Thread Joel Pearson

Sounds like your DNS configuration is a bit weird. Do you control the DNS
server where you put that myapps domain? How did you figure the nodes to
use DNS?


On Fri, 16 Mar 2018 at 3:47 pm, abdul nizam  wrote:

> Hi All,
>
> I have 2 nodes and one master.
> I have installed OSE 3.6 in my setup.
> I have created 2 projects say Project-A and Project-B
> under project-A i have i have deployed redmine application and Under
> Project-B i have deployed jenkins application.
> Both the pods are in running state and both the pods are runing in
> separate nodes.(ex: redmine pod in node1 and jenkins pod in node 2.
>
> And i have DNS wildcard entry which resolves to the nodes.
>
> Now when i tried to curl the route of redmine from jenkins pod it shows
> error as below
> "Could not resolve host: redmine-project-a.apps67.myapps.com; Name or
> service not known"
>
> I am getting this error most of the time but there is a catch that is some
> times it does work means when i curl the redmine route it works fine.
>
> and when i am getting error and if i make the node IP and route enrty in
> /etc/hosts of the jenkins pod it works fine.
>
> I wanted to know why this kind of behaviour?
>
> Regards
> Abdul
> ___
> users mailing list
> users@lists.openshift.redhat.com
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: Reverse Proxy using Nginx

2018-03-20 Thread Joel Pearson

What do you want Nginx for? OpenShift has a component called the Router
which routes traffic. It is based on Haproxy. You could run an nginx
container that the router will send traffic to, but if you’re just trying
to expose other apps. Then just use the built in Router.

Unless you’re talking about the kubernetes reference nginx ingress
controller?
On Sat, 17 Mar 2018 at 5:05 am, Gaurav Ojha  wrote:

> Hello,
>
> I have a single host OpenShift cluster. Is it possible to install Nginx
> (run it as a docker image) and route traffic using Nginx?
>
> If so, can someone point out the configurations for NO_PROXY and
> HTTP_PROXY in this case?
>
> I dont want any OpenShift instance IP managed by OpenShift. What I am
> confused about is this part of the document
>
> HTTP_PROXY=http://:@:/
> HTTPS_PROXY=https://:@:/
> NO_PROXY=master.hostname.example.com,10.1.0.0/16,172.30.0.0/16
>
>
> It mentions that NO_PROXY has the hostname of the master included in
> NO_PROXY. But since my cluster only has 1 host, so all my routes are
> managed through that hostname. In this case, do I just assign some random
> routes, and route through Nginx?
>
> Regards
>
> ___
> users mailing list
> users@lists.openshift.redhat.com
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: TSB fails to start

2018-03-20 Thread Joel Pearson

I just noticed the other day that 3.7.2 came out. Maybe that’s worth a try.
We’re running 3.7.1 in the office on our home rolled openstack and it’s
going ok.
On Tue, 20 Mar 2018 at 7:34 pm, Tim Dudgeon  wrote:

> I'm just using the default SDN.
>
> This seems to be some issue with Origin 3.7.  Switching back to 3.6.1
> works fine.
> I'm struggling to work out what is going on as it is not very reproducible
> (and 3.7 is broken at present).
>
> Tim
>
> On 20/03/18 08:10, Joel Pearson wrote:
>
> Are you using calico or something like that? If so why not consider a
> regular overlay network just to get it working?
> On Thu, 15 Mar 2018 at 5:26 am, Tim Dudgeon  wrote:
>
>> A little more on this.
>> One the nodes that are not working the file
>> /etc/cni/net.d/80-openshift-network.conf is not present.
>> This seems to cause errors like this in the origin-node service:
>>
>> Mar 14 18:21:45 zzz-infra.openstacklocal origin-node[17833]: W0314
>> 18:21:45.711715   17833 cni.go:189] Unable to update cni config: No
>> networks found in /etc/cni/net.d
>>
>> Where in the installation process does the 80-openshift-network.conf file
>> get created?
>> I don't see anything in the ansible installer logs suggesting anything
>> has gone wrong.
>>
>>
>>
>> On 13/03/18 17:02, Tim Dudgeon wrote:
>>
>> This is still troubling me. I would welcome any input on this.
>>
>> When I run an ansible install (using Origin 3.7.1 on Centos7 nodes) the
>> DNS setup on some nodes seems to randomly get messed up. For instance I've
>> just run a setup with 1 master, 1 infra and 2 identical worker nodes.
>>
>> During the installation one of the worker nodes starts responding very
>> slowly. The other is fine.
>> Looking deeper, on the slow responding one I see a DNS setup like this:
>>
>> [centos@xxx-node-001 ~]$ sudo netstat -tunlp | grep tcp | grep :53 |
>> grep -v tcp6
>> tcp0  0 10.0.0.20:530.0.0.0:*
>> LISTEN  14727/dnsmasq
>> tcp0  0 172.17.0.1:53   0.0.0.0:*
>> LISTEN  14727/dnsmasq
>> [centos@xxx-node-001 ~]$ host orndev-bastion-002
>> ;; connection timed out; trying next origin
>> orndev-bastion-002.openstacklocal has address 10.0.0.9
>>
>> Whilst on the good one it looks like this:
>>
>> [centos@xxx-node-002 ~]$ sudo netstat -tunlp | grep tcp | grep :53 |
>> grep -v tcp6
>> tcp0  0 127.0.0.1:530.0.0.0:*
>> LISTEN  17231/openshift
>> tcp0  0 10.129.0.1:53   0.0.0.0:*
>> LISTEN  14563/dnsmasq
>> tcp0  0 10.0.0.22:530.0.0.0:*
>> LISTEN  14563/dnsmasq
>> tcp0  0 172.17.0.1:53   0.0.0.0:*
>> LISTEN  14563/dnsmasq
>> [centos@xxx-node-002 ~]$ host orndev-bastion-002
>> orndev-bastion-002.openstacklocal has address 10.0.0.9
>>
>> Notice how 2 DNS listeners are not present, and how this causes the DNS
>> lookup to timeout locally before falling back to an upstream server.
>>
>> Getting into this state seems to be a random event.
>>
>> Any thoughts?
>>
>>
>>
>> On 01/03/18 14:30, Tim Dudgeon wrote:
>>
>> Yes, I think it is related to DNS.
>>
>> On a similar, but working, OpenStack environment ` netstat -tunlp | grep
>> ...` shows this:
>>
>> tcp0  0 127.0.0.1:530.0.0.0:*
>> LISTEN  16957/openshift
>> tcp0  0 10.128.0.1:53   0.0.0.0:*
>> LISTEN  16248/dnsmasq
>> tcp0  0 10.0.0.5:53 0.0.0.0:*
>> LISTEN  16248/dnsmasq
>> tcp0  0 172.17.0.1:53   0.0.0.0:*
>> LISTEN  16248/dnsmasq
>> tcp0  0 0.0.0.0:80530.0.0.0:*
>> LISTEN  12270/openshift
>>
>> On the environment where the TSB is failing to start I'm seeing:
>>
>> tcp0  0 127.0.0.1:530.0.0.0:*
>> LISTEN  19067/openshift
>> tcp0  0 10.129.0.1:53   0.0.0.0:*
>> LISTEN  16062/dnsmasq
>> tcp0  0 172.17.0.1:53   0.0.0.0:*
>> LISTEN  16062/dnsmasq
>> tcp0  0 0.0.0.0:80530.0.0.0:*
>> LISTEN  11628/openshift
>>
>> Notice that inf the first case dnsmasq is listening on the machine's IP
>> address (line 3) but in the second case  this is missing.
>>
>> Both environments have been created with the openshift-ansible playbooks
>> using an approach that is as equival

Re: TSB fails to start

2018-03-20 Thread Joel Pearson

Are you using calico or something like that? If so why not consider a
regular overlay network just to get it working?
On Thu, 15 Mar 2018 at 5:26 am, Tim Dudgeon  wrote:

> A little more on this.
> One the nodes that are not working the file
> /etc/cni/net.d/80-openshift-network.conf is not present.
> This seems to cause errors like this in the origin-node service:
>
> Mar 14 18:21:45 zzz-infra.openstacklocal origin-node[17833]: W0314
> 18:21:45.711715   17833 cni.go:189] Unable to update cni config: No
> networks found in /etc/cni/net.d
>
> Where in the installation process does the 80-openshift-network.conf file
> get created?
> I don't see anything in the ansible installer logs suggesting anything has
> gone wrong.
>
>
>
> On 13/03/18 17:02, Tim Dudgeon wrote:
>
> This is still troubling me. I would welcome any input on this.
>
> When I run an ansible install (using Origin 3.7.1 on Centos7 nodes) the
> DNS setup on some nodes seems to randomly get messed up. For instance I've
> just run a setup with 1 master, 1 infra and 2 identical worker nodes.
>
> During the installation one of the worker nodes starts responding very
> slowly. The other is fine.
> Looking deeper, on the slow responding one I see a DNS setup like this:
>
> [centos@xxx-node-001 ~]$ sudo netstat -tunlp | grep tcp | grep :53 | grep
> -v tcp6
> tcp0  0 10.0.0.20:530.0.0.0:*
> LISTEN  14727/dnsmasq
> tcp0  0 172.17.0.1:53   0.0.0.0:*
> LISTEN  14727/dnsmasq
> [centos@xxx-node-001 ~]$ host orndev-bastion-002
> ;; connection timed out; trying next origin
> orndev-bastion-002.openstacklocal has address 10.0.0.9
>
> Whilst on the good one it looks like this:
>
> [centos@xxx-node-002 ~]$ sudo netstat -tunlp | grep tcp | grep :53 | grep
> -v tcp6
> tcp0  0 127.0.0.1:530.0.0.0:*
> LISTEN  17231/openshift
> tcp0  0 10.129.0.1:53   0.0.0.0:*
> LISTEN  14563/dnsmasq
> tcp0  0 10.0.0.22:530.0.0.0:*
> LISTEN  14563/dnsmasq
> tcp0  0 172.17.0.1:53   0.0.0.0:*
> LISTEN  14563/dnsmasq
> [centos@xxx-node-002 ~]$ host orndev-bastion-002
> orndev-bastion-002.openstacklocal has address 10.0.0.9
>
> Notice how 2 DNS listeners are not present, and how this causes the DNS
> lookup to timeout locally before falling back to an upstream server.
>
> Getting into this state seems to be a random event.
>
> Any thoughts?
>
>
>
> On 01/03/18 14:30, Tim Dudgeon wrote:
>
> Yes, I think it is related to DNS.
>
> On a similar, but working, OpenStack environment ` netstat -tunlp | grep
> ...` shows this:
>
> tcp0  0 127.0.0.1:530.0.0.0:*
> LISTEN  16957/openshift
> tcp0  0 10.128.0.1:53   0.0.0.0:*
> LISTEN  16248/dnsmasq
> tcp0  0 10.0.0.5:53 0.0.0.0:*
> LISTEN  16248/dnsmasq
> tcp0  0 172.17.0.1:53   0.0.0.0:*
> LISTEN  16248/dnsmasq
> tcp0  0 0.0.0.0:80530.0.0.0:*
> LISTEN  12270/openshift
>
> On the environment where the TSB is failing to start I'm seeing:
>
> tcp0  0 127.0.0.1:530.0.0.0:*
> LISTEN  19067/openshift
> tcp0  0 10.129.0.1:53   0.0.0.0:*
> LISTEN  16062/dnsmasq
> tcp0  0 172.17.0.1:53   0.0.0.0:*
> LISTEN  16062/dnsmasq
> tcp0  0 0.0.0.0:80530.0.0.0:*
> LISTEN  11628/openshift
>
> Notice that inf the first case dnsmasq is listening on the machine's IP
> address (line 3) but in the second case  this is missing.
>
> Both environments have been created with the openshift-ansible playbooks
> using an approach that is as equivalent as is possible.
> The contents of /etc/dnsmasq.d/ on the two systems also seem to be
> equivalent.
>
> Any thoughts?
>
>
>
> On 28/02/18 18:50, Nobuhiro Sue wrote:
>
> Tim,
>
> It seems to be DNS issue. I guess your environment is on OpenStack, so
> please check resolver (lookup / reverse lookup).
> You can see how DNS works on OpenShift 3.6 or above:
>
> https://blog.openshift.com/dns-changes-red-hat-openshift-container-platform-3-6/
>
> 2018-03-01 0:06 GMT+09:00 Tim Dudgeon :
>
>> Hi
>>
>> I'm having problems getting an Origin cluster running, using the ansible
>> playbooks.
>> It fails at this point:
>>
>> TASK [template_service_broker : Verify that TSB is running]
>> **
>> FAILED - RETRYING: Verify that TSB is running (120 retries left).
>> FAILED - RETRYING: Verify that TSB is running (119 retries left).
>> 
>> FAILED - RETRYING: Verify that TSB is running (1 retries left).
>> fatal: [master-01.novalocal]: FAILED! => {"attempts": 120, "changed":
>> false, "cmd": ["curl", "-k", "
>> https://apiserver.openshift-template-service-broker.svc/healthz";],
>> "delta": "0:00:01.529402", "end": "2018-02-28 14:49:30.190842", "msg"

OpenShift Origin 3.9.0 release imminent?

2018-03-20 Thread Joel Pearson

Is the OpenShift Origin 3.9.0 release imminent? I noticed the tag appeared
4 days ago, but without any detail yet:

https://github.com/openshift/origin/releases
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: Can the Origin Ansible Playbook stop on "Restart node" fatal errors?

2018-03-14 Thread Joel Pearson

You could edit the
openshift-ansible\playbooks\common\openshift-node\restart.yml and add:

max_fail_percentage: 0

under

serial: "{{ openshift_restart_nodes_serial | default(1) }}"

That, in theory, should make it fail straight away.

On Wed, Mar 14, 2018 at 9:46 PM Alan Christie <
achris...@informaticsmatters.com> wrote:

> Hi,
>
> I’ve been running the Ansible release-3.7 branch playbook and occasionally
> I get errors restarting nodes. I’m not looking for help on why my nodes are
> not restarting but I am curious as to why the playbook continues when there
> are fatal errors that eventually lead to a failure some 30 minutes or so
> later? Especially annoying if you happen a) not to be looking at the screen
> at the time of the original failure or b) running the installation inside
> another IaC framework.
>
> Is there an option to “stop on fatal” I’m missing by chance?
>
> Here’s a typical failure at (in my case) 21 minutes in…
>
>
> *RUNNING HANDLER [openshift_node : restart
> node] 
> ***Wednesday
> 14 March 2018  10:12:44 + (0:00:00.081)   0:21:47.968 ***
> skipping: [os-master-1]
> skipping: [os-node-001]
> FAILED - RETRYING: restart node (3 retries left).
> FAILED - RETRYING: restart node (3 retries left).
> FAILED - RETRYING: restart node (2 retries left).
> FAILED - RETRYING: restart node (2 retries left).
> FAILED - RETRYING: restart node (1 retries left).
> FAILED - RETRYING: restart node (1 retries left).
>
>
> *fatal: [os-infra-1]: FAILED! => {"attempts": 3, "changed": false, "msg":
> "Unable to restart service origin-node: Job for origin-node.service failed
> because the control process exited with error code. See \"systemctl status
> origin-node.service\" and \"journalctl -xe\" for details.\n"}fatal:
> [os-node-002]: FAILED! => {"attempts": 3, "changed": false, "msg": "Unable
> to restart service origin-node: Job for origin-node.service failed because
> the control process exited with error code. See \"systemctl status
> origin-node.service\" and \"journalctl -xe\" for details.\n"}*
> And the roll-out finally "gives up the ghost" (in my case) after a further
> 30 minutes...
>
> TASK [debug]
> *
> Wednesday 14 March 2018  10:42:20 + (0:00:00.117)   0:51:23.829
> ***
> skipping: [os-master-1]
> to retry, use: --limit
> @/home/centos/abc/orchestrator/openshift/openshift-ansible/playbooks/byo/config.retry
>
> PLAY RECAP
> ***
> localhost  : ok=13   changed=0unreachable=0
>   failed=0
> *os-infra-1 : ok=182  changed=70   unreachable=0
>   failed=1   *
> os-master-1: ok=539  changed=210  unreachable=0
>   failed=0
> os-node-001: ok=188  changed=65   unreachable=0
>   failed=0
> *os-node-002: ok=165  changed=61   unreachable=0
>   failed=1*
>
> Alan Christie
>
>
>
>
> ___
> users mailing list
> users@lists.openshift.redhat.com
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: How to use DNS hostname of OpenShift on AWS

2018-02-21 Thread Joel Pearson

Michael are you running OpenShift on AWS?

https://github.com/openshift/openshift-ansible-contrib/tree/master/reference-architecture/aws-ansible
is the AWS reference architecture and it does use openshift-ansible once
the infrastructure is built, but it uses a dynamic inventory.

It’s not an option for us not to use the aws reference architecture to
install OpenShift as it would be rather painful as we’re relying heavily on
cloud formation and the dynamic inventory.

While ansible is running the hostnames are correct, so I’m suspecting that
maybe OpenShift itself is detecting the cloud provider and overriding the
hostname or maybe the ansible playbook is doing something similar. Inside
the ansible openshift_facts python library I saw some custom hostname
handling for Google Cloud, but not for AWS, but it made me suspicious
thinking it might be hiding somewhere else.
On Wed, 21 Feb 2018 at 11:38 pm, Feld, Michael (IMS) 
wrote:

> Deploying with https://github.com/openshift/openshift-ansible you can
> define the hostnames in your inventory file. There is a sample inventory
> file at
> https://docs.openshift.org/latest/install_config/install/advanced_install.html
> that shows how to define the master/etcd/nodes, and those names should be
> used as the hostnames in the cluster.
>
>
>
> *From:* users-boun...@lists.openshift.redhat.com [mailto:
> users-boun...@lists.openshift.redhat.com] *On Behalf Of *Joel Pearson
> *Sent:* Wednesday, February 21, 2018 7:14 AM
> *To:* users 
> *Subject:* How to use DNS hostname of OpenShift on AWS
>
>
>
> Hi,
>
>
>
> I'm trying to figure out how to use the DNS hostname when deploying
> OpenShift on AWS using
> https://github.com/openshift/openshift-ansible-contrib/tree/master/reference-architecture/aws-ansible
>  Currently
> it uses private dns name, eg, ip-10-2-7-121.ap-southeast-2.compute.internal
> but that isn't too useful a name for me.  I've managed to set the hostname
> on the ec2 instance properly but disabling the relevant cloud-init setting,
> but it still grabs the private dns name somehow.
>
>
>
> I tried adding "openshift_hostname" to be the same as "name" on this line:
> https://github.com/openshift/openshift-ansible-contrib/blob/master/reference-architecture/aws-ansible/playbooks/roles/instance-groups/tasks/main.yaml#L11
>
>
>
> Which did set the hostname in the node-config.yaml, but then when running
> "oc get nodes" it still returned the private dns name somehow, and
> installation failed waiting for the nodes to start properly, I guess a
> mismatch between node names somewhere.
>
>
>
> I found an old github issue, but it's all referring to files in ansible
> that exist no longer:
>
> https://github.com/openshift/openshift-ansible/issues/1170
>
>
>
> Even on OpenShift Online Starter, they're using the default ec2 names,
> eg: ip-172-31-28-11.ca-central-1.compute.internal, which isn't a good sign
> I guess.
>
>
>
> Has anyone successfully used a DNS name for OpenShift on AWS?
>
>
>
> Thanks,
>
>
>
> Joel
>
> --
>
> Information in this e-mail may be confidential. It is intended only for
> the addressee(s) identified above. If you are not the addressee(s), or an
> employee or agent of the addressee(s), please note that any dissemination,
> distribution, or copying of this communication is strictly prohibited. If
> you have received this e-mail in error, please notify the sender of the
> error.
>
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

How to use DNS hostname of OpenShift on AWS

2018-02-21 Thread Joel Pearson

Hi,

I'm trying to figure out how to use the DNS hostname when deploying
OpenShift on AWS using
https://github.com/openshift/openshift-ansible-contrib/tree/master/reference-architecture/aws-ansible
Currently
it uses private dns name, eg, ip-10-2-7-121.ap-southeast-2.compute.internal
but that isn't too useful a name for me.  I've managed to set the hostname
on the ec2 instance properly but disabling the relevant cloud-init setting,
but it still grabs the private dns name somehow.

I tried adding "openshift_hostname" to be the same as "name" on this line:
https://github.com/openshift/openshift-ansible-contrib/blob/master/reference-architecture/aws-ansible/playbooks/roles/instance-groups/tasks/main.yaml#L11

Which did set the hostname in the node-config.yaml, but then when running
"oc get nodes" it still returned the private dns name somehow, and
installation failed waiting for the nodes to start properly, I guess a
mismatch between node names somewhere.

I found an old github issue, but it's all referring to files in ansible
that exist no longer:
https://github.com/openshift/openshift-ansible/issues/1170

Even on OpenShift Online Starter, they're using the default ec2 names,
eg: ip-172-31-28-11.ca-central-1.compute.internal, which isn't a good sign
I guess.

Has anyone successfully used a DNS name for OpenShift on AWS?

Thanks,

Joel
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: Deployment getting deleted when running configure.yml again

2018-02-13 Thread Joel Pearson

The information about where the bug is fixed is:

https://lists.openshift.redhat.com/openshift-archives/users/2018-January/msg00042.html
On Mon, 5 Feb 2018 at 8:19 pm, Alon Zusman  wrote:

> Yes I do. This fix worked for few times but then it started to make the
> router and other things to be deleted. Anyway this is not something that I
> can do for every user that wants to use the services I provide.
> I could not find the bug opened for this or anything on it actually on
> google. (Could not even find the post you linked).
> When I true fix will be available?
> Thanks.
>
>
>
> On Jan 31, 2018 at 12:14 AM, >
> wrote:
>
> I presume you’re running OpenShift 3.7?
>
> If you’re running the new template broker (openshift-ansible installs it)
> it has a nasty bug that does what you describe. But you can work around it
> by removing an owner reference see:
>
>
> https://lists.openshift.redhat.com/openshift-archives/users/2018-January/msg00045.html
> On Tue, 30 Jan 2018 at 9:53 pm, Alon Zusman  wrote:
>
>> Hello,
>> I have an OpenShift cluster with 3 masters, 3 infra, 3 nodes.
>>
>> I change the cluster configuration from a time to time and whenever I run
>> config.yml (after the first time) all the deployments that were created
>> using a provisioned service being deleted.
>>
>> That is a huge problem for me.
>> Am I missing something? Should I be running a different playbook?
>> Thank you.
>> ___
>> users mailing list
>> users@lists.openshift.redhat.com
>> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>>
>
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: Openshift Web UI

2018-02-13 Thread Joel Pearson

The web UI and registry aren’t connected in any way I don’t believe.

If you don’t use the internal registry then you can’t trigger things to
deploy when an image changes, not sure if that matters to you...
On Tue, 13 Feb 2018 at 11:26 pm, Polushkin Aleksandr <
aleksandr.polush...@t-systems.ru> wrote:

> Hello everyone !
>
>
>
> I’m playing with Openshift  and at my last installation I disabled
> internal registry and this disabled Web console too.
>
> Am I getting right that it isn’t possible to disable internal registry and
> save the Web UI ?
>
>
>
>
>
> Regards,
>
> Aleksandr
>
>
>
>
> --
>
> T-Systems RUS GmbH
>
> Point of Production
>
> Aleksandr Polushkin
>
> Sr. Configuration Manager
>
> V.O. 13th line, 14B, 199034, St.Petersburg, Russia
>
> Email: aleksandr.polush...@t-systems.ru
>
>
> ___
> users mailing list
> users@lists.openshift.redhat.com
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: Deployment getting deleted when running configure.yml again

2018-01-30 Thread Joel Pearson

I presume you’re running OpenShift 3.7?

If you’re running the new template broker (openshift-ansible installs it)
it has a nasty bug that does what you describe. But you can work around it
by removing an owner reference see:

https://lists.openshift.redhat.com/openshift-archives/users/2018-January/msg00045.html
On Tue, 30 Jan 2018 at 9:53 pm, Alon Zusman  wrote:

> Hello,
> I have an OpenShift cluster with 3 masters, 3 infra, 3 nodes.
>
> I change the cluster configuration from a time to time and whenever I run
> config.yml (after the first time) all the deployments that were created
> using a provisioned service being deleted.
>
> That is a huge problem for me.
> Am I missing something? Should I be running a different playbook?
> Thank you.
> ___
> users mailing list
> users@lists.openshift.redhat.com
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: Passthrough TLS route not working

2018-01-19 Thread Joel Pearson

In the reference implementation they use Classic ELB load balancers in TCP
mode:

See this cloud formation template:
https://github.com/openshift/openshift-ansible-contrib/blob/master/reference-architecture/aws-ansible/playbooks/roles/cloudformation-infra/files/greenfield.json.j2#L763

On Sat, Jan 20, 2018 at 8:55 AM Joel Pearson 
wrote:

> What mode are you running the AWS load balancers in? You probably want to
> run them as TCP load balancers and not HTTP. That way as you say the SNI
> will not get messed with.
> On Sat, 20 Jan 2018 at 4:45 am, Marc Boorshtein 
> wrote:
>
>> So if I bypass the AWS load balancer, everything works great.  Why
>> doesn't HAProxy like the incoming requests?  I'm trying to debug the issue
>> by enabling logging with
>>
>> oc set env dc/router ROUTER_SYSLOG_ADDRESS=127.0.0.1 ROUTER_LOG_LEVEL=debug
>>
>> But the logging doesn't seem to get there (I also tried a remote server as 
>> well).  I'm guessing this is probably an SNI configuration issue?
>>
>>
>>
>> On Fri, Jan 19, 2018 at 11:59 AM Marc Boorshtein 
>> wrote:
>>
>>> I'm running origin 3.7 on AWS.  I have an AWS load balancer in front of
>>> my infrastructure node.  I have a pod listening on TLS on port 9090.  The
>>> service links to the pod and then I have a route that is setup with
>>> passthrough tls to the pod, but every time i try to access it I get the
>>> "Application is not availble" screen even though looking in the console the
>>> service references both the router and the pod.  I have deployments that do
>>> the same thing but will only work with re-encrypt.  Am I missing
>>> something?  Is there an issue using the AWS load balancer with passthrough?
>>>
>>> Thanks
>>>
>> ___
>> users mailing list
>> users@lists.openshift.redhat.com
>> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>>
>
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: Passthrough TLS route not working

2018-01-19 Thread Joel Pearson

What mode are you running the AWS load balancers in? You probably want to
run them as TCP load balancers and not HTTP. That way as you say the SNI
will not get messed with.
On Sat, 20 Jan 2018 at 4:45 am, Marc Boorshtein 
wrote:

> So if I bypass the AWS load balancer, everything works great.  Why doesn't
> HAProxy like the incoming requests?  I'm trying to debug the issue by
> enabling logging with
>
> oc set env dc/router ROUTER_SYSLOG_ADDRESS=127.0.0.1 ROUTER_LOG_LEVEL=debug
>
> But the logging doesn't seem to get there (I also tried a remote server as 
> well).  I'm guessing this is probably an SNI configuration issue?
>
>
>
> On Fri, Jan 19, 2018 at 11:59 AM Marc Boorshtein 
> wrote:
>
>> I'm running origin 3.7 on AWS.  I have an AWS load balancer in front of
>> my infrastructure node.  I have a pod listening on TLS on port 9090.  The
>> service links to the pod and then I have a route that is setup with
>> passthrough tls to the pod, but every time i try to access it I get the
>> "Application is not availble" screen even though looking in the console the
>> service references both the router and the pod.  I have deployments that do
>> the same thing but will only work with re-encrypt.  Am I missing
>> something?  Is there an issue using the AWS load balancer with passthrough?
>>
>> Thanks
>>
> ___
> users mailing list
> users@lists.openshift.redhat.com
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: OpenStack cloud provider problems

2018-01-17 Thread Joel Pearson

Have you tried an OpenStack users list? It sounds like you need someone
with in-depth OpenStack knowledge
On Wed, 17 Jan 2018 at 9:55 pm, Tim Dudgeon  wrote:

> So what does "complete an install" entail?
> Presumably  OpenShift/Kubernetes is trying to do something in OpenStack
> but this is failing.
>
> But what is it trying to do?
>
> On 17/01/18 10:49, Joel Pearson wrote:
>
> Complete stab in the dark, but maybe your OpenStack account doesn’t have
> enough privileges to be able to complete an install?
> On Wed, 17 Jan 2018 at 9:46 pm, Tim Dudgeon  wrote:
>
>> I'm still having problems getting the OpenStack cloud provider running.
>>
>> I have a minimal OpenShift Origin 3.7 Ansible install that runs OK. But
>> when I add the definition for the OpenStack cloud provider (just the
>> cloud provider definition, nothing yet that uses it) the installation
>> fails like this:
>>
>> TASK [nickhammond.logrotate : nickhammond.logrotate | Setup logrotate.d
>> scripts]
>>
>> ***
>>
>> RUNNING HANDLER [openshift_node : restart node]
>>
>> 
>> FAILED - RETRYING: restart node (3 retries left).
>> FAILED - RETRYING: restart node (3 retries left).
>> FAILED - RETRYING: restart node (3 retries left).
>> FAILED - RETRYING: restart node (3 retries left).
>> FAILED - RETRYING: restart node (3 retries left).
>> FAILED - RETRYING: restart node (2 retries left).
>> FAILED - RETRYING: restart node (2 retries left).
>> FAILED - RETRYING: restart node (2 retries left).
>> FAILED - RETRYING: restart node (2 retries left).
>> FAILED - RETRYING: restart node (2 retries left).
>> FAILED - RETRYING: restart node (1 retries left).
>> FAILED - RETRYING: restart node (1 retries left).
>> FAILED - RETRYING: restart node (1 retries left).
>> FAILED - RETRYING: restart node (1 retries left).
>> FAILED - RETRYING: restart node (1 retries left).
>> fatal: [orndev-node-000]: FAILED! => {"attempts": 3, "changed": false,
>> "msg": "Unable to restart service origin-node: Job for
>> origin-node.service failed because the control process exited with error
>> code. See \"systemctl status origin-node.service\" and \"journalctl
>> -xe\" for details.\n"}
>> fatal: [orndev-node-001]: FAILED! => {"attempts": 3, "changed": false,
>> "msg": "Unable to restart service origin-node: Job for
>> origin-node.service failed because the control process exited with error
>> code. See \"systemctl status origin-node.service\" and \"journalctl
>> -xe\" for details.\n"}
>> fatal: [orndev-master-000]: FAILED! => {"attempts": 3, "changed": false,
>> "msg": "Unable to restart service origin-node: Job for
>> origin-node.service failed because the control process exited with error
>> code. See \"systemctl status origin-node.service\" and \"journalctl
>> -xe\" for details.\n"}
>> fatal: [orndev-node-002]: FAILED! => {"attempts": 3, "changed": false,
>> "msg": "Unable to restart service origin-node: Job for
>> origin-node.service failed because the control process exited with error
>> code. See \"systemctl status origin-node.service\" and \"journalctl
>> -xe\" for details.\n"}
>> fatal: [orndev-infra-000]: FAILED! => {"attempts": 3, "changed": false,
>> "msg": "Unable to restart service origin-node: Job for
>> origin-node.service failed because the control process exited with error
>> code. See \"systemctl status origin-node.service\" and \"journalctl
>> -xe\" for details.\n"}
>>
>> RUNNING HANDLER [openshift_node : reload systemd units]
>>
>> 
>>  to retry, use: --limit
>> @/home/centos/openshift-ansible/playbooks/byo/config.retry
>>
>>
>> Looking on one of the nodes I see this error in the origin-node.service
>> logs:
>>
>> Jan 17 09:40:49 orndev-master-000 origin-node[2419]: E0117
>> 09:40:49.7468062419 kubelet_node_status.go:106] Unable to register
>> node "orndev-master-000" with API serve

Re: OpenStack cloud provider problems

2018-01-17 Thread Joel Pearson

Complete stab in the dark, but maybe your OpenStack account doesn’t have
enough privileges to be able to complete an install?
On Wed, 17 Jan 2018 at 9:46 pm, Tim Dudgeon  wrote:

> I'm still having problems getting the OpenStack cloud provider running.
>
> I have a minimal OpenShift Origin 3.7 Ansible install that runs OK. But
> when I add the definition for the OpenStack cloud provider (just the
> cloud provider definition, nothing yet that uses it) the installation
> fails like this:
>
> TASK [nickhammond.logrotate : nickhammond.logrotate | Setup logrotate.d
> scripts]
>
> ***
>
> RUNNING HANDLER [openshift_node : restart node]
>
> 
> FAILED - RETRYING: restart node (3 retries left).
> FAILED - RETRYING: restart node (3 retries left).
> FAILED - RETRYING: restart node (3 retries left).
> FAILED - RETRYING: restart node (3 retries left).
> FAILED - RETRYING: restart node (3 retries left).
> FAILED - RETRYING: restart node (2 retries left).
> FAILED - RETRYING: restart node (2 retries left).
> FAILED - RETRYING: restart node (2 retries left).
> FAILED - RETRYING: restart node (2 retries left).
> FAILED - RETRYING: restart node (2 retries left).
> FAILED - RETRYING: restart node (1 retries left).
> FAILED - RETRYING: restart node (1 retries left).
> FAILED - RETRYING: restart node (1 retries left).
> FAILED - RETRYING: restart node (1 retries left).
> FAILED - RETRYING: restart node (1 retries left).
> fatal: [orndev-node-000]: FAILED! => {"attempts": 3, "changed": false,
> "msg": "Unable to restart service origin-node: Job for
> origin-node.service failed because the control process exited with error
> code. See \"systemctl status origin-node.service\" and \"journalctl
> -xe\" for details.\n"}
> fatal: [orndev-node-001]: FAILED! => {"attempts": 3, "changed": false,
> "msg": "Unable to restart service origin-node: Job for
> origin-node.service failed because the control process exited with error
> code. See \"systemctl status origin-node.service\" and \"journalctl
> -xe\" for details.\n"}
> fatal: [orndev-master-000]: FAILED! => {"attempts": 3, "changed": false,
> "msg": "Unable to restart service origin-node: Job for
> origin-node.service failed because the control process exited with error
> code. See \"systemctl status origin-node.service\" and \"journalctl
> -xe\" for details.\n"}
> fatal: [orndev-node-002]: FAILED! => {"attempts": 3, "changed": false,
> "msg": "Unable to restart service origin-node: Job for
> origin-node.service failed because the control process exited with error
> code. See \"systemctl status origin-node.service\" and \"journalctl
> -xe\" for details.\n"}
> fatal: [orndev-infra-000]: FAILED! => {"attempts": 3, "changed": false,
> "msg": "Unable to restart service origin-node: Job for
> origin-node.service failed because the control process exited with error
> code. See \"systemctl status origin-node.service\" and \"journalctl
> -xe\" for details.\n"}
>
> RUNNING HANDLER [openshift_node : reload systemd units]
>
> 
>  to retry, use: --limit
> @/home/centos/openshift-ansible/playbooks/byo/config.retry
>
>
> Looking on one of the nodes I see this error in the origin-node.service
> logs:
>
> Jan 17 09:40:49 orndev-master-000 origin-node[2419]: E0117
> 09:40:49.7468062419 kubelet_node_status.go:106] Unable to register
> node "orndev-master-000" with API server: nodes "orndev-master-000" is
> forbidden: node 10.0.0.6 cannot modify node orndev-master-000
>
> The /etc/origin/cloudprovider/openstack.conf file has been created OK,
> and looks to be what is expected.
> But I can't be sure its specified correctly and will work. In fact if I
> deliberately change the configuration to use an invalid openstack
> username the install fails at the same place, but the error message on
> the node is different:
>
> Jan 17 10:08:58 orndev-master-000 origin-node[24066]: F0117
> 10:08:58.474152   24066 start_node.go:159] could not init cloud provider
> "openstack": Authentication failed
>
> When set back to the right username the node service again fails because
> of:
> Unable to register node "orndev-master-000" with API server: nodes
> "orndev-master-000" is forbidden: node 10.0.0.6 cannot modify node
> orndev-master-000
>
> How can this be tested on a node to ensure that the cloud provider is
> configured correctly?
> Any idea what the "node 10.0.0.6 cannot modify node orndev-master-000"
> error is about?
>
>
>
>
> ___
> users mailing list
> users@lists.openshift.redhat.com
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>
__

Re: Pod persistence without replication controller

2018-01-09 Thread Joel Pearson

You could use a StatefulSet if you want a consistent hostname, it would
also ensure that there is a always one running.
On Wed, 10 Jan 2018 at 3:49 am, Feld, Michael (IMS) 
wrote:

> Does anyone know why a standalone pod (without a replication controller)
> sometimes persists through a host/node reboot, but not all times (not
> evacuating first)? We have a database pod that we cannot risk scaling, and
> want to ensure that it’s always running.
>
> --
>
> Information in this e-mail may be confidential. It is intended only for
> the addressee(s) identified above. If you are not the addressee(s), or an
> employee or agent of the addressee(s), please note that any dissemination,
> distribution, or copying of this communication is strictly prohibited. If
> you have received this e-mail in error, please notify the sender of the
> error.
> ___
> users mailing list
> users@lists.openshift.redhat.com
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: OpenShift Origin 3.7 Template Broker seems super flaky

2018-01-07 Thread Joel Pearson

> The TemplateInstance object should have an ownerReference to a
BrokerTemplateInstance and that reference not being handled properly is the
bug.  If you remove that ownerRef from the TemplateInstance, you should be
safe from undesired of the TemplateInstance (and the cascading delete of
everything else) (at least w/ respect to the bug we are aware of).

Nice, that did the trick.

I did an oc patch, and that fixed it:

$ oc get templateinstance
NAME   TEMPLATE
b180d814-2917-4c7e-875f-b91e5d4743e8   jenkins-ephemeral

$ oc patch templateinstance b180d814-2917-4c7e-875f-b91e5d4743e8 --type
json -p='[{"op": "remove", "path": "/metadata/ownerReferences"}]'
templateinstance "b180d814-2917-4c7e-875f-b91e5d4743e8" patched


Also, I've got another stale serviceinstance after a few rounds of testing,
I cannot for the life of me make it die, meaning I can't delete the project
that it is a part of, I've tried a force delete, but it doesn't work.

$ oc delete serviceinstance jenkins-ephemeral-8dmk9 --force --grace-period=0
warning: Immediate deletion does not wait for confirmation that the running
resource has been terminated. The resource may continue to run on the
cluster indefinitely.
serviceinstance "jenkins-ephemeral-8dmk9" deleted

$ oc get serviceinstance
NAME  AGE
jenkins-ephemeral-8dmk9   7m

What's the magic sauce to make it so that I can delete the serviceinstance?

On 8 January 2018 at 15:29, Ben Parees  wrote:

>
>
> On Sun, Jan 7, 2018 at 9:35 PM, Joel Pearson <
> japear...@agiledigital.com.au> wrote:
>
>> Ahh, I looked into all the objects that were getting deleted and they all
>> have an ownerReference, eg:
>>
>> "ownerReferences": [
>> {
>> "apiVersion": "template.openshift.io/v1",
>> "kind": "TemplateInstance",
>> "name": "75c0ccd3-642e-4035-a5cf-3c27e54cae40",
>> "uid": "a7301596-f41a-11e7-88e5-fa163eb8ca3a",
>> "blockOwnerDeletion": true
>> }
>> ]
>>
>> That looks like what patch is about. I also found that if I tried to edit
>> an object and remove the ownerReference then it also triggered a garbage
>> collect on the spot and all the resources evaporated.
>>
>>
> Sounds worse than the behavior we were aware of, but fundamentally what's
> causing the cascade deletion is this:
>
> Jan 08 00:26:49 master-0.openshift.staging.local dockerd-current[23329]:
> I0108 00:26:49.904249   1 garbagecollector.go:394] delete object [
> template.openshift.io/v1/TemplateInstance, namespace: jenkins-test, name:
> e3639aec-bbbc-4170-b0e4-3b63735af348, uid: 
> 915d585d-f408-11e7-88e5-fa163eb8ca3a]
> with propagation policy Background
>
> The TemplateInstance object should have an ownerReference to a
> BrokerTemplateInstance and that reference not being handled properly is the
> bug.  If you remove that ownerRef from the TemplateInstance, you should be
> safe from undesired of the TemplateInstance (and the cascading delete of
> everything else) (at least w/ respect to the bug we are aware of).
>
> That should be the only ownerRef you need to delete unless there are other
> (to date unknow) bugs in the GC behavior, or in how the TSB is creating the
> ownerRef chain.
>
>
>
>> So I guess my workaround can be, run the template, wait for everything to
>> deploy, export all templated resources to json, strip out ownerReferences,
>> and create all the resources again.
>>
>> On Mon, Jan 8, 2018 at 12:30 PM Joel Pearson <
>> japear...@agiledigital.com.au> wrote:
>>
>>> Hmm, in my case I don't need to need to restart to cause the problem to
>>> happen. Is there some way to run nightlies of openshift:release-3.7 using
>>> the openshift-ansible? So that I can verify it's fixed for me?
>>>
>>> On Mon, Jan 8, 2018 at 12:23 PM Jordan Liggitt 
>>> wrote:
>>>
>>>> Garbage collection in particular could be related to
>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1525699 (fixed in
>>>> https://github.com/openshift/origin/pull/17818 but not included in a
>>>> point release yet)
>>>>
>>>>
>>>> On Jan 7, 2018, at 8:17 PM, Joel Pearson 
>>>> wrote:
>>>>
>>>> Hi,
>>>>
>>>> Has anyone else noticed that the new OpenShift Origin 3.7 Template
&g

Re: OpenShift Origin 3.7 Template Broker seems super flaky

2018-01-07 Thread Joel Pearson

Ahh, I looked into all the objects that were getting deleted and they all
have an ownerReference, eg:

"ownerReferences": [
{
"apiVersion": "template.openshift.io/v1",
"kind": "TemplateInstance",
"name": "75c0ccd3-642e-4035-a5cf-3c27e54cae40",
"uid": "a7301596-f41a-11e7-88e5-fa163eb8ca3a",
"blockOwnerDeletion": true
}
]

That looks like what patch is about. I also found that if I tried to edit
an object and remove the ownerReference then it also triggered a garbage
collect on the spot and all the resources evaporated.

So I guess my workaround can be, run the template, wait for everything to
deploy, export all templated resources to json, strip out ownerReferences,
and create all the resources again.

On Mon, Jan 8, 2018 at 12:30 PM Joel Pearson 
wrote:

> Hmm, in my case I don't need to need to restart to cause the problem to
> happen. Is there some way to run nightlies of openshift:release-3.7 using
> the openshift-ansible? So that I can verify it's fixed for me?
>
> On Mon, Jan 8, 2018 at 12:23 PM Jordan Liggitt 
> wrote:
>
>> Garbage collection in particular could be related to
>> https://bugzilla.redhat.com/show_bug.cgi?id=1525699 (fixed in
>> https://github.com/openshift/origin/pull/17818 but not included in a
>> point release yet)
>>
>>
>> On Jan 7, 2018, at 8:17 PM, Joel Pearson 
>> wrote:
>>
>> Hi,
>>
>> Has anyone else noticed that the new OpenShift Origin 3.7 Template Broker
>> seems super flaky?
>>
>> For example, if I deploy a Jenkins (Persistent or Ephemeral), and then I
>> modify the route, by adding an annotation for example:
>>
>> kubernetes.io/tls-acme: 'true'
>>
>> I have https://github.com/tnozicka/openshift-acme Installed in the
>> cluster which then grabs an SSL cert for me, adds it to the route, then
>> moments later all resources from the template are garbage collected for no
>> apparent reason.
>>
>> I also got the same behaviour when I modified the service account the
>> Jenkins template uses, I added an additional route so I added a new "
>> serviceaccounts.openshift.io/oauth-redirectreference.jenkins:" entry. It
>> took a bit longer (like 12 hours), but it all disappeared again.  I have a
>> suspicion that if you modify any object that a template created, then
>> eventually the template broker will remove all objects it created.
>>
>> Is there any way to disable the new template broker and use the old
>> template system?
>>
>> In Origin 3.6 it was flawless and worked with openshift-acme without any
>> problems at all.
>>
>> I should mention that if I create things manually then it works fine, I
>> can use openshift-acme, and all my resources don't vanish at whim.
>>
>> Here is a snippet of the logs, you can see the acme points are removed
>> after successfully getting a cert, and then moments later, the deleting
>> starts:
>>
>> Jan 08 00:26:47 master-0.openshift.staging.local dockerd-current[23329]:
>> I0108 00:26:47.648255   1 leaderelection.go:199] successfully renewed
>> lease kube-service-catalog/service-catalog-controller-manager
>> Jan 08 00:26:47 master-0.openshift.staging.local origin-node[26684]:
>> I0108 00:26:47.744777   26749 roundrobin.go:338] LoadBalancerRR: Removing
>> endpoints for jenkins-test/acme-9cv97q5dn8:
>> Jan 08 00:26:47 master-0.openshift.staging.local dockerd-current[23329]:
>> I0108 00:26:47.744777   26749 roundrobin.go:338] LoadBalancerRR: Removing
>> endpoints for jenkins-test/acme-9cv97q5dn8:
>> Jan 08 00:26:47 master-0.openshift.staging.local origin-node[26684]:
>> I0108 00:26:47.762005   26749 ovs.go:143] Error executing ovs-ofctl:
>> ovs-ofctl: None: invalid IP address
>> Jan 08 00:26:47 master-0.openshift.staging.local dockerd-current[23329]:
>> I0108 00:26:47.762005   26749 ovs.go:143] Error executing ovs-ofctl:
>> ovs-ofctl: None: invalid IP address
>> Jan 08 00:26:47 master-0.openshift.staging.local dockerd-current[23329]:
>> E0108 00:26:47.765091   26749 sdn_controller.go:284] Error deleting OVS
>> flows for service &{{ } {acme-9cv97q5dn8  jenkins-test
>> /api/v1/namespaces/jenkins-test/services/acme-9cv97q5dn8
>> 94c6b3b3-f40a-11e7-88e5-fa163eb8ca3a 622382 0 2018-01-08 00:26:34 + UTC
>>   map[] map[] [] nil [] } {ClusterIP [{http TCP 80 {0 80 } 0}]
>> map[] None  []  None []  0} {{[]}}}: exit status 1
>>

Re: OpenShift Origin 3.7 Template Broker seems super flaky

2018-01-07 Thread Joel Pearson

Hmm, in my case I don't need to need to restart to cause the problem to
happen. Is there some way to run nightlies of openshift:release-3.7 using
the openshift-ansible? So that I can verify it's fixed for me?

On Mon, Jan 8, 2018 at 12:23 PM Jordan Liggitt  wrote:

> Garbage collection in particular could be related to
> https://bugzilla.redhat.com/show_bug.cgi?id=1525699 (fixed in
> https://github.com/openshift/origin/pull/17818 but not included in a
> point release yet)
>
>
> On Jan 7, 2018, at 8:17 PM, Joel Pearson 
> wrote:
>
> Hi,
>
> Has anyone else noticed that the new OpenShift Origin 3.7 Template Broker
> seems super flaky?
>
> For example, if I deploy a Jenkins (Persistent or Ephemeral), and then I
> modify the route, by adding an annotation for example:
>
> kubernetes.io/tls-acme: 'true'
>
> I have https://github.com/tnozicka/openshift-acme Installed in the
> cluster which then grabs an SSL cert for me, adds it to the route, then
> moments later all resources from the template are garbage collected for no
> apparent reason.
>
> I also got the same behaviour when I modified the service account the
> Jenkins template uses, I added an additional route so I added a new "
> serviceaccounts.openshift.io/oauth-redirectreference.jenkins:" entry. It
> took a bit longer (like 12 hours), but it all disappeared again.  I have a
> suspicion that if you modify any object that a template created, then
> eventually the template broker will remove all objects it created.
>
> Is there any way to disable the new template broker and use the old
> template system?
>
> In Origin 3.6 it was flawless and worked with openshift-acme without any
> problems at all.
>
> I should mention that if I create things manually then it works fine, I
> can use openshift-acme, and all my resources don't vanish at whim.
>
> Here is a snippet of the logs, you can see the acme points are removed
> after successfully getting a cert, and then moments later, the deleting
> starts:
>
> Jan 08 00:26:47 master-0.openshift.staging.local dockerd-current[23329]:
> I0108 00:26:47.648255   1 leaderelection.go:199] successfully renewed
> lease kube-service-catalog/service-catalog-controller-manager
> Jan 08 00:26:47 master-0.openshift.staging.local origin-node[26684]: I0108
> 00:26:47.744777   26749 roundrobin.go:338] LoadBalancerRR: Removing
> endpoints for jenkins-test/acme-9cv97q5dn8:
> Jan 08 00:26:47 master-0.openshift.staging.local dockerd-current[23329]:
> I0108 00:26:47.744777   26749 roundrobin.go:338] LoadBalancerRR: Removing
> endpoints for jenkins-test/acme-9cv97q5dn8:
> Jan 08 00:26:47 master-0.openshift.staging.local origin-node[26684]: I0108
> 00:26:47.762005   26749 ovs.go:143] Error executing ovs-ofctl: ovs-ofctl:
> None: invalid IP address
> Jan 08 00:26:47 master-0.openshift.staging.local dockerd-current[23329]:
> I0108 00:26:47.762005   26749 ovs.go:143] Error executing ovs-ofctl:
> ovs-ofctl: None: invalid IP address
> Jan 08 00:26:47 master-0.openshift.staging.local dockerd-current[23329]:
> E0108 00:26:47.765091   26749 sdn_controller.go:284] Error deleting OVS
> flows for service &{{ } {acme-9cv97q5dn8  jenkins-test
> /api/v1/namespaces/jenkins-test/services/acme-9cv97q5dn8
> 94c6b3b3-f40a-11e7-88e5-fa163eb8ca3a 622382 0 2018-01-08 00:26:34 + UTC
>   map[] map[] [] nil [] } {ClusterIP [{http TCP 80 {0 80 } 0}]
> map[] None  []  None []  0} {{[]}}}: exit status 1
> Jan 08 00:26:47 master-0.openshift.staging.local origin-node[26684]: E0108
> 00:26:47.765091   26749 sdn_controller.go:284] Error deleting OVS flows for
> service &{{ } {acme-9cv97q5dn8  jenkins-test
> /api/v1/namespaces/jenkins-test/services/acme-9cv97q5dn8
> 94c6b3b3-f40a-11e7-88e5-fa163eb8ca3a 622382 0 2018-01-08 00:26:34 + UTC
>   map[] map[] [] nil [] } {ClusterIP [{http TCP 80 {0 80 } 0}]
> map[] None  []  None []  0} {{[]}}}: exit status 1
> Jan 08 00:26:48 master-0.openshift.staging.local dockerd-current[23329]:
> I0108 00:26:48.139090   1 rest.go:362] Starting watch for
> /api/v1/namespaces, rv=622418 labels= fields= timeout=8m38s
> Jan 08 00:26:48 master-0.openshift.staging.local origin-master-api[23448]:
> I0108 00:26:48.139090   1 rest.go:362] Starting watch for
> /api/v1/namespaces, rv=622418 labels= fields= timeout=8m38s
> Jan 08 00:26:49 master-0.openshift.staging.local dockerd-current[23329]:
> I0108 00:26:49.668205   1 leaderelection.go:199] successfully renewed
> lease kube-service-catalog/service-catalog-controller-manager
> Jan 08 00:26:49 master-0.openshift.staging.local dockerd-current[23329]:
> I0108 00:26:49.885207   1 garbagecollector.go:291] processing item [
> template.openshift.io/v1/TemplateInstance, namesp

1 2 >

1 - 100 of 143 matches

Mail list logo