Re: [Logging] What component forward log entries to fluentd input service?

2017-07-11 Thread Alex Wauck
Last I checked (OpenShift Origin 1.2), fluentd was just slurping up the log
files produced by Docker.  It can do that because the pods it runs in have
access to the host filesystem.

On Tue, Jul 11, 2017 at 6:12 AM, Stéphane Klein <cont...@stephane-klein.info
> wrote:

> Hi,
>
> I see here https://github.com/openshift/origin-aggregated-
> logging/blob/master/fluentd/configs.d/input-post-forward-mux.conf#L2
> that fluentd logging system use secure_forward input system.
>
> My question: what component forward log entries to fluentd input service ?
>
> Best regards,
> Stéphane
> --
> Stéphane Klein <cont...@stephane-klein.info>
> blog: http://stephane-klein.info
> cv : http://cv.stephane-klein.info
> Twitter: http://twitter.com/klein_stephane
>
> ___
> users mailing list
> users@lists.openshift.redhat.com
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>
>


-- 

Alex Wauck // Senior DevOps Engineer

*E X O S I T E*
*www.exosite.com <http://www.exosite.com/>*

Making Machines More Human.
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


Re: In OpenShift Ansible, what is the differences between roles/openshift_hosted_metrics and roles/openshift_metrics ?

2017-04-28 Thread Alex Wauck
I think Stéphane meant to link to this:
https://github.com/openshift/openshift-ansible/tree/master/roles/openshift_hosted_metrics

What's the difference between that one and openshift_metrics?

On Fri, Apr 28, 2017 at 11:46 AM, Tim Bielawa <tbiel...@redhat.com> wrote:

> I believe that openshift-hosted-logging installs kibana (logging
> exploration) whereas openshift-metrics will install hawkular (a metric
> storage engine).
>
> On Fri, Apr 28, 2017 at 9:25 AM, Stéphane Klein <
> cont...@stephane-klein.info> wrote:
>
>> Hi,
>>
>> what is the differences between :
>>
>> * roles/openshift_hosted_metrics (https://github.com/openshift/
>> openshift-ansible/tree/master/roles/openshift_hosted_logging)
>> * and roles/openshift_metrics (https://github.com/openshift/
>> openshift-ansible/tree/master/roles/openshift_metrics)
>>
>> ?
>>
>> Best regards,
>> Stéphane
>> --
>> Stéphane Klein <cont...@stephane-klein.info>
>> blog: http://stephane-klein.info
>> cv : http://cv.stephane-klein.info
>> Twitter: http://twitter.com/klein_stephane
>>
>> ___
>> users mailing list
>> users@lists.openshift.redhat.com
>> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>>
>>
>
>
> --
> Tim Bielawa, Software Engineer [ED-C137]
> Cell: 919.332.6411 <(919)%20332-6411>  | IRC: tbielawa (#openshift)
> 1BA0 4FAB 4C13 FBA0 A036  4958 AD05 E75E 0333 AE37
>
> ___
> users mailing list
> users@lists.openshift.redhat.com
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>
>


-- 

Alex Wauck // DevOps Engineer

*E X O S I T E*
*www.exosite.com <http://www.exosite.com/>*

Making Machines More Human.
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


Re: Pods randomly running as root

2017-02-07 Thread Alex Wauck
Yeah, I fixed it with oadm policy reconcile-sccs.  Nobody who has admin
access has admitted to doing it, so I guess I'll just revoke that access
from anybody who isn't part of the on-call rotation.  I see that newer
versions of OpenShift have audit logging; I look forward to getting that in
place.

On Tue, Feb 7, 2017 at 3:21 PM, Jordan Liggitt <jligg...@redhat.com> wrote:

> It is not right, and no, Ansible does not relax the restricted SCC.
>
> `oadm policy reconcile-sccs` will show you default sccs that need
> reconciling, and `oadm policy reconcile-sccs --confirm` will revert them to
> their default settings.
>
> On Mon, Feb 6, 2017 at 2:29 PM, Alex Wauck <alexwa...@exosite.com> wrote:
>
>> Well, well:
>>
>> $ oc export scc/restricted
>> allowHostDirVolumePlugin: false
>> allowHostIPC: false
>> allowHostNetwork: false
>> allowHostPID: false
>> allowHostPorts: false
>> allowPrivilegedContainer: false
>> allowedCapabilities: null
>> apiVersion: v1
>> defaultAddCapabilities: null
>> fsGroup:
>>   type: MustRunAs
>> groups:
>> - system:authenticated
>> kind: SecurityContextConstraints
>> metadata:
>>   annotations:
>> kubernetes.io/description: restricted denies access to all host
>> features and requires
>>   pods to be run with a UID, and SELinux context that are allocated
>> to the namespace.  This
>>   is the most restrictive SCC.
>>   creationTimestamp: null
>>   name: restricted
>> priority: null
>> readOnlyRootFilesystem: false
>> requiredDropCapabilities:
>> - KILL
>> - MKNOD
>> - SYS_CHROOT
>> - SETUID
>> - SETGID
>> runAsUser:
>>   type: RunAsAny
>> seLinuxContext:
>>   type: MustRunAs
>> supplementalGroups:
>>   type: RunAsAny
>> volumes:
>> - configMap
>> - downwardAPI
>> - emptyDir
>> - persistentVolumeClaim
>> - secret
>>
>> That runAsUser isn't right, is it?  Any idea how that could have been
>> done by openshift-ansible or something?  Otherwise, this might be my excuse
>> to clamp down hard on admin-level access to the cluster.
>>
>> On Mon, Feb 6, 2017 at 1:24 PM, Jordan Liggitt <jligg...@redhat.com>
>> wrote:
>>
>>> Can you include your `restricted` scc definition:
>>>
>>> oc get scc -o yaml
>>>
>>> It seems likely that the restricted scc definition was modified in your
>>> installation to not be as restrictive. By default, it sets runAsUser to
>>> MustRunAsRange
>>>
>>>
>>>
>>>
>>> On Mon, Feb 6, 2017 at 2:17 PM, Alex Wauck <alexwa...@exosite.com>
>>> wrote:
>>>
>>>> openshift.io/scc is "restricted" for app1-45-3blnd (not running as
>>>> root).  It also has that value for app5-36-2rfsq (running as root).
>>>>
>>>> On Mon, Feb 6, 2017 at 1:11 PM, Clayton Coleman <ccole...@redhat.com>
>>>> wrote:
>>>>
>>>>> Were those apps created in order?  Or at individual times?   If you
>>>>> did the following order of actions:
>>>>>
>>>>> 1. create app2, app4
>>>>> 2. grant the default service account access to a higher level SCC
>>>>> 3. create app1, app3, app5, and app6
>>>>>
>>>>> Then this would be what I would expect.  Can you look at the
>>>>> annotations of pod app1-45-3blnd and see what the value of "
>>>>> openshift.io/scc" is?
>>>>>
>>>>>
>>>>> On Mon, Feb 6, 2017 at 1:57 PM, Alex Wauck <alexwa...@exosite.com>
>>>>> wrote:
>>>>>
>>>>>> OK, this just got a lot more interesting:
>>>>>>
>>>>>> $ oc -n some-project exec app1-45-3blnd -- id
>>>>>> uid=0(root) gid=0(root) groups=0(root),1(bin),2(daemon
>>>>>> ),3(sys),4(adm),6(disk),10(wheel),11(floppy),20(dialout),26(
>>>>>> tape),27(video),100037
>>>>>> $ oc -n some-project exec app2-18-q2fwm -- id
>>>>>> uid=100037 gid=0(root) groups=100037
>>>>>> $ oc -n some-project exec app3-10-lhato -- id
>>>>>> uid=0(root) gid=0(root) groups=0(root),1(bin),2(daemon
>>>>>> ),3(sys),4(adm),6(disk),10(wheel),11(floppy),20(dialout),26(
>>>>>> tape),27(video),100037
>>>>>> $ oc -n some-project exec app4-16-dl2r7 -- id
>>>>>> uid=100037 gid=0(root) groups=10003700

Re: Pods randomly running as root

2017-02-06 Thread Alex Wauck
Well, well:

$ oc export scc/restricted
allowHostDirVolumePlugin: false
allowHostIPC: false
allowHostNetwork: false
allowHostPID: false
allowHostPorts: false
allowPrivilegedContainer: false
allowedCapabilities: null
apiVersion: v1
defaultAddCapabilities: null
fsGroup:
  type: MustRunAs
groups:
- system:authenticated
kind: SecurityContextConstraints
metadata:
  annotations:
kubernetes.io/description: restricted denies access to all host
features and requires
  pods to be run with a UID, and SELinux context that are allocated to
the namespace.  This
  is the most restrictive SCC.
  creationTimestamp: null
  name: restricted
priority: null
readOnlyRootFilesystem: false
requiredDropCapabilities:
- KILL
- MKNOD
- SYS_CHROOT
- SETUID
- SETGID
runAsUser:
  type: RunAsAny
seLinuxContext:
  type: MustRunAs
supplementalGroups:
  type: RunAsAny
volumes:
- configMap
- downwardAPI
- emptyDir
- persistentVolumeClaim
- secret

That runAsUser isn't right, is it?  Any idea how that could have been done
by openshift-ansible or something?  Otherwise, this might be my excuse to
clamp down hard on admin-level access to the cluster.

On Mon, Feb 6, 2017 at 1:24 PM, Jordan Liggitt <jligg...@redhat.com> wrote:

> Can you include your `restricted` scc definition:
>
> oc get scc -o yaml
>
> It seems likely that the restricted scc definition was modified in your
> installation to not be as restrictive. By default, it sets runAsUser to
> MustRunAsRange
>
>
>
>
> On Mon, Feb 6, 2017 at 2:17 PM, Alex Wauck <alexwa...@exosite.com> wrote:
>
>> openshift.io/scc is "restricted" for app1-45-3blnd (not running as
>> root).  It also has that value for app5-36-2rfsq (running as root).
>>
>> On Mon, Feb 6, 2017 at 1:11 PM, Clayton Coleman <ccole...@redhat.com>
>> wrote:
>>
>>> Were those apps created in order?  Or at individual times?   If you did
>>> the following order of actions:
>>>
>>> 1. create app2, app4
>>> 2. grant the default service account access to a higher level SCC
>>> 3. create app1, app3, app5, and app6
>>>
>>> Then this would be what I would expect.  Can you look at the annotations
>>> of pod app1-45-3blnd and see what the value of "openshift.io/scc" is?
>>>
>>>
>>> On Mon, Feb 6, 2017 at 1:57 PM, Alex Wauck <alexwa...@exosite.com>
>>> wrote:
>>>
>>>> OK, this just got a lot more interesting:
>>>>
>>>> $ oc -n some-project exec app1-45-3blnd -- id
>>>> uid=0(root) gid=0(root) groups=0(root),1(bin),2(daemon
>>>> ),3(sys),4(adm),6(disk),10(wheel),11(floppy),20(dialout),26(
>>>> tape),27(video),100037
>>>> $ oc -n some-project exec app2-18-q2fwm -- id
>>>> uid=100037 gid=0(root) groups=100037
>>>> $ oc -n some-project exec app3-10-lhato -- id
>>>> uid=0(root) gid=0(root) groups=0(root),1(bin),2(daemon
>>>> ),3(sys),4(adm),6(disk),10(wheel),11(floppy),20(dialout),26(
>>>> tape),27(video),100037
>>>> $ oc -n some-project exec app4-16-dl2r7 -- id
>>>> uid=100037 gid=0(root) groups=100037
>>>> $ oc -n some-project exec app5-36-2rfsq -- id
>>>> uid=0(root) gid=0(root) groups=0(root),100037
>>>> $ oc -n some-project exec app6-15-078fd -- id
>>>> uid=0(root) gid=0(root) groups=0(root),100037
>>>>
>>>> All of these pods are running on the same node, and as you can see,
>>>> they are in the same project.  Yet, some are running as root and some are
>>>> not.  How weird is that?
>>>>
>>>> On Mon, Feb 6, 2017 at 12:49 PM, Alex Wauck <alexwa...@exosite.com>
>>>> wrote:
>>>>
>>>>> $ oc export -n some-project pod/good-pod | grep serviceAccount
>>>>>   serviceAccount: default
>>>>>   serviceAccountName: default
>>>>> $ oc export -n some-project pod/bad-pod | grep serviceAccount
>>>>>   serviceAccount: default
>>>>>   serviceAccountName: default
>>>>>
>>>>> Same serviceAccountName.  This problem seems to happen with any pod
>>>>> from any project that happens to run on these newer nodes.  I examined the
>>>>> output of `oc describe scc`, and I did not find any unexpected access to
>>>>> elevated privileges for a default serviceaccount.  The project were I'm
>>>>> currently seeing the problem is not mentioned at all.  Also, I've seen the
>>>>> problem happen with pods that are managed by the same replication
>>>>> controller.
>>>>&

Re: Pods randomly running as root

2017-02-06 Thread Alex Wauck
Whoops, those are both running as root.  However, app2-18-q2fwm is also
"restricted" and is not running as root.

On Mon, Feb 6, 2017 at 1:17 PM, Alex Wauck <alexwa...@exosite.com> wrote:

> openshift.io/scc is "restricted" for app1-45-3blnd (not running as
> root).  It also has that value for app5-36-2rfsq (running as root).
>
> On Mon, Feb 6, 2017 at 1:11 PM, Clayton Coleman <ccole...@redhat.com>
> wrote:
>
>> Were those apps created in order?  Or at individual times?   If you did
>> the following order of actions:
>>
>> 1. create app2, app4
>> 2. grant the default service account access to a higher level SCC
>> 3. create app1, app3, app5, and app6
>>
>> Then this would be what I would expect.  Can you look at the annotations
>> of pod app1-45-3blnd and see what the value of "openshift.io/scc" is?
>>
>>
>> On Mon, Feb 6, 2017 at 1:57 PM, Alex Wauck <alexwa...@exosite.com> wrote:
>>
>>> OK, this just got a lot more interesting:
>>>
>>> $ oc -n some-project exec app1-45-3blnd -- id
>>> uid=0(root) gid=0(root) groups=0(root),1(bin),2(daemon
>>> ),3(sys),4(adm),6(disk),10(wheel),11(floppy),20(dialout),26(
>>> tape),27(video),100037
>>> $ oc -n some-project exec app2-18-q2fwm -- id
>>> uid=100037 gid=0(root) groups=100037
>>> $ oc -n some-project exec app3-10-lhato -- id
>>> uid=0(root) gid=0(root) groups=0(root),1(bin),2(daemon
>>> ),3(sys),4(adm),6(disk),10(wheel),11(floppy),20(dialout),26(
>>> tape),27(video),100037
>>> $ oc -n some-project exec app4-16-dl2r7 -- id
>>> uid=100037 gid=0(root) groups=100037
>>> $ oc -n some-project exec app5-36-2rfsq -- id
>>> uid=0(root) gid=0(root) groups=0(root),100037
>>> $ oc -n some-project exec app6-15-078fd -- id
>>> uid=0(root) gid=0(root) groups=0(root),100037
>>>
>>> All of these pods are running on the same node, and as you can see, they
>>> are in the same project.  Yet, some are running as root and some are not.
>>> How weird is that?
>>>
>>> On Mon, Feb 6, 2017 at 12:49 PM, Alex Wauck <alexwa...@exosite.com>
>>> wrote:
>>>
>>>> $ oc export -n some-project pod/good-pod | grep serviceAccount
>>>>   serviceAccount: default
>>>>   serviceAccountName: default
>>>> $ oc export -n some-project pod/bad-pod | grep serviceAccount
>>>>   serviceAccount: default
>>>>   serviceAccountName: default
>>>>
>>>> Same serviceAccountName.  This problem seems to happen with any pod
>>>> from any project that happens to run on these newer nodes.  I examined the
>>>> output of `oc describe scc`, and I did not find any unexpected access to
>>>> elevated privileges for a default serviceaccount.  The project were I'm
>>>> currently seeing the problem is not mentioned at all.  Also, I've seen the
>>>> problem happen with pods that are managed by the same replication
>>>> controller.
>>>>
>>>> On Mon, Feb 6, 2017 at 12:46 PM, Clayton Coleman <ccole...@redhat.com>
>>>> wrote:
>>>>
>>>>> Adding the list back
>>>>>
>>>>> -- Forwarded message --
>>>>> From: Clayton Coleman <ccole...@redhat.com>
>>>>> Date: Mon, Feb 6, 2017 at 1:42 PM
>>>>> Subject: Re: Pods randomly running as root
>>>>> To: Alex Wauck <alexwa...@exosite.com>
>>>>> Cc: users <us...@redhat.com>
>>>>>
>>>>>
>>>>> Do the pods running as root and not have the same serviceAccountName
>>>>> field or different ones?  IF different, you may have granted the service
>>>>> account access to a higher role - defaulting is determined by the SCC's
>>>>> that a service account can access, so an admin level service account will
>>>>> run as root by default unless you specify you don't want that.
>>>>>
>>>>> On Mon, Feb 6, 2017 at 1:37 PM, Alex Wauck <alexwa...@exosite.com>
>>>>> wrote:
>>>>>
>>>>>> I'm looking at two nodes where one has the problem and the other
>>>>>> doesn't, and I have confirmed that their node-config.yaml is the same for
>>>>>> both (modulo IP addresses).  The generated kubeconfigs for these nodes on
>>>>>> the master are also the same (modulo IP addresses and keys/certs).
>>>>>>
>>

Re: Pods randomly running as root

2017-02-06 Thread Alex Wauck
openshift.io/scc is "restricted" for app1-45-3blnd (not running as root).
It also has that value for app5-36-2rfsq (running as root).

On Mon, Feb 6, 2017 at 1:11 PM, Clayton Coleman <ccole...@redhat.com> wrote:

> Were those apps created in order?  Or at individual times?   If you did
> the following order of actions:
>
> 1. create app2, app4
> 2. grant the default service account access to a higher level SCC
> 3. create app1, app3, app5, and app6
>
> Then this would be what I would expect.  Can you look at the annotations
> of pod app1-45-3blnd and see what the value of "openshift.io/scc" is?
>
>
> On Mon, Feb 6, 2017 at 1:57 PM, Alex Wauck <alexwa...@exosite.com> wrote:
>
>> OK, this just got a lot more interesting:
>>
>> $ oc -n some-project exec app1-45-3blnd -- id
>> uid=0(root) gid=0(root) groups=0(root),1(bin),2(daemon
>> ),3(sys),4(adm),6(disk),10(wheel),11(floppy),20(dialout),
>> 26(tape),27(video),100037
>> $ oc -n some-project exec app2-18-q2fwm -- id
>> uid=100037 gid=0(root) groups=100037
>> $ oc -n some-project exec app3-10-lhato -- id
>> uid=0(root) gid=0(root) groups=0(root),1(bin),2(daemon
>> ),3(sys),4(adm),6(disk),10(wheel),11(floppy),20(dialout),
>> 26(tape),27(video),100037
>> $ oc -n some-project exec app4-16-dl2r7 -- id
>> uid=100037 gid=0(root) groups=100037
>> $ oc -n some-project exec app5-36-2rfsq -- id
>> uid=0(root) gid=0(root) groups=0(root),100037
>> $ oc -n some-project exec app6-15-078fd -- id
>> uid=0(root) gid=0(root) groups=0(root),100037
>>
>> All of these pods are running on the same node, and as you can see, they
>> are in the same project.  Yet, some are running as root and some are not.
>> How weird is that?
>>
>> On Mon, Feb 6, 2017 at 12:49 PM, Alex Wauck <alexwa...@exosite.com>
>> wrote:
>>
>>> $ oc export -n some-project pod/good-pod | grep serviceAccount
>>>   serviceAccount: default
>>>   serviceAccountName: default
>>> $ oc export -n some-project pod/bad-pod | grep serviceAccount
>>>   serviceAccount: default
>>>   serviceAccountName: default
>>>
>>> Same serviceAccountName.  This problem seems to happen with any pod from
>>> any project that happens to run on these newer nodes.  I examined the
>>> output of `oc describe scc`, and I did not find any unexpected access to
>>> elevated privileges for a default serviceaccount.  The project were I'm
>>> currently seeing the problem is not mentioned at all.  Also, I've seen the
>>> problem happen with pods that are managed by the same replication
>>> controller.
>>>
>>> On Mon, Feb 6, 2017 at 12:46 PM, Clayton Coleman <ccole...@redhat.com>
>>> wrote:
>>>
>>>> Adding the list back
>>>>
>>>> -- Forwarded message --
>>>> From: Clayton Coleman <ccole...@redhat.com>
>>>> Date: Mon, Feb 6, 2017 at 1:42 PM
>>>> Subject: Re: Pods randomly running as root
>>>> To: Alex Wauck <alexwa...@exosite.com>
>>>> Cc: users <us...@redhat.com>
>>>>
>>>>
>>>> Do the pods running as root and not have the same serviceAccountName
>>>> field or different ones?  IF different, you may have granted the service
>>>> account access to a higher role - defaulting is determined by the SCC's
>>>> that a service account can access, so an admin level service account will
>>>> run as root by default unless you specify you don't want that.
>>>>
>>>> On Mon, Feb 6, 2017 at 1:37 PM, Alex Wauck <alexwa...@exosite.com>
>>>> wrote:
>>>>
>>>>> I'm looking at two nodes where one has the problem and the other
>>>>> doesn't, and I have confirmed that their node-config.yaml is the same for
>>>>> both (modulo IP addresses).  The generated kubeconfigs for these nodes on
>>>>> the master are also the same (modulo IP addresses and keys/certs).
>>>>>
>>>>> On Mon, Feb 6, 2017 at 10:46 AM, Alex Wauck <alexwa...@exosite.com>
>>>>> wrote:
>>>>>
>>>>>> Oh, wait.  I was looking at the wrong section.  The non-root pod as a
>>>>>> runAsUser attribute, but the root pod doesn't!
>>>>>>
>>>>>> On Mon, Feb 6, 2017 at 10:44 AM, Alex Wauck <alexwa...@exosite.com>
>>>>>> wrote:
>>>>>>
>>>>>>> A pod that IS running as root have this:
&

Re: Pods randomly running as root

2017-02-06 Thread Alex Wauck
Redacted versions of the full pod specs are attached.  app1 is running as
root; app2 is not.

On Mon, Feb 6, 2017 at 1:07 PM, Jordan Liggitt <jligg...@redhat.com> wrote:

> Can you provide the full pod specs for a pod running as non-root and a pod
> running as root?
>
> Can you also provide the definition of the SecurityContextConstraint
> referenced in the pod specs "openshift.io/scc" annotation?
>
>
>
>
> On Mon, Feb 6, 2017 at 2:01 PM, Alex Wauck <alexwa...@exosite.com> wrote:
>
>> Judging by pod start times, it looks like everything that started before
>> February 2 is not running as root, while everything else is.
>>
>> On Mon, Feb 6, 2017 at 12:57 PM, Alex Wauck <alexwa...@exosite.com>
>> wrote:
>>
>>> OK, this just got a lot more interesting:
>>>
>>> $ oc -n some-project exec app1-45-3blnd -- id
>>> uid=0(root) gid=0(root) groups=0(root),1(bin),2(daemon
>>> ),3(sys),4(adm),6(disk),10(wheel),11(floppy),20(dialout),26(
>>> tape),27(video),100037
>>> $ oc -n some-project exec app2-18-q2fwm -- id
>>> uid=100037 gid=0(root) groups=100037
>>> $ oc -n some-project exec app3-10-lhato -- id
>>> uid=0(root) gid=0(root) groups=0(root),1(bin),2(daemon
>>> ),3(sys),4(adm),6(disk),10(wheel),11(floppy),20(dialout),26(
>>> tape),27(video),100037
>>> $ oc -n some-project exec app4-16-dl2r7 -- id
>>> uid=100037 gid=0(root) groups=100037
>>> $ oc -n some-project exec app5-36-2rfsq -- id
>>> uid=0(root) gid=0(root) groups=0(root),100037
>>> $ oc -n some-project exec app6-15-078fd -- id
>>> uid=0(root) gid=0(root) groups=0(root),100037
>>>
>>> All of these pods are running on the same node, and as you can see, they
>>> are in the same project.  Yet, some are running as root and some are not.
>>> How weird is that?
>>>
>>> On Mon, Feb 6, 2017 at 12:49 PM, Alex Wauck <alexwa...@exosite.com>
>>> wrote:
>>>
>>>> $ oc export -n some-project pod/good-pod | grep serviceAccount
>>>>   serviceAccount: default
>>>>   serviceAccountName: default
>>>> $ oc export -n some-project pod/bad-pod | grep serviceAccount
>>>>   serviceAccount: default
>>>>   serviceAccountName: default
>>>>
>>>> Same serviceAccountName.  This problem seems to happen with any pod
>>>> from any project that happens to run on these newer nodes.  I examined the
>>>> output of `oc describe scc`, and I did not find any unexpected access to
>>>> elevated privileges for a default serviceaccount.  The project were I'm
>>>> currently seeing the problem is not mentioned at all.  Also, I've seen the
>>>> problem happen with pods that are managed by the same replication
>>>> controller.
>>>>
>>>> On Mon, Feb 6, 2017 at 12:46 PM, Clayton Coleman <ccole...@redhat.com>
>>>> wrote:
>>>>
>>>>> Adding the list back
>>>>>
>>>>> -- Forwarded message --
>>>>> From: Clayton Coleman <ccole...@redhat.com>
>>>>> Date: Mon, Feb 6, 2017 at 1:42 PM
>>>>> Subject: Re: Pods randomly running as root
>>>>> To: Alex Wauck <alexwa...@exosite.com>
>>>>> Cc: users <us...@redhat.com>
>>>>>
>>>>>
>>>>> Do the pods running as root and not have the same serviceAccountName
>>>>> field or different ones?  IF different, you may have granted the service
>>>>> account access to a higher role - defaulting is determined by the SCC's
>>>>> that a service account can access, so an admin level service account will
>>>>> run as root by default unless you specify you don't want that.
>>>>>
>>>>> On Mon, Feb 6, 2017 at 1:37 PM, Alex Wauck <alexwa...@exosite.com>
>>>>> wrote:
>>>>>
>>>>>> I'm looking at two nodes where one has the problem and the other
>>>>>> doesn't, and I have confirmed that their node-config.yaml is the same for
>>>>>> both (modulo IP addresses).  The generated kubeconfigs for these nodes on
>>>>>> the master are also the same (modulo IP addresses and keys/certs).
>>>>>>
>>>>>> On Mon, Feb 6, 2017 at 10:46 AM, Alex Wauck <alexwa...@exosite.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Oh, wait.  I was looking at the wrong section

Re: Pods randomly running as root

2017-02-06 Thread Alex Wauck
Judging by pod start times, it looks like everything that started before
February 2 is not running as root, while everything else is.

On Mon, Feb 6, 2017 at 12:57 PM, Alex Wauck <alexwa...@exosite.com> wrote:

> OK, this just got a lot more interesting:
>
> $ oc -n some-project exec app1-45-3blnd -- id
> uid=0(root) gid=0(root) groups=0(root),1(bin),2(
> daemon),3(sys),4(adm),6(disk),10(wheel),11(floppy),20(
> dialout),26(tape),27(video),100037
> $ oc -n some-project exec app2-18-q2fwm -- id
> uid=100037 gid=0(root) groups=100037
> $ oc -n some-project exec app3-10-lhato -- id
> uid=0(root) gid=0(root) groups=0(root),1(bin),2(
> daemon),3(sys),4(adm),6(disk),10(wheel),11(floppy),20(
> dialout),26(tape),27(video),100037
> $ oc -n some-project exec app4-16-dl2r7 -- id
> uid=100037 gid=0(root) groups=100037
> $ oc -n some-project exec app5-36-2rfsq -- id
> uid=0(root) gid=0(root) groups=0(root),100037
> $ oc -n some-project exec app6-15-078fd -- id
> uid=0(root) gid=0(root) groups=0(root),100037
>
> All of these pods are running on the same node, and as you can see, they
> are in the same project.  Yet, some are running as root and some are not.
> How weird is that?
>
> On Mon, Feb 6, 2017 at 12:49 PM, Alex Wauck <alexwa...@exosite.com> wrote:
>
>> $ oc export -n some-project pod/good-pod | grep serviceAccount
>>   serviceAccount: default
>>   serviceAccountName: default
>> $ oc export -n some-project pod/bad-pod | grep serviceAccount
>>   serviceAccount: default
>>   serviceAccountName: default
>>
>> Same serviceAccountName.  This problem seems to happen with any pod from
>> any project that happens to run on these newer nodes.  I examined the
>> output of `oc describe scc`, and I did not find any unexpected access to
>> elevated privileges for a default serviceaccount.  The project were I'm
>> currently seeing the problem is not mentioned at all.  Also, I've seen the
>> problem happen with pods that are managed by the same replication
>> controller.
>>
>> On Mon, Feb 6, 2017 at 12:46 PM, Clayton Coleman <ccole...@redhat.com>
>> wrote:
>>
>>> Adding the list back
>>>
>>> -- Forwarded message --
>>> From: Clayton Coleman <ccole...@redhat.com>
>>> Date: Mon, Feb 6, 2017 at 1:42 PM
>>> Subject: Re: Pods randomly running as root
>>> To: Alex Wauck <alexwa...@exosite.com>
>>> Cc: users <us...@redhat.com>
>>>
>>>
>>> Do the pods running as root and not have the same serviceAccountName
>>> field or different ones?  IF different, you may have granted the service
>>> account access to a higher role - defaulting is determined by the SCC's
>>> that a service account can access, so an admin level service account will
>>> run as root by default unless you specify you don't want that.
>>>
>>> On Mon, Feb 6, 2017 at 1:37 PM, Alex Wauck <alexwa...@exosite.com>
>>> wrote:
>>>
>>>> I'm looking at two nodes where one has the problem and the other
>>>> doesn't, and I have confirmed that their node-config.yaml is the same for
>>>> both (modulo IP addresses).  The generated kubeconfigs for these nodes on
>>>> the master are also the same (modulo IP addresses and keys/certs).
>>>>
>>>> On Mon, Feb 6, 2017 at 10:46 AM, Alex Wauck <alexwa...@exosite.com>
>>>> wrote:
>>>>
>>>>> Oh, wait.  I was looking at the wrong section.  The non-root pod as a
>>>>> runAsUser attribute, but the root pod doesn't!
>>>>>
>>>>> On Mon, Feb 6, 2017 at 10:44 AM, Alex Wauck <alexwa...@exosite.com>
>>>>> wrote:
>>>>>
>>>>>> A pod that IS running as root have this:
>>>>>>
>>>>>>   securityContext:
>>>>>> fsGroup: 100037
>>>>>> seLinuxOptions:
>>>>>>   level: s0:c19,c14
>>>>>>
>>>>>> Another pod in the same project that is NOT running as root has the
>>>>>> exact same securityContext section.
>>>>>>
>>>>>> On Mon, Feb 6, 2017 at 10:25 AM, Clayton Coleman <ccole...@redhat.com
>>>>>> > wrote:
>>>>>>
>>>>>>> Do the pods themselves have a user UID set on them?  Each pod should
>>>>>>> have the container "securityContext" field set and have an explicit 
>>>>>>> user ID
>>>>>>> value

Re: Pods randomly running as root

2017-02-06 Thread Alex Wauck
OK, this just got a lot more interesting:

$ oc -n some-project exec app1-45-3blnd -- id
uid=0(root) gid=0(root)
groups=0(root),1(bin),2(daemon),3(sys),4(adm),6(disk),10(wheel),11(floppy),20(dialout),26(tape),27(video),100037
$ oc -n some-project exec app2-18-q2fwm -- id
uid=100037 gid=0(root) groups=100037
$ oc -n some-project exec app3-10-lhato -- id
uid=0(root) gid=0(root)
groups=0(root),1(bin),2(daemon),3(sys),4(adm),6(disk),10(wheel),11(floppy),20(dialout),26(tape),27(video),100037
$ oc -n some-project exec app4-16-dl2r7 -- id
uid=100037 gid=0(root) groups=100037
$ oc -n some-project exec app5-36-2rfsq -- id
uid=0(root) gid=0(root) groups=0(root),100037
$ oc -n some-project exec app6-15-078fd -- id
uid=0(root) gid=0(root) groups=0(root),100037

All of these pods are running on the same node, and as you can see, they
are in the same project.  Yet, some are running as root and some are not.
How weird is that?

On Mon, Feb 6, 2017 at 12:49 PM, Alex Wauck <alexwa...@exosite.com> wrote:

> $ oc export -n some-project pod/good-pod | grep serviceAccount
>   serviceAccount: default
>   serviceAccountName: default
> $ oc export -n some-project pod/bad-pod | grep serviceAccount
>   serviceAccount: default
>   serviceAccountName: default
>
> Same serviceAccountName.  This problem seems to happen with any pod from
> any project that happens to run on these newer nodes.  I examined the
> output of `oc describe scc`, and I did not find any unexpected access to
> elevated privileges for a default serviceaccount.  The project were I'm
> currently seeing the problem is not mentioned at all.  Also, I've seen the
> problem happen with pods that are managed by the same replication
> controller.
>
> On Mon, Feb 6, 2017 at 12:46 PM, Clayton Coleman <ccole...@redhat.com>
> wrote:
>
>> Adding the list back
>>
>> -- Forwarded message --
>> From: Clayton Coleman <ccole...@redhat.com>
>> Date: Mon, Feb 6, 2017 at 1:42 PM
>> Subject: Re: Pods randomly running as root
>> To: Alex Wauck <alexwa...@exosite.com>
>> Cc: users <us...@redhat.com>
>>
>>
>> Do the pods running as root and not have the same serviceAccountName
>> field or different ones?  IF different, you may have granted the service
>> account access to a higher role - defaulting is determined by the SCC's
>> that a service account can access, so an admin level service account will
>> run as root by default unless you specify you don't want that.
>>
>> On Mon, Feb 6, 2017 at 1:37 PM, Alex Wauck <alexwa...@exosite.com> wrote:
>>
>>> I'm looking at two nodes where one has the problem and the other
>>> doesn't, and I have confirmed that their node-config.yaml is the same for
>>> both (modulo IP addresses).  The generated kubeconfigs for these nodes on
>>> the master are also the same (modulo IP addresses and keys/certs).
>>>
>>> On Mon, Feb 6, 2017 at 10:46 AM, Alex Wauck <alexwa...@exosite.com>
>>> wrote:
>>>
>>>> Oh, wait.  I was looking at the wrong section.  The non-root pod as a
>>>> runAsUser attribute, but the root pod doesn't!
>>>>
>>>> On Mon, Feb 6, 2017 at 10:44 AM, Alex Wauck <alexwa...@exosite.com>
>>>> wrote:
>>>>
>>>>> A pod that IS running as root have this:
>>>>>
>>>>>   securityContext:
>>>>> fsGroup: 100037
>>>>> seLinuxOptions:
>>>>>   level: s0:c19,c14
>>>>>
>>>>> Another pod in the same project that is NOT running as root has the
>>>>> exact same securityContext section.
>>>>>
>>>>> On Mon, Feb 6, 2017 at 10:25 AM, Clayton Coleman <ccole...@redhat.com>
>>>>> wrote:
>>>>>
>>>>>> Do the pods themselves have a user UID set on them?  Each pod should
>>>>>> have the container "securityContext" field set and have an explicit user 
>>>>>> ID
>>>>>> value set.
>>>>>>
>>>>>> On Mon, Feb 6, 2017 at 11:23 AM, Alex Wauck <alexwa...@exosite.com>
>>>>>> wrote:
>>>>>>
>>>>>>> These are completely normal app containers.  They are managed by
>>>>>>> deploy configs.  Whether they run as root or not seems to depend on 
>>>>>>> which
>>>>>>> node they run on: the older nodes seem to run pods as random UIDs, while
>>>>>>> the newer ones run as root.  Our older nodes have docker-selinux-1.10.3
>>&

Re: Pods randomly running as root

2017-02-06 Thread Alex Wauck
$ oc export -n some-project pod/good-pod | grep serviceAccount
  serviceAccount: default
  serviceAccountName: default
$ oc export -n some-project pod/bad-pod | grep serviceAccount
  serviceAccount: default
  serviceAccountName: default

Same serviceAccountName.  This problem seems to happen with any pod from
any project that happens to run on these newer nodes.  I examined the
output of `oc describe scc`, and I did not find any unexpected access to
elevated privileges for a default serviceaccount.  The project were I'm
currently seeing the problem is not mentioned at all.  Also, I've seen the
problem happen with pods that are managed by the same replication
controller.

On Mon, Feb 6, 2017 at 12:46 PM, Clayton Coleman <ccole...@redhat.com>
wrote:

> Adding the list back
>
> -- Forwarded message --
> From: Clayton Coleman <ccole...@redhat.com>
> Date: Mon, Feb 6, 2017 at 1:42 PM
> Subject: Re: Pods randomly running as root
> To: Alex Wauck <alexwa...@exosite.com>
> Cc: users <us...@redhat.com>
>
>
> Do the pods running as root and not have the same serviceAccountName field
> or different ones?  IF different, you may have granted the service account
> access to a higher role - defaulting is determined by the SCC's that a
> service account can access, so an admin level service account will run as
> root by default unless you specify you don't want that.
>
> On Mon, Feb 6, 2017 at 1:37 PM, Alex Wauck <alexwa...@exosite.com> wrote:
>
>> I'm looking at two nodes where one has the problem and the other doesn't,
>> and I have confirmed that their node-config.yaml is the same for both
>> (modulo IP addresses).  The generated kubeconfigs for these nodes on the
>> master are also the same (modulo IP addresses and keys/certs).
>>
>> On Mon, Feb 6, 2017 at 10:46 AM, Alex Wauck <alexwa...@exosite.com>
>> wrote:
>>
>>> Oh, wait.  I was looking at the wrong section.  The non-root pod as a
>>> runAsUser attribute, but the root pod doesn't!
>>>
>>> On Mon, Feb 6, 2017 at 10:44 AM, Alex Wauck <alexwa...@exosite.com>
>>> wrote:
>>>
>>>> A pod that IS running as root have this:
>>>>
>>>>   securityContext:
>>>> fsGroup: 100037
>>>> seLinuxOptions:
>>>>   level: s0:c19,c14
>>>>
>>>> Another pod in the same project that is NOT running as root has the
>>>> exact same securityContext section.
>>>>
>>>> On Mon, Feb 6, 2017 at 10:25 AM, Clayton Coleman <ccole...@redhat.com>
>>>> wrote:
>>>>
>>>>> Do the pods themselves have a user UID set on them?  Each pod should
>>>>> have the container "securityContext" field set and have an explicit user 
>>>>> ID
>>>>> value set.
>>>>>
>>>>> On Mon, Feb 6, 2017 at 11:23 AM, Alex Wauck <alexwa...@exosite.com>
>>>>> wrote:
>>>>>
>>>>>> These are completely normal app containers.  They are managed by
>>>>>> deploy configs.  Whether they run as root or not seems to depend on which
>>>>>> node they run on: the older nodes seem to run pods as random UIDs, while
>>>>>> the newer ones run as root.  Our older nodes have docker-selinux-1.10.3
>>>>>> installed, while the newer ones do not.  They only have
>>>>>> docker-selinux-1.9.1 available, since the 1.10.3 package seems to have 
>>>>>> been
>>>>>> removed from the CentOS extras repo.
>>>>>>
>>>>>> We are running OpenShift 1.2.1, since I haven't had time to upgrade
>>>>>> it.
>>>>>>
>>>>>> On Mon, Feb 6, 2017 at 8:31 AM, Clayton Coleman <ccole...@redhat.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Are you running them directly (launching a pod)?  Or running them
>>>>>>> under another controller resource.
>>>>>>>
>>>>>>> On Feb 6, 2017, at 2:00 AM, Alex Wauck <alexwa...@exosite.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>> Recently, I began to notice that some of my pods on OpenShift run as
>>>>>>> root instead of a random UID.  There does not seem to be any obvious 
>>>>>>> cause
>>>>>>> (e.g. SCC).  Any idea how this could happen or where to look for clues?
>>>>>>>
>>>>>>>
>


-- 

Alex Wauck // DevOps Engineer

*E X O S I T E*
*www.exosite.com <http://www.exosite.com/>*

Making Machines More Human.
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


Re: Pods randomly running as root

2017-02-06 Thread Alex Wauck
Oh, wait.  I was looking at the wrong section.  The non-root pod as a
runAsUser attribute, but the root pod doesn't!

On Mon, Feb 6, 2017 at 10:44 AM, Alex Wauck <alexwa...@exosite.com> wrote:

> A pod that IS running as root have this:
>
>   securityContext:
> fsGroup: 100037
> seLinuxOptions:
>   level: s0:c19,c14
>
> Another pod in the same project that is NOT running as root has the exact
> same securityContext section.
>
> On Mon, Feb 6, 2017 at 10:25 AM, Clayton Coleman <ccole...@redhat.com>
> wrote:
>
>> Do the pods themselves have a user UID set on them?  Each pod should have
>> the container "securityContext" field set and have an explicit user ID
>> value set.
>>
>> On Mon, Feb 6, 2017 at 11:23 AM, Alex Wauck <alexwa...@exosite.com>
>> wrote:
>>
>>> These are completely normal app containers.  They are managed by deploy
>>> configs.  Whether they run as root or not seems to depend on which node
>>> they run on: the older nodes seem to run pods as random UIDs, while the
>>> newer ones run as root.  Our older nodes have docker-selinux-1.10.3
>>> installed, while the newer ones do not.  They only have
>>> docker-selinux-1.9.1 available, since the 1.10.3 package seems to have been
>>> removed from the CentOS extras repo.
>>>
>>> We are running OpenShift 1.2.1, since I haven't had time to upgrade it.
>>>
>>> On Mon, Feb 6, 2017 at 8:31 AM, Clayton Coleman <ccole...@redhat.com>
>>> wrote:
>>>
>>>> Are you running them directly (launching a pod)?  Or running them under
>>>> another controller resource.
>>>>
>>>> On Feb 6, 2017, at 2:00 AM, Alex Wauck <alexwa...@exosite.com> wrote:
>>>>
>>>> Recently, I began to notice that some of my pods on OpenShift run as
>>>> root instead of a random UID.  There does not seem to be any obvious cause
>>>> (e.g. SCC).  Any idea how this could happen or where to look for clues?
>>>>
>>>> --
>>>>
>>>> Alex Wauck // DevOps Engineer
>>>>
>>>> *E X O S I T E*
>>>> *www.exosite.com <http://www.exosite.com/>*
>>>>
>>>> Making Machines More Human.
>>>>
>>>> ___
>>>> users mailing list
>>>> users@lists.openshift.redhat.com
>>>> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>>>>
>>>>
>>>
>>>
>>> --
>>>
>>> Alex Wauck // DevOps Engineer
>>>
>>> *E X O S I T E*
>>> *www.exosite.com <http://www.exosite.com/>*
>>>
>>> Making Machines More Human.
>>>
>>>
>>
>
>
> --
>
> Alex Wauck // DevOps Engineer
>
> *E X O S I T E*
> *www.exosite.com <http://www.exosite.com/>*
>
> Making Machines More Human.
>
>


-- 

Alex Wauck // DevOps Engineer

*E X O S I T E*
*www.exosite.com <http://www.exosite.com/>*

Making Machines More Human.
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


Re: Pods randomly running as root

2017-02-06 Thread Alex Wauck
A pod that IS running as root have this:

  securityContext:
fsGroup: 100037
seLinuxOptions:
  level: s0:c19,c14

Another pod in the same project that is NOT running as root has the exact
same securityContext section.

On Mon, Feb 6, 2017 at 10:25 AM, Clayton Coleman <ccole...@redhat.com>
wrote:

> Do the pods themselves have a user UID set on them?  Each pod should have
> the container "securityContext" field set and have an explicit user ID
> value set.
>
> On Mon, Feb 6, 2017 at 11:23 AM, Alex Wauck <alexwa...@exosite.com> wrote:
>
>> These are completely normal app containers.  They are managed by deploy
>> configs.  Whether they run as root or not seems to depend on which node
>> they run on: the older nodes seem to run pods as random UIDs, while the
>> newer ones run as root.  Our older nodes have docker-selinux-1.10.3
>> installed, while the newer ones do not.  They only have
>> docker-selinux-1.9.1 available, since the 1.10.3 package seems to have been
>> removed from the CentOS extras repo.
>>
>> We are running OpenShift 1.2.1, since I haven't had time to upgrade it.
>>
>> On Mon, Feb 6, 2017 at 8:31 AM, Clayton Coleman <ccole...@redhat.com>
>> wrote:
>>
>>> Are you running them directly (launching a pod)?  Or running them under
>>> another controller resource.
>>>
>>> On Feb 6, 2017, at 2:00 AM, Alex Wauck <alexwa...@exosite.com> wrote:
>>>
>>> Recently, I began to notice that some of my pods on OpenShift run as
>>> root instead of a random UID.  There does not seem to be any obvious cause
>>> (e.g. SCC).  Any idea how this could happen or where to look for clues?
>>>
>>> --
>>>
>>> Alex Wauck // DevOps Engineer
>>>
>>> *E X O S I T E*
>>> *www.exosite.com <http://www.exosite.com/>*
>>>
>>> Making Machines More Human.
>>>
>>> ___
>>> users mailing list
>>> users@lists.openshift.redhat.com
>>> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>>>
>>>
>>
>>
>> --
>>
>> Alex Wauck // DevOps Engineer
>>
>> *E X O S I T E*
>> *www.exosite.com <http://www.exosite.com/>*
>>
>> Making Machines More Human.
>>
>>
>


-- 

Alex Wauck // DevOps Engineer

*E X O S I T E*
*www.exosite.com <http://www.exosite.com/>*

Making Machines More Human.
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


Pods randomly running as root

2017-02-05 Thread Alex Wauck
Recently, I began to notice that some of my pods on OpenShift run as root
instead of a random UID.  There does not seem to be any obvious cause (e.g.
SCC).  Any idea how this could happen or where to look for clues?

-- 

Alex Wauck // DevOps Engineer

*E X O S I T E*
*www.exosite.com <http://www.exosite.com/>*

Making Machines More Human.
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


Re: How can I scale up the number of etcd host ?

2017-01-12 Thread Alex Wauck
Are you using the built-in OpenShift etcd on that one node, or are you
using real etcd?  We're currently using the built-in OpenShift etcd on our
one master node, and we really want to switch to multiple nodes.

On Tue, Jan 10, 2017 at 3:01 PM, Stéphane Klein <cont...@stephane-klein.info
> wrote:

> 2017-01-10 20:17 GMT+01:00 Scott Dodson <sdod...@redhat.com>:
>
>> openshift-ansible doesn't currently provide this, there's an issue
>> requesting it https://github.com/openshift/openshift-ansible/issues/1772
>> which links to a blog post describing how to do it, though I've not
>> validated that myself.
>
>
>
> Thanks.
>
> > I'm curious what's your current etcd cluster size and what's leading
> > you to resize it?
>
> 1 to 2
>
> ___
> users mailing list
> users@lists.openshift.redhat.com
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>
>


-- 

Alex Wauck // DevOps Engineer

*E X O S I T E*
*www.exosite.com <http://www.exosite.com/>*

Making Machines More Human.
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


Re: Best way to use oc client

2016-12-13 Thread Alex Wauck
My understanding is that you don't need access to the kube config or
anything like that as long as you stay away from the "oc adm" subcommand,
and "oadm" is the same as "oc adm".  Anything else in "oc" should be fine.

Mind you, I'm just a user, not a developer, but I have been using OpenShift
for nearly a year now and have yet to see anything that would suggest that
I'm wrong about this.

On Tue, Dec 13, 2016 at 10:01 AM, John Mazzitelli <m...@redhat.com> wrote:

> > I use oc from a utility server that’s not part of the cluster, which any
> > developer can access. We keep oadm on a master openshift host, which is
> only
> > accessible by openshift admins. I don’t believe oc needs access to the
> kube
> > config, or at least haven’t hit any commands for it yet. Oadm does though
> > which is why we keep it on the master.
>
> But doesn't "oc" also have admin capabilities? I thought "oadm == oc
> admin" ??
>
> ___
> users mailing list
> users@lists.openshift.redhat.com
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>



-- 

Alex Wauck // DevOps Engineer

*E X O S I T E*
*www.exosite.com <http://www.exosite.com/>*

Making Machines More Human.
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


Re: error during install: subnet id does not exist

2016-11-18 Thread Alex Wauck
Upon re-reading that page, I see that it does, in fact, only ask for subnet
ID.  I could have sworn there were more VPC related environment variables.
Well, I'm afraid I can't help you; that error message makes no sense to
me.  The only thing I can think of is that maybe you have multiple AWS
accounts and are using the wrong one by mistake.

I wish I could be more helpful, but that's all I've got.

On Thu, Nov 17, 2016 at 6:40 PM, Ravi <ravikapoor...@gmail.com> wrote:

>
> It only asks for VPC Subnet id. I have set that and that is giving
> trouble. No place to put VPC id itself.
>
>
> On 11/17/2016 1:06 PM, Alex Wauck wrote:
>
>>
>> On Thu, Nov 17, 2016 at 2:52 PM, Ravi Kapoor <ravikapoor...@gmail.com
>> <mailto:ravikapoor...@gmail.com>> wrote:
>>
>> The instructions do not ask for availability zone or VPC. They only
>> ask for a subnet and I have specified that.
>> Maybe it is picking some other VPC where the subnet is not available.
>>
>>
>> Have you read this?
>>  https://github.com/openshift/openshift-ansible/blob/master/README_AWS.md
>>
>> According to that document, you are supposed to specify a whole bunch of
>> stuff in environment variables, including the VPC.  I've never tried it
>> myself, so I'm not sure how well it works.
>>
>
>


-- 

Alex Wauck // DevOps Engineer

*E X O S I T E*
*www.exosite.com <http://www.exosite.com/>*

Making Machines More Human.
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


Re: error during install: subnet id does not exist

2016-11-17 Thread Alex Wauck
On Thu, Nov 17, 2016 at 2:52 PM, Ravi Kapoor 
wrote:

> The instructions do not ask for availability zone or VPC. They only ask
> for a subnet and I have specified that.
> Maybe it is picking some other VPC where the subnet is not available.
>

Have you read this?
https://github.com/openshift/openshift-ansible/blob/master/README_AWS.md

According to that document, you are supposed to specify a whole bunch of
stuff in environment variables, including the VPC.  I've never tried it
myself, so I'm not sure how well it works.
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


Re: error during install: subnet id does not exist

2016-11-17 Thread Alex Wauck
On Thu, Nov 17, 2016 at 12:09 PM, Ravi Kapoor <ravikapoor...@gmail.com>
wrote:

Question1: Is this best way to install? So far I have been using "oc
> cluster up" while it works it crashes once in a while (at least UI crashes,
> so I am forced to restart it which kills all pods)
>

We used openshift-ansible to install our OpenShift cluster, and we fairly
regularly use it to create temporary clusters for testing purposes.  I
would consider it the best way to install.


> Question2:
> After I did all the configurations, my install still fails with following
> error:
>
> exception occurred during task execution. To see the full traceback, use
> -vvv. The error was: <
> Code>InvalidSubnetID.NotFoundThe subnet ID
> 'subnet-c7372dfd' does not exist Errors>2b4d4256-7204-4ced-9af3-318d86a759f0 RequestID>
>

Are you using openshift-ansible's AWS support to create EC2 instances for
you?  We create our instances by other means and then run openshift-ansible
on them using the BYO playbooks, so I'm not familiar with
openshift-ansible's AWS support.  Do you have the availability zone or VPC
set in your inventory file?  If so, does it match the subnet you specified?



-- 

Alex Wauck // DevOps Engineer

*E X O S I T E*
*www.exosite.com <http://www.exosite.com/>*

Making Machines More Human.
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


Re: Node triggered evacuation

2016-10-28 Thread Alex Wauck
Non-master nodes don't seem to have any built-in privileges.  I suggest
creating a service account with the necessary permissions and using its
token to perform actions on the nodes.

On Fri, Oct 28, 2016 at 1:07 AM, Andrew Lau <and...@andrewklau.com> wrote:

> Hi,
>
> Is there any facility to trigger a node evacuation from within the node?
> eg. if we are in the console of a particular node or the node receives a
> signal (eg. spot termination notice).
>
> Thanks
>
> ___
> users mailing list
> users@lists.openshift.redhat.com
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>
>


-- 

Alex Wauck // DevOps Engineer

*E X O S I T E*
*www.exosite.com <http://www.exosite.com/>*

Making Machines More Human.
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


Re: Metrics - Could not connect to Cassandra cluster

2016-10-28 Thread Alex Wauck
This happened to us.  The problem is probably that you have your metrics
replication controllers set to pull the latest versions of the images.  (I
think this is the default.  Bad!)  The current latest version needs
different configuration, so your existing configuration no longer works.
You probably had this problem for a long time but didn't notice until some
component of the system restarted for some reason, triggering a new image
pull.

We fixed this by changing the images specified in the replication
controllers.  For example, in rc/hawkular-metrics, we changed

image: openshift/origin-metrics-hawkular-metrics:latest

to

image: openshift/origin-metrics-hawkular-metrics:v1.2.1

While I was debugging, I restarted hawkular-cassandra, so it got upgraded,
too.  I don't know if it had already gotten upgraded; if yours hasn't, then
you can avoid losing data.  So, I had to set the :v1.2.1 tag on all three
components (hawkular-cassandra, hawkular-metrics, and heapster) and also
delete all data (both the data directory and the commitlog directory) on
the hawkular-cassandra PV.  In order to delete that data, I had to find the
mountpoint on the node where the hawkular-cassandra pod was running and
delete the files from the host side.  Because hawkular-cassandra was
failing, I was unable to use `oc rsh` to get in.

On Sat, Oct 22, 2016 at 2:32 PM, Miloslav Vlach <miloslav.vl...@rohlik.cz>
wrote:

> Hi,
>
> I don’t know why is on one server problem with connection to the casandra
> database.
>
> The hawkular write
>
> 19:27:15,354 WARN [org.hawkular.alerts.engine.impl.CassCluster]
> (ServerService Thread Pool -- 75) Could not connect to Cassandra cluster -
> assuming is not up yet. Cause: 
> com.datastax.driver.core.exceptions.NoHostAvailableException:
> All host(s) tried for query failed (tried: /127.0.0.1:9042
> (com.datastax.driver.core.exceptions.TransportException: [/127.0.0.1]
> Cannot connect))
>
>
> But the endpoint is not 127.0.0.1:9042
>
> On the other server outside cluster
>
> 19:26:54,909 WARN [org.hawkular.metrics.api.jaxrs.MetricsServiceLifecycle]
> (metricsservice-lifecycle-thread) HAWKMETRICS23: Could not connect to
> Cassandra cluster - assuming its not up yet: All host(s) tried for query
> failed (tried: hawkular-cassandra/172.30.155.228:9042
> (com.datastax.driver.core.exceptions.TransportException:
> [hawkular-cassandra/172.30.155.228] Cannot connect))
>
> but after a few second it connects to the casandra.
>
> Know somebody where is the problem ?
>
> Instalation performed via ansible. All works before restart.
>
> Thanks Mila
>
> ___
> users mailing list
> users@lists.openshift.redhat.com
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>
>


-- 

Alex Wauck // DevOps Engineer

*E X O S I T E*
*www.exosite.com <http://www.exosite.com/>*

Making Machines More Human.
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


Re: docker version doubt

2016-09-20 Thread Alex Wauck
I've seen the same thing myself.  It seems to cause some bad interactions
with image stream tags (i.e. sha256-based references result in pull
failures), but on the plus side, you can use all those images on Docker Hub
that were pushed with 1.10 or later.  On balance, I'd say it solves more
problems than it creates.  We're running our production OpenShift cluster
with 1.10, and it's worked out pretty well for us.

On Tue, Sep 20, 2016 at 3:50 AM, Julio Saura <jsa...@hiberus.com> wrote:

> Hello
>
> i am installing a new brand open shift origin cluster with centos 7.
>
> after installing docker engine ( from centos repo ) i check the version
> and i am concerned about the result
>
> docker --version
> Docker version 1.10.3, build cb079f6-unsupported
>
> unsupported¿?
>
> is this normal?
>
> thanks
>
>
>
>
>
> ___
> users mailing list
> users@lists.openshift.redhat.com
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>



-- 

Alex Wauck // DevOps Engineer

*E X O S I T E*
*www.exosite.com <http://www.exosite.com/>*

Making Machines More Human.
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


Bad Elasticsearch queries

2016-08-10 Thread Alex Wauck
When I click the "View archive" link for a pod's logs, I get a Kibana page
with a query like this:

kubernetes_pod_name: some-pod-name && kubernetes_namespace_name:
random-namespace

Am I missing something, or should it instead be this:

kubernetes_pod_name: "some-pod-name" && kubernetes_namespace_name:
"random-namespace"

Seems like a bug to me.  I noticed this after clicking the "View archive"
link for a build and getting a lot of log messages from random other pods
in other namespaces.  I guess the current way works fine if you don't have
hyphens in any names anywhere.

-- 

Alex Wauck // DevOps Engineer

*E X O S I T E*
*www.exosite.com <http://www.exosite.com/>*

Making Machines More Human.
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


Re: What actually is required for DNS and Origin?

2016-07-21 Thread Alex Wauck
On Thu, Jul 21, 2016 at 3:29 PM, Josh Berkus <jber...@redhat.com> wrote:

> There is no external DNS server, here.  I'm talking about a portable
> microcluster, a stack of microboard computers, self-contained.  The idea
> would be to run some kind of local DNS server so that, on directly
> connected machines, we could point to that in DNS and it would expose
> the services.
>
> I suppose I can just bootstrap that, maybe as a system container ...
>

If it's a bunch of microboard computers, I'd be tempted to just stick one
more in there and run BIND on it.  Are you running a DHCP server, or are
all IP addresses statically assigned?

-- 

Alex Wauck // DevOps Engineer

*E X O S I T E*
*www.exosite.com <http://www.exosite.com/>*

Making Machines More Human.
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


Re: What actually is required for DNS and Origin?

2016-07-21 Thread Alex Wauck
On Thu, Jul 21, 2016 at 2:32 PM, Aleksandar Kostadinov <akost...@redhat.com>
wrote:

> Two things as listed in the doc. One is to have hostnames of masters and
> slaves resolvable over the configured DNS servers.
>

If you're on AWS, this is taken care of for you.  Your masters and slaves
and whatnot will all be referred to by their internal DNS names (e.g.
ip-172-31-33-101.us-west-1.compute.internal), so this aspect will just
work, even if you set up the EC2 instances yourself and use the BYO
playbooks.


> The other thing listed as "optional" is having a wildcard record(s) for
> the routes exposed to services in OpenShift. This subdomain also needs to
> be configured in master's config file.
>

I highly recommend this.  It makes it very quick and easy to set up new
services with valid DNS records.  Also, get a wildcard SSL certificate if
you can afford it.  You can configure the router to automatically use that
certificate for any service that doesn't specify one.

-- 

Alex Wauck // DevOps Engineer

*E X O S I T E*
*www.exosite.com <http://www.exosite.com/>*

Making Machines More Human.
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


Re: logging-es errors: shards failed

2016-07-15 Thread Alex Wauck
I also tried to fetch the logs from our logging-ops ES instance.  That also
met with failure.  Searching for "kubernetes_namespace_name: logging" there
lead to "No results found".

On Fri, Jul 15, 2016 at 2:48 PM, Peter Portante <pport...@redhat.com> wrote:

> Well, we don't send ES logs to itself.  I think you can create a
> feedback loop that breaks the whole thing down.
> -peter
>
> On Fri, Jul 15, 2016 at 3:39 PM, Luke Meyer <lme...@redhat.com> wrote:
> > They surely do. Although it would probably be easiest here to just get
> them
> > from `oc logs` against the ES pod, especially if we can't trust ES
> storage.
> >
> > On Fri, Jul 15, 2016 at 3:26 PM, Peter Portante <pport...@redhat.com>
> wrote:
> >>
> >> Eric, Luke,
> >>
> >> Do the logs from the ES instance itself flow into that ES instance?
> >>
> >> -peter
> >>
> >> On Fri, Jul 15, 2016 at 12:14 PM, Alex Wauck <alexwa...@exosite.com>
> >> wrote:
> >> > I'm not sure that I can.  I clicked the "Archive" link for the
> >> > logging-es
> >> > pod and then changed the query in Kibana to
> "kubernetes_container_name:
> >> > logging-es-cycd8veb && kubernetes_namespace_name: logging".  I got no
> >> > results, instead getting this error:
> >> >
> >> > Index:
> unrelated-project.92c37428-11f6-11e6-9c83-020b5091df01.2016.07.12
> >> > Shard: 2 Reason: EsRejectedExecutionException[rejected execution
> (queue
> >> > capacity 1000) on
> >> >
> >> >
> org.elasticsearch.search.action.SearchServiceTransportAction$23@6b1f2699]
> >> > Index:
> unrelated-project.92c37428-11f6-11e6-9c83-020b5091df01.2016.07.14
> >> > Shard: 2 Reason: EsRejectedExecutionException[rejected execution
> (queue
> >> > capacity 1000) on
> >> >
> >> >
> org.elasticsearch.search.action.SearchServiceTransportAction$23@66b9a5fb]
> >> > Index:
> unrelated-project.92c37428-11f6-11e6-9c83-020b5091df01.2016.07.15
> >> > Shard: 2 Reason: EsRejectedExecutionException[rejected execution
> (queue
> >> > capacity 1000) on
> >> >
> org.elasticsearch.search.action.SearchServiceTransportAction$23@512820e]
> >> > Index:
> unrelated-project.f38ac6ff-3e42-11e6-ab71-020b5091df01.2016.06.29
> >> > Shard: 2 Reason: EsRejectedExecutionException[rejected execution
> (queue
> >> > capacity 1000) on
> >> >
> >> >
> org.elasticsearch.search.action.SearchServiceTransportAction$23@3dce96b9]
> >> > Index:
> unrelated-project.f38ac6ff-3e42-11e6-ab71-020b5091df01.2016.06.30
> >> > Shard: 2 Reason: EsRejectedExecutionException[rejected execution
> (queue
> >> > capacity 1000) on
> >> >
> >> >
> org.elasticsearch.search.action.SearchServiceTransportAction$23@2f774477]
> >> >
> >> > When I initially clicked the "Archive" link, I saw a lot of messages
> >> > with
> >> > the kubernetes_container_name "logging-fluentd", which is not what I
> >> > expected to see.
> >> >
> >> >
> >> > On Fri, Jul 15, 2016 at 10:44 AM, Peter Portante <pport...@redhat.com
> >
> >> > wrote:
> >> >>
> >> >> Can you go back further in the logs to the point where the errors
> >> >> started?
> >> >>
> >> >> I am thinking about possible Java HEAP issues, or possibly ES
> >> >> restarting for some reason.
> >> >>
> >> >> -peter
> >> >>
> >> >> On Fri, Jul 15, 2016 at 11:37 AM, Lukáš Vlček <lvl...@redhat.com>
> >> >> wrote:
> >> >> > Also looking at this.
> >> >> > Alex, is it possible to investigate if you were having some kind of
> >> >> > network connection issues in the ES cluster (I mean between
> >> >> > individual
> >> >> > cluster nodes)?
> >> >> >
> >> >> > Regards,
> >> >> > Lukáš
> >> >> >
> >> >> >
> >> >> >
> >> >> >
> >> >> >> On 15 Jul 2016, at 17:08, Peter Portante <pport...@redhat.com>
> >> >> >> wrote:
> >> >> >>
> >> >> >> Just catching up on the thread, will get back to you all in a few
> >> >

Re: logging-es errors: shards failed

2016-07-15 Thread Alex Wauck
I'm not sure that I can.  I clicked the "Archive" link for the logging-es
pod and then changed the query in Kibana to "kubernetes_container_name:
logging-es-cycd8veb && kubernetes_namespace_name: logging".  I got no
results, instead getting this error:


   - *Index:*
unrelated-project.92c37428-11f6-11e6-9c83-020b5091df01.2016.07.12
   *Shard:* 2 *Reason:* EsRejectedExecutionException[rejected execution
   (queue capacity 1000) on
   org.elasticsearch.search.action.SearchServiceTransportAction$23@6b1f2699]
   - *Index:*
unrelated-project.92c37428-11f6-11e6-9c83-020b5091df01.2016.07.14
   *Shard:* 2 *Reason:* EsRejectedExecutionException[rejected execution
   (queue capacity 1000) on
   org.elasticsearch.search.action.SearchServiceTransportAction$23@66b9a5fb]
   - *Index:*
unrelated-project.92c37428-11f6-11e6-9c83-020b5091df01.2016.07.15
   *Shard:* 2 *Reason:* EsRejectedExecutionException[rejected execution
   (queue capacity 1000) on
   org.elasticsearch.search.action.SearchServiceTransportAction$23@512820e]
   - *Index:*
unrelated-project.f38ac6ff-3e42-11e6-ab71-020b5091df01.2016.06.29
   *Shard:* 2 *Reason:* EsRejectedExecutionException[rejected execution
   (queue capacity 1000) on
   org.elasticsearch.search.action.SearchServiceTransportAction$23@3dce96b9]
   - *Index:*
unrelated-project.f38ac6ff-3e42-11e6-ab71-020b5091df01.2016.06.30
   *Shard:* 2 *Reason:* EsRejectedExecutionException[rejected execution
   (queue capacity 1000) on
   org.elasticsearch.search.action.SearchServiceTransportAction$23@2f774477]

When I initially clicked the "Archive" link, I saw a lot of messages with
the kubernetes_container_name "logging-fluentd", which is not what I
expected to see.


On Fri, Jul 15, 2016 at 10:44 AM, Peter Portante <pport...@redhat.com>
wrote:

> Can you go back further in the logs to the point where the errors started?
>
> I am thinking about possible Java HEAP issues, or possibly ES
> restarting for some reason.
>
> -peter
>
> On Fri, Jul 15, 2016 at 11:37 AM, Lukáš Vlček <lvl...@redhat.com> wrote:
> > Also looking at this.
> > Alex, is it possible to investigate if you were having some kind of
> network connection issues in the ES cluster (I mean between individual
> cluster nodes)?
> >
> > Regards,
> > Lukáš
> >
> >
> >
> >
> >> On 15 Jul 2016, at 17:08, Peter Portante <pport...@redhat.com> wrote:
> >>
> >> Just catching up on the thread, will get back to you all in a few ...
> >>
> >> On Fri, Jul 15, 2016 at 10:08 AM, Eric Wolinetz <ewoli...@redhat.com>
> wrote:
> >>> Adding Lukas and Peter
> >>>
> >>> On Fri, Jul 15, 2016 at 8:07 AM, Luke Meyer <lme...@redhat.com> wrote:
> >>>>
> >>>> I believe the "queue capacity" there is the number of parallel
> searches
> >>>> that can be queued while the existing search workers operate. It
> sounds like
> >>>> it has plenty of capacity there and it has a different reason for
> rejecting
> >>>> the query. I would guess the data requested is missing given it
> couldn't
> >>>> fetch shards it expected to.
> >>>>
> >>>> The number of shards is a multiple (for redundancy) of the number of
> >>>> indices, and there is an index created per project per day. So even
> for a
> >>>> small cluster this doesn't sound out of line.
> >>>>
> >>>> Can you give a little more information about your logging deployment?
> Have
> >>>> you deployed multiple ES nodes for redundancy, and what are you using
> for
> >>>> storage? Could you attach full ES logs? How many OpenShift nodes and
> >>>> projects do you have? Any history of events that might have resulted
> in lost
> >>>> data?
> >>>>
> >>>> On Thu, Jul 14, 2016 at 4:06 PM, Alex Wauck <alexwa...@exosite.com>
> wrote:
> >>>>>
> >>>>> When doing searches in Kibana, I get error messages similar to
> "Courier
> >>>>> Fetch: 919 of 2020 shards failed".  Deeper inspection reveals errors
> like
> >>>>> this: "EsRejectedExecutionException[rejected execution (queue
> capacity 1000)
> >>>>> on
> >>>>>
> org.elasticsearch.search.action.SearchServiceTransportAction$23@14522b8e
> ]".
> >>>>>
> >>>>> A bit of investigation lead me to conclude that our Elasticsearch
> server
> >>>>> was not sufficiently powerful, but I spun up a new one with four
> times the
> >&

logging-es errors: shards failed

2016-07-14 Thread Alex Wauck
When doing searches in Kibana, I get error messages similar to "Courier
Fetch: 919 of 2020 shards failed".  Deeper inspection reveals errors like
this: "EsRejectedExecutionException[rejected execution (queue capacity
1000) on
org.elasticsearch.search.action.SearchServiceTransportAction$23@14522b8e]".

A bit of investigation lead me to conclude that our Elasticsearch server
was not sufficiently powerful, but I spun up a new one with four times the
CPU and RAM of the original one, but the queue capacity is still only
1000.  Also, 2020 seems like a really ridiculous number of shards.  Any
idea what's going on here?

-- 

Alex Wauck // DevOps Engineer

*E X O S I T E*
*www.exosite.com <http://www.exosite.com/>*

Making Machines More Human.
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


Re: Preview Openshift 3 Pod Failure, System error

2016-07-13 Thread Alex Wauck
Yeah, it took us a while to figure it out, too.

On Wed, Jul 13, 2016 at 9:13 AM, Skarbek, John <john.skar...@ca.com> wrote:

> That never even occurred to me. Thank you sir.
>
>
>
> --
> John Skarbek
>
> On July 13, 2016 at 09:18:44, Alex Wauck (alexwa...@exosite.com) wrote:
>
> Your Dockerfile clobbers /run in the image.  That leads to bad things.
> Don't feel bad; we made the same mistake.
>
> On Wed, Jul 13, 2016 at 7:06 AM, Skarbek, John <john.skar...@ca.com>
> wrote:
>
>> Good Morning,
>>
>> I was messing around with a random quick application on the preview of
>> openshift 3 online. I ran into this in the log of a container that won’t
>> start:
>>
>> Timestamp: 2016-07-13 11:49:38.160398231 + UTC
>> Code: System error
>>
>> Message: lstat 
>> /var/lib/docker/devicemapper/mnt/704986103e760820b33944aaf09c2210b07e6b89f158f2053f3782307de89846/rootfs/run/secrets:
>>  not a directory
>>
>> Frames:
>> ---
>> 0: setupRootfs
>> Package: github.com/opencontainers/runc/libcontainer 
>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__github.com_opencontainers_runc_libcontainer=DQMFaQ=_hRq4mqlUmqpqlyQ5hkoDXIVh6I6pxfkkNxQuL0p-Z0=8IlWeJZqFtf8Tvx1PDV9NsLfM_M0oNfzEXXNp-tpx74=Y6yqnj7Vh5AReMgc3-pyk_HyGvQyVeFBu9onFo7G2jo=MdjgQTD-AKc4DXGFR9lBj3OWx96g1ug9qZmpE5xAJ8k=>
>> File: rootfs_linux.go@40
>> ---
>> 1: Init
>> Package: github.com/opencontainers/runc/libcontainer.(*linuxStandardInit) 
>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__github.com_opencontainers_runc_libcontainer.-28-2AlinuxStandardInit-29=DQMFaQ=_hRq4mqlUmqpqlyQ5hkoDXIVh6I6pxfkkNxQuL0p-Z0=8IlWeJZqFtf8Tvx1PDV9NsLfM_M0oNfzEXXNp-tpx74=Y6yqnj7Vh5AReMgc3-pyk_HyGvQyVeFBu9onFo7G2jo=ukIIiIFSzKGhSGOxjReVzzlOpAnRsCdVj-gNUc7xuHc=>
>> File: standard_init_linux.go@57
>> ---
>> 2: StartInitialization
>> Package: github.com/opencontainers/runc/libcontainer.(*LinuxFactory) 
>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__github.com_opencontainers_runc_libcontainer.-28-2ALinuxFactory-29=DQMFaQ=_hRq4mqlUmqpqlyQ5hkoDXIVh6I6pxfkkNxQuL0p-Z0=8IlWeJZqFtf8Tvx1PDV9NsLfM_M0oNfzEXXNp-tpx74=Y6yqnj7Vh5AReMgc3-pyk_HyGvQyVeFBu9onFo7G2jo=7TjVgGEoTxGMQcH4_KMsykvVYn6SQJKQw5h79BnC_34=>
>> File: factory_linux.go@242
>> ---
>> 3: initializer
>> Package: github.com/docker/docker/daemon/execdriver/native 
>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__github.com_docker_docker_daemon_execdriver_native=DQMFaQ=_hRq4mqlUmqpqlyQ5hkoDXIVh6I6pxfkkNxQuL0p-Z0=8IlWeJZqFtf8Tvx1PDV9NsLfM_M0oNfzEXXNp-tpx74=Y6yqnj7Vh5AReMgc3-pyk_HyGvQyVeFBu9onFo7G2jo=qUnmNFii0TZmmX0J95Q3ouEfCVLDo6SFTIttKmytmIg=>
>> File: init.go@35
>> ---
>> 4: Init
>> Package: github.com/docker/docker/pkg/reexec 
>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__github.com_docker_docker_pkg_reexec=DQMFaQ=_hRq4mqlUmqpqlyQ5hkoDXIVh6I6pxfkkNxQuL0p-Z0=8IlWeJZqFtf8Tvx1PDV9NsLfM_M0oNfzEXXNp-tpx74=Y6yqnj7Vh5AReMgc3-pyk_HyGvQyVeFBu9onFo7G2jo=0OFbbrkatBCcMpPOnUiLyJxNvfZuIFPz3MH5M3yXdIo=>
>> File: reexec.go@26
>> ---
>> 5: main
>> Package: main
>> File: docker.go@18
>> ---
>> 6: main
>> Package: runtime
>> File: proc.go@63
>> ---
>> 7: goexit
>> Package: runtime
>> File: asm_amd64.s@2232
>>
>> The pod remains in a Crashloop. I fear something might be wrong with the
>> ability handle secrets. Despite me not using any…
>>
>> For reference here’s my quick and nasty docker image:
>> https://hub.docker.com/r/jtslear/command-check/
>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__hub.docker.com_r_jtslear_command-2Dcheck_=DQMFaQ=_hRq4mqlUmqpqlyQ5hkoDXIVh6I6pxfkkNxQuL0p-Z0=8IlWeJZqFtf8Tvx1PDV9NsLfM_M0oNfzEXXNp-tpx74=Y6yqnj7Vh5AReMgc3-pyk_HyGvQyVeFBu9onFo7G2jo=GCRCW7-ezPTFBrHmgYodNa-m3kp77iONbxCWrvjXrHM=>
>>
>>
>> --
>> John Skarbek
>>
>> _______
>> users mailing list
>> users@lists.openshift.redhat.com
>> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.openshift.redhat.com_openshiftmm_listinfo_users=DQMFaQ=_hRq4mqlUmqpqlyQ5hkoDXIVh6I6pxfkkNxQuL0p-Z0=8IlWeJZqFtf8Tvx1PDV9NsLfM_M0oNfzEXXNp-tpx74=Y6yqnj7Vh5AReMgc3-pyk_HyGvQyVeFBu9onFo7G2jo=euJS0SUEBywbXJ12I8fCt82PsCgZRdiHgdi68Pvqyhw=>
>>
>>
>
>
> --
>
> Alex Wauck // DevOps Engineer
>
> *E X O S I T E*
> *www.exosite.com
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.exosite.com_=DQMFaQ=_hRq4mqlUmqpqlyQ5hkoDXIVh6I6pxfkkNxQuL0p-Z0=8IlWeJZqFtf8Tvx1PDV9NsLfM_M0oNfzEXXNp-tpx74=Y6yqnj7Vh5AReMgc3-pyk_HyGvQyVeFBu9onFo7G2jo=e6D0tcn1-iPb3hLDOvnbIa-_vROQWeLo8FpgvwKIkBY=>*
>
>
> Making Machines More Human.
>
>


-- 

Alex Wauck // DevOps Engineer

*E X O S I T E*
*www.exosite.com <http://www.exosite.com/>*

Making Machines More Human.
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


Re: Preview Openshift 3 Pod Failure, System error

2016-07-13 Thread Alex Wauck
Your Dockerfile clobbers /run in the image.  That leads to bad things.
Don't feel bad; we made the same mistake.

On Wed, Jul 13, 2016 at 7:06 AM, Skarbek, John <john.skar...@ca.com> wrote:

> Good Morning,
>
> I was messing around with a random quick application on the preview of
> openshift 3 online. I ran into this in the log of a container that won’t
> start:
>
> Timestamp: 2016-07-13 11:49:38.160398231 + UTC
> Code: System error
>
> Message: lstat 
> /var/lib/docker/devicemapper/mnt/704986103e760820b33944aaf09c2210b07e6b89f158f2053f3782307de89846/rootfs/run/secrets:
>  not a directory
>
> Frames:
> ---
> 0: setupRootfs
> Package: github.com/opencontainers/runc/libcontainer
> File: rootfs_linux.go@40
> ---
> 1: Init
> Package: github.com/opencontainers/runc/libcontainer.(*linuxStandardInit)
> File: standard_init_linux.go@57
> ---
> 2: StartInitialization
> Package: github.com/opencontainers/runc/libcontainer.(*LinuxFactory)
> File: factory_linux.go@242
> ---
> 3: initializer
> Package: github.com/docker/docker/daemon/execdriver/native
> File: init.go@35
> ---
> 4: Init
> Package: github.com/docker/docker/pkg/reexec
> File: reexec.go@26
> ---
> 5: main
> Package: main
> File: docker.go@18
> ---
> 6: main
> Package: runtime
> File: proc.go@63
> ---
> 7: goexit
> Package: runtime
> File: asm_amd64.s@2232
>
> The pod remains in a Crashloop. I fear something might be wrong with the
> ability handle secrets. Despite me not using any…
>
> For reference here’s my quick and nasty docker image:
> https://hub.docker.com/r/jtslear/command-check/
>
>
>
> --
> John Skarbek
>
>
> ___
> users mailing list
> users@lists.openshift.redhat.com
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>
>


-- 

Alex Wauck // DevOps Engineer

*E X O S I T E*
*www.exosite.com <http://www.exosite.com/>*

Making Machines More Human.
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


Re: Adding nodes to existing origin 1.2 cluster

2016-07-12 Thread Alex Wauck
I see that your [OSEv3:children] section does not contain new_nodes.  Maybe
try adding that?  Mine contains masters, nodes, and new_nodes (we're using
built-in etcd right now).

On Tue, Jul 12, 2016 at 1:27 PM, Den Cowboy <dencow...@hotmail.com> wrote:

> I try to add nodes to our v1.2 cluster:
> I added the new_nodes section to my /etc/ansible/host
>
> [OSEv3:children]
> masters
> nodes
> etcd
>
> # Set variables common for all OSEv3 hosts
> [OSEv3:vars]
> ansible_ssh_user=root
> deployment_type=origin
> openshift_pkg_version=-1.2.0-4.el7
>
>
> # uncomment the following to enable htpasswd authentication; defaults to
> DenyAllPasswordIdentityProvider
> openshift_master_identity_providers=[{'name': 'htpasswd_auth', 'login':
> 'true', 'challenge': 'true', 'kind': 'HTPasswdPasswordIdentityProvider',
> 'filename': '/etc/origin/master/htpasswd'}]
>
>
> # host group for masters
> [masters]
> master
>
>
> # host group for etcd
> [etcd]
> master
>
> # host group for nodes, includes region info
> [nodes]
> node1 openshift_node_labels="{'xx'}"
> master openshift_node_labels="{'xx}"
>
> [new_nodes]
> node2 openshift_node_labels="{'xx}"
> node3 openshift_node_labels="{'xx}"
> node4 openshift_node_labels="{'xx}"
> ~
>
>
> I execute:
> ansible-playbook
> ~/openshift-ansible/playbooks/byo/openshift-node/scaleup.yml
>
> It seems to start fine but pretty fast I get the following error:
> TASK [openshift_facts : Gather Cluster facts and set is_containerized if
> needed] ***
> fatal: [node2]: FAILED! => {"failed": true, "msg": "{{ deployment_type }}:
> 'deployment_type' is undefined"}
> fatal: [node3]: FAILED! => {"failed": true, "msg": "{{ deployment_type }}:
> 'deployment_type' is undefined"}
> fatal: [node4]: FAILED! => {"failed": true, "msg": "{{ deployment_type }}:
> 'deployment_type' is undefined"}
>
> But the deployment_type is described in my /etc/ansible/hosts file? Also
> the first deployment( some weeks ago) went well.
>
> ___
> users mailing list
> users@lists.openshift.redhat.com
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>
>


-- 

Alex Wauck // DevOps Engineer

*E X O S I T E*
*www.exosite.com <http://www.exosite.com/>*

Making Machines More Human.
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


Re: the ovs-multitenant SDN plug-in

2016-07-11 Thread Alex Wauck
You shouldn't have to.  We didn't.

On Mon, Jul 11, 2016 at 2:58 AM, Den Cowboy <dencow...@hotmail.com> wrote:

> Hi,
>
> I tried to change my SDN configuration from subnet to multitenant.
> I deleted the section on my nodes and changed in on my master and
> restarted my master.
>
> 1. Is there a way to check somewhere this plugin is used?
>
> After that I tried to join 2 projects (I want to connect 2 pods without
> going outside my OpenShift environment (so without using routes but using
> services (names)).
>
> $ oadm pod-network join-projects --to=test1 test2
>
> Project test1 contains an app. Project test2 contains my mysql database.
> I used as env variable for my app: mysql=mysql.test2
> This works fine. So it's using the right database (which is in another
> project).
> Now is my question?
> I changed the configuration now. Do I have to recreate all my old
> projects/restart all my pods or?
>
> Thanks
>
>
>
> ___
> users mailing list
> users@lists.openshift.redhat.com
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>
>


-- 

Alex Wauck // DevOps Engineer

*E X O S I T E*
*www.exosite.com <http://www.exosite.com/>*

Making Machines More Human.
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


Re: Determine when a deployment finishes programmatically

2016-07-08 Thread Alex Wauck
On Fri, Jul 8, 2016 at 12:25 PM, Rodolfo Carvalho <rcarv...@redhat.com>
wrote:

> @Alex, when using `oc get --watch`, probably you want to combine that with
> an alternative output format, like json, jsonpath or template.
> Then act upon every new output.
>

That will probably be OK.  I see that using a different output format tells
me how many replicas I should expect, so I can just wait for
status.replicas to match spec.replicas.


> Or maybe I just interpreted you wrong and all you want is some
> programmatic way the current deployment state? (complete, failed, running)
> And not *wait for it to finish*?
>

Well, if I can get the current deployment state, I can wait for it to
finish by polling until the state is "Complete".  This is intended for use
in automated tests, since we don't really want to start testing until the
new version is fully deployed in our staging environment.

-- 

Alex Wauck // DevOps Engineer

*E X O S I T E*
*www.exosite.com <http://www.exosite.com/>*

Making Machines More Human.
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


Re: Determine when a deployment finishes programmatically

2016-07-08 Thread Alex Wauck
That's kind of helpful.  I tried --watch for one of my deployments, and I
got this result:

$ oc get rc/watcher-17 --watch
NAME DESIRED   CURRENT   AGE
watcher-17   1 1 12s
watcher-17   2 1 55s
watcher-17   2 1 55s
watcher-17   2 2 55s
watcher-17   2 2 1m
watcher-17   2 2 1m

So, if I don't know how many replicas it's supposed to have, then this
output doesn't tell me when it's done.  Not very helpful.  Also, the CLI
seems to think that the old rc has 0 replicas before the web UI does.  Kind
of strange.

On Fri, Jul 8, 2016 at 11:20 AM, Rodolfo Carvalho <rcarv...@redhat.com>
wrote:

> Hi Alex,
>
> The way our tests wait for a deployment to finish is like this:
>
>
> https://github.com/openshift/origin/blob/69bd3991df7256befa0c979b6620153c44b428c1/test/extended/util/framework.go#L484-L487
> *https://github.com/openshift/origin/blob/69bd3991df7256befa0c979b6620153c44b428c1/test/extended/util/framework.go#L370-L482
> <https://github.com/openshift/origin/blob/69bd3991df7256befa0c979b6620153c44b428c1/test/extended/util/framework.go#L370-L482>*
>
>
> The key part there is using the watch API.
>
>
> I think there's no CLI command that would give you as much flexibility as
> the API today, but you could try to do something on top of
>
> $ oc get dc/... --watch / --watch-only
>
>
> You'd react to every new output until you see the desired state.
>
>
> Rodolfo Carvalho | OpenShift
>
> On Fri, Jul 8, 2016 at 6:02 PM, Alex Wauck <alexwa...@exosite.com> wrote:
>
>> No luck:
>> $ oc get rc -l deploymentconfig=$PROJECT,deployment=$PROJECT-12
>> $ oc describe rc/$PROJECT-12
>> Name: $PROJECT-12
>> Namespace: $PROJECT
>> Image(s):
>> 172.30.151.60:5000/$PROJECT/$PROJECT@sha256:91a0d57c0dca1a985c6c5f78ccad4c1e1c79db4a98832d2f5749326da0154c88
>> Selector: app=$PROJECT,deployment=$PROJECT-12,deploymentconfig=$PROJECT
>> Labels: app=$PROJECT,openshift.io/deployment-config.name=$PROJECT
>> Replicas: 2 current / 2 desired
>> Pods Status: 1 Running / 1 Waiting / 0 Succeeded / 0 Failed
>> No volumes.
>> Events:
>>   FirstSeen LastSeen Count From SubobjectPath Type Reason Message
>>   -  -  -  -- ---
>>   1m 1m 1 {replication-controller } Normal SuccessfulCreate Created pod:
>> $PROJECT-12-7udpy
>>   38s 38s 1 {replication-controller } Normal SuccessfulCreate Created
>> pod: $PROJECT-12-dvjp8
>>
>>
>> On Fri, Jul 8, 2016 at 9:57 AM, Clayton Coleman <ccole...@redhat.com>
>> wrote:
>>
>>> oc get rc -l deploymentconfig=NAME,deployment=# should show you
>>>
>>> On Jul 8, 2016, at 10:07 AM, Alex Wauck <alexwa...@exosite.com> wrote:
>>>
>>> Is there any decent way to determine when a deployment has completed?
>>> I've tried `oc get deployments`, which never shows me anything, even when I
>>> have a deployment in progress.  I can go into the web UI and see a list of
>>> deployments, but I can't find any way to access that information via the
>>> CLI aside from parsing the very machine-unfriendly output of `oc describe
>>> dc/whatever`.
>>>
>>> How does the web UI get that information?  It doesn't have any special
>>> access that the CLI doesn't, does it?
>>>
>>> --
>>>
>>> Alex Wauck // DevOps Engineer
>>>
>>> *E X O S I T E*
>>> *www.exosite.com <http://www.exosite.com/>*
>>>
>>> Making Machines More Human.
>>>
>>> ___
>>> users mailing list
>>> users@lists.openshift.redhat.com
>>> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>>>
>>>
>>
>>
>> --
>>
>> Alex Wauck // DevOps Engineer
>>
>> *E X O S I T E*
>> *www.exosite.com <http://www.exosite.com/>*
>>
>> Making Machines More Human.
>>
>>
>> ___
>> users mailing list
>> users@lists.openshift.redhat.com
>> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>>
>>
>


-- 

Alex Wauck // DevOps Engineer

*E X O S I T E*
*www.exosite.com <http://www.exosite.com/>*

Making Machines More Human.
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


Re: Determine when a deployment finishes programmatically

2016-07-08 Thread Alex Wauck
OK, that gets me a list of replication controllers.  I then have to dig
through those, find the latest two, and then check for the second-latest
one going to zero?

On Fri, Jul 8, 2016 at 11:11 AM, Clayton Coleman <ccole...@redhat.com>
wrote:

> oh, you'll need to use -l openshift.io/deployment-config.name=$PROJECT
>
> On Fri, Jul 8, 2016 at 12:02 PM, Alex Wauck <alexwa...@exosite.com> wrote:
>
>> No luck:
>> $ oc get rc -l deploymentconfig=$PROJECT,deployment=$PROJECT-12
>> $ oc describe rc/$PROJECT-12
>> Name: $PROJECT-12
>> Namespace: $PROJECT
>> Image(s):
>> 172.30.151.60:5000/$PROJECT/$PROJECT@sha256:91a0d57c0dca1a985c6c5f78ccad4c1e1c79db4a98832d2f5749326da0154c88
>> Selector: app=$PROJECT,deployment=$PROJECT-12,deploymentconfig=$PROJECT
>> Labels: app=$PROJECT,openshift.io/deployment-config.name=$PROJECT
>> Replicas: 2 current / 2 desired
>> Pods Status: 1 Running / 1 Waiting / 0 Succeeded / 0 Failed
>> No volumes.
>> Events:
>>   FirstSeen LastSeen Count From SubobjectPath Type Reason Message
>>   -  -  -  -- ---
>>   1m 1m 1 {replication-controller } Normal SuccessfulCreate Created pod:
>> $PROJECT-12-7udpy
>>   38s 38s 1 {replication-controller } Normal SuccessfulCreate Created
>> pod: $PROJECT-12-dvjp8
>>
>>
>> On Fri, Jul 8, 2016 at 9:57 AM, Clayton Coleman <ccole...@redhat.com>
>> wrote:
>>
>>> oc get rc -l deploymentconfig=NAME,deployment=# should show you
>>>
>>> On Jul 8, 2016, at 10:07 AM, Alex Wauck <alexwa...@exosite.com> wrote:
>>>
>>> Is there any decent way to determine when a deployment has completed?
>>> I've tried `oc get deployments`, which never shows me anything, even when I
>>> have a deployment in progress.  I can go into the web UI and see a list of
>>> deployments, but I can't find any way to access that information via the
>>> CLI aside from parsing the very machine-unfriendly output of `oc describe
>>> dc/whatever`.
>>>
>>> How does the web UI get that information?  It doesn't have any special
>>> access that the CLI doesn't, does it?
>>>
>>> --
>>>
>>> Alex Wauck // DevOps Engineer
>>>
>>> *E X O S I T E*
>>> *www.exosite.com <http://www.exosite.com/>*
>>>
>>> Making Machines More Human.
>>>
>>> ___
>>> users mailing list
>>> users@lists.openshift.redhat.com
>>> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>>>
>>>
>>
>>
>> --
>>
>> Alex Wauck // DevOps Engineer
>>
>> *E X O S I T E*
>> *www.exosite.com <http://www.exosite.com/>*
>>
>> Making Machines More Human.
>>
>>
>


-- 

Alex Wauck // DevOps Engineer

*E X O S I T E*
*www.exosite.com <http://www.exosite.com/>*

Making Machines More Human.
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


Determine when a deployment finishes programmatically

2016-07-08 Thread Alex Wauck
Is there any decent way to determine when a deployment has completed?  I've
tried `oc get deployments`, which never shows me anything, even when I have
a deployment in progress.  I can go into the web UI and see a list of
deployments, but I can't find any way to access that information via the
CLI aside from parsing the very machine-unfriendly output of `oc describe
dc/whatever`.

How does the web UI get that information?  It doesn't have any special
access that the CLI doesn't, does it?

-- 

Alex Wauck // DevOps Engineer

*E X O S I T E*
*www.exosite.com <http://www.exosite.com/>*

Making Machines More Human.
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


Re: Binding service account to project-local roles

2016-07-07 Thread Alex Wauck
Note: $ROLE_PROJECT is the project containing the role that I want to
assign to the service account in $SERVICEACCOUNT_PROJECT.

Here's the YAML I used to create the policybinding:
apiVersion: v1
kind: PolicyBinding
metadata:
  name: $ROLE_PROJECT:default
policyRef:
  name: default
  namespace: $ROLE_PROJECT
roleBindings:
- name: testing
  roleBinding:
metadata:
  name: testing
  namespace: $ROLE_PROJECT
roleRef:
  name: testing
  namespace: $ROLE_PROJECT
subjects:
- kind: ServiceAccount
  name: system:serviceaccount:$SERVICEACCOUNT_PROJECT:testing
userNames: null

Terminal session after creating the above:
$ oc policy add-role-to-user --role-namespace=$ROLE_PROJECT testing -z
testing
The RoleBinding "testing" is invalid.

* metadata.resourceVersion: Invalid value: "": must be specified for an
update
* metadata.resourceVersion: Invalid value: "": must be specified for an
update
$ oc project $SERVICEACCOUNT_PROJECT
Now using project "$SERVICEACCOUNT_PROJECT" on server "
https://example.com:8443;.
$ oc policy add-role-to-user --role-namespace=$ROLE_PROJECT testing -z
testing
Error from server: policybinding "$ROLE_PROJECT:default" not found
$ oc get policybinding -n $ROLE_PROJECT
NAME ROLE BINDINGS
 LAST MODIFIED
:default admin, system:deployers, system:image-builders,
system:image-pullers   2016-06-22 01:59:45 -0500 CDT
$ROLE_PROJECT:default   testing

Looks like there's something I don't understand about policies, policy
bindings, roles, service accounts, and how they all fit together.

On Wed, Jul 6, 2016 at 8:12 PM, Jordan Liggitt <jligg...@redhat.com> wrote:

> Can you show the output of this command while your project is in that
> state?
>
> oc get policybindings -o yaml
>
> On Wed, Jul 6, 2016 at 5:51 PM, Alex Wauck <alexwa...@exosite.com> wrote:
>
>> I did create a "role-namespace:default" policy binding object.  After
>> creating it, the project disappeared from the list (but it was still
>> accessible via direct links).  After deleting it, the project reappeared.
>>
>> On Wed, Jul 6, 2016 at 4:28 PM, Jordan Liggitt <jligg...@redhat.com>
>> wrote:
>>
>>> A policy binding object name consists of the namespace where the roles
>>> are defined (or the empty string for cluster-level roles) and the name of
>>> the policy object (which is currently pinned to "default").
>>>
>>> When you created (or overwrote) the ":default" policy binding object, it
>>> removed bindings to cluster-level roles, which likely removed your user's
>>> access to the project.
>>>
>>> To bind to roles defined in "role-namespace", create a policy binding
>>> object named "role-namespace:default".
>>>
>>>
>>> On Wed, Jul 6, 2016 at 5:17 PM, Alex Wauck <alexwa...@exosite.com>
>>> wrote:
>>>
>>>> I want to bind a service account defined in one project to a role
>>>> defined in another.  Apparently, I need to create a :default policybinding
>>>> for the project containing the role.  I tried doing this, but then the
>>>> project stopped showing up in the list of projects in the Web UI.  Why do I
>>>> need this policybinding, and what does it need to contain in order to not
>>>> break the Web UI?
>>>>
>>>> I am running OpenShift Origin 1.2.0.
>>>>
>>>> --
>>>>
>>>> Alex Wauck // DevOps Engineer
>>>>
>>>> *E X O S I T E*
>>>> *www.exosite.com <http://www.exosite.com/>*
>>>>
>>>> Making Machines More Human.
>>>>
>>>>
>>>> ___
>>>> users mailing list
>>>> users@lists.openshift.redhat.com
>>>> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>>>>
>>>>
>>>
>>
>>
>> --
>>
>> Alex Wauck // DevOps Engineer
>>
>> *E X O S I T E*
>> *www.exosite.com <http://www.exosite.com/>*
>>
>> Making Machines More Human.
>>
>>
>


-- 

Alex Wauck // DevOps Engineer

*E X O S I T E*
*www.exosite.com <http://www.exosite.com/>*

Making Machines More Human.
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


Re: Binding service account to project-local roles

2016-07-07 Thread Alex Wauck
 $ oc create policybinding $ROLE_PROJECT -n $SERVICEACCOUNT_PROJECT
Create a resource by filename or stdin

JSON and YAML formats are accepted.

Usage:

  oc create -f FILENAME [options]
...

Looks like I can't just go and create a policybinding object.

On Thu, Jul 7, 2016 at 7:18 AM, David Eads <de...@redhat.com> wrote:

> `oc create policybinding TARGET_POLICY_NAMESPACE` should help create a
> policybinding the correct shape.
>
>
>   # Create a policy binding in namespace "foo" that references the policy
> in namespace "bar"
>   oc create policybinding bar -n foo
>
>
> On Wed, Jul 6, 2016 at 9:12 PM, Jordan Liggitt <jligg...@redhat.com>
> wrote:
>
>> Can you show the output of this command while your project is in that
>> state?
>>
>> oc get policybindings -o yaml
>>
>> On Wed, Jul 6, 2016 at 5:51 PM, Alex Wauck <alexwa...@exosite.com> wrote:
>>
>>> I did create a "role-namespace:default" policy binding object.  After
>>> creating it, the project disappeared from the list (but it was still
>>> accessible via direct links).  After deleting it, the project reappeared.
>>>
>>> On Wed, Jul 6, 2016 at 4:28 PM, Jordan Liggitt <jligg...@redhat.com>
>>> wrote:
>>>
>>>> A policy binding object name consists of the namespace where the roles
>>>> are defined (or the empty string for cluster-level roles) and the name of
>>>> the policy object (which is currently pinned to "default").
>>>>
>>>> When you created (or overwrote) the ":default" policy binding object,
>>>> it removed bindings to cluster-level roles, which likely removed your
>>>> user's access to the project.
>>>>
>>>> To bind to roles defined in "role-namespace", create a policy binding
>>>> object named "role-namespace:default".
>>>>
>>>>
>>>> On Wed, Jul 6, 2016 at 5:17 PM, Alex Wauck <alexwa...@exosite.com>
>>>> wrote:
>>>>
>>>>> I want to bind a service account defined in one project to a role
>>>>> defined in another.  Apparently, I need to create a :default policybinding
>>>>> for the project containing the role.  I tried doing this, but then the
>>>>> project stopped showing up in the list of projects in the Web UI.  Why do 
>>>>> I
>>>>> need this policybinding, and what does it need to contain in order to not
>>>>> break the Web UI?
>>>>>
>>>>> I am running OpenShift Origin 1.2.0.
>>>>>
>>>>> --
>>>>>
>>>>> Alex Wauck // DevOps Engineer
>>>>>
>>>>> *E X O S I T E*
>>>>> *www.exosite.com <http://www.exosite.com/>*
>>>>>
>>>>> Making Machines More Human.
>>>>>
>>>>>
>>>>> ___
>>>>> users mailing list
>>>>> users@lists.openshift.redhat.com
>>>>> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>>
>>> Alex Wauck // DevOps Engineer
>>>
>>> *E X O S I T E*
>>> *www.exosite.com <http://www.exosite.com/>*
>>>
>>> Making Machines More Human.
>>>
>>>
>>
>> ___
>> users mailing list
>> users@lists.openshift.redhat.com
>> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>>
>>
>


-- 

Alex Wauck // DevOps Engineer

*E X O S I T E*
*www.exosite.com <http://www.exosite.com/>*

Making Machines More Human.
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


Re: Binding service account to project-local roles

2016-07-06 Thread Alex Wauck
I did create a "role-namespace:default" policy binding object.  After
creating it, the project disappeared from the list (but it was still
accessible via direct links).  After deleting it, the project reappeared.

On Wed, Jul 6, 2016 at 4:28 PM, Jordan Liggitt <jligg...@redhat.com> wrote:

> A policy binding object name consists of the namespace where the roles are
> defined (or the empty string for cluster-level roles) and the name of the
> policy object (which is currently pinned to "default").
>
> When you created (or overwrote) the ":default" policy binding object, it
> removed bindings to cluster-level roles, which likely removed your user's
> access to the project.
>
> To bind to roles defined in "role-namespace", create a policy binding
> object named "role-namespace:default".
>
>
> On Wed, Jul 6, 2016 at 5:17 PM, Alex Wauck <alexwa...@exosite.com> wrote:
>
>> I want to bind a service account defined in one project to a role defined
>> in another.  Apparently, I need to create a :default policybinding for the
>> project containing the role.  I tried doing this, but then the project
>> stopped showing up in the list of projects in the Web UI.  Why do I need
>> this policybinding, and what does it need to contain in order to not break
>> the Web UI?
>>
>> I am running OpenShift Origin 1.2.0.
>>
>> --
>>
>> Alex Wauck // DevOps Engineer
>>
>> *E X O S I T E*
>> *www.exosite.com <http://www.exosite.com/>*
>>
>> Making Machines More Human.
>>
>>
>> ___
>> users mailing list
>> users@lists.openshift.redhat.com
>> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>>
>>
>


-- 

Alex Wauck // DevOps Engineer

*E X O S I T E*
*www.exosite.com <http://www.exosite.com/>*

Making Machines More Human.
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


Binding service account to project-local roles

2016-07-06 Thread Alex Wauck
I want to bind a service account defined in one project to a role defined
in another.  Apparently, I need to create a :default policybinding for the
project containing the role.  I tried doing this, but then the project
stopped showing up in the list of projects in the Web UI.  Why do I need
this policybinding, and what does it need to contain in order to not break
the Web UI?

I am running OpenShift Origin 1.2.0.

-- 

Alex Wauck // DevOps Engineer

*E X O S I T E*
*www.exosite.com <http://www.exosite.com/>*

Making Machines More Human.
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


Re: Can't override default token age?

2016-06-29 Thread Alex Wauck
Responses below.

On Tue, Jun 28, 2016 at 7:38 PM, Jordan Liggitt <jligg...@redhat.com> wrote:

> That looks like the right config value. Some things to check:
>
> 1. Are there duplicate `oauthConfig` stanzas (or tokenConfig, etc) in your
> config file? I think the last one wins.
>
No, but there are two tokenConfig sections for some reason.  That explains
it.

As an aside, for external integrations with machine accounts, service
> account tokens are recommended (
> https://docs.openshift.org/latest/dev_guide/service_accounts.html)...
> they don't expire, but can be revoked.
>

This sounds like the right  solution for Nagios.

-- 

Alex Wauck // DevOps Engineer

*E X O S I T E*
*www.exosite.com <http://www.exosite.com/>*

Making Machines More Human.
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


Re: define openshift origin version (stable 1.2.0) for Ansible install

2016-06-22 Thread Alex Wauck
This seems to be caused by the 1.2.0-2.el7 packages containing the wrong
version.  I had a conversation on IRC about this earlier (#openshift), and
somebody confirmed it.  I suspect a new release will be available soon.  At
any rate, downgrading to 1.2.0-1.el7 worked for us.

On Wed, Jun 22, 2016 at 8:55 AM, Den Cowboy <dencow...@hotmail.com> wrote:

> I tried:
> [OSEv3:vars]
> ansible_ssh_user=root
> deployment_type=origin
> openshift_pkg_version=-1.2.0
> openshift_image_tag=-1.2.0
>
> But it installed a release canidad and not v1.2.0
>
> oc v1.2.0-rc1-13-g2e62fab
> kubernetes v1.2.0-36-g4a3f9c5
>
> --
> From: dencow...@hotmail.com
> To: cont...@stephane-klein.info
> Subject: RE: define openshift origin version (stable 1.2.0) for Ansible
> install
> Date: Wed, 22 Jun 2016 12:51:18 +
> CC: users@lists.openshift.redhat.com
>
>
> Thanks for your fast reply
> This is the beginning of my playbook:
>
> [OSEv3:vars]
> ansible_ssh_user=root
> deployment_type=origin
> openshift_pkg_version=v1.2.0
> openshift_image_tag=v1.2.0
>
> But I got an error:
> TASK [openshift_master_ca : Install the base package for admin tooling]
> 
> FAILED! => {"changed": false, "failed": true, "msg": "No Package matching
> 'originv1.2.0' found available, installed or updated", "rc": 0, "results":
> []}
>
> --
> From: cont...@stephane-klein.info
> Date: Wed, 22 Jun 2016 13:53:57 +0200
> Subject: Re: define openshift origin version (stable 1.2.0) for Ansible
> install
> To: dencow...@hotmail.com
> CC: users@lists.openshift.redhat.com
>
> Personally I use this options to fix OpenShift version:
>
> openshift_pkg_version=v1.2.0
> openshift_image_tag=v1.2.0
>
>
> 2016-06-22 13:24 GMT+02:00 Den Cowboy <dencow...@hotmail.com>:
>
> Is it possible to define and origin version in your ansible install.
> At the moment we have so many issues with our newest install (while we had
> 1.1.6 pretty stable for some time)
> We want to go to a stable 1.2.0
>
> Our issues:
> version = oc v1.2.0-rc1-13-g2e62fab
> So images are pulled with tag oc v1.2.0-rc1-13-g2e62fab which doesn't
> exist in openshift. Okay we have a workaround by editing the master and
> node config's and using 'i--image' but whe don't like this approach
>
> logs on our nodes:
>  level=error msg="Error reading loginuid: open /proc/27182/loginuid: no
> such file or directory"
> level=error msg="Error reading loginuid: open /proc/27182/loginuid: no
> such file or directory"
>
> We started a mysql instance. We weren't able to use the service name to
> connect:
> mysql -u test -h mysql -p did NOT work
> mysql -u test -h 172.30.x.x (service ip) -p did work..
>
> So we have too many issues on this version of OpenShift. We've deployed in
> a team several times and are pretty confident with the setup and it was
> always working fine for us. But now this last weird versions seem really
> bad for us.
>
> ___
> users mailing list
> users@lists.openshift.redhat.com
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>
>
>
>
> --
> Stéphane Klein <cont...@stephane-klein.info>
> blog: http://stephane-klein.info
> cv : http://cv.stephane-klein.info
> Twitter: http://twitter.com/klein_stephane
>
> ___ users mailing list
> users@lists.openshift.redhat.com
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>
> ___
> users mailing list
> users@lists.openshift.redhat.com
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>
>


-- 

Alex Wauck // DevOps Engineer
+1 612 790 1558 (USA Mobile)

*E X O S I T E*
275 Market Street, Suite 535
Minneapolis, MN 55405
*www.exosite.com <http://www.exosite.com/>*

This communication may contain confidential information that is proprietary to
Exosite. Any unauthorized use or disclosure of this information is
strictly prohibited. If you are not the intended recipient, please delete
this message and any attachments, and advise the sender by return e-mail.

Making Machines More Human.
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


Re: Creating from a template: get parameters from a file

2016-06-20 Thread Alex Wauck
Oh, I hadn't thought of that.  I guess a solution for that would be
necessary after all.  Maybe have some special syntax to indicate that the
value should be read from a file?

At any rate, a simple key=value file will suffice for my needs.

On Mon, Jun 20, 2016 at 9:17 AM, Jordan Liggitt <jligg...@redhat.com> wrote:

> if you have the content of a cert file as a param, that will typically
> contain newlines
>
> On Mon, Jun 20, 2016 at 10:13 AM, Alex Wauck <alexwa...@exosite.com>
> wrote:
>
>> We do not have any parameters that require newlines, thankfully.  In my
>> opinion, that would be kind of insane.
>>
>> new-app isn't a great solution, since the template is intended to set up
>> our entire environment with multiple apps (some apps share parameters).  Or
>> current work-around is a Python script that generates Javascript that we
>> then paste into the Chrome JS console to populate the fields in the web UI.
>>
>> I'd really just like oc process to accept a file containing key=value
>> pairs, one per line.
>>
>> On Fri, Jun 17, 2016 at 5:07 PM, Clayton Coleman <ccole...@redhat.com>
>> wrote:
>>
>>> The -v flag needs to be fixed for sure (splitting flag values is bad).
>>>
>>> New-app should support both -f FILE and -p (which you can specify
>>> multiple -p, one for each param).
>>>
>>> Do you have any templates that require new lines?
>>>
>>> On Jun 17, 2016, at 5:55 PM, Alex Wauck <alexwa...@exosite.com> wrote:
>>>
>>> I need to create services from a template that has a lot of parameters.
>>> In addition to having a lot of parameters, it has parameters with values
>>> containing commas, which does not play well with the -v flag for oc
>>> process.  Is there any way to make oc process get the parameter values from
>>> a file?  I'm currently tediously copy/pasting the values into the web UI,
>>> which is not a good solution.
>>>
>>> --
>>>
>>> Alex Wauck // DevOps Engineer
>>> *E X O S I T E*
>>> *www.exosite.com <http://www.exosite.com/>*
>>> Making Machines More Human.
>>>
>>> ___
>>> users mailing list
>>> users@lists.openshift.redhat.com
>>> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>>>
>>>
>>
>>
>> --
>>
>> Alex Wauck // DevOps Engineer
>> +1 612 790 1558 (USA Mobile)
>>
>> *E X O S I T E*
>> 275 Market Street, Suite 535
>> Minneapolis, MN 55405
>> *www.exosite.com <http://www.exosite.com/>*
>>
>> This communication may contain confidential information that is proprietary 
>> to
>> Exosite. Any unauthorized use or disclosure of this information is
>> strictly prohibited. If you are not the intended recipient, please delete
>> this message and any attachments, and advise the sender by return e-mail.
>>
>> Making Machines More Human.
>>
>>
>> ___
>> users mailing list
>> users@lists.openshift.redhat.com
>> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>>
>>
>


-- 

Alex Wauck // DevOps Engineer
+1 612 790 1558 (USA Mobile)

*E X O S I T E*
275 Market Street, Suite 535
Minneapolis, MN 55405
*www.exosite.com <http://www.exosite.com/>*

This communication may contain confidential information that is proprietary to
Exosite. Any unauthorized use or disclosure of this information is
strictly prohibited. If you are not the intended recipient, please delete
this message and any attachments, and advise the sender by return e-mail.

Making Machines More Human.
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


Re: OpenShift Origin assistance desired

2016-04-27 Thread Alex Wauck
On Wed, Apr 27, 2016 at 12:10 PM, Jonathan Yu <jaw...@redhat.com> wrote:

> Hey Alex,
>
> On Tue, Apr 26, 2016 at 1:06 PM, Alex Wauck <alexwa...@exosite.com> wrote:
>
>> My employer (www.exosite.com) has a pilot project using OpenShift Origin
>> and we're looking for a contractor to help in configuring OpenShift and
>> Kubernetes. AWS experience would be helpful too, since we suspect the
>> issues
>>
> Not to make this sound like an ad, but... Have you considered Red Hat
> Consulting? https://www.redhat.com/en/services/consulting
>

Considering our experience attempting to get OpenShift Enterprise and the
associated support, we get the feeling that 1) we are not the kind of
customer Red Hat is particularly interested in (small, not a huge budget),
and 2) whatever they offer us will be expensive.

Right now, we just need an expert to help us figure out what's wrong with
our configuration and fix it.  (We're not doing anything weird, so we're
assuming that our problems are simple but we just don't know where to
look.)  I, for one, am not convinced that Red Hat is willing to provide
that to us, at least not at a reasonable price (rather, a "please go away,
but we don't want to actually say that" price).

-- 

Alex Wauck // DevOps Engineer
+1 612 790 1558 (USA Mobile)

*E X O S I T E*
*www.exosite.com <http://www.exosite.com/>*
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


OpenShift Origin assistance desired

2016-04-26 Thread Alex Wauck
My employer (www.exosite.com) has a pilot project using OpenShift Origin
and we're looking for a contractor to help in configuring OpenShift and
Kubernetes. AWS experience would be helpful too, since we suspect the
issues we've run into may be specific to the way we deployed Origin in AWS.
If you can help, e.g. on an hourly basis, or you can recommend anyone who
can help us, please get in touch. Thank you!

Please note that we may be switching to OpenShift Enterprise with support
from Red Hat at some point in the not-so-distant future, so this could be a
rather short-term gig.

-- 

Alex Wauck // DevOps Engineer
+1 612 790 1558 (USA Mobile)

*E X O S I T E*
275 Market Street, Suite 535
Minneapolis, MN 55405
*www.exosite.com <http://www.exosite.com/>*
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


Re: Moving from multitenant to subnet plugin results in disaster

2016-04-25 Thread Alex Wauck
One more thing: we are running Origin 1.1.6.

On Mon, Apr 25, 2016 at 9:27 AM, Alex Wauck <alexwa...@exosite.com> wrote:

> In an attempt to avoid problems[1] with the multitenant networking plugin,
> we recently switched from the multitenant plugin to the subnet plugin.
> After restarting all instances of origin-node and origin-master across our
> cluster, we found that nothing was able to communicate with anything over
> the service network.  Deploys were unable to pull from the registry, and
> apps were inaccessible.  We "fixed" this by reverting to the multitenant
> plugin, which brought us back to the previous less broken state, but only
> after rebuilding all apps (not just redeploying, but rebuilding).
>
> Does anybody know what went so horribly wrong here?  I may be able to
> provide logs if need be (not sure if logs rolled over yet).  One potential
> source of trouble: instead of shutting down all nodes, making the change,
> then bringing them back up, I changed them one-by-one.  Is that a bad thing
> to do?  Also, should we expect to need to rebuild all apps after changing
> the networking plugin?  Does that include the router and registry?
>
> [1] Pods on different machines were sometimes unable to communicate with
> each other via the service network.  Likely fixed by
> https://github.com/openshift/openshift-sdn/pull/285
> --
>
> Alex Wauck // DevOps Engineer
> +1 612 790 1558 (USA Mobile)
>
> *E X O S I T E*
> *www.exosite.com <http://www.exosite.com/>*
>
>
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


Moving from multitenant to subnet plugin results in disaster

2016-04-25 Thread Alex Wauck
In an attempt to avoid problems[1] with the multitenant networking plugin,
we recently switched from the multitenant plugin to the subnet plugin.
After restarting all instances of origin-node and origin-master across our
cluster, we found that nothing was able to communicate with anything over
the service network.  Deploys were unable to pull from the registry, and
apps were inaccessible.  We "fixed" this by reverting to the multitenant
plugin, which brought us back to the previous less broken state, but only
after rebuilding all apps (not just redeploying, but rebuilding).

Does anybody know what went so horribly wrong here?  I may be able to
provide logs if need be (not sure if logs rolled over yet).  One potential
source of trouble: instead of shutting down all nodes, making the change,
then bringing them back up, I changed them one-by-one.  Is that a bad thing
to do?  Also, should we expect to need to rebuild all apps after changing
the networking plugin?  Does that include the router and registry?

[1] Pods on different machines were sometimes unable to communicate with
each other via the service network.  Likely fixed by
https://github.com/openshift/openshift-sdn/pull/285
-- 

Alex Wauck // DevOps Engineer
+1 612 790 1558 (USA Mobile)

*E X O S I T E*
*www.exosite.com <http://www.exosite.com/>*
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users