Re: Method to move a single or multiple pods to a different node?

2017-07-11 Thread Per Carlson
On 12 July 2017 at 00:50, G. Jones  wrote:

> That’s just it, the masters were unschedulable. During the outage wer
> restarted the masters and nodes but the nodes wouldn’t come online. While
> we were working on getting the nodes up the pods had been restarted on the
> masters but they were never set as schedulable. When everything was finally
> up and running I did an oc describe node and found that pods were spread
> across the masters and nodes without me explicitly setting the masters as
> schedulable.
>

​Sounds like a bug to me. If you still have got logs/forensics you could
file a bug report.

-- 
Pelle

Research is what I'm doing when I don't know what I'm doing.
- Wernher von Braun
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


Re: timeout expired waiting for volumes to attach/mount for pod

2017-07-11 Thread Andrew Lau
Also might want to check if your nodes has any os updates, I found Network
Manager 1.4.0-19.el7_3 has a memory leak which appears overtime. There was
a recent devicemapper update too I believe.

On Wed, 12 Jul 2017 at 07:59 Andrew Lau  wrote:

> Try restarting origin-node it seemed to fix this issue for me.
>
> Also sometimes those mount errors are actually harmless. It happens when
> one of the controllers had been restarted but didn't sync the status.
> There's a fix upstream but I think only landed in 1.7
>
> The volume is already mounted but the controller doesn't know.
>
>
> On Wed., 12 Jul. 2017, 4:19 am Philippe Lafoucrière, <
> philippe.lafoucri...@tech-angels.com> wrote:
>
>> And... it's starting again.
>> Pods are getting stuck because volumes (secrets) can't be mounted, then
>> after a few minutes, everything starts.
>> I really don't get it :(
>> ​
>> ___
>> users mailing list
>> users@lists.openshift.redhat.com
>> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>>
>
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


OpenShift Origin Active Directory Authentication

2017-07-11 Thread Werner, Mark
I am really struggling to get Active Directory authentication to work.

The oauthConfig section of the master-config.yaml file starts out like this
and all is fine.

oauthConfig:

  assetPublicURL:  
https://master.domain.local:8443/console/

  grantConfig:

method: auto

  identityProviders:

  - challenge: true

login: true

mappingMethod: claim

name: allow_all

provider:

  apiVersion: v1

  kind: AllowAllPasswordIdentityProvider

  masterCA: ca-bundle.crt

  masterPublicURL:  
https://master.domain.local:8443

  masterURL:  
https://master.domain.local:8443

Then I attempt to modify the oauthConfig section of the master-config.yaml
file to look like this.

oauthConfig:

  assetPublicURL:  
https://master.domain.local:8443/console/

  grantConfig:

method: auto

  identityProviders:

  - name: Active_Directory

challenge: true

login: true

mappingMethod: claim

provider:

  apiVersion: v1

  kind: LDAPPasswordIdentityProvider

  attributes:

id:

- dn

email:

- mail

name:

- cn

preferredUsername:

- uid

  bindDN: "cn=openshift,cn=users,dc=domain,dc=local"

  bindPassword: "password"

  insecure: true

  url: ldap://dc.domain.local:389/cn=users,dc=domain,dc=local?uid

  assetPublicURL:  
https://master.domain.local:8443/console/

  masterPublicURL:  
https://master.domain.local:8443

  masterURL:  
https://master.domain.local:8443

Then I try to restart the origin-master service and it fails to restart, and
won't start again, not even on reboot. If I revert back to the old
master-config.yaml file everything works fine again, and origin-master
service starts with no problem.

The user "openshift" has been created in Active Directory with the correct
password.

I have even tried using url:
ldaps://dc.domain.local:686/cn=users,dc=domain,dc=local?uid

That doesn't work either. I cannot seem to figure out what I am doing wrong
and what the origin-master service does not like about the modified
master-config.yaml file that keeps it from starting.

 

 

Mark Werner | Senior Systems Engineer | Cloud & Infrastructure Services

Unisys | Mobile Phone 586.214.9017 |  
mark.wer...@unisys.com 

11720 Plaza America Drive, Reston, VA 20190

 

  

 

THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY
MATERIAL and is for use only by the intended recipient. If you received this
in error, please contact the sender and delete the e-mail and its
attachments from all devices.

 


  
 

 



smime.p7s
Description: S/MIME cryptographic signature
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


Re: timeout expired waiting for volumes to attach/mount for pod

2017-07-11 Thread Andrew Lau
Try restarting origin-node it seemed to fix this issue for me.

Also sometimes those mount errors are actually harmless. It happens when
one of the controllers had been restarted but didn't sync the status.
There's a fix upstream but I think only landed in 1.7

The volume is already mounted but the controller doesn't know.

On Wed., 12 Jul. 2017, 4:19 am Philippe Lafoucrière, <
philippe.lafoucri...@tech-angels.com> wrote:

> And... it's starting again.
> Pods are getting stuck because volumes (secrets) can't be mounted, then
> after a few minutes, everything starts.
> I really don't get it :(
> ​
> ___
> users mailing list
> users@lists.openshift.redhat.com
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


Re: Fencing and pod guarantees

2017-07-11 Thread Clayton Coleman
On Thu, Jul 6, 2017 at 6:34 AM, Nicola Ferraro  wrote:

> Hi,
> I've read some discussions on fencing and pod guarantees. Most of them are
> related to stateful sets, e.g. https://github.com/
> kubernetes/community/blob/master/contributors/design-
> proposals/pod-safety.md and related threads.
> Anyway, I couldn't find an answer to the following questions...
>
> Suppose I create a DeploymentConfig (so, no statefulsets) with replicas=1.
> After a pod is scheduled on some node, that node is disconnected from the
> cluster (I block all communications with the master).
> After some time, the DC/RC tries to delete that pod and reschedule a new
> pod on another node.
>

The RC doesn't delete the pod, but the node controller will (after X
minutes).  A new pod is created - the RC does *not* block waiting for old
pods to be deleted before creating new ones.  If the Pod references a PV
that supports locking innately (GCE, AWS, Azure, Ceph, Gluster), then the
second pod will *not* start up, because the volume can't be attached to the
new node.  But this behavior depends on the storage service itself, not on
Kube.


>
> For what I've understood, if now I reconnect the failing node, the Kubelet
> will read the cluster status and effectively delete the old pod, but,
> before that moment, both pods were running in their respective nodes and
> the old pod was allowed to access external resources (e.g. if the network
> still allowed communication with them).
>

Yes


>
> Is this scenario possible?
> Is there a mechanism by which a disconnected node can tear down its pods
> automatically after a certain timeout?
>

Run a daemonset that shuts down the instance if it loses contact with the
master API / health check for > X seconds.  Even this is best effort.  You
can also run a daemon set that uses sanlock or another tool based on a
shared RWM volume, and then self terminate if you lose the lock.  Keep in
mind these solutions aren't perfect, and it's always possible that a bug in
sanlock or another node error prevents that daemon process from running to
completion.


> Is fencing implemented/going-to-be-implemented for normal pods, even if
> they don't belong to stateful sets?
>

It's possible that we will add attach/detach controller support to control
whether volumes that are RWO but don't have innate locking.  It's also
possible that someone will implement a fencer.  It should be easy to
implement a fencer today.


>
> Thanks,
> Nicola
>
> ___
> users mailing list
> users@lists.openshift.redhat.com
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>
>
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


RE: Method to move a single or multiple pods to a different node?

2017-07-11 Thread G. Jones
That’s just it, the masters were unschedulable. During the outage wer restarted 
the masters and nodes but the nodes wouldn’t come online. While we were working 
on getting the nodes up the pods had been restarted on the masters but they 
were never set as schedulable. When everything was finally up and running I did 
an oc describe node and found that pods were spread across the masters and 
nodes without me explicitly setting the masters as schedulable.

 

 

  _  

From: Per Carlson [mailto:pe...@hemmop.com] 
Sent: Tuesday, July 11, 2017 12:46 AM
To: G. Jones
Cc: openshift
Subject: Re: Method to move a single or multiple pods to a different node?

 

Hi.

 

On 8 July 2017 at 21:45, G. Jones  wrote:

I’ve got an Origin 1.5 environment where, during an outage with my nodes, pods 
got relocated to my masters fairly randomly. I need to clean it up and get the 
pods back where I want them but have not yet found a way to do this. The only 
way I see to move pods between nodes is to scale to two replicas, mark one node 
as unschedulable and evacuate to the other node, then scale down to one. Is 
there a way to move single or multiple pods between nodes?

 

​

If you don't want any PODs on the masters, why not making them unschedulable​? 
Having the masters acting as a node in the event of failures sounds fragile to 
me (a POD could potentially consume all resources on the master and thus 
causing cluster wide instability). 

 

-- 

Pelle

Research is what I'm doing when I don't know what I'm doing.
- Wernher von Braun

___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


Re: timeout expired waiting for volumes to attach/mount for pod

2017-07-11 Thread Aleksandar Lazic
Title: Re: timeout expired waiting for volumes to attach/mount for pod


Hi Philippe.

on Dienstag, 11. Juli 2017 at 23:18 was written:





And... it's starting again.
Pods are getting stuck because volumes (secrets) can't be mounted, then after a few minutes, everything starts.
I really don't get it :(



Maybe it would help when you tell us some basic informations.

On which plattform do your run openshift?
Since when happen this behavior?
What was the latest changes which your have done before? 

oc version
oc project
oc export dc/
oc describe pod 
oc get events

-- 
Best Regards
Aleks


smime.p7s
Description: S/MIME Cryptographic Signature
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


Re: timeout expired waiting for volumes to attach/mount for pod

2017-07-11 Thread Philippe Lafoucrière
And... it's starting again.
Pods are getting stuck because volumes (secrets) can't be mounted, then
after a few minutes, everything starts.
I really don't get it :(
​
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


Re: timeout expired waiting for volumes to attach/mount for pod

2017-07-11 Thread Philippe Lafoucrière
After a lot of tests, we discovered the pending pods were always on the
same node.
There were some (usual) "thin: Deletion of thin device" messages.
After draining the node, nuking /var/lib/docker, a hard reboot, everything
went back to normal.

I suspect devicemapper to be the source of all our troubles, and we'll
certainly try overlayfs instead when 3.6 will be ready.
​
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


Re: [Logging] What component forward log entries to fluentd input service?

2017-07-11 Thread Stéphane Klein
2017-07-11 15:00 GMT+02:00 Alex Wauck :

> Last I checked (OpenShift Origin 1.2), fluentd was just slurping up the
> log files produced by Docker.  It can do that because the pods it runs in
> have access to the host filesystem.
>
> On Tue, Jul 11, 2017 at 6:12 AM, Stéphane Klein <
> cont...@stephane-klein.info> wrote:
>
>> Hi,
>>
>> I see here https://github.com/openshift/origin-aggregated-logging/
>> blob/master/fluentd/configs.d/input-post-forward-mux.conf#L2
>> that fluentd logging system use secure_forward input system.
>>
>> My question: what component forward log entries to fluentd input service ?
>>
>>
Ok it's here:

bash-4.2# cat configs.d/dynamic/input-syslog-default-syslog.conf

  @type systemd
  @label @INGRESS
  path "/var/log/journal"
  pos_file /var/log/journal.pos
  tag journal


Thanks
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


timeout expired waiting for volumes to attach/mount for pod

2017-07-11 Thread Philippe Lafoucrière
Hi,

Since a few days, we have pods waiting for volumes to be mounted, and get
stuck for several minutes.

https://www.dropbox.com/s/9vuge2t9llr7u6h/Screenshot%202017-07-11%2011.29.19.png?dl=0

After 3-10 minutes, the pod eventually starts, with no obvious reason. Any
idea what could cause this?

Thanks
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


Re: Running sshd in a Docker Container on Openshift

2017-07-11 Thread Aleksandar Kostadinov

You sure?

Where do you read OpenSSH is dropping that mode?

Tobias Florek wrote on 07/11/17 13:07:

Hi!

I have a container (based on centos), that runs openssh's sftp server as
random uid for use in openshift (using nss-wrapper).

Unfortunately OpenSSH is going to drop running as non-root in the next
major version because they think non-root container sshd is a bad idea
(I don't know why).

See https://github.com/ibotty/kubernetes-sftp For running interactive
ssh you will have to change the generated ssh-config file in `start.sh`.

Cheers,
 Tobias Florek



___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users



___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


Re: [Logging] What component forward log entries to fluentd input service?

2017-07-11 Thread Richard Megginson
Please see 
https://github.com/openshift/origin-aggregated-logging/blob/master/docs/mux-logging-service.md

- Original Message -
> Hi,
> 
> I see here
> https://github.com/openshift/origin-aggregated-logging/blob/master/fluentd/configs.d/input-post-forward-mux.conf#L2
> 
> that fluentd logging system use secure_forward input system.
> 
> My question: what component forward log entries to fluentd input service ?
> 
> Best regards,
> Stéphane
> --
> Stéphane Klein 
> blog: http://stephane-klein.info
> cv : http://cv.stephane-klein.info
> Twitter: http://twitter.com/klein_stephane
> 

___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


Re: [Logging] What component forward log entries to fluentd input service?

2017-07-11 Thread Peter Portante
On Tue, Jul 11, 2017 at 9:00 AM, Alex Wauck  wrote:
> Last I checked (OpenShift Origin 1.2), fluentd was just slurping up the log
> files produced by Docker.  It can do that because the pods it runs in have
> access to the host filesystem.
>
> On Tue, Jul 11, 2017 at 6:12 AM, Stéphane Klein
>  wrote:
>>
>> Hi,
>>
>> I see here
>> https://github.com/openshift/origin-aggregated-logging/blob/master/fluentd/configs.d/input-post-forward-mux.conf#L2
>> that fluentd logging system use secure_forward input system.
>>
>> My question: what component forward log entries to fluentd input service ?

The "mux" service is a concentrator of sorts.

Without the mux service, each fluentd pod runs on a host in an
OpenShift cluster collecting logs and sending them to Elasticsearch
directly.  The collectors also have the responsibility of enhancing
the logs collected with the metadata that describes which
pod/container they came from.  This requires connections to the API
server to get that information.

So in a large cluster, 200+ nodes, maybe less, maybe more, the API
servers are overwhelmed by requests from all the fluentd pods.

With the mux service, all the fluentd collections pods only talk to
the mux service and DO NOT talk to the API server; they simply send
the logs they collect to the mux fluentd instance.

The mux fluentd instance in turns talks to the API service to enrich
the logs with the pod/container metadata and then send along to
Elasticsearch.

This scales much better.

-peter


>>
>> Best regards,
>> Stéphane
>> --
>> Stéphane Klein 
>> blog: http://stephane-klein.info
>> cv : http://cv.stephane-klein.info
>> Twitter: http://twitter.com/klein_stephane
>>
>> ___
>> users mailing list
>> users@lists.openshift.redhat.com
>> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>>
>
>
>
> --
>
> Alex Wauck // Senior DevOps Engineer
>
> E X O S I T E
> www.exosite.com
>
> Making Machines More Human.
>
>
> ___
> users mailing list
> users@lists.openshift.redhat.com
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>

___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


Re: [Logging] What component forward log entries to fluentd input service?

2017-07-11 Thread Alex Wauck
Last I checked (OpenShift Origin 1.2), fluentd was just slurping up the log
files produced by Docker.  It can do that because the pods it runs in have
access to the host filesystem.

On Tue, Jul 11, 2017 at 6:12 AM, Stéphane Klein  wrote:

> Hi,
>
> I see here https://github.com/openshift/origin-aggregated-
> logging/blob/master/fluentd/configs.d/input-post-forward-mux.conf#L2
> that fluentd logging system use secure_forward input system.
>
> My question: what component forward log entries to fluentd input service ?
>
> Best regards,
> Stéphane
> --
> Stéphane Klein 
> blog: http://stephane-klein.info
> cv : http://cv.stephane-klein.info
> Twitter: http://twitter.com/klein_stephane
>
> ___
> users mailing list
> users@lists.openshift.redhat.com
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>
>


-- 

Alex Wauck // Senior DevOps Engineer

*E X O S I T E*
*www.exosite.com *

Making Machines More Human.
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


[Logging] What component forward log entries to fluentd input service?

2017-07-11 Thread Stéphane Klein
Hi,

I see here
https://github.com/openshift/origin-aggregated-logging/blob/master/fluentd/configs.d/input-post-forward-mux.conf#L2

that fluentd logging system use secure_forward input system.

My question: what component forward log entries to fluentd input service ?

Best regards,
Stéphane
-- 
Stéphane Klein 
blog: http://stephane-klein.info
cv : http://cv.stephane-klein.info
Twitter: http://twitter.com/klein_stephane
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


Re: Running sshd in a Docker Container on Openshift

2017-07-11 Thread Tobias Florek
Hi!

I have a container (based on centos), that runs openssh's sftp server as
random uid for use in openshift (using nss-wrapper).

Unfortunately OpenSSH is going to drop running as non-root in the next
major version because they think non-root container sshd is a bad idea
(I don't know why).

See https://github.com/ibotty/kubernetes-sftp For running interactive
ssh you will have to change the generated ssh-config file in `start.sh`.

Cheers,
 Tobias Florek


signature.asc
Description: signature
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


Re: Method to move a single or multiple pods to a different node?

2017-07-11 Thread Per Carlson
Hi.

On 8 July 2017 at 21:45, G. Jones  wrote:

> I’ve got an Origin 1.5 environment where, during an outage with my nodes,
> pods got relocated to my masters fairly randomly. I need to clean it up and
> get the pods back where I want them but have not yet found a way to do
> this. The only way I see to move pods between nodes is to scale to two
> replicas, mark one node as unschedulable and evacuate to the other node,
> then scale down to one. Is there a way to move single or multiple pods
> between nodes?
>
> ​
If you don't want any PODs on the masters, why not making them
unschedulable​? Having the masters acting as a node in the event of
failures sounds fragile to me (a POD could potentially consume all
resources on the master and thus causing cluster wide instability).

-- 
Pelle

Research is what I'm doing when I don't know what I'm doing.
- Wernher von Braun
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users