Re: Rolling pod evacuation

2017-04-20 Thread Andrew Lau
Thanks. I did see PodDisruptionBudget in the docs but it requires that
extra step, users are also unable to create their own by default(?)

On Thu, 20 Apr 2017 at 19:23 Marko Lukša  wrote:

> Take a look at PodDisruptionBudget. It allows you to specify the minimum
> number of pods that must be kept running when removing pods voluntarily
> (draining nodes is an example of this). But this feature may not be in
> OpenShift yet (IIRC draining nodes in Kubernetes honors the
> PodDisruptionBudget from version 1.6 onwards).
>
> On 20. 04. 2017 10:11, Andrew Lau wrote:
>
> Is there any way to evacuate a node using the rolling deployment process
> where a the new pod can start up first before being deleted from the
> current node?
>
> Drain seems to only delete the pod straight away. If there is a grace
> period set, it would be nice if the new pod could atleast have its image
> pulled into a new node first before being deleted from the drained more.
>
>
> ___
> users mailing 
> listusers@lists.openshift.redhat.comhttp://lists.openshift.redhat.com/openshiftmm/listinfo/users
>
> ___
> users mailing list
> users@lists.openshift.redhat.com
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


Re: Rolling pod evacuation

2017-04-20 Thread Andrew Lau
I didn't think this was honoured as it just deletes the pods?

On Thu, 20 Apr 2017 at 18:43 Michail Kargakis  wrote:

> If you want to scale up first and wait for the new pod to come up before
> deleting the old use maxSurge=1, maxUnavailable=0
>
> On Thu, Apr 20, 2017 at 10:11 AM, Andrew Lau 
> wrote:
>
>> Is there any way to evacuate a node using the rolling deployment process
>> where a the new pod can start up first before being deleted from the
>> current node?
>>
>> Drain seems to only delete the pod straight away. If there is a grace
>> period set, it would be nice if the new pod could atleast have its image
>> pulled into a new node first before being deleted from the drained more.
>>
>> ___
>> users mailing list
>> users@lists.openshift.redhat.com
>> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>>
>>
>
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


Re: Node report OK but every pod marked unready

2017-04-20 Thread Andrew Lau
Thanks! Hopefully we don't hit this too much until 1.5.0 is released

On Fri, 21 Apr 2017 at 01:26 Patrick Tescher 
wrote:

> We upgraded to 1.5.0 and that error went away.
>
> --
> Patrick Tescher
>
> On Apr 19, 2017, at 10:59 PM, Andrew Lau  wrote:
>
> thin_ls has been happening for quite some time
> https://github.com/openshift/origin/issues/10940
>
> On Thu, 20 Apr 2017 at 15:55 Tero Ahonen  wrote:
>
>> It seems that error is related to docker storage on that vm
>>
>> .t
>>
>> Sent from my iPhone
>>
>> On 20 Apr 2017, at 8.53, Andrew Lau  wrote:
>>
>> Unfortunately I did not. I dumped the logs and just removed the node in
>> order to quickly restore the current containers on another node.
>>
>> At the exact time it failed I saw a lot of the following:
>>
>> ===
>> thin_pool_watcher.go:72] encountered error refreshing thin pool watcher:
>> error performing thin_ls on metadata device
>> /dev/mapper/docker_vg-docker--pool_tmeta: Error running command `thin_ls
>> --no-headers -m -o DEV,
>> EXCLUSIVE_BYTES /dev/mapper/docker_vg-docker--pool_tmeta`: exit status 127
>>
>> failed (failure): rpc error: code = 2 desc = shim error: context deadline
>> exceeded#015
>>
>> Error running exec in container: rpc error: code = 2 desc = shim error:
>> context deadline exceeded
>> ===
>>
>> Seems to match https://bugzilla.redhat.com/show_bug.cgi?id=1427212
>>
>>
>> On Thu, 20 Apr 2017 at 15:41 Tero Ahonen  wrote:
>>
>>> Hi
>>>
>>> Did u try to ssh to that node and execute sudo docker run to some
>>> container?
>>>
>>> .t
>>>
>>> Sent from my iPhone
>>>
>>> > On 20 Apr 2017, at 8.18, Andrew Lau  wrote:
>>> >
>>> > I'm trying to debug a weird scenario where a node has had every pod
>>> crash with the error:
>>> > "rpc error: code = 2 desc = shim error: context deadline exceeded"
>>> >
>>> > The pods stayed in the state Ready 0/1
>>> > The docker daemon was responding and the kublet and all it's services
>>> were running. The node was reporting with the OK status.
>>> >
>>> > No resource limits were hit with CPU almost idle and memory at 25%
>>> utilisation.
>>> >
>>> >
>>> >
>>> >
>>> > ___
>>> > users mailing list
>>> > users@lists.openshift.redhat.com
>>> > http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>>>
>> ___
> dev mailing list
> d...@lists.openshift.redhat.com
> http://lists.openshift.redhat.com/openshiftmm/listinfo/dev
>
>
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


Re: Passthrough and insecure route

2017-04-20 Thread Philippe Lafoucrière
Hmm, tested with OS 1.4.1:

Route "https" is invalid: spec.tls.insecureEdgeTerminationPolicy: Invalid
value: "Redirect": InsecureEdgeTerminationPolicy is only allowed for
edge-terminated routes

but the doc (
https://docs.openshift.com/container-platform/3.4/architecture/core_concepts/routes.html)
is telling the same as you:

"passthrough routes can also have an insecureEdgeTerminationPolicy the only
valid values are None or empty (for disabled) or Redirect."

​Any idea? :(

Thanks
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


Re: Rolling pod evacuation

2017-04-20 Thread Marko Lukša
Take a look at PodDisruptionBudget. It allows you to specify the minimum 
number of pods that must be kept running when removing pods voluntarily 
(draining nodes is an example of this). But this feature may not be in 
OpenShift yet (IIRC draining nodes in Kubernetes honors the 
PodDisruptionBudget from version 1.6 onwards).



On 20. 04. 2017 10:11, Andrew Lau wrote:
Is there any way to evacuate a node using the rolling deployment 
process where a the new pod can start up first before being deleted 
from the current node?


Drain seems to only delete the pod straight away. If there is a grace 
period set, it would be nice if the new pod could atleast have its 
image pulled into a new node first before being deleted from the 
drained more.



___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


Re: Rolling pod evacuation

2017-04-20 Thread Michail Kargakis
If you want to scale up first and wait for the new pod to come up before
deleting the old use maxSurge=1, maxUnavailable=0

On Thu, Apr 20, 2017 at 10:11 AM, Andrew Lau  wrote:

> Is there any way to evacuate a node using the rolling deployment process
> where a the new pod can start up first before being deleted from the
> current node?
>
> Drain seems to only delete the pod straight away. If there is a grace
> period set, it would be nice if the new pod could atleast have its image
> pulled into a new node first before being deleted from the drained more.
>
> ___
> users mailing list
> users@lists.openshift.redhat.com
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>
>
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


Rolling pod evacuation

2017-04-20 Thread Andrew Lau
Is there any way to evacuate a node using the rolling deployment process
where a the new pod can start up first before being deleted from the
current node?

Drain seems to only delete the pod straight away. If there is a grace
period set, it would be nice if the new pod could atleast have its image
pulled into a new node first before being deleted from the drained more.
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


Re: Node report OK but every pod marked unready

2017-04-20 Thread Andrew Lau
thin_ls has been happening for quite some time
https://github.com/openshift/origin/issues/10940

On Thu, 20 Apr 2017 at 15:55 Tero Ahonen  wrote:

> It seems that error is related to docker storage on that vm
>
> .t
>
> Sent from my iPhone
>
> On 20 Apr 2017, at 8.53, Andrew Lau  wrote:
>
> Unfortunately I did not. I dumped the logs and just removed the node in
> order to quickly restore the current containers on another node.
>
> At the exact time it failed I saw a lot of the following:
>
> ===
> thin_pool_watcher.go:72] encountered error refreshing thin pool watcher:
> error performing thin_ls on metadata device
> /dev/mapper/docker_vg-docker--pool_tmeta: Error running command `thin_ls
> --no-headers -m -o DEV,
> EXCLUSIVE_BYTES /dev/mapper/docker_vg-docker--pool_tmeta`: exit status 127
>
> failed (failure): rpc error: code = 2 desc = shim error: context deadline
> exceeded#015
>
> Error running exec in container: rpc error: code = 2 desc = shim error:
> context deadline exceeded
> ===
>
> Seems to match https://bugzilla.redhat.com/show_bug.cgi?id=1427212
>
>
> On Thu, 20 Apr 2017 at 15:41 Tero Ahonen  wrote:
>
>> Hi
>>
>> Did u try to ssh to that node and execute sudo docker run to some
>> container?
>>
>> .t
>>
>> Sent from my iPhone
>>
>> > On 20 Apr 2017, at 8.18, Andrew Lau  wrote:
>> >
>> > I'm trying to debug a weird scenario where a node has had every pod
>> crash with the error:
>> > "rpc error: code = 2 desc = shim error: context deadline exceeded"
>> >
>> > The pods stayed in the state Ready 0/1
>> > The docker daemon was responding and the kublet and all it's services
>> were running. The node was reporting with the OK status.
>> >
>> > No resource limits were hit with CPU almost idle and memory at 25%
>> utilisation.
>> >
>> >
>> >
>> >
>> > ___
>> > users mailing list
>> > users@lists.openshift.redhat.com
>> > http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>>
>
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users