Re: Rolling pod evacuation
Thanks. I did see PodDisruptionBudget in the docs but it requires that extra step, users are also unable to create their own by default(?) On Thu, 20 Apr 2017 at 19:23 Marko Lukšawrote: > Take a look at PodDisruptionBudget. It allows you to specify the minimum > number of pods that must be kept running when removing pods voluntarily > (draining nodes is an example of this). But this feature may not be in > OpenShift yet (IIRC draining nodes in Kubernetes honors the > PodDisruptionBudget from version 1.6 onwards). > > On 20. 04. 2017 10:11, Andrew Lau wrote: > > Is there any way to evacuate a node using the rolling deployment process > where a the new pod can start up first before being deleted from the > current node? > > Drain seems to only delete the pod straight away. If there is a grace > period set, it would be nice if the new pod could atleast have its image > pulled into a new node first before being deleted from the drained more. > > > ___ > users mailing > listusers@lists.openshift.redhat.comhttp://lists.openshift.redhat.com/openshiftmm/listinfo/users > > ___ > users mailing list > users@lists.openshift.redhat.com > http://lists.openshift.redhat.com/openshiftmm/listinfo/users > ___ users mailing list users@lists.openshift.redhat.com http://lists.openshift.redhat.com/openshiftmm/listinfo/users
Re: Rolling pod evacuation
I didn't think this was honoured as it just deletes the pods? On Thu, 20 Apr 2017 at 18:43 Michail Kargakiswrote: > If you want to scale up first and wait for the new pod to come up before > deleting the old use maxSurge=1, maxUnavailable=0 > > On Thu, Apr 20, 2017 at 10:11 AM, Andrew Lau > wrote: > >> Is there any way to evacuate a node using the rolling deployment process >> where a the new pod can start up first before being deleted from the >> current node? >> >> Drain seems to only delete the pod straight away. If there is a grace >> period set, it would be nice if the new pod could atleast have its image >> pulled into a new node first before being deleted from the drained more. >> >> ___ >> users mailing list >> users@lists.openshift.redhat.com >> http://lists.openshift.redhat.com/openshiftmm/listinfo/users >> >> > ___ users mailing list users@lists.openshift.redhat.com http://lists.openshift.redhat.com/openshiftmm/listinfo/users
Re: Node report OK but every pod marked unready
Thanks! Hopefully we don't hit this too much until 1.5.0 is released On Fri, 21 Apr 2017 at 01:26 Patrick Tescherwrote: > We upgraded to 1.5.0 and that error went away. > > -- > Patrick Tescher > > On Apr 19, 2017, at 10:59 PM, Andrew Lau wrote: > > thin_ls has been happening for quite some time > https://github.com/openshift/origin/issues/10940 > > On Thu, 20 Apr 2017 at 15:55 Tero Ahonen wrote: > >> It seems that error is related to docker storage on that vm >> >> .t >> >> Sent from my iPhone >> >> On 20 Apr 2017, at 8.53, Andrew Lau wrote: >> >> Unfortunately I did not. I dumped the logs and just removed the node in >> order to quickly restore the current containers on another node. >> >> At the exact time it failed I saw a lot of the following: >> >> === >> thin_pool_watcher.go:72] encountered error refreshing thin pool watcher: >> error performing thin_ls on metadata device >> /dev/mapper/docker_vg-docker--pool_tmeta: Error running command `thin_ls >> --no-headers -m -o DEV, >> EXCLUSIVE_BYTES /dev/mapper/docker_vg-docker--pool_tmeta`: exit status 127 >> >> failed (failure): rpc error: code = 2 desc = shim error: context deadline >> exceeded#015 >> >> Error running exec in container: rpc error: code = 2 desc = shim error: >> context deadline exceeded >> === >> >> Seems to match https://bugzilla.redhat.com/show_bug.cgi?id=1427212 >> >> >> On Thu, 20 Apr 2017 at 15:41 Tero Ahonen wrote: >> >>> Hi >>> >>> Did u try to ssh to that node and execute sudo docker run to some >>> container? >>> >>> .t >>> >>> Sent from my iPhone >>> >>> > On 20 Apr 2017, at 8.18, Andrew Lau wrote: >>> > >>> > I'm trying to debug a weird scenario where a node has had every pod >>> crash with the error: >>> > "rpc error: code = 2 desc = shim error: context deadline exceeded" >>> > >>> > The pods stayed in the state Ready 0/1 >>> > The docker daemon was responding and the kublet and all it's services >>> were running. The node was reporting with the OK status. >>> > >>> > No resource limits were hit with CPU almost idle and memory at 25% >>> utilisation. >>> > >>> > >>> > >>> > >>> > ___ >>> > users mailing list >>> > users@lists.openshift.redhat.com >>> > http://lists.openshift.redhat.com/openshiftmm/listinfo/users >>> >> ___ > dev mailing list > d...@lists.openshift.redhat.com > http://lists.openshift.redhat.com/openshiftmm/listinfo/dev > > ___ users mailing list users@lists.openshift.redhat.com http://lists.openshift.redhat.com/openshiftmm/listinfo/users
Re: Passthrough and insecure route
Hmm, tested with OS 1.4.1: Route "https" is invalid: spec.tls.insecureEdgeTerminationPolicy: Invalid value: "Redirect": InsecureEdgeTerminationPolicy is only allowed for edge-terminated routes but the doc ( https://docs.openshift.com/container-platform/3.4/architecture/core_concepts/routes.html) is telling the same as you: "passthrough routes can also have an insecureEdgeTerminationPolicy the only valid values are None or empty (for disabled) or Redirect." Any idea? :( Thanks ___ users mailing list users@lists.openshift.redhat.com http://lists.openshift.redhat.com/openshiftmm/listinfo/users
Re: Rolling pod evacuation
Take a look at PodDisruptionBudget. It allows you to specify the minimum number of pods that must be kept running when removing pods voluntarily (draining nodes is an example of this). But this feature may not be in OpenShift yet (IIRC draining nodes in Kubernetes honors the PodDisruptionBudget from version 1.6 onwards). On 20. 04. 2017 10:11, Andrew Lau wrote: Is there any way to evacuate a node using the rolling deployment process where a the new pod can start up first before being deleted from the current node? Drain seems to only delete the pod straight away. If there is a grace period set, it would be nice if the new pod could atleast have its image pulled into a new node first before being deleted from the drained more. ___ users mailing list users@lists.openshift.redhat.com http://lists.openshift.redhat.com/openshiftmm/listinfo/users ___ users mailing list users@lists.openshift.redhat.com http://lists.openshift.redhat.com/openshiftmm/listinfo/users
Re: Rolling pod evacuation
If you want to scale up first and wait for the new pod to come up before deleting the old use maxSurge=1, maxUnavailable=0 On Thu, Apr 20, 2017 at 10:11 AM, Andrew Lauwrote: > Is there any way to evacuate a node using the rolling deployment process > where a the new pod can start up first before being deleted from the > current node? > > Drain seems to only delete the pod straight away. If there is a grace > period set, it would be nice if the new pod could atleast have its image > pulled into a new node first before being deleted from the drained more. > > ___ > users mailing list > users@lists.openshift.redhat.com > http://lists.openshift.redhat.com/openshiftmm/listinfo/users > > ___ users mailing list users@lists.openshift.redhat.com http://lists.openshift.redhat.com/openshiftmm/listinfo/users
Rolling pod evacuation
Is there any way to evacuate a node using the rolling deployment process where a the new pod can start up first before being deleted from the current node? Drain seems to only delete the pod straight away. If there is a grace period set, it would be nice if the new pod could atleast have its image pulled into a new node first before being deleted from the drained more. ___ users mailing list users@lists.openshift.redhat.com http://lists.openshift.redhat.com/openshiftmm/listinfo/users
Re: Node report OK but every pod marked unready
thin_ls has been happening for quite some time https://github.com/openshift/origin/issues/10940 On Thu, 20 Apr 2017 at 15:55 Tero Ahonenwrote: > It seems that error is related to docker storage on that vm > > .t > > Sent from my iPhone > > On 20 Apr 2017, at 8.53, Andrew Lau wrote: > > Unfortunately I did not. I dumped the logs and just removed the node in > order to quickly restore the current containers on another node. > > At the exact time it failed I saw a lot of the following: > > === > thin_pool_watcher.go:72] encountered error refreshing thin pool watcher: > error performing thin_ls on metadata device > /dev/mapper/docker_vg-docker--pool_tmeta: Error running command `thin_ls > --no-headers -m -o DEV, > EXCLUSIVE_BYTES /dev/mapper/docker_vg-docker--pool_tmeta`: exit status 127 > > failed (failure): rpc error: code = 2 desc = shim error: context deadline > exceeded#015 > > Error running exec in container: rpc error: code = 2 desc = shim error: > context deadline exceeded > === > > Seems to match https://bugzilla.redhat.com/show_bug.cgi?id=1427212 > > > On Thu, 20 Apr 2017 at 15:41 Tero Ahonen wrote: > >> Hi >> >> Did u try to ssh to that node and execute sudo docker run to some >> container? >> >> .t >> >> Sent from my iPhone >> >> > On 20 Apr 2017, at 8.18, Andrew Lau wrote: >> > >> > I'm trying to debug a weird scenario where a node has had every pod >> crash with the error: >> > "rpc error: code = 2 desc = shim error: context deadline exceeded" >> > >> > The pods stayed in the state Ready 0/1 >> > The docker daemon was responding and the kublet and all it's services >> were running. The node was reporting with the OK status. >> > >> > No resource limits were hit with CPU almost idle and memory at 25% >> utilisation. >> > >> > >> > >> > >> > ___ >> > users mailing list >> > users@lists.openshift.redhat.com >> > http://lists.openshift.redhat.com/openshiftmm/listinfo/users >> > ___ users mailing list users@lists.openshift.redhat.com http://lists.openshift.redhat.com/openshiftmm/listinfo/users