Re: [Cloud] [Cloud-announce] [Toolforge] New Kubernetes cluster open for beta testers

2020-01-15 Thread Count Count
Thanks, creating the venv in a container did the trick. May I suggest adding virtualenv to the Python 3.7 image? python -m venv works, but virtualenv is more commonly used. On Mon, Jan 13, 2020 at 2:09 AM Brooke Storm wrote: > You have to create the venv in a container using 'webservice shell of

Re: [Cloud] [Cloud-announce] [Toolforge] New Kubernetes cluster open for beta testers

2020-01-13 Thread Brooke Storm
Per that ticket, I no longer think there is any issue with the images for python at least. There was an issue with some nodes (that is being/mostly fixed). I’ll take a look at glamtools Brooke Storm Senior SRE Wikimedia Cloud Services bst...@wikimedia.org IRC: bsto

Re: [Cloud] [Cloud-announce] [Toolforge] New Kubernetes cluster open for beta testers

2020-01-13 Thread Brooke Storm
I’ve created Phabricator task T242632 to track this. Please coordinate there as well for anyone with information and time. Brooke Storm Senior SRE Wikimedia Cloud Services bst...@wikimedia.org IRC: bstorm_ > On Jan 13, 2020, at 9:10 AM, Brooke Storm wrote: > > I

Re: [Cloud] [Cloud-announce] [Toolforge] New Kubernetes cluster open for beta testers

2020-01-13 Thread Brooke Storm
I suspect this will affect containers that run Debian Buster packages. I see php7.3 and python3.7. I’d suggest not even restarting web services on those runtimes until we have it fixed. For anyone who has done so, we are working on it. Any logs could be helpful. Brooke Storm Senior SRE Wikim

Re: [Cloud] [Cloud-announce] [Toolforge] New Kubernetes cluster open for beta testers

2020-01-13 Thread Brooke Storm
I see a problem with at least one container image (which has nothing to do with the new cluster, I can see it on the old cluster as well). It looks like I’m going to be trying to fix that now. (Magnus, this is probably what you are seeing as well). Brooke Storm Senior SRE Wikimedia Cloud Servic

Re: [Cloud] [Cloud-announce] [Toolforge] New Kubernetes cluster open for beta testers

2020-01-13 Thread Magnus Manske via Cloud
And now I can't switch back: ...switch back commands... tools.glamtools@tools-sgebastion-07:~$ unalias kubectl tools.glamtools@tools-sgebastion-07:~$ webservice --backend=kubernetes php7.3 start Your job is already running tools.glamtools@tools-sgebastion-07:~$ webservice stop Your webservice is

Re: [Cloud] [Cloud-announce] [Toolforge] New Kubernetes cluster open for beta testers

2020-01-13 Thread Magnus Manske via Cloud
I switched a few "big ones" successfully, but ran into one that doesn't work: glamtools https://tools.wmflabs.org/glamtools/ is 503 but `webservice status` says "Your webservice of type php7.3 is running". On Thu, Jan 9, 2020 at 9:58 PM Bryan Davis wrote: > I am happy to announce that a new and

Re: [Cloud] [Cloud-announce] [Toolforge] New Kubernetes cluster open for beta testers

2020-01-12 Thread Brooke Storm
You have to create the venv in a container using 'webservice shell of the right runtime'. We support Python versions from Debian Jessie, Stretch and Buster by building in containers, so we cannot sync more than one of those to the bastion. We have moved a lot of Python tools back and forth without

Re: [Cloud] [Cloud-announce] [Toolforge] New Kubernetes cluster open for beta testers

2020-01-12 Thread Alex Monk
Interesting, uwsgi had Python 3.7.3 but `./www/python/venv/bin/python --version` says 3.7.6. Is that a big enough difference to cause problems? On Sun, 12 Jan 2020 at 23:19, Chico Venancio wrote: > Maybe a venv created in a different python version? > > Chico Venancio > > Em dom, 12 de jan de 20

Re: [Cloud] [Cloud-announce] [Toolforge] New Kubernetes cluster open for beta testers

2020-01-12 Thread Count Count
> > Maybe a venv created in a different python version? > Hmm, I am using a venv with Python 3.7.6. I can try with 3.7.3 tomorrow, which is used in the image. BTW: No version of Python 3.7 is installed on the dev/bastion hosts afaics. Might be a good idea to sync the version to the one used in th

Re: [Cloud] [Cloud-announce] [Toolforge] New Kubernetes cluster open for beta testers

2020-01-12 Thread Chico Venancio
Maybe a venv created in a different python version? Chico Venancio Em dom, 12 de jan de 2020 20:14, Alex Monk escreveu: > I think I've seen that particular error that you see in stdout/stderr (via > kubectl logs) before - on pods that in fact were working. > > Meanwhile, uwsgi.log says: > > Pyt

Re: [Cloud] [Cloud-announce] [Toolforge] New Kubernetes cluster open for beta testers

2020-01-12 Thread Alex Monk
I think I've seen that particular error that you see in stdout/stderr (via kubectl logs) before - on pods that in fact were working. Meanwhile, uwsgi.log says: Python version: 3.7.3 (default, Apr 3 2019, 05:39:12) [GCC 8.3.0] Set PythonHome to /data/project/countcounttest/www/python/venv Fatal

Re: [Cloud] [Cloud-announce] [Toolforge] New Kubernetes cluster open for beta testers

2020-01-12 Thread Count Count
> > Your pod started and container and it crashed, I see a uwsgi.log file with > a python module problem and a uwsgi segfault. > Yes. It was working fine with the legacy cluster. The service ist started via webservice --backend=kubernetes python3.7 start Apparently it cannot load the uwsgi share

Re: [Cloud] [Cloud-announce] [Toolforge] New Kubernetes cluster open for beta testers

2020-01-12 Thread Alex Monk
Hi Count Count, I believe I may have sorted out an issue that prevented some pods (depending partially on luck) from creating containers. Your pod started and container and it crashed, I see a uwsgi.log file with a python module problem and a uwsgi segfault. On Sun, 12 Jan 2020 at 22:12, Alex Monk

Re: [Cloud] [Cloud-announce] [Toolforge] New Kubernetes cluster open for beta testers

2020-01-12 Thread Alex Monk
Thanks Count Count. I have identified a new issue with the new k8s cluster and am looking into it. On Sun, 12 Jan 2020 at 21:43, Count Count wrote: > Yes, I switched back to the old cluster. This is a new tool that was used > in production even if only rarely. I can't leave it offline for hours.

Re: [Cloud] [Cloud-announce] [Toolforge] New Kubernetes cluster open for beta testers

2020-01-12 Thread Yongmin Hong via Cloud
Hi. While I know what 'kubernetes' is, I don't have any idea if any of the tools I maintain depends on this k8s migration, and if yes, why. I simply use `jsub` to submit jobs to be submitted, and sit back and expect it to work (and it does). I have no memory of ever touching anything kubectl-re

Re: [Cloud] [Cloud-announce] [Toolforge] New Kubernetes cluster open for beta testers

2020-01-12 Thread Maciej Jaros
Russell Blau (2020-01-12 21:40): Between steps 1 and 2, did you insert “webservice stop”?  If not, try that!  :-) Yes, webserivce was off. And I also did try to turn it off and on again ;-). Few times. I also tried "php7.2" and that didn't work either ¯\_(ツ)_/¯ Sent from my iPhone On Ja

Re: [Cloud] [Cloud-announce] [Toolforge] New Kubernetes cluster open for beta testers

2020-01-12 Thread Maciej Jaros
Brooke Storm (2020-01-12 20:53): Hi Nux, I took a look, and I see you have DNA running on Grid Engine.  Has it ever run ok on either Kubernetes backend (the old “default” or the new “toolforge”)? Yes, IIRC Bryan did run it on kubernetes for me (on my request from IRC). That was just before

Re: [Cloud] [Cloud-announce] [Toolforge] New Kubernetes cluster open for beta testers

2020-01-12 Thread Count Count
Yes, I switched back to the old cluster. This is a new tool that was used in production even if only rarely. I can't leave it offline for hours. I have created a test tool as a copy with which I can reproduce the issue: tools.countcounttest@tools-sgebastion-07:~$ kubectl get pods NAME

Re: [Cloud] [Cloud-announce] [Toolforge] New Kubernetes cluster open for beta testers

2020-01-12 Thread Alex Monk
Hi Count Count, I'm afraid you seem to have no pods on the new cluster to look at: # kubectl get -n tool-flaggedrevspromotioncheck pod No resources found. Alex On Sun, 12 Jan 2020 at 21:07, Count Count wrote: > Hi! > > I don't have much luck with a webservice based on the python3.7 image. It

Re: [Cloud] [Cloud-announce] [Toolforge] New Kubernetes cluster open for beta testers

2020-01-12 Thread Count Count
Hi! I don't have much luck with a webservice based on the python3.7 image. It is running fine on the legacy K8s cluster. On the new cluster I got a segfault. After stopping the webservice and trying again to get an empty log the pod is now stuck in ContainerCreating. A few minutes ago: tools.fla

Re: [Cloud] [Cloud-announce] [Toolforge] New Kubernetes cluster open for beta testers

2020-01-12 Thread Russell Blau
Between steps 1 and 2, did you insert “webservice stop”? If not, try that! :-) Sent from my iPhone > On Jan 11, 2020, at 5:08 PM, Maciej Jaros wrote: > >  > Hi > > I tried the migration path described here: > https://wikitech.wikimedia.org/wiki/News/2020_Kubernetes_cluster_migration#Manuall

Re: [Cloud] [Cloud-announce] [Toolforge] New Kubernetes cluster open for beta testers

2020-01-12 Thread Brooke Storm
Hi Nux, I took a look, and I see you have DNA running on Grid Engine. Has it ever run ok on either Kubernetes backend (the old “default” or the new “toolforge”)? Brooke Storm Senior SRE Wikimedia Cloud Services bst...@wikimedia.org IRC: bstorm_ > On Jan 11, 2020, a

Re: [Cloud] [Cloud-announce] [Toolforge] New Kubernetes cluster open for beta testers

2020-01-11 Thread Maciej Jaros
Hi I tried the migration path described here: https://wikitech.wikimedia.org/wiki/News/2020_Kubernetes_cluster_migration#Manually_migrate_a_webservice_to_the_new_cluster That doesn't seem to be working for me (or at least not for my /dna/ tool). Some problems: 1. `webservice status` on grid en

Re: [Cloud] [Cloud-announce] [Toolforge] New Kubernetes cluster open for beta testers

2020-01-10 Thread Chase Pettet
Amazing work folks. I'm really proud of you all. On Thu, Jan 9, 2020 at 3:58 PM Bryan Davis wrote: > I am happy to announce that a new and improved Kubernetes cluster is > now available for use by beta testers on an opt-in basis. A page has > been created on Wikitech [0] outlining the self-serv

[Cloud] [Cloud-announce] [Toolforge] New Kubernetes cluster open for beta testers

2020-01-09 Thread Bryan Davis
I am happy to announce that a new and improved Kubernetes cluster is now available for use by beta testers on an opt-in basis. A page has been created on Wikitech [0] outlining the self-service migration process. Timeline: * 2020-01-09: 2020 Kubernetes cluster available for beta testers on an opt-