Hi everybody,

some news from the Analytics team:

- The Kerberos ticket expiry time has been bumped to 48h. You can
kdestroy/kinit to get the new settings :)

- There are new memory and cpu limits on all stat/notebook hosts, that
should automatically kill big jobs that cause too much memory pressure. CPU
cores are also limited to 90% of the available ones to leave space for
system daemons. This should help a lot in avoiding recurrent alarms to the
SRE team (and me reaching out to some of you as consequence!) and it should
be a more fair system for everybody. In order to apply these new settings
I'd need to shutdown/start all the notebooks running on notebook1003/1004,
but I didn't do it since I didn't want to impact any work. If you could
please take care of stopping/starting your notebooks it would be really
appreciated :)

- We deployed jupyterhub on stat1004 and stat1006, ready for general use!
This should help in avoiding the small home size problem that many of you
are experiencing on notebook1003/1004. We are also working on setting up
jupyterhub on stat1005, with updated dependencies (jupyterhub 1.1.0, toree
0.3.0, etc.. full list in
https://gerrit.wikimedia.org/r/#/c/analytics/jupyterhub/deploy/+/577761/1/frozen-requirements.txt).
The plan is to eventually have the same version on all stat boxes (no
timeline yet). We didn't deploy jupyterhub on stat1007 due to some puppet
code refactoring in progress, but we hope to do it next quarter.

- A new stat host (stat1008) will be ready for general use soon. It hosts a
GPU like stat1005.

If you have questions/doubts/etc.. please feel free to follow up with me or
any member of the Analytics team on #wikimedia-analytics :)

Luca (on behalf of the Analytics team)
_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics

Reply via email to