On Mon, Mar 7, 2016 at 6:04 PM, Joshua Hesketh <joshua.hesk...@gmail.com> wrote: > Hi Mikhail, > > Okay thanks, that's helpful. > > You mentioned that you might try restarting zuul periodically to see if that > helps. Perhaps instead you could do a reload (or HUP) first to see if that > clears the cache and alleviates the issue for you?
SIGHUP (kill -1) does get configuration reloaded (according to logs), but I saw no immediate effect on memory footprint. At the time of test, zuul-server was at 3GB (while 24 hours earlier it was at 2GB). Unfortunately I had to restart zuul-server due to unrelated problems, so now I need to wait some time before being able to test again. I would definitely go the periodic SIGHUP route if it proves to work, that is a good idea. > > Cheers, > Josh > > On Tue, Mar 8, 2016 at 10:53 AM, Mikhail Medvedev <mihail...@gmail.com> > wrote: >> >> Hi Josh, >> >> On Mon, Mar 7, 2016 at 5:25 PM, Joshua Hesketh <joshua.hesk...@gmail.com> >> wrote: >> > Hi Mikhail, >> > >> > Thank you for the extra details. I'll continue to look into this. >> > >> > With the daily bumps when you do the log rotation, I assume you aren't >> > reloading zuul at that point and the freed memory is likely due to >> > another >> > process? >> >> I was puzzled by the bumps, and checked the syslog. They are definitely >> due to >> "run-parts --report /etc/cron.daily" being triggered at 06:25, and not >> zuul reloads. >> The memory bumps could be due to any of the cron jobs. logrotate seemed >> likely. >> For the record: >> >> root@zuul:~# ls /etc/cron.daily >> apache2 apport apt aptitude bsdmainutils dpkg exim4-base >> logrotate man-db mlocate ntp passwd update-notifier-common >> upstart >> >> I have also confirmed there were no changes to zuul layout for the >> interval that >> the graph shows. >> >> > >> > Cheers, >> > Josh >> > >> > On Tue, Mar 8, 2016 at 10:17 AM, Mikhail Medvedev <mihail...@gmail.com> >> > wrote: >> >> >> >> On Wed, Feb 10, 2016 at 10:57 AM, James E. Blair <cor...@inaugust.com> >> >> wrote: >> >> > Michael Still <mi...@stillhq.com> writes: >> >> > >> >> >> On Tue, Feb 9, 2016 at 4:59 AM, Joshua Hesketh >> >> >> <joshua.hesk...@gmail.com> >> >> >> wrote: >> >> >> >> >> >>> On Thu, Feb 4, 2016 at 2:44 AM, James E. Blair >> >> >>> <cor...@inaugust.com> >> >> >>> wrote: >> >> >>>> >> >> >>>> On the subject of clearing the cache more often, I think we may >> >> >>>> not >> >> >>>> want >> >> >>>> to wipe out the cache more often than we do now -- in fact, I >> >> >>>> think >> >> >>>> we >> >> >>>> may want to look into ways to keep from doing even that, because >> >> >>>> whenever we reload now, Zuul slows down considerably as it has to >> >> >>>> query >> >> >>>> Gerrit again for all of the data previously in its cache. >> >> >>>> >> >> >>> >> >> >>> I can see a lot of 3rd parties or simpler CI's not needing to >> >> >>> reload >> >> >>> zuul >> >> >>> very often so this cache would never get cleared. Perhaps cached >> >> >>> objects >> >> >>> should have an expiry time (of a day or so) and can be cleaned up >> >> >>> periodically? Additionally if clearing the cache on a reload is >> >> >>> causing >> >> >>> pain maybe we should move the cache into the scheduler and keep it >> >> >>> between >> >> >>> reloads? >> >> >>> >> >> >> >> >> >> Do you guys use oslo at all? I ask because the olso memcache stuff >> >> >> does >> >> >> exactly this, so it should be trivial to implement if you don't mind >> >> >> depending on oslo. >> >> > >> >> > One of the main things we use the cache for is to ensure that every >> >> > change is represented by a single Change object in Zuul's memory. >> >> > The >> >> > graph of enqueued Items link to their respective Changes which may >> >> > link >> >> > to each other due to dependencies. When something changes in Gerrit, >> >> > we >> >> > want that reflected immediately and consistently in all of the >> >> > objects >> >> > in that graph. Using the cache means that every time we add a new >> >> > Change object to that graph, we use the same object for a given >> >> > change. >> >> > >> >> > This is why we can't use time-based expiry -- we must not drop >> >> > objects >> >> > from the cache if they are still in the graph. Otherwise we will >> >> > create >> >> > new duplicative objects and the ones still in the graph will not be >> >> > updated. >> >> > >> >> > Perhaps we should change these objects to something more ephemeral >> >> > that >> >> > can proxy for some other mechanism that can operate more like a >> >> > traditional cache (with time-based expiry). But I think changes to >> >> > this >> >> > system should happen in Zuulv3 -- it works well enough for Zuulv2 for >> >> > now. >> >> > >> >> > -Jim >> >> > >> >> >> >> We are one of third-party CIs and using "Zuul version: 2.1.1.dev123", >> >> which is one commit after [1]. That one commit after is not in tree - I >> >> am >> >> applying [2] on top. >> >> >> >> The VM has 8GB of RAM. zuul-server memory footprint goes up >> >> consistently >> >> over >> >> the course of a week. Normally it takes about 3-4 days to get over to >> >> 3Gb. >> >> About a week ago I witnessed zuul-server get to 95% of RAM, at which >> >> point >> >> kernel started killing other processes. The graph [3] memory [3], and >> >> it >> >> reflects zuul-server consumption. The daily bumps on the graph are >> >> daily >> >> cron >> >> doing log rotation etc, possibly flushing caches. >> >> >> >> I can not say 100% that it is still the leak. Could simply be that >> >> zuul-server >> >> requires more ram now. >> >> >> >> [1] >> >> >> >> https://review.openstack.org/#q,I81ee47524cda71a500c55a95a2280f491b1b63d9,n,z >> >> [2] >> >> >> >> https://review.openstack.org/#q,If3a418fa2d4993a149d454e02a9b26529e4b6825,n,z >> >> [3] http://imgur.com/SzqSA1H >> >> >> >> Mikhail Medvedev (mmedvede) >> >> >> >> _______________________________________________ >> >> OpenStack-Infra mailing list >> >> OpenStack-Infra@lists.openstack.org >> >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra >> > >> > > > _______________________________________________ OpenStack-Infra mailing list OpenStack-Infra@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra