Re: [ovirt-devel] [VDSM][sampling] thread pool status and handling of stuck calls
(I'm continuing from here but this probably deserves a new thread. However.) - Original Message - From: Federico Simoncelli fsimo...@redhat.com To: devel@ovirt.org Cc: Nir Soffer nsof...@redhat.com, Michal Skrivanek mskri...@redhat.com, Adam Litke ali...@redhat.com, Francesco Romani from...@redhat.com Sent: Wednesday, July 9, 2014 4:57:53 PM Subject: Re: [ovirt-devel] [VDSM][sampling] thread pool status and handling of stuck calls So basically the current threading model is the behavior we want? If some call get stuck, stop sampling this vm. Continue when the call returns. Michal? Federico? Yep - but with less threads, and surely with a constant number of them. Your schedule library (review in my queue at very high priority) is indeed a nice step in this direcation. Waiting for Federico's ack. That looks good. Now I would like to summarize few things. We know that when a request gets stuck on a vm also the subsequent ones will get stuck (at least until their timeout is up, except for the first one that could stay there forever). We want a limited number of threads polling the statistics (trying to match the number of threads that libvirt has). Given those two assumptions we want a thread pool of workers that are picking up jobs *per-vm*. The jobs should be smart enough to: - understand what samples they have to take in that cycle (cpu? network? etc.) - resubmit themselves in the queue Now this will ensure that in the queue there's only one job per-vm and if it gets stuck it is not re-submitted (no other worker will get stuck). In the last few days I was thinking really hard and long about our last discussions, feedback and proposals and how to properly fit all the pieces together. Michal and me also had a chat about this topics on Friday, and eventually I come up with this new draft http://gerrit.ovirt.org/#/c/29977 (yes, that's it, just this) which builds on Nir's Schedule, the existing Threadpool hidden inside vdsm/storage, and which I believe provides a much, much better ground for further development or discussion . Driving forces behind this new draft: - minimize bloat. - minimize changes. - separate nicely concerns (Scheduler schedules, threadpool executes, Sampling cares about the actual sampling only). - leverage as much as possible existing infrastracture; avoid to introduce new fancy stuff unless absolutely needed. And here it is. Almost all the concepts and requirements we discussed are there. The thing which is lacking here is strong isolation about VMs/samplings. This new concept does nothing to recover stuck worker threads: if the pool is exausted, everything eventually stops, after a few sampling intervals. Stuck jobs are detected and the corresponding VMs are marked unresponsive (leveraging existing infrastructure). When (if?) stuck jobs eventually restart working, everything else restarts as well. The changes are minimal, and there is still room for refactoring and cleanup, but I believe the design is nicer and cleaner. Further steps: * replace existing thread pool with a fancier one which can replace stuck threads, or dinamically resize himself, to achieve better isolation among VMs or jobs? * Split the new VmStatsCollector class in smaller components? * Stale data detection. Planned but not yet there, I just need to get how to properly fit it into the AdvancedStatsFunction windowing sample. Should nt be a big deal, however. I also have already quite few cleanup patches for the existing threadpool and for the sampling code in the queue, some are on gerrit, some are not. I think most of them can wait once we agree on the overall design. Nir also provided further suggestions (thanks for that!) and possible design alternatives which I'm now evaluating carefully. Thanks and bests, -- Francesco Romani RedHat Engineering Virtualization R D Phone: 8261328 IRC: fromani ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Re: [ovirt-devel] ovirt-guest-agent
Subject: Re: [ovirt-devel] ovirt-guest-agent From: michal.skriva...@redhat.com Date: Fri, 11 Jul 2014 10:05:59 +0200 CC: vfeen...@redhat.com; in...@ovirt.org; devel@ovirt.org; cybertimber2...@hotmail.com To: sbona...@redhat.com On Jul 11, 2014, at 09:17 , Sandro Bonazzola sbona...@redhat.com wrote: Il 11/07/2014 09:12, Vinzenz Feenstra ha scritto: On 07/11/2014 09:02 AM, Sandro Bonazzola wrote: Hi, looking at http://jenkins.ovirt.org/view/All/ I see that for ovirt-guest-agent there's only the following job: http://jenkins.ovirt.org/view/All/job/ovirt-guest-agent_master_gerrit/ This means that ovirt-guest-agent is not built nightly neither for master nor for stable and it's not published neither in nightly snapshot nor in official releases. Yeah it never has been built nightly. I see that ovirt-guest-agent is shipped within Fedora and EPEL: http://koji.fedoraproject.org/koji/packageinfo?packageID=15434 so I suppose official release are shipped there. Yes Do we need nightly builds? It would be great if we would have nightly builds. Or at some trigger after merges, as the guest agent has not such a high volume of patches, nightly might be overkill, at least for now. Ok, I'll open a ticket for that. Also I see that http://www.ovirt.org/How_to_install_the_guest_agent_in_Fedora refers to obsolete layout / releases and the package is not shipped within oVirt repo for above considerations and should be updated accordingly. True, and now looking at it, I have to say that I am surprised that the guest agent builds aren't included in the ovirt releases anymore. Usually they were including the latest koji builds. (At least from my knowledge) that does not seem to be the case anymore. when looking at: http://resources.ovirt.org/releases/3.4/rpm/fc20/noarch/ for example. It's because RC are composed from nightly and if package is not in nightly it's not in the RC and then not in the final release. Well, not really an issue since official release are available in official fedora and epel repo, this also avoid rpm duplication. It would be nice if it could be shipped again - my version of the wiki article had instructions for two use cases - one where a Fedora VM could connect to the Fedora repos to get the agent, and one use case where the VM does not have internet access. It seemed like a good workaround until we can see if the *nix agents can be added to a guest agent CD. For 3.5 I think Lev is targeting just Windows systems. and since the guest agent is supposed to go to the guest instead of being installed as part of the ovirt installation it kind of makes sense like this indeed:) -- Sandro Bonazzola Better technology. Faster innovation. Powered by community collaboration. See how it works at redhat.com ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Re: [ovirt-devel] python-ioprocess for el7?
On 07/11/2014 07:46 AM, Dan Kenigsberg wrote: On Thu, Jul 10, 2014 at 05:01:21PM -0400, Adam Litke wrote: Hi, I am looking for python-ioprocess RPMs (new enough for latest vdsm requirements). Can anyone point me in the right direction? Thanks! Looking at https://admin.fedoraproject.org/updates/search/python-pthreading https://admin.fedoraproject.org/updates/search/python-cpopen https://admin.fedoraproject.org/updates/search/ioprocess I can confirm that we miss quite a bit of our dependencies for el7. Douglas, Yaniv: can you have them built? I see that http://dl.fedoraproject.org/pub/epel/beta/7/x86_64/ already exists, and I hope to see out packages there. Sure, please refresh, python-cpopen and python-pthreading should be there now. However, ioprocess requires Saggi interaction. -- Cheers Douglas ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel