Hi, > Am 28.08.2017 um 23:34 schrieb Michael Stauffer <[email protected]>: > > On Tue, Aug 22, 2017 at 3:26 AM, Reuti <[email protected]> wrote: > >> Am 22.08.2017 um 00:38 schrieb Michael Stauffer: >> >>> >>> >>> On Thu, Aug 17, 2017 at 7:39 AM, Reuti <[email protected]> >> wrote: >>> >>>> My experience is, that sometimes RQS blocks the execution for unknown >> reasons while jobs should start according to their setting. >>>> >>>> A mysterious bug? I hope not. :/ >>> >>> Unfortunately my experience is, that there is something odd with RQS. I >> had the effect several times, especially if one uses a load sensor, that at >> some point no more jobs were scheduled. Disabling the RQS worked instantly, >> although they should have been started before this. >>> >>> I'm not sure how I'd run my cluster without RQS - it'd be a free-for-all >> unless there's another way to limit user's resource consumption? >> >> Sure, but could you disable the RQS to test it? Setting the "enabled" flag >> shows already whether there is an issue in your case. >> > > If I disable the slot and memory RQS's, then the stuck jobs run. I've got a > workaround, see other post I'll make in a minute to my other thread. > > >>> Also, I don't believe I have any load sensor running except for the >> default. Should I try disabling that? How do I do that? >> >> It's not directly the issue of a custom load sensor, but if you use any >> value which is *not* computed as a consumable. E.g. a memory load. >> > > I don't believe I use anything other than the consumables. Or rather, I > don't recall setting anything up other than the consumables. Are there > default settings (SoGE 8.1.8) that would do what you're talking about?
AFAICT, also the load sensor for the double-face complexes (when made consumable) like virtual_free can affect this. But I never got a definitive indication what is causing this. It even happened, that after some time I could switch on the RQS again and it worked correctly for some time. -- Reuti _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
