Hello, in July 2010 I asked on the users mailing list back at SunSource about a peculiar regression in master node selection behaviour of SGE 6.2u5. (see http://markmail.org/message/svuskq5qc6oe3axv) After some discussion Andy pointed out that I was most likely hitting IZ 3148 which was fixed in 6.2u6. And indeed, I was not able to trigger the bug in 6.2u6, which was worst of all, because I couldn't upgrade.
Today I've tried a recent build of V800_BRANCH of https://github.com/gridengine/gridengine.git and was able to reproduce the bug just as with SGE 6.2u5. Does anyone here have a handle on the issue and can help out in tracking it down and fixing it? Does perhaps one of the other forks fix the bug? In short, after some jobs have been run on an empty cluster, the scheduler will start distributing say a two-slot $pe_fillup job over two nodes even though one of them could have accomodated the whole job. An example: weiser@laudrup ~ $ qhost -j HOSTNAME ARCH NCPU NSOC NCOR NTHR LOAD MEMTOT MEMUSE SWAPTO SWAPUS ---------------------------------------------------------------------------------------------- global - - - - - - - - - - kempes lx-amd64 4 0 0 0 0.00 31.4G 249.7M 33.4G 0.0 job-ID prior name user state submit/start at queue master ja-task-ID ---------------------------------------------------------------------------------------------- 17 0.51000 STDIN weiser r 06/28/2011 13:52:53 normal@kem MASTER 26 0.61000 STDIN weiser r 06/28/2011 13:56:53 normal@kem MASTER laudrup lx-amd64 2 0 0 0 0.04 7.7G 654.5M 1.9G 224.0K 26 0.61000 STDIN weiser r 06/28/2011 13:56:53 normal@lau SLAVE maradonna lx-amd64 4 0 0 0 0.00 31.4G 354.4M 33.4G 0.0 Thanks in advance, -- Michael Weiser science + computing ag Senior Systems Engineer Geschaeftsstelle Duesseldorf Martinstrasse 47-55, Haus A phone: +49 211 302 708 32 D-40223 Duesseldorf fax: +49 211 302 708 50 www.science-computing.de -- Vorstand/Board of Management: Dr. Bernd Finkbeiner, Dr. Roland Niemeier, Dr. Arno Steitz, Dr. Ingrid Zech Vorsitzender des Aufsichtsrats/ Chairman of the Supervisory Board: Philippe Miltin Sitz/Registered Office: Tuebingen Registergericht/Registration Court: Stuttgart Registernummer/Commercial Register No.: HRB 382196 _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
