On Thu, Jan 12, 2012 at 9:50 PM, Rayson Ho <ray...@scalablelogic.com> wrote:
> On Fri, Jan 13, 2012 at 12:02 AM, Simon Matthews > <simon.d.matth...@gmail.com> wrote: > > I have an installation of SGE 6.2U4 that I downloaded some years ago > that I > > have installed on a couple of qmaster hosts. > > Are you using the same version of SGE (SGE 6.2u4) on both the qmaster > & the node? You can run "qconf -help | head -1" on both the master & > the node to show the version. > > Most common GDI errors are due to mismatching versions of Grid Engine > - and if you are running the same version of SGE, then let us know, I > will dig the code to see what can possibly go wrong. > > And don't worry about still using the Sun binaries, I work with sites > that have even older versions of Grid Engine. Sun contributed the code > to open source, and without Sun we wouldn't have this community. > Just in case it is related -- occasionally, I see the following on the console: warning many ticks lost your time source seems to be instable or some driver is hogging interrupts. rip default_idle+0x20/0x23 Simon > > Rayson > > > > > > I hope that I do not offend the users of this list by asking for help > using > > a binary installation, using binaries built by Sun. > > > > I hope that someone can shed some light on the problem. > > > > I have built some new virtualized clients using KVM on a Centos 6 host. > The > > Centos 5 client seems to work properly, but the Centos 4 client does > not. I > > need a Centos 4 execd for testing purposes. > > > > I cannot install sge_execd, because of the qconf problems. > > > > qconf -sh results in: > > > > qconf -sh > > ERROR: failed receiving gdi request response for mid=1 (got no message). > > > > I get this message if I try this client against the new cluster and a > > cluster that has been running for several years. Other Centos 4 clients > can > > run "qconf -ch" against both clusters without problem. > > > > qping works from the problematic client: > > qping -info sgemaster 6444 qmaster 1 > > 01/12/2012 20:59:38: > > SIRM version: 0.1 > > SIRM message id: 1 > > start time: 01/12/2012 16:31:57 (1326414717) > > run time [s]: 16052 > > messages in read buffer: 0 > > messages in write buffer: 0 > > nr. of connected clients: 2 > > status: 2 > > info: MAIN: E (16052.50) | signaler000: E (16052.05) > | > > event_master000: E (0.58) | timer000: E (1.58) | worker000: W (41.59) | > > worker001: W (101.61) | listener000: W (5.58) | listener001: W (5.58) | > > scheduler000: W (5.57) | ERROR > > malloc: arena(0) |ordblks(1) | smblks(0) | hblksr(0) | > > hblhkd(0) usmblks(0) | fsmblks(0) | uordblks(0) | fordblks(0) | > keepcost(0) > > Monitor: disabled > > > > Simon > > > > > > > > _______________________________________________ > > users mailing list > > users@gridengine.org > > https://gridengine.org/mailman/listinfo/users > > >
_______________________________________________ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users