On Thu, Jan 12, 2012 at 9:50 PM, Rayson Ho <ray...@scalablelogic.com> wrote:

> On Fri, Jan 13, 2012 at 12:02 AM, Simon Matthews
> <simon.d.matth...@gmail.com> wrote:
> > I have an installation of SGE 6.2U4 that I downloaded some years ago
> that I
> > have installed on a couple of qmaster hosts.
>
> Are you using the same version of SGE (SGE 6.2u4) on both the qmaster
> & the node? You can run "qconf -help | head -1" on both the master &
> the node to show the version.
>
> Most common GDI errors are due to mismatching versions of Grid Engine
> - and if you are running the same version of SGE, then let us know, I
> will dig the code to see what can possibly go wrong.
>
> And don't worry about still using the Sun binaries, I work with sites
> that have even older versions of Grid Engine. Sun contributed the code
> to open source, and without Sun we wouldn't have this community.
>


Just in case it is related -- occasionally, I see the following on the
console:
 warning many ticks lost
your time source seems to be instable or some driver is hogging interrupts.
rip default_idle+0x20/0x23

Simon

>
> Rayson
>
>
>
>
> > I hope that I do not offend the users of this list by asking for help
> using
> > a binary installation, using binaries built by Sun.
> >
> > I hope that someone can shed some light on the problem.
> >
> > I have built some new virtualized clients using KVM on a Centos 6 host.
> The
> > Centos 5 client seems to work properly, but the Centos 4 client does
> not. I
> > need a Centos 4 execd for testing purposes.
> >
> > I cannot install sge_execd, because of the qconf problems.
> >
> > qconf -sh results in:
> >
> > qconf -sh
> > ERROR: failed receiving gdi request response for mid=1 (got no message).
> >
> > I get this message if I try this client against the new cluster and a
> > cluster that has been running for several years. Other Centos 4 clients
> can
> > run "qconf -ch" against both clusters without problem.
> >
> > qping works from the problematic client:
> >  qping -info sgemaster 6444 qmaster 1
> > 01/12/2012 20:59:38:
> > SIRM version:             0.1
> > SIRM message id:          1
> > start time:               01/12/2012 16:31:57 (1326414717)
> > run time [s]:             16052
> > messages in read buffer:  0
> > messages in write buffer: 0
> > nr. of connected clients: 2
> > status:                   2
> > info:                     MAIN: E (16052.50) | signaler000: E (16052.05)
> |
> > event_master000: E (0.58) | timer000: E (1.58) | worker000: W (41.59) |
> > worker001: W (101.61) | listener000: W (5.58) | listener001: W (5.58) |
> > scheduler000: W (5.57) | ERROR
> > malloc:                   arena(0) |ordblks(1) | smblks(0) | hblksr(0) |
> > hblhkd(0) usmblks(0) | fsmblks(0) | uordblks(0) | fordblks(0) |
> keepcost(0)
> > Monitor:                  disabled
> >
> > Simon
> >
> >
> >
> > _______________________________________________
> > users mailing list
> > users@gridengine.org
> > https://gridengine.org/mailman/listinfo/users
> >
>
_______________________________________________
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users

Reply via email to