Does it hang when you issue the qconf command on that node, or does it
return the error message immediately??

Rayson


On Fri, Jan 13, 2012 at 1:00 AM, Simon Matthews
<simon.d.matth...@gmail.com> wrote:
> I am running the same version. I have one installation tree that is NFS
> mounted. All clients use the same binaries.
>
> I had wanted to move to 6.2U5, but I can't find a source to download it.
>
> Simon
>
>
> On Thu, Jan 12, 2012 at 9:50 PM, Rayson Ho <ray...@scalablelogic.com> wrote:
>>
>> On Fri, Jan 13, 2012 at 12:02 AM, Simon Matthews
>> <simon.d.matth...@gmail.com> wrote:
>> > I have an installation of SGE 6.2U4 that I downloaded some years ago
>> > that I
>> > have installed on a couple of qmaster hosts.
>>
>> Are you using the same version of SGE (SGE 6.2u4) on both the qmaster
>> & the node? You can run "qconf -help | head -1" on both the master &
>> the node to show the version.
>>
>> Most common GDI errors are due to mismatching versions of Grid Engine
>> - and if you are running the same version of SGE, then let us know, I
>> will dig the code to see what can possibly go wrong.
>>
>> And don't worry about still using the Sun binaries, I work with sites
>> that have even older versions of Grid Engine. Sun contributed the code
>> to open source, and without Sun we wouldn't have this community.
>>
>> Rayson
>>
>>
>>
>>
>> > I hope that I do not offend the users of this list by asking for help
>> > using
>> > a binary installation, using binaries built by Sun.
>> >
>> > I hope that someone can shed some light on the problem.
>> >
>> > I have built some new virtualized clients using KVM on a Centos 6 host.
>> > The
>> > Centos 5 client seems to work properly, but the Centos 4 client does
>> > not. I
>> > need a Centos 4 execd for testing purposes.
>> >
>> > I cannot install sge_execd, because of the qconf problems.
>> >
>> > qconf -sh results in:
>> >
>> > qconf -sh
>> > ERROR: failed receiving gdi request response for mid=1 (got no message).
>> >
>> > I get this message if I try this client against the new cluster and a
>> > cluster that has been running for several years. Other Centos 4 clients
>> > can
>> > run "qconf -ch" against both clusters without problem.
>> >
>> > qping works from the problematic client:
>> >  qping -info sgemaster 6444 qmaster 1
>> > 01/12/2012 20:59:38:
>> > SIRM version:             0.1
>> > SIRM message id:          1
>> > start time:               01/12/2012 16:31:57 (1326414717)
>> > run time [s]:             16052
>> > messages in read buffer:  0
>> > messages in write buffer: 0
>> > nr. of connected clients: 2
>> > status:                   2
>> > info:                     MAIN: E (16052.50) | signaler000: E (16052.05)
>> > |
>> > event_master000: E (0.58) | timer000: E (1.58) | worker000: W (41.59) |
>> > worker001: W (101.61) | listener000: W (5.58) | listener001: W (5.58) |
>> > scheduler000: W (5.57) | ERROR
>> > malloc:                   arena(0) |ordblks(1) | smblks(0) | hblksr(0) |
>> > hblhkd(0) usmblks(0) | fsmblks(0) | uordblks(0) | fordblks(0) |
>> > keepcost(0)
>> > Monitor:                  disabled
>> >
>> > Simon
>> >
>> >
>> >
>> > _______________________________________________
>> > users mailing list
>> > users@gridengine.org
>> > https://gridengine.org/mailman/listinfo/users
>> >
>
>

_______________________________________________
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users

Reply via email to