Hi folks, SGE is broken for me. IT went through and updated the domain name for our hosts; but now I can’t seem to get SGE to update anything. If I use “qconf -me <hostname>” and update the domain name, SGE puts it right back to the old domain name.
I was able to get the queues and Host Groups updated with the new domain name, but that’s it. If I try to delete a host, I get the following error: Host object "knsim8" is still referenced in cluster queue "long.q". However, like I said, I _was_ able to get the queues updated to the new domain name so I don’t know why I would be getting the aforementioned error; and I did stop/start the master daemon after doing updating the queues just in case it was necessary. I’m at a loss as to what to try next in order to salvage our SGE installation. Any suggestions would be welcomed. Regards, -- Mun From: Mun Johl <mun.j...@wdc.com> Sent: Monday, October 28, 2019 5:17 PM To: dpo...@gmail.com Cc: Skylar Thompson <skyl...@uw.edu>; users@gridengine.org Subject: RE: [gridengine users] What is the easiest/best way to update our servers' domain name? Hi Daniel, Thank you for your feedback. I am kind of thinking of staying with the FQDN at this point since that technique has been working well for us. Regards, -- Mun From: Daniel Povey <dpo...@gmail.com<mailto:dpo...@gmail.com>> Sent: Monday, October 28, 2019 3:24 PM To: Mun Johl <mun.j...@wdc.com<mailto:mun.j...@wdc.com>> Cc: Skylar Thompson <skyl...@uw.edu<mailto:skyl...@uw.edu>>; users@gridengine.org<mailto:users@gridengine.org> Subject: Re: [gridengine users] What is the easiest/best way to update our servers' domain name? CAUTION: This email originated from outside of Western Digital. Do not click on links or open attachments unless you recognize the sender and know that the content is safe. I always use the FQDN. I recall running into problems with SunRPC if not... there may be ways to get around that, e.g. have each host announce it's raw hostname as its FQDN, but it might not be compatible with the hosts having normal network access. I forget what specific mechanism SunRPC uses to find the hostname. On Mon, Oct 28, 2019 at 2:18 PM Mun Johl <mun.j...@wdc.com<mailto:mun.j...@wdc.com>> wrote: Hi all, I do have a follow-up question: When I am specifying hostnames for the execution hosts, admin hosts, etc.; do I need to use the FQDN? Or can I simply use the hostname in order for grid to operate correctly? That is, do I have to use hostname.domain.com<http://hostname.domain.com> (as I am currently doing). Or is it sufficient to simply use “hostname”? Regards, -- Mun From: Mun Johl <mun.j...@wdc.com<mailto:mun.j...@wdc.com>> Sent: Friday, October 25, 2019 5:42 PM To: dpo...@gmail.com<mailto:dpo...@gmail.com> Cc: Skylar Thompson <skyl...@uw.edu<mailto:skyl...@uw.edu>>; users@gridengine.org<mailto:users@gridengine.org> Subject: RE: [gridengine users] What is the easiest/best way to update our servers' domain name? Hi Daniel, Thank you for your reply. From: Daniel Povey <dpo...@gmail.com<mailto:dpo...@gmail.com>> You may have to write a script to do that, but it could be something like for exechost in $(qconf -sel); do qconf -se $exechost | sed s/old_domain_name/new_domain_name/ > tmp qconf -de $exechost qconf -Ae tmp done but you might need to tweak that to get it to work, e.g. get rid of load_values from the tmp file. [Mun] Understood. Since we have a fairly small set of servers currently, I may just update them by hand via “qconf -me <hostname>”; and then address the queues via “qconf -mq <qname>”. Oh, and I just noticed I can modify hostgroups via “qconf -mhgrp @name”. After that I can re-start the daemons and I “should” be good to go, right? Thanks again Daniel. Best regards, -- Mun On Fri, Oct 25, 2019 at 5:24 PM Mun Johl <mun.j...@wdc.com<mailto:mun.j...@wdc.com>> wrote: Hi Daniel and Skylar, Thank you for your replies. > -----Original Message----- > I think it might depend on the setting of ignore_fqdn in the bootstrap file > (can't remember if this just tunes load reporting or also things like which > qmaster the execd's talk to). I wouldn't count on it working, though, and > agree with Daniel that you probably want to plan on an outage. [Mun] An outage is acceptable; but I'm not sure what is the best/easiest approach to take in order to change the domain names within SGE for all of the servers as well as update the hostgroups and queues. I mean, I know I can delete the hosts and add them back in; and the same for the queue specifications, etc. However, I'm not sure if that is an adequate solution or one that will cause problems for me. I'm also not sure if that is the best approach to take for this task. Thanks, -- Mun > > On Fri, Oct 25, 2019 at 04:12:11PM -0700, Daniel Povey wrote: > > IIRC, GridEngine is very picky about machines having a consistent > > hostname, e.g. that what hostname they think they have matches with > > how they were addressed. I think this is because of SunRPC. I think > > it may be hard to do what you want without an interruption of some kind. > But I may be wrong. > > > > On Fri, Oct 25, 2019 at 3:37 PM Mun Johl > > <mun.j...@wdc.com<mailto:mun.j...@wdc.com>> wrote: > > > > > Hi, > > > > > > > > > > > > I need to update the domain names of our SGE servers. What is the > > > easiest way to do that? Can I simply update the domain name somehow > > > and have that propagate to hostgroupgs, queue specifications, etc.? > > > > > > > > > > > > Or do I have to delete the current hosts and add the new ones? > > > Which I think also implies setting up the hostgroups and queues > > > again as well for our implementation. > > > > > > > > > > > > Best regards, > > > > > > > > > > > > -- > > > > > > Mun > > > _______________________________________________ > > > users mailing list > > > users@gridengine.org<mailto:users@gridengine.org> > > > https://gridengine.org/mailman/listinfo/users > > > > > > _______________________________________________ > > users mailing list > > users@gridengine.org<mailto:users@gridengine.org> > > https://gridengine.org/mailman/listinfo/users > > > -- > -- Skylar Thompson (skyl...@u.washington.edu<mailto:skyl...@u.washington.edu>) > -- Genome Sciences Department, System Administrator > -- Foege Building S046, (206)-685-7354 > -- University of Washington School of Medicine > _______________________________________________ > users mailing list > users@gridengine.org<mailto:users@gridengine.org> > https://gridengine.org/mailman/listinfo/users _______________________________________________ users mailing list users@gridengine.org<mailto:users@gridengine.org> https://gridengine.org/mailman/listinfo/users
_______________________________________________ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users