Re: [gridengine users] What is the easiest/best way to update our servers' domain name?

2019-10-29 Thread Mun Johl
Hi Hugh,

Thank you for your reply.
See my comments below.

What’s the output of ‘qconf -sq long.q’? Are you sure it doesn’t still 
reference the old hostname, maybe within a hostgroup?

[Mun] I did check the queue after updating the queues and host groups and 
‘qconf -sq long.q’ looks good.  It has the new domain name listed.  I’ve 
checked the host groups via ‘qconf -shgrp @name’ and each of them also lists 
the new domain name.

A new finding is if I try to *add* a host with the new FQDN, SGE says it 
already exists even though ‘qconf -sh’ doesn’t show the host with the new 
domain name--it still shows the old domain name for the particular host.

Regards,

--
Mun


-Hugh

From: users-boun...@gridengine.org 
mailto:users-boun...@gridengine.org>> On Behalf 
Of Mun Johl
Sent: Tuesday, October 29, 2019 1:38 PM
To: dpo...@gmail.com
Cc: users@gridengine.org
Subject: Re: [gridengine users] What is the easiest/best way to update our 
servers' domain name?

Hi folks,

SGE is broken for me.  IT went through and updated the domain name for our 
hosts; but now I can’t seem to get SGE to update anything.  If I use “qconf -me 
” and update the domain name, SGE puts it right back to the old 
domain name.

I was able to get the queues and Host Groups updated with the new domain name, 
but that’s it.  If I try to delete a host, I get the following error:

Host object "knsim8" is still referenced in cluster queue "long.q".

However, like I said, I _was_ able to get the queues updated to the new domain 
name so I don’t know why I would be getting the aforementioned error; and I did 
stop/start the master daemon after doing updating the queues just in case it 
was necessary.

I’m at a loss as to what to try next in order to salvage our SGE installation.  
Any suggestions would be welcomed.

Regards,

--
Mun


From: Mun Johl mailto:mun.j...@wdc.com>>
Sent: Monday, October 28, 2019 5:17 PM
To: dpo...@gmail.com
Cc: Skylar Thompson mailto:skyl...@uw.edu>>; 
users@gridengine.org
Subject: RE: [gridengine users] What is the easiest/best way to update our 
servers' domain name?

Hi Daniel,

Thank you for your feedback.  I am kind of thinking of staying with the FQDN at 
this point since that technique has been working well for us.

Regards,

--
Mun


From: Daniel Povey mailto:dpo...@gmail.com>>
Sent: Monday, October 28, 2019 3:24 PM
To: Mun Johl mailto:mun.j...@wdc.com>>
Cc: Skylar Thompson mailto:skyl...@uw.edu>>; 
users@gridengine.org
Subject: Re: [gridengine users] What is the easiest/best way to update our 
servers' domain name?

CAUTION: This email originated from outside of Western Digital. Do not click on 
links or open attachments unless you recognize the sender and know that the 
content is safe.

I always use the FQDN.  I recall running into problems with SunRPC if not... 
there may be ways to get around that, e.g. have each host announce it's raw 
hostname as its FQDN, but it might not be compatible with the hosts having 
normal network access.
I forget what specific mechanism SunRPC uses to find the hostname.

On Mon, Oct 28, 2019 at 2:18 PM Mun Johl 
mailto:mun.j...@wdc.com>> wrote:
Hi all,

I do have a follow-up question: When I am specifying hostnames for the 
execution hosts, admin hosts, etc.; do I need to use the FQDN?  Or can I simply 
use the hostname in order for grid to operate correctly?  That is, do I have to 
use hostname.domain.com (as I am currently doing).  
Or is it sufficient to simply use “hostname”?

Regards,

--
Mun


From: Mun Johl mailto:mun.j...@wdc.com>>
Sent: Friday, October 25, 2019 5:42 PM
To: dpo...@gmail.com
Cc: Skylar Thompson mailto:skyl...@uw.edu>>; 
users@gridengine.org
Subject: RE: [gridengine users] What is the easiest/best way to update our 
servers' domain name?

Hi Daniel,

Thank you for your reply.

From: Daniel Povey mailto:dpo...@gmail.com>>
You may have to write a script to do that, but it could be something like

for exechost in $(qconf -sel); do
   qconf -se $exechost  | sed s/old_domain_name/new_domain_name/ > tmp
   qconf -de $exechost
   qconf -Ae tmp
done

but you might need to tweak that to get it to work, e.g. get rid of load_values 
from the tmp file.

[Mun] Understood.  Since we have a fairly small set of servers currently, I may 
just update them by hand via “qconf -me ”; and then address the 
queues via “qconf -mq ”.  Oh, and I just noticed I can modify hostgroups 
via “qconf -mhgrp @name”.

After that I can re-start the daemons and I “should” be good to go, right?

Thanks again Daniel.

Best regards,

--
Mun


On Fri, Oct 25, 2019 at 5:24 PM Mun Johl 
mailto:mun.j...@wdc.com>> wrote:
Hi Daniel and Skylar,

Thank you for your replies.

> -Original Message-
> I think it might depend on the setting of ignor

Re: [gridengine users] What is the easiest/best way to update our servers' domain name?

2019-10-29 Thread MacMullan IV, Hugh
What’s the output of ‘qconf -sq long.q’? Are you sure it doesn’t still 
reference the old hostname, maybe within a hostgroup?

-Hugh

From: users-boun...@gridengine.org  On Behalf Of 
Mun Johl
Sent: Tuesday, October 29, 2019 1:38 PM
To: dpo...@gmail.com
Cc: users@gridengine.org
Subject: Re: [gridengine users] What is the easiest/best way to update our 
servers' domain name?

Hi folks,

SGE is broken for me.  IT went through and updated the domain name for our 
hosts; but now I can’t seem to get SGE to update anything.  If I use “qconf -me 
” and update the domain name, SGE puts it right back to the old 
domain name.

I was able to get the queues and Host Groups updated with the new domain name, 
but that’s it.  If I try to delete a host, I get the following error:

Host object "knsim8" is still referenced in cluster queue "long.q".

However, like I said, I _was_ able to get the queues updated to the new domain 
name so I don’t know why I would be getting the aforementioned error; and I did 
stop/start the master daemon after doing updating the queues just in case it 
was necessary.

I’m at a loss as to what to try next in order to salvage our SGE installation.  
Any suggestions would be welcomed.

Regards,

--
Mun


From: Mun Johl mailto:mun.j...@wdc.com>>
Sent: Monday, October 28, 2019 5:17 PM
To: dpo...@gmail.com
Cc: Skylar Thompson mailto:skyl...@uw.edu>>; 
users@gridengine.org
Subject: RE: [gridengine users] What is the easiest/best way to update our 
servers' domain name?

Hi Daniel,

Thank you for your feedback.  I am kind of thinking of staying with the FQDN at 
this point since that technique has been working well for us.

Regards,

--
Mun


From: Daniel Povey mailto:dpo...@gmail.com>>
Sent: Monday, October 28, 2019 3:24 PM
To: Mun Johl mailto:mun.j...@wdc.com>>
Cc: Skylar Thompson mailto:skyl...@uw.edu>>; 
users@gridengine.org
Subject: Re: [gridengine users] What is the easiest/best way to update our 
servers' domain name?

CAUTION: This email originated from outside of Western Digital. Do not click on 
links or open attachments unless you recognize the sender and know that the 
content is safe.

I always use the FQDN.  I recall running into problems with SunRPC if not... 
there may be ways to get around that, e.g. have each host announce it's raw 
hostname as its FQDN, but it might not be compatible with the hosts having 
normal network access.
I forget what specific mechanism SunRPC uses to find the hostname.

On Mon, Oct 28, 2019 at 2:18 PM Mun Johl 
mailto:mun.j...@wdc.com>> wrote:
Hi all,

I do have a follow-up question: When I am specifying hostnames for the 
execution hosts, admin hosts, etc.; do I need to use the FQDN?  Or can I simply 
use the hostname in order for grid to operate correctly?  That is, do I have to 
use hostname.domain.com (as I am currently doing).  
Or is it sufficient to simply use “hostname”?

Regards,

--
Mun


From: Mun Johl mailto:mun.j...@wdc.com>>
Sent: Friday, October 25, 2019 5:42 PM
To: dpo...@gmail.com
Cc: Skylar Thompson mailto:skyl...@uw.edu>>; 
users@gridengine.org
Subject: RE: [gridengine users] What is the easiest/best way to update our 
servers' domain name?

Hi Daniel,

Thank you for your reply.

From: Daniel Povey mailto:dpo...@gmail.com>>
You may have to write a script to do that, but it could be something like

for exechost in $(qconf -sel); do
   qconf -se $exechost  | sed s/old_domain_name/new_domain_name/ > tmp
   qconf -de $exechost
   qconf -Ae tmp
done

but you might need to tweak that to get it to work, e.g. get rid of load_values 
from the tmp file.

[Mun] Understood.  Since we have a fairly small set of servers currently, I may 
just update them by hand via “qconf -me ”; and then address the 
queues via “qconf -mq ”.  Oh, and I just noticed I can modify hostgroups 
via “qconf -mhgrp @name”.

After that I can re-start the daemons and I “should” be good to go, right?

Thanks again Daniel.

Best regards,

--
Mun


On Fri, Oct 25, 2019 at 5:24 PM Mun Johl 
mailto:mun.j...@wdc.com>> wrote:
Hi Daniel and Skylar,

Thank you for your replies.

> -Original Message-
> I think it might depend on the setting of ignore_fqdn in the bootstrap file
> (can't remember if this just tunes load reporting or also things like which
> qmaster the execd's talk to). I wouldn't count on it working, though, and
> agree with Daniel that you probably want to plan on an outage.

[Mun] An outage is acceptable; but I'm not sure what is the best/easiest 
approach to take in order to change the domain names within SGE for all of the 
servers as well as update the hostgroups and queues.  I mean, I know I can 
delete the hosts and add them back in; and the same for the queue 
specifications, etc.  However, I'm not sure if that is an adequate solution or 
one that will cause problems for me.  I'm also not sur

Re: [gridengine users] What is the easiest/best way to update our servers' domain name?

2019-10-29 Thread Mun Johl
Hi folks,

SGE is broken for me.  IT went through and updated the domain name for our 
hosts; but now I can’t seem to get SGE to update anything.  If I use “qconf -me 
” and update the domain name, SGE puts it right back to the old 
domain name.

I was able to get the queues and Host Groups updated with the new domain name, 
but that’s it.  If I try to delete a host, I get the following error:

Host object "knsim8" is still referenced in cluster queue "long.q".

However, like I said, I _was_ able to get the queues updated to the new domain 
name so I don’t know why I would be getting the aforementioned error; and I did 
stop/start the master daemon after doing updating the queues just in case it 
was necessary.

I’m at a loss as to what to try next in order to salvage our SGE installation.  
Any suggestions would be welcomed.

Regards,

--
Mun


From: Mun Johl 
Sent: Monday, October 28, 2019 5:17 PM
To: dpo...@gmail.com
Cc: Skylar Thompson ; users@gridengine.org
Subject: RE: [gridengine users] What is the easiest/best way to update our 
servers' domain name?

Hi Daniel,

Thank you for your feedback.  I am kind of thinking of staying with the FQDN at 
this point since that technique has been working well for us.

Regards,

--
Mun


From: Daniel Povey mailto:dpo...@gmail.com>>
Sent: Monday, October 28, 2019 3:24 PM
To: Mun Johl mailto:mun.j...@wdc.com>>
Cc: Skylar Thompson mailto:skyl...@uw.edu>>; 
users@gridengine.org
Subject: Re: [gridengine users] What is the easiest/best way to update our 
servers' domain name?

CAUTION: This email originated from outside of Western Digital. Do not click on 
links or open attachments unless you recognize the sender and know that the 
content is safe.

I always use the FQDN.  I recall running into problems with SunRPC if not... 
there may be ways to get around that, e.g. have each host announce it's raw 
hostname as its FQDN, but it might not be compatible with the hosts having 
normal network access.
I forget what specific mechanism SunRPC uses to find the hostname.

On Mon, Oct 28, 2019 at 2:18 PM Mun Johl 
mailto:mun.j...@wdc.com>> wrote:
Hi all,

I do have a follow-up question: When I am specifying hostnames for the 
execution hosts, admin hosts, etc.; do I need to use the FQDN?  Or can I simply 
use the hostname in order for grid to operate correctly?  That is, do I have to 
use hostname.domain.com (as I am currently doing).  
Or is it sufficient to simply use “hostname”?

Regards,

--
Mun


From: Mun Johl mailto:mun.j...@wdc.com>>
Sent: Friday, October 25, 2019 5:42 PM
To: dpo...@gmail.com
Cc: Skylar Thompson mailto:skyl...@uw.edu>>; 
users@gridengine.org
Subject: RE: [gridengine users] What is the easiest/best way to update our 
servers' domain name?

Hi Daniel,

Thank you for your reply.

From: Daniel Povey mailto:dpo...@gmail.com>>
You may have to write a script to do that, but it could be something like

for exechost in $(qconf -sel); do
   qconf -se $exechost  | sed s/old_domain_name/new_domain_name/ > tmp
   qconf -de $exechost
   qconf -Ae tmp
done

but you might need to tweak that to get it to work, e.g. get rid of load_values 
from the tmp file.

[Mun] Understood.  Since we have a fairly small set of servers currently, I may 
just update them by hand via “qconf -me ”; and then address the 
queues via “qconf -mq ”.  Oh, and I just noticed I can modify hostgroups 
via “qconf -mhgrp @name”.

After that I can re-start the daemons and I “should” be good to go, right?

Thanks again Daniel.

Best regards,

--
Mun


On Fri, Oct 25, 2019 at 5:24 PM Mun Johl 
mailto:mun.j...@wdc.com>> wrote:
Hi Daniel and Skylar,

Thank you for your replies.

> -Original Message-
> I think it might depend on the setting of ignore_fqdn in the bootstrap file
> (can't remember if this just tunes load reporting or also things like which
> qmaster the execd's talk to). I wouldn't count on it working, though, and
> agree with Daniel that you probably want to plan on an outage.

[Mun] An outage is acceptable; but I'm not sure what is the best/easiest 
approach to take in order to change the domain names within SGE for all of the 
servers as well as update the hostgroups and queues.  I mean, I know I can 
delete the hosts and add them back in; and the same for the queue 
specifications, etc.  However, I'm not sure if that is an adequate solution or 
one that will cause problems for me.  I'm also not sure if that is the best 
approach to take for this task.

Thanks,

--
Mun


>
> On Fri, Oct 25, 2019 at 04:12:11PM -0700, Daniel Povey wrote:
> > IIRC, GridEngine is very picky about machines having a consistent
> > hostname, e.g. that what hostname they think they have matches with
> > how they were addressed.  I think this is because of SunRPC.  I think
> > it may be hard to do what you want without an interruption  of some kind.
> But I may be wrong.
> >
> > On Fri, Oct 25, 2