Ben,
When I looked though the code the Bad UID message was most likely from
"src/server/geteusernam.c" and caused by either the user not existing,
or not in /etc/hosts.equiv (or other rexec/ruserok requirements). Also,
you cannot do it as root (specifically prohibited).
Anyway, look through that file, as it seems to be the most appropriate.
Frank
On Thu, 2004-03-04 at 05:01, Benjamin Simmons wrote:
> Ok, after a little more poking around, I looked through the source, and
> learned a few things, but I don't think any of that has made too much of
> a difference yet. I reviewed my server definition, and added ACL host
> information to the server definition. Now I can submit a job, and
> perform qstat, but I get a Bad UID message from qsub. I have synced the
> passwd, group,and shadow files across the machines. I have submitted
> the job as shown below.
>
> Any thoughts on this, I am sooo close now, thanks for everyone's help.
>
> Ben
>
> [EMAIL PROTECTED] bdsimmns]$ qsub -q [EMAIL PROTECTED] -u
> [EMAIL PROTECTED] surface.pbs
> qsub: Bad UID for job execution
> [EMAIL PROTECTED] bdsimmns]$
> [EMAIL PROTECTED] bdsimmns]$
> [EMAIL PROTECTED] bdsimmns]$ cat surface.pbs
> #PBS -N se_
> #PBS -o AxiCircleNoDowel.log
> #PBS -e AxiCircleNoDowel.pbs
> #PBS -q [EMAIL PROTECTED]
> cd /home/1/bdsimmns
> echo /home/1/bdsimmns/AxiCircleNoDowel.fe
> /home/1/bdsimmns/bin/evolver /home/1/bdsimmns/AxiCircleNoDowel.fe
> #All done
> [EMAIL PROTECTED] bdsimmns]$
>
> Frank Crawford wrote:
>
> >Ben,
> > I'd do this on some other workstation, rather than on the cluster
> >itself. Get the src.rpm, install it (rpm -ivh openpbs.src.rpm - check
> >the name). You may need to do this as root, if you haven't set up some
> >rpm macros. After this, do "rpmbuild -bp openpbs.spec" in the
> >directory. At that point you have the patched source, for you to grep
> >and search in other ways.
> >
> > A simple start on it is "find . -type f -exec grep PBSE_BADHOST {}
> >/dev/null \;" to find the error report and then go searching from there.
> >
> >Frank
> >
> >On Wed, 2004-03-03 at 13:06, Benjamin Simmons wrote:
> >
> >
> >>How do I dig into the code like you suggest. I am ok to do this, just
> >>not sure how to get started.
> >>
> >>Ben
> >>
> >>Frank Crawford wrote:
> >>
> >>
> >>
> >>>Ben,
> >>> The last time I came across this sort of thing, I eventually had to
> >>>dive into the code and add extra information to find out what and why it
> >>>was complaining. There are about 16 places where the PBSE_BADHOST (i.e.
> >>>the error you are getting) can come out. A lot of those can probably be
> >>>dropped as they aren't the right function, but it could be any one of
> >>>the rest.
> >>>
> >>> I don't know if that is an option for you, but it might be something to
> >>>consider, even if it is just a code review you do.
> >>>
> >>>Frank
> >>>
> >>>On Wed, 2004-03-03 at 04:57, Benjamin Simmons wrote:
> >>>
> >>>
> >>>
> >>>
> >>>>I added viper.memphis.edu and borg.memphis.edu to both the clienthosts
> >>>>and to the restricted parts and added the /home entries as well although
> >>>>both systems share a common home directory system.
> >>>>The same error message persists.
> >>>>Ben
> >>>>
> >>>>Frank Crawford wrote:
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>>Ben,
> >>>>> Have a look in /var/spool/pbs/mon_priv at the file config. You may
> >>>>>need to add either the restricted or clienthosts line for them. See
> >>>>>pbs_mom man page for more details.
> >>>>>
> >>>>>Frank
> >>>>>
> >>>>>On Tue, 2004-03-02 at 12:14, Benjamin Simmons wrote:
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>>I added +viper.memphis.edu to the etc/hosts.equiv on borg.memphis.edu.
> >>>>>>Viper is the machine that I want to do my job submission from, and
> >>>>>>borg.memphis.edu is the cluster server.
> >>>>>>
> >>>>>>This is the error message listed in the logs on borg.memphis.edu
> >>>>>>
> >>>>>>03/01/2004 19:09:42;0100;PBS_Server;Req;;Type 49 request received from
> >>>>>>[EMAIL PROTECTED], sock=11
> >>>>>>03/01/2004 19:09:42;0080;PBS_Server;Req;req_reject;Reject reply
> >>>>>>code=15008, aux=0, type=49, from [EMAIL PROTECTED]
> >>>>>>
> >>>>>>this is the error message I recieved when I tried the qsub on viper:
> >>>>>>
> >>>>>>[EMAIL PROTECTED] bdsimmns]$ qsub -q [EMAIL PROTECTED] surface.pbs
> >>>>>>pbs_iff: error returned: 15008
> >>>>>>pbs_iff: Access from host not allowed, or unknown host
> >>>>>>No Permission.
> >>>>>>qsub: cannot connect to server borg.memphis.edu (errno=15007)
> >>>>>>[EMAIL PROTECTED] bdsimmns]$
> >>>>>>
> >>>>>>
> >>>>>>I can ssh to and from the two machines without a password, and they
> >>>>>>share a common home directory system. I can submit jobs to the queue
> >>>>>>workq while logged into borg.
> >>>>>>
> >>>>>>Any other thoughts as to things to look for?
> >>>>>>
> >>>>>>Ben
> >>>>>>
> >>>>>>Frank Crawford wrote:
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>>Ben,
> >>>>>>> Did you add something in /etc/hosts.equiv on the other server
> >>>>>>>(borg.memphis.edu)? Anyway, I think that problems with hosts.equiv come
> >>>>>>>up with a 10523 (Bad User) error.
> >>>>>>>
> >>>>>>> What are you seeing in the logs on the other server, i.e.
> >>>>>>>borg.memphis.edu? You should either see something logged about the
> >>>>>>>connection and possibly a rejection, or you would have a deeper level
> >>>>>>>problem.
> >>>>>>>
> >>>>>>>Frank
> >>>>>>>
> >>>>>>>On Tue, 2004-03-02 at 10:41, Benjamin Simmons wrote:
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>>ok, here is what I have setup right how:
> >>>>>>>>I edited /var/spool/pbs/server_name and set it to the cluster server
> >>>>>>>>I edited hosts.equiv and set it to +outside.computer.name
> >>>>>>>>This computer is connected through eth1 of the cluster server, so I
> >>>>>>>>edited the pfilter.conf to have eth1 all trusted, to ensure that would
> >>>>>>>>not play a role for now.
> >>>>>>>>
> >>>>>>>>I created the following script and executed it with qsub.
> >>>>>>>>
> >>>>>>>>The error message follows below, but is permissions based.
> >>>>>>>>Thanks,
> >>>>>>>>Ben
> >>>>>>>>
> >>>>>>>>[EMAIL PROTECTED] bdsimmns]$ vi surface.pbs
> >>>>>>>>
> >>>>>>>>#PBS -N se_
> >>>>>>>>#PBS -o AxiCircleNoDowel.log
> >>>>>>>>#PBS -e AxiCircleNoDowel.pbs
> >>>>>>>>cd /home/1/bdsimmns
> >>>>>>>>echo /home/1/bdsimmns/AxiCircleNoDowel.fe
> >>>>>>>>/home/1/bdsimmns/bin/evolver /home/1/bdsimmns/AxiCircleNoDowel.fe
> >>>>>>>>#All done
> >>>>>>>>~
> >>>>>>>>
> >>>>>>>>[EMAIL PROTECTED] bdsimmns]$ qsub -q [EMAIL PROTECTED] surface.pbs
> >>>>>>>>pbs_iff: error returned: 15008
> >>>>>>>>pbs_iff: Access from host not allowed, or unknown host
> >>>>>>>>No Permission.
> >>>>>>>>qsub: cannot connect to server borg.memphis.edu (errno=15007)
> >>>>>>>>[EMAIL PROTECTED] bdsimmns]$
> >>>>>>>>
> >>>>>>>>Jeremy Enos wrote:
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>>The extra pbs server and routing queue shouldn't be necessary. Try
> >>>>>>>>>setting up a hosts.equiv on the pbs_server machine which includes the
> >>>>>>>>>machine you want to run qsub from. Also, on the machine you're
> >>>>>>>>>running qsub from, set the server_name file appropriately for the
> >>>>>>>>>remote server. qsub/qstat commands should work then. As I mentioned
> >>>>>>>>>before though, I'm not positive that the hosts.equiv is all that is
> >>>>>>>>>necessary, but I am sure that you don't need two pbs_servers.
> >>>>>>>>>
> >>>>>>>>> Jeremy
> >>>>>>>>>
> >>>>>>>>>At 03:03 PM 3/1/2004, Benjamin Simmons wrote:
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>>I already went through that pain to get a username and password
> >>>>>>>>>>there. The way I read through the manuals is that I need a pbs_server
> >>>>>>>>>>running on the machine that I am making the submission from, and that
> >>>>>>>>>>I have a queue defined on this machine as well. This machine's queue
> >>>>>>>>>>is a routing queue, and I can define one or more destination
> >>>>>>>>>>queues. I for now only want it to go to the queue that is on my
> >>>>>>>>>>cluster server, but I will later need to direct it to other clusters
> >>>>>>>>>>around our campus.
> >>>>>>>>>>
> >>>>>>>>>>I can send you privately the pdf of the admin guide from the site you
> >>>>>>>>>>mentioned, or reference pages I think I am using correctly.
> >>>>>>>>>>
> >>>>>>>>>>Am I misunderstanding how this is supposed to work to go between
> >>>>>>>>>>different physical machines?
> >>>>>>>>>>
> >>>>>>>>>>Thanks,
> >>>>>>>>>>Ben
> >>>>>>>>>>
> >>>>>>>>>>Jeremy Enos wrote:
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>>At 01:32 PM 3/1/2004, Benjamin Simmons wrote:
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>>Ok, I think I set everything up correctly, but when I try to
> >>>>>>>>>>>>submit a job I get a error that job has been rejected by all
> >>>>>>>>>>>>possible destinations.
> >>>>>>>>>>>>
> >>>>>>>>>>>>Any thoughts on what I need to look at, or do I need to post the
> >>>>>>>>>>>>server and queue configs for the machine I am submitting on and the
> >>>>>>>>>>>>one that is recieving?
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>Not sure why you would have server/queue configs at all on the
> >>>>>>>>>>>machine you're submitting on... shouldn't have a server on there
> >>>>>>>>>>>at all, right? I know there are admin guides available at
> >>>>>>>>>>>http://www.openpbs.org that may also be of help. You will need to
> >>>>>>>>>>>register a user name though. (I had some trouble finding a link to
> >>>>>>>>>>>do this there though)
> >>>>>>>>>>>
> >>>>>>>>>>> Jeremy
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>>Thanks for the help,
> >>>>>>>>>>>>Ben Simmons
> >>>>>>>>>>>>
> >>>>>>>>>>>>Jeremy Enos wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>>It's been awhile since I looked into it, but I think a hosts.equiv
> >>>>>>>>>>>>>file is needed to allow submission from other hosts. I don't have
> >>>>>>>>>>>>>reliable detail past that at the moment.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Jeremy
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>At 07:57 PM 2/26/2004, Benjamin Simmons wrote:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>>Has anyone tried to setup a pbs queue that routes to a different
> >>>>>>>>>>>>>>server outside their cluster? Or are most people just having
> >>>>>>>>>>>>>>users submit the job on the server that it should be run on?
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>I am getting many different types of errors, but am looking to
> >>>>>>>>>>>>>>see if there is an experience base to draw on here, or if I need
> >>>>>>>>>>>>>>to look towards another list,
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>Thanks to all in advance,
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>Ben Simmons
> >>>>>>>>>>>>>>The University of Memphis
> >>>>>>>>>>>>>>http://viper.memphis.edu
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>-------------------------------------------------------
> >>>>>>>>>>>>>>SF.Net is sponsored by: Speed Start Your Linux Apps Now.
> >>>>>>>>>>>>>>Build and deploy apps & Web services for Linux with
> >>>>>>>>>>>>>>a free DVD software kit from IBM. Click Now!
> >>>>>>>>>>>>>>http://ads.osdn.com/?ad_id=1356&alloc_id=3438&op=click
> >>>>>>>>>>>>>>_______________________________________________
> >>>>>>>>>>>>>>Oscar-users mailing list
> >>>>>>>>>>>>>>[EMAIL PROTECTED]
> >>>>>>>>>>>>>>https://lists.sourceforge.net/lists/listinfo/oscar-users
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>-------------------------------------------------------
> >>>>>>>>>>>>SF.Net is sponsored by: Speed Start Your Linux Apps Now.
> >>>>>>>>>>>>Build and deploy apps & Web services for Linux with
> >>>>>>>>>>>>a free DVD software kit from IBM. Click Now!
> >>>>>>>>>>>>http://ads.osdn.com/?ad_id=1356&alloc_id=3438&op=click
> >>>>>>>>>>>>_______________________________________________
> >>>>>>>>>>>>Oscar-users mailing list
> >>>>>>>>>>>>[EMAIL PROTECTED]
> >>>>>>>>>>>>https://lists.sourceforge.net/lists/listinfo/oscar-users
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>-------------------------------------------------------
> >>>>>>>>>SF.Net is sponsored by: Speed Start Your Linux Apps Now.
> >>>>>>>>>Build and deploy apps & Web services for Linux with
> >>>>>>>>>a free DVD software kit from IBM. Click Now!
> >>>>>>>>>http://ads.osdn.com/?ad_id=1356&alloc_id=3438&op=click
> >>>>>>>>>_______________________________________________
> >>>>>>>>>Oscar-users mailing list
> >>>>>>>>>[EMAIL PROTECTED]
> >>>>>>>>>https://lists.sourceforge.net/lists/listinfo/oscar-users
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>-------------------------------------------------------
> >>>>>>>>SF.Net is sponsored by: Speed Start Your Linux Apps Now.
> >>>>>>>>Build and deploy apps & Web services for Linux with
> >>>>>>>>a free DVD software kit from IBM. Click Now!
> >>>>>>>>http://ads.osdn.com/?ad_id=1356&alloc_id=3438&op=click
> >>>>>>>>_______________________________________________
> >>>>>>>>Oscar-users mailing list
> >>>>>>>>[EMAIL PROTECTED]
> >>>>>>>>https://lists.sourceforge.net/lists/listinfo/oscar-users
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
--
ac3
Suite G16, Bay 7, Locomotive Workshop Phone: 02 9209 4600
Australian Technology Park Fax: 02 9209 4611
Eveleigh NSW 1430
-------------------------------------------------------
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click
_______________________________________________
Oscar-users mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/oscar-users