Title: [Oscar-users] Torque fails creating work queue at step 7
Hi Umberto:
 
Try putting "pbs_oscar" in your _external_ interface instead of your internal interface and see if it works.  Also, there are a bunch of log files in /var/spool/pbs which you can take a look at also.
 
Cheers,
 
Bernard


From: [EMAIL PROTECTED] on behalf of Umberto Amato
Sent: Mon 02/01/2006 08:53
To: [email protected]
Subject: [Oscar-users] Torque fails creating work queue at step 7

Dear all,
I-m installing OSCAR 4.1 on a cluster made with dual 64 bit Opteron boards
and Scientific Linux Operating System 4.1. I have a problem already
considered on the list, that is failure of Torque in creating work queue at
Step 7. In http://sourceforge.net/mailarchive/message.php?msg_id=11522706
the issue had been closed for lack of occurrences: here am I.

The relevant part of the oscarinstall.log is:

Updating pbs_server nodes
/opt/pbs/bin/pbsnodes: Server has no node list
qmgr obj=lilligridfast1.na.iac.cnr.it svr=default: Unauthorized Request
create node lilligridfast1.na.iac.cnr.it np = 2 , properties = all
qmgr obj=lilligridfast2.na.iac.cnr.it svr=default: Unauthorized Request
create node lilligridfast2.na.iac.cnr.it np = 2 , properties = all
qmgr obj=lilligridfast3.na.iac.cnr.it svr=default: Unauthorized Request
create node lilligridfast3.na.iac.cnr.it np = 2 , properties = all
qmgr obj=lilligridfast4.na.iac.cnr.it svr=default: Unauthorized Request
create node lilligridfast4.na.iac.cnr.it np = 2 , properties = all
qmgr obj=lilligridfast5.na.iac.cnr.it svr=default: Unauthorized Request
create node lilligridfast5.na.iac.cnr.it np = 2 , properties = all
Shutting down TORQUE Server: [ OK ]
Starting TORQUE Server: [ OK ]
Creating torque workq queue...
Max open servers: 4
qmgr obj=workq svr=default: Unauthorized Request
create queue workq
Configuration of Torque queues failed at
/opt/oscar/packages/torque/scripts/post_install line 315
Script /opt/oscar/packages/torque/scripts/post_install exitted badly with
exit code '2' at ./post_install line 44
Couldn't run 'post_install' script for torque at ./post_install line 45
Some of the post install scripts failed, please check your logs for more
info at ./post_install line 50
--> Step 7: Failed to properly complete the cluster install; please check
the logs

I also attach the /etc/hosts file, because from the mail exchange it turns
out to be the problem:

# Do not remove the following line, or various programs
# that require network functionality will fail.
127.0.0.1 localhost.localdomain localhost
192.168.1.100 lilligridfast100.na.iac.cnr.it lilligridfast100 oscar_server
nfs_oscar pbs_oscar
140.164.12.100 lilligrid.na.iac.cnr.it lilligrid
# These entries are managed by SIS, please don't modify them.
192.168.1.1 lilligridfast1.na.iac.cnr.it lilligridfast1
192.168.1.2 lilligridfast2.na.iac.cnr.it lilligridfast2
192.168.1.3 lilligridfast3.na.iac.cnr.it lilligridfast3
192.168.1.4 lilligridfast4.na.iac.cnr.it lilligridfast4
192.168.1.5 lilligridfast5.na.iac.cnr.it lilligridfast5

Ping to any of the aliases of 192.168.1.100 (including pbs_oscar) is
successfull from the server and from the nodes, while the corresponding host
command fails.

Any help will be greatly appreciated

Umberto Amato
Istituto per le Applicazioni del Calcolo -Mauro Picone- CNR
Via Pietro Castellino111
80131 Napoli

E-mail: [EMAIL PROTECTED]






-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
Oscar-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/oscar-users

Reply via email to