From: [EMAIL PROTECTED] on behalf of Umberto Amato
Sent: Mon 02/01/2006 08:53
To: [email protected]
Subject: [Oscar-users] Torque fails creating work queue at step 7
Dear all,
I-m installing OSCAR 4.1 on a cluster made with
dual 64 bit Opteron boards
and Scientific Linux Operating System 4.1. I have
a problem already
considered on the list, that is failure of Torque in
creating work queue at
Step 7. In http://sourceforge.net/mailarchive/message.php?msg_id=11522706
the
issue had been closed for lack of occurrences: here am I.
The relevant
part of the oscarinstall.log is:
Updating pbs_server
nodes
/opt/pbs/bin/pbsnodes: Server has no node list
qmgr
obj=lilligridfast1.na.iac.cnr.it svr=default: Unauthorized Request
create
node lilligridfast1.na.iac.cnr.it np = 2 , properties = all
qmgr
obj=lilligridfast2.na.iac.cnr.it svr=default: Unauthorized Request
create
node lilligridfast2.na.iac.cnr.it np = 2 , properties = all
qmgr
obj=lilligridfast3.na.iac.cnr.it svr=default: Unauthorized Request
create
node lilligridfast3.na.iac.cnr.it np = 2 , properties = all
qmgr
obj=lilligridfast4.na.iac.cnr.it svr=default: Unauthorized Request
create
node lilligridfast4.na.iac.cnr.it np = 2 , properties = all
qmgr
obj=lilligridfast5.na.iac.cnr.it svr=default: Unauthorized Request
create
node lilligridfast5.na.iac.cnr.it np = 2 , properties = all
Shutting down
TORQUE Server: [60G[ [0;32mOK[0;39m ]
Starting TORQUE Server: [60G[
[0;32mOK[0;39m ]
Creating torque workq queue...
Max open servers:
4
qmgr obj=workq svr=default: Unauthorized Request
create queue
workq
Configuration of Torque queues failed
at
/opt/oscar/packages/torque/scripts/post_install line 315
Script
/opt/oscar/packages/torque/scripts/post_install exitted badly with
exit code
'2' at ./post_install line 44
Couldn't run 'post_install' script for torque
at ./post_install line 45
Some of the post install scripts failed, please
check your logs for more
info at ./post_install line 50
--> Step 7:
Failed to properly complete the cluster install; please check
the
logs
I also attach the /etc/hosts file, because from the mail exchange it
turns
out to be the problem:
# Do not remove the following line, or
various programs
# that require network functionality will fail.
127.0.0.1
localhost.localdomain localhost
192.168.1.100 lilligridfast100.na.iac.cnr.it
lilligridfast100 oscar_server
nfs_oscar pbs_oscar
140.164.12.100
lilligrid.na.iac.cnr.it lilligrid
# These entries are managed by SIS, please
don't modify them.
192.168.1.1 lilligridfast1.na.iac.cnr.it
lilligridfast1
192.168.1.2 lilligridfast2.na.iac.cnr.it
lilligridfast2
192.168.1.3 lilligridfast3.na.iac.cnr.it
lilligridfast3
192.168.1.4 lilligridfast4.na.iac.cnr.it
lilligridfast4
192.168.1.5 lilligridfast5.na.iac.cnr.it
lilligridfast5
Ping to any of the aliases of 192.168.1.100 (including
pbs_oscar) is
successfull from the server and from the nodes, while the
corresponding host
command fails.
Any help will be greatly
appreciated
Umberto Amato
Istituto per le Applicazioni del Calcolo
-Mauro Picone- CNR
Via Pietro Castellino111
80131 Napoli
E-mail:
[EMAIL PROTECTED]
-------------------------------------------------------
This
SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for
problems? Stop! Download the new AJAX search engine that
makes
searching your log files as easy as surfing the web.
DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
Oscar-users
mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/oscar-users
