Re: [gt-user] line 37: [: too many arguments

Daniel Andrzejewski Wed, 21 Nov 2007 10:26:18 -0800

Hi Charles,

Thank you for your advice. I have modified pbs.pm file, which fixed theproblem.


However, I encountered another problem:
in a job description file I specify how many nodes I want to request, e.g. 8
<count>8</count>
but I only get a half of that number.

All the compute nodes have 2 CPUs.

Right now (as I mentioned before) in pbs.pm file variable $cpu_per_node= 1. If I change it to 2, then I get 1/4 of the number of requestednodes (e.g. 8/4 = 2). Temporarily, I got a working solution, which isnot perfect, because I can never get more then a half of the totalnumber of nodes in the cluster.


Regards,

--
Daniel

Charles Bacon wrote:

On Nov 19, 2007, at 4:05 PM, Daniel Andrzejewski wrote:
I'd like to add that the name of the compute node is node10.local notnode10.local:1 as you can see in the error message.
So, possibly the PBS nodefile is coming out in a different format thanexpected, thus causing trouble in the following loop:
hosts=\`cat \$PBS_NODEFILE\`;
counter=0
while test \$counter -lt $count; do
    for host in \$hosts; do
        if test \$counter -lt $count; then
            $remote_shell \$host "/bin/sh $cmd_script_name" < $stdin &
            counter=\`expr \$counter + 1\`
        else
            break
        fi
    done
done


That winds up in the submit file to PBS.  You can add a line like:
system("cp $pbs_job_script_name /tmp/ws.gram.job");

right before the line reading:
chomp($job_id = `$qsub < $pbs_job_script_name $errfile`);

Then you can edit the /tmp/ws.gram.job file to see what fix is required.


Charles
The following is the piece of${GLOBUS_LOCATION}/lib/perl/Globus/GRAM/JobManager/pbs.pm file
----------------------
my ($mpirun, $mpiexec, $qsub, $qstat, $qdel, $cluster, $cpu_per_node,$remote_shell);
BEGIN
{
    $mpiexec        = 'no';
    $mpirun         = '/usr/local/bin/mpirun';
    $qsub           = '/usr/local/bin/qsub';
    $qstat          = '/usr/local/bin/qstat';
    $qdel           = '/usr/local/bin/qdel';
    $cluster        = 1;
    $cpu_per_node   = 1;
    $remote_shell   = '/usr/local/bin/ssh';
    $softenv_dir    = '';
    $soft_msc       = "$softenv_dir/bin/soft-msc";
    $softenv_load   = "$softenv_dir/etc/softenv-load.sh";

}
----------------------
If I change $cluster to 0 I don't get any errors, but I don't get asmany resources as I request (in a job description file, e.g.<count>10</count>)
Thank you,

--Daniel

Charles Bacon wrote:
Your client sends its hostname to the container. Are you submittingfrom a machine named node10.local? If so, you should setGLOBUS_HOSTNAME to the publically visible name of your machine instead.If myhost.com is really node10.local, then you should setGLOBUS_HOSTNAME in its environment to its publically visible name.
Charles
On Nov 19, 2007, at 2:57 PM, Daniel Andrzejewski wrote:
Hi all,
When I submit the following job I get no problems, but no outputeither.
globusrun-ws -submit -Fhttps://myhost.com:8443/wsrf/services/ManagedJobFactoryService -FtPBS -c /bin/ls
When I add -s option I get the following error:

ssh: node10.local:1: Name or service not known
/var/torque/mom_priv/jobs/179.myhost-head.SC: line 37: [: too manyarguments
I don't have any problems with ssh keys and I use Torque/Maui.

Thanks in advance.

--Daniel Andrzejewski
student IT Administrator
Electrical Engineering and Computer Science
University of Tennessee
(865) 974 - 4388 (work)
"Investment in knowledge always pays the best interest" BenjaminFranklin
--

Re: [gt-user] line 37: [: too many arguments

Reply via email to