Hi, Galaxy Community,

Greetings from The University of Chicago; I hope all who attended the
Galaxy conference enjoyed it as much as I did.  I have searched the
mailing list archives as well as Google to a resolve problem I am
seeing, however I am somewhat at a loss as to the next course of
action I should be taking to bring this issue to a close.  I am hoping
that one of the bright minds on this mailing list could help me  shed
some light on the solution to my problem, or at least help me identify
a root cause.  I have configured Galaxy to integrate with TORQUE
(version 4.0.2) server, and successfully built the PBS python egg as
specified in the Galaxy documentation.  I am using Python version 2.6
and the latest build of Galaxy.  Whenever I launch a job from the
Galaxy UI, I get the following error message(s) on the PBS server:

07/30/2012 15:42:48;0080;PBS_Server;Req;dis_request_read;conflicting
version numbers, 1 detected, 2 expected
07/30/2012 15:42:48;0080;PBS_Server;Req;req_reject;Reject reply
code=15058(Bad DIS based Request Protocol MSG=cannot decode message),
aux=0, type=AlternateUserAuthentication, from galaxy@
07/30/2012 15:42:48;0080;PBS_Server;Req;dis_request_read;conflicting
version numbers, 1 detected, 2 expected
07/30/2012 15:42:48;0080;PBS_Server;Req;req_reject;Reject reply
code=15058(Bad DIS based Request Protocol MSG=cannot decode message),
aux=0, type=QueueJob, from galaxy@
07/30/2012 15:42:48;0080;PBS_Server;Req;dis_request_read;conflicting
version numbers, 1 detected, 2 expected
07/30/2012 15:42:48;0080;PBS_Server;Req;req_reject;Reject reply
code=15058(Bad DIS based Request Protocol MSG=cannot decode message),
aux=0, type=Disconnect, from galaxy@
07/30/2012 15:43:01;0002;PBS_Server;Svr;PBS_Server;Torque Server
Version = 4.0.2, loglevel = 1

One thing I did notice, that suggests there might be a problem, is
that there is no hostname after the galaxy@; most of the other
messages in this log file have a host name appended to log entry,
i.e.:

07/30/2012 15:43:38;0100;PBS_Server;Req;;Type StatusJob request
received from root@sc01, sock=10

I have completed a tcpdump on the schedule node, and I can definitely
see bi-directional traffic between the Galaxy server and the scheduler
node on TCP port 15001.  In addition to this, I have installed the
TORQUE client tools on on the Galaxy server, and can spawn an
interactive job with qsub -I, as well as check the status of queued
jobs using qstat (from the Galaxy server).  This suggests to me that
there a potential problem with the PBS Egg, although I am not certain.
 Has anybody seen something like this before, or could somebody point
me in the right direction?  We do have a support contract with
Adaptive Computing, and I am opening a ticket with them as well,
however I wanted to reach out to the Galaxy community to cover all of
my bases.  Thank-you so much for taking the time to read my email.

Dan Sullivan
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Reply via email to