Hi Nate. I do understand that this is not a bug directly stemming from the 
Galaxy code base. Munge is really just a tool to pass user credentials between 
systems during job submission to the PBS server. Galaxy is spooling jobs 
through the PBS job runner, which presumably indirectly calls munge through 
jobs submission to the PBS server. I'm just not sure why the munge process is 
sometimes becoming corrupt. This is an issue since I rapidly reach my max 
number of threads for the Galaxy user on my head node. 

At this point I guess I'll try downloading the latest stable version of Torque 
and build RPMs. I have been using what is in EPEL for RHEL6. Thanks for the 
reply, and any other thoughts are still appreciated!

On Dec 7, 2012, at 10:51 AM, Nate Coraor <n...@bx.psu.edu>
 wrote:

> On Dec 6, 2012, at 2:34 PM, Matthew Shirley wrote:
> 
>> I am fairly new to PBS management, so I can't rule out some 
>> misconfiguration, but I have a strange issue when running Galaxy with the 
>> PBS job runner. It seems that munge spawns a bunch of defunct processes 
>> after running Galaxy on my cluster:
>> 
>> `ps axjf`:
>> 
>>    1 25992 25991 25991 ?           -1 Sl   77777   8:48 python 
>> ./scripts/paster.py serve universe_wsgi.ini --daemon
>> 25992 26032 25991 25991 ?           -1 Z    77777   0:00  \_ [munge] 
>> <defunct>
>> 25992 26034 25991 25991 ?           -1 Z    77777   0:00  \_ [munge] 
>> <defunct>
>> 25992 26036 25991 25991 ?           -1 Z    77777   0:00  \_ [munge] 
>> <defunct>
>> 
>> Now, these processes are being spawned by Galaxy, and I can't figure out 
>> why. Can anyone provide some insight or clues about where to start debugging 
>> this? Thanks,
> 
> Hi Matt,
> 
> I'm not sure what munge is, it's not something provided with Galaxy.  
> Googling suggests it might be an authentication tool used in some HPC 
> environments.  Without having any familiarity with it, I can't say what 
> process in Galaxy would be interacting with it, especially since that 
> interaction must occur implicitly somewhere down the chain of normal Galaxy 
> operations.
> 
> --nate
> 
>> 
>> Matt
>> 
>> ___________________________________________________________
>> Please keep all replies on the list by using "reply all"
>> in your mail client.  To manage your subscriptions to this
>> and other Galaxy lists, please use the interface at:
>> 
>> http://lists.bx.psu.edu/
> 
> 



___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Reply via email to