Re: [gridengine users] SoGE Upgrade Method

2012-08-27 Thread Smith, David [EESUS]
Yes, unzipping. I believe it was because, for whatever reason, the libraries needed for Berkeley spooling were not present in the zip files after the March release if memory serves. They previously had been. David -Original Message- From: Dave Love,,, [mailto:d.l...@liverpool.ac.uk]

[gridengine users] Segfault trying to start qmaster w/GE2011.11p1

2012-08-27 Thread Karl Vollmer
Hello, I recently tried installing GE2011.11p1 and after configuring it I get a seg-fault when trying to start the qmaster. If anyone has any insight I would appreciate the help. OS: Centos 6.3x86_32 I used/use the following set of commands to install GE2011, you'll note the .po

Re: [gridengine users] sge inspect

2012-08-27 Thread Chakravarthy Girda
Hi, I am looking for something like "Platform-RTM". To configure cluster and also to maintain the history on overall cluster & individual nodes. So to my knowledge "qmon" is just like a admin tool. Thank you Chakri On Sat, 25 Aug 2012, Reuti wrote: Hi, Am 25.08.2012 um 02:04 schrieb Dav

Re: [gridengine users] Segfault trying to start qmaster w/GE2011.11p1

2012-08-27 Thread Rayson Ho
Most of our users don't run 32-bit Linux, but we tested it and it worked for us (earlier RHEL versions, but not CentOS 6.3). Can you run the qmaster under gdb so that gdb would show the stack trace? Rayson On Mon, Aug 27, 2012 at 10:35 AM, Karl Vollmer wrote: > Hello, > > I recently tried in

Re: [gridengine users] sge inspect

2012-08-27 Thread Chakravarthy Girda
Morning Dave, Thank you for the suggestion. I will give a try ? also what are the SGE builds that you have on this site. Are they the free version of SGE ? I mean can I try that on a full production cluster with more than 300 nodes ? Thank you Chakri On Sat, 25 Aug 2012, Dave Love wrote:

Re: [gridengine users] Segfault trying to start qmaster w/GE2011.11p1

2012-08-27 Thread Karl Vollmer
Rayson Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 0xb51ffb70 (LWP 3441)] 0x0824ce78 in sge_get_message_id_output_implementation () (gdb) bt #0 0x0824ce78 in sge_get_message_id_output_implementation () #1 0x0824cf0d in sge_gettext_ () #2 0x08239fae in sge_monitor_i

Re: [gridengine users] Segfault trying to start qmaster w/GE2011.11p1

2012-08-27 Thread Rayson Ho
Thanks, so it was indeed an i13n / locate issue (and we don't usually compile with " -intl", as we only need English messages). Did you try the binaries without compiling without " -intl"? Rayson On Mon, Aug 27, 2012 at 10:56 AM, Karl Vollmer wrote: > Rayson > > Program received signal SIGSEG

Re: [gridengine users] Tightly integrated parallel environment - Cleanly stopping "qrsh -inherit" sub-processes

2012-08-27 Thread Julien Nicoulaud
Thanks for your answer, I'll deal with the clean shutdown in my application. However, do you know whether this task failure detection can be disabled ? It is acceptable for me to have one worker process crashing, but not if it kills the whole job as a side effect... 2012/8/26 Reuti > Hi, > > Am

Re: [gridengine users] Tightly integrated parallel environment - Cleanly stopping "qrsh -inherit" sub-processes

2012-08-27 Thread Reuti
Am 27.08.2012 um 18:20 schrieb Julien Nicoulaud: > Thanks for your answer, I'll deal with the clean shutdown in my application. > > However, do you know whether this task failure detection can be disabled ? It > is acceptable for me to have one worker process crashing, but not if it kills > the