Thanks, so it was indeed an i13n / locate issue (and we don't usually
compile with " -intl", as we only need English messages).

Did you try the binaries without compiling without " -intl"?

Rayson



On Mon, Aug 27, 2012 at 10:56 AM, Karl Vollmer <[email protected]> wrote:
> Rayson
>
> Program received signal SIGSEGV, Segmentation fault.
> [Switching to Thread 0xb51ffb70 (LWP 3441)]
> 0x0824ce78 in sge_get_message_id_output_implementation ()
> (gdb) bt
> #0  0x0824ce78 in sge_get_message_id_output_implementation ()
> #1  0x0824cf0d in sge_gettext_ ()
> #2  0x08239fae in sge_monitor_init ()
> #3  0x08056463 in sge_signaler_main ()
> #4  0x00166a49 in start_thread () from /lib/libpthread.so.0
> #5  0x00259e5e in clone () from /lib/libc.so.6
>
>   Let me know if you want/need any more info
>
> -Karl
> On Mon, Aug 27, 2012 at 10:45:19AM -0400, Rayson Ho wrote:
>> Most of our users don't run 32-bit Linux, but we tested it and it
>> worked for us (earlier RHEL versions, but not CentOS 6.3). Can you run
>> the qmaster under gdb so that gdb would show the stack trace?
>>
>> Rayson
>>
>>
>>
>> On Mon, Aug 27, 2012 at 10:35 AM, Karl Vollmer <[email protected]> wrote:
>> > Hello,
>> >
>> >   I recently tried installing GE2011.11p1 and after configuring it I get a
>> >   seg-fault when trying to start the qmaster. If anyone has any insight I
>> >   would appreciate the help.
>> >
>> >   OS: Centos 6.3x86_32
>> >
>> >   I used/use the following set of commands to install GE2011, you'll note 
>> > the
>> >   .po file mojo at the end, not sure if that's required but it did remove 
>> > some
>> >   errors from the debug log.
>> >
>> > ----
>> > export JAVA_HOME=/usr/lib/jvm/java-openjdk/
>> > ./aimk -only-depend
>> > ./scripts/zerodepend
>> > ./aimk depend
>> > ./aimk -no-java -no-jni -no-secure -spool-classic -no-dump -intl
>> > ./aimk -man
>> > ./scripts/distinst -y -basedir /opt/ge -vdir GE2011.11p1  -all -noexit
>> > ln -s /usr/src/${PKGNAME}/source/scripts/mk_dist 
>> > /opt/ge/GE2011.11p1/mk_dist
>> > cp -r /usr/src/${PKGNAME}/source/dist/locale /opt/ge/GE2011.11p1/
>> > mkdir -p /opt/ge/GE2011.11p1/locale/en/LC_MESSAGES/linux-x86
>> > cp /opt/ge/GE2011.11p1/locale/en/LC_MESSAGES/gridengine.po 
>> > /opt/ge/GE2011.11p1/locale/en/LC_MESSAGES/linux-x86
>> > msgfmt /opt/ge/GE2011.11p1/locale/en/LC_MESSAGES/linux-x86/gridengine.po 
>> > -o /opt/ge/GE2011.11p1/locale/en/LC_MESSAGES/linux-x86/gridengine.mo
>> > ----
>> >
>> >   Packages installed to support GE2011
>> >
>> >   package { 'java-1.7.0-openjdk': ensure        => installed }
>> >   package { 'java-1.7.0-openjdk-devel': ensure  => installed }
>> >   package { 'lib-X11-devel': ensure             => installed }
>> >   package { 'openssl-static': ensure            => installed }
>> >   package { 'tcsh': ensure            => installed }
>> >   package { 'pam-devel': ensure       => installed }
>> >   package { 'openmotif-devel': ensure => installed }
>> >   package { 'openssl': ensure         => installed }
>> >   package { 'openssl-devel': ensure   => installed }
>> >   package { 'libXpm-devel': ensure    => installed }
>> >   package { 'ncurses-devel': ensure   => installed }
>> >   package { 'ncurses': ensure         => installed }
>> >   package { 'texinfo': ensure         => installed }
>> >   package { 'unzip': ensure           => installed }
>> >   package { 'gettext': ensure         => installed }
>> >   package { 'gettext-devel': ensure   => installed }
>> >
>> >   Debug output from sge_qmaster
>> >
>> >   starting sge_qmaster
>> >      0   2951 -1216837952     ****** starting localization procedure ... 
>> > **********
>> >      1   2951 -1216837952     could not get environment variable 
>> > "GRIDPACKAGE"
>> >      2   2951 -1216837952     could not get environment variable 
>> > "GRIDLOCALEDIR"
>> >      3   2951 -1216837952     setlocale() returns NULL     4   2951 
>> > -1216837952     locale directory: >/opt/ge/GE2011.1
>> >      5   2951 -1216837952     package file:     >linux-x86/gridengine.mo<
>> >      6   2951 -1216837952     language (LANG):  >en<
>> >      7   2951 -1216837952     loading message file: 
>> > /opt/ge/GE2011.11p1/locale/en/LC_MESSAGES/linux-x86/gridengine.mo
>> >      8   2951 -1216837952     found message file - ok
>> >      9   2951 -1216837952     setlocale() returns NULL
>> >     10   2951 -1216837952     bindtextdomain() returns 
>> > "/opt/ge/GE2011.11p1/locale"
>> >     11   2951 -1216837952     textdomain() returns "linux-x86/gridengine"
>> >     12   2951 -1216837952     error id output     : enabled
>> >     13   2951 -1216837952     ****** starting localization procedure ... 
>> > success **
>> >     14   2951 -1216837952     sge_qmaster is not daemonized
>> >     15   2951 -1216837952     returning port value: 7111
>> >     16   2951 -1216837952     returning port value: 7112
>> >     17   2951 -1216837952     Getting host by name - Linux
>> >     18   2951 -1216837952     1 names in h_addr_list
>> >     19   2951 -1216837952     1 names in h_aliases
>> >     20   2951 -1216837952     Getting host by name - Linux
>> >     21   2951 -1216837952     1 names in h_addr_list
>> >     22   2951 -1216837952     1 names in h_aliases
>> >     23   2951         main     Getting host by name - Linux
>> >     24   2951         main     1 names in h_addr_list
>> >     25   2951         main     1 names in h_aliases
>> >     26   2951         main     creating QMASTER handle
>> >     27   2951         main     act_qmaster file contains local host name
>> >     28   2951         main     auid=0; agid=0
>> >     29   2951         main     uid=0; gid=0; euid=0; egid=0 auid=0; agid=0
>> >     30   2951         main     Making dir 
>> > "/opt/ge/GE2011.11p1/default/spool/qmaster/jobs"
>> >     31   2951         main     retval = 0
>> >     32   2951         main     Making dir 
>> > "/opt/ge/GE2011.11p1/default/spool/qmaster/zombies"
>> >     33   2951         main     retval = 0
>> >     34   2951         main     Making dir 
>> > "/opt/ge/GE2011.11p1/default/spool/qmaster/cqueues"
>> >     35   2951         main     retval = 0
>> >     36   2951         main     Making dir 
>> > "/opt/ge/GE2011.11p1/default/spool/qmaster/qinstances"
>> >     37   2951         main     retval = 0
>> >     38   2951         main     Making dir 
>> > "/opt/ge/GE2011.11p1/default/spool/qmaster/exec_hosts"
>> >     39   2951         main     retval = 0
>> >     40   2951         main     Making dir 
>> > "/opt/ge/GE2011.11p1/default/spool/qmaster/submit_hosts"
>> >     41   2951         main     retval = 0
>> >     42   2951         main     Making dir 
>> > "/opt/ge/GE2011.11p1/default/spool/qmaster/admin_hosts"
>> >     43   2951         main     retval = 0
>> >     44   2951         main     Making dir 
>> > "/opt/ge/GE2011.11p1/default/spool/qmaster/centry"
>> >     45   2951         main     retval = 0
>> >     46   2951         main     Making dir 
>> > "/opt/ge/GE2011.11p1/default/spool/qmaster/job_scripts"
>> >     47   2951         main     retval = 0
>> >     48   2951         main     Making dir 
>> > "/opt/ge/GE2011.11p1/default/spool/qmaster/pe"
>> >     49   2951         main     retval = 0
>> >     50   2951         main     Making dir 
>> > "/opt/ge/GE2011.11p1/default/spool/qmaster/ckpt"
>> >     51   2951         main     retval = 0
>> >     52   2951         main     Making dir 
>> > "/opt/ge/GE2011.11p1/default/spool/qmaster/usersets"
>> >     53   2951         main     retval = 0
>> >     54   2951         main     Making dir 
>> > "/opt/ge/GE2011.11p1/default/spool/qmaster/calendars"
>> >     55   2951         main     retval = 0
>> >     56   2951         main     Making dir 
>> > "/opt/ge/GE2011.11p1/default/spool/qmaster/hostgroups"
>> >     57   2951         main     retval = 0
>> >     58   2951         main     Making dir 
>> > "/opt/ge/GE2011.11p1/default/spool/qmaster/users"
>> >     59   2951         main     retval = 0
>> >     60   2951         main     Making dir 
>> > "/opt/ge/GE2011.11p1/default/spool/qmaster/projects"
>> >     61   2951         main     retval = 0
>> >     62   2951         main     Making dir 
>> > "/opt/ge/GE2011.11p1/default/spool/qmaster/resource_quotas"
>> >     63   2951         main     retval = 0
>> >     64   2951         main     Making dir 
>> > "/opt/ge/GE2011.11p1/default/spool/qmaster/advance_reservations"
>> >     65   2951         main     retval = 0
>> >     66   2951         main     Making dir 
>> > "/opt/ge/GE2011.11p1/default/common/local_conf"
>> >     67   2951         main     retval = 0
>> >     68   2951         main     reading CONFIG "global"
>> >     69   2951         main     reading CONFIG "veli.local"
>> >     70   2951         main     qualified_hostname: 'veli.local'
>> >     71   2951         main     Complex Attributes----------------------
>> >     72   2951         main     reading COMPLEX_ENTRY "s_vmem"
>> >     73   2951         main     reading COMPLEX_ENTRY "s_core"
>> >     74   2951         main     reading COMPLEX_ENTRY "virtual_used"
>> >     75   2951         main     reading COMPLEX_ENTRY "s_rss"
>> >     76   2951         main     reading COMPLEX_ENTRY "swap_total"
>> >     77   2951         main     reading COMPLEX_ENTRY "load_long"
>> >     78   2951         main     reading COMPLEX_ENTRY "virtual_free"
>> >     79   2951         main     reading COMPLEX_ENTRY "np_load_avg"
>> >     80   2951         main     reading COMPLEX_ENTRY "calendar"
>> >     81   2951         main     reading COMPLEX_ENTRY "h_cpu"
>> >     82   2951         main     reading COMPLEX_ENTRY "min_cpu_interval"
>> >     83   2951         main     reading COMPLEX_ENTRY "h_rt"
>> >     84   2951         main     reading COMPLEX_ENTRY "h_vmem"
>> >     85   2951         main     reading COMPLEX_ENTRY "h_data"
>> >     86   2951         main     reading COMPLEX_ENTRY "m_socket"
>> >     87   2951         main     reading COMPLEX_ENTRY "mem_used"
>> >     88   2951         main     reading COMPLEX_ENTRY "s_rt"
>> >     89   2951         main     reading COMPLEX_ENTRY "virtual_total"
>> >     90   2951         main     reading COMPLEX_ENTRY "swap_free"
>> >     91   2951         main     reading COMPLEX_ENTRY "seq_no"
>> >     92   2951         main     reading COMPLEX_ENTRY "slots"
>> >     93   2951         main     reading COMPLEX_ENTRY "s_fsize"
>> >     94   2951         main     reading COMPLEX_ENTRY "h_stack"
>> >     95   2951         main     reading COMPLEX_ENTRY "h_fsize"
>> >     96   2951         main     reading COMPLEX_ENTRY "load_avg"
>> >     97   2951         main     reading COMPLEX_ENTRY "load_short"
>> >     98   2951         main     reading COMPLEX_ENTRY "hostname"
>> >     99   2951         main     reading COMPLEX_ENTRY "h_rss"
>> >    100   2951         main     reading COMPLEX_ENTRY "np_load_short"
>> >    101   2951         main     reading COMPLEX_ENTRY "arch"
>> >    102   2951         main     reading COMPLEX_ENTRY "num_proc"
>> >    103   2951         main     reading COMPLEX_ENTRY "np_load_medium"
>> >    104   2951         main     reading COMPLEX_ENTRY "m_topology_inuse"
>> >    105   2951         main     reading COMPLEX_ENTRY "h_core"
>> >    106   2951         main     reading COMPLEX_ENTRY "tmpdir"
>> >    107   2951         main     reading COMPLEX_ENTRY "s_data"
>> >    108   2951         main     reading COMPLEX_ENTRY "rerun"
>> >    109   2951         main     reading COMPLEX_ENTRY "qname"
>> >    110   2951         main     reading COMPLEX_ENTRY "s_cpu"
>> >    111   2951         main     reading COMPLEX_ENTRY "mem_total"
>> >    112   2951         main     reading COMPLEX_ENTRY "s_stack"
>> >    113   2951         main     reading COMPLEX_ENTRY "swap_rsvd"
>> >    114   2951         main     reading COMPLEX_ENTRY "m_topology"
>> >    115   2951         main     reading COMPLEX_ENTRY "cpu"
>> >    116   2951         main     reading COMPLEX_ENTRY "load_medium"
>> >    117   2951         main     reading COMPLEX_ENTRY "mem_free"
>> >    118   2951         main     reading COMPLEX_ENTRY "swap_rate"
>> >    119   2951         main     reading COMPLEX_ENTRY "np_load_long"
>> >    120   2951         main     reading COMPLEX_ENTRY "m_core"
>> >    121   2951         main     reading COMPLEX_ENTRY "swap_used"
>> >    122   2951         main     reading COMPLEX_ENTRY "display_win_gui"
>> >    123   2951         main     host_list----------------------------
>> >    124   2951         main     reading EXECHOST "global"
>> >    125   2951         main     reading EXECHOST "template"
>> >    126   2951         main     reading ADMINHOST "veli.local"
>> >    127   2951         main     manager_list----------------------------
>> >    128   2951         main     root
>> >    129   2951         main     host group definitions-----------
>> >    130   2951         main     operator_list----------------------------
>> >    131   2951         main     userset_list------------------------------
>> >    132   2951         main     reading USERSET "deadlineusers"
>> >    133   2951         main     reading USERSET "defaultdepartment"
>> >    134   2951         main     reading USERSET "arusers"
>> >    135   2951         main     calendar list ------------------------------
>> >    136   2951         main     resource quota list -----------------------
>> >    137   2951         main     
>> > cluster_queue_list---------------------------------
>> >    138   2951         main     pe_list---------------------------------
>> >    139   2951         main     ckpt_list---------------------------------
>> >    140   2951         main     advance reservation list 
>> > -----------------------
>> >    141   2951         main     job_list-----------------------------------
>> >    142   2951         main     user list-----------------------------------
>> >    143   2951         main     project 
>> > list-----------------------------------
>> >    144   2951         main     scheduler config 
>> > -----------------------------------
>> >    145   2951         main     reading SCHEDD_CONF "sched_configuration"
>> >    146   2951         main     sconf_validate: no config to validate
>> >    147   2951         main     share tree 
>> > list-----------------------------------
>> >    148   2951         main     reading SHARETREE "sharetree"
>> >    149   2951         main     received qmaster_params are: none
>> >    150   2951         main     event master functionality has been 
>> > initialized
>> >    151   2951 -1275073680 /etc/init.d/sgemaster.p7111: line 760:  2951 
>> > Segmentation fault      $bin_dir/sge_qmaster
>> >
>> >    Strace (last few lines)
>> >
>> > open("qmaster.pid", O_WRONLY|O_CREAT|O_TRUNC|O_LARGEFILE, 0644) = 4
>> > close(4)                                = 0
>> > open("qmaster.pid", O_WRONLY|O_CREAT|O_TRUNC|O_LARGEFILE, 0666) = 4
>> > fstat64(4, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
>> > mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) 
>> > = 0xb7895000
>> > write(4, "3273\n", 5)                   = 5
>> > close(4)                                = 0
>> > munmap(0xb7895000, 4096)                = 0
>> > mmap2(NULL, 10489856, PROT_READ|PROT_WRITE, 
>> > MAP_PRIVATE|MAP_ANONYMOUS|MAP_STACK, -1, 0) = 0xb40ff000
>> > mprotect(0xb40ff000, 4096, PROT_NONE)   = 0
>> > clone(child_stack=0xb4aff494, 
>> > flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SET
>> > write(2, "   150   3273         main ", 27   150   3273         main ) = 27
>> > write(2, 0xbfc7f170, 52 <unfinished ...>
>> > +++ killed by SIGSEGV +++
>> > Segmentation fault
>> >
>> >     Long enough? :D
>> >
>> > -Karl Vollmer
>> >
>> > _______________________________________________
>> > users mailing list
>> > [email protected]
>> > https://gridengine.org/mailman/listinfo/users
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to