Rayson

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0xb51ffb70 (LWP 3441)]
0x0824ce78 in sge_get_message_id_output_implementation ()
(gdb) bt
#0  0x0824ce78 in sge_get_message_id_output_implementation ()
#1  0x0824cf0d in sge_gettext_ ()
#2  0x08239fae in sge_monitor_init ()
#3  0x08056463 in sge_signaler_main ()
#4  0x00166a49 in start_thread () from /lib/libpthread.so.0
#5  0x00259e5e in clone () from /lib/libc.so.6

  Let me know if you want/need any more info

-Karl 
On Mon, Aug 27, 2012 at 10:45:19AM -0400, Rayson Ho wrote:
> Most of our users don't run 32-bit Linux, but we tested it and it
> worked for us (earlier RHEL versions, but not CentOS 6.3). Can you run
> the qmaster under gdb so that gdb would show the stack trace?
> 
> Rayson
> 
> 
> 
> On Mon, Aug 27, 2012 at 10:35 AM, Karl Vollmer <[email protected]> wrote:
> > Hello,
> >
> >   I recently tried installing GE2011.11p1 and after configuring it I get a
> >   seg-fault when trying to start the qmaster. If anyone has any insight I
> >   would appreciate the help.
> >
> >   OS: Centos 6.3x86_32
> >
> >   I used/use the following set of commands to install GE2011, you'll note 
> > the
> >   .po file mojo at the end, not sure if that's required but it did remove 
> > some
> >   errors from the debug log.
> >
> > ----
> > export JAVA_HOME=/usr/lib/jvm/java-openjdk/
> > ./aimk -only-depend
> > ./scripts/zerodepend
> > ./aimk depend
> > ./aimk -no-java -no-jni -no-secure -spool-classic -no-dump -intl
> > ./aimk -man
> > ./scripts/distinst -y -basedir /opt/ge -vdir GE2011.11p1  -all -noexit
> > ln -s /usr/src/${PKGNAME}/source/scripts/mk_dist /opt/ge/GE2011.11p1/mk_dist
> > cp -r /usr/src/${PKGNAME}/source/dist/locale /opt/ge/GE2011.11p1/
> > mkdir -p /opt/ge/GE2011.11p1/locale/en/LC_MESSAGES/linux-x86
> > cp /opt/ge/GE2011.11p1/locale/en/LC_MESSAGES/gridengine.po 
> > /opt/ge/GE2011.11p1/locale/en/LC_MESSAGES/linux-x86
> > msgfmt /opt/ge/GE2011.11p1/locale/en/LC_MESSAGES/linux-x86/gridengine.po -o 
> > /opt/ge/GE2011.11p1/locale/en/LC_MESSAGES/linux-x86/gridengine.mo
> > ----
> >
> >   Packages installed to support GE2011
> >
> >   package { 'java-1.7.0-openjdk': ensure        => installed }
> >   package { 'java-1.7.0-openjdk-devel': ensure  => installed }
> >   package { 'lib-X11-devel': ensure             => installed }
> >   package { 'openssl-static': ensure            => installed }
> >   package { 'tcsh': ensure            => installed }
> >   package { 'pam-devel': ensure       => installed }
> >   package { 'openmotif-devel': ensure => installed }
> >   package { 'openssl': ensure         => installed }
> >   package { 'openssl-devel': ensure   => installed }
> >   package { 'libXpm-devel': ensure    => installed }
> >   package { 'ncurses-devel': ensure   => installed }
> >   package { 'ncurses': ensure         => installed }
> >   package { 'texinfo': ensure         => installed }
> >   package { 'unzip': ensure           => installed }
> >   package { 'gettext': ensure         => installed }
> >   package { 'gettext-devel': ensure   => installed }
> >
> >   Debug output from sge_qmaster
> >
> >   starting sge_qmaster
> >      0   2951 -1216837952     ****** starting localization procedure ... 
> > **********
> >      1   2951 -1216837952     could not get environment variable 
> > "GRIDPACKAGE"
> >      2   2951 -1216837952     could not get environment variable 
> > "GRIDLOCALEDIR"
> >      3   2951 -1216837952     setlocale() returns NULL     4   2951 
> > -1216837952     locale directory: >/opt/ge/GE2011.1
> >      5   2951 -1216837952     package file:     >linux-x86/gridengine.mo<
> >      6   2951 -1216837952     language (LANG):  >en<
> >      7   2951 -1216837952     loading message file: 
> > /opt/ge/GE2011.11p1/locale/en/LC_MESSAGES/linux-x86/gridengine.mo
> >      8   2951 -1216837952     found message file - ok
> >      9   2951 -1216837952     setlocale() returns NULL
> >     10   2951 -1216837952     bindtextdomain() returns 
> > "/opt/ge/GE2011.11p1/locale"
> >     11   2951 -1216837952     textdomain() returns "linux-x86/gridengine"
> >     12   2951 -1216837952     error id output     : enabled
> >     13   2951 -1216837952     ****** starting localization procedure ... 
> > success **
> >     14   2951 -1216837952     sge_qmaster is not daemonized
> >     15   2951 -1216837952     returning port value: 7111
> >     16   2951 -1216837952     returning port value: 7112
> >     17   2951 -1216837952     Getting host by name - Linux
> >     18   2951 -1216837952     1 names in h_addr_list
> >     19   2951 -1216837952     1 names in h_aliases
> >     20   2951 -1216837952     Getting host by name - Linux
> >     21   2951 -1216837952     1 names in h_addr_list
> >     22   2951 -1216837952     1 names in h_aliases
> >     23   2951         main     Getting host by name - Linux
> >     24   2951         main     1 names in h_addr_list
> >     25   2951         main     1 names in h_aliases
> >     26   2951         main     creating QMASTER handle
> >     27   2951         main     act_qmaster file contains local host name
> >     28   2951         main     auid=0; agid=0
> >     29   2951         main     uid=0; gid=0; euid=0; egid=0 auid=0; agid=0
> >     30   2951         main     Making dir 
> > "/opt/ge/GE2011.11p1/default/spool/qmaster/jobs"
> >     31   2951         main     retval = 0
> >     32   2951         main     Making dir 
> > "/opt/ge/GE2011.11p1/default/spool/qmaster/zombies"
> >     33   2951         main     retval = 0
> >     34   2951         main     Making dir 
> > "/opt/ge/GE2011.11p1/default/spool/qmaster/cqueues"
> >     35   2951         main     retval = 0
> >     36   2951         main     Making dir 
> > "/opt/ge/GE2011.11p1/default/spool/qmaster/qinstances"
> >     37   2951         main     retval = 0
> >     38   2951         main     Making dir 
> > "/opt/ge/GE2011.11p1/default/spool/qmaster/exec_hosts"
> >     39   2951         main     retval = 0
> >     40   2951         main     Making dir 
> > "/opt/ge/GE2011.11p1/default/spool/qmaster/submit_hosts"
> >     41   2951         main     retval = 0
> >     42   2951         main     Making dir 
> > "/opt/ge/GE2011.11p1/default/spool/qmaster/admin_hosts"
> >     43   2951         main     retval = 0
> >     44   2951         main     Making dir 
> > "/opt/ge/GE2011.11p1/default/spool/qmaster/centry"
> >     45   2951         main     retval = 0
> >     46   2951         main     Making dir 
> > "/opt/ge/GE2011.11p1/default/spool/qmaster/job_scripts"
> >     47   2951         main     retval = 0
> >     48   2951         main     Making dir 
> > "/opt/ge/GE2011.11p1/default/spool/qmaster/pe"
> >     49   2951         main     retval = 0
> >     50   2951         main     Making dir 
> > "/opt/ge/GE2011.11p1/default/spool/qmaster/ckpt"
> >     51   2951         main     retval = 0
> >     52   2951         main     Making dir 
> > "/opt/ge/GE2011.11p1/default/spool/qmaster/usersets"
> >     53   2951         main     retval = 0
> >     54   2951         main     Making dir 
> > "/opt/ge/GE2011.11p1/default/spool/qmaster/calendars"
> >     55   2951         main     retval = 0
> >     56   2951         main     Making dir 
> > "/opt/ge/GE2011.11p1/default/spool/qmaster/hostgroups"
> >     57   2951         main     retval = 0
> >     58   2951         main     Making dir 
> > "/opt/ge/GE2011.11p1/default/spool/qmaster/users"
> >     59   2951         main     retval = 0
> >     60   2951         main     Making dir 
> > "/opt/ge/GE2011.11p1/default/spool/qmaster/projects"
> >     61   2951         main     retval = 0
> >     62   2951         main     Making dir 
> > "/opt/ge/GE2011.11p1/default/spool/qmaster/resource_quotas"
> >     63   2951         main     retval = 0
> >     64   2951         main     Making dir 
> > "/opt/ge/GE2011.11p1/default/spool/qmaster/advance_reservations"
> >     65   2951         main     retval = 0
> >     66   2951         main     Making dir 
> > "/opt/ge/GE2011.11p1/default/common/local_conf"
> >     67   2951         main     retval = 0
> >     68   2951         main     reading CONFIG "global"
> >     69   2951         main     reading CONFIG "veli.local"
> >     70   2951         main     qualified_hostname: 'veli.local'
> >     71   2951         main     Complex Attributes----------------------
> >     72   2951         main     reading COMPLEX_ENTRY "s_vmem"
> >     73   2951         main     reading COMPLEX_ENTRY "s_core"
> >     74   2951         main     reading COMPLEX_ENTRY "virtual_used"
> >     75   2951         main     reading COMPLEX_ENTRY "s_rss"
> >     76   2951         main     reading COMPLEX_ENTRY "swap_total"
> >     77   2951         main     reading COMPLEX_ENTRY "load_long"
> >     78   2951         main     reading COMPLEX_ENTRY "virtual_free"
> >     79   2951         main     reading COMPLEX_ENTRY "np_load_avg"
> >     80   2951         main     reading COMPLEX_ENTRY "calendar"
> >     81   2951         main     reading COMPLEX_ENTRY "h_cpu"
> >     82   2951         main     reading COMPLEX_ENTRY "min_cpu_interval"
> >     83   2951         main     reading COMPLEX_ENTRY "h_rt"
> >     84   2951         main     reading COMPLEX_ENTRY "h_vmem"
> >     85   2951         main     reading COMPLEX_ENTRY "h_data"
> >     86   2951         main     reading COMPLEX_ENTRY "m_socket"
> >     87   2951         main     reading COMPLEX_ENTRY "mem_used"
> >     88   2951         main     reading COMPLEX_ENTRY "s_rt"
> >     89   2951         main     reading COMPLEX_ENTRY "virtual_total"
> >     90   2951         main     reading COMPLEX_ENTRY "swap_free"
> >     91   2951         main     reading COMPLEX_ENTRY "seq_no"
> >     92   2951         main     reading COMPLEX_ENTRY "slots"
> >     93   2951         main     reading COMPLEX_ENTRY "s_fsize"
> >     94   2951         main     reading COMPLEX_ENTRY "h_stack"
> >     95   2951         main     reading COMPLEX_ENTRY "h_fsize"
> >     96   2951         main     reading COMPLEX_ENTRY "load_avg"
> >     97   2951         main     reading COMPLEX_ENTRY "load_short"
> >     98   2951         main     reading COMPLEX_ENTRY "hostname"
> >     99   2951         main     reading COMPLEX_ENTRY "h_rss"
> >    100   2951         main     reading COMPLEX_ENTRY "np_load_short"
> >    101   2951         main     reading COMPLEX_ENTRY "arch"
> >    102   2951         main     reading COMPLEX_ENTRY "num_proc"
> >    103   2951         main     reading COMPLEX_ENTRY "np_load_medium"
> >    104   2951         main     reading COMPLEX_ENTRY "m_topology_inuse"
> >    105   2951         main     reading COMPLEX_ENTRY "h_core"
> >    106   2951         main     reading COMPLEX_ENTRY "tmpdir"
> >    107   2951         main     reading COMPLEX_ENTRY "s_data"
> >    108   2951         main     reading COMPLEX_ENTRY "rerun"
> >    109   2951         main     reading COMPLEX_ENTRY "qname"
> >    110   2951         main     reading COMPLEX_ENTRY "s_cpu"
> >    111   2951         main     reading COMPLEX_ENTRY "mem_total"
> >    112   2951         main     reading COMPLEX_ENTRY "s_stack"
> >    113   2951         main     reading COMPLEX_ENTRY "swap_rsvd"
> >    114   2951         main     reading COMPLEX_ENTRY "m_topology"
> >    115   2951         main     reading COMPLEX_ENTRY "cpu"
> >    116   2951         main     reading COMPLEX_ENTRY "load_medium"
> >    117   2951         main     reading COMPLEX_ENTRY "mem_free"
> >    118   2951         main     reading COMPLEX_ENTRY "swap_rate"
> >    119   2951         main     reading COMPLEX_ENTRY "np_load_long"
> >    120   2951         main     reading COMPLEX_ENTRY "m_core"
> >    121   2951         main     reading COMPLEX_ENTRY "swap_used"
> >    122   2951         main     reading COMPLEX_ENTRY "display_win_gui"
> >    123   2951         main     host_list----------------------------
> >    124   2951         main     reading EXECHOST "global"
> >    125   2951         main     reading EXECHOST "template"
> >    126   2951         main     reading ADMINHOST "veli.local"
> >    127   2951         main     manager_list----------------------------
> >    128   2951         main     root
> >    129   2951         main     host group definitions-----------
> >    130   2951         main     operator_list----------------------------
> >    131   2951         main     userset_list------------------------------
> >    132   2951         main     reading USERSET "deadlineusers"
> >    133   2951         main     reading USERSET "defaultdepartment"
> >    134   2951         main     reading USERSET "arusers"
> >    135   2951         main     calendar list ------------------------------
> >    136   2951         main     resource quota list -----------------------
> >    137   2951         main     
> > cluster_queue_list---------------------------------
> >    138   2951         main     pe_list---------------------------------
> >    139   2951         main     ckpt_list---------------------------------
> >    140   2951         main     advance reservation list 
> > -----------------------
> >    141   2951         main     job_list-----------------------------------
> >    142   2951         main     user list-----------------------------------
> >    143   2951         main     project 
> > list-----------------------------------
> >    144   2951         main     scheduler config 
> > -----------------------------------
> >    145   2951         main     reading SCHEDD_CONF "sched_configuration"
> >    146   2951         main     sconf_validate: no config to validate
> >    147   2951         main     share tree 
> > list-----------------------------------
> >    148   2951         main     reading SHARETREE "sharetree"
> >    149   2951         main     received qmaster_params are: none
> >    150   2951         main     event master functionality has been 
> > initialized
> >    151   2951 -1275073680 /etc/init.d/sgemaster.p7111: line 760:  2951 
> > Segmentation fault      $bin_dir/sge_qmaster
> >
> >    Strace (last few lines)
> >
> > open("qmaster.pid", O_WRONLY|O_CREAT|O_TRUNC|O_LARGEFILE, 0644) = 4
> > close(4)                                = 0
> > open("qmaster.pid", O_WRONLY|O_CREAT|O_TRUNC|O_LARGEFILE, 0666) = 4
> > fstat64(4, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
> > mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 
> > 0xb7895000
> > write(4, "3273\n", 5)                   = 5
> > close(4)                                = 0
> > munmap(0xb7895000, 4096)                = 0
> > mmap2(NULL, 10489856, PROT_READ|PROT_WRITE, 
> > MAP_PRIVATE|MAP_ANONYMOUS|MAP_STACK, -1, 0) = 0xb40ff000
> > mprotect(0xb40ff000, 4096, PROT_NONE)   = 0
> > clone(child_stack=0xb4aff494, 
> > flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SET
> > write(2, "   150   3273         main ", 27   150   3273         main ) = 27
> > write(2, 0xbfc7f170, 52 <unfinished ...>
> > +++ killed by SIGSEGV +++
> > Segmentation fault
> >
> >     Long enough? :D
> >
> > -Karl Vollmer
> >
> > _______________________________________________
> > users mailing list
> > [email protected]
> > https://gridengine.org/mailman/listinfo/users
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to