Thanks, so it was indeed an i13n / locate issue (and we don't usually compile with " -intl", as we only need English messages).
Did you try the binaries without compiling without " -intl"? Rayson On Mon, Aug 27, 2012 at 10:56 AM, Karl Vollmer <[email protected]> wrote: > Rayson > > Program received signal SIGSEGV, Segmentation fault. > [Switching to Thread 0xb51ffb70 (LWP 3441)] > 0x0824ce78 in sge_get_message_id_output_implementation () > (gdb) bt > #0 0x0824ce78 in sge_get_message_id_output_implementation () > #1 0x0824cf0d in sge_gettext_ () > #2 0x08239fae in sge_monitor_init () > #3 0x08056463 in sge_signaler_main () > #4 0x00166a49 in start_thread () from /lib/libpthread.so.0 > #5 0x00259e5e in clone () from /lib/libc.so.6 > > Let me know if you want/need any more info > > -Karl > On Mon, Aug 27, 2012 at 10:45:19AM -0400, Rayson Ho wrote: >> Most of our users don't run 32-bit Linux, but we tested it and it >> worked for us (earlier RHEL versions, but not CentOS 6.3). Can you run >> the qmaster under gdb so that gdb would show the stack trace? >> >> Rayson >> >> >> >> On Mon, Aug 27, 2012 at 10:35 AM, Karl Vollmer <[email protected]> wrote: >> > Hello, >> > >> > I recently tried installing GE2011.11p1 and after configuring it I get a >> > seg-fault when trying to start the qmaster. If anyone has any insight I >> > would appreciate the help. >> > >> > OS: Centos 6.3x86_32 >> > >> > I used/use the following set of commands to install GE2011, you'll note >> > the >> > .po file mojo at the end, not sure if that's required but it did remove >> > some >> > errors from the debug log. >> > >> > ---- >> > export JAVA_HOME=/usr/lib/jvm/java-openjdk/ >> > ./aimk -only-depend >> > ./scripts/zerodepend >> > ./aimk depend >> > ./aimk -no-java -no-jni -no-secure -spool-classic -no-dump -intl >> > ./aimk -man >> > ./scripts/distinst -y -basedir /opt/ge -vdir GE2011.11p1 -all -noexit >> > ln -s /usr/src/${PKGNAME}/source/scripts/mk_dist >> > /opt/ge/GE2011.11p1/mk_dist >> > cp -r /usr/src/${PKGNAME}/source/dist/locale /opt/ge/GE2011.11p1/ >> > mkdir -p /opt/ge/GE2011.11p1/locale/en/LC_MESSAGES/linux-x86 >> > cp /opt/ge/GE2011.11p1/locale/en/LC_MESSAGES/gridengine.po >> > /opt/ge/GE2011.11p1/locale/en/LC_MESSAGES/linux-x86 >> > msgfmt /opt/ge/GE2011.11p1/locale/en/LC_MESSAGES/linux-x86/gridengine.po >> > -o /opt/ge/GE2011.11p1/locale/en/LC_MESSAGES/linux-x86/gridengine.mo >> > ---- >> > >> > Packages installed to support GE2011 >> > >> > package { 'java-1.7.0-openjdk': ensure => installed } >> > package { 'java-1.7.0-openjdk-devel': ensure => installed } >> > package { 'lib-X11-devel': ensure => installed } >> > package { 'openssl-static': ensure => installed } >> > package { 'tcsh': ensure => installed } >> > package { 'pam-devel': ensure => installed } >> > package { 'openmotif-devel': ensure => installed } >> > package { 'openssl': ensure => installed } >> > package { 'openssl-devel': ensure => installed } >> > package { 'libXpm-devel': ensure => installed } >> > package { 'ncurses-devel': ensure => installed } >> > package { 'ncurses': ensure => installed } >> > package { 'texinfo': ensure => installed } >> > package { 'unzip': ensure => installed } >> > package { 'gettext': ensure => installed } >> > package { 'gettext-devel': ensure => installed } >> > >> > Debug output from sge_qmaster >> > >> > starting sge_qmaster >> > 0 2951 -1216837952 ****** starting localization procedure ... >> > ********** >> > 1 2951 -1216837952 could not get environment variable >> > "GRIDPACKAGE" >> > 2 2951 -1216837952 could not get environment variable >> > "GRIDLOCALEDIR" >> > 3 2951 -1216837952 setlocale() returns NULL 4 2951 >> > -1216837952 locale directory: >/opt/ge/GE2011.1 >> > 5 2951 -1216837952 package file: >linux-x86/gridengine.mo< >> > 6 2951 -1216837952 language (LANG): >en< >> > 7 2951 -1216837952 loading message file: >> > /opt/ge/GE2011.11p1/locale/en/LC_MESSAGES/linux-x86/gridengine.mo >> > 8 2951 -1216837952 found message file - ok >> > 9 2951 -1216837952 setlocale() returns NULL >> > 10 2951 -1216837952 bindtextdomain() returns >> > "/opt/ge/GE2011.11p1/locale" >> > 11 2951 -1216837952 textdomain() returns "linux-x86/gridengine" >> > 12 2951 -1216837952 error id output : enabled >> > 13 2951 -1216837952 ****** starting localization procedure ... >> > success ** >> > 14 2951 -1216837952 sge_qmaster is not daemonized >> > 15 2951 -1216837952 returning port value: 7111 >> > 16 2951 -1216837952 returning port value: 7112 >> > 17 2951 -1216837952 Getting host by name - Linux >> > 18 2951 -1216837952 1 names in h_addr_list >> > 19 2951 -1216837952 1 names in h_aliases >> > 20 2951 -1216837952 Getting host by name - Linux >> > 21 2951 -1216837952 1 names in h_addr_list >> > 22 2951 -1216837952 1 names in h_aliases >> > 23 2951 main Getting host by name - Linux >> > 24 2951 main 1 names in h_addr_list >> > 25 2951 main 1 names in h_aliases >> > 26 2951 main creating QMASTER handle >> > 27 2951 main act_qmaster file contains local host name >> > 28 2951 main auid=0; agid=0 >> > 29 2951 main uid=0; gid=0; euid=0; egid=0 auid=0; agid=0 >> > 30 2951 main Making dir >> > "/opt/ge/GE2011.11p1/default/spool/qmaster/jobs" >> > 31 2951 main retval = 0 >> > 32 2951 main Making dir >> > "/opt/ge/GE2011.11p1/default/spool/qmaster/zombies" >> > 33 2951 main retval = 0 >> > 34 2951 main Making dir >> > "/opt/ge/GE2011.11p1/default/spool/qmaster/cqueues" >> > 35 2951 main retval = 0 >> > 36 2951 main Making dir >> > "/opt/ge/GE2011.11p1/default/spool/qmaster/qinstances" >> > 37 2951 main retval = 0 >> > 38 2951 main Making dir >> > "/opt/ge/GE2011.11p1/default/spool/qmaster/exec_hosts" >> > 39 2951 main retval = 0 >> > 40 2951 main Making dir >> > "/opt/ge/GE2011.11p1/default/spool/qmaster/submit_hosts" >> > 41 2951 main retval = 0 >> > 42 2951 main Making dir >> > "/opt/ge/GE2011.11p1/default/spool/qmaster/admin_hosts" >> > 43 2951 main retval = 0 >> > 44 2951 main Making dir >> > "/opt/ge/GE2011.11p1/default/spool/qmaster/centry" >> > 45 2951 main retval = 0 >> > 46 2951 main Making dir >> > "/opt/ge/GE2011.11p1/default/spool/qmaster/job_scripts" >> > 47 2951 main retval = 0 >> > 48 2951 main Making dir >> > "/opt/ge/GE2011.11p1/default/spool/qmaster/pe" >> > 49 2951 main retval = 0 >> > 50 2951 main Making dir >> > "/opt/ge/GE2011.11p1/default/spool/qmaster/ckpt" >> > 51 2951 main retval = 0 >> > 52 2951 main Making dir >> > "/opt/ge/GE2011.11p1/default/spool/qmaster/usersets" >> > 53 2951 main retval = 0 >> > 54 2951 main Making dir >> > "/opt/ge/GE2011.11p1/default/spool/qmaster/calendars" >> > 55 2951 main retval = 0 >> > 56 2951 main Making dir >> > "/opt/ge/GE2011.11p1/default/spool/qmaster/hostgroups" >> > 57 2951 main retval = 0 >> > 58 2951 main Making dir >> > "/opt/ge/GE2011.11p1/default/spool/qmaster/users" >> > 59 2951 main retval = 0 >> > 60 2951 main Making dir >> > "/opt/ge/GE2011.11p1/default/spool/qmaster/projects" >> > 61 2951 main retval = 0 >> > 62 2951 main Making dir >> > "/opt/ge/GE2011.11p1/default/spool/qmaster/resource_quotas" >> > 63 2951 main retval = 0 >> > 64 2951 main Making dir >> > "/opt/ge/GE2011.11p1/default/spool/qmaster/advance_reservations" >> > 65 2951 main retval = 0 >> > 66 2951 main Making dir >> > "/opt/ge/GE2011.11p1/default/common/local_conf" >> > 67 2951 main retval = 0 >> > 68 2951 main reading CONFIG "global" >> > 69 2951 main reading CONFIG "veli.local" >> > 70 2951 main qualified_hostname: 'veli.local' >> > 71 2951 main Complex Attributes---------------------- >> > 72 2951 main reading COMPLEX_ENTRY "s_vmem" >> > 73 2951 main reading COMPLEX_ENTRY "s_core" >> > 74 2951 main reading COMPLEX_ENTRY "virtual_used" >> > 75 2951 main reading COMPLEX_ENTRY "s_rss" >> > 76 2951 main reading COMPLEX_ENTRY "swap_total" >> > 77 2951 main reading COMPLEX_ENTRY "load_long" >> > 78 2951 main reading COMPLEX_ENTRY "virtual_free" >> > 79 2951 main reading COMPLEX_ENTRY "np_load_avg" >> > 80 2951 main reading COMPLEX_ENTRY "calendar" >> > 81 2951 main reading COMPLEX_ENTRY "h_cpu" >> > 82 2951 main reading COMPLEX_ENTRY "min_cpu_interval" >> > 83 2951 main reading COMPLEX_ENTRY "h_rt" >> > 84 2951 main reading COMPLEX_ENTRY "h_vmem" >> > 85 2951 main reading COMPLEX_ENTRY "h_data" >> > 86 2951 main reading COMPLEX_ENTRY "m_socket" >> > 87 2951 main reading COMPLEX_ENTRY "mem_used" >> > 88 2951 main reading COMPLEX_ENTRY "s_rt" >> > 89 2951 main reading COMPLEX_ENTRY "virtual_total" >> > 90 2951 main reading COMPLEX_ENTRY "swap_free" >> > 91 2951 main reading COMPLEX_ENTRY "seq_no" >> > 92 2951 main reading COMPLEX_ENTRY "slots" >> > 93 2951 main reading COMPLEX_ENTRY "s_fsize" >> > 94 2951 main reading COMPLEX_ENTRY "h_stack" >> > 95 2951 main reading COMPLEX_ENTRY "h_fsize" >> > 96 2951 main reading COMPLEX_ENTRY "load_avg" >> > 97 2951 main reading COMPLEX_ENTRY "load_short" >> > 98 2951 main reading COMPLEX_ENTRY "hostname" >> > 99 2951 main reading COMPLEX_ENTRY "h_rss" >> > 100 2951 main reading COMPLEX_ENTRY "np_load_short" >> > 101 2951 main reading COMPLEX_ENTRY "arch" >> > 102 2951 main reading COMPLEX_ENTRY "num_proc" >> > 103 2951 main reading COMPLEX_ENTRY "np_load_medium" >> > 104 2951 main reading COMPLEX_ENTRY "m_topology_inuse" >> > 105 2951 main reading COMPLEX_ENTRY "h_core" >> > 106 2951 main reading COMPLEX_ENTRY "tmpdir" >> > 107 2951 main reading COMPLEX_ENTRY "s_data" >> > 108 2951 main reading COMPLEX_ENTRY "rerun" >> > 109 2951 main reading COMPLEX_ENTRY "qname" >> > 110 2951 main reading COMPLEX_ENTRY "s_cpu" >> > 111 2951 main reading COMPLEX_ENTRY "mem_total" >> > 112 2951 main reading COMPLEX_ENTRY "s_stack" >> > 113 2951 main reading COMPLEX_ENTRY "swap_rsvd" >> > 114 2951 main reading COMPLEX_ENTRY "m_topology" >> > 115 2951 main reading COMPLEX_ENTRY "cpu" >> > 116 2951 main reading COMPLEX_ENTRY "load_medium" >> > 117 2951 main reading COMPLEX_ENTRY "mem_free" >> > 118 2951 main reading COMPLEX_ENTRY "swap_rate" >> > 119 2951 main reading COMPLEX_ENTRY "np_load_long" >> > 120 2951 main reading COMPLEX_ENTRY "m_core" >> > 121 2951 main reading COMPLEX_ENTRY "swap_used" >> > 122 2951 main reading COMPLEX_ENTRY "display_win_gui" >> > 123 2951 main host_list---------------------------- >> > 124 2951 main reading EXECHOST "global" >> > 125 2951 main reading EXECHOST "template" >> > 126 2951 main reading ADMINHOST "veli.local" >> > 127 2951 main manager_list---------------------------- >> > 128 2951 main root >> > 129 2951 main host group definitions----------- >> > 130 2951 main operator_list---------------------------- >> > 131 2951 main userset_list------------------------------ >> > 132 2951 main reading USERSET "deadlineusers" >> > 133 2951 main reading USERSET "defaultdepartment" >> > 134 2951 main reading USERSET "arusers" >> > 135 2951 main calendar list ------------------------------ >> > 136 2951 main resource quota list ----------------------- >> > 137 2951 main >> > cluster_queue_list--------------------------------- >> > 138 2951 main pe_list--------------------------------- >> > 139 2951 main ckpt_list--------------------------------- >> > 140 2951 main advance reservation list >> > ----------------------- >> > 141 2951 main job_list----------------------------------- >> > 142 2951 main user list----------------------------------- >> > 143 2951 main project >> > list----------------------------------- >> > 144 2951 main scheduler config >> > ----------------------------------- >> > 145 2951 main reading SCHEDD_CONF "sched_configuration" >> > 146 2951 main sconf_validate: no config to validate >> > 147 2951 main share tree >> > list----------------------------------- >> > 148 2951 main reading SHARETREE "sharetree" >> > 149 2951 main received qmaster_params are: none >> > 150 2951 main event master functionality has been >> > initialized >> > 151 2951 -1275073680 /etc/init.d/sgemaster.p7111: line 760: 2951 >> > Segmentation fault $bin_dir/sge_qmaster >> > >> > Strace (last few lines) >> > >> > open("qmaster.pid", O_WRONLY|O_CREAT|O_TRUNC|O_LARGEFILE, 0644) = 4 >> > close(4) = 0 >> > open("qmaster.pid", O_WRONLY|O_CREAT|O_TRUNC|O_LARGEFILE, 0666) = 4 >> > fstat64(4, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0 >> > mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) >> > = 0xb7895000 >> > write(4, "3273\n", 5) = 5 >> > close(4) = 0 >> > munmap(0xb7895000, 4096) = 0 >> > mmap2(NULL, 10489856, PROT_READ|PROT_WRITE, >> > MAP_PRIVATE|MAP_ANONYMOUS|MAP_STACK, -1, 0) = 0xb40ff000 >> > mprotect(0xb40ff000, 4096, PROT_NONE) = 0 >> > clone(child_stack=0xb4aff494, >> > flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SET >> > write(2, " 150 3273 main ", 27 150 3273 main ) = 27 >> > write(2, 0xbfc7f170, 52 <unfinished ...> >> > +++ killed by SIGSEGV +++ >> > Segmentation fault >> > >> > Long enough? :D >> > >> > -Karl Vollmer >> > >> > _______________________________________________ >> > users mailing list >> > [email protected] >> > https://gridengine.org/mailman/listinfo/users _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
