Most of our users don't run 32-bit Linux, but we tested it and it
worked for us (earlier RHEL versions, but not CentOS 6.3). Can you run
the qmaster under gdb so that gdb would show the stack trace?

Rayson



On Mon, Aug 27, 2012 at 10:35 AM, Karl Vollmer <[email protected]> wrote:
> Hello,
>
>   I recently tried installing GE2011.11p1 and after configuring it I get a
>   seg-fault when trying to start the qmaster. If anyone has any insight I
>   would appreciate the help.
>
>   OS: Centos 6.3x86_32
>
>   I used/use the following set of commands to install GE2011, you'll note the
>   .po file mojo at the end, not sure if that's required but it did remove some
>   errors from the debug log.
>
> ----
> export JAVA_HOME=/usr/lib/jvm/java-openjdk/
> ./aimk -only-depend
> ./scripts/zerodepend
> ./aimk depend
> ./aimk -no-java -no-jni -no-secure -spool-classic -no-dump -intl
> ./aimk -man
> ./scripts/distinst -y -basedir /opt/ge -vdir GE2011.11p1  -all -noexit
> ln -s /usr/src/${PKGNAME}/source/scripts/mk_dist /opt/ge/GE2011.11p1/mk_dist
> cp -r /usr/src/${PKGNAME}/source/dist/locale /opt/ge/GE2011.11p1/
> mkdir -p /opt/ge/GE2011.11p1/locale/en/LC_MESSAGES/linux-x86
> cp /opt/ge/GE2011.11p1/locale/en/LC_MESSAGES/gridengine.po 
> /opt/ge/GE2011.11p1/locale/en/LC_MESSAGES/linux-x86
> msgfmt /opt/ge/GE2011.11p1/locale/en/LC_MESSAGES/linux-x86/gridengine.po -o 
> /opt/ge/GE2011.11p1/locale/en/LC_MESSAGES/linux-x86/gridengine.mo
> ----
>
>   Packages installed to support GE2011
>
>   package { 'java-1.7.0-openjdk': ensure        => installed }
>   package { 'java-1.7.0-openjdk-devel': ensure  => installed }
>   package { 'lib-X11-devel': ensure             => installed }
>   package { 'openssl-static': ensure            => installed }
>   package { 'tcsh': ensure            => installed }
>   package { 'pam-devel': ensure       => installed }
>   package { 'openmotif-devel': ensure => installed }
>   package { 'openssl': ensure         => installed }
>   package { 'openssl-devel': ensure   => installed }
>   package { 'libXpm-devel': ensure    => installed }
>   package { 'ncurses-devel': ensure   => installed }
>   package { 'ncurses': ensure         => installed }
>   package { 'texinfo': ensure         => installed }
>   package { 'unzip': ensure           => installed }
>   package { 'gettext': ensure         => installed }
>   package { 'gettext-devel': ensure   => installed }
>
>   Debug output from sge_qmaster
>
>   starting sge_qmaster
>      0   2951 -1216837952     ****** starting localization procedure ... 
> **********
>      1   2951 -1216837952     could not get environment variable "GRIDPACKAGE"
>      2   2951 -1216837952     could not get environment variable 
> "GRIDLOCALEDIR"
>      3   2951 -1216837952     setlocale() returns NULL     4   2951 
> -1216837952     locale directory: >/opt/ge/GE2011.1
>      5   2951 -1216837952     package file:     >linux-x86/gridengine.mo<
>      6   2951 -1216837952     language (LANG):  >en<
>      7   2951 -1216837952     loading message file: 
> /opt/ge/GE2011.11p1/locale/en/LC_MESSAGES/linux-x86/gridengine.mo
>      8   2951 -1216837952     found message file - ok
>      9   2951 -1216837952     setlocale() returns NULL
>     10   2951 -1216837952     bindtextdomain() returns 
> "/opt/ge/GE2011.11p1/locale"
>     11   2951 -1216837952     textdomain() returns "linux-x86/gridengine"
>     12   2951 -1216837952     error id output     : enabled
>     13   2951 -1216837952     ****** starting localization procedure ... 
> success **
>     14   2951 -1216837952     sge_qmaster is not daemonized
>     15   2951 -1216837952     returning port value: 7111
>     16   2951 -1216837952     returning port value: 7112
>     17   2951 -1216837952     Getting host by name - Linux
>     18   2951 -1216837952     1 names in h_addr_list
>     19   2951 -1216837952     1 names in h_aliases
>     20   2951 -1216837952     Getting host by name - Linux
>     21   2951 -1216837952     1 names in h_addr_list
>     22   2951 -1216837952     1 names in h_aliases
>     23   2951         main     Getting host by name - Linux
>     24   2951         main     1 names in h_addr_list
>     25   2951         main     1 names in h_aliases
>     26   2951         main     creating QMASTER handle
>     27   2951         main     act_qmaster file contains local host name
>     28   2951         main     auid=0; agid=0
>     29   2951         main     uid=0; gid=0; euid=0; egid=0 auid=0; agid=0
>     30   2951         main     Making dir 
> "/opt/ge/GE2011.11p1/default/spool/qmaster/jobs"
>     31   2951         main     retval = 0
>     32   2951         main     Making dir 
> "/opt/ge/GE2011.11p1/default/spool/qmaster/zombies"
>     33   2951         main     retval = 0
>     34   2951         main     Making dir 
> "/opt/ge/GE2011.11p1/default/spool/qmaster/cqueues"
>     35   2951         main     retval = 0
>     36   2951         main     Making dir 
> "/opt/ge/GE2011.11p1/default/spool/qmaster/qinstances"
>     37   2951         main     retval = 0
>     38   2951         main     Making dir 
> "/opt/ge/GE2011.11p1/default/spool/qmaster/exec_hosts"
>     39   2951         main     retval = 0
>     40   2951         main     Making dir 
> "/opt/ge/GE2011.11p1/default/spool/qmaster/submit_hosts"
>     41   2951         main     retval = 0
>     42   2951         main     Making dir 
> "/opt/ge/GE2011.11p1/default/spool/qmaster/admin_hosts"
>     43   2951         main     retval = 0
>     44   2951         main     Making dir 
> "/opt/ge/GE2011.11p1/default/spool/qmaster/centry"
>     45   2951         main     retval = 0
>     46   2951         main     Making dir 
> "/opt/ge/GE2011.11p1/default/spool/qmaster/job_scripts"
>     47   2951         main     retval = 0
>     48   2951         main     Making dir 
> "/opt/ge/GE2011.11p1/default/spool/qmaster/pe"
>     49   2951         main     retval = 0
>     50   2951         main     Making dir 
> "/opt/ge/GE2011.11p1/default/spool/qmaster/ckpt"
>     51   2951         main     retval = 0
>     52   2951         main     Making dir 
> "/opt/ge/GE2011.11p1/default/spool/qmaster/usersets"
>     53   2951         main     retval = 0
>     54   2951         main     Making dir 
> "/opt/ge/GE2011.11p1/default/spool/qmaster/calendars"
>     55   2951         main     retval = 0
>     56   2951         main     Making dir 
> "/opt/ge/GE2011.11p1/default/spool/qmaster/hostgroups"
>     57   2951         main     retval = 0
>     58   2951         main     Making dir 
> "/opt/ge/GE2011.11p1/default/spool/qmaster/users"
>     59   2951         main     retval = 0
>     60   2951         main     Making dir 
> "/opt/ge/GE2011.11p1/default/spool/qmaster/projects"
>     61   2951         main     retval = 0
>     62   2951         main     Making dir 
> "/opt/ge/GE2011.11p1/default/spool/qmaster/resource_quotas"
>     63   2951         main     retval = 0
>     64   2951         main     Making dir 
> "/opt/ge/GE2011.11p1/default/spool/qmaster/advance_reservations"
>     65   2951         main     retval = 0
>     66   2951         main     Making dir 
> "/opt/ge/GE2011.11p1/default/common/local_conf"
>     67   2951         main     retval = 0
>     68   2951         main     reading CONFIG "global"
>     69   2951         main     reading CONFIG "veli.local"
>     70   2951         main     qualified_hostname: 'veli.local'
>     71   2951         main     Complex Attributes----------------------
>     72   2951         main     reading COMPLEX_ENTRY "s_vmem"
>     73   2951         main     reading COMPLEX_ENTRY "s_core"
>     74   2951         main     reading COMPLEX_ENTRY "virtual_used"
>     75   2951         main     reading COMPLEX_ENTRY "s_rss"
>     76   2951         main     reading COMPLEX_ENTRY "swap_total"
>     77   2951         main     reading COMPLEX_ENTRY "load_long"
>     78   2951         main     reading COMPLEX_ENTRY "virtual_free"
>     79   2951         main     reading COMPLEX_ENTRY "np_load_avg"
>     80   2951         main     reading COMPLEX_ENTRY "calendar"
>     81   2951         main     reading COMPLEX_ENTRY "h_cpu"
>     82   2951         main     reading COMPLEX_ENTRY "min_cpu_interval"
>     83   2951         main     reading COMPLEX_ENTRY "h_rt"
>     84   2951         main     reading COMPLEX_ENTRY "h_vmem"
>     85   2951         main     reading COMPLEX_ENTRY "h_data"
>     86   2951         main     reading COMPLEX_ENTRY "m_socket"
>     87   2951         main     reading COMPLEX_ENTRY "mem_used"
>     88   2951         main     reading COMPLEX_ENTRY "s_rt"
>     89   2951         main     reading COMPLEX_ENTRY "virtual_total"
>     90   2951         main     reading COMPLEX_ENTRY "swap_free"
>     91   2951         main     reading COMPLEX_ENTRY "seq_no"
>     92   2951         main     reading COMPLEX_ENTRY "slots"
>     93   2951         main     reading COMPLEX_ENTRY "s_fsize"
>     94   2951         main     reading COMPLEX_ENTRY "h_stack"
>     95   2951         main     reading COMPLEX_ENTRY "h_fsize"
>     96   2951         main     reading COMPLEX_ENTRY "load_avg"
>     97   2951         main     reading COMPLEX_ENTRY "load_short"
>     98   2951         main     reading COMPLEX_ENTRY "hostname"
>     99   2951         main     reading COMPLEX_ENTRY "h_rss"
>    100   2951         main     reading COMPLEX_ENTRY "np_load_short"
>    101   2951         main     reading COMPLEX_ENTRY "arch"
>    102   2951         main     reading COMPLEX_ENTRY "num_proc"
>    103   2951         main     reading COMPLEX_ENTRY "np_load_medium"
>    104   2951         main     reading COMPLEX_ENTRY "m_topology_inuse"
>    105   2951         main     reading COMPLEX_ENTRY "h_core"
>    106   2951         main     reading COMPLEX_ENTRY "tmpdir"
>    107   2951         main     reading COMPLEX_ENTRY "s_data"
>    108   2951         main     reading COMPLEX_ENTRY "rerun"
>    109   2951         main     reading COMPLEX_ENTRY "qname"
>    110   2951         main     reading COMPLEX_ENTRY "s_cpu"
>    111   2951         main     reading COMPLEX_ENTRY "mem_total"
>    112   2951         main     reading COMPLEX_ENTRY "s_stack"
>    113   2951         main     reading COMPLEX_ENTRY "swap_rsvd"
>    114   2951         main     reading COMPLEX_ENTRY "m_topology"
>    115   2951         main     reading COMPLEX_ENTRY "cpu"
>    116   2951         main     reading COMPLEX_ENTRY "load_medium"
>    117   2951         main     reading COMPLEX_ENTRY "mem_free"
>    118   2951         main     reading COMPLEX_ENTRY "swap_rate"
>    119   2951         main     reading COMPLEX_ENTRY "np_load_long"
>    120   2951         main     reading COMPLEX_ENTRY "m_core"
>    121   2951         main     reading COMPLEX_ENTRY "swap_used"
>    122   2951         main     reading COMPLEX_ENTRY "display_win_gui"
>    123   2951         main     host_list----------------------------
>    124   2951         main     reading EXECHOST "global"
>    125   2951         main     reading EXECHOST "template"
>    126   2951         main     reading ADMINHOST "veli.local"
>    127   2951         main     manager_list----------------------------
>    128   2951         main     root
>    129   2951         main     host group definitions-----------
>    130   2951         main     operator_list----------------------------
>    131   2951         main     userset_list------------------------------
>    132   2951         main     reading USERSET "deadlineusers"
>    133   2951         main     reading USERSET "defaultdepartment"
>    134   2951         main     reading USERSET "arusers"
>    135   2951         main     calendar list ------------------------------
>    136   2951         main     resource quota list -----------------------
>    137   2951         main     
> cluster_queue_list---------------------------------
>    138   2951         main     pe_list---------------------------------
>    139   2951         main     ckpt_list---------------------------------
>    140   2951         main     advance reservation list 
> -----------------------
>    141   2951         main     job_list-----------------------------------
>    142   2951         main     user list-----------------------------------
>    143   2951         main     project list-----------------------------------
>    144   2951         main     scheduler config 
> -----------------------------------
>    145   2951         main     reading SCHEDD_CONF "sched_configuration"
>    146   2951         main     sconf_validate: no config to validate
>    147   2951         main     share tree 
> list-----------------------------------
>    148   2951         main     reading SHARETREE "sharetree"
>    149   2951         main     received qmaster_params are: none
>    150   2951         main     event master functionality has been initialized
>    151   2951 -1275073680 /etc/init.d/sgemaster.p7111: line 760:  2951 
> Segmentation fault      $bin_dir/sge_qmaster
>
>    Strace (last few lines)
>
> open("qmaster.pid", O_WRONLY|O_CREAT|O_TRUNC|O_LARGEFILE, 0644) = 4
> close(4)                                = 0
> open("qmaster.pid", O_WRONLY|O_CREAT|O_TRUNC|O_LARGEFILE, 0666) = 4
> fstat64(4, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
> mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 
> 0xb7895000
> write(4, "3273\n", 5)                   = 5
> close(4)                                = 0
> munmap(0xb7895000, 4096)                = 0
> mmap2(NULL, 10489856, PROT_READ|PROT_WRITE, 
> MAP_PRIVATE|MAP_ANONYMOUS|MAP_STACK, -1, 0) = 0xb40ff000
> mprotect(0xb40ff000, 4096, PROT_NONE)   = 0
> clone(child_stack=0xb4aff494, 
> flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SET
> write(2, "   150   3273         main ", 27   150   3273         main ) = 27
> write(2, 0xbfc7f170, 52 <unfinished ...>
> +++ killed by SIGSEGV +++
> Segmentation fault
>
>     Long enough? :D
>
> -Karl Vollmer
>
> _______________________________________________
> users mailing list
> [email protected]
> https://gridengine.org/mailman/listinfo/users
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to