Am 29.01.2014 um 09:20 schrieb Pierre Lindenbaum:
> Hi,
> I'm runnning a big workflow using qmake.
>
> Yesterday night the workflow was 'frozen'. I pressed Ctrl-C to re-run the
> analysis
>
> Now, qmake doesn't work any more, there is no message on stdout/stderr; The
> exit code seems to be '139'
> qmake still works with some other 'Makefiles'.
And it was working before? Which `make`version are you using interactively?
AFAICS `qmake` is based on `make` 3.78.1 and there was already an issue with
this older version on this list.
-- Reuti
> $ qmake -cwd -v PATH -l arch=lx24-amd64 -- -j 50 -n all ; echo $status
> 139
>
> If if run the classical 'make', It prints the commands to be done:
>
> $ make -n all
>
> mkdir -p ../align20140124/Samples/Sample1/VCF/ALL/ && \
> gunzip -c ../align20140124/Samples//Sample1/VCF/ALL//Sample1.vcf.gz | \
> awk -F ' ' '/^#/ {print;next;} {OFS=" ";gsub(/,/,".",$6);
> if($6!="." && $6<0) $6=0; print;}' |\
> (...)
>
>
> Some core dumps have been generated:
>
> $gdb --core=core.55770
> GNU gdb (GDB) Red Hat Enterprise Linux (7.2-60.el6_4.1)
> Copyright (C) 2010 Free Software Foundation, Inc.
> License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
> This is free software: you are free to change and redistribute it.
> There is NO WARRANTY, to the extent permitted by law. Type "show copying"
> and "show warranty" for details.
> This GDB was configured as "x86_64-redhat-linux-gnu".
> For bug reporting instructions, please see:
> <http://www.gnu.org/software/gdb/bugs/>.
> [New Thread 55770]
> Core was generated by `qmake -inherit -cwd -v PATH -l arch=lx24-amd64 -- -j
> 50 -n all'.
> Program terminated with signal 11, Segmentation fault.
> #0 0x0000000000412fb1 in ?? ()
> (gdb) bt
> #0 0x0000000000412fb1 in ?? ()
> #1 0x000000000238f150 in ?? ()
> #2 0x0000000000000000 in ?? ()
>
>
> here are some other informations but I don't know they are related:
>
> $ dmesg
>
> Out of memory: Kill process 21010 (hapHunt) score 975 or sacrifice child
> Killed process 21010, UID 502, (hapHunt) total-vm:104669264kB,
> anon-rss:96930100kB, file-rss:4kB
> [drm:output_poll_execute] *ERROR* delayed enqueue failed -125
> [drm:output_poll_execute] *ERROR* delayed enqueue failed -125
> [drm:output_poll_execute] *ERROR* delayed enqueue failed -125
> [drm:output_poll_execute] *ERROR* delayed enqueue failed -125
> samtools[28071]: segfault at 30 ip 000000000043f8d8 sp 00007fff1485f550 error
> 4 in samtools[400000+60000]
> udev: starting version 147
> nfsd: last server has exited, flushing export cache
> NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory
> NFSD: starting 90-second grace period
> circo[26747]: segfault at 4 ip 0000003c2b60f6f1 sp 00007fffdebb83f0 error 4
> in libcairo.so.2.10800.8[3c2b600000+76000]
> dot[4620]: segfault at 4 ip 0000003c2b60f6f1 sp 00007fff701d59c0 error 4 in
> libcairo.so.2.10800.8[3c2b600000+76000]
> [drm:output_poll_execute] *ERROR* delayed enqueue failed -125
> [drm:output_poll_execute] *ERROR* delayed enqueue failed -125
> samtools[30216] trap divide error ip:40630d sp:7fffd0178dd0 error:0 in
> samtools[400000+60000]
>
>
> in /path/to/sge_root/name/spool/qmaster/messages
>
> 01/29/2014 09:48:31|worker|master|W|job 1171361.1 failed on host node04
> assumedly after job because: job 1171361.1 died through signal SEGV (11)
>
>
> any idea how to fix this ?
>
> Thank you,
>
> Pierre
>
> _______________________________________________
> users mailing list
> [email protected]
> https://gridengine.org/mailman/listinfo/users
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users