bug#25386: Manual gc helps
I did a fairly through review of the thread-creation and thread-join code in the git master branch, and it looks to be just fine. Thus, some experimentation is in order: Going back to guile-2.0, I see this behavior: guile -v guile (GNU Guile) 2.0.11 Packaged by Debian (2.0.11-deb+1-10) If I add a manual gc to the exit of the thread, like so: (define (mkthr v) (call-with-new-thread (lambda () (set! junk (+ junk 1)) (gc) ))) then the heap blows up, in minutes, to about 180MB but then stops growing, even after hours and millions of thread creates: (heap-size . 183734272) (gc-times . 1957954) If I gc only every third thread create, it quickly blows up to about 400MB, and then stablizes, for hours: (heap-size . 428638208) (gc-times . 1292663) If I gc every 17th thread, it blows up to about 1.8GB and then is stable: (heap-size . 1875902464) (gc-times . 327462) This last one after about 5.5 million thread creates and joins. The counting is done like so: (define (mkthr v) (call-with-new-thread (lambda () (lock-mutex mtx) (if (eq? 0 (modulo junk 17)) (gc)) (set! junk (+ junk 1)) (unlock-mutex mtx) ))) In each case, it seems to hit a plateau at about (n+1)*100MB when gc is done on one out of every n threads. This seems quite bizarre to me: why does this inverted relation on number of gcs vs number of thread creates? What's magic about 100MB? Clearly 100MB is wayyy too large for this very simple program. I mean, even if I gc at *every* thread-exit ... (I have not yet explored above in guile-2.2) Since I cannot find any 'obvious' bugs in guile, this suggests some strange stochastic behavior in bdw-gc?
bug#25387: also crashes in guile-2.0
Also crashes in guile-2.0, but takes much longer - 5 minutes --linas
bug#25387: guile-2.2 multi-thread segfault in SCM_VALIDATE_WEAK_TABLE
Following program crashes immediately (fraction of a second) in guile-2.2, current git version (as of 29 Dec 2016 a0656ad4cf976b3845e9b9663a90b46b4cf9fc5a ) It runs fine in guile-2.0. Its doing something slightly squonky: referencing the variable 'cnt' in a thread. Note definition of use before definition of variable Its deterministic - always crashes in the same place. (define junk 0) (define halt #f) (define (wtf-thr) (define start (- (current-time) 0.1)) ; Create thread that does junk and exits. Yes, the increment ; of `junk` is not protected, and its racey, but so what. (define (mkthr v) (call-with-new-thread (lambda () (if (eq? 0 (modulo cnt 30)) (gc)) crashes here!!! (set! junk (+ junk 1) ; thread arguments (define thrarg (make-list 10 0)) (define cnt 0) (define (mke) ; Create a limited number of threads (define thr-list (map mkthr thrarg)) ; (display (length (all-threads))) (map join-thread thr-list) ; Some handy debug printing. (set! cnt (+ cnt 1)) (if (eq? 0 (modulo cnt 500)) (begin (display "rate=") (display (/ cnt (- (current-time) start))) (newline) (display "cnt=") (display cnt) (newline) (display (gc-stats)) (newline) (newline) ))) ; tail recursive infinite loop. (define (aloop) (mke) (if (not halt) (aloop))) ; while forever. (aloop) ) ; Run elsewhere, so that we have a shell prompt ; (not required for the bug) (call-with-new-thread wtf-thr) ; halt if desired. ; (set! halt #t) Thread 621 "guile" received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x7fffedbe1700 (LWP 10504)] 0x77b78af1 in scm_c_weak_table_ref (table=0x0, raw_hash=2738445758486295669, pred=0x77b77bb0 , closure=0x558fff00, dflt=0x904) at ../../libguile/weak-table.c:862 warning: Source file is more recent than executable. 862 SCM_VALIDATE_WEAK_TABLE (1, table); (gdb) bt #0 0x77b78af1 in scm_c_weak_table_ref (table=0x0, raw_hash=2738445758486295669, pred=0x77b77bb0 , closure=0x558fff00, dflt=0x904) at ../../libguile/weak-table.c:862 #1 0x77b02fa4 in fluid_ref (dynamic_state=0x55f8ce60, fluid=0x558fff00) at ../../libguile/fluids.c:287 #2 0x77b0325f in scm_fluid_ref (fluid=0x558fff00) at ../../libguile/fluids.c:308 #3 0x77b34424 in scm_i_default_port_conversion_strategy () at ../../libguile/ports.c:1015 #4 0x77b5e4df in scm_i_default_string_failed_conversion_handler () at ../../libguile/strings.c:1619 #5 scm_from_locale_stringn ( str=0x77b88d50 "Wrong type argument in position ~A: ~S", len=len@entry=18446744073709551615) at ../../libguile/strings.c:1626 #6 0x77b5e51c in scm_from_locale_string (str=) at ../../libguile/strings.c:1613 #7 0x77af76c6 in scm_error (key=0x558fa960, subr=subr@entry=0x77b8a080 "set-current-dynamic-state", message=, args=0x55c6ce30, rest=rest@entry=0x55c6ce50) at ../../libguile/error.c:59 #8 0x77af7968 in scm_wrong_type_arg ( subr=subr@entry=0x77b8a080 "set-current-dynamic-state", pos=pos@entry=1, bad_value=bad_value@entry=0x55c6c3b0) ---Type to continue, or q to quit--- at ../../libguile/error.c:251 #9 0x77b03096 in scm_set_current_dynamic_state ( state=state@entry=0x55c6c3b0) at ../../libguile/fluids.c:496 #10 0x77b6351a in guilify_self_2 ( dynamic_state=dynamic_state@entry=0x55c6c3b0) at ../../libguile/threads.c:466 #11 0x77b63e0c in scm_i_init_thread_for_guile (base=0x7fffedbe0ec0, dynamic_state=0x55c6c3b0) at ../../libguile/threads.c:595 #12 0x77b63e59 in with_guile (base=base@entry=0x7fffedbe0ec0, data=data@entry=0x7fffedbe0ef0) at ../../libguile/threads.c:638 #13 0x76c71812 in GC_call_with_stack_base ( fn=fn@entry=0x77b63e40 , arg=arg@entry=0x7fffedbe0ef0) at misc.c:1925 #14 0x77b635cc in scm_i_with_guile (dynamic_state=, data=0x55c6c410, func=0x77b635e0 ) at ../../libguile/threads.c:688 #15 launch_thread (d=0x55c6c410) at ../../libguile/threads.c:750 #16 0x7735f464 in start_thread (arg=0x7fffedbe1700) at pthread_create.c:333 #17 0x770a29df in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:105
bug#25386: guile-2.0 and 2.2 thread leakage+crash; very simple test attached.
The (very simple) program below leaks ... something, very rapidly, and then crashes after about 15-30 seconds. Last thing printed before crash: rate=194.80519560944032 num threads=2 ((gc-time-taken . 2791348254) (heap-size . 7532883968) (heap-free-size . 2449408) (heap-total-allocated . 23912882640) (heap-allocated-since-gc . 1073995264) (protected-objects . 90) (gc-times . 87)) Too many heap sections: Increase MAXHINCR or MAX_HEAP_SECTS Aborted Similar issue in guile-2.2 except it takes longer (8 minutes) and crashes in gc somewhere. I assume that some sort of continuation is left lying about, even though the thread has exited. (define junk 0) (define halt #f) (define (wtf-thr) (define start (- (current-time) 0.1)) ; Create thread that does junk and exits. Yes, the increment ; of `junk` is not protected, and its racey, but so what. (define (mkthr v) (call-with-new-thread (lambda () (set! junk (+ junk 1) ; thread arguments (define thrarg (make-list 10 0)) (define cnt 0) (define (mke) ; Create a limited number of threads (define thr-list (map mkthr thrarg)) ; (display (length (all-threads))) (map join-thread thr-list) ; Some handy debug printing. (set! cnt (+ cnt 1)) (if (eq? 0 (modulo cnt 500)) (begin (display "rate=") (display (/ cnt (- (current-time) start))) (newline) (display "num threads=") (display (length (all-threads))) (newline) (display (gc-stats)) (newline) (newline) ))) ; tail recursive infinite loop. (define (aloop) (mke) (if (not halt) (aloop))) ; while forever. (aloop) ) ; Run elsewhere, so that we have a shell prompt ; (not required for the bug) (call-with-new-thread wtf-thr) ; halt if desired. ; (set! halt #t)
bug#25384: Compiler mis-identifies source location of erroneous parenthesis pairs
In guile 2.0.13, both the compiler and the interpreter fail to identify the source location of errors of the following form. It instead reports the errors as occurring in boot-9.scm. (unknown-func unknown-symbol () #t) GUILD COMPILE ERROR ;;; WARNING: compilation of /home/mike/projects/bug1.scm failed: ;;; ERROR: Syntax error: ;;; unknown location: unexpected syntax in form () ice-9/boot-9.scm:703:29: In procedure map: ice-9/boot-9.scm:703:29: Syntax error: unknown location: unexpected syntax in form () INTERPRETER ERROR ice-9/boot-9.scm:703:29: In procedure map: ice-9/boot-9.scm:703:29: Syntax error: unknown location: unexpected syntax in form () Entering a new prompt. Type `,bt' for a backtrace or `,q' to continue. scheme@(guile-user) [1]> ,bt 4 (primitive-load "/home/mike/projects/bug1.scm") In ice-9/eval.scm: 505:12 3 (# #) In ice-9/psyntax.scm: 1116:54 2 (expand-top-sequence ((unknownfunc unknownsymbol ...)) ...) 1346:11 1 (# unknownfunc (# # #t)) In ice-9/boot-9.scm: 703:29 0 (map # #) Thanks, Mike