Hi all, As Kooda found out when working on #1379, there's still some problems with the symbol GC stuff in CHICKEN 5. Sometimes you'll see an assertion failure in update_symbol_tables: assert(!C_persistable_symbol(sym)) will bail out due to some symbols being released while they still had a global value or plist.
Attached is a patch to fix this. First, I added the #ifdef DEBUGBUILD block in update_symbol_tables() to more aggressively check for inconsistencies. Then, one by one, I fixed the problems that were detected by this sanity check. The main problem is that originally, I thought symbols were statically allocated by C_h_intern, but it's only the *string* that's statically allocated. Making the entire symbol statically allocated is AFAICT not possible (because the GC will skip its value and plist), so instead I changed add_symbol() to check for this and allocate a weak or regular pair from the start, instead of attempting to detect statically allocated symbols during garbage collection. This should make GCs ever so slightly faster, too :) C_i_unpersist() now also needs to check for static strings, to avoid unpersisting those symbols prematurely. I also noticed a sort of race condition in eval.scm where it could persist the symbol before its value would be filled in, tripping up the sanity check. I worked around that by first performing the calculation and then persisting. I also removed ##sys#persist-symbol in favor of using a ##core#inline in eval.scm, which should also make things slightly faster and not susceptible to interrupts. Finally, there's a weird situation where some symbols will be allocated statically, but someone else might have interned the same symbol before. This means the existing symbol might be dynamically allocated, and it could be GCed, which means lf[...] will refer to dead memory (the GC can't update lf[...] entries). To fix this, I decided to go ahead and allocate the string statically, and replace the existing symbol's dynamic string with the newly allocated one, and persisting the symbol to ensure it won't get GCed. I'm reasonably sure that this weird situation can also occur in master. I don't think it can be fixed as easily there, though, due to how the weak table stuff works: if nothing currently refers to the symbol, it will always be collected, even though the C code can refer to lf[...] items later. Cheers, Peter
From 2940a1f6a74141ddb5bc7e9517778f95b5dd8b09 Mon Sep 17 00:00:00 2001 From: Peter Bex <pe...@more-magic.net> Date: Fri, 16 Jun 2017 19:54:42 +0200 Subject: [PATCH] Fix some edge cases with symbol GC Symbols are never statically allocated, their name strings are, so the permanentp() check in update_symbol_tables was bogus. It might trigger collection of a symbol even though it would be statically allocated. This could potentially cause problems when generated C code accessed a symbol global through lf[...], for example, because its symbol would have been collected. Instead, we now ensure persistence in add_symbol, based on whether the symbol's string name is in non-GCable memory, and C_i_unpersist_symbol will now also check the string for being GCable before unpersisting. This commit also adds a very paranoid check to update_symbol_tables which detects such edge cases. This check found more edge cases: - When a symbol is immediately given a value in C_intern3, it should also be persisted. - If a static symbol is to be generated by C_h_intern, but it has already been created in the heap (a practical example is the pending-finalizers symbol), there is no guarantee that it will stick around after GC. So, if a symbol is found already in the symbol table, and its string name isn't statically allocated, we replace it by a newly allocated static string, to ensure that the code which created it won't allow the symbol to get GCed. - The code in eval.scm had a problem if a GC happened in between persisting the symbol and actually assigning the value. This is not a problem in practice, but persisting it after calculating the value is cleaner. We also use the inline operator for performance and to avoid a GC in between persisting and assigning. --- eval.scm | 5 +++-- library.scm | 1 - runtime.c | 66 +++++++++++++++++++++++++++++++++++++++++++++++++++++-------- 3 files changed, 61 insertions(+), 11 deletions(-) diff --git a/eval.scm b/eval.scm index 8167774..0eb395e 100644 --- a/eval.scm +++ b/eval.scm @@ -267,8 +267,9 @@ (lambda (v) (##sys#error 'eval "environment is not mutable" evalenv var)) ;XXX var? (lambda (v) - (##sys#persist-symbol var) - (##sys#setslot var 0 (##core#app val v)))))) + (let ((result (##core#app val v))) + (##core#inline "C_i_persist_symbol" var) + (##sys#setslot var 0 result)))))) ((zero? i) (lambda (v) (##sys#setslot (##sys#slot v 0) j (##core#app val v)))) (else (lambda (v) diff --git a/library.scm b/library.scm index 9da4ef9..7f0d60a 100644 --- a/library.scm +++ b/library.scm @@ -275,7 +275,6 @@ EOF (define ##sys#gc (##core#primitive "C_gc")) (define (##sys#setslot x i y) (##core#inline "C_i_setslot" x i y)) (define (##sys#setislot x i y) (##core#inline "C_i_set_i_slot" x i y)) -(define (##sys#persist-symbol s) (##core#inline "C_i_persist_symbol" s)) (define ##sys#allocate-vector (##core#primitive "C_allocate_vector")) (define (argc+argv) (##sys#values main_argc main_argv)) (define ##sys#make-structure (##core#primitive "C_make_structure")) diff --git a/runtime.c b/runtime.c index 1d2e750..64020e2 100644 --- a/runtime.c +++ b/runtime.c @@ -2279,8 +2279,11 @@ C_regparm C_word C_fcall C_intern_in(C_word **ptr, int len, C_char *str, C_SYMBO C_regparm C_word C_fcall C_h_intern_in(C_word *slot, int len, C_char *str, C_SYMBOL_TABLE *stable) { - /* Intern as usual, but remember slot, if looked up symbol is in nursery. - also: allocate in static memory. */ + /* Intern as usual, but remember slot, and allocate in static + * memory. If symbol already exists, replace its string by a fresh + * statically allocated string to ensure it never gets collected, as + * lf[] entries are not tracked by the GC. + */ int key; C_word s; @@ -2291,6 +2294,11 @@ C_regparm C_word C_fcall C_h_intern_in(C_word *slot, int len, C_char *str, C_SYM if(C_truep(s = lookup(key, len, str, stable))) { if(C_in_stackp(s)) C_mutate_slot(slot, s); + if(!C_truep(C_permanentp(C_symbol_name(s)))) { + /* Replace by statically allocated string, and persist it */ + C_set_block_item(s, 1, C_static_string(C_heaptop, len, str)); + C_i_persist_symbol(s); + } return s; } @@ -2333,6 +2341,7 @@ C_regparm C_word C_fcall C_intern3(C_word **ptr, C_char *str, C_word value) C_word s = C_intern_in(ptr, C_strlen(str), str, symbol_table); C_mutate2(&C_block_item(s,0), value); + C_i_persist_symbol(s); /* Symbol has a value now; persist it */ return s; } @@ -2385,7 +2394,8 @@ C_regparm C_word C_fcall C_i_persist_symbol(C_word sym) } /* Possibly remove "persistence" of symbol, to allowed it to be GC'ed. - * This is only done if the symbol is unbound and has an empty plist. + * This is only done if the symbol is unbound, has an empty plist and + * is allocated in managed memory. */ C_regparm C_word C_fcall C_i_unpersist_symbol(C_word sym) { @@ -2393,7 +2403,10 @@ C_regparm C_word C_fcall C_i_unpersist_symbol(C_word sym) C_i_check_symbol(sym); - if (C_persistable_symbol(sym)) return C_SCHEME_FALSE; + if (C_persistable_symbol(sym) || + C_truep(C_permanentp(C_symbol_name(sym)))) { + return C_SCHEME_FALSE; + } bucket = lookup_bucket(sym, NULL); if (C_truep(bucket)) { /* It could be an uninterned symbol(?) */ @@ -2464,7 +2477,13 @@ C_word add_symbol(C_word **ptr, C_word key, C_word string, C_SYMBOL_TABLE *stabl C_set_block_item(sym, 2, C_SCHEME_END_OF_LIST); *ptr = p; b2 = stable->table[ key ]; /* previous bucket */ - bucket = C_a_weak_pair(ptr, sym, b2); /* create new bucket */ + + /* Create new weak or strong bucket depending on persistability */ + if (C_persistable_symbol(sym) || C_truep(C_permanentp(string))) { + bucket = C_a_pair(ptr, sym, b2); + } else { + bucket = C_a_weak_pair(ptr, sym, b2); + } if(ptr != C_heaptop) C_mutate_slot(&stable->table[ key ], bucket); else { @@ -4138,10 +4157,41 @@ C_regparm void C_fcall update_symbol_tables(int mode) assert((h & C_HEADER_TYPE_BITS) == C_SYMBOL_TYPE); +#ifdef DEBUGBUILD + /* Detect inconsistencies before dropping / keeping the symbol */ + { + C_word str = C_symbol_name(sym); + int str_perm; + + h = C_block_header(str); + + while(is_fptr(h)) { + str = fptr_to_ptr(h); + h = C_block_header(str); + } + + str_perm = !C_in_stackp(str) && !C_in_heapp(str) && + !C_in_scratchspacep(str) && + (mode == GC_REALLOC ? !C_in_new_heapp(str) : 1); + + if ((C_persistable_symbol(sym) || str_perm) && + (C_block_header(bucket) == C_WEAK_PAIR_TAG)) { + C_dbg(C_text("GC"), C_text("Offending symbol: `%.*s'\n"), + (int)C_header_size(str), C_c_string(str)); + panic(C_text("Persistable symbol found in weak pair")); + } else if (!C_persistable_symbol(sym) && !str_perm && + (C_block_header(bucket) == C_PAIR_TAG)) { + C_dbg(C_text("GC"), C_text("Offending symbol: `%.*s'...\n"), + (int)C_header_size(str), C_c_string(str)); + panic(C_text("Unpersistable symbol found in strong pair")); + } + } +#endif + /* If the symbol is unreferenced, drop it: */ - if(!C_truep(C_permanentp(sym)) && (mode == GC_REALLOC ? - !C_in_new_heapp(sym) : - !C_in_fromspacep(sym))) { + if(mode == GC_REALLOC ? + !C_in_new_heapp(sym) : + !C_in_fromspacep(sym)) { if(last) C_set_block_item(last, 1, C_block_item(bucket,1)); else stp->table[ i ] = C_block_item(bucket,1); -- 2.1.4
signature.asc
Description: Digital signature
_______________________________________________ Chicken-hackers mailing list Chicken-hackers@nongnu.org https://lists.nongnu.org/mailman/listinfo/chicken-hackers