Java finalization & smobs?
Hi Guile devs, We are debugging a GC issue in LilyPond when used with Guile 2.2, and could use some help. The issue and associated Merge Request are here: https://gitlab.com/lilypond/lilypond/-/issues/6218 https://gitlab.com/lilypond/lilypond/-/merge_requests/1035 We are using smobs with custom mark and free functions for interfacing with our C++ code. We are seeing that sometimes mark functions are called on smobs which have their dependencies already collected, leading to crashes. We can change our mark routines to avoid the crash, but it's unclear to us if this behavior is intended or not, and we worry that this will come back to bite us in the future. On the one hand, the docs for smobs state "must assume .. all SCM values that it references have already been freed and are thus invalid", which suggests that smob freeing happens in random order, which is consistent with what we see. On the other hand, Guile sets up BDWGC with GC_java_finalization=1, which should keep GC dependencies of an object alive until the object itself is finalized, and I think we have observed the mark calls that make this happen. which of the two is it? (I am seeing the problem on Fedora 35, with Guile 2.2.7 and libgc 8.0.4) -- Han-Wen Nienhuys - hanw...@gmail.com - http://www.xs4all.nl/~hanwen
Re: Guile 1.8.9 release
On Sun, Feb 14, 2021 at 11:35 AM Han-Wen Nienhuys wrote: > >> Thanks. It turns out my previous fix introduced ABI > >> breakage, so I reworked it to not change function > >> signatures or struct sizes. It's also split up in more > >> parts, so it becomes easier to understand. Please see > >> here: [...] > > > >Any news here? Can I do anything to get this fix in? ping? -- Han-Wen Nienhuys - hanw...@gmail.com - http://www.xs4all.nl/~hanwen
Re: Guile 1.8.9 release
On Wed, Feb 10, 2021 at 11:18 PM Thien-Thi Nguyen wrote: > > > () Han-Wen Nienhuys > () Tue, 9 Feb 2021 09:59:07 +0100 > >> Thanks. It turns out my previous fix introduced ABI >> breakage, so I reworked it to not change function >> signatures or struct sizes. It's also split up in more >> parts, so it becomes easier to understand. Please see >> here: [...] > >Any news here? Can I do anything to get this fix in? > > IIUC, the second iteration achieves the same goals as the first > one (i.e., reducing unnecessary allocation by refining the heap > monitoring machinery). Is that correct? (What am i missing?) the first iteration broke ABI compatibility. The second patch doesn't > > I would be happy to commit the second patch, if you could refine > it to add the extensive explanation of the first. (You could > even mention the first approach, as an interesting but misguided > dead end.) That way, we have a full record. Maybe this wasn't clear, but the second patch is actually a sequence of patches, and in aggregate they have more detailed explanation of what is going on. The following are cosmetic changes. While they don't have to be merged, per se, the formatting fixes are the basis of the bugfix changes: 701f6e2cae3dfc1e280711345fb5a75b0aae gc: cleanup DEBUGINFO printfs. ce503b481e7486a7fb5152d3075c2d475fd33e06 gc: fix formatting inconsistencies 925edffc2d4efd19333582ca588e0aebb1c7adf8 gc-segment: clarify comments on segment initialization these are the real bug fixes: 2511e1fa97558e1d0f0620489cdd7550e7d77195 gc: reinterpret scm_gc_cells_collected as garbage counter 594783b15b00133d73aed00bb7f3304a56725497 gc: use normal sweep for pre-mark sweep the following are minor follow-on cleanups: 923c41cb94462140fa07120632f5736680a0c76e gc: calculate min_yield statelessly e2d04fdd4d8a3d9cebd0d9289c5b8f9528e47d34 gc: simplify statistic keeping > I would be extremely happy to commit a test along w/ the change, > if we can figure that out. But it's not critical (we can do it > later). > > Re testing, i don't know how to go about setting up a test to > avoid regressions. (IIUC, this is a performance-related change > and not a functionality-related one.) Any ideas? I can add a test, but it requires adding several fields to the (gc-stats) output, which might surprise callers not expecting them. Also, as it is a performance test, it is hard to construct a test that always works. -- Han-Wen Nienhuys - hanw...@gmail.com - http://www.xs4all.nl/~hanwen
Re: Guile 1.8.9 release
On Sun, Jan 17, 2021 at 7:37 PM Han-Wen Nienhuys wrote: > On Sun, Jan 17, 2021 at 12:10 AM Thien-Thi Nguyen wrote: > >your timing is fortuitous. I just spent the christmas > >holidays delving into GUILE 1.8's heap expansion strategy and > >found and fixed a bug with it. > > > > > https://github.com/hanwen/guile/commit/9b32504780e0b604196be866b8c36079891e3cd6 > > > >How does code review for proposed patches work these days? > > > > I don't know. I think what you did is fine. I invite experts > > to review and comment. They will surely be quicker than me (i > > will require a week or two just to read/understand the patch). > > > >I'd be happy to polish it up (ie. add a proper test) for > >inclusion in GUILE 1.8.9, > > > > Sounds good. I think the general approach for 1.8.x releases > > will be bugfixes and documentation changes primarily, so your > > change would be most welcome (once i wrap my head around it). > > Thanks. It turns out my previous fix introduced ABI breakage, so I > reworked it to not change function signatures or struct sizes. It's > also split up in more parts, so it becomes easier to understand. > Please see here: > > > https://gcithub.com/hanwen/guile/commit/8fbe3222cac4b4e9b39a6a3570ac43f160faa516 > <https://github.com/hanwen/guile/commit/8fbe3222cac4b4e9b39a6a3570ac43f160faa516> Any news here? Can I do anything to get this fix in? -- Han-Wen Nienhuys - hanw...@gmail.com - http://www.xs4all.nl/~hanwen
Re: Guile 1.8.9 release
On Sun, Jan 17, 2021 at 12:10 AM Thien-Thi Nguyen wrote: > > > () Han-Wen Nienhuys > () Tue, 12 Jan 2021 09:20:55 +0100 > >your timing is fortuitous. I just spent the christmas >holidays delving into GUILE 1.8's heap expansion strategy and >found and fixed a bug with it. > > > https://github.com/hanwen/guile/commit/9b32504780e0b604196be866b8c36079891e3cd6 > >How does code review for proposed patches work these days? > > I don't know. I think what you did is fine. I invite experts > to review and comment. They will surely be quicker than me (i > will require a week or two just to read/understand the patch). > >I'd be happy to polish it up (ie. add a proper test) for >inclusion in GUILE 1.8.9, > > Sounds good. I think the general approach for 1.8.x releases > will be bugfixes and documentation changes primarily, so your > change would be most welcome (once i wrap my head around it). Thanks. It turns out my previous fix introduced ABI breakage, so I reworked it to not change function signatures or struct sizes. It's also split up in more parts, so it becomes easier to understand. Please see here: https://github.com/hanwen/guile/commit/8fbe3222cac4b4e9b39a6a3570ac43f160faa516 -- Han-Wen Nienhuys - hanw...@gmail.com - http://www.xs4all.nl/~hanwen
Guile 1.8.9 release
Hi Thien-Thi, your timing is fortuitous. I just spent the christmas holidays delving into GUILE 1.8's heap expansion strategy and found and fixed a bug with it. See https://github.com/hanwen/guile/commit/9b32504780e0b604196be866b8c36079891e3cd6 How does code review for proposed patches work these days? I'd be happy to polish it up (ie. add a proper test) for inclusion in GUILE 1.8.9, cheers, -- Han-Wen Nienhuys - hanw...@gmail.com - http://www.xs4all.nl/~hanwen
useful statprof data?
We finally have a version of LilyPond that can work with byte compiled files on GUILE 3. I had high hopes for this, given the ebullient promises of JIT in the release announcement. The results are disappointing. GUILE 3.0 is 3% faster than GUILE 2.2; both are about 30% slower than GUILE 1.8 I wanted to look more into the cause of this, and tried to run statprof. This prints a ton of ;;; (what! #) lines. Then, the results look like 97.2% #x221bfb4 27.6% #x2215134 27.4% #x221836c 15.6% #x2215138 15.6% #x220ae70 15.6% profile-signal-handler at statprof.scm:251:4 15.6% #x2212b88 6.7% #x2218430 6.7% #x220ae70 6.7% profile-signal-handler at statprof.scm:251:4 6.7% #x2212b88 Is there a way to get insight into what these hex addresses (?) mean? -- Han-Wen Nienhuys - hanw...@gmail.com - http://www.xs4all.nl/~hanwen
Re: definitions in macros?
On Sun, Mar 22, 2020 at 10:09 PM David Kastrup wrote: > > In the code below, it looks like only one of the two definitions in > > the body of my-macro-new takes effect. Is this expected, and if so, > > why? > > > > (defmacro-public my-macro-old (command-and-args . definition) > > (module-define! (current-module) 'x1 "I am X1\n") > > (module-define! (current-module) 'x2 "I am X2\n")) > > > > (defmacro-public my-macro-new (command-and-args . definition) > > `(define p "i am P\n") > > `(define q "i am Q\n")) > > This is very much expected. The macro body contains two side-effect > free expressions (namely quoted lists) and returns the last one which is .. > You probably wanted something like > `(begin (define p ...) (define q ...)) d'oh! I am an idiot. Thanks, -- Han-Wen Nienhuys - hanw...@gmail.com - http://www.xs4all.nl/~hanwen
definitions in macros?
Hi there, in my quest to get lilypond working with GUILE 2+, I've hit another stumbling block. In order to make compilation with GUILE 2+ working, we have to move away from runtime symbol definition (ie. module-define! calls). In the code below, it looks like only one of the two definitions in the body of my-macro-new takes effect. Is this expected, and if so, why? (defmacro-public my-macro-old (command-and-args . definition) (module-define! (current-module) 'x1 "I am X1\n") (module-define! (current-module) 'x2 "I am X2\n")) (defmacro-public my-macro-new (command-and-args . definition) `(define p "i am P\n") `(define q "i am Q\n")) (my-macro-old 1 2) (my-macro-new 1 2) (display x1) (display x2) (display q) (display p) thanks, -- Han-Wen Nienhuys - hanw...@gmail.com - http://www.xs4all.nl/~hanwen
Re: garbage collection slowdown
On Wed, Feb 5, 2020 at 5:23 PM Ludovic Courtès wrote: > Weird. It would be interesting to see where the slowdown comes from. > Overall, my recollection of the 1.8 to 2.0 transition (where we > introduced libgc) is that GC was a bit faster, definitely not slower. > > That said, does LilyPond happen to use lots of bignums and/or lots of > finalizers? Finalizers, including those on bignums, end up being very > GC-intensive, as discussed in my recent message. Perhaps that’s what’s > happening here, for instance if you create lots of SMOBs with a free > function. > No, I think it's because in some phases of the program, there is a lot of heap growth, with little garbage generation. This causes frequent (expensive) GCs that don't reclaim anything. -- Han-Wen Nienhuys - hanw...@gmail.com - http://www.xs4all.nl/~hanwen
Re: unhandled constant?
On Sat, Feb 1, 2020 at 11:11 AM David Kastrup wrote: > >> Here is an example that shows better how things work, and what might > >> be the cause of my problems. Is it right that programmatically set > >> contents of "current-module" are not serialized into the compiled > >> file? > >> > >>(define (run-at-compile-time cmd) > >> (module-define! (current-module) (string->symbol cmd) #t) > >> (format (current-error-port) "I-am-called-at-compile-time ~a\n" cmd)) > >> > >> (define (runtime-call cmd) > >> (format (current-error-port) "I-am-called-at-runtime ~a\n" cmd) > >> (format (current-error-port) "val ~a\n" > >> (module-ref (current-module) (string->symbol cmd > >> > >> (defmacro foo (cmd . rest) > >> (run-at-compile-time cmd) > >> `(runtime-call ,cmd)) > >> > >> (foo "xy") .. > >> ERROR: In procedure scm-error: > >> No variable named xy in # > >> > > But that is not using a local define at all. Can you point out the > actual code that failed for you? There are two independent problems. One is a problem with inner defines, which is addressed by https://codereview.appspot.com/553480044/ the symptom is compilation failing with "unhandled constant # " The other is a problem you can reproduce if you check out https://github.com/hanwen/lilypond/tree/guile22-experiment with the symptom being: ;;; compiling /home/hanwen/vc/lilypond/out/share/lilypond/current/scm/define-markup-commands.scm fatal error: Not a markup command: line This is because the LilyPond macro "markup" doesn't recognize markup command and aborts in code that is executed at compile-time. The code that triggers this are definitions in scm/define-markup-commands.scm that use (mark #:blah .. ) in the function body. You can verify this by rewriting https://github.com/lilypond/lilypond/blob/c5ffa540fdbe52486b9575567ede70be575adb47/scm/define-markup-commands.scm#L305 and seeing how the error message changes. I still don't understand why some code is executed compile time (the expansion of the markup macro) while other is not (defining the make-x-markup function in (current-module)) Since we recognize markup commands by looking them up in (current-module), I believe the example I showed here shows that we can never make this work, and we will have to revisit the markup macros completely. -- Han-Wen Nienhuys - hanw...@gmail.com - http://www.xs4all.nl/~hanwen
Re: unhandled constant?
+lilypond-devel for visibility. On Sat, Feb 1, 2020 at 10:54 AM Han-Wen Nienhuys wrote: > > Here is an example that shows better how things work, and what might > be the cause of my problems. Is it right that programmatically set > contents of "current-module" are not serialized into the compiled > file? > >(define (run-at-compile-time cmd) > (module-define! (current-module) (string->symbol cmd) #t) > (format (current-error-port) "I-am-called-at-compile-time ~a\n" cmd)) > > (define (runtime-call cmd) > (format (current-error-port) "I-am-called-at-runtime ~a\n" cmd) > (format (current-error-port) "val ~a\n" > (module-ref (current-module) (string->symbol cmd > > (defmacro foo (cmd . rest) > (run-at-compile-time cmd) > `(runtime-call ,cmd)) > > (foo "xy") > > $ guile1.8 ew.scm > I-am-called-at-compile-time xy > I-am-called-at-runtime xy > val #t > > this is compatible with 2.2 without compilation, > > $ GUILE_AUTO_COMPILE=0 guile2.2 ew.scm > I-am-called-at-compile-time xy > I-am-called-at-runtime xy > val #t > > but compilation fails > > $ GUILE_AUTO_COMPILE=1 guile2.2 ew.scm > ;;; note: auto-compilation is enabled, set GUILE_AUTO_COMPILE=0 > ;;; or pass the --no-auto-compile argument to disable. > ;;; compiling /home/hanwen/vc/lilypond/ew.scm > ;;; WARNING: compilation of /home/hanwen/vc/lilypond/ew.scm failed: > ;;; Unbound variable: run-at-compile-time > > $ guild2.2 compile ew.scm > Backtrace: > In system/base/target.scm: > 57:6 19 (with-target _ _) > In system/base/compile.scm: > .. > Unbound variable: run-at-compile-time > > > If I encapsulate the run-at-compile-time definition with > > (eval-when > (compile eval) > > it works if I remove the module manipulation, but the module-ref > doesn't work. It looks like the settings from module-define! are not > serialized into the byte code, so I can't have code that relies on > correspondence between module-define driven from macros and module-ref > during evaluation. > > [hanwen@localhost lilypond]$ guild2.2 compile ew.scm > I-am-called-at-compile-time xy > wrote > `/home/hanwen/.cache/guile/ccache/2.2-LE-8-3.A/home/hanwen/vc/lilypond/ew.scm.go' > [hanwen@localhost lilypond]$ GUILE_AUTO_COMPILE=0 guile2.2 ew.scm > I-am-called-at-runtime xy > Backtrace: >6 (apply-smob/1 #) > In ice-9/boot-9.scm: > 705:2 5 (call-with-prompt ("prompt") # …) > In ice-9/eval.scm: > 619:8 4 (_ #(#(#))) > In ice-9/boot-9.scm: >2312:4 3 (save-module-excursion #) > 3832:12 2 (_) > In ew.scm: > 10:10 1 (runtime-call "xy") > In unknown file: >0 (scm-error misc-error #f "~A ~S ~S ~S" ("No variabl…" …) …) > > ERROR: In procedure scm-error: > No variable named xy in # > -- Han-Wen Nienhuys - hanw...@gmail.com - http://www.xs4all.nl/~hanwen
Re: unhandled constant?
Here is an example that shows better how things work, and what might be the cause of my problems. Is it right that programmatically set contents of "current-module" are not serialized into the compiled file? (define (run-at-compile-time cmd) (module-define! (current-module) (string->symbol cmd) #t) (format (current-error-port) "I-am-called-at-compile-time ~a\n" cmd)) (define (runtime-call cmd) (format (current-error-port) "I-am-called-at-runtime ~a\n" cmd) (format (current-error-port) "val ~a\n" (module-ref (current-module) (string->symbol cmd (defmacro foo (cmd . rest) (run-at-compile-time cmd) `(runtime-call ,cmd)) (foo "xy") $ guile1.8 ew.scm I-am-called-at-compile-time xy I-am-called-at-runtime xy val #t this is compatible with 2.2 without compilation, $ GUILE_AUTO_COMPILE=0 guile2.2 ew.scm I-am-called-at-compile-time xy I-am-called-at-runtime xy val #t but compilation fails $ GUILE_AUTO_COMPILE=1 guile2.2 ew.scm ;;; note: auto-compilation is enabled, set GUILE_AUTO_COMPILE=0 ;;; or pass the --no-auto-compile argument to disable. ;;; compiling /home/hanwen/vc/lilypond/ew.scm ;;; WARNING: compilation of /home/hanwen/vc/lilypond/ew.scm failed: ;;; Unbound variable: run-at-compile-time $ guild2.2 compile ew.scm Backtrace: In system/base/target.scm: 57:6 19 (with-target _ _) In system/base/compile.scm: .. Unbound variable: run-at-compile-time If I encapsulate the run-at-compile-time definition with (eval-when (compile eval) it works if I remove the module manipulation, but the module-ref doesn't work. It looks like the settings from module-define! are not serialized into the byte code, so I can't have code that relies on correspondence between module-define driven from macros and module-ref during evaluation. [hanwen@localhost lilypond]$ guild2.2 compile ew.scm I-am-called-at-compile-time xy wrote `/home/hanwen/.cache/guile/ccache/2.2-LE-8-3.A/home/hanwen/vc/lilypond/ew.scm.go' [hanwen@localhost lilypond]$ GUILE_AUTO_COMPILE=0 guile2.2 ew.scm I-am-called-at-runtime xy Backtrace: 6 (apply-smob/1 #) In ice-9/boot-9.scm: 705:2 5 (call-with-prompt ("prompt") # …) In ice-9/eval.scm: 619:8 4 (_ #(#(#))) In ice-9/boot-9.scm: 2312:4 3 (save-module-excursion #) 3832:12 2 (_) In ew.scm: 10:10 1 (runtime-call "xy") In unknown file: 0 (scm-error misc-error #f "~A ~S ~S ~S" ("No variabl…" …) …) ERROR: In procedure scm-error: No variable named xy in # On Fri, Jan 31, 2020 at 9:01 PM Linus Björnstam wrote: > > Read the docs. That seems to be a documentation bug. Try fiddling with the > arguments to eval when and see if you can make it work. > > -- > Linus Björnstam > > On Fri, 31 Jan 2020, at 20:17, Han-Wen Nienhuys wrote: > > On Fri, Jan 31, 2020 at 7:20 PM Linus Björnstam > > wrote: > > > I don't really understand your question. With defmacro and syntax-case > > > you can run arbitrary code. If you just output code that does > > > module-define! that won't be run until runtime, and thus you cannot > > > depend on the result of that module-define! during expansion. You can > > > however wrap it in an eval-when to solve that issue. That allows you to > > > specify when code gets run. With module-define! I personally find it all > > > a bit icky, but I usually stay as far away from phasing as I can :) > > > > eval-when looks like it might be a solution to the puzzle , but > > honestly, the doc at > > > > https://www.gnu.org/software/guile/manual/html_node/Eval-When.html > > > > has me mystified. When I run the example through guile 2.2 and > > display *compilation-date*, > > > > I get a different answer each time. Shouldn't it be a fixed timestamp > > (of when the compile happened?) > > > > [hanwen@localhost lilypond]$ guile2.2 e.scm > > ;;; note: source file /home/hanwen/vc/lilypond/e.scm > > ;;; newer than compiled > > /home/hanwen/.cache/guile/ccache/2.2-LE-8-3.A/home/hanwen/vc/lilypond/e.scm.go > > ;;; note: auto-compilation is enabled, set GUILE_AUTO_COMPILE=0 > > ;;; or pass the --no-auto-compile argument to disable. > > ;;; compiling /home/hanwen/vc/lilypond/e.scm > > ;;; compiled > > /home/hanwen/.cache/guile/ccache/2.2-LE-8-3.A/home/hanwen/vc/lilypond/e.scm.go > > Fri Jan 31 20:15:57+0100 2020 > > [hanwen@localhost lilypond]$ guile2.2 e.scm > > Fri Jan 31 20:15:58+0100 2020 > > [hanwen@localhost lilypond]$ guile2.2 e.scm > > Fri Jan 31 20:16:00+0100 2020 > > > > -- > > Han-Wen Nienhuys - hanw...@gmail.com - http://www.xs4all.nl/~hanwen > > -- Han-Wen Nienhuys - hanw...@gmail.com - http://www.xs4all.nl/~hanwen
Re: garbage collection slowdown
On Tue, Jan 28, 2020 at 11:41 PM Han-Wen Nienhuys wrote: > Unfortunately, it looks like the adoption of the BDW GC library caused > a ~6x slowdown, causing an overall end-to-end slowdown of 50%. > > I was wondering if you folks would have tips to further tune GC for > wall-time speed, and if there additional diagnostics to see if we're > doing something extraordinarily silly. For the record, I managed to solve this, by scaling up the heap more aggressively. See https://codereview.appspot.com/561390043/diff/557260051/lily/score-engraver.cc Ironically, this is the same problem scenario that led me to refactoring the GUILE GC in ~2002, in c8a1bdc460f892847d0fb3f1321cdeb305160bf8. -- Han-Wen Nienhuys - hanw...@gmail.com - http://www.xs4all.nl/~hanwen
Re: unhandled constant?
On Fri, Jan 31, 2020 at 7:20 PM Linus Björnstam wrote: > I don't really understand your question. With defmacro and syntax-case you > can run arbitrary code. If you just output code that does module-define! that > won't be run until runtime, and thus you cannot depend on the result of that > module-define! during expansion. You can however wrap it in an eval-when to > solve that issue. That allows you to specify when code gets run. With > module-define! I personally find it all a bit icky, but I usually stay as far > away from phasing as I can :) eval-when looks like it might be a solution to the puzzle , but honestly, the doc at https://www.gnu.org/software/guile/manual/html_node/Eval-When.html has me mystified. When I run the example through guile 2.2 and display *compilation-date*, I get a different answer each time. Shouldn't it be a fixed timestamp (of when the compile happened?) [hanwen@localhost lilypond]$ guile2.2 e.scm ;;; note: source file /home/hanwen/vc/lilypond/e.scm ;;; newer than compiled /home/hanwen/.cache/guile/ccache/2.2-LE-8-3.A/home/hanwen/vc/lilypond/e.scm.go ;;; note: auto-compilation is enabled, set GUILE_AUTO_COMPILE=0 ;;; or pass the --no-auto-compile argument to disable. ;;; compiling /home/hanwen/vc/lilypond/e.scm ;;; compiled /home/hanwen/.cache/guile/ccache/2.2-LE-8-3.A/home/hanwen/vc/lilypond/e.scm.go Fri Jan 31 20:15:57+0100 2020 [hanwen@localhost lilypond]$ guile2.2 e.scm Fri Jan 31 20:15:58+0100 2020 [hanwen@localhost lilypond]$ guile2.2 e.scm Fri Jan 31 20:16:00+0100 2020 -- Han-Wen Nienhuys - hanw...@gmail.com - http://www.xs4all.nl/~hanwen
Re: unhandled constant?
On Fri, Jan 31, 2020 at 3:58 PM Linus Björnstam wrote: > > Guile1.8's macros are run-time macros: they are executed directly and not > transformed to output code that is then compiled. That is the reason why your > code works: to newer guiles the (inner ...) is only available at expansion > time. The macro output is trying to call code that does not exist at runtime! When is the code executed? If have complex set of macros to define a special type of functions (so called markup commands). Some of these refer to other markup commands through a macro. What I can observe that some of the functions involved are not called during the compilation, but others are. In particular, the function that registers a markup command using something like (module-define! (current-module) (string->symbol (format #f "~a-markup" name)) defn)) but this function is not called during the compile There is a convenience macro that is called within some function bodies, that does get called. Unfortunately, the latter convenience macro is expanded and then executed; the execution tries to then do (module-ref (current-module) (string->symbol (format #f "~a-markup" name) which fails. > For this to be working code the (inner ...) function needs to be available in > the macro expansion. I didn't read through exactly what you are trying to do, > but try outputting a let: > > `(let ((inner (lambda (n v) (set ! ... > (inner ,name ,value)) > > I doubt you can make the old code work in newer guiles, since I doubt any > scheme is a s lax about expansion time and macro time separation. > -- > Linus Björnstam > > On Wed, 29 Jan 2020, at 00:08, Han-Wen Nienhuys wrote: > > Some of the lilypond Scheme files do the following: > > > > > > (define decl '()) > > (define (make-var n v) (list "var" n v)) > > (defmacro define-session (name value) > > (define (inner n v) > > (set! decl > > (cons > > (make-var n v) > > decl)) > > ) > > `(,inner ',name ,value)) > > (define-session foo 1) > > (display decl) > > (newline) > > > > In GUILE 2.2, this yields > > > > ;;; WARNING: compilation of /home/hanwen/vc/lilypond/q.scm failed: > > ;;; unhandled constant # > > > > What does this error message mean, and what should I do to address the > > problem? > > -- > > Han-Wen Nienhuys - hanw...@gmail.com - http://www.xs4all.nl/~hanwen > > > > -- Han-Wen Nienhuys - hanw...@gmail.com - http://www.xs4all.nl/~hanwen
Re: unhandled constant?
On Fri, Jan 31, 2020 at 3:58 PM Linus Björnstam wrote: > > Guile1.8's macros are run-time macros: they are executed directly and not > transformed to output code that is then compiled. That is the reason why your > code works: to newer guiles the (inner ...) is only available at expansion > time. The macro output is trying to call code that does not exist at runtime! > > For this to be working code the (inner ...) function needs to be available in > the macro expansion. I didn't read through exactly what you are trying to do, > but try outputting a let: > > `(let ((inner (lambda (n v) (set ! ... > (inner ,name ,value)) > > I doubt you can make the old code work in newer guiles, since I doubt any > scheme is a s lax about expansion time and macro time separation. Thanks for the explanation. This makes sense. Is it possible to attach doc strings to (define-syntax .. ) declarations, and if so, where do they go? -- Han-Wen Nienhuys - hanw...@gmail.com - http://www.xs4all.nl/~hanwen
Re: unhandled constant?
On Thu, Jan 30, 2020 at 9:05 AM Han-Wen Nienhuys wrote: > > [guile1.8]$ grep -ir define-syntax-rule . > > (empty) > > I need something that works in GUILE 1.8 too. > > this is what I got from David Kastrup: > > >Got any comments about macros being sooo yesterday compared to syntax > forms? Syntax forms actually don't work in LilyPond: there was an > incompatibility because the 1.8 implementation will balk if some symbol > used in syntax forms already has a definition, and we have that. I > forgot the exact symbol at fault. I think it was some music function > name. > > Would there be a way that we can use our source code unchanged with GUILE 2.2? > > Can you explain why I get this error message? Also, how is it possible that, when disabling auto-compilation, the whole thing works? $ GUILE_AUTO_COMPILE=0 guile2.2 q.scm ((var foo 1)) is there an evaluator in GUILE that is separate from the bytecode VM, and if so, is this evaluator guaranteed to be supported in upcoming versions of GUILE? Does the evalation without auto-compilation benefit from JIT treatment? If I want to explore this myself, how do I hack on GUILE itself? Compiling GUILE from scratch takes more than an hour for me. I assume there must be a faster way to experiment, but what is that? It looks like the HACKING file is out of date. -- Han-Wen Nienhuys - hanw...@gmail.com - http://www.xs4all.nl/~hanwen
Re: unhandled constant?
[guile1.8]$ grep -ir define-syntax-rule . (empty) I need something that works in GUILE 1.8 too. this is what I got from David Kastrup: >Got any comments about macros being sooo yesterday compared to syntax forms? Syntax forms actually don't work in LilyPond: there was an incompatibility because the 1.8 implementation will balk if some symbol used in syntax forms already has a definition, and we have that. I forgot the exact symbol at fault. I think it was some music function name. Would there be a way that we can use our source code unchanged with GUILE 2.2? Can you explain why I get this error message? On Wed, Jan 29, 2020 at 4:06 PM Ricardo Wurmus wrote: > > > Han-Wen Nienhuys writes: > > > Some of the lilypond Scheme files do the following: > > > > > > (define decl '()) > > (define (make-var n v) (list "var" n v)) > > (defmacro define-session (name value) > > (define (inner n v) > > (set! decl > > (cons > > (make-var n v) > > decl)) > > ) > > `(,inner ',name ,value)) > > (define-session foo 1) > > (display decl) > > (newline) > > > > In GUILE 2.2, this yields > > > > ;;; WARNING: compilation of /home/hanwen/vc/lilypond/q.scm failed: > > ;;; unhandled constant # > > > > What does this error message mean, and what should I do to address the > > problem? > > Would it be feasible to use define-syntax-rule here? > > --8<---cut here---start->8--- > (define decl '()) > (define (make-var n v) (list "var" n v)) > (define-syntax-rule (define-session name value) > (let ((inner (lambda (n v) > (set! decl >(cons > (make-var n v) > decl) > (inner 'name value))) > (define-session foo 1) > (display decl) > (newline) > --8<---cut here---end--->8--- > > > -- > Ricardo -- Han-Wen Nienhuys - hanw...@gmail.com - http://www.xs4all.nl/~hanwen
Re: garbage collection slowdown
(please CC replies directly to me; I am not on the guile-devel list.) Arne mentioned >This might be the read-function which is slower in 2.2. You might want >to try to go directly to Guile 3, in which the read function should be >on par with the read in 1.8. I'd rather avoid making a jump to GUILE 3, as it's not in distributionsn yet. Just to be extra clear: if I instrument Lily with (begin (display "gc time taken: ") (display (* 1.0 (/ (cdr (assoc 'gc-time-taken (gc-stats))) internal-time-units-per-second))) (display "\n"))) this number increases from 0.3 to 1.7. Parsing and compiling the .scm files in our distribution has a GC overhead of 0.3 by itself on GUILE 2.2 The release notes for 2.0 say that "Switch to the Boehm-Demers-Weiser garbage collector .. It also improves performance." I am curious about the numbers that support this; can somebody point me to them? From where I stand, it looks like a huge performance regression. On Tue, Jan 28, 2020 at 11:41 PM Han-Wen Nienhuys wrote: > > Hi folks, > > after a long hiatus I have started getting involved with LilyPond > again, and one of the things I'd like to do is get LilyPond off GUILE > 1.8. Experiments suggest that starting from GUILE 2.2, the execution > performance is on par with 1.8. There are 2 open issues: caching byte > compiled files (which I haven't looked into yet), and GC. > > Unfortunately, it looks like the adoption of the BDW GC library caused > a ~6x slowdown, causing an overall end-to-end slowdown of 50%. > > I was wondering if you folks would have tips to further tune GC for > wall-time speed, and if there additional diagnostics to see if we're > doing something extraordinarily silly. > > I already found the GC_free_space_divisor, but I already tuned to its > fastest value, 1. > > -- > Han-Wen Nienhuys - hanw...@gmail.com - http://www.xs4all.nl/~hanwen -- Han-Wen Nienhuys - hanw...@gmail.com - http://www.xs4all.nl/~hanwen
unhandled constant?
Some of the lilypond Scheme files do the following: (define decl '()) (define (make-var n v) (list "var" n v)) (defmacro define-session (name value) (define (inner n v) (set! decl (cons (make-var n v) decl)) ) `(,inner ',name ,value)) (define-session foo 1) (display decl) (newline) In GUILE 2.2, this yields ;;; WARNING: compilation of /home/hanwen/vc/lilypond/q.scm failed: ;;; unhandled constant # What does this error message mean, and what should I do to address the problem? -- Han-Wen Nienhuys - hanw...@gmail.com - http://www.xs4all.nl/~hanwen
garbage collection slowdown
Hi folks, after a long hiatus I have started getting involved with LilyPond again, and one of the things I'd like to do is get LilyPond off GUILE 1.8. Experiments suggest that starting from GUILE 2.2, the execution performance is on par with 1.8. There are 2 open issues: caching byte compiled files (which I haven't looked into yet), and GC. Unfortunately, it looks like the adoption of the BDW GC library caused a ~6x slowdown, causing an overall end-to-end slowdown of 50%. I was wondering if you folks would have tips to further tune GC for wall-time speed, and if there additional diagnostics to see if we're doing something extraordinarily silly. I already found the GC_free_space_divisor, but I already tuned to its fastest value, 1. -- Han-Wen Nienhuys - hanw...@gmail.com - http://www.xs4all.nl/~hanwen
Re: The usage of -dtrace-scheme-coverage in Lilypond
On Fri, Mar 11, 2011 at 12:21 AM, zhangxy wrote: > Dear hanwen, > Now I want to analyze the test coverage of Lilypond. I find the option > -dtrace-scheme-coverage. It says that the option can record coverage of > Scheme files in `FILE.cov'. Then I do the following >> >> lilypond -dtrace-scheme-coverage test.ly > > But it gives me the error >> >> throw from within critical section. trace-scheme-coverage relies on a hack in the GUILE evaluator that I added in 2007, which records a symbol's source location the moment that symbol is looked up. The glue on the lilypond side is in scm/coverage.scm. This feature was removed in 2010, when the GUILE folks rewrote the evaluator, before ever seeing the light in a GUILE release. Apparently there is a new mechanism for finding coverage. See https://www.gnu.org/software/guile/manual/html_node/Code-Coverage.html - perhaps you can work out a way with the guile folks to resurrect lilypond's test coverage code. -- Han-Wen Nienhuys - han...@xs4all.nl - http://www.xs4all.nl/~hanwen
Re: %module-public-interface
On Fri, Apr 2, 2010 at 8:58 AM, Ian Hulin wrote: >>>>>> How do they use it? >>>>>> >>>>> >>>>> Linking to the evil empire: >>>>> http://www.google.com/codesearch?hl=en&lr=&q=%25module-public-interface&sbtn=Search (ehh?) >> When is the new Lilypond release due? >> >> > > I'm not the ReleaseMeister for Lilypond; you'll get a better picture by > talking to Graham Percival (gra...@percival-music.ca). > > But FWIW it looks like we're on our few last development releases before the > stable V2.14 comes out. It's near enough for a spoof release announcement > to have gone out on the mailing list on April 1st which suckered me! > > I reckon plans are for Lilypond to stick with Guile V1.8.7 at least until > the next Lilypond stable version after V2.14, but again, mileage may vary if > you talk to more experienced Lilypond people. Is Guile 2.0 already released? I think it makes sense to forget about guile 2.0 for the 2.14 release, and require 2.0 for the 2.16 release. We could scrap lots of hairy GC code if we could move to 2.0 (2.0 supports boehm GC, right?) > 4. We've already seen the %module-public-interface thing in the Lily C++. > There's probably more smelly stuff lurking in the C++ interface, which > won't surface until we start trying to use Guile 2.0 more. There may be lots of hairiness in the module interface; I sort of made up functions as I went along, since it was largely undocumented. -- Han-Wen Nienhuys - han...@xs4all.nl - http://www.xs4all.nl/~hanwen
Re: marking overhead, and on the cost of conditionals in hot code
Andy Wingo escreveu: > I dropped into cachegrind, and it tells me thing about scm_gc_mark in a > simple guile -c 1 run: > > > I think that the items on the left are cycle counts, and are of relative > importance. The => lines are the cumulative costs of the subroutines. > > The salient point for me is that the scm_i_marking check slows down > this function by about 10%! This can easily be remedied by splitting off the actual work into internal function which skips the check. The GC module could alway call the internal function. > Also, that the majority of the time in this > function is in the SCM_GC_MARK_P line. Well, GC_MARK_P is bit fiddling a pointer dereference, with a possible cache miss. Also, the code up to that point will get executed much more often than what follows. -- Han-Wen Nienhuys - han...@xs4all.nl - http://www.xs4all.nl/~hanwen
Re: Using define in multiple threads?
Linas Vepstas escreveu: > Is it "safe" or "legal" to use define in multiple threads? I guess not. Someone -I forgot who- put in the pthreads without thinking through the consequences. Look through eval.c, you´ll see SCM_SETCAR (expr, SCM_IM_DO); // * SCM_SETCDR (expr, tail); which is very dubious if a thread switch happens at (*) -- Han-Wen Nienhuys - [EMAIL PROTECTED] - http://www.xs4all.nl/~hanwen
Re: Guile release planning
Neil Jerram escreveu: > So, what do you think? There have been discussions of release > strategy in the past, which I've seen as 50/50 between the split > stable and development model (which we have now) and the steady new > feature model (described above), but I don't recall them considering > the overall community focus angle before. In my view, when we add in > that angle, the steady new feature model is better. One angle that we could take is time based release planning, like GNOME and Fedora do: plan to do one or two releases per year on a rigid schedule. The LilyPond 2.11 vs. 2.12 jump has been delaying for too long, but I generally do a biweekly release, which is stable enough to reasonably be called 'stable', and it has worked very well so far. The precondition for this is that there is a good test-suite so we can be sure that a release that passes the tests is good. -- Han-Wen Nienhuys - [EMAIL PROTECTED] - http://www.xs4all.nl/~hanwen
Re: Comparing Guile's GC with BDW-GC
Ludovic Courtès escreveu: > Hello! >heap size (MiB) execution time (s.) > Guile1.54 (1.00x) 6.316 (1.00x) > BDW-GC, FSD=32.41 (1.57x) 4.943 (0.78x) I wonder whether this is a useful benchmark. 1.54 mb is small compared to the 2mb L2 cache of a core duo. I think it's best to look at tests that exercise a larger working set. >From casual inspection of the numbers, I'd say that going to BDW is worth it: the speed/memory usage is roughly comparable, and the ability to junk 6k lines of hairy code is definitely worth it. -- Han-Wen Nienhuys - [EMAIL PROTECTED] - http://www.xs4all.nl/~hanwen
Re: Comparing Guile's GC with BDW-GC
Ludovic Courtès escreveu: > Hello! > > I finally [0] conducted experiments to compare Guile's GC with my port > of Guile to the Boehm-Demers-Weiser GC (BDW-GC). The code for that port > is not currently available on-line but I'd be happy to push it somewhere > (would Guile's repo at Savannah be a good fit?). I recommend you push a branch to savannah, for everyone to see what is happening. -- Han-Wen Nienhuys - [EMAIL PROTECTED] - http://www.xs4all.nl/~hanwen
Re: Why bother porting Guile to BDW-GC?
Han-Wen Nienhuys escreveu: > Ludovic Courtès escreveu: >> Hello Guilers! >> >> Below are some of the points (in no particular order) that IMO can make >> it worthwhile to use the Boehm-Demers-Weiser GC [0] in Guile instead of >> Guile's historical GC, from an engineering viewpoint. >> > > I'm all for scrapping code; here are my concerns: > > - what is the performance impact? > > - does BDW GC handle weak references correctly? > > - What about various (undoubtedly little used) areas where GC interacts > with the interpreter: port de-allocation, guardians, etc. I saw that you mentioned these in your mail. I wonder if it feasible to provide backward compatibility if we move to BDW. -- Han-Wen Nienhuys - [EMAIL PROTECTED] - http://www.xs4all.nl/~hanwen
Re: Why bother porting Guile to BDW-GC?
Ludovic Courtès escreveu: > Hello Guilers! > > Below are some of the points (in no particular order) that IMO can make > it worthwhile to use the Boehm-Demers-Weiser GC [0] in Guile instead of > Guile's historical GC, from an engineering viewpoint. > I'm all for scrapping code; here are my concerns: - what is the performance impact? - does BDW GC handle weak references correctly? - What about various (undoubtedly little used) areas where GC interacts with the interpreter: port de-allocation, guardians, etc. -- Han-Wen Nienhuys - [EMAIL PROTECTED] - http://www.xs4all.nl/~hanwen
automake incantation?
How do I invoke automake for GUILE? v1.10.1 [EMAIL PROTECTED] guile-x]$ automake --add-missing configure.in:83: installing `build-aux/compile' configure.in:67: installing `build-aux/config.guess' configure.in:67: installing `build-aux/config.sub' configure.in:44: installing `build-aux/install-sh' configure.in:44: installing `build-aux/missing' doc/ref/Makefile.am:26: installing `build-aux/mdate-sh' doc/ref/Makefile.am:26: installing `build-aux/texinfo.tex' emacs/Makefile.am:24: installing `build-aux/elisp-comp' lib/Makefile.am:33: @LTALLOCA@ used but `LTALLOCA' is undefined lib/Makefile.am: installing `build-aux/depcomp' configure.in:51: required file `config.h.in' not found configure.in:86: required file `build-aux/ltmain.sh' not found [EMAIL PROTECTED] guile-x]$ echo $? 1 -- Han-Wen Nienhuys - [EMAIL PROTECTED] - http://www.xs4all.nl/~hanwen
Re: RFD: drop the GH interface.
Ping? Han-Wen Nienhuys escreveu: > Ludovic Courtès escreveu: >>>> Yes, fine by me, but no rush. ;-) >>> Do you mean anything specific by "no rush" here? >> I just meant I'm not gonna do it Right Now but that's fine by me if >> somebody else does. > > Please see dev/hanwen on savannah. > -- Han-Wen Nienhuys - [EMAIL PROTECTED] - http://www.xs4all.nl/~hanwen
Re: RFD: drop the GH interface.
Ludovic Courtès escreveu: >>> Yes, fine by me, but no rush. ;-) >> Do you mean anything specific by "no rush" here? > > I just meant I'm not gonna do it Right Now but that's fine by me if > somebody else does. Please see dev/hanwen on savannah. -- Han-Wen Nienhuys - [EMAIL PROTECTED] - http://www.xs4all.nl/~hanwen
Re: RFD: drop the GH interface.
Do we have a consensus? Ludovic Courtès escreveu: >> The GH interface was marked as deprecated in >> * Explain GH deprecation & plan for scm documentation. >> >> >> Let's really drop it now. > > Why? It doesn't cost much to keep it, does it? > -- Han-Wen Nienhuys - [EMAIL PROTECTED] - http://www.xs4all.nl/~hanwen
Re: RFD: drop the GH interface.
Ludovic Courtès escreveu: > Han-Wen Nienhuys <[EMAIL PROTECTED]> writes: > >> The GH interface was marked as deprecated in >> >> commit a0143ebc24c24198e0dfce9b80f3648feb706226 >> Author: Neil Jerram <[EMAIL PROTECTED]> >> Date: Wed Jun 20 22:08:19 2001 + >> >> * Explain GH deprecation & plan for scm documentation. >> >> >> Let's really drop it now. > > Why? It doesn't cost much to keep it, does it? It is a small cost, but it is recurring. Eliminating it once will save us work over a longer period. Also, if we don't remove interfaces that we deprecate, we shouldn't bother with deprecating them in the first place. -- Han-Wen Nienhuys - [EMAIL PROTECTED] - http://www.xs4all.nl/~hanwen
Re: advice on reducing C stack frame size?
Andy Wingo escreveu: > Hi, > > With a local patch, it seems that my C stack frames are getting large > enough to start hitting the stack overflow checks. > > (In the future this won't be a terrible problem, as you won't be > recursively calling the evaluator the the vm then the evaluator etc too > much, but while we still have a fair amount of code being interpreted, > it is important.) > > So for example, just sitting at the repl, we have: > > frame (I think). It is about 20 kilobytes!!! In contrast, a deval frame > appears to be less, but still excessive: > > #19 0x0014b076 in deval (x=0xb7f3a478, env=0xb7ee2560) at eval.i.c:358 > 358 (void) EVAL (form, env); > (gdb) > #20 0x0014e72e in scm_dapply (proc=0xb7f3a6d0, arg1= out>, args=0xb7ee25d0) at eval.i.c:1858 > 1858RETURN (EVALCAR (proc, args)); > (gdb) p 0x0014e72e - 0x0014b076 > $5 = 14008 > > This is with gcc 4.3.0 20080428 (Red Hat 4.3.0-8). > > My question is: what should I do about this? Wait for the runtime tuning > patches to land in master and then merge them? Assume that over time, I This looks like a bug or an oversight. - 14k is about 3500 SCM values; we surely don't have that many local variables, so it looks as if there might be some macro that expands into a local array. I'd have a look at the addresses of the different local variables to see where all that memory is going. Also, look at the preprocessed source and scan for array variables. -- Han-Wen Nienhuys - [EMAIL PROTECTED] - http://www.xs4all.nl/~hanwen
RFD: drop the GH interface.
The GH interface was marked as deprecated in commit a0143ebc24c24198e0dfce9b80f3648feb706226 Author: Neil Jerram <[EMAIL PROTECTED]> Date: Wed Jun 20 22:08:19 2001 + * Explain GH deprecation & plan for scm documentation. Let's really drop it now. -- Han-Wen Nienhuys - [EMAIL PROTECTED] - http://www.xs4all.nl/~hanwen
Re: [PATCH] Revise GC asserts.
Ludovic Courtès escreveu: >> * libguile/private-gc.h (nil): introduce scm_i_last_marked_cell_count, >> as a private mechanism for maintaining cell counts. Previous >> versions incremented scm_cells_allocated in an inlined function, so >> loading dynamic objects of older GUILEs would break invariants. > > OTOH, if we are to change the way `scm_cells_allocated' is used and > don't want older code to interfere with that, it's safe the break the > ABI here (we're on `master' after all). If this is our stance, this looks like a good opportunity to take all these variable private, and have people use gc-stats to get the data. Providing direct access to variables is problematic from an API stability point of view. -- Han-Wen Nienhuys - [EMAIL PROTECTED] - http://www.xs4all.nl/~hanwen
Re: [PATCH] Revise GC asserts.
Ludovic Courtès escreveu: >> * libguile/gc.c (scm_i_gc): Change assert into deprecation warning. > > Why? It's not a deprecation but really an invariant, right? Yes, but it probably does not warrant crashing the program; memory allocation sizes will just be a bit off as a result. > Thus I would vote in favor of making `scm_cells_allocated' internal > (which requires that no public macro or inline function refer to it) or > renaming it, e.g., to `scm_i_cells_allocated'. Let's just remove the variable, since scm_i_last_marked_cell_count is a more exact name. > BTW, can you add a one-line summary to the log, as is done on `master', > `vm', etc.? The custom is (see git-format-patch) that the subject line is the summary. -- Han-Wen Nienhuys - [EMAIL PROTECTED] - http://www.xs4all.nl/~hanwen
[PATCH] Revise GC asserts.
* libguile/gc.c (scm_i_gc): Change assert into deprecation warning. * libguile/private-gc.h (nil): introduce scm_i_last_marked_cell_count, as a private mechanism for maintaining cell counts. Previous versions incremented scm_cells_allocated in an inlined function, so loading dynamic objects of older GUILEs would break invariants. --- libguile/gc.c | 18 +- libguile/private-gc.h |5 +++-- 2 files changed, 16 insertions(+), 7 deletions(-) diff --git a/libguile/gc.c b/libguile/gc.c index f3ef585..0323f87 100644 --- a/libguile/gc.c +++ b/libguile/gc.c @@ -416,7 +416,7 @@ gc_end_stats () scm_gc_cells_allocated_acc += (double) scm_i_gc_sweep_stats.collected; - scm_gc_cells_marked_acc += (double) scm_cells_allocated; + scm_gc_cells_marked_acc += (double) scm_i_last_marked_cell_count; scm_gc_cells_marked_conservatively_acc += (double) scm_i_find_heap_calls; scm_gc_cells_swept_acc += (double) scm_i_gc_sweep_stats.swept; @@ -558,6 +558,8 @@ scm_check_deprecated_memory_return () scm_i_deprecated_memory_return = 0; } +long int scm_i_last_marked_cell_count; + /* Must be called while holding scm_i_sweep_mutex. This function is fairly long, but it touches various global @@ -603,11 +605,17 @@ scm_i_gc (const char *what) /* TODO(hanwen): figure out why the stats are off on x64_64. */ /* If this was not true, someone touched mark bits outside of the mark phase. */ - assert (scm_cells_allocated == scm_i_marked_count ()); + if (scm_i_last_marked_cell_count != scm_i_marked_count ()) +{ + static char msg[] = + "The number of marked objects changed since the last GC. " + "Are you marking objects outside of the mark phase?"; + scm_c_issue_deprecation_warning(msg); +} assert (scm_i_gc_sweep_stats.swept == (scm_i_master_freelist.heap_total_cells + scm_i_master_freelist2.heap_total_cells)); - assert (scm_i_gc_sweep_stats.collected + scm_cells_allocated + assert (scm_i_gc_sweep_stats.collected + scm_i_last_marked_cell_count == scm_i_gc_sweep_stats.swept); #endif /* SCM_DEBUG_CELL_ACCESSES */ @@ -617,8 +625,8 @@ scm_i_gc (const char *what) scm_mark_all (); scm_gc_mark_time_taken += (scm_c_get_internal_run_time () - t_before_gc); - scm_cells_allocated = scm_i_marked_count (); - + scm_i_last_marked_cell_count = scm_cells_allocated = scm_i_marked_count (); + /* Sweep TODO: the after_sweep hook should probably be moved to just before diff --git a/libguile/private-gc.h b/libguile/private-gc.h index 93503ce..f5331ab 100644 --- a/libguile/private-gc.h +++ b/libguile/private-gc.h @@ -273,8 +273,9 @@ SCM_INTERNAL void scm_i_sweep_all_segments (char const *reason, SCM_INTERNAL SCM scm_i_all_segments_statistics (SCM hashtab); SCM_INTERNAL unsigned long *scm_i_segment_table_info(int *size); -extern long int scm_i_deprecated_memory_return; -extern long int scm_i_find_heap_calls; +SCM_INTERNAL long int scm_i_deprecated_memory_return; +SCM_INTERNAL long int scm_i_find_heap_calls; +SCM_INTERNAL long int scm_i_last_marked_cell_count; /* global init funcs. -- 1.5.5.1
Re: i18n broken on mingw cross compile
Ludovic Courtès escreveu: > Hi, > > Han-Wen Nienhuys <[EMAIL PROTECTED]> writes: > >> i686-mingw32-gcc -mms-bitfields -DHAVE_CONFIG_H >> -I/home/lilydev/vc/gub/target/mingw/src/guile-1.9.git -I.. >> -I/home/lilydev/vc/gub/target/mingw/src/guile-1.9.git/lib -I../lib -Wall >> -Wmissing-prototypes -g -O2 -MT libguile_i18n_v_0_la-i18n.lo -MD -MP -MF >> .deps/libguile_i18n_v_0_la-i18n.Tpo -c >> /home/lilydev/vc/gub/target/mingw/src/guile-1.9.git/libguile/i18n.c >> -DDLL_EXPORT -DPIC -o .libs/libguile_i18n_v_0_la-i18n.o >> In file included from >> /home/lilydev/vc/gub/target/mingw/src/guile-1.9.git/libguile/i18n.c:296: >> /home/lilydev/vc/gub/target/mingw/src/guile-1.9.git/libguile/locale-categories.h: >> In function 'get_current_locale_settings': >> /home/lilydev/vc/gub/target/mingw/src/guile-1.9.git/libguile/locale-categories.h:24: >> error: 'LC_MESSAGES' undeclared (first use in this function) >> /home/lilydev/vc/gub/target/mingw/src/guile-1.9.git/libguile/locale-categories.h:24: >> error: (Each undeclared identifier is reported only once >> /home/lilydev/vc/gub/target/mingw/src/guile-1.9.git/libguile/locale-categories.h:24: >> error: for each function it appears in.) > > Can you try out the attached patch? Could you publish your changes through savannah as well? This makes checking the patch a lot easier. For example, you could push into a dev/ludo branch. -- Han-Wen Nienhuys - [EMAIL PROTECTED] - http://www.xs4all.nl/~hanwen
Re: GUILE_MAX_HEAP_SIZE
Ludovic Courtès escreveu: > Hi, > > [EMAIL PROTECTED] (Ludovic Courtès) writes: > >> Han-Wen Nienhuys <[EMAIL PROTECTED]> writes: >>> Han-Wen Nienhuys escreveu: >>>> Ludovic Courtès escreveu: >>>>> +/* >>>>> + Classic MIT Hack, see e.g. http://www.tekpool.com/?cat=9 >>>>> + */ >>>>> +int scm_i_uint_bit_count(unsigned int u) >>>>> >>>>> (BTW, it'd make sense to use Gnulib's `count-one-bits' module, which is >>>>> able to use GCC's `__builtin_popcount ()'.) >>> Could you add the gnulib module? I'll do the rest. >> I asked for re-licensing: >> >> http://thread.gmane.org/gmane.comp.lib.gnulib.bugs/14349 >> >> Once that is done, all you need is to add it to `m4/gnulib-cache.m4', >> run "gnulib-tool --update", commit the module changes and additions, and >> hack the thing. You can post the patches before committing, too. ;-) > > I just did it (patch attached). Thanks. I'm confused though, commit 53f4876abcebf3f05d2a88bba3a898ddcda25a74 Merge: 69f2317... 242ebea... Author: Ludovic Courtès <[EMAIL PROTECTED]> Date: Tue Sep 9 22:03:42 2008 +0200 Merge branch 'master' into strftime-gnulib Conflicts: libguile/ChangeLog srfi/ChangeLog test-suite/ChangeLog I thought we were supposed to keep the history linear; did I miss something? -- Han-Wen Nienhuys - [EMAIL PROTECTED] - http://www.xs4all.nl/~hanwen
Re: Cleanup mark-during-GC debug checks.
Ludovic Courtès escreveu: > Hello! > > [EMAIL PROTECTED] writes: > >> Reviewers: hanwenn, >> >> Message: >> Hello guile devel, >> >> please go to >> >> http://codereview.appspot.com/4847 >> >> to review this patch. >> >> >> I hope you like it; thanks! > > A couple of notes: > > 1. I don't want to use a web interface to review code. Most free > software projects use email in one form or another, which I find > convenient. Having patches in-lined is optimal IMO. My experience is that a web interface (which tracks different versions of the same patch) is a lot easier when it is a major change with lots of revisions. In general, I find cutting & pasting patches into emails clumsy and error prone. In general, git is much better suited for sending patches around. > 2. I don't want to have a Google account. > > Thus, I'll comment on the patch here. > > * I'd name the macro `SCM_DEBUG_MARK_PHASE' rather, as it sounds mot > idiomatic (but I'm not a native speaker). It's rather the reverse: ensuring that the non-mark phase is correct (in not having mark calls), but I couldn't think of a good name. > * Use "static const char msg[] = ...". done. -- Han-Wen Nienhuys - [EMAIL PROTECTED] - http://www.xs4all.nl/~hanwen
Re: Getting rid of maintainer mode
Ludovic Courtès escreveu: > Hello, > > Is anyone against getting rid of `AM_MAINTAINER_MODE'? > > If in doubt, see (info "(automake) maintainer-mode"). :-) Getting rid of anything that starts with AM_ has my full support. -- Han-Wen Nienhuys - [EMAIL PROTECTED] - http://www.xs4all.nl/~hanwen
Re: GC asserts and threads
Andy Wingo escreveu: > On Tue 09 Sep 2008 07:58, Han-Wen Nienhuys <[EMAIL PROTECTED]> writes: > >> Is there any code in GUILE that would create a thread (possibly >> leading to race conditions) when there is no explicit start-thread >> call in the code? The program (lilypond) does run through the >> regular GUILE boot procedure. > > Yep, when compiled with threads, guile spawns a separate thread to > handle signals. > but I am only seeing one $ guile guile> (all-threads) (#) > Andy -- Han-Wen Nienhuys - [EMAIL PROTECTED] - http://www.xs4all.nl/~hanwen
Re: GC asserts and threads
On Tue, Sep 9, 2008 at 4:00 AM, Andy Wingo <[EMAIL PROTECTED]> wrote: > On Tue 09 Sep 2008 07:58, Han-Wen Nienhuys <[EMAIL PROTECTED]> writes: > >> Is there any code in GUILE that would create a thread (possibly >> leading to race conditions) when there is no explicit start-thread >> call in the code? The program (lilypond) does run through the >> regular GUILE boot procedure. > > Yep, when compiled with threads, guile spawns a separate thread to > handle signals. Where does that happen? -- Han-Wen Nienhuys - [EMAIL PROTECTED] - http://www.xs4all.nl/~hanwen
GC asserts and threads
Hi there, I'm debugging an issue here that cause the GC asserts to trigger, with the values compared being off by one. The problem disappears when I compile --without-threads. The program does not explicitly create threads Is there any code in GUILE that would create a thread (possibly leading to race conditions) when there is no explicit start-thread call in the code? The program (lilypond) does run through the regular GUILE boot procedure. -- Han-Wen Nienhuys - [EMAIL PROTECTED] - http://www.xs4all.nl/~hanwen
Re: development goals
Ludovic Courtès escreveu: >> You might construe that I would like to turn Guile development into >> LilyPond development. That is not necessarily the case, but I keep >> misunderstanding what people expect in this community. I am assuming >> that developers in general are interested in a more lively and more >> rapid evolution of Guile, but everytime I see habits and policies that >> seem contrary to that goal. > > How many people fix bugs reported to `bug-guile'? Believe me, spending > time doing this makes you feel reluctant to large unmotivated changes. Good point. I've just added myself to that list. >> Then again, with all the back & forth porting of changes between 1.8 >> and head, it's difficult to tell what is in 1.8 and what is not. > > Surely you'll enjoy it: we have an old-fashioned tradition of updating > `NEWS' when changing something in a branch! :-) Well, yes, but we also have many commits that are identified as "Changes from Arch/CVS synchronization" and "merge from 1.8" I tend to look at development history with gitk. I realize these commits are from the CVS era, but it makes me loose track of what is happening where. -- Han-Wen Nienhuys - [EMAIL PROTECTED] - http://www.xs4all.nl/~hanwen
Re: development goals
Neil Jerram escreveu: > 2008/9/7 Han-Wen Nienhuys <[EMAIL PROTECTED]>: >> OK - I will admit that interpreter/GC hacking is cool, but on the >> downside, when I try to do anything, the intertia/resistance I feel in >> the community here is a big turnoff for me. > > Do you mean regarding releases (as you say more on below)? Or/also > the mailing list dynamic/responsiveness? Anything else? For a large patch (like the disputed GC patch), I got criticism of the process I used to push it and complaints that it had multiple fixes collated together. I was expecting criticism about what it changed, i.e. I was hoping for more insightful comments. It's hard to take criticism seriously if it is only about cosmetics. Also, the general idea that patches can only be committed if they are 'perfect' is not conducive to more rapid development. I was rather shocked that the idea of rolling back the change was considered a better option than adding further fixes on top of it. For example, there was the issue of x86_64 being broken by the asserts. I read complaints on the list, I got scolded at by Ludovic, and a week later the code was still there. I posted a proposed patch (a single #ifdef!), to which I didn't receive a single comment. When I finally committed, I got a complaint that the commit message did not have the right formatting. Apparently: - Rolling back a patch is preferred over fixing actual problems. - Nobody has enough initiative to put a single strategic #ifdef in the code. - If someone finally does take initiative, it's only ok if it is perfect. I try not to goof up when pushing lilypond patches, but I do, up to committing code that does not compile (yes, sue me). When that happens, generally people push simple fixes - there are 24 people with push access, of whom 16 or so are active. The people that push generally ask for review if they are unsure of what they are doing, but when they feel certain, they do without asking; I like that, because I don't have enough time to police the master branch. At ~15 commits per day I don't have time for that. At times, I have to request some corrections (missing regression tests, style issues), but that is rare, and when I do I get prompt responses, with the perpetrators pushing fixes on their own initiative. There also are people that I know and trust to have full competence. For example, Joe Neeman has rewritten all of the vertical spacing code in Lily, and I don't pretend to judge or really understand his code when he pushes. You might construe that I would like to turn Guile development into LilyPond development. That is not necessarily the case, but I keep misunderstanding what people expect in this community. I am assuming that developers in general are interested in a more lively and more rapid evolution of Guile, but everytime I see habits and policies that seem contrary to that goal. >>> FWIW, I would like to run my code on other schemes -- not the same goal >>> as this one, but it overlaps considerably. For me, I think that the path >>> will be implementation of some scheme standard that supports modules, >>> then migrating code over to that standard. I'm not sure about R6 though. >> Is there a Rx/SRFI standard for modules? I always thought that module >> system(s) was one of the unspecified areas. > > R6RS specifies libraries, which are similar to modules. (But probably > much cleverer for separate compilation, and vastly more complicated in > their semantics.) Hmm .. curious, I thought one of the objectives of Scheme was simplicity. >> When working with the devs here I continue to be puzzled by what the >> objectives are. For instance, we had 5 major (stable) releases in 11 >> years. I have always wanted this rate to go up, and have tried argue >> for that, but with 1.10 (or 2.0, whatever it is called) being in >> preparation for 2.5 years at this moment, I don't see this changing. > > For me, almost all of my time since becoming a maintainer has been > absorbed by working on bug fixes, largely to do with slightly odd > platforms (e.g. Mac) or architectures (e.g. ia64). IMO it was > worthwhile to focus on such bug reports soon after they were reported, > because (i) the reporters are still around and interested enough to be > able to provide more info and test fixes, (ii) I believe that running > on more platforms will be good for the Guile community, and for Guile > applications. FWIW, I am shipping lilypond cross compiled on linux (ppc, x86_64, x86) darwin (ppc, x86), mingw and freebsd (x86, x86_64). The stable release is mostly working for that. > Basically, my feeling is that Guile users have been badly burned by > major release incompatibilities in the past, and I really don't want > that to happen again.
Re: [PATCH] Avoid `SCM_VALIDATE_LIST ()'
Neil Jerram escreveu: > 2008/9/2 Han-Wen Nienhuys <[EMAIL PROTECTED]>: >> If you are doing memq? for something you already know to >> somewhere in front of the list [...] > > Why would you do that? In two senses: > > 1. I know memq gives you the tail of the list, but I usually use its > result only as a true/false value Why would run use memq like that in > a situation where you already know that it will give you true? > > 2. It feels unusual to me to have a long list, but in which certain > kinds of values are known always to be near the front. That sounds > like something that should really be represented as two (or more) > separate lists. > > Have you observed this (the current usage of SCM_VALIDATE_LIST) as a > performance problem in practice? No, but it feels strange to me that a function whose intrinsic function does not require O(n) behavior, does require it in all cases. However, I find Mikael's argument that it complicates programming a lot persuasive. -- Han-Wen Nienhuys - [EMAIL PROTECTED] - http://www.xs4all.nl/~hanwen
Re: [PATCH] Avoid `SCM_VALIDATE_LIST ()'
Neil Jerram escreveu: > Since you mention 'cleanups', I must say that I agree with Ludovic, > that it would have been preferable to post the patch for > review/discussion before committing it, since that is our (majority) > current practice. Sure there may have been a few exceptions, but only > for trivial changes, I believe, and I don't believe that this was - > overall - a trivial change. (I'm aware that it has lots of trivial > bits in it, but I don't think it's all trivial.) > > (I also think it's arguable that actually committing to a branch is > more convenient, for author and reviewers, than juggling emails - but > that then leads on to other questions, like what expectations people > can have of the "master" branch, and why we are using Git like CVS...) If we want to code review as a policy, I'm fine with that, but let's do it in a structured way then. Guido van Rossum wrote a webtool for code review, which should be fairly easy to use and setup http://code.google.com/appengine/articles/rietveld.html and I heard people have recently added Git support to it. How about using this for Guile? (We need to figure out how to set it up, and contributors need to have gmail addresses, but that should not be a problem, is it?) -- Han-Wen Nienhuys - [EMAIL PROTECTED] - http://www.xs4all.nl/~hanwen
Re: development goals (was: [PATCH] Avoid `SCM_VALIDATE_LIST ()')
Andy Wingo escreveu: >> I am not using and enhancing GUILE primarily for fun. > > Then why do it? > > I'm serious. Because I take my (LilyPond) users serious. OK - I will admit that interpreter/GC hacking is cool, but on the downside, when I try to do anything, the intertia/resistance I feel in the community here is a big turnoff for me. >> I feel using GUILE has been a big mistake -especially considering the >> amount of time I sank into it. I seriously looked into moving lily to >> mzscheme, but I lack the bandwidth to do that now. > > FWIW, I would like to run my code on other schemes -- not the same goal > as this one, but it overlaps considerably. For me, I think that the path > will be implementation of some scheme standard that supports modules, > then migrating code over to that standard. I'm not sure about R6 though. Is there a Rx/SRFI standard for modules? I always thought that module system(s) was one of the unspecified areas. >> I hope you can understand that I have a somewhat different basic >> attitude wrt GUILE development. > > I understand your frustrations, believe me. But excuse my being blunt, > but these frustratons seem to have made you bitter and confrontational. FWIW, I have always been a bit confrontational. > All attitudes are not equal. This one is not the best for Guile. I think Guile does not have any desire to be anything. Therefore "best for Guile" does not exist; it is the people that develop it and the people who use it who have desires. I am probably the only person who is in both camps currently. When working with the devs here I continue to be puzzled by what the objectives are. For instance, we had 5 major (stable) releases in 11 years. I have always wanted this rate to go up, and have tried argue for that, but with 1.10 (or 2.0, whatever it is called) being in preparation for 2.5 years at this moment, I don't see this changing. At such a glacial pace of development, you would imagine that backward compatibility would not be a concern - after all, who plans for compatibility over a five year span, yet Guile continues to support (by default!) the GH interface which was deprecated in 2002 (or was it with version 1.4 in 2000?). For LilyPond (and any other package that uses Guile, eg. texmacs, snd), this is a problem, since we can not realistically ship anything that requires the latest Git/CVS version to work. Hence, my improvements in LilyPond are held back by Guile's release scheme. For example, I wanted to use SCM rationals for various purposes in Lily for a long time, but had to wait for the 1.8 release before I could do this. If anything, my impression so far is that the objective is to produce a piece of perfect code (with pretty linear history graphs and GNU compliant commit messages!) without a desire to actually ship anything. So, what are the goals in this group? -- Han-Wen Nienhuys - [EMAIL PROTECTED] - http://www.xs4all.nl/~hanwen
Re: [PATCH] Avoid `SCM_VALIDATE_LIST ()'
Neil Jerram escreveu: > 2008/9/7 Han-Wen Nienhuys <[EMAIL PROTECTED]>: >> I am not using and enhancing GUILE primarily for fun. A large part of >> the lilypond architecture in written in it, and performance problems >> in GUILE often translate directly to problems in LilyPond. The reason >> I delved in the GC years ago was because lily was spending half of its >> time running GUILE's GC. >> >> I feel using GUILE has been a big mistake -especially considering the >> amount of time I sank into it. I seriously looked into moving lily to >> mzscheme, but I lack the bandwidth to do that now. >> >> I hope you can understand that I have a somewhat different basic >> attitude wrt GUILE development. > > I'm sorry to hear that. Personally, I hugely appreciate the time that > you've invested in Guile's GC. I wish I understood the GC fully > myself, so I could help more with that work. Actually, since the couple of cleanups (or as some on this list like to say: 'cleanups') I did, the GC has become a lot more simple. It's not really that difficult, you just have to take a more global view of the interpreter. The nice thing about GC is that if you break it, it tends break all over the place in obvious ways. Usually, you can't even get to the 'guile>' prompt. Please feel free to dive in and bug me with questions. I am always very eager to help people that will take over code maintenance duties from me :-) -- Han-Wen Nienhuys - [EMAIL PROTECTED] - http://www.xs4all.nl/~hanwen
Re: [PATCH] Avoid `SCM_VALIDATE_LIST ()'
Mikael Djurfeldt escreveu: >> Yes, that's what I meant by "happily traverse circular lists". :-) > Having been part of Guile development for some time, it's sad to see > how much work is put into changing code back and forth due to > vacillating development goals. It's apparent how important it is to > have a written development policy with design decisions and > motivations. Probably a lot of that should also be put directly into > the code in the form of comments. +1 I am participating in GUILE since it is a foundation of LilyPond. As far as I am concerned, it should be fast, bug-free, standards-compliant and should not have any code that is so complex that it will bitrot when its developer takes off. That is why I am bothered by the half-working elisp support: my impression is that is not really used, so once Neil stops doing his job, GUILE will be left with a bunch of weird code that noone knows how to deal with. All this also makes me wonder why there are no Gnucash developers here, since I recall they actually do use GUILE for a lot of stuff. -- Han-Wen Nienhuys - [EMAIL PROTECTED] - http://www.xs4all.nl/~hanwen
Re: [PATCH] Avoid `SCM_VALIDATE_LIST ()'
On Sat, Sep 6, 2008 at 7:45 PM, Neil Jerram <[EMAIL PROTECTED]> wrote: >>> Well, I remember having a flamewar with RMS about language agnosticism >>> and running emacs on GUILE about 8 years ago, and I don't think we >>> have progressed much since then. Extrapolating this pace, I think >>> it's a waste of time. >> >> 8 years?? Anyway. I just mentioned things that guile-vm can do. If they >> don't get done that means they're not important. > > I was going to write the same thing - then saw that Andy had already done so. > > People work on what interests them, in the time they have spare. I > know we've lost in the global scripting language competition, but I > still find this language implementation fun. I am not using and enhancing GUILE primarily for fun. A large part of the lilypond architecture in written in it, and performance problems in GUILE often translate directly to problems in LilyPond. The reason I delved in the GC years ago was because lily was spending half of its time running GUILE's GC. I feel using GUILE has been a big mistake -especially considering the amount of time I sank into it. I seriously looked into moving lily to mzscheme, but I lack the bandwidth to do that now. I hope you can understand that I have a somewhat different basic attitude wrt GUILE development. -- Han-Wen Nienhuys - [EMAIL PROTECTED] - http://www.xs4all.nl/~hanwen
Re: [PATCH] Avoid `SCM_VALIDATE_LIST ()'
Andy Wingo escreveu: > On Sun 31 Aug 2008 17:12, Han-Wen Nienhuys <[EMAIL PROTECTED]> writes: > >> On a tangent, is anyone still seriously considering to run Emacs atop GUILE? >> (It looks a bit like a travesty if we're trying to accomodate elisp while >> also trying to follow standards like SRFI-x and RxRS) > > I think it makes a *lot* of sense to compile elisp to the VM. I don't > plan on doing so myself, but if the VM gets good enough, it could be > enhanced with the instructions that elisp needs, if any, and it would be > possible to run emacs lisp code, and possibly even emacs itself, on > guile. > > Guile-VM already has a language-agnostic compiler, repl, etc. Scheme > compilation starts with a language-specific reader then translation to > GHIL, at which point the generic compilation proceeds. You could plug in > an elisp reader and translator (see > module/language/scheme/translate.scm) to GHIL, or compile directly to > GLIL. Well, I remember having a flamewar with RMS about language agnosticism and running emacs on GUILE about 8 years ago, and I don't think we have progressed much since then. Extrapolating this pace, I think it's a waste of time. > I don't know where the boundary lies regarding C primitives, though. I > think we'll eventually want to make VM-implemented functions as fast or > faster than the C ones, through a tracing JIT or something. So you could > make elisp reference different C primitives, or implement its primitives > in elisp (or scheme, or whatever), or make our C primitives do both. Actually, on a complete tangent: There is a large industry movement trying to optimize the hell out of JavaScript, since it is used in browsers. How much sense does it make to translate Scheme or LISP to JavaScript and have (for example) V8 or tracemonkey handle it? I'm not very familiar with JavaScript, but I recall it shared lots of characteristics with LISP. I am fairly certain that we will never out-optimize (for example) V8. -- Han-Wen Nienhuys - [EMAIL PROTECTED] - http://www.xs4all.nl/~hanwen
Re: Updated HACKING
Ludovic Courtès escreveu: > Hi, > > Han-Wen Nienhuys <[EMAIL PROTECTED]> writes: > >> Please check dev/hanwen for changes. > > The `HACKING' changes looks good overall, thank you! The "complete > description in the commit message" should rather be "complete > ChangeLog-style description in the commit message (see the GNU Coding > Standards for details)". The sentence "Post your patch to > guile-devel@gnu.org" lacks a period. How about the following: - Provide a description in the commit message, like so: 1-line description of change More extensive discussion of your change. Document why you are changing things. * filename (function name): file specific change comments. This the 1 line + more extensive doc is the standard for git (AFAIK) and lilypond. Background info is very useful for later referral. I would like to propose to only have the * filename sections if there are file specific notes that are not covered by the general description and git's list of modified files. -- Han-Wen Nienhuys - [EMAIL PROTECTED] - http://www.xs4all.nl/~hanwen
Re: the new gc asserts in master
Ludovic Courtès escreveu: > Hi, > > Han-Wen Nienhuys <[EMAIL PROTECTED]> writes: > >> Ludovic Courtès escreveu: > >>> Note that what we agreed on was to provide ChangeLog-style comments in >>> the Git log entry, which this patch doesn't have. >> Can you explain me exactly what you want and why? > > I'm suggesting that we keep using ChangeLog-style entries, as per (info > "(standards) Change Logs"). > > This is in accordance with GCS and has proved to be a good auditing > tool. Auditing? how so? -- Han-Wen Nienhuys - [EMAIL PROTECTED] - http://www.xs4all.nl/~hanwen
Re: the new gc asserts in master
Ludovic Courtès escreveu: > Han-Wen Nienhuys <[EMAIL PROTECTED]> writes: > >> Pushed (without changelog entry). > > Note that what we agreed on was to provide ChangeLog-style comments in > the Git log entry, which this patch doesn't have. Can you explain me exactly what you want and why? I hope you're not suggesting that we add filenames, because git already tracks those for us. In many cases (see eg. git.git), people feel free to put much longer messages (entire emails) into the message. -- Han-Wen Nienhuys - [EMAIL PROTECTED] - http://www.xs4all.nl/~hanwen
Re: Race condition in threading code?
Han-Wen Nienhuys escreveu: >>> ERROR: srfi-18.test: thread-start!: >>> thread activates only after start >>>- arguments: ((syntax-error "memoization" >>> "In file ~S, line ~S: ~A ~S in expression ~S." >>> ("/home/lilydev/vc/guile/srfi/srfi-18.scm" 135 >>>"Bad binding" ct >>> (let (ct (current-thread)) >>> [EMAIL PROTECTED] (or (hashq-ref >>> thread-exception-handlers ct) >>> (hashq-set! thread-exception-handlers ct >>> (list initial-handler) #f)) >> I'm seeing this as well, but it's a [EMAIL PROTECTED]' here (single-binding >> `let's >> are memoized as [EMAIL PROTECTED]'): >> >> ((syntax-error "memoization" >> "In file ~S, line ~S: ~A ~S in expression ~S." >> ("/home/ludo/src/guile/srfi/srfi-18.scm" 138 >> "Bad binding" >> ct >> ([EMAIL PROTECTED] (ct (#> #>)) >> ([EMAIL PROTECTED] (#> #> >> #> #> [EMAIL PROTECTED]) >> (#> hashq-set!>> #> [EMAIL >> PROTECTED] (#> >> #>)) >> ))) >> #f)) >> >> It can be reproduced, but very infrequently, with this program: >> >> (use-modules (ice-9 threads)) >> >> (define (foo x y) >> (let ((z (+ x y))) >> (let ((a (+ z 1))) >> (let ((b (- a 2))) >> (let ((c (* b 3))) >> c) >> >> (define (entry) >> (foo 1 2)) >> >> (for-each (lambda (i) (make-thread entry)) >> (iota 123)) >> >> My explanation is that the `let*' memoizer, aka. `scm_m_letstar ()', is >> not thread-safe; it's clearly not atomic, and it's of course not >> protected by a mutex or so. > > Is that the only one? > > SCM > scm_m_let (SCM expr, SCM env) > ... > /* plain let */ > SCM rvariables; > SCM inits; > transform_bindings (bindings, expr, &rvariables, &inits); > > { > const SCM new_body = m_body (SCM_IM_LET, SCM_CDR (cdr_expr)); > const SCM new_tail = scm_cons2 (rvariables, inits, new_body); > SCM_SETCAR (expr, SCM_IM_LET); > // !!! > SCM_SETCDR (expr, new_tail); > > What happens if another thread tries to evaluate expr at the place marked > !!! ? > > At the very least, we should have an atomic SCM_SETCELL() which overwrites > car and > cdr atomically. Anyone? Does anyone still understand how the evaluator works? (if not, let's move to the VM earlier than later.) -- Han-Wen Nienhuys - [EMAIL PROTECTED] - http://www.xs4all.nl/~hanwen
Updated HACKING
Please check dev/hanwen for changes. I've also looked at renaming ChangeLogs, but I got a scare when I saw how many we have (I forgot GUILE did per directory ones). Seeing so many deprecated files makes want to - delete them, or - put them in a separate subdirectory where they will be out of sight. comments? -- Han-Wen Nienhuys - [EMAIL PROTECTED] - http://www.xs4all.nl/~hanwen
Re: the new gc asserts in master
Ludovic Courtès escreveu: >>> I'm still in favor of "git revert" since the log message makes it clear >>> which patch was reverted and why. "We" can then take our time and work >>> out a proper fix, and finally re-merge the patch plus its fix. >>> Furthermore, in the eventuality where none of us eventually finds a fix, >>> `master' is left in the previous state, which is better IMO. >> 'master' in its previous states grows the heap to 600M doing the 1000-fold >> version of srfi-18 test I posted. I think it's not a good solution. >> >> Commenting out the assert for x86-64 should yield better behavior. > > Alright, then please go ahead. Pushed (without changelog entry). -- Han-Wen Nienhuys - [EMAIL PROTECTED] - http://www.xs4all.nl/~hanwen
Re: [PATCH] Avoid `SCM_VALIDATE_LIST ()'
Neil Jerram escreveu: > 2008/9/1 Ludovic Courtès <[EMAIL PROTECTED]>: >> Hello, >> >> This is a followup to this discussion: >> >> http://thread.gmane.org/gmane.lisp.guile.devel/7194 >> >> The attached patch changes several list-related functions > > reverse!, memq, memv, member, filter, filter! > SRFI-1: concatenate, concatenate!, member, remove, remove! > >> so that they >> don't validate their input with `SCM_VALIDATE_LIST ()' since it's O(n). > > I'm afraid I don't get your rationale, because all these functions are > O(n) anyway. (For reverse*, filter* and concatenate* I believe this > is obvious. For mem* and remove*, they should always be O(n) in > practice because it would be stupid to design a list structure where > the element being looked for or removed was not equally likely to be > anywhere along the list.) All these functions are O(n), but by prefixing the list check you're doubling the running time (eg. in the case of reverse!). If you are doing memq? for something you already know to somewhere in front of the list, the list? check will slow it down much worse. -- Han-Wen Nienhuys - [EMAIL PROTECTED] - http://www.xs4all.nl/~hanwen
Re: [PATCH] Avoid `SCM_VALIDATE_LIST ()'
Ludovic Courtès escreveu: >> On a tangent, is anyone still seriously considering to run Emacs atop GUILE? > > There's Ken Reaburn's attempt at http://www.mit.edu/~raeburn/guilemacs/ , > and there's also the Elisp support that's under `lang'. I don't think > the former is really maintained. The latter isn't actively maintained > either but I think it's in a pretty good shape. Neil? What is the intended use case of running Elisp in GUILE ? Is anyone using it for anything? >> - Copyright (C) 2000, 2001, 2006 Free Software Foundation, Inc. >> + Copyright (C) 2000, 2001, 2006, 2008 Free Software Foundation, Inc. >> >> Can we do this in one fell swoop, adding 2008 to all files? > > I wouldn't do that. I think updating the copyright year *when* a change > is made is better: it allows people to see at a glance whether a file > has been changed at all recently I think that git log FILE is the reliable and precise way to check that. The headers are so much boilerplate that I pretty much ignore all of them. > and avoids pointless commits. We could add 2008 to all files in a single commit, and avoid poluting diffs with header blah blah for an entire year. > However, the GNU maintainer's guide (see (info "(maintain) Copyright > Notices")) prefers the other way: > > To update the list of year numbers, add each year in which you have > made nontrivial changes to the package. (Here we assume you're using a > publicly accessible revision control server, so that every revision > installed is also immediately and automatically published.) When you > add the new year, it is not required to keep track of which files have > seen significant changes in the new year and which have not. It is > recommended and simpler to add the new year to all files in the > package, and be done with it for the rest of the year. -- Han-Wen Nienhuys - [EMAIL PROTECTED] - http://www.xs4all.nl/~hanwen
Re: Race condition in threading code?
Ludovic Courtès escreveu: > Hello, > > Andy Wingo <[EMAIL PROTECTED]> writes: > >> ERROR: srfi-18.test: thread-start!: >> thread activates only after start >>- arguments: ((syntax-error "memoization" >> "In file ~S, line ~S: ~A ~S in expression ~S." >> ("/home/lilydev/vc/guile/srfi/srfi-18.scm" 135 >>"Bad binding" ct >> (let (ct (current-thread)) >> [EMAIL PROTECTED] (or (hashq-ref >> thread-exception-handlers ct) >> (hashq-set! thread-exception-handlers ct >> (list initial-handler) #f)) > > I'm seeing this as well, but it's a [EMAIL PROTECTED]' here (single-binding > `let's > are memoized as [EMAIL PROTECTED]'): > > ((syntax-error "memoization" > "In file ~S, line ~S: ~A ~S in expression ~S." > ("/home/ludo/src/guile/srfi/srfi-18.scm" 138 > "Bad binding" > ct > ([EMAIL PROTECTED] (ct (# #>)) > ([EMAIL PROTECTED] (# #> > # #> [EMAIL PROTECTED]) > (# hashq-set!>> #> [EMAIL > PROTECTED] (#> > #>)) > ))) > #f)) > > It can be reproduced, but very infrequently, with this program: > > (use-modules (ice-9 threads)) > > (define (foo x y) > (let ((z (+ x y))) > (let ((a (+ z 1))) > (let ((b (- a 2))) > (let ((c (* b 3))) > c) > > (define (entry) > (foo 1 2)) > > (for-each (lambda (i) (make-thread entry)) > (iota 123)) > > My explanation is that the `let*' memoizer, aka. `scm_m_letstar ()', is > not thread-safe; it's clearly not atomic, and it's of course not > protected by a mutex or so. Is that the only one? SCM scm_m_let (SCM expr, SCM env) ... /* plain let */ SCM rvariables; SCM inits; transform_bindings (bindings, expr, &rvariables, &inits); { const SCM new_body = m_body (SCM_IM_LET, SCM_CDR (cdr_expr)); const SCM new_tail = scm_cons2 (rvariables, inits, new_body); SCM_SETCAR (expr, SCM_IM_LET); // !!! SCM_SETCDR (expr, new_tail); What happens if another thread tries to evaluate expr at the place marked !!! ? At the very least, we should have an atomic SCM_SETCELL() which overwrites car and cdr atomically. -- Han-Wen Nienhuys - [EMAIL PROTECTED] - http://www.xs4all.nl/~hanwen
Re: [PATCH] Avoid `SCM_VALIDATE_LIST ()'
Ludovic Courtès escreveu: > Hello, > > This is a followup to this discussion: > > http://thread.gmane.org/gmane.lisp.guile.devel/7194 > > The attached patch changes several list-related functions so that they > don't validate their input with `SCM_VALIDATE_LIST ()' since it's O(n). > > A side-effect (besides performance improvements) is that all these > functions will now happily traverse circular lists, and will silently > deal with dotted lists. This is acceptable behavior IMO. > > Nevertheless, the second patch below implements the "tortoise and the > hare" in `list-copy' so that it detects circular list; it seems > worthwhile to check that here since `list-copy' would otherwise exhaust > memory. > > (Note that SRFI-1's `list-copy' *does* accept improper lists, including > circular lists, although SRFI-1 does not explicitly mention that it > should handle improper list.) > > Also, in some cases, the `wrong-type-arg' message is different (but the > exception key is the same). > > OK to apply to both branches? - return scm_c_memq (x, lst); + for (; !SCM_NULL_OR_NIL_P (lst); lst = SCM_CDR (lst)) +{ + SCM_VALIDATE_CONS (2, lst); Looks cleaner to use SCM_CONS_P (or whatever it is called) as loop guard, so it is obviously correct, and crash if the lst is not properly terminated after the loop (- perhaps only if we're not compiling in optimizing mode). On a tangent, is anyone still seriously considering to run Emacs atop GUILE? (It looks a bit like a travesty if we're trying to accomodate elisp while also trying to follow standards like SRFI-x and RxRS) - SCM from_here; + SCM from_here, hare; you could do the init to lst right here. IMO it's neater not to have uninitialized memory locations during program execution. list.test --- tests guile's lists -*- scheme -*- - Copyright (C) 2000, 2001, 2006 Free Software Foundation, Inc. + Copyright (C) 2000, 2001, 2006, 2008 Free Software Foundation, Inc. Can we do this in one fell swoop, adding 2008 to all files? After all publishing a commit is a release in some sense. Then, we don't have to worry about the file headers anymore. -- Han-Wen Nienhuys - [EMAIL PROTECTED] - http://www.xs4all.nl/~hanwen
Re: Race condition in threading code?
Ludovic Courtès escreveu: > Hello, > > Andy Wingo <[EMAIL PROTECTED]> writes: > >> ERROR: srfi-18.test: thread-start!: >> thread activates only after start >>- arguments: ((syntax-error "memoization" >> "In file ~S, line ~S: ~A ~S in expression ~S." >> ("/home/lilydev/vc/guile/srfi/srfi-18.scm" 135 >>"Bad binding" ct >> (let (ct (current-thread)) >> [EMAIL PROTECTED] (or (hashq-ref >> thread-exception-handlers ct) >> (hashq-set! thread-exception-handlers ct >> (list initial-handler) #f)) > > I'm seeing this as well, but it's a [EMAIL PROTECTED]' here (single-binding > `let's > are memoized as [EMAIL PROTECTED]'): > > ((syntax-error "memoization" > "In file ~S, line ~S: ~A ~S in expression ~S." > ("/home/ludo/src/guile/srfi/srfi-18.scm" 138 > "Bad binding" > ct > ([EMAIL PROTECTED] (ct (# #>)) > ([EMAIL PROTECTED] (# #> > # #> [EMAIL PROTECTED]) > (# hashq-set!>> #> [EMAIL > PROTECTED] (#> > #>)) > ))) > #f)) > > It can be reproduced, but very infrequently, with this program: > > (use-modules (ice-9 threads)) > > (define (foo x y) > (let ((z (+ x y))) > (let ((a (+ z 1))) > (let ((b (- a 2))) > (let ((c (* b 3))) > c) > > (define (entry) > (foo 1 2)) > > (for-each (lambda (i) (make-thread entry)) > (iota 123)) > > My explanation is that the `let*' memoizer, aka. `scm_m_letstar ()', is > not thread-safe; it's clearly not atomic, and it's of course not > protected by a mutex or so. > > I can't think of any simple fix. `scm_m_letstar ()' could be made > atomic by having it duplicate the input list instead of modifying it > directly; it could then atomically update the input. However, > allocating cells during memoization wouldn't be a good idea > performance-wise. I don't understand: memoization is only supposed to happen once for each piece of code, right? So, the cost of it is not that interesting? I remember seeing a very scary looking explanation in eval.c about the evaluator being unlocked but still thread-safe since the result of memoizing was supposed to be confluent (ie. duplicate runs would yield independent results.) /* The Lookup Car Race - by Eva Luator This was added by Marius Vollmer, but at the time, GUILE did not support real posix threads, so any problem may not have manifested itself before. -- Han-Wen Nienhuys - [EMAIL PROTECTED] - http://www.xs4all.nl/~hanwen
Re: Compiling GUILE
Ludovic Courtès escreveu: > Hello, > > "Han-Wen Nienhuys" <[EMAIL PROTECTED]> writes: > >> autoreconf: running: aclocal --force >> autoreconf: configure.in: not using Libtool >> autoreconf: running: /home/hanwen/usr/pkg/ac/bin/autoconf --force >> configure.in:20: error: possibly undefined macro: AC_LIBTOOL_WIN32_DLL >> If this token and others are legitimate, please use m4_pattern_allow. >> See the Autoconf documentation. >> configure.in:21: error: possibly undefined macro: AC_PROG_LIBTOOL >> autoreconf: /home/hanwen/usr/pkg/ac/bin/autoconf failed with exit status: 1 > > Looks like the machine is missing Libtool or at least that its M4 macros > cannot be found. You need Libtool 1.5.x, *not* Libtool 2.2. Now that we're busy dropping cruft, it would be really nice if we could libtool too. (major sigh). -- Han-Wen Nienhuys - [EMAIL PROTECTED] - http://www.xs4all.nl/~hanwen
Re: Compiling GUILE
Ludovic Courtès escreveu: > > Looks like the machine is missing Libtool or at least that its M4 macros > cannot be found. You need Libtool 1.5.x, *not* Libtool 2.2. > >> How do I get GUILE to actually compile now? > > How did you get it to compile last week? :-) I used my 32 bit laptop, but that some people complaining :) -- Han-Wen Nienhuys - [EMAIL PROTECTED] - http://www.xs4all.nl/~hanwen
Re: Race condition in threading code?
Julian Graham escreveu: >> Would this also explain the 'corruption' in the evaluator we have been >> seeing ("bad bindings at .. ")? > > > I don't think so; in fact, I just got one of those errors while > looping your test code. I think it's something else, unrelated to the > patch I just sent in. I'll keep looking at it. You have any luck > narrowing things down with Helgrind? No, there is no consistent lock ordering in Guile at all, so Helgrind gives 1000s of errors. -- Han-Wen Nienhuys - [EMAIL PROTECTED] - http://www.xs4all.nl/~hanwen
Re: Race condition in threading code?
Julian Graham escreveu: > Okay, I think I know what the problem is: Part of the SRFI-18 thread > start / creation process involves contention for a mutex, and there's > a bug in fat_mutex_lock code that causes the locking thread to > sometimes miss an unlocking thread's notification that a mutex is > available. So it's actually a mutex bug -- specifically, in the loop > code in fat_mutex_lock that ends with the following snippet: > > ... > scm_i_pthread_mutex_unlock (&m->lock); > SCM_TICK; > scm_i_scm_pthread_mutex_lock (&m->lock); > } > block_self (m->waiting, mutex, &m->lock, timeout); > > ...which means that if the loop is entered while the mutex is still > locked but the owner unlocks it after the locking thread releases the > administrative lock to run the tick, the locking thread will sleep > forever because it doesn't re-check the state of the mutex. I've made > a small change (blocking before doing the tick instead of after) that > seems to resolve the issue (so far no lock-ups using Han-Wen's x.test > for a couple of hours). There's a patch attached. > > (Sorry, should have noticed this earlier; the problem existed before > the changes I introduced to support SRFI-18...) Would this also explain the 'corruption' in the evaluator we have been seeing ("bad bindings at .. ")? -- Han-Wen Nienhuys - [EMAIL PROTECTED] - http://www.xs4all.nl/~hanwen
Compiling GUILE
Hi, I'm trying to compile GUILE on a x86_64 machine, to figure out the GC problems autoreconf: Entering directory `.' autoreconf: configure.in: not using Gettext autoreconf: running: aclocal --force -I guile-config -I m4 configure.in:815: warning: macro `AM_GNU_GETTEXT' not found in library autoreconf: configure.in: tracing autoreconf: configure.in: adding subdirectory guile-readline to autoreconf autoreconf: Entering directory `guile-readline' autoreconf: running: aclocal --force autoreconf: configure.in: not using Libtool autoreconf: running: /home/hanwen/usr/pkg/ac/bin/autoconf --force configure.in:20: error: possibly undefined macro: AC_LIBTOOL_WIN32_DLL If this token and others are legitimate, please use m4_pattern_allow. See the Autoconf documentation. configure.in:21: error: possibly undefined macro: AC_PROG_LIBTOOL autoreconf: /home/hanwen/usr/pkg/ac/bin/autoconf failed with exit status: 1 To satisfy all the dependencies, I installed m4 (1.4.4), autoconf (2.60) automake (1.10.1) locally. (sigh.) How do I get GUILE to actually compile now? Also, can someone tip me off how I create 64 bit binary on a mixed 32/64 machine? I assume the default will generate a 32 bit binary. -- Han-Wen Nienhuys - [EMAIL PROTECTED] - http://www.xs4all.nl/~hanwen
Re: RFD: please drop ChangeLog updates
Ludovic Courtès escreveu: > Hey! > > Han-Wen Nienhuys <[EMAIL PROTECTED]> writes: > >> Ludovic Courtès escreveu: > >>> That said, the ideal would be something like `add-change-log-entry' that >>> operates on Git logs instead of ChangeLogs, but there doesn't seem to be >>> anything like this. DVC is said to support things like that, but it >>> doesn't seem to be well documented. >> Try magit (by Marius Vollmer, our previous overlord). It lets you create >> commits by marking individual patch hunks. > > I wasn't aware of that, but I just gave it a try and it seems to rock! > The doc is also nice. Good to hear from Marius! ;-) > > Then I guess I'm happy to abolish ChangeLogs. Two questions remain: > > 1. Should we remove ChangeLog files from the repo? > 2. Should we generate ChangeLogs for releases? > > I'd say "yes" to (1) and "no" to (2). That would be a departure from > the GNU Standards, but perhaps it's a sign of their age. I'd say no (1). The information in ChangeLog and commit message can diverge, so deleting them may remove information. I recommend keeping them around, Perhaps in a directory in with historic files, or marking them as THESE CHANGELOGS ARE NO LONGER UPDATED in screaming letters at the top. -- Han-Wen Nienhuys - [EMAIL PROTECTED] - http://www.xs4all.nl/~hanwen
Re: RFD: please drop ChangeLog updates
Ludovic Courtès escreveu: > Hi, > > Sergey Poznyakoff <[EMAIL PROTECTED]> writes: > >> <[EMAIL PROTECTED]> ha escrit: >> >>> Guile is distributed as a tarball, not a git repo. Does it make sense to >>> create the ChangeLog from the git log at make dist time? >>> >> FWIW, there is a gnulib module for that purpose: gitlog-to-changelog. > > Emacs' VC (since 22.2) can also do that (see (info "(emacs) Types of Log > File")). > > That said, the ideal would be something like `add-change-log-entry' that > operates on Git logs instead of ChangeLogs, but there doesn't seem to be > anything like this. DVC is said to support things like that, but it > doesn't seem to be well documented. Try magit (by Marius Vollmer, our previous overlord). It lets you create commits by marking individual patch hunks. -- Han-Wen Nienhuys - [EMAIL PROTECTED] - http://www.xs4all.nl/~hanwen
Re: RFD: please drop ChangeLog updates
[EMAIL PROTECTED] escreveu: >> * Much more detailed and inherently correct information can be gotten from >> >> git log -- libguile/ >> >> git log -- test-suite/ >> >> etc. >> >> * The ChangeLog duplicates the git log information if done correctly. Hence >> it requires double work for the committer. > > Guile is distributed as a tarball, not a git repo. Does it make > sense to create the ChangeLog from the git log at make dist time? I'm not against that, but anyone who is interested in the history might as well grab the Git repository. -- Han-Wen Nienhuys - [EMAIL PROTECTED] - http://www.xs4all.nl/~hanwen
RFD: please drop ChangeLog updates
Reasons: * Much more detailed and inherently correct information can be gotten from git log -- libguile/ git log -- test-suite/ etc. * The ChangeLog duplicates the git log information if done correctly. Hence it requires double work for the committer. * Since updates to the ChangeLog always happen at the top, they virtually always conflict on pulls and cherry-picks. This makes it impossible to use the power of git. For example, rebase is the standard git approach to creating linear history of changes. This is apparently something the GUILE devs think is important, but the changes to ChangeLog ensure that every cherry pick will need manual conflict resolution. -- Han-Wen Nienhuys - [EMAIL PROTECTED] - http://www.xs4all.nl/~hanwen
Re: the new gc asserts in master
Han-Wen Nienhuys escreveu: >>> The use of scm_gc_mark() outside of GC is fundamentally broken, since it >>> creates race conditions in the presence of threads. >> I was not aware that this was the case. >> >> My impression was that the mark phase is global; it requires all threads >> that were in guile mode to go dormant, and those that were not in guile >> mode cannot enter guile mode until the mark is complete. > > Yes, the mark phase is global, but the thread locking is done in > scm_i_gc; once the marking starts, there is only one thread. Since > scm_gc_mark is called from the smob mark functions, it does not force > other threads to go dormant. It could, but I suspect the lock would > be a contention point. It would be very cool to have thread safe marking for a different reason: marking it is the expensive step in GC, so if we can do that in N threads concurrently (on a SMP machine) we have can speed it up by almost a factor N. To do it properly, you could do the bitvector marking with a compare & swap instruction. -- Han-Wen Nienhuys - [EMAIL PROTECTED] - http://www.xs4all.nl/~hanwen
Re: the new gc asserts in master
Andy Wingo escreveu: > Hi again! > > On Wed 27 Aug 2008 12:00, Andy Wingo <[EMAIL PROTECTED]> writes: > >> On Wed 27 Aug 2008 07:00, Han-Wen Nienhuys <[EMAIL PROTECTED]> writes: >> >>>>>> http://thread.gmane.org/gmane.lisp.guile.user/6372 >>> I think reference counting is the correct solution for this, as far as >>> I understand the problem from the quoted message. >> I don't think so; the use case is that (1) we don't want to prevent the >> C object from being freed, so we don't want to hold a reference on the C >> object; but (2) we do want to know when it is freed, so we can release >> our cache; but (3) we want to get the scheme object back if the object >> has not in fact been swept. As far as I understand, you are splitting the decision whether an object is live (reachable) between C and SCM - it's reachable when there is an SCM reference, but it is also reachable when there is no SCM reference anymore, but it is still reachable by looking at a C table and returning the SCM reference inside the C struct. You can never get a consistent model if you try to split it up like that. Either the memory management is entirely in SCM (using (weak) hash tables in SCM and GC which decides whether an object is reachable) or the memory management is entirely in C (eg. using ref counting). If you try to mix both, you'll get a frankenstein with cloudy semantics and edgy corner cases. > I don't think this is exactly right. I was discussing this on IRC with > Ludovic and he made me come out with a better characterization of the > problem. > > Consider a C object, `C'. We wrap it in scheme with a SCM object, `S'. S > has a reference on C, using reference counting. C has some kind of API > to associate S with it: set_ptr() and get_ptr(). Cool. > > So what happens if we get C back from a callback at some time in the > future? Well we call get_ptr(C) and return that if it's non-null. > Otherwise we make a new smob and call set_ptr (C, S) and then return S. > > So what happens if the scheme object becomes collectable? Well S has a > free function which will unref the C object and set_ptr (C, NULL). This > is also OK. > > But what if it goes like this: > > S becomes collectable in theory > > mark phase: S is indeed marked as collectable > > C is returned from a callback: get_ptr() return S > > at some later time the card containing S is swept; S's free function > is run, and S is marked as a free cell Here you are dividing the responsibity of liveness between SCM and C. SCM decides the object is dead, but you hold on to it in in C, and then insert the dead reference back into SCM, leading to crappage. I think it is wrong to equate (destructor has run) with (object is reachable). One of the important points of GC (RAII will say it's a detractor) vs. C++ smart pointer is that going out of scope and running the destructor to go with that can happen at different points in time. Between these points, the objects are in a sort of limbo: you can't do return them back to the 'live' part of the program, but the destructror hasn't run yet. -- Han-Wen Nienhuys - [EMAIL PROTECTED] - http://www.xs4all.nl/~hanwen
Re: the new gc asserts in master
Ludovic Courtès escreveu: > Sorry if I missed something but my understanding was that you were > referring to a GC fix, which it isn't. Re-reading the thread, it seems > I indeed missed the point, and I apologize. I hope you do realize that every time you miss the point and send out a reply you have me trembling with anger, and contemplating whether I should stop contributing to GUILE right away or after it's patched up enough for LilyPond? After some years of doing lilypond, I have become wise enough to delete the expletives before I send an email, but please think and reread before you send out something. -- Han-Wen Nienhuys - [EMAIL PROTECTED] - http://www.xs4all.nl/~hanwen
Re: the new gc asserts in master
Ludovic Courtès escreveu: >> +#if (SCM_DEBUG_CELL_ACCESSES == 0 && SCM_SIZEOF_UNSIGNED_LONG == 4) > > x86-64 is not the only arch with 4-byte long long integers. I'm not pretending it is the end-all fix. - we (I) need to understand what is happening and make the numbers match up exactly. This is just a kludge. > I'm still in favor of "git revert" since the log message makes it clear > which patch was reverted and why. "We" can then take our time and work > out a proper fix, and finally re-merge the patch plus its fix. > Furthermore, in the eventuality where none of us eventually finds a fix, > `master' is left in the previous state, which is better IMO. 'master' in its previous states grows the heap to 600M doing the 1000-fold version of srfi-18 test I posted. I think it's not a good solution. Commenting out the assert for x86-64 should yield better behavior. -- Han-Wen Nienhuys - [EMAIL PROTECTED] - http://www.xs4all.nl/~hanwen
Re: the new gc asserts in master
Andy Wingo escreveu: >> I think reference counting is the correct solution for this, as far as >> I understand the problem from the quoted message. > > I don't think so; the use case is that (1) we don't want to prevent the > C object from being freed, so we don't want to hold a reference on the C > object; but (2) we do want to know when it is freed, so we can release > our cache; but (3) we want to get the scheme object back if the object > has not in fact been swept. I don't understand how reference counting would prevent you from doing this. Reference counting only is broken if you have (lots of) cyclical references. Do you have those? With ref-counting you would loose the unique C object <-> SCM object mapping, since you could create another reference (another SCM object) for a given C object if a SCM reference was lost, but a new one created in the same GC cycle. From what I've read, this should not be a problem. Are you relying on the SCM representation of the C object being unique? If not, there should not be a problem. > But the laziness of the sweeper prevents us from knowing whether the > cache that we have is in fact accessible, because there is a time > between the mark phase (in which the object might become sweepable) and > when the smob's free function is called in the sweep phase (which would > invalidate the cache). >> The use of scm_gc_mark() outside of GC is fundamentally broken, since it >> creates race conditions in the presence of threads. > > I was not aware that this was the case. > > My impression was that the mark phase is global; it requires all threads > that were in guile mode to go dormant, and those that were not in guile > mode cannot enter guile mode until the mark is complete. Yes, the mark phase is global, but the thread locking is done in scm_i_gc; once the marking starts, there is only one thread. Since scm_gc_mark is called from the smob mark functions, it does not force other threads to go dormant. It could, but I suspect the lock would be a contention point. If you call scm_gc_mark() on your own schedule, you're venturing outside the established model. From what I understand from your description, you detect that an object is live outside of the mark phase, which is what triggers the assertion, and which does not force thread correctness. > So if I have a thread in guile mode, it is not in the mark phase, hence > no race. Also, it would not be sweeping; I can check the cache and > retrieve and mark the object without the thread of interest doing a > sweep(). But perhaps some other thread would sweep that card, in which > case I guess I can see where the problem would come in. An alternative is to have a GC hook, like struct does. It does tie you to the GC implementation. I think something that works regardless of the GC details is better for a user of GUILE. > It's very irksome that I missed this bit in the documentation. > > So, my proposal to fix this is to expose the sweep mutex as part of the > API somehow, perhaps as e.g. > >void* scm_with_sweep_mutex (void* (*with_mutex_func)(void*), void*); > > or so. How do you feel about this? I know it constrains your GC > implementation, but threads + lazy sweeping + integrating with C > libraries = exposing some minimal amount of low-level details. I think it is best to expose as little of the GC implementation as possible. The fact that there is a mutex already gives away implementation details. I'm a bit miffed that the current interface already gives away that we won't rewrite (ie. compact) the current heap. Let's not paint ourselves in more corners. -- Han-Wen Nienhuys - [EMAIL PROTECTED] - http://www.xs4all.nl/~hanwen
Re: the new gc asserts in master
Ludovic Courtès escreveu: > Han-Wen Nienhuys <[EMAIL PROTECTED]> writes: > >> I am pushing a fix for this to master. > > Will you care to post and discuss your patches before pushing them? Over the last weeks I have seen little discussion on your patches, eg. commit 582a4997abc8b34ac6caf374fda8ea3ac65bd571 Author: Ludovic Courtès <[EMAIL PROTECTED]> Date: Mon Aug 25 11:20:02 2008 +0200 Use $(GCC_CFLAGS) for `-Werror' et al. so that it's not used to compile Gnulib code. commit c95514b3b41c8e335ada863f8abb99cc4af9abe1 Author: Ludovic Courtès <[EMAIL PROTECTED]> Date: Thu Aug 14 00:15:03 2008 +0200 Remove the now useless `qthreads.m4'. Were pushed without review. There was a post on commit 450be18dfffd496ef14e1c921953e6f179727ab4 Author: Ludovic Courtès <[EMAIL PROTECTED]> Date: Thu Jul 17 00:17:56 2008 +0200 Handle lack of `struct dirent64' and `readdir64_r ()' on HP-UX 11.11. but it was after the fact Hi, FYI, I committed the attached patch, which handles the lack of `struct dirent64' and `readdir64_r ()' on HP-UX 11.11 (and possibly other versions). I'm assuming here that you -in good community spirit- don't consider yourself to be above your own rules. In other words: you can't do this. If you want people to discuss before pushing you should set the good example. > This is all the more important that the patches don't seem to have any > relation with the problem at hand: > > f85ea2a85fcdd051f432964806f044c0301d0945 Merge branch 'master' of > git://git.sv.gnu.org/guile into nits > 487b9dec2ea6b88ddbc6fbd17f445ddb197aebc5 Only sanity check numbers if > SCM_DEBUG_CELL_ACCESSES is unset. > 80237dcc7783b4d94ecf1d987deb9306d61735a0 Set SRCPROP{PLIST,COPY} through a > macro, so SCM_DEBUG_CELL_ACCESSES compiles. This in reference to GUILE not compiling with SCM_DEBUG_CELL_ACCESSES The commit message says, Set SRCPROP{PLIST,COPY} through a macro, so SCM_DEBUG_CELL_ACCESSES compiles. how much clearer do you want this message to be? It is fixing a compilation problem for a certain preprocessor define. It doesn't pretend to fix x86-64. You were complaining before that my changes were too large and should have been split up. This patch is split up. Can you make up your mind? > Do you think you can come up with a fix within the next few days? In the spirit of your undocumented development and community standards, I am including below a patch for this problem. Let's discuss this complex change first to decide whether it is worthy for inclusion in the oh-so-active GUILE repository. > Otherwise, I'm inclined to revert the offending commits in `master' and > wait for a signal from you (i.e., a patch or merge request posted to the > mailing list, *not* a commit on `master'). It would make it easier for > us to play with `master' in the meantime. > Besides, avoid pushing from an non-up-to-date repo: this yields to > automatic merges like the one above, which is annoying as it makes > history harder to follow. Better pull first, then merge your changes, > then push. Can you document your requirements upfront instead complaining after the fact? [EMAIL PROTECTED] guile]$ grep -i merge HACKING [EMAIL PROTECTED] guile]$ [EMAIL PROTECTED] guile]$ grep -i pull HACKING [EMAIL PROTECTED] guile]$ >>> even the lazy smob case I wrote about here: >>> >>> http://thread.gmane.org/gmane.lisp.guile.user/6372 >> I would classify the use of mark bits outside of the mark phase as outside >> of the defined API. If you want to have weak pointer semantics, use >> a weak hashtable, or implement reference counting on the C side. > > That's a reasonable argument, but it's something we should not change > without discussing it first. For instance, it may be important to study > why Guile-GNOME had to resort to this, and how it could avoid it, > instead of just gratuitously breaking it. I'm not suggesting to change without discussing; this message rather is the start of the discussion. I think reference counting is the correct solution for this, as far as I understand the problem from the quoted message. The use of scm_gc_mark() outside of GC is fundamentally broken, since it creates race conditions in the presence of threads. -- Han-Wen Nienhuys - [EMAIL PROTECTED] - http://www.xs4all.nl/~hanwen >From ccd010e15ec0ddf285b75911739e85866d2d865c Mon Sep 17 00:00:00 2001 From: Han-Wen Nienhuys <[EMAIL PROTECTED]> Date: Wed, 27 Aug 2008 10:48:06 -0300 Subject: [PATCH] Kludge around x86-64 GC runtime checks. 2008-08-27 Han-Wen Nienhuys <[EMAIL PROTECTED]> * gc.c (scm_i_gc): Don't sanity check numbers on x64, while we investigate a real
Re: Race condition in threading code?
It could be an existing problem that is exposed by the new tighter allocation. With a larger heap, there are larger intervals between recollection of the heap. Let me run this through DEBUGINFO when I get the chance. On Tue, Aug 26, 2008 at 5:23 PM, Andy Wingo <[EMAIL PROTECTED]> wrote: > Hi, > > On Sat 16 Aug 2008 11:45, Han-Wen Nienhuys <[EMAIL PROTECTED]> writes: > >> Julian Graham escreveu: >>> Hmmm... I don't recall seeing those when I was writing that test >>> suite. Just to be clear, were you getting those errors before making >>> your changes? >> >> No, but some very unrelated changes made them go away again. > > I still get that error, having merged master into vm. Do you have other > fixes? > > The original error that you had, reflowed: > > ERROR: srfi-18.test: thread-start!: > thread activates only after start > - arguments: ((syntax-error "memoization" > "In file ~S, line ~S: ~A ~S in expression ~S." > ("/home/lilydev/vc/guile/srfi/srfi-18.scm" 135 > "Bad binding" ct >(let (ct (current-thread)) >[EMAIL PROTECTED] (or (hashq-ref thread-exception-handlers > ct) > (hashq-set! thread-exception-handlers ct >(list initial-handler) #f)) > > Bad binding with an [EMAIL PROTECTED] This would seem to indicate that a > memoized cell was not marked, and was then swept/re-used. > > Andy > -- > http://wingolog.org/ > -- Han-Wen Nienhuys - [EMAIL PROTECTED] - http://www.xs4all.nl/~hanwen
Re: the new gc asserts in master
Han-Wen Nienhuys escreveu: >> even the lazy smob case I wrote about here: >> >> http://thread.gmane.org/gmane.lisp.guile.user/6372 > > I would classify the use of mark bits outside of the mark phase as outside > of the defined API. If you want to have weak pointer semantics, use > a weak hashtable, or implement reference counting on the C side. > > I am actuallly inclined to add add abort() for anyone who calls scm_gc_mark() > outside the marking phase. Also, you're creating a race condition: the mark bits are not protected by a lock, so you will be screwed in still more interesting ways if more threads or types would start doing this. -- Han-Wen Nienhuys - [EMAIL PROTECTED] - http://www.xs4all.nl/~hanwen
Re: Race condition in threading code?
Andy Wingo escreveu: > Hi, > > On Sat 16 Aug 2008 11:45, Han-Wen Nienhuys <[EMAIL PROTECTED]> writes: > >> Julian Graham escreveu: >>> Hmmm... I don't recall seeing those when I was writing that test >>> suite. Just to be clear, were you getting those errors before making >>> your changes? >> No, but some very unrelated changes made them go away again. > > I still get that error, having merged master into vm. Do you have other > fixes? Hi, can we have some certainty that this > > The original error that you had, reflowed: > > ERROR: srfi-18.test: thread-start!: > thread activates only after start >- arguments: ((syntax-error "memoization" > "In file ~S, line ~S: ~A ~S in expression ~S." > ("/home/lilydev/vc/guile/srfi/srfi-18.scm" 135 >"Bad binding" ct > (let (ct (current-thread)) > [EMAIL PROTECTED] (or (hashq-ref > thread-exception-handlers ct) > (hashq-set! thread-exception-handlers ct > (list initial-handler) #f)) > > Bad binding with an [EMAIL PROTECTED] This would seem to indicate that a > memoized cell was not marked, and was then swept/re-used. x.test: (define-module (test-suite test-srfi-18) #:use-module (test-suite lib) #:use-module (srfi srfi-18)) (iota 100) (for-each (lambda (x) (display x) (newline) (gc) (gc) (with-test-prefix "thread-start!" (pass-if "thread activates only after start" (let* ((started #f) (m (make-mutex 'thread-start-mutex)) (t (make-thread (lambda () (set! started #t)) 'thread-start-1))) (and (not started) (thread-start! t) (thread-join! t) started ) (iota 1000)) ... [Thread 0xb7322b90 (LWP 10865) exited] 333 [New Thread 0xb7322b90 (LWP 10866)] [Thread 0xb7322b90 (LWP 10866) exited] 334 [New Thread 0xb7322b90 (LWP 10867)] [Thread 0xb7322b90 (LWP 10867) exited] 335 [New Thread 0xb7322b90 (LWP 10868)] The large allocation and the 2 (gc) calls ensure that this test itself does not trigger gcs. The code exercised here appears to deadlock. I would be suspect of other threading problems (race conditions) in this code as well. ^C Program received signal SIGINT, Interrupt. 0x00110416 in __kernel_vsyscall () (gdb) bt #0 0x00110416 in __kernel_vsyscall () #1 0x007f3ba5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libpthread.so.0 #2 0x001c7453 in scm_pthread_cond_wait (cond=0x955c04c, mutex=0x95f2e90) at threads.c:1804 #3 0x001c79bd in block_self (queue=0xb7397690, sleep_object=, mutex=0x95f2e90, waittime=0x0) at threads.c:248 #4 0x001c8875 in fat_mutex_lock (mutex=0xb73976c0, timeout=0x0, owner=0xb7f87a28, ret=0xbfee5348) at threads.c:1299 #5 0x001c8b6b in scm_lock_mutex_timed (m=0xb73976c0, timeout=0x204, owner=0xb7f87a28) at threads.c:1331 #6 0x001c8d54 in fat_mutex_unlock (mutex=0xb73976c0, cond=0xb7397590, waittime=0x0, relock=1) at threads.c:1449 #7 0x001c8e89 in scm_timed_wait_condition_variable (cv=0xb7397590, mx=0xb73976c0, t=0x204) at threads.c:1626 #8 0x00171ab3 in scm_gsubr_apply (args=0x404) at gsubr.c:221 #9 0x001588f3 in scm_apply (proc=0xb7f87a60, arg1=0xb7f87fb0, args=0xb73970e8) at eval.i.c:1778 #10 0x0014a2b7 in ceval (x=0x404, env=0xb7397170) at eval.i.c:1360 #11 0x0014a4cd in ceval (x=0x95a3fc0, env=0xb7397170) at eval.i.c:358 #12 0x0014dfbb in ceval (x=0xb7366e38, env=0xb73977f0) at eval.i.c:609 #13 0x0014dfbb in ceval (x=0xb73613d0, env=0x95b7020) at eval.i.c:609 #14 0x0015a21d in scm_call_0 (proc=0x95b7988) at eval.c:3053 -- Han-Wen Nienhuys - [EMAIL PROTECTED] - http://www.xs4all.nl/~hanwen
Re: the new gc asserts in master
Andy Wingo escreveu: > Hi, > > I just merged master to guile-vm, but I'm not sure if I really wanted to > do that now. Normal test suites are failing: > > lt-guile: gc.c:604: scm_i_gc: Assertion `scm_cells_allocated == > scm_i_marked_count ()' failed. > /home/wingo/src/guile/vm/test-suite/standalone/test-use-srfi: line 27: > 29507 Aborted guile -q --use-srfi=1,10 > /dev/null < (if (and (defined? 'partition) > (defined? 'define-reader-ctor)) > (exit 0) ;; good > (exit 1)) ;; bad > EOF > > guile --use-srfi=1,10 fails to run > FAIL: test-use-srfi > > This is on a core 2 duo, in 32-bit mode, configured as: > > CFLAGS="-g -O2" ./configure --with-threads --enable-maintainer-mode > --prefix=/opt/guile-vm > > So it seems that the new gc "cleanups" don't want you to touch mark bits > outside the mark phase. This is incompatible with other uses inside > guile itself, e.g. the SCM_DEBUG_CELL_ACCESSES == 1 case in inline.h, or I am pushing a fix for this to master. > even the lazy smob case I wrote about here: > > http://thread.gmane.org/gmane.lisp.guile.user/6372 I would classify the use of mark bits outside of the mark phase as outside of the defined API. If you want to have weak pointer semantics, use a weak hashtable, or implement reference counting on the C side. I am actuallly inclined to add add abort() for anyone who calls scm_gc_mark() outside the marking phase. > There are more cases, rgrep for SCM_SET_GC_MARK in libguile/*.[ch]. I looked through all of these, and these all happen during the mark phase. > I'm going to commit an #if 0 around those asserts in the vm branch, > because I don't have the brain power to deal with it. (It is irritating > that I have to even write this mail.) Certainly if the near-term choice > is between inaccurate statistics and calls to abort(), I know which > choice I prefer... The statistics form the basis of the allocation strategy, so they are not a 'cute' feature. If these statistics go off, we overallocate, or spend too much time trying garbage collect when there is nothing to reclaim. -- Han-Wen Nienhuys - [EMAIL PROTECTED] - http://www.xs4all.nl/~hanwen
Re: load_extension tests broken
Ludovic Courtès escreveu: > I don't think there's any such problem, at least on GNU/Linux. See: > > $ ./pre-inst-guile > guile> (getenv "LTDL_LIBRARY_PATH") > > "/home/ludo/src/guile/libguile:/home/ludo/src/guile/guile-readline:/home/ludo/src/guile/srfi:" > > The first directory is the build directory. > > Just to make sure, I also ran `pre-inst-guile', typed > "(use-modules (ice-9 i18n))" and attached GDB to it: it shows that the > right `libguile-i18n' is loaded. > > If in doubt, can you try similar things on your machine? (getenv "LD_LIBRARY_PATH") "/home/lilydev/usr/lib:" I'm running Fedora 9. -- Han-Wen Nienhuys - [EMAIL PROTECTED] - http://www.xs4all.nl/~hanwen
Re: GUILE_MAX_HEAP_SIZE
Ludovic Courtès escreveu: > I don't observe this failure on my x86 (Core(TM)2 Duo) laptop. Let me try to look at this during next week. As a kludge, you can #if 0 the asserts; although the stats will be off (as well as the allocation strategy), nothing bad should happen. -- Han-Wen Nienhuys - [EMAIL PROTECTED] - http://www.xs4all.nl/~hanwen
--without-threads compile broken
i686-linux-gcc -DHAVE_CONFIG_H -I/home/lilydev/vc/gub/target/linux-x86/src/guil e-1.9.git -I.. -I/home/lilydev/vc/gub/target/linux-x86/src/guile-1.9.git/lib -I. ./lib -g -O2 -Wall -Wmissing-prototypes -MT libguile_la-scmsigs.lo -MD -MP -MF . deps/libguile_la-scmsigs.Tpo -c /home/lilydev/vc/gub/target/linux-x86/src/guile- 1.9.git/libguile/scmsigs.c -fPIC -DPIC -o .libs/libguile_la-scmsigs.o /home/lilydev/vc/gub/target/linux-x86/src/guile-1.9.git/libguile/scmsigs.c: In f unction 'scm_i_close_signal_pipe': /home/lilydev/vc/gub/target/linux-x86/src/guile-1.9.git/libguile/scmsigs.c:684: error: 'signal_pipe' undeclared (first use in this function) /home/lilydev/vc/gub/target/linux-x86/src/guile-1.9.git/libguile/scmsigs.c:684: error: (Each undeclared identifier is reported only once /home/lilydev/vc/gub/target/linux-x86/src/guile-1.9.git/libguile/scmsigs.c:684: error: for each function it appears in.) -- Han-Wen Nienhuys - [EMAIL PROTECTED] - http://www.xs4all.nl/~hanwen
Re: GUILE_MAX_HEAP_SIZE
Han-Wen Nienhuys escreveu: > Ludovic Courtès escreveu: >> Hello, >> >> Han-Wen Nienhuys <[EMAIL PROTECTED]> writes: >> >>> Ludovic Courtès escreveu: >>>> Off the top of my head: incorrect indentation, missing spaces around >>>> brackets, and more importantly comments (see (standards.info)Comments). >>> The code I went through should not have that; please point me to locations >>> where things are broken so I can fix them. >> E.g., from commit: >> >> +/* >> + Classic MIT Hack, see e.g. http://www.tekpool.com/?cat=9 >> + */ >> +int scm_i_uint_bit_count(unsigned int u) >> >> (BTW, it'd make sense to use Gnulib's `count-one-bits' module, which is >> able to use GCC's `__builtin_popcount ()'.) Could you add the gnulib module? I'll do the rest. -- Han-Wen Nienhuys - [EMAIL PROTECTED] - http://www.xs4all.nl/~hanwen
Re: further guile-vm integration
Andy Wingo escreveu: > Let me know if you have thoughts about this plan! My hope would be that > once there are no or very few and solvable regressions, we could merge > this to master and call it 1.10 or 2.0. I'm always for faster release cycles, but wouldn't it be good to push out 1.10 now, and merge the VM (which is a large change) afterwards? -- Han-Wen Nienhuys - [EMAIL PROTECTED] - http://www.xs4all.nl/~hanwen
Re: GUILE_MAX_HEAP_SIZE
Ludovic Courtès escreveu: >> A likely candidate is the patch from you that I applied. In >> particular, >> 4c7016dc06525c7910ce6c99d97eb9c52c6b43e4 > > Well, that's a good candidate since it's the last significant change > that was done to the GC on `master'. However, Kevin's original post > compared 1.8 (which doesn't have this commit) to 1.6. I did a big rewrite of the garbage collector between 1.6 and 1.8; See commit 06e80f59f976c8dda5161804f611f489ec2948a2 Author: Han-Wen Nienhuys <[EMAIL PROTECTED]> Date: Tue Dec 10 13:26:25 2002 + and following commits. This was undoubtedly the cause for this. I hope you don't want me to figure what it was I did 6 years ago that caused the breakage. -- Han-Wen Nienhuys - [EMAIL PROTECTED] - http://www.xs4all.nl/~hanwen
Re: GUILE_MAX_HEAP_SIZE
Ludovic Courtès escreveu: > Hello, > > Han-Wen Nienhuys <[EMAIL PROTECTED]> writes: > >> Ludovic Courtès escreveu: > >>> Off the top of my head: incorrect indentation, missing spaces around >>> brackets, and more importantly comments (see (standards.info)Comments). >> The code I went through should not have that; please point me to locations >> where things are broken so I can fix them. > > E.g., from commit: > > +/* > + Classic MIT Hack, see e.g. http://www.tekpool.com/?cat=9 > + */ > +int scm_i_uint_bit_count(unsigned int u) > > (BTW, it'd make sense to use Gnulib's `count-one-bits' module, which is > able to use GCC's `__builtin_popcount ()'.) > > +/* > + Amount of cells marked in this cell, measured in 1-cells. > + */ > +int > +scm_i_card_marked_count (scm_t_cell *card, int span) > > + while (bvec < bvec_end) { > +count += scm_i_uint_bit_count(*bvec); > +bvec ++; > + } Yes, sorry about that. Being at google means that the google style is now native to me. -- Han-Wen Nienhuys - [EMAIL PROTECTED] - http://www.xs4all.nl/~hanwen
Re: GUILE_MAX_HEAP_SIZE
Ludovic Courtès escreveu: >> Ludovic Courtès escreveu: > >> Can you be more specific about this? > > Off the top of my head: incorrect indentation, missing spaces around > brackets, and more importantly comments (see (standards.info)Comments). The code I went through should not have that; please point me to locations where things are broken so I can fix them. >> See below - note that the old .scm file was pretty much broken, as it >> was using gc-live-object-stats which is only accurate just after the >> mark phase. > > Hmm, `gc-live-object-stats' may return information from the previous > cycle, but it shouldn't be *that* accurate, should it? No; the current implementation uses a similar scheme to gc-live-object-stats (counting in the bitvector) to determine the live object count. There is now no way that it can ever be larger than the total heap size. I also changed the code to not look at the penultimate GC stats, since I couldn't invent a scenario where that would help, and IMO it only confuses things. This may have been a remnant of the pre-lazy sweep code. > Interestingly, head-before-your-changes and 1.8 end up with > `cells-allocated' greater than `total-cell-heap', which I guess isn't > intended (`cells-allocated' and `scm_cells_allocated' are really the > total number of cells allocated since the last GC, right?). In the case > of 1.8, it's only slightly greater; in the case of > head-before-your-changes, it's more than 80 times greater. That does > seem to indicate brokenness... There was some confusion about cells vs. double cells vs. bytes, but I think was mostly in my head and perhaps in your stress test. If you really want to know, use git bisect. A likely candidate is the patch from you that I applied. In particular, 4c7016dc06525c7910ce6c99d97eb9c52c6b43e4 + + seg->freelist->collected += collected * seg->span; looks fishy as this code is called multiple times for a given card. >> The problem is that the previous state of the GC was very much confused >> with contradicting definitions within the code of what was being kept as >> statistics. This is problematic since the allocation strategy (should >> I allocate more memory?) was based on these confused statistics. > > Which statistics were confused exactly? Can you pinpoint the part of > your patch that fixes statistics computation? Otherwise, I find it hard > to say whether it's actually "fixed". The scm_t_sweep_statistics were sometimes passed into the sweep function and sometimes not; I couldn't work out what the global variables were supposed to mean exactly, and consequently, if their updates were correct. The reason I am confident about the statistics now is the assert()s I added to scm_i_gc(), which compare exactly mark bit counts, the sweep statistics and freelist statistics. Some of the changes I did were to make these numbers match up exactly. >> It's still somewhat kludgy - especially the way that gc-stats is >> constructed is asking for trouble. We should really have a scm_t_gc_stats >> struct and use nice OO patterns for dealing with that. > > What would you think of using the Boehm et al. GC? > > I'm willing to import the BGC-based Guile branch into Git when time > permits, so that we can compare them. I think we've tried this before, and IIRC it was 50~100% performance hit. Of course we could try again, but since we have a much better understanding of data is laid out, we can be faster. Unfortunately I don't have time to look into this, but we also be compact the heap (Bartlett's patents have or are soon to be expired). I'd be interested in seeing benchmarks between Guile and PLT after my cleanup. For a lot of benchmarks, GC time is an important factor, and it might be that we can now beat PLT (they use BGC). BTW, I'm attaching a new plot of the stress test, now up to iteration 1 (the large allocation). Interestingly, the large allocation is cleaned up only once - (on iteration 1000), and remains 'live' after that, so there may still be some bugs lurking. BTW2 stress.scm says (char-set #\.) ;; double-cell char-sets are smobs and use single cells, AFAICT. -- Han-Wen Nienhuys - [EMAIL PROTECTED] - http://www.xs4all.nl/~hanwen <>
Re: Goops & Valgrind
Mikael Djurfeldt escreveu: > Unfortunately, I don't have time to fix this. I suggest that some > Guile developer removes %fast-slot-ref/set! and supplies some other > (more clean) way of supporting the code in active-slot.scm. Also, > make sure to check that these primitives are not used anywhere else. Do we have a slow-slot-ref that we could simply substitute for this code? -- Han-Wen Nienhuys - [EMAIL PROTECTED] - http://www.xs4all.nl/~hanwen
Re: GUILE_MAX_HEAP_SIZE
Ludovic Courtès escreveu: >> (I intend to squash into a single commit before pushing to master). > > First of all, thanks for your work (I know it's not so much fun to hack > the GC), but I feel unhappy with your commit to both `master' and > `branch_release-1-8'. On the contrary, I think the GC is one of the more interesting parts. Only the evaluator is more funky, but I lack the braincells to deal with that competently. -- Han-Wen Nienhuys - [EMAIL PROTECTED] - http://www.xs4all.nl/~hanwen
Re: Goops & Valgrind
Ludovic Courtès escreveu: > Han-Wen Nienhuys <[EMAIL PROTECTED]> writes: > >> Running the test suite through valgrind, I get some fishy errors. > > So was that fixed by this commit? > > commit 51ef99f7fa9fb766fbb48619fc5863ab9914591d > Author: Han-Wen Nienhuys <[EMAIL PROTECTED]> > Date: Sat Aug 16 02:18:51 2008 -0300 > > Fix memory corruption issue with hell[] array: realloc/calloc need to > factor in sizeof(scm_t_bits) No, this error is still there. I lack knowledge of goops to investigate this. -- Han-Wen Nienhuys - [EMAIL PROTECTED] - http://www.xs4all.nl/~hanwen
Re: GUILE_MAX_HEAP_SIZE
Ludovic Courtès escreveu: > Hi Han-Wen, > > Han-Wen Nienhuys <[EMAIL PROTECTED]> writes: > >> Ludovic Courtès escreveu: > >>> is kind of hard to review in a glimpse. Does it just randomly "clean >>> things up" (whatever that means---it does not follow the GCS, for >> GCS? > > "GNU Coding Standards", the thing we're supposed to adhere to when > writing code for the GNU Project: I've pretty religiously followed those for LilyPond although google's coding style now has me remove spaces before ( in function calls. The existing code was not following GCS by any standard (which was my fault 7 years ago or so.) Can you be more specific about this? >>> instance), or does it fix anything? It's hard to tell. Can you >>> reproduce the heap usage graphs referred earlier in this thread? Do >> No, the memory usage is more stable now. > > Can you show what the graph looks like, for comparison purposes? See below - note that the old .scm file was pretty much broken, as it was using gc-live-object-stats which is only accurate just after the mark phase. > Which assertion is it that failed? Was that due to an old > `libguile-i18n.so' being loaded? yes - the old version had a scm_cells_allocated++ inlined, which was included in the .so file. Also, the old version was handling odd cases in scm_i_sweep_card differently. >> If you think you need to roll back this change, please revoke my >> commit privilege and sort things out yourself. > > I tried and failed, and so did Kevin > (http://thread.gmane.org/gmane.lisp.guile.devel/6699/focus=6832). AIUI, > both Kevin and I tried to identify the root of the problem (the "bug") > in a way that would allow us to fix the offending code as conservatively > as possible. being conservative is a good basic attitude, but I just saw general brokenness all around. > Conversely, the size and scope of your patch leaves me the impression > that you rewrote parts of the GC, without actually pinpointing what > was/is wrong with the code. I'd have been much more confident with a > one-liner along with an explanation and sample program to determine > whether the problem is there. The problem is that the previous state of the GC was very much confused with contradicting definitions within the code of what was being kept as statistics. This is problematic since the allocation strategy (should I allocate more memory?) was based on these confused statistics. This change simplifies and corrects this status. It does not introduce more clever behavior; it just trims any excess intelligence and confusion from the code. The thread you quote remarks something odd about live-cell-heap, which is not that surprising. The gc-live-object-statistics show the state of the last GC round rather than the current state. - so it tends to jump around at unpredictable times Attached is a plot of alive/total, where you can see that it fluctuates between 0.6 and 0.95 - consistent with the 40% yield percentage. Note that summing segments is misleading: the segments are memory addresses, while the rest uses cells for measuring memory allocations. I've included the .scm (slightly revised because I trimmed the gc-stats output as well). >> The garbage collector isn't that complicated after all. > > Then the people, including me, who spent large amounts of time trying in > vain to fix the code must have been dumb. Hopefully, with this change, things will become less confused. It's still somewhat kludgy - especially the way that gc-stats is constructed is asking for trouble. We should really have a scm_t_gc_stats struct and use nice OO patterns for dealing with that. -- Han-Wen Nienhuys - [EMAIL PROTECTED] - http://www.xs4all.nl/~hanwen <>(use-modules (srfi srfi-1) (srfi srfi-14)) (define (total-cell-heap) (assoc-ref (gc-stats) 'cell-heap-size)) (define (get-yields) (list 0 0)) (setvbuf (current-output-port) _IOLBF) (format #t "### plot '-' using 1:($2/$3) with lines title \"total/alive\"~%") # plot 'out' using 1:($2/$3) with lines title "alive/total" (let loop ((iteration 0)) (if (> iteration 1000) #t (begin ; 100k cells, (let ((lst (list-tabulate 1000 (lambda (i) (char-set #\.) ;; double-cell (make-list 100) (if (and (= (modulo iteration 1000) 0) (> iteration 0)) ;; sporadic heap-intensive job ; 1M cells. (make-list 100))) (if (= 0 (modulo iteration 10)) (let* ((total (total-cell-heap)) (alive (assoc-ref (gc-stats) 'cells-allocated))) (format #t "~a ~a ~a~%" iteration alive total ))) (loop (1+ iteration)
Re: GUILE_MAX_HEAP_SIZE
Ludovic Courtès escreveu: >>>hell = scm_calloc (hell_size * sizeof (*hell)); >>> >>> `sizeof (*hell)' is actually `sizeof (scm_t_bits *)', which is equal >>> to `sizeof (scm_t_bits)', but using `sizeof (*hell)' is clearer and >>> less error-prone. >>> >>> Besides, is that code only used when the one changes the class of an >>> instance? How did you trigger it? >> valgrind. Fixed. > > Sorry, I meant: which Scheme code leads to the execution of that code? I have no idea. Some part of the test suite exercises it, but I think it is better not to leave such an obvious error there. >>> Hmm, well, we really need an `SCM_MIN ()' somewhere. I'd rather >>> duplicate its definition than expand it as you did here, since that >>> makes the code rather unreadable. >> I called private-gc.h private for a reason. Please do not include it >> unless you are called libguile/gc*c; feel free to transplant SCM_MIN to >> someplace else. > > I agree on that, but I also think that expanding `SCM_MIN ()' in-place > is not a good idea. I agree with that, it's just that I prefer to let my responsibility end at the boundaries of the GC code. >>> * 569aa529d5379f3c942fa6eb01e8a1ad48ba9f77 >>> Use word_2 to store mark bits for freeing structs and vtables in the >>> correct order. >>> >>> Can you explain this? Suppose we have struct S whose vtable is V; >>> V cannot be swept in the same GC cycle as S since it's still >>> referenced by S. Thus, I don't understand the need for >>> special-casing here. >> Freeing S requires a function stored in V. > > Right, but my understanding is that V is still reachable (via S) when S > becomes candidate for sweeping. Is that right? No, the freeing is all for unreachable objects. The problem is that unreachable objects also may have an ordering: S needs to be freed before V, even if both are unreachable. >>> `ensure_marking ()' must be `static'. The definition of >>> `scm_i_marking' clearly doesn't belong in a header. Besides, all >>> this is unused, so what's the point? >> I'm not sure where to put the code, perhaps in a ifdef DEBUG or something: >> the point was to extend SCM_GC_SET_MARK with ensure_marking() to catch >> illegal >> use of the mark bits. > > But it's actually unused (at least in this commit), so I'd just remove > it. Yes - it's not ideal. I would vote for keeping the is_marking variable, and perhaps dropping the ensure_marking() function. (My experience is that the #ifdef DEBUG sections are never exercised, because noone ever bothers to test and use those secions.) >> Also, if a core contributor apparently need some sort of review process to >> push >> code they feel comfortable with, can you please post a link to the process? > > There's no such document, just an observation of what has been common > practice since I follow Guile development (c. 2004). FWIW, GUILE development seems from the outside very much stagnant, even if there are the occasional commits to the master branch. Perhaps I have various preconceptions because I also follow LilyPond development, which is more turbulent, with more mistakes going in at a higher pace, but also more discussion and more bugfixing going in at a higher pace. -- Han-Wen Nienhuys - [EMAIL PROTECTED] - http://www.xs4all.nl/~hanwen