Autocompilation/LilyPond
Hi, with the stable release 2.16 of LilyPond looming around the corner, it will become imminent soon to think about supporting Guile 2.0. Previous attempts have mostly exploded around the problem that we have something like (for-each ly:load init-scheme-files) in our lily.scm file, and the auto-compiler attempts to compile all of those files independently as far as I understand. Unfortunately, some of them contain macro definitions that other files rely on. Personally, I think it would make sense if we could get the autocompiler to treat the whole blob of files as _one_ unit, and recompile the unit if it gets out of date. That would save us from trying to factor out macro dependencies into separate files (and since our markup system defines a macro for every markup function, and since the macros are needed when building markups in Scheme, this is actually rather hard to do). You might say that it is LilyPond's own fault that it has reverted a bit more to macro programming than feasible for its own good. I would not actually say that you are wrong. However, there is the problem of lifting a whole bunch of working code base under active development into the Guilev2 era, and if we could tackle design or maldesign questions mostly independently and in bite-sized chunks rather than humongous patches moving material around, this would help a lot in getting Guilev2 support on track. Suggestions? -- David Kastrup
Re: Autocompilation/LilyPond
David Kastrup writes: > with the stable release 2.16 of LilyPond looming around the corner, it > will become imminent soon to think about supporting Guile 2.0. > > Previous attempts have mostly exploded around the problem that we have > something like > > (for-each ly:load init-scheme-files) > > in our lily.scm file, and the auto-compiler attempts to compile all of > those files independently as far as I understand. Unfortunately, some > of them contain macro definitions that other files rely on. > > Personally, I think it would make sense if we could get the autocompiler > to treat the whole blob of files as _one_ unit, and recompile the unit > if it gets out of date. I'm not sure that would help much. There's a deeper problem that you should be aware of. In Guile 1.x, macro uses within procedures are not expanded until the procedure is evaluated. In Guile 2, macros are expanded as soon as the procedure is defined, even if compilation is turned off. This means that functions can only use macros that were previously defined. For example, the following works in Guile 1.8 but not in Guile 2: (define (foo x) (bar x)) (define-macro (bar x) `(quote ,x)) In Guile 2, you must put the definition of 'bar' before 'foo'. So you'll need to load your code in the right order so that macros always come before their uses. One clarification is in order. You might conclude from this that it is not possible to define mutually-recursive macros, but that's not true. For example, the following is fine: (define-macro (beep x) `(boop ,x)) (define-macro (boop x) `(quote ,x)) That's because 'beep' doesn't use 'boop', it merely produces a use of 'boop' in its expansion. 'boop' is merely part of a quoted literal within 'beep'. Regards, Mark
Re: Autocompilation/LilyPond
Mark H Weaver writes: > David Kastrup writes: > >> with the stable release 2.16 of LilyPond looming around the corner, it >> will become imminent soon to think about supporting Guile 2.0. >> >> Previous attempts have mostly exploded around the problem that we have >> something like >> >> (for-each ly:load init-scheme-files) >> >> in our lily.scm file, and the auto-compiler attempts to compile all >> of those files independently as far as I understand. Unfortunately, >> some of them contain macro definitions that other files rely on. >> >> Personally, I think it would make sense if we could get the >> autocompiler to treat the whole blob of files as _one_ unit, and >> recompile the unit if it gets out of date. > > I'm not sure that would help much. There's a deeper problem that you > should be aware of. In Guile 1.x, macro uses within procedures are > not expanded until the procedure is evaluated. In Guile 2, macros are > expanded as soon as the procedure is defined, even if compilation is > turned off. This means that functions can only use macros that were > previously defined. I don't think that making this condition hold would be really hard. LilyPond has a rather carefully selected load order in several stages, so use-before-definition, whether in the context of macros or not, should be more the exception than the rule, and only require smaller rearrangements. > One clarification is in order. You might conclude from this that it > is not possible to define mutually-recursive macros, but that's not > true. I don't think we use them, anyway. Most problems are more due to small oversights rather than maliciously clever overdesign. -- David Kastrup
[PATCH] Efficient Gensym Hack
Hello all, Here's an implementation of the efficient gensym hack for stable-2.0. It makes 'gensym' about 4.7 times faster on my Yeeloong. Gensyms are not given names or even numbers until they are asked for their names or hash values (for 'equal?' hash tables only). The first patch adds an optimization for strings that is important for gensyms. It avoids locking a mutex when setting the shared flag on a stringbuf if the shared flag is already set. This is important for gensyms because when 'gensym' is called, it must save the stringbuf of the prefix and set its shared flag. In the common case where 'gensym' is called many times with the same prefix, this avoids locking any mutexes within most calls to 'gensym'. The second patch is trivial and unrelated to the efficient gensym hack, but I include it here to save everyone an additional recompile of libguile. The third patch actually implements the efficient gensym hack. It was made a bit hairier by two unfortunate facts: 1. The implementation of symbols is split between symbols.c and strings.c, and the gensym hack needs the internals of both. I had to add some new internal functions, including one to make a stringbuf from a string and one to make a string from a stringbuf. 2. The symbol table uses the symbols themselves as the keys. This was already hairy and inefficient: take a look at symbol_lookup_assoc_fn, which has to convert symbols to strings (which involves allocation) to implement the hash lookup! However, it makes things even worse when forcing lazy gensyms, because we must intern the gensym before clearing its "lazy gensym flag". This is necessary because if the name we chose already belongs to a pre-existing interned symbol, we _must_ choose another name, and we must prevent any other thread from getting our gensym's name until after we have interned it. This involved adding a new internal function to get the name of a symbol without checking its lazy gensym flag, for use by symbol_lookup_assoc_fn. IMHO, it would be much better to use a weak-value hash table, with strings as the keys and symbols as the values. Maybe we can do that for 2.2. Anyway, here are the patches. Comments and suggestions welcome. Mark >From 5f558244261f3a22217d5136d0aebb7f644d7efb Mon Sep 17 00:00:00 2001 From: Mark H Weaver Date: Mon, 5 Mar 2012 09:51:17 -0500 Subject: [PATCH 1/3] Don't lock mutex to set shared flag on stringbuf if it's already shared * libguile/strings.c (set_stringbuf_shared): New internal static function to replace the macro SET_STRINGBUF_SHARED. The macro assumed that the stringbuf_write_mutex was already locked, but this new function handles locking internally, and avoids locking if the stringbuf is already shared. (SET_STRINGBUF_SHARED): Removed. (scm_i_make_string, scm_i_substring, scm_i_substring_read_only, scm_i_make_symbol, scm_i_symbol_substring): Use set_stringbuf_shared instead of SET_STRINGBUF_SHARED. --- libguile/strings.c | 41 ++--- 1 files changed, 18 insertions(+), 23 deletions(-) diff --git a/libguile/strings.c b/libguile/strings.c index 494a658..35757f0 100644 --- a/libguile/strings.c +++ b/libguile/strings.c @@ -91,16 +91,6 @@ #define STRINGBUF_LENGTH(buf) (SCM_CELL_WORD_1 (buf)) -#define SET_STRINGBUF_SHARED(buf) \ - do \ -{ \ - /* Don't modify BUF if it's already marked as shared since it might be \ - a read-only, statically allocated stringbuf. */ \ - if (SCM_LIKELY (!STRINGBUF_SHARED (buf)))\ - SCM_SET_CELL_WORD_0 ((buf), SCM_CELL_WORD_0 (buf) | STRINGBUF_F_SHARED); \ -} \ - while (0) - #ifdef SCM_STRING_LENGTH_HISTOGRAM static size_t lenhist[1001]; #endif @@ -227,6 +217,19 @@ narrow_stringbuf (SCM buf) scm_i_pthread_mutex_t stringbuf_write_mutex = SCM_I_PTHREAD_MUTEX_INITIALIZER; +static void +set_stringbuf_shared (SCM buf) +{ + /* Don't modify BUF if it's already marked as shared since it + might be a read-only, statically allocated stringbuf. */ + if (!STRINGBUF_SHARED (buf)) +{ + scm_i_pthread_mutex_lock (&stringbuf_write_mutex); + SCM_SET_CELL_WORD_0 (buf, SCM_CELL_WORD_0 (buf) | STRINGBUF_F_SHARED); + scm_i_pthread_mutex_unlock (&stringbuf_write_mutex); +} +} + /* Copy-on-write strings. */ @@ -276,7 +279,7 @@ scm_i_make_string (size_t len, char **charsp, int read_only_p) if (SCM_UNLIKELY (scm_is_false (null_stringbuf))) { null_stringbuf = make_stringbuf (0); - SET_STRINGBUF_SHARED (null_stringbuf); + set_stringbuf_shared (null_stringbuf); } buf = null_stringbuf; } @@ -341,9 +344,7 @@ scm_i_substring (SCM str, size_t start, size_t end) SCM buf; size_t str_start; get_str_buf_start (&str, &buf, &str_start); - scm_i_pthread_mutex_lock (&stringbuf_write_mutex); - SET_STRINGBUF_SHARED (buf); - scm_i_pthread_mutex_unlock (&stringbuf_write_m
Re: Autocompilation/LilyPond
David Kastrup writes: > Mark H Weaver writes: > >> David Kastrup writes: >> >>> with the stable release 2.16 of LilyPond looming around the corner, it >>> will become imminent soon to think about supporting Guile 2.0. >>> >>> Previous attempts have mostly exploded around the problem that we have >>> something like >>> >>> (for-each ly:load init-scheme-files) >>> >>> in our lily.scm file, and the auto-compiler attempts to compile all >>> of those files independently as far as I understand. Unfortunately, >>> some of them contain macro definitions that other files rely on. >>> >>> Personally, I think it would make sense if we could get the >>> autocompiler to treat the whole blob of files as _one_ unit, and >>> recompile the unit if it gets out of date. >> >> I'm not sure that would help much. There's a deeper problem that you >> should be aware of. In Guile 1.x, macro uses within procedures are >> not expanded until the procedure is evaluated. In Guile 2, macros are >> expanded as soon as the procedure is defined, even if compilation is >> turned off. This means that functions can only use macros that were >> previously defined. > > I don't think that making this condition hold would be really hard. > LilyPond has a rather carefully selected load order in several stages, > so use-before-definition, whether in the context of macros or not, > should be more the exception than the rule, and only require smaller > rearrangements. Excellent! As long as you load everything in the right order, such that macros are defined before they are used, I don't see why there should be any other problems related to macros and compilation. Mark
Re: [PATCH] Efficient Gensym Hack
Hi Mark, A quick reaction to your summary; I'll look at the patches shortly. On Mon 05 Mar 2012 18:17, Mark H Weaver writes: > Here's an implementation of the efficient gensym hack for stable-2.0. Excellent! > It makes 'gensym' about 4.7 times faster on my Yeeloong. Gensyms are > not given names or even numbers until they are asked for their names or > hash values (for 'equal?' hash tables only). Ah, interesting :) I had always thought that you would need to number them first, but I see that you found a way to avoid that. > 1. The implementation of symbols is split between symbols.c and > strings.c, and the gensym hack needs the internals of both. I had to > add some new internal functions, including one to make a stringbuf from > a string and one to make a string from a stringbuf. Yeah, this is not good. With my dynstack work I found that functions that are internal but not static can prevent some important inlining. (I found the performance impact using "perf record", and valgrind --tool=callgrind). It's good that we have internal functions to avoid bloating our public API, but they do seem to prevent optimization. I wonder if LTO could help here. > 2. The symbol table uses the symbols themselves as the keys. This was > already hairy and inefficient: take a look at symbol_lookup_assoc_fn, > which has to convert symbols to strings (which involves allocation) to > implement the hash lookup! It uses the symbols as keys, but it uses the string hash value (not the symbol hashq value) as the hash. There are some important cases in which no string need be allocated: scm_from_utf8_symbol and scm_from_latin1_symbol. But yes, it's hairy. Note also that this has changed significantly in master. Your thoughts on that weak set mechanism would be appreciated. > IMHO, it would be much better to use a weak-value hash table, with > strings as the keys and symbols as the values. Maybe we can do that > for 2.2. Interesting idea. It's not clear to me how this would solve this problem though; but perhaps that will be clear when I read the patches. Anyway, to keep this short, I'll look at the patches in another mail. Cheers! Andy ps. An interesting benchmark (before and after) would be to time the execution of (compile-file "module/ice-9/psyntax.scm"). -- http://wingolog.org/
Re: Autocompilation/LilyPond
Mark H Weaver writes: > David Kastrup writes: > >> Mark H Weaver writes: >> >>> David Kastrup writes: >>> with the stable release 2.16 of LilyPond looming around the corner, it will become imminent soon to think about supporting Guile 2.0. Previous attempts have mostly exploded around the problem that we have something like (for-each ly:load init-scheme-files) in our lily.scm file, and the auto-compiler attempts to compile all of those files independently as far as I understand. Unfortunately, some of them contain macro definitions that other files rely on. Personally, I think it would make sense if we could get the autocompiler to treat the whole blob of files as _one_ unit, and recompile the unit if it gets out of date. >>> >>> I'm not sure that would help much. There's a deeper problem that you >>> should be aware of. In Guile 1.x, macro uses within procedures are >>> not expanded until the procedure is evaluated. In Guile 2, macros are >>> expanded as soon as the procedure is defined, even if compilation is >>> turned off. This means that functions can only use macros that were >>> previously defined. >> >> I don't think that making this condition hold would be really hard. >> LilyPond has a rather carefully selected load order in several stages, >> so use-before-definition, whether in the context of macros or not, >> should be more the exception than the rule, and only require smaller >> rearrangements. > > Excellent! As long as you load everything in the right order, such that > macros are defined before they are used, I don't see why there should be > any other problems related to macros and compilation. Because the individual files are not independent from one another? That's why I wrote: Personally, I think it would make sense if we could get the autocompiler to treat the whole blob of files as _one_ unit, and recompile the unit if it gets out of date. -- David Kastrup
Re: Autocompilation/LilyPond
David Kastrup writes: > Mark H Weaver writes: > >> Excellent! As long as you load everything in the right order, such that >> macros are defined before they are used, I don't see why there should be >> any other problems related to macros and compilation. > > Because the individual files are not independent from one another? > That's why I wrote: > > Personally, I think it would make sense if we could get the > autocompiler to treat the whole blob of files as _one_ unit, and > recompile the unit if it gets out of date. There's no problem with them being dependent on one another. When you load a file, even with auto-compilation, the macro expander will make use of whatever macros are already bound in the current module. The rest of the compiler sees only the output of the macro expander. Try the following experiment: put "(define-macro (bar x) `(quote ,x))" into "foo1.scm", and "(define (foo x) (bar x))" into "foo2.scm", and then within a REPL type: (load "foo1.scm") (load "foo2.scm") and observe that everything works as it should. If you really want everything to be compiled as one unit, you can use 'include' (which acts essentially the same as #include in C), though beware that Guile is not yet smart enough to auto-recompile when one of the included files gets updated. I don't see any compelling benefit to compiling everything as one unit, but do as you wish :) Regards, Mark
Re: [PATCH] Efficient Gensym Hack
Hi Andy, thanks for the quick response! :) Andy Wingo writes: > On Mon 05 Mar 2012 18:17, Mark H Weaver writes: > >> It makes 'gensym' about 4.7 times faster on my Yeeloong. Gensyms are >> not given names or even numbers until they are asked for their names or >> hash values (for 'equal?' hash tables only). > > Ah, interesting :) I had always thought that you would need to number > them first, but I see that you found a way to avoid that. I got the idea from http://icfp06.cs.uchicago.edu/dybvig-talk.pdf Anyway, in retrospect, I don't even see how I could make it work otherwise. The problem is that with lazy gensyms, the name you ultimately assign to the gensym _must_ not already be interned. Think about it. Suppose you assign a gensym with prefix "foo" the number 6, and that there's another symbol already interned with the name "foo6". Now you have two distinct symbols (in the sense of 'eq?'), both semantically interned, with the same name. There's no way to recover from this. The only solution I could find is to give the gensym a name that has not already been interned. In my implementation, I don't increment the counter at all until the lazy gensym is "forced". If that name is already interned, and I just keep incrementing the counter until I find a unique symbol. Only after it has been successfully interned do I _commit_ to the new name by clearing the "lazy gensym flag". >> 1. The implementation of symbols is split between symbols.c and >> strings.c, and the gensym hack needs the internals of both. I had to >> add some new internal functions, including one to make a stringbuf from >> a string and one to make a string from a stringbuf. > > Yeah, this is not good. With my dynstack work I found that functions > that are internal but not static can prevent some important inlining. That's definitely true, but fortunately these new internal functions are used only by the gensym code. Anyway, if you're concerned about this, one option would be to combine string.c and symbol.c into a single file. > Your thoughts on that weak set mechanism would be appreciated. Everything I know about weak storage mechanisms I learned from Bruno Haible. Highly recommended reading: http://www.haible.de/bruno/papers/cs/weak/WeakDatastructures-writeup.html I'll explore issues regarding the symbol table in another email. Thanks! Mark
About sweet-expression
I try to port sweet-expression to newest Guile. Fortunately, the author wrote a compatible version for old Guile in 2008. So I just try to rewrite part of it for some obvious reasons. It woks fine now. Though some guys doesn't like sweet-expression at all (OK, they're real Lispers! But me too ;-)). I'm not a big fan to write Scheme like this: --- define fibfast(n) if {n < 2} n fibup(n 2 1 0) --- But I think it maybe useful for newbies especially came from Python of C. If we can fix the problems which I'll mention later, I expect it to be added as inner language support. I put it here, if you're interested, please checkout here: git://gitorious.org/nacre/guile-sweet.git It supports Modern & Sugar both. You may read the README and try the example. But there're some problems for the original implementation, so I didn't format a patch. I think there should be something to be fixed. 1. The author's comment shows that we don't need to re-implement a reader if our reader treats "[]" and "{}" as delimiter. But seems "{}" is not delimiter in current Guile. Mark Weaver said it should be fixed. 2. And the second suggestion of Mark Weaver is we *must* try the Guile inner reader tools and keep some information if there's errors happened. I think it's a more worthy consideration. The current sweet is too weak to be a productive thing. 3. We don't have "nfx" macro in current Guile, so only simple infix expression can be evaluated. Say, all the operators are homologous: {1 + 2 + 3} is simple infix expression {1 + 2 - 3} is not. Anyway, it's not a big deal, I'll fix it later. But I'm not sure whether it should be in Guile or in app? Since the author use "nfx" directly, maybe it's a common thing we missed?
Re: About sweet-expression
sorry for stupid typo: s/Python of C/Python or C Regards
Re: About sweet-expression
Nala Ginrut writes: > I try to port sweet-expression to newest Guile. Fortunately, the > author wrote a compatible version for old Guile in 2008. So I just try > to rewrite part of it for some obvious reasons. It woks fine now. > > Though some guys doesn't like sweet-expression at all (OK, they're > real Lispers! But me too ;-)). I'm not a big fan to write Scheme like > this: > --- > define fibfast(n) > if {n < 2} > n > fibup(n 2 1 0) > --- > But I think it maybe useful for newbies especially came from Python of > C. If we can fix the problems which I'll mention later, I expect it to > be added as inner language support. > > I put it here, if you're interested, please checkout here: > git://gitorious.org/nacre/guile-sweet.git > It supports Modern & Sugar both. You may read the README and try the > example. > > But there're some problems for the original implementation, so I didn't > format a patch. I think there should be something to be fixed. > 1. The author's comment shows that we don't need to re-implement a > reader if our reader treats "[]" and "{}" as delimiter. But seems " > {}" is not delimiter in current Guile. Mark Weaver said it should be > fixed. Yes, it would be good to at least have a reader option to treat {} as delimiters. I haven't looked carefully at this, but I suspect the main requirement is to stop reading when we reach a '}', and furthermore to unget that delimiter when we reach it. Without this, the sweet expression reader will need to reimplement the entire reader from scratch, which seem suboptimal for several reasons. > 2. And the second suggestion of Mark Weaver is we *must* try the Guile > inner reader tools and keep some information if there's errors > happened. I think it's a more worthy consideration. The current sweet > is too weak to be a productive thing. To clarify, I was talking about the need to set source properties on the datums that are read. However, having thought a bit more about this, it's clear that Guile's reader can be of very little help here. The sweet expression reader itself must be responsible for setting source properties on all the resulting lists that are not written with normal parentheses. For example, in the following sweet expression: define fibfast(n) if {n < 2} n fibup(n 2 1 0) Guile's reader is of no help at all with source properties. Fortunately, Guile provides all of the interfaces you need to do this job from Scheme: 'set-source-properties!', 'port-filename', 'port-line' and 'port-column'. This will have to be implemented in the sweet expression reader. > 3. We don't have "nfx" macro in current Guile, so only simple infix > expression can be evaluated. Say, all the operators are homologous: > {1 + 2 + 3} is simple infix expression > {1 + 2 - 3} is not. > Anyway, it's not a big deal, I'll fix it later. But I'm not sure > whether it should be in Guile or in app? Since the author use "nfx" > directly, maybe it's a common thing we missed? It seems clear to me that David Wheeler intended for 'nfx' to be defined by the user, to implement whatever operator precedence they wish for their particular module. There's no universally good 'nfx' macro. Thanks, Mark
Re: About sweet-expression
Nala Ginrut writes: > I try to port sweet-expression to newest Guile. Fortunately, the > author wrote a compatible version for old Guile in 2008. So I just try > to rewrite part of it for some obvious reasons. It woks fine now. > > Though some guys doesn't like sweet-expression at all (OK, they're > real Lispers! But me too ;-)). I'm not a big fan to write Scheme like > this: > --- > define fibfast(n) > if {n < 2} > n > fibup(n 2 1 0) > --- > But I think it maybe useful for newbies especially came from Python of > C. That's like thinking the easiest path for a Franch-speaking newbie to learn English is to start by learning Latin. You'll find that kind of opinion expressed mostly by people fascinated with Latin rather than English. -- David Kastrup