Autocompilation/LilyPond

2012-03-05 Thread David Kastrup

Hi,

with the stable release 2.16 of LilyPond looming around the corner, it
will become imminent soon to think about supporting Guile 2.0.

Previous attempts have mostly exploded around the problem that we have
something like

(for-each ly:load init-scheme-files)

in our lily.scm file, and the auto-compiler attempts to compile all of
those files independently as far as I understand.  Unfortunately, some
of them contain macro definitions that other files rely on.

Personally, I think it would make sense if we could get the autocompiler
to treat the whole blob of files as _one_ unit, and recompile the unit
if it gets out of date.  That would save us from trying to factor out
macro dependencies into separate files (and since our markup system
defines a macro for every markup function, and since the macros are
needed when building markups in Scheme, this is actually rather hard to
do).  You might say that it is LilyPond's own fault that it has reverted
a bit more to macro programming than feasible for its own good.

I would not actually say that you are wrong.  However, there is the
problem of lifting a whole bunch of working code base under active
development into the Guilev2 era, and if we could tackle design or
maldesign questions mostly independently and in bite-sized chunks rather
than humongous patches moving material around, this would help a lot in
getting Guilev2 support on track.

Suggestions?

-- 
David Kastrup




Re: Autocompilation/LilyPond

2012-03-05 Thread Mark H Weaver
David Kastrup d...@gnu.org writes:

 with the stable release 2.16 of LilyPond looming around the corner, it
 will become imminent soon to think about supporting Guile 2.0.

 Previous attempts have mostly exploded around the problem that we have
 something like

 (for-each ly:load init-scheme-files)

 in our lily.scm file, and the auto-compiler attempts to compile all of
 those files independently as far as I understand.  Unfortunately, some
 of them contain macro definitions that other files rely on.

 Personally, I think it would make sense if we could get the autocompiler
 to treat the whole blob of files as _one_ unit, and recompile the unit
 if it gets out of date.

I'm not sure that would help much.  There's a deeper problem that you
should be aware of.  In Guile 1.x, macro uses within procedures are not
expanded until the procedure is evaluated.  In Guile 2, macros are
expanded as soon as the procedure is defined, even if compilation is
turned off.  This means that functions can only use macros that were
previously defined.

For example, the following works in Guile 1.8 but not in Guile 2:

  (define (foo x) (bar x))
  (define-macro (bar x) `(quote ,x))

In Guile 2, you must put the definition of 'bar' before 'foo'.

So you'll need to load your code in the right order so that macros
always come before their uses.

One clarification is in order.  You might conclude from this that it is
not possible to define mutually-recursive macros, but that's not true.
For example, the following is fine:

(define-macro (beep x) `(boop ,x))
(define-macro (boop x) `(quote ,x))

That's because 'beep' doesn't use 'boop', it merely produces a use of
'boop' in its expansion.  'boop' is merely part of a quoted literal
within 'beep'.

   Regards,
 Mark



Re: Autocompilation/LilyPond

2012-03-05 Thread David Kastrup
Mark H Weaver m...@netris.org writes:

 David Kastrup d...@gnu.org writes:

 with the stable release 2.16 of LilyPond looming around the corner, it
 will become imminent soon to think about supporting Guile 2.0.

 Previous attempts have mostly exploded around the problem that we have
 something like

 (for-each ly:load init-scheme-files)

 in our lily.scm file, and the auto-compiler attempts to compile all
 of those files independently as far as I understand.  Unfortunately,
 some of them contain macro definitions that other files rely on.

 Personally, I think it would make sense if we could get the
 autocompiler to treat the whole blob of files as _one_ unit, and
 recompile the unit if it gets out of date.

 I'm not sure that would help much.  There's a deeper problem that you
 should be aware of.  In Guile 1.x, macro uses within procedures are
 not expanded until the procedure is evaluated.  In Guile 2, macros are
 expanded as soon as the procedure is defined, even if compilation is
 turned off.  This means that functions can only use macros that were
 previously defined.

I don't think that making this condition hold would be really hard.
LilyPond has a rather carefully selected load order in several stages,
so use-before-definition, whether in the context of macros or not,
should be more the exception than the rule, and only require smaller
rearrangements.

 One clarification is in order.  You might conclude from this that it
 is not possible to define mutually-recursive macros, but that's not
 true.

I don't think we use them, anyway.  Most problems are more due to small
oversights rather than maliciously clever overdesign.

-- 
David Kastrup



[PATCH] Efficient Gensym Hack

2012-03-05 Thread Mark H Weaver
Hello all,

Here's an implementation of the efficient gensym hack for stable-2.0.
It makes 'gensym' about 4.7 times faster on my Yeeloong.  Gensyms are
not given names or even numbers until they are asked for their names or
hash values (for 'equal?' hash tables only).

The first patch adds an optimization for strings that is important for
gensyms.  It avoids locking a mutex when setting the shared flag on a
stringbuf if the shared flag is already set.  This is important for
gensyms because when 'gensym' is called, it must save the stringbuf of
the prefix and set its shared flag.  In the common case where 'gensym'
is called many times with the same prefix, this avoids locking any
mutexes within most calls to 'gensym'.

The second patch is trivial and unrelated to the efficient gensym hack,
but I include it here to save everyone an additional recompile of
libguile.

The third patch actually implements the efficient gensym hack.  It was
made a bit hairier by two unfortunate facts:

1. The implementation of symbols is split between symbols.c and
strings.c, and the gensym hack needs the internals of both.  I had to
add some new internal functions, including one to make a stringbuf from
a string and one to make a string from a stringbuf.

2. The symbol table uses the symbols themselves as the keys.  This was
already hairy and inefficient: take a look at symbol_lookup_assoc_fn,
which has to convert symbols to strings (which involves allocation) to
implement the hash lookup!  However, it makes things even worse when
forcing lazy gensyms, because we must intern the gensym before clearing
its lazy gensym flag.  This is necessary because if the name we chose
already belongs to a pre-existing interned symbol, we _must_ choose
another name, and we must prevent any other thread from getting our
gensym's name until after we have interned it.  This involved adding a
new internal function to get the name of a symbol without checking its
lazy gensym flag, for use by symbol_lookup_assoc_fn.  IMHO, it would be
much better to use a weak-value hash table, with strings as the keys and
symbols as the values.  Maybe we can do that for 2.2.

Anyway, here are the patches.  Comments and suggestions welcome.

Mark


From 5f558244261f3a22217d5136d0aebb7f644d7efb Mon Sep 17 00:00:00 2001
From: Mark H Weaver m...@netris.org
Date: Mon, 5 Mar 2012 09:51:17 -0500
Subject: [PATCH 1/3] Don't lock mutex to set shared flag on stringbuf if it's
 already shared

* libguile/strings.c (set_stringbuf_shared): New internal static
  function to replace the macro SET_STRINGBUF_SHARED.  The macro assumed
  that the stringbuf_write_mutex was already locked, but this new
  function handles locking internally, and avoids locking if the
  stringbuf is already shared.

  (SET_STRINGBUF_SHARED): Removed.

  (scm_i_make_string, scm_i_substring, scm_i_substring_read_only,
  scm_i_make_symbol, scm_i_symbol_substring): Use set_stringbuf_shared
  instead of SET_STRINGBUF_SHARED.
---
 libguile/strings.c |   41 ++---
 1 files changed, 18 insertions(+), 23 deletions(-)

diff --git a/libguile/strings.c b/libguile/strings.c
index 494a658..35757f0 100644
--- a/libguile/strings.c
+++ b/libguile/strings.c
@@ -91,16 +91,6 @@
 
 #define STRINGBUF_LENGTH(buf)   (SCM_CELL_WORD_1 (buf))
 
-#define SET_STRINGBUF_SHARED(buf)	\
-  do	\
-{	\
-  /* Don't modify BUF if it's already marked as shared since it might be \
-	 a read-only, statically allocated stringbuf.  */		\
-  if (SCM_LIKELY (!STRINGBUF_SHARED (buf)))\
-	SCM_SET_CELL_WORD_0 ((buf), SCM_CELL_WORD_0 (buf) | STRINGBUF_F_SHARED); \
-}	\
-  while (0)
-
 #ifdef SCM_STRING_LENGTH_HISTOGRAM
 static size_t lenhist[1001];
 #endif
@@ -227,6 +217,19 @@ narrow_stringbuf (SCM buf)
 
 scm_i_pthread_mutex_t stringbuf_write_mutex = SCM_I_PTHREAD_MUTEX_INITIALIZER;
 
+static void
+set_stringbuf_shared (SCM buf)
+{
+  /* Don't modify BUF if it's already marked as shared since it
+ might be a read-only, statically allocated stringbuf.  */
+  if (!STRINGBUF_SHARED (buf))
+{
+  scm_i_pthread_mutex_lock (stringbuf_write_mutex);
+  SCM_SET_CELL_WORD_0 (buf, SCM_CELL_WORD_0 (buf) | STRINGBUF_F_SHARED);
+  scm_i_pthread_mutex_unlock (stringbuf_write_mutex);
+}
+}
+
 
 /* Copy-on-write strings.
  */
@@ -276,7 +279,7 @@ scm_i_make_string (size_t len, char **charsp, int read_only_p)
   if (SCM_UNLIKELY (scm_is_false (null_stringbuf)))
 {
   null_stringbuf = make_stringbuf (0);
-  SET_STRINGBUF_SHARED (null_stringbuf);
+  set_stringbuf_shared (null_stringbuf);
 }
   buf = null_stringbuf;
 }
@@ -341,9 +344,7 @@ scm_i_substring (SCM str, size_t start, size_t end)
   SCM buf;
   size_t str_start;
   get_str_buf_start (str, buf, str_start);
-  scm_i_pthread_mutex_lock (stringbuf_write_mutex);
-  SET_STRINGBUF_SHARED (buf);
-  scm_i_pthread_mutex_unlock 

Re: Autocompilation/LilyPond

2012-03-05 Thread Mark H Weaver
David Kastrup d...@gnu.org writes:

 Mark H Weaver m...@netris.org writes:

 David Kastrup d...@gnu.org writes:

 with the stable release 2.16 of LilyPond looming around the corner, it
 will become imminent soon to think about supporting Guile 2.0.

 Previous attempts have mostly exploded around the problem that we have
 something like

 (for-each ly:load init-scheme-files)

 in our lily.scm file, and the auto-compiler attempts to compile all
 of those files independently as far as I understand.  Unfortunately,
 some of them contain macro definitions that other files rely on.

 Personally, I think it would make sense if we could get the
 autocompiler to treat the whole blob of files as _one_ unit, and
 recompile the unit if it gets out of date.

 I'm not sure that would help much.  There's a deeper problem that you
 should be aware of.  In Guile 1.x, macro uses within procedures are
 not expanded until the procedure is evaluated.  In Guile 2, macros are
 expanded as soon as the procedure is defined, even if compilation is
 turned off.  This means that functions can only use macros that were
 previously defined.

 I don't think that making this condition hold would be really hard.
 LilyPond has a rather carefully selected load order in several stages,
 so use-before-definition, whether in the context of macros or not,
 should be more the exception than the rule, and only require smaller
 rearrangements.

Excellent!  As long as you load everything in the right order, such that
macros are defined before they are used, I don't see why there should be
any other problems related to macros and compilation.

 Mark



Re: [PATCH] Efficient Gensym Hack

2012-03-05 Thread Andy Wingo
Hi Mark,

A quick reaction to your summary; I'll look at the patches shortly.

On Mon 05 Mar 2012 18:17, Mark H Weaver m...@netris.org writes:

 Here's an implementation of the efficient gensym hack for stable-2.0.

Excellent!

 It makes 'gensym' about 4.7 times faster on my Yeeloong.  Gensyms are
 not given names or even numbers until they are asked for their names or
 hash values (for 'equal?' hash tables only).

Ah, interesting :)  I had always thought that you would need to number
them first, but I see that you found a way to avoid that.

 1. The implementation of symbols is split between symbols.c and
 strings.c, and the gensym hack needs the internals of both.  I had to
 add some new internal functions, including one to make a stringbuf from
 a string and one to make a string from a stringbuf.

Yeah, this is not good.  With my dynstack work I found that functions
that are internal but not static can prevent some important inlining.
(I found the performance impact using perf record, and valgrind
--tool=callgrind).  It's good that we have internal functions to avoid
bloating our public API, but they do seem to prevent optimization.  I
wonder if LTO could help here.

 2. The symbol table uses the symbols themselves as the keys.  This was
 already hairy and inefficient: take a look at symbol_lookup_assoc_fn,
 which has to convert symbols to strings (which involves allocation) to
 implement the hash lookup!

It uses the symbols as keys, but it uses the string hash value (not the
symbol hashq value) as the hash.  There are some important cases in
which no string need be allocated: scm_from_utf8_symbol and
scm_from_latin1_symbol.  But yes, it's hairy.

Note also that this has changed significantly in master.  Your thoughts
on that weak set mechanism would be appreciated.

 IMHO, it would be much better to use a weak-value hash table, with
 strings as the keys and symbols as the values.  Maybe we can do that
 for 2.2.

Interesting idea.  It's not clear to me how this would solve this
problem though; but perhaps that will be clear when I read the patches.

Anyway, to keep this short, I'll look at the patches in another mail.

Cheers!

Andy

ps. An interesting benchmark (before and after) would be to time the
execution of (compile-file module/ice-9/psyntax.scm).
-- 
http://wingolog.org/



Re: Autocompilation/LilyPond

2012-03-05 Thread David Kastrup
Mark H Weaver m...@netris.org writes:

 David Kastrup d...@gnu.org writes:

 Mark H Weaver m...@netris.org writes:

 David Kastrup d...@gnu.org writes:

 with the stable release 2.16 of LilyPond looming around the corner, it
 will become imminent soon to think about supporting Guile 2.0.

 Previous attempts have mostly exploded around the problem that we have
 something like

 (for-each ly:load init-scheme-files)

 in our lily.scm file, and the auto-compiler attempts to compile all
 of those files independently as far as I understand.  Unfortunately,
 some of them contain macro definitions that other files rely on.

 Personally, I think it would make sense if we could get the
 autocompiler to treat the whole blob of files as _one_ unit, and
 recompile the unit if it gets out of date.

 I'm not sure that would help much.  There's a deeper problem that you
 should be aware of.  In Guile 1.x, macro uses within procedures are
 not expanded until the procedure is evaluated.  In Guile 2, macros are
 expanded as soon as the procedure is defined, even if compilation is
 turned off.  This means that functions can only use macros that were
 previously defined.

 I don't think that making this condition hold would be really hard.
 LilyPond has a rather carefully selected load order in several stages,
 so use-before-definition, whether in the context of macros or not,
 should be more the exception than the rule, and only require smaller
 rearrangements.

 Excellent!  As long as you load everything in the right order, such that
 macros are defined before they are used, I don't see why there should be
 any other problems related to macros and compilation.

Because the individual files are not independent from one another?
That's why I wrote:

Personally, I think it would make sense if we could get the
autocompiler to treat the whole blob of files as _one_ unit, and
recompile the unit if it gets out of date.

-- 
David Kastrup



Re: Autocompilation/LilyPond

2012-03-05 Thread Mark H Weaver
David Kastrup d...@gnu.org writes:

 Mark H Weaver m...@netris.org writes:

 Excellent!  As long as you load everything in the right order, such that
 macros are defined before they are used, I don't see why there should be
 any other problems related to macros and compilation.

 Because the individual files are not independent from one another?
 That's why I wrote:

 Personally, I think it would make sense if we could get the
 autocompiler to treat the whole blob of files as _one_ unit, and
 recompile the unit if it gets out of date.

There's no problem with them being dependent on one another.  When you
load a file, even with auto-compilation, the macro expander will make
use of whatever macros are already bound in the current module.  The
rest of the compiler sees only the output of the macro expander.

Try the following experiment: put (define-macro (bar x) `(quote ,x))
into foo1.scm, and (define (foo x) (bar x)) into foo2.scm, and
then within a REPL type: (load foo1.scm) (load foo2.scm) and observe
that everything works as it should.

If you really want everything to be compiled as one unit, you can use
'include' (which acts essentially the same as #include in C), though
beware that Guile is not yet smart enough to auto-recompile when one of
the included files gets updated.

I don't see any compelling benefit to compiling everything as one unit,
but do as you wish :)

   Regards,
 Mark



Re: [PATCH] Efficient Gensym Hack

2012-03-05 Thread Mark H Weaver
Hi Andy, thanks for the quick response! :)

Andy Wingo wi...@pobox.com writes:

 On Mon 05 Mar 2012 18:17, Mark H Weaver m...@netris.org writes:

 It makes 'gensym' about 4.7 times faster on my Yeeloong.  Gensyms are
 not given names or even numbers until they are asked for their names or
 hash values (for 'equal?' hash tables only).

 Ah, interesting :)  I had always thought that you would need to number
 them first, but I see that you found a way to avoid that.

I got the idea from http://icfp06.cs.uchicago.edu/dybvig-talk.pdf

Anyway, in retrospect, I don't even see how I could make it work
otherwise.  The problem is that with lazy gensyms, the name you
ultimately assign to the gensym _must_ not already be interned.  Think
about it.  Suppose you assign a gensym with prefix foo the number 6,
and that there's another symbol already interned with the name foo6.
Now you have two distinct symbols (in the sense of 'eq?'), both
semantically interned, with the same name.  There's no way to recover
from this.

The only solution I could find is to give the gensym a name that has not
already been interned.  In my implementation, I don't increment the
counter at all until the lazy gensym is forced.  If that name is
already interned, and I just keep incrementing the counter until I find
a unique symbol.  Only after it has been successfully interned do I
_commit_ to the new name by clearing the lazy gensym flag.

 1. The implementation of symbols is split between symbols.c and
 strings.c, and the gensym hack needs the internals of both.  I had to
 add some new internal functions, including one to make a stringbuf from
 a string and one to make a string from a stringbuf.

 Yeah, this is not good.  With my dynstack work I found that functions
 that are internal but not static can prevent some important inlining.

That's definitely true, but fortunately these new internal functions are
used only by the gensym code.  Anyway, if you're concerned about this,
one option would be to combine string.c and symbol.c into a single file.

 Your thoughts on that weak set mechanism would be appreciated.

Everything I know about weak storage mechanisms I learned from Bruno
Haible.  Highly recommended reading:

  http://www.haible.de/bruno/papers/cs/weak/WeakDatastructures-writeup.html

I'll explore issues regarding the symbol table in another email.

Thanks!
  Mark



About sweet-expression

2012-03-05 Thread Nala Ginrut
I try to port sweet-expression to newest Guile. Fortunately, the author
wrote a compatible version for old Guile in 2008. So I just try to rewrite
part of it for some obvious reasons. It woks fine now.
Though some guys doesn't like sweet-expression at all (OK, they're real
Lispers! But me too ;-)). I'm not a big fan to write Scheme like this:
---
define fibfast(n)
  if {n  2}
n
fibup(n 2 1 0)
---
But I think it maybe useful for newbies especially came from Python of C.
If we can fix the problems which I'll mention later, I expect it to be
added as inner language support.

I put it here, if you're interested, please checkout here:
git://gitorious.org/nacre/guile-sweet.git
It supports Modern  Sugar both. You may read the README and try the
example.

But there're some problems for the original implementation, so I didn't
format a patch. I think there should be something to be fixed.
1.  The author's comment shows that we don't need to re-implement a reader
if our reader treats [] and {} as delimiter.  But seems {} is not
delimiter in current Guile. Mark Weaver said it should be fixed.
2. And the second suggestion of Mark Weaver is we *must* try the Guile
inner reader tools and keep some information if there's errors happened. I
think it's a more worthy consideration. The current sweet is too weak to be
a productive thing.
3. We don't have nfx macro in current Guile, so only simple infix
expression can be evaluated. Say, all the operators are homologous:
{1 + 2 + 3} is simple infix expression
{1 + 2 - 3} is not.
Anyway, it's not a big deal, I'll fix it later. But I'm not sure whether it
should be in Guile or in app? Since the author use nfx directly, maybe
it's a common thing we missed?


Re: About sweet-expression

2012-03-05 Thread Nala Ginrut
sorry for stupid typo:
s/Python of C/Python or C

Regards


Re: About sweet-expression

2012-03-05 Thread Mark H Weaver
Nala Ginrut nalagin...@gmail.com writes:

 I try to port sweet-expression to newest Guile. Fortunately, the
 author wrote a compatible version for old Guile in 2008. So I just try
 to rewrite part of it for some obvious reasons. It woks fine now.

 Though some guys doesn't like sweet-expression at all (OK, they're
 real Lispers! But me too ;-)). I'm not a big fan to write Scheme like
 this:
 ---
 define fibfast(n)  
   if {n  2}       
     n
     fibup(n 2 1 0)
 ---
 But I think it maybe useful for newbies especially came from Python of
 C. If we can fix the problems which I'll mention later, I expect it to
 be added as inner language support.

 I put it here, if you're interested, please checkout here:
 git://gitorious.org/nacre/guile-sweet.git
 It supports Modern  Sugar both. You may read the README and try the
 example.

 But there're some problems for the original implementation, so I didn't
 format a patch. I think there should be something to be fixed.
 1.  The author's comment shows that we don't need to re-implement a
 reader if our reader treats [] and {} as delimiter.  But seems 
 {} is not delimiter in current Guile. Mark Weaver said it should be
 fixed. 

Yes, it would be good to at least have a reader option to treat {} as
delimiters.  I haven't looked carefully at this, but I suspect the main
requirement is to stop reading when we reach a '}', and furthermore to
unget that delimiter when we reach it.

Without this, the sweet expression reader will need to reimplement the
entire reader from scratch, which seem suboptimal for several reasons.

 2. And the second suggestion of Mark Weaver is we *must* try the Guile
 inner reader tools and keep some information if there's errors
 happened. I think it's a more worthy consideration. The current sweet
 is too weak to be a productive thing. 

To clarify, I was talking about the need to set source properties on the
datums that are read.  However, having thought a bit more about this,
it's clear that Guile's reader can be of very little help here.

The sweet expression reader itself must be responsible for setting
source properties on all the resulting lists that are not written with
normal parentheses.  For example, in the following sweet expression:

  define fibfast(n)  
    if {n  2}       
      n
      fibup(n 2 1 0)

Guile's reader is of no help at all with source properties.
Fortunately, Guile provides all of the interfaces you need to do this
job from Scheme: 'set-source-properties!', 'port-filename', 'port-line'
and 'port-column'.  This will have to be implemented in the sweet
expression reader.

 3. We don't have nfx macro in current Guile, so only simple infix
 expression can be evaluated. Say, all the operators are homologous: 
 {1 + 2 + 3} is simple infix expression
 {1 + 2 - 3} is not.
 Anyway, it's not a big deal, I'll fix it later. But I'm not sure
 whether it should be in Guile or in app? Since the author use nfx
 directly, maybe it's a common thing we missed?

It seems clear to me that David Wheeler intended for 'nfx' to be defined
by the user, to implement whatever operator precedence they wish for
their particular module.  There's no universally good 'nfx' macro.

Thanks,
  Mark



Re: About sweet-expression

2012-03-05 Thread David Kastrup
Nala Ginrut nalagin...@gmail.com writes:

 I try to port sweet-expression to newest Guile. Fortunately, the
 author wrote a compatible version for old Guile in 2008. So I just try
 to rewrite part of it for some obvious reasons. It woks fine now.

 Though some guys doesn't like sweet-expression at all (OK, they're
 real Lispers! But me too ;-)). I'm not a big fan to write Scheme like
 this:
 ---
 define fibfast(n)  
   if {n  2}       
     n
     fibup(n 2 1 0)
 ---
 But I think it maybe useful for newbies especially came from Python of
 C.

That's like thinking the easiest path for a Franch-speaking newbie to
learn English is to start by learning Latin.

You'll find that kind of opinion expressed mostly by people fascinated
with Latin rather than English.

-- 
David Kastrup