date:20120305

Autocompilation/LilyPond

2012-03-05 Thread David Kastrup


Hi,

with the stable release 2.16 of LilyPond looming around the corner, it
will become imminent soon to think about supporting Guile 2.0.

Previous attempts have mostly exploded around the problem that we have
something like

(for-each ly:load init-scheme-files)

in our lily.scm file, and the auto-compiler attempts to compile all of
those files independently as far as I understand.  Unfortunately, some
of them contain macro definitions that other files rely on.

Personally, I think it would make sense if we could get the autocompiler
to treat the whole blob of files as _one_ unit, and recompile the unit
if it gets out of date.  That would save us from trying to factor out
macro dependencies into separate files (and since our markup system
defines a macro for every markup function, and since the macros are
needed when building markups in Scheme, this is actually rather hard to
do).  You might say that it is LilyPond's own fault that it has reverted
a bit more to macro programming than feasible for its own good.

I would not actually say that you are wrong.  However, there is the
problem of lifting a whole bunch of working code base under active
development into the Guilev2 era, and if we could tackle design or
maldesign questions mostly independently and in bite-sized chunks rather
than humongous patches moving material around, this would help a lot in
getting Guilev2 support on track.

Suggestions?

-- 
David Kastrup

Re: Autocompilation/LilyPond

2012-03-05 Thread Mark H Weaver

David Kastrup  writes:

> with the stable release 2.16 of LilyPond looming around the corner, it
> will become imminent soon to think about supporting Guile 2.0.
>
> Previous attempts have mostly exploded around the problem that we have
> something like
>
> (for-each ly:load init-scheme-files)
>
> in our lily.scm file, and the auto-compiler attempts to compile all of
> those files independently as far as I understand.  Unfortunately, some
> of them contain macro definitions that other files rely on.
>
> Personally, I think it would make sense if we could get the autocompiler
> to treat the whole blob of files as _one_ unit, and recompile the unit
> if it gets out of date.

I'm not sure that would help much.  There's a deeper problem that you
should be aware of.  In Guile 1.x, macro uses within procedures are not
expanded until the procedure is evaluated.  In Guile 2, macros are
expanded as soon as the procedure is defined, even if compilation is
turned off.  This means that functions can only use macros that were
previously defined.

For example, the following works in Guile 1.8 but not in Guile 2:

  (define (foo x) (bar x))
  (define-macro (bar x) `(quote ,x))

In Guile 2, you must put the definition of 'bar' before 'foo'.

So you'll need to load your code in the right order so that macros
always come before their uses.

One clarification is in order.  You might conclude from this that it is
not possible to define mutually-recursive macros, but that's not true.
For example, the following is fine:

(define-macro (beep x) `(boop ,x))
(define-macro (boop x) `(quote ,x))

That's because 'beep' doesn't use 'boop', it merely produces a use of
'boop' in its expansion.  'boop' is merely part of a quoted literal
within 'beep'.

   Regards,
 Mark

Re: Autocompilation/LilyPond

2012-03-05 Thread David Kastrup

Mark H Weaver  writes:

> David Kastrup  writes:
>
>> with the stable release 2.16 of LilyPond looming around the corner, it
>> will become imminent soon to think about supporting Guile 2.0.
>>
>> Previous attempts have mostly exploded around the problem that we have
>> something like
>>
>> (for-each ly:load init-scheme-files)
>>
>> in our lily.scm file, and the auto-compiler attempts to compile all
>> of those files independently as far as I understand.  Unfortunately,
>> some of them contain macro definitions that other files rely on.
>>
>> Personally, I think it would make sense if we could get the
>> autocompiler to treat the whole blob of files as _one_ unit, and
>> recompile the unit if it gets out of date.
>
> I'm not sure that would help much.  There's a deeper problem that you
> should be aware of.  In Guile 1.x, macro uses within procedures are
> not expanded until the procedure is evaluated.  In Guile 2, macros are
> expanded as soon as the procedure is defined, even if compilation is
> turned off.  This means that functions can only use macros that were
> previously defined.

I don't think that making this condition hold would be really hard.
LilyPond has a rather carefully selected load order in several stages,
so use-before-definition, whether in the context of macros or not,
should be more the exception than the rule, and only require smaller
rearrangements.

> One clarification is in order.  You might conclude from this that it
> is not possible to define mutually-recursive macros, but that's not
> true.

I don't think we use them, anyway.  Most problems are more due to small
oversights rather than maliciously clever overdesign.

-- 
David Kastrup

[PATCH] Efficient Gensym Hack

2012-03-05 Thread Mark H Weaver

Hello all,

Here's an implementation of the efficient gensym hack for stable-2.0.
It makes 'gensym' about 4.7 times faster on my Yeeloong.  Gensyms are
not given names or even numbers until they are asked for their names or
hash values (for 'equal?' hash tables only).

The first patch adds an optimization for strings that is important for
gensyms.  It avoids locking a mutex when setting the shared flag on a
stringbuf if the shared flag is already set.  This is important for
gensyms because when 'gensym' is called, it must save the stringbuf of
the prefix and set its shared flag.  In the common case where 'gensym'
is called many times with the same prefix, this avoids locking any
mutexes within most calls to 'gensym'.

The second patch is trivial and unrelated to the efficient gensym hack,
but I include it here to save everyone an additional recompile of
libguile.

The third patch actually implements the efficient gensym hack.  It was
made a bit hairier by two unfortunate facts:

1. The implementation of symbols is split between symbols.c and
strings.c, and the gensym hack needs the internals of both.  I had to
add some new internal functions, including one to make a stringbuf from
a string and one to make a string from a stringbuf.

2. The symbol table uses the symbols themselves as the keys.  This was
already hairy and inefficient: take a look at symbol_lookup_assoc_fn,
which has to convert symbols to strings (which involves allocation) to
implement the hash lookup!  However, it makes things even worse when
forcing lazy gensyms, because we must intern the gensym before clearing
its "lazy gensym flag".  This is necessary because if the name we chose
already belongs to a pre-existing interned symbol, we _must_ choose
another name, and we must prevent any other thread from getting our
gensym's name until after we have interned it.  This involved adding a
new internal function to get the name of a symbol without checking its
lazy gensym flag, for use by symbol_lookup_assoc_fn.  IMHO, it would be
much better to use a weak-value hash table, with strings as the keys and
symbols as the values.  Maybe we can do that for 2.2.

Anyway, here are the patches.  Comments and suggestions welcome.

Mark


>From 5f558244261f3a22217d5136d0aebb7f644d7efb Mon Sep 17 00:00:00 2001
From: Mark H Weaver 
Date: Mon, 5 Mar 2012 09:51:17 -0500
Subject: [PATCH 1/3] Don't lock mutex to set shared flag on stringbuf if it's
 already shared

* libguile/strings.c (set_stringbuf_shared): New internal static
  function to replace the macro SET_STRINGBUF_SHARED.  The macro assumed
  that the stringbuf_write_mutex was already locked, but this new
  function handles locking internally, and avoids locking if the
  stringbuf is already shared.

  (SET_STRINGBUF_SHARED): Removed.

  (scm_i_make_string, scm_i_substring, scm_i_substring_read_only,
  scm_i_make_symbol, scm_i_symbol_substring): Use set_stringbuf_shared
  instead of SET_STRINGBUF_SHARED.
---
 libguile/strings.c |   41 ++---
 1 files changed, 18 insertions(+), 23 deletions(-)

diff --git a/libguile/strings.c b/libguile/strings.c
index 494a658..35757f0 100644
--- a/libguile/strings.c
+++ b/libguile/strings.c
@@ -91,16 +91,6 @@
 
 #define STRINGBUF_LENGTH(buf)   (SCM_CELL_WORD_1 (buf))
 
-#define SET_STRINGBUF_SHARED(buf)	\
-  do	\
-{	\
-  /* Don't modify BUF if it's already marked as shared since it might be \
-	 a read-only, statically allocated stringbuf.  */		\
-  if (SCM_LIKELY (!STRINGBUF_SHARED (buf)))\
-	SCM_SET_CELL_WORD_0 ((buf), SCM_CELL_WORD_0 (buf) | STRINGBUF_F_SHARED); \
-}	\
-  while (0)
-
 #ifdef SCM_STRING_LENGTH_HISTOGRAM
 static size_t lenhist[1001];
 #endif
@@ -227,6 +217,19 @@ narrow_stringbuf (SCM buf)
 
 scm_i_pthread_mutex_t stringbuf_write_mutex = SCM_I_PTHREAD_MUTEX_INITIALIZER;
 
+static void
+set_stringbuf_shared (SCM buf)
+{
+  /* Don't modify BUF if it's already marked as shared since it
+ might be a read-only, statically allocated stringbuf.  */
+  if (!STRINGBUF_SHARED (buf))
+{
+  scm_i_pthread_mutex_lock (&stringbuf_write_mutex);
+  SCM_SET_CELL_WORD_0 (buf, SCM_CELL_WORD_0 (buf) | STRINGBUF_F_SHARED);
+  scm_i_pthread_mutex_unlock (&stringbuf_write_mutex);
+}
+}
+
 
 /* Copy-on-write strings.
  */
@@ -276,7 +279,7 @@ scm_i_make_string (size_t len, char **charsp, int read_only_p)
   if (SCM_UNLIKELY (scm_is_false (null_stringbuf)))
 {
   null_stringbuf = make_stringbuf (0);
-  SET_STRINGBUF_SHARED (null_stringbuf);
+  set_stringbuf_shared (null_stringbuf);
 }
   buf = null_stringbuf;
 }
@@ -341,9 +344,7 @@ scm_i_substring (SCM str, size_t start, size_t end)
   SCM buf;
   size_t str_start;
   get_str_buf_start (&str, &buf, &str_start);
-  scm_i_pthread_mutex_lock (&stringbuf_write_mutex);
-  SET_STRINGBUF_SHARED (buf);
-  scm_i_pthread_mutex_unlock (&stringbuf_write_m

Re: Autocompilation/LilyPond

2012-03-05 Thread Mark H Weaver

David Kastrup  writes:

> Mark H Weaver  writes:
>
>> David Kastrup  writes:
>>
>>> with the stable release 2.16 of LilyPond looming around the corner, it
>>> will become imminent soon to think about supporting Guile 2.0.
>>>
>>> Previous attempts have mostly exploded around the problem that we have
>>> something like
>>>
>>> (for-each ly:load init-scheme-files)
>>>
>>> in our lily.scm file, and the auto-compiler attempts to compile all
>>> of those files independently as far as I understand.  Unfortunately,
>>> some of them contain macro definitions that other files rely on.
>>>
>>> Personally, I think it would make sense if we could get the
>>> autocompiler to treat the whole blob of files as _one_ unit, and
>>> recompile the unit if it gets out of date.
>>
>> I'm not sure that would help much.  There's a deeper problem that you
>> should be aware of.  In Guile 1.x, macro uses within procedures are
>> not expanded until the procedure is evaluated.  In Guile 2, macros are
>> expanded as soon as the procedure is defined, even if compilation is
>> turned off.  This means that functions can only use macros that were
>> previously defined.
>
> I don't think that making this condition hold would be really hard.
> LilyPond has a rather carefully selected load order in several stages,
> so use-before-definition, whether in the context of macros or not,
> should be more the exception than the rule, and only require smaller
> rearrangements.

Excellent!  As long as you load everything in the right order, such that
macros are defined before they are used, I don't see why there should be
any other problems related to macros and compilation.

 Mark

Re: [PATCH] Efficient Gensym Hack

2012-03-05 Thread Andy Wingo

Hi Mark,

A quick reaction to your summary; I'll look at the patches shortly.

On Mon 05 Mar 2012 18:17, Mark H Weaver  writes:

> Here's an implementation of the efficient gensym hack for stable-2.0.

Excellent!

> It makes 'gensym' about 4.7 times faster on my Yeeloong.  Gensyms are
> not given names or even numbers until they are asked for their names or
> hash values (for 'equal?' hash tables only).

Ah, interesting :)  I had always thought that you would need to number
them first, but I see that you found a way to avoid that.

> 1. The implementation of symbols is split between symbols.c and
> strings.c, and the gensym hack needs the internals of both.  I had to
> add some new internal functions, including one to make a stringbuf from
> a string and one to make a string from a stringbuf.

Yeah, this is not good.  With my dynstack work I found that functions
that are internal but not static can prevent some important inlining.
(I found the performance impact using "perf record", and valgrind
--tool=callgrind).  It's good that we have internal functions to avoid
bloating our public API, but they do seem to prevent optimization.  I
wonder if LTO could help here.

> 2. The symbol table uses the symbols themselves as the keys.  This was
> already hairy and inefficient: take a look at symbol_lookup_assoc_fn,
> which has to convert symbols to strings (which involves allocation) to
> implement the hash lookup!

It uses the symbols as keys, but it uses the string hash value (not the
symbol hashq value) as the hash.  There are some important cases in
which no string need be allocated: scm_from_utf8_symbol and
scm_from_latin1_symbol.  But yes, it's hairy.

Note also that this has changed significantly in master.  Your thoughts
on that weak set mechanism would be appreciated.

> IMHO, it would be much better to use a weak-value hash table, with
> strings as the keys and symbols as the values.  Maybe we can do that
> for 2.2.

Interesting idea.  It's not clear to me how this would solve this
problem though; but perhaps that will be clear when I read the patches.

Anyway, to keep this short, I'll look at the patches in another mail.

Cheers!

Andy

ps. An interesting benchmark (before and after) would be to time the
execution of (compile-file "module/ice-9/psyntax.scm").
-- 
http://wingolog.org/

Re: Autocompilation/LilyPond

2012-03-05 Thread David Kastrup

Mark H Weaver  writes:

> David Kastrup  writes:
>
>> Mark H Weaver  writes:
>>
>>> David Kastrup  writes:
>>>
 with the stable release 2.16 of LilyPond looming around the corner, it
 will become imminent soon to think about supporting Guile 2.0.

 Previous attempts have mostly exploded around the problem that we have
 something like

 (for-each ly:load init-scheme-files)

 in our lily.scm file, and the auto-compiler attempts to compile all
 of those files independently as far as I understand.  Unfortunately,
 some of them contain macro definitions that other files rely on.

 Personally, I think it would make sense if we could get the
 autocompiler to treat the whole blob of files as _one_ unit, and
 recompile the unit if it gets out of date.
>>>
>>> I'm not sure that would help much.  There's a deeper problem that you
>>> should be aware of.  In Guile 1.x, macro uses within procedures are
>>> not expanded until the procedure is evaluated.  In Guile 2, macros are
>>> expanded as soon as the procedure is defined, even if compilation is
>>> turned off.  This means that functions can only use macros that were
>>> previously defined.
>>
>> I don't think that making this condition hold would be really hard.
>> LilyPond has a rather carefully selected load order in several stages,
>> so use-before-definition, whether in the context of macros or not,
>> should be more the exception than the rule, and only require smaller
>> rearrangements.
>
> Excellent!  As long as you load everything in the right order, such that
> macros are defined before they are used, I don't see why there should be
> any other problems related to macros and compilation.

Because the individual files are not independent from one another?
That's why I wrote:

Personally, I think it would make sense if we could get the
autocompiler to treat the whole blob of files as _one_ unit, and
recompile the unit if it gets out of date.

-- 
David Kastrup

Re: Autocompilation/LilyPond

2012-03-05 Thread Mark H Weaver

David Kastrup  writes:

> Mark H Weaver  writes:
>
>> Excellent!  As long as you load everything in the right order, such that
>> macros are defined before they are used, I don't see why there should be
>> any other problems related to macros and compilation.
>
> Because the individual files are not independent from one another?
> That's why I wrote:
>
> Personally, I think it would make sense if we could get the
> autocompiler to treat the whole blob of files as _one_ unit, and
> recompile the unit if it gets out of date.

There's no problem with them being dependent on one another.  When you
load a file, even with auto-compilation, the macro expander will make
use of whatever macros are already bound in the current module.  The
rest of the compiler sees only the output of the macro expander.

Try the following experiment: put "(define-macro (bar x) `(quote ,x))"
into "foo1.scm", and "(define (foo x) (bar x))" into "foo2.scm", and
then within a REPL type: (load "foo1.scm") (load "foo2.scm") and observe
that everything works as it should.

If you really want everything to be compiled as one unit, you can use
'include' (which acts essentially the same as #include in C), though
beware that Guile is not yet smart enough to auto-recompile when one of
the included files gets updated.

I don't see any compelling benefit to compiling everything as one unit,
but do as you wish :)

   Regards,
 Mark

Re: [PATCH] Efficient Gensym Hack

2012-03-05 Thread Mark H Weaver

Hi Andy, thanks for the quick response! :)

Andy Wingo  writes:

> On Mon 05 Mar 2012 18:17, Mark H Weaver  writes:
>
>> It makes 'gensym' about 4.7 times faster on my Yeeloong.  Gensyms are
>> not given names or even numbers until they are asked for their names or
>> hash values (for 'equal?' hash tables only).
>
> Ah, interesting :)  I had always thought that you would need to number
> them first, but I see that you found a way to avoid that.

I got the idea from http://icfp06.cs.uchicago.edu/dybvig-talk.pdf

Anyway, in retrospect, I don't even see how I could make it work
otherwise.  The problem is that with lazy gensyms, the name you
ultimately assign to the gensym _must_ not already be interned.  Think
about it.  Suppose you assign a gensym with prefix "foo" the number 6,
and that there's another symbol already interned with the name "foo6".
Now you have two distinct symbols (in the sense of 'eq?'), both
semantically interned, with the same name.  There's no way to recover
from this.

The only solution I could find is to give the gensym a name that has not
already been interned.  In my implementation, I don't increment the
counter at all until the lazy gensym is "forced".  If that name is
already interned, and I just keep incrementing the counter until I find
a unique symbol.  Only after it has been successfully interned do I
_commit_ to the new name by clearing the "lazy gensym flag".

>> 1. The implementation of symbols is split between symbols.c and
>> strings.c, and the gensym hack needs the internals of both.  I had to
>> add some new internal functions, including one to make a stringbuf from
>> a string and one to make a string from a stringbuf.
>
> Yeah, this is not good.  With my dynstack work I found that functions
> that are internal but not static can prevent some important inlining.

That's definitely true, but fortunately these new internal functions are
used only by the gensym code.  Anyway, if you're concerned about this,
one option would be to combine string.c and symbol.c into a single file.

> Your thoughts on that weak set mechanism would be appreciated.

Everything I know about weak storage mechanisms I learned from Bruno
Haible.  Highly recommended reading:

  http://www.haible.de/bruno/papers/cs/weak/WeakDatastructures-writeup.html

I'll explore issues regarding the symbol table in another email.

Thanks!
  Mark

About sweet-expression

2012-03-05 Thread Nala Ginrut

I try to port sweet-expression to newest Guile. Fortunately, the author
wrote a compatible version for old Guile in 2008. So I just try to rewrite
part of it for some obvious reasons. It woks fine now.
Though some guys doesn't like sweet-expression at all (OK, they're real
Lispers! But me too ;-)). I'm not a big fan to write Scheme like this:
---
define fibfast(n)
  if {n < 2}
n
fibup(n 2 1 0)
---
But I think it maybe useful for newbies especially came from Python of C.
If we can fix the problems which I'll mention later, I expect it to be
added as inner language support.

I put it here, if you're interested, please checkout here:
git://gitorious.org/nacre/guile-sweet.git
It supports Modern & Sugar both. You may read the README and try the
example.

But there're some problems for the original implementation, so I didn't
format a patch. I think there should be something to be fixed.
1.  The author's comment shows that we don't need to re-implement a reader
if our reader treats "[]" and "{}" as delimiter.  But seems "{}" is not
delimiter in current Guile. Mark Weaver said it should be fixed.
2. And the second suggestion of Mark Weaver is we *must* try the Guile
inner reader tools and keep some information if there's errors happened. I
think it's a more worthy consideration. The current sweet is too weak to be
a productive thing.
3. We don't have "nfx" macro in current Guile, so only simple infix
expression can be evaluated. Say, all the operators are homologous:
{1 + 2 + 3} is simple infix expression
{1 + 2 - 3} is not.
Anyway, it's not a big deal, I'll fix it later. But I'm not sure whether it
should be in Guile or in app? Since the author use "nfx" directly, maybe
it's a common thing we missed?

Re: About sweet-expression

2012-03-05 Thread Nala Ginrut

sorry for stupid typo:
s/Python of C/Python or C

Regards

Re: About sweet-expression

2012-03-05 Thread Mark H Weaver

Nala Ginrut  writes:

> I try to port sweet-expression to newest Guile. Fortunately, the
> author wrote a compatible version for old Guile in 2008. So I just try
> to rewrite part of it for some obvious reasons. It woks fine now.
>
> Though some guys doesn't like sweet-expression at all (OK, they're
> real Lispers! But me too ;-)). I'm not a big fan to write Scheme like
> this:
> ---
> define fibfast(n)  
>   if {n < 2}       
>     n
>     fibup(n 2 1 0)
> ---
> But I think it maybe useful for newbies especially came from Python of
> C. If we can fix the problems which I'll mention later, I expect it to
> be added as inner language support.
>
> I put it here, if you're interested, please checkout here:
> git://gitorious.org/nacre/guile-sweet.git
> It supports Modern & Sugar both. You may read the README and try the
> example.
>
> But there're some problems for the original implementation, so I didn't
> format a patch. I think there should be something to be fixed.
> 1.  The author's comment shows that we don't need to re-implement a
> reader if our reader treats "[]" and "{}" as delimiter.  But seems "
> {}" is not delimiter in current Guile. Mark Weaver said it should be
> fixed. 

Yes, it would be good to at least have a reader option to treat {} as
delimiters.  I haven't looked carefully at this, but I suspect the main
requirement is to stop reading when we reach a '}', and furthermore to
unget that delimiter when we reach it.

Without this, the sweet expression reader will need to reimplement the
entire reader from scratch, which seem suboptimal for several reasons.

> 2. And the second suggestion of Mark Weaver is we *must* try the Guile
> inner reader tools and keep some information if there's errors
> happened. I think it's a more worthy consideration. The current sweet
> is too weak to be a productive thing. 

To clarify, I was talking about the need to set source properties on the
datums that are read.  However, having thought a bit more about this,
it's clear that Guile's reader can be of very little help here.

The sweet expression reader itself must be responsible for setting
source properties on all the resulting lists that are not written with
normal parentheses.  For example, in the following sweet expression:

  define fibfast(n)  
    if {n < 2}       
      n
      fibup(n 2 1 0)

Guile's reader is of no help at all with source properties.
Fortunately, Guile provides all of the interfaces you need to do this
job from Scheme: 'set-source-properties!', 'port-filename', 'port-line'
and 'port-column'.  This will have to be implemented in the sweet
expression reader.

> 3. We don't have "nfx" macro in current Guile, so only simple infix
> expression can be evaluated. Say, all the operators are homologous: 
> {1 + 2 + 3} is simple infix expression
> {1 + 2 - 3} is not.
> Anyway, it's not a big deal, I'll fix it later. But I'm not sure
> whether it should be in Guile or in app? Since the author use "nfx"
> directly, maybe it's a common thing we missed?

It seems clear to me that David Wheeler intended for 'nfx' to be defined
by the user, to implement whatever operator precedence they wish for
their particular module.  There's no universally good 'nfx' macro.

Thanks,
  Mark

Re: About sweet-expression

2012-03-05 Thread David Kastrup

Nala Ginrut  writes:

> I try to port sweet-expression to newest Guile. Fortunately, the
> author wrote a compatible version for old Guile in 2008. So I just try
> to rewrite part of it for some obvious reasons. It woks fine now.
>
> Though some guys doesn't like sweet-expression at all (OK, they're
> real Lispers! But me too ;-)). I'm not a big fan to write Scheme like
> this:
> ---
> define fibfast(n)  
>   if {n < 2}       
>     n
>     fibup(n 2 1 0)
> ---
> But I think it maybe useful for newbies especially came from Python of
> C.

That's like thinking the easiest path for a Franch-speaking newbie to
learn English is to start by learning Latin.

You'll find that kind of opinion expressed mostly by people fascinated
with Latin rather than English.

-- 
David Kastrup

Autocompilation/LilyPond

Re: Autocompilation/LilyPond

Re: Autocompilation/LilyPond

[PATCH] Efficient Gensym Hack

Re: Autocompilation/LilyPond

Re: [PATCH] Efficient Gensym Hack

Re: Autocompilation/LilyPond

Re: Autocompilation/LilyPond

Re: [PATCH] Efficient Gensym Hack

About sweet-expression

Re: About sweet-expression

Re: About sweet-expression

Re: About sweet-expression

13 matches

Site Navigation

Mail list logo

Footer information