bug#19883: Correction for backtrace

2016-06-23 Thread Ludovic Courtès
Andy Wingo  skribis:

> On Thu 23 Jun 2016 11:50, Andy Wingo  writes:
>
>> On Thu 26 Feb 2015 16:30, David Kastrup  writes:
>>
>>> Try ./test 2 2000 200
>>
>> I can reproduce the crash with your test case, thanks :) The patch below
>> fixes the bug for me.  WDYT Ludovic?
>
> Here's a patch with a test case.

Perfect, you can ignore my previous message.  :-)

Ludo’.





bug#19883: Correction for backtrace

2016-06-23 Thread Ludovic Courtès
Andy Wingo  skribis:

> On Thu 26 Feb 2015 16:30, David Kastrup  writes:
>
>> Try ./test 2 2000 200
>
> I can reproduce the crash with your test case, thanks :) The patch below
> fixes the bug for me.  WDYT Ludovic?
>
> Andy
>
> commit db30120fc3a1727d8f221cbb014314f2babf841e
> Author: Andy Wingo 
> Date:   Thu Jun 23 11:47:42 2016 +0200
>
> Fix race between SMOB marking and finalization
> 
> * libguile/smob.c (clear_smobnum): New helper.
>   (finalize_smob): Re-set the smobnum to the "finalized smob" type
>   before finalizing.  Fixes #19883.
>   (scm_smob_prehistory): Pre-register a "finalized smob" type, which has
>   no mark procedure.

This LGTM, nice hack!

Do you think the test case could be added to the test suite somehow?

Thank you,
Ludo’.





bug#19883: Correction for backtrace

2016-06-23 Thread Andy Wingo
On Thu 23 Jun 2016 11:50, Andy Wingo  writes:

> On Thu 26 Feb 2015 16:30, David Kastrup  writes:
>
>> Try ./test 2 2000 200
>
> I can reproduce the crash with your test case, thanks :) The patch below
> fixes the bug for me.  WDYT Ludovic?

Here's a patch with a test case.  I'm going to apply as it seems to be
obviously the right thing and the test case does reproduce what I think
is the bug (racing mark and finalize procedures, even if it's only
happening in one thread, finalizers and mark procedures do introduce
concurrency).  We trigger the concurrency in a simple way, via
allocation in the finalizer.  The patch does fix the original test.  GC
could happen due to another thread of course.  I'm actually not sure
where the concurrency is coming from in David's test though :/

I'm very interested in any feedback you might have!

Andy

>From 8dff3af087c6eaa83ae0d72aa8b22aef5c65d65d Mon Sep 17 00:00:00 2001
From: Andy Wingo 
Date: Thu, 23 Jun 2016 11:47:42 +0200
Subject: [PATCH] Fix race between SMOB marking and finalization

* libguile/smob.c (clear_smobnum): New helper.
  (finalize_smob): Re-set the smobnum to the "finalized smob" type
  before finalizing.  Fixes #19883.
  (scm_smob_prehistory): Pre-register a "finalized smob" type, which has
  no mark procedure.
* test-suite/standalone/test-smob-mark-race.c: New file.
* test-suite/standalone/Makefile.am: Arrange to build and run the new
  test.
---
 libguile/smob.c | 33 +--
 test-suite/standalone/Makefile.am   |  6 +++
 test-suite/standalone/test-smob-mark-race.c | 65 +
 3 files changed, 101 insertions(+), 3 deletions(-)
 create mode 100644 test-suite/standalone/test-smob-mark-race.c

diff --git a/libguile/smob.c b/libguile/smob.c
index 90849a8..ed9d91a 100644
--- a/libguile/smob.c
+++ b/libguile/smob.c
@@ -374,20 +374,43 @@ scm_gc_mark (SCM o)
 }
 
 
+static void*
+clear_smobnum (void *ptr)
+{
+  SCM smob;
+  scm_t_bits smobnum;
+
+  smob = PTR2SCM (ptr);
+
+  smobnum = SCM_SMOBNUM (smob);
+  /* Frob the object's type in place, re-setting it to be the "finalized
+ smob" type.  This will prevent other routines from accessing its
+ internals in a way that assumes that the smob data is valid.  This
+ is notably the case for SMOB's own "mark" procedure, if any; as the
+ finalizer runs without the alloc lock, it's possible for a GC to
+ occur while it's running, in which case the object is alive and yet
+ its data is invalid.  */
+  SCM_SET_SMOB_DATA_0 (smob, SCM_SMOB_DATA_0 (smob) & ~(scm_t_bits) 0xff00);
+
+  return (void *) smobnum;
+}
+
 /* Finalize SMOB by calling its SMOB type's free function, if any.  */
 static void
 finalize_smob (void *ptr, void *data)
 {
   SCM smob;
+  scm_t_bits smobnum;
   size_t (* free_smob) (SCM);
 
   smob = PTR2SCM (ptr);
+  smobnum = (scm_t_bits) GC_call_with_alloc_lock (clear_smobnum, ptr);
+
 #if 0
-  printf ("finalizing SMOB %p (smobnum: %u)\n",
- ptr, SCM_SMOBNUM (smob));
+  printf ("finalizing SMOB %p (smobnum: %u)\n", ptr, smobnum);
 #endif
 
-  free_smob = scm_smobs[SCM_SMOBNUM (smob)].free;
+  free_smob = scm_smobs[smobnum].free;
   if (free_smob)
 free_smob (smob);
 }
@@ -470,6 +493,7 @@ void
 scm_smob_prehistory ()
 {
   long i;
+  scm_t_bits finalized_smob_tc16;
 
   scm_i_pthread_key_create (¤t_mark_stack_pointer, NULL);
   scm_i_pthread_key_create (¤t_mark_stack_limit, NULL);
@@ -493,6 +517,9 @@ scm_smob_prehistory ()
   scm_smobs[i].apply  = 0;
   scm_smobs[i].apply_trampoline_objcode = SCM_BOOL_F;
 }
+
+  finalized_smob_tc16 = scm_make_smob_type ("finalized smob", 0);
+  if (SCM_TC2SMOBNUM (finalized_smob_tc16) != 0) abort ();
 }
 
 /*
diff --git a/test-suite/standalone/Makefile.am 
b/test-suite/standalone/Makefile.am
index 712418a..aec7418 100644
--- a/test-suite/standalone/Makefile.am
+++ b/test-suite/standalone/Makefile.am
@@ -275,4 +275,10 @@ test_smob_mark_LDADD = $(LIBGUILE_LDADD)
 check_PROGRAMS += test-smob-mark
 TESTS += test-smob-mark
 
+test_smob_mark_race_SOURCES = test-smob-mark-race.c
+test_smob_mark_race_CFLAGS = ${test_cflags}
+test_smob_mark_race_LDADD = $(LIBGUILE_LDADD)
+check_PROGRAMS += test-smob-mark-race
+TESTS += test-smob-mark-race
+
 EXTRA_DIST += ${check_SCRIPTS}
diff --git a/test-suite/standalone/test-smob-mark-race.c 
b/test-suite/standalone/test-smob-mark-race.c
new file mode 100644
index 000..eca0325
--- /dev/null
+++ b/test-suite/standalone/test-smob-mark-race.c
@@ -0,0 +1,65 @@
+/* Copyright (C) 2016 Free Software Foundation, Inc.
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public License
+ * as published by the Free Software Foundation; either version 3 of
+ * the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABI

bug#19883: Correction for backtrace

2016-06-23 Thread Andy Wingo
On Thu 26 Feb 2015 16:30, David Kastrup  writes:

> Try ./test 2 2000 200

I can reproduce the crash with your test case, thanks :) The patch below
fixes the bug for me.  WDYT Ludovic?

Andy

commit db30120fc3a1727d8f221cbb014314f2babf841e
Author: Andy Wingo 
Date:   Thu Jun 23 11:47:42 2016 +0200

Fix race between SMOB marking and finalization

* libguile/smob.c (clear_smobnum): New helper.
  (finalize_smob): Re-set the smobnum to the "finalized smob" type
  before finalizing.  Fixes #19883.
  (scm_smob_prehistory): Pre-register a "finalized smob" type, which has
  no mark procedure.

diff --git a/libguile/smob.c b/libguile/smob.c
index 6a97caa..43ea613 100644
--- a/libguile/smob.c
+++ b/libguile/smob.c
@@ -372,20 +372,43 @@ scm_gc_mark (SCM o)
 }
 
 
+static void*
+clear_smobnum (void *ptr)
+{
+  SCM smob;
+  scm_t_bits smobnum;
+
+  smob = SCM_PACK_POINTER (ptr);
+
+  smobnum = SCM_SMOBNUM (smob);
+  /* Frob the object's type in place, re-setting it to be the "finalized
+ smob" type.  This will prevent other routines from accessing its
+ internals in a way that assumes that the smob data is valid.  This
+ is notably the case for SMOB's own "mark" procedure, if any; as the
+ finalizer runs without the alloc lock, it's possible for a GC to
+ occur while it's running, in which case the object is alive and yet
+ its data is invalid.  */
+  SCM_SET_SMOB_DATA_0 (smob, SCM_SMOB_DATA_0 (smob) & ~(scm_t_bits) 0xff00);
+
+  return (void *) smobnum;
+}
+
 /* Finalize SMOB by calling its SMOB type's free function, if any.  */
 static void
 finalize_smob (void *ptr, void *data)
 {
   SCM smob;
+  scm_t_bits smobnum;
   size_t (* free_smob) (SCM);
 
   smob = SCM_PACK_POINTER (ptr);
+  smobnum = (scm_t_bits) GC_call_with_alloc_lock (clear_smobnum, ptr);
+
 #if 0
-  printf ("finalizing SMOB %p (smobnum: %u)\n",
- ptr, SCM_SMOBNUM (smob));
+  printf ("finalizing SMOB %p (smobnum: %u)\n", ptr, smobnum);
 #endif
 
-  free_smob = scm_smobs[SCM_SMOBNUM (smob)].free;
+  free_smob = scm_smobs[smobnum].free;
   if (free_smob)
 free_smob (smob);
 }
@@ -460,6 +483,7 @@ void
 scm_smob_prehistory ()
 {
   long i;
+  scm_t_bits finalized_smob_tc16;
 
   scm_i_pthread_key_create (¤t_mark_stack_pointer, NULL);
   scm_i_pthread_key_create (¤t_mark_stack_limit, NULL);
@@ -483,6 +507,9 @@ scm_smob_prehistory ()
   scm_smobs[i].apply  = 0;
   scm_smobs[i].apply_trampoline = SCM_BOOL_F;
 }
+
+  finalized_smob_tc16 = scm_make_smob_type ("finalized smob", 0);
+  if (SCM_TC2SMOBNUM (finalized_smob_tc16) != 0) abort ();
 }
 
 /*





bug#19883: Correction for backtrace

2015-02-27 Thread David Kastrup
l...@gnu.org (Ludovic Courtès) writes:

> David Kastrup  skribis:
>
>> Three years ago in August there was a meeting of developers at my house,
>> a lot of information was passed around and in a concerted effort
>> LilyPond-2.16.0 was released.
>>
>> Since then all developments would properly be labelled a one-person
>> project or side-project.
>
> OK.  I viewed LilyPond as a project with more people involved.

There are a number of them involved, but with regard to coding,
everybody is working on his own projects.  The largest amount of
"teamwork" is probably done in the translation team (which translates
documentation and web pages to several different languages) and with
documentation writers.

There are also people compiling the releases and running the test
suites.  And there is feedback on the bug trackers.

> Sure.  There’s quite a number of names showing up in the
> lilypond-devel archive though.  We need them to feel concerned about
> this.

The only concern you'll get is people piping up and saying they cannot
believe that not more progress has been made.  Which is not all that
dissimilar to what I see with GUILE and other projects even when sending
ready-made patches.

In this particular case however, the respective people do not have the
low-level skills to actually contribute significantly.  Which is the
reason I try to condense the ugly stuff into well-encapsulated and
controlled places instead of keeping them distributed across the code
base.

The memory management stuff is of that "nobody else treads here" kind.

Now the encoding stuff is wildly enfuriating in its pointless erection
and shifting around of road blocks for people needing to actually work
with strings and buffers in their original encoding, but would possibly
be suitable as material to fight with to more C programmers.

-- 
David Kastrup





bug#19883: Correction for backtrace

2015-02-26 Thread David Kastrup
l...@gnu.org (Ludovic Courtès) writes:

> David Kastrup  skribis:
>
>> Shrug.  I'll put a link to this bug report to a suitable LilyPond issue.
>
> Thank you.  Though I want other LilyPond developers to get involved, and
> I’m afraid it would be easy for them to just ignore a side bug report.
>
> It’s a vital task for LilyPond, it cannot be a one-person side-project
> on the LilyPond side.

Three years ago in August there was a meeting of developers at my house,
a lot of information was passed around and in a concerted effort
LilyPond-2.16.0 was released.

Since then all developments would properly be labelled a one-person
project or side-project.

Now if you take a look at GUILE 2.1 development, in particular all
commits that are _not_ merges from the 2.0 branch, you'll find that
quite more than 90% have come from the same person.

LilyPond development is quite more diverse, but the various changes in
the last years have invariably been one-person projects.

Of the three large GUILE-based applications Gnucash, TeXmacs, and
LilyPond, the only successful migration to GUILEv2 so far has been
Gnucash, and its integration with GUILE is much smaller than that of
LilyPond.

The one-person projects concerning finding and/or fixing language and
compiler problems or restructuring the related code areas have been
exclusively mine.  That is not necessarily a good match since my
productivity drops to near zero when I lose interest with a problem, and
without anybody else keeping the ball rolling, it can easily stay there.

But I have no resources I could call upon to change that.

-- 
David Kastrup





bug#19883: Correction for backtrace

2015-02-26 Thread Ludovic Courtès
David Kastrup  skribis:

> l...@gnu.org (Ludovic Courtès) writes:
>
>> David Kastrup  skribis:

[...]

>>> It would not help since many of the references are stored in STL
>>> containers (like std::vector ) which have their data
>>> allocated/deallocated separately from the memory area of the
>>> structure itself.
>>
>> Oh, OK.  Still, I don’t think this is a problem because each C++
>> object has a corresponding SMOB, and said SMOB is GC-protected; thus
>> the C++ object itself is also GC-protected until the SMOB is
>> unprotected.
>
> The code given in test.cc is representative for LilyPond: most of the
> C++ objects refer to other C++ objects via pointers, and the protection
> of SMOB and C++ objects is managed through the mark callbacks.  Complex
> C++ objects contain their own SCM value as part of the Smob base class.
> Simple C++ objects (derived from Simple_smob) don't and are only
> optionally managed by GUILE.

I don’t think this contradicts what I wrote above, does it?

>> Here’s the patch I’ve ended up with:
>>
>> diff --git a/smobs.hh b/smobs.hh
>> index 3701280..a41a645 100644
>> --- a/smobs.hh
>> +++ b/smobs.hh
>> @@ -263,6 +263,20 @@ private:
>>  protected:
>>Smob () : self_scm_ (SCM_UNDEFINED), protection_cons_ (SCM_EOL) { };
>>  public:
>> +  static void *operator new (std::size_t size)
>> +  {
>> +/* This C++ object is referred to by the corresponding SMOB, which is
>> +   itself GC-protected.  Thus, no need to protect the C++ object.  */
>> +return scm_gc_malloc (size, "lily-smob");
>> +  }
>> +
>> +  static void operator delete (void *thing)
>> +  {
>> +/* Nothing to do: the GC will reclaim memory for THING when it deems
>> +   appropriate.  */
>> +// printf ("delete %p\n", thing);
>> +  }
>> +
>
> As I stated: this will not help with STL containers which are
> extensively used in pretty much every data structure of LilyPond.

Sorry, I don’t understand how it doesn’t help.

It would be a problem is ‘Smob’ objects could be copied, thus ending up
in non-GC-scanned storage, but IIUC they cannot be copied because their
copy constructor is private.

What am I missing?

At any rate, I don’t see any failure with the test program.

>> I think it would help to get everyone involved on both sides.  Thus,
>> could you Cc: this bug report to the LilyPond developer list, or the
>> corresponding LilyPond bug report?  This is really important to me.
>
> Shrug.  I'll put a link to this bug report to a suitable LilyPond issue.

Thank you.  Though I want other LilyPond developers to get involved, and
I’m afraid it would be easy for them to just ignore a side bug report.

It’s a vital task for LilyPond, it cannot be a one-person side-project
on the LilyPond side.

Thanks in advance,
Ludo’.





bug#19883: Correction for backtrace

2015-02-26 Thread David Kastrup
l...@gnu.org (Ludovic Courtès) writes:

> David Kastrup  skribis:
>
>> l...@gnu.org (Ludovic Courtès) writes:
>>
>>> David Kastrup  skribis:
>>>
 This is embarrassing: I used the wrong executable in connection with the
 core dump.  With the matching executable, the coredump makes a lot more
 sense:

 #0  0x in ?? ()
 #1  0x0804aee0 in Smob_base::mark_trampoline (arg=0x9fbb000)
 at smobs.tcc:34
 #2  0xb761b2da in ?? () from /usr/lib/libguile-2.0.so.22
 #3  0xb72751f8 in GC_mark_from () from /usr/lib/i386-linux-gnu/libgc.so.1
>>>
>>> Could you try commenting out all the SMOB mark functions in LilyPond?
>>>
>>> This doesn’t fix the bug, of course, but it’s probably a good
>>> workaround: user-provided mark functions are not needed in Guile 2.0
>>> since libgc scans the whole heap for live pointers.
>>
>> Even the test program crashes at the end (when `count' is called in
>> order to traverse the created hierarchy) when you disable the setting of
>> the mark function in the init method in smobs.tcc.
>
> Could you add debugging symbols for libguile?  I don’t understand how
> ‘count’ gets called.

Figure me surprised.  Here is the recursive walk:

int
Family::count ()
{
  int sum = 1;
  for (int i = 0; i < kids.size (); i++)
sum += kids[i]->count ();
  return sum;
}

and here is the starting call in workload():

  cout << "last has " << Family::unsmob (k)->count () << endl;

> Do you know if this is a use-after-free error?

Sure.  Nothing else would clobber the kids[] array to contain bad
pointers.

> If this is the case, Andy had the idea of turning on topological
> finalization in the GC.  This may help for this particular case, but I
> vaguely recall that this breaks other finalizer-related things.

I don't see why.  Topological finalization might help with
mark-after-free.  But why would it help if there is not even any mark
call involved?  This is clearly use-after-free.

> (I would check by myself, but ISTR that building LilyPond “on one’s
> own” is not recommended.  What would you suggest?  A Guix recipe would
> be sweet.)

Is there a reason you are not using the test program provided with this
bug report?  There is no real point in experimenting with LilyPond's
complexity when a simple test program using its memory management
classes already crashes.

LilyPond's GUILEv2 branch is currently out of order again since 2.0.11
changed encoding mechanisms _again_ in an incompatible manner (what
GUILE calls "stable" is anything but).  It is becoming harder and harder
to work around GUILE's attempts of wresting encoding control from the
application, while GUILE has no byte-transparent decoding of UTF-8, does
not support strings encoded in UTF-8, and (as of 2.0.11 or 2.0.10)
supports _only_ string ports redecoded to UTF-8.

So dealing with memory-mapped UTF-8 encoded files which are multiplexed
between reading by GUILE and reading by an UTF-8 decoding parser has
again been thwarted.  While I try figuring out how to repair the damage
this time, testing with LilyPond itself is hard to interpret since a
number of problems are not related to the memory management.

As long as this simple test program can show the memory management
related crashes, I don't see the point in throwing people at LilyPond:
that has not delivered any results the last several times I tried it.

>> A pointer to a C++ structure does not appear to protect the
>> corresponding SMOB data and free_smob calls the delete operator which
>> calls destructors and clobbers the memory area.
>
> Oh, I was mistaken in my previous message.  GC scans the stack and the
> GC-managed heap (stuff allocated with GC_MALLOC/scm_gc_malloc et al.),
> but it does *not* scan the malloc/new heap.
>
> So indeed, C++ objects that hold references to ‘SCM’ objects, such as
> instances of ‘Smob’, must either have a mark function, or they must
> be allocated with scm_gc_malloc.
>
> Would it be possible to add a ‘new’ operator to ‘Smob’ that uses
> ‘scm_gc_malloc’, and a ‘delete’ operator that uses ‘scm_gc_free’?

It would not help since many of the references are stored in STL
containers (like std::vector ) which have their data
allocated/deallocated separately from the memory area of the structure
itself.

Frankly, I don't get the current strategy of GUILE: basically any use of
scm_set_smob_mark will result in a function that can be called with
garbage from a smob that has already been deallocated via the function
registered with scm_set_smob_free.

GUILEv2 developers have resisted fixing this bug for years by trying to
stop people from using scm_set_smob_mark and instead telling people to
have their entire heap scanned by a conservative garbage collector.

For an application like LilyPond which can easily have the heap cover
more than half of the available address space and run for half an hour
(when generating docs) processing independent files with large
individual memory requirements, this strategy will have both
conside

bug#19883: Correction for backtrace

2015-02-26 Thread Ludovic Courtès
David Kastrup  skribis:

> l...@gnu.org (Ludovic Courtès) writes:
>
>> David Kastrup  skribis:
>>
>>> This is embarrassing: I used the wrong executable in connection with the
>>> core dump.  With the matching executable, the coredump makes a lot more
>>> sense:
>>>
>>> #0  0x in ?? ()
>>> #1  0x0804aee0 in Smob_base::mark_trampoline (arg=0x9fbb000)
>>> at smobs.tcc:34
>>> #2  0xb761b2da in ?? () from /usr/lib/libguile-2.0.so.22
>>> #3  0xb72751f8 in GC_mark_from () from /usr/lib/i386-linux-gnu/libgc.so.1
>>
>> Could you try commenting out all the SMOB mark functions in LilyPond?
>>
>> This doesn’t fix the bug, of course, but it’s probably a good
>> workaround: user-provided mark functions are not needed in Guile 2.0
>> since libgc scans the whole heap for live pointers.
>
> Even the test program crashes at the end (when `count' is called in
> order to traverse the created hierarchy) when you disable the setting of
> the mark function in the init method in smobs.tcc.

Could you add debugging symbols for libguile?  I don’t understand how
‘count’ gets called.

Do you know if this is a use-after-free error?  Perhaps setting
MALLOC_CHECK_=1 would give a hint.

If this is the case, Andy had the idea of turning on topological
finalization in the GC.  This may help for this particular case, but I
vaguely recall that this breaks other finalizer-related things.

(I would check by myself, but ISTR that building LilyPond “on one’s own”
is not recommended.  What would you suggest?  A Guix recipe would be
sweet.)

> A pointer to a C++ structure does not appear to protect the
> corresponding SMOB data and free_smob calls the delete operator which
> calls destructors and clobbers the memory area.

Oh, I was mistaken in my previous message.  GC scans the stack and the
GC-managed heap (stuff allocated with GC_MALLOC/scm_gc_malloc et al.),
but it does *not* scan the malloc/new heap.

So indeed, C++ objects that hold references to ‘SCM’ objects, such as
instances of ‘Smob’, must either have a mark function, or they must
be allocated with scm_gc_malloc.

Would it be possible to add a ‘new’ operator to ‘Smob’ that uses
‘scm_gc_malloc’, and a ‘delete’ operator that uses ‘scm_gc_free’?

Thanks,
Ludo’.





bug#19883: Correction for backtrace

2015-02-25 Thread David Kastrup
l...@gnu.org (Ludovic Courtès) writes:

> David Kastrup  skribis:
>
>> This is embarrassing: I used the wrong executable in connection with the
>> core dump.  With the matching executable, the coredump makes a lot more
>> sense:
>>
>> #0  0x in ?? ()
>> #1  0x0804aee0 in Smob_base::mark_trampoline (arg=0x9fbb000)
>> at smobs.tcc:34
>> #2  0xb761b2da in ?? () from /usr/lib/libguile-2.0.so.22
>> #3  0xb72751f8 in GC_mark_from () from /usr/lib/i386-linux-gnu/libgc.so.1
>
> Could you try commenting out all the SMOB mark functions in LilyPond?
>
> This doesn’t fix the bug, of course, but it’s probably a good
> workaround: user-provided mark functions are not needed in Guile 2.0
> since libgc scans the whole heap for live pointers.

Even the test program crashes at the end (when `count' is called in
order to traverse the created hierarchy) when you disable the setting of
the mark function in the init method in smobs.tcc.

A pointer to a C++ structure does not appear to protect the
corresponding SMOB data and free_smob calls the delete operator which
calls destructors and clobbers the memory area.

Program received signal SIGSEGV, Segmentation fault.
0x08049b0a in std::vector >::size (
this=0x1b8b) at /usr/include/c++/4.9/bits/stl_vector.h:655
655   { return size_type(this->_M_impl._M_finish - 
this->_M_impl._M_start); }
(gdb) bt
#0  0x08049b0a in std::vector >::size (
this=0x1b8b) at /usr/include/c++/4.9/bits/stl_vector.h:655
#1  0x08049498 in Family::count (this=0x1b7f) at test.cc:53
#2  0x0804947c in Family::count (this=0x834f350) at test.cc:54
#3  0x0804947c in Family::count (this=0x8297d40) at test.cc:54
#4  0x0804947c in Family::count (this=0x828a9f8) at test.cc:54
#5  0x0804947c in Family::count (this=0x817d768) at test.cc:54
#6  0x0804947c in Family::count (this=0x828d588) at test.cc:54
#7  0x0804947c in Family::count (this=0x83298b8) at test.cc:54
#8  0x0804947c in Family::count (this=0x817fe58) at test.cc:54
#9  0x080495df in workload (avv=0xb074) at test.cc:73
#10 0xb7e66dfd in ?? () from /usr/lib/libguile-2.0.so.22
#11 0xb7ef08e7 in ?? () from /usr/lib/libguile-2.0.so.22
#12 0xb7ec9fb9 in ?? () from /usr/lib/libguile-2.0.so.22
#13 0xb7f08f20 in ?? () from /usr/lib/libguile-2.0.so.22
#14 0xb7f09539 in ?? () from /usr/lib/libguile-2.0.so.22
#15 0xb7e714f3 in scm_call_4 () from /usr/lib/libguile-2.0.so.22
#16 0xb7ef0acf in scm_catch_with_pre_unwind_handler ()
   from /usr/lib/libguile-2.0.so.22
#17 0xb7ef0bd4 in scm_c_catch () from /usr/lib/libguile-2.0.so.22
#18 0xb7e675d1 in ?? () from /usr/lib/libguile-2.0.so.22
#19 0xb7e676d3 in scm_c_with_continuation_barrier ()
   from /usr/lib/libguile-2.0.so.22
#20 0xb7eedf7e in ?? () from /usr/lib/libguile-2.0.so.22
#21 0xb7b272c1 in GC_call_with_stack_base ()
   from /usr/lib/i386-linux-gnu/libgc.so.1
#22 0xb7eee3e6 in scm_with_guile () from /usr/lib/libguile-2.0.so.22
#23 0x08049685 in main (ac=4, av=0xb074) at test.cc:85


-- 
David Kastrup





bug#19883: Correction for backtrace

2015-02-25 Thread Ludovic Courtès
David Kastrup  skribis:

> This is embarrassing: I used the wrong executable in connection with the
> core dump.  With the matching executable, the coredump makes a lot more
> sense:
>
> #0  0x in ?? ()
> #1  0x0804aee0 in Smob_base::mark_trampoline (arg=0x9fbb000)
> at smobs.tcc:34
> #2  0xb761b2da in ?? () from /usr/lib/libguile-2.0.so.22
> #3  0xb72751f8 in GC_mark_from () from /usr/lib/i386-linux-gnu/libgc.so.1

Could you try commenting out all the SMOB mark functions in LilyPond?

This doesn’t fix the bug, of course, but it’s probably a good
workaround: user-provided mark functions are not needed in Guile 2.0
since libgc scans the whole heap for live pointers.

Thanks,
Ludo’.





bug#19883: Correction for backtrace

2015-02-16 Thread David Kastrup

This is embarrassing: I used the wrong executable in connection with the
core dump.  With the matching executable, the coredump makes a lot more
sense:

#0  0x in ?? ()
#1  0x0804aee0 in Smob_base::mark_trampoline (arg=0x9fbb000)
at smobs.tcc:34
#2  0xb761b2da in ?? () from /usr/lib/libguile-2.0.so.22
#3  0xb72751f8 in GC_mark_from () from /usr/lib/i386-linux-gnu/libgc.so.1
#4  0xb72766ca in GC_mark_some () from /usr/lib/i386-linux-gnu/libgc.so.1
#5  0xb726cd16 in GC_stopped_mark () from /usr/lib/i386-linux-gnu/libgc.so.1
#6  0xb726d730 in GC_try_to_collect_inner ()
   from /usr/lib/i386-linux-gnu/libgc.so.1
#7  0xb726e12a in GC_collect_or_expand ()
   from /usr/lib/i386-linux-gnu/libgc.so.1
#8  0xb726e2b1 in GC_allocobj () from /usr/lib/i386-linux-gnu/libgc.so.1
#9  0xb72731a4 in GC_generic_malloc_inner ()
   from /usr/lib/i386-linux-gnu/libgc.so.1
#10 0xb72732be in GC_generic_malloc () from /usr/lib/i386-linux-gnu/libgc.so.1
#11 0xb761b81e in scm_i_new_smob () from /usr/lib/libguile-2.0.so.22
#12 0x0804985a in scm_new_smob (tc=7039, data=158987224)
at /usr/include/guile/2.0/libguile/smob.h:91
#13 0x0804a6fc in Smob_base::register_ptr (p=0x979f3d8)
at smobs.tcc:53
#14 0x0804a03d in Smob::unprotected_smobify_self (this=0x979f3dc)
at smobs.hh:273
#15 0x08049a73 in Smob::smobify_self (this=0x979f3dc) at smobs.hh:286
#16 0x0804934d in Family::Family (this=0x979f3d8, totals=0, branch=2)
at test.cc:30
#17 0x0804938f in Family::Family (this=0x98b7c98, totals=3, branch=2)
at test.cc:35
#18 0x0804938f in Family::Family (this=0x979e598, totals=8, branch=2)
at test.cc:35
#19 0x0804938f in Family::Family (this=0x98b65c8, totals=17, branch=2)
at test.cc:35
#20 0x0804938f in Family::Family (this=0x9934d78, totals=37, branch=2)
at test.cc:35
#21 0x0804938f in Family::Family (this=0x99377c8, totals=76, branch=2)
at test.cc:35
#22 0x0804938f in Family::Family (this=0x96921e0, totals=154, branch=2)
at test.cc:35
#23 0x0804938f in Family::Family (this=0xa0c3998, totals=311, branch=2)
at test.cc:35
#24 0x0804938f in Family::Family (this=0x9d0cc88, totals=623, branch=2)
at test.cc:35
#25 0x0804938f in Family::Family (this=0x9b8b0a8, totals=1248, branch=2)
at test.cc:35
#26 0x0804938f in Family::Family (this=0x9e84808, totals=2498, branch=2)
at test.cc:35
#27 0x0804938f in Family::Family (this=0x9bf3d48, totals=4998, branch=2)
at test.cc:35
#28 0x0804938f in Family::Family (this=0x9d05de8, totals=9998, branch=2)
at test.cc:35
#29 0x0804938f in Family::Family (this=0x9b8dcb8, totals=1, branch=2)
at test.cc:35
#30 0x080495e7 in workload (avv=0xbfca7aa4) at test.cc:71
#31 0xb75b7dfd in ?? () from /usr/lib/libguile-2.0.so.22
#32 0xb76418e7 in ?? () from /usr/lib/libguile-2.0.so.22
#33 0xb761afb9 in ?? () from /usr/lib/libguile-2.0.so.22
#34 0xb7659f20 in ?? () from /usr/lib/libguile-2.0.so.22
#35 0xb765a539 in ?? () from /usr/lib/libguile-2.0.so.22
#36 0xb75c24f3 in scm_call_4 () from /usr/lib/libguile-2.0.so.22
#37 0xb7641acf in scm_catch_with_pre_unwind_handler ()
   from /usr/lib/libguile-2.0.so.22
#38 0xb7641bd4 in scm_c_catch () from /usr/lib/libguile-2.0.so.22
#39 0xb75b85d1 in ?? () from /usr/lib/libguile-2.0.so.22
#40 0xb75b86d3 in scm_c_with_continuation_barrier ()
   from /usr/lib/libguile-2.0.so.22
#41 0xb763ef7e in ?? () from /usr/lib/libguile-2.0.so.22
#42 0xb72782c1 in GC_call_with_stack_base ()
   from /usr/lib/i386-linux-gnu/libgc.so.1
#43 0xb763f3e6 in scm_with_guile () from /usr/lib/libguile-2.0.so.22
#44 0x080496c5 in main (ac=4, av=0xbfca7aa4) at test.cc:85
(gdb) 

So the core indeed occurs when trying to call scm_gc_mark on a smob no
longer (or not yet?) associated with a valid structure in memory, in a
garbage collection apparently triggered during normal allocation of smob
data.

Sorry for the nonsensical core dump previously.

-- 
David Kastrup