Re: GUILE_MAX_HEAP_SIZE

Ludovic Courtès Thu, 21 Aug 2008 11:36:42 -0700

Hello,

Han-Wen Nienhuys <[EMAIL PROTECTED]> writes:


> Ludovic Courtès escreveu:

>> Off the top of my head: incorrect indentation, missing spaces around
>> brackets, and more importantly comments (see (standards.info)Comments).
>
> The code I went through should not have that; please point me to locations
> where things are broken so I can fix them.

E.g., from commit:

+/*
+  Classic MIT Hack, see e.g. http://www.tekpool.com/?cat=9
+ */
+int scm_i_uint_bit_count(unsigned int u)

(BTW, it'd make sense to use Gnulib's `count-one-bits' module, which is
able to use GCC's `__builtin_popcount ()'.)

+/*
+  Amount of cells marked in this cell, measured in 1-cells.
+ */
+int
+scm_i_card_marked_count (scm_t_cell *card, int span)

+  while (bvec < bvec_end) {
+    count += scm_i_uint_bit_count(*bvec);
+    bvec ++;
+  }

Other than that, the new `gc-segment-table.c' does look nice to the
eye.  ;-)

>>> See below - note that the old .scm file was pretty much broken, as it 
>>> was using gc-live-object-stats which is only accurate just after the
>>> mark phase.
>> 
>> Hmm, `gc-live-object-stats' may return information from the previous
>> cycle, but it shouldn't be *that* accurate, should it?

Sorry, that should have read "that inaccurate"...

> No; the current implementation uses a similar scheme to
> gc-live-object-stats (counting in the bitvector) to determine the live
> object count.  There is now no way that it can ever be larger than the
> total heap size.

OK.

> I also changed the code to not look at the penultimate GC stats, since
> I couldn't invent a scenario where that would help, and IMO it only
> confuses things.  This may have been a remnant of the pre-lazy sweep
> code.

Well, it's actually hard to "invent" things in that area without any
measurement to back them up.

> There was some confusion about cells vs. double cells vs. bytes, but I
> think was mostly in my head and perhaps in your stress test.
>
> If you really want to know, use git bisect.

I would have expected you to use such an approach when you volunteered
to fix things.

> A likely candidate is the patch from you that I applied. In
> particular,
> 4c7016dc06525c7910ce6c99d97eb9c52c6b43e4

Well, that's a good candidate since it's the last significant change
that was done to the GC on `master'.  However, Kevin's original post
compared 1.8 (which doesn't have this commit) to 1.6.

> +  seg->freelist->collected += collected * seg->span;
>
> looks fishy as this code is called multiple times for a given
> card. 

This very line was already there before the patch (see the diff).

> The scm_t_sweep_statistics were sometimes passed into the sweep
> function and sometimes not; I couldn't work out what the global
> variables were supposed to mean exactly, and consequently, if their
> updates were correct.  The reason I am confident about the statistics
> now is the assert()s I added to scm_i_gc(), which compare exactly mark
> bit counts, the sweep statistics and freelist statistics.  Some of the
> changes I did were to make these numbers match up exactly.

OK, let's hope for the best.  ;-)

> I'd be interested in seeing benchmarks between Guile and PLT after my 
> cleanup.  For a lot of benchmarks, GC time is an important factor, and
> it might be that we can now beat PLT (they use BGC).

Hmm, that seems unlikely to me, but that'd be good news.
>
> BTW, I'm attaching a new plot of the stress test, now up to iteration
> 10000 (the large allocation).  Interestingly, the large allocation is
> cleaned up only once - (on iteration 1000), and remains 'live' after
> that, so there may still be some bugs lurking.

Eh, how fun.

> char-sets are smobs and use single cells, AFAICT.

Right (but `SCM_NEWSMOB{2,3} ()' use double cells, though).

Thanks,
Ludo'.

Re: GUILE_MAX_HEAP_SIZE

Reply via email to