On Nov 8, 2011, at 5:32 PM, Marvin Humphrey wrote:
> Greets,
>
> In Perl 5.15 (current "blead" Perl -- the developer release), Lucy fails most
> of its tests because of an exception thrown during global destruction:
>
> (in cleanup) Insane attempt to destroy VTable for class 'Lucy::Object::Obj'
> lucy_VTable_destroy at
> /home/sts/cpansmoke/perl-5.15.2/cpan/build/Lucy-0.2.2-o_YHcb/core/Lucy/Object/VTable.c
> line 44
> at t/018-host.t line 0
>
> That's a tripwire that I set because VTable's destructor should *never* be
> invoked. We leak VTables on purpose.
>
> What has changed in Perl 5.15 is that destructors are now called during global
> destruction; previously, Perl freed all SVs during global destruction but did
> not call DESTROY on objects.
Perl previously did call DESTROY on objects during global destruction, and the
order was non-determistic, but a few objects would escape the purge, in
particular:
• blessed array elements (bless \$_[0])
• blessed closure variables (bless my \$x; sub foo { $x ... })
• any other unreferenced SVs (not referenced by RVs or GVs)
The VTables belong to the third category.
>
>
> http://search.cpan.org/~stevan/perl-5.15.3/pod/perlobj.pod#Global_Destruction
>
> This change to Perl is going to require a corresponding change
> to Lucy's Perl bindings. Consider the following code:
>
> my %hash = (
> searcher => Lucy::Search::IndexSearcher->new(index => $path),
> );
> $hash{circular_reference} = \%hash;
>
> Because of the circular reference, that Perl hash, the Searcher it refers to,
> and crucially, the Searcher's inner PolyReader will not be deallocated until
> global destruction. During global destruction, though, refcounting goes out
> the window and destruction order is effectively random.
How has Lucy worked before, seeing that the order was already
non-deterministic? Do they simply depend on the presence of the VTable?
>
> What we would ordinarily want to see is destruction moving from the outermost
> object to the innermost:
>
> Perl hash
> IndexSearcher
> PolyReader
> SegReaders
> DataReaders
> InStreams
> FileHandles
> ...
>
> This is important because when we get to the IndexSearcher's destructor, its
> subcomponents still need to be valid:
>
> void
> IxSearcher_destroy(IndexSearcher *self) {
> DECREF(self->reader);
This seems to answer my question in the negative.
From reading this code superficially, it looks as though the Searcher object
has an internal (non-Perl) reference count on the reader. The Perl object will
also have a reference count on the reader. That should prevent the reader from
being destroyed before the searcher is.
> // ...
> SUPER_DESTROY(self, INDEXSEARCHER);
> }
>
> If self->reader has already been freed when this destructor gets called,
> that's bad news -- we're going to be invoking DECREF on freed memory.
>
> As far as I can tell, the only solution is to disconnect our DESTROY methods
> when Perl enters global destruction and leak everything. Here's sample XS
> code to get the point across:
>
> void
> DESTROY(self)
> lucy_IndexSearcher *self;
> PPCODE:
> if (PL_phase != PERL_PHASE_DESTRUCT) {
> lucy_IxSearcher_destroy(self);
> }
>
> Of course, this defeats the purpose of the change that was made in Perl 5.15.
> The rationale for the new behavior is to support situations where for example,
> you could guarantee that when a Perl interpreter in an embedded system shuts
> down, *everything* gets reclaimed. But I believe that architecture is only
> feasible when you control all memory allocation (as when the OS closes a
> process) and thus Perl's new global destruction model is flawed as it cannot
> encompass external resources.
Perl’s global destruction has always necessarily been flawed. It cannot but be
non-deterministic, due to the way circular references work. There is simply no
way to know which thing is the ‘outer’ object, and which is the ‘inner’, as
they are all just linked, rather than ‘inner’ or ‘outer’.
I can’t say I fully understand why destroying the Perl-level Reader before the
Searcher would be a problem. But you do seem to be implying that VTables need
to be present for anything to work. If that is the case, then Lucy was already
relying on an implementation detail, so why not continue to?
Let’s look at the relevant code from the perl source:
> void
> Perl_sv_clean_objs(pTHX)
> {
> dVAR;
> GV *olddef, *olderr;
> PL_in_clean_objs = TRUE;
This line goes through all scalars that are references to objects and calls
undef() on them:
> visit(do_clean_objs, SVf_ROK, SVf_ROK);
The next two function calls eliminate all blessed GV slots. I think the GV
slots are nulled and the SVs in them have their reference count lowered, but I
haven’t actually read the code.
> /* Some barnacles may yet remain, clinging to typeglobs.
> * Run the non-IO destructors first: they may want to output
> * error messages, close files etc */
> visit(do_clean_named_objs, SVt_PVGV|SVpgv_GP, SVTYPEMASK|SVp_POK|SVpgv_GP);
> visit(do_clean_named_io_objs, SVt_PVGV|SVpgv_GP,
> SVTYPEMASK|SVp_POK|SVpgv_GP);
This is the bit added in 5.15. It looks for any objects remaining. Since they
may be referenced by other objects (indirectly, through closures or array
elements), whose destructors have not fired yet, they are not actually freed,
but simply cursed; that is, they revert to non-object status (something you
cannot do from Perl or XS, even though the core has the facility to do it).
> /* And if there are some very tenacious barnacles clinging to arrays,
> closures, or what have you.... */
> visit(do_curse, SVs_OBJECT, SVs_OBJECT);
> olddef = PL_defoutgv;
> PL_defoutgv = NULL; /* disable skip of PL_defoutgv */
> if (olddef && isGV_with_GP(olddef))
> do_clean_named_io_objs(aTHX_ MUTABLE_SV(olddef));
> olderr = PL_stderrgv;
> PL_stderrgv = NULL; /* disable skip of PL_stderrgv */
> if (olderr && isGV_with_GP(olderr))
> do_clean_named_io_objs(aTHX_ MUTABLE_SV(olderr));
> SvREFCNT_dec(olddef);
> PL_in_clean_objs = FALSE;
> }
So based on that it looks as though you simply need to remove the destructor on
VTables, since they will be destroyed last. Or create a destructor that makes
sure all other Lucy objects have been purged.
Now I hope I have you thoroughly confused. :-)