Re: Finalizers: conclusion?

Antony Courtney Thu, 23 Jan 2003 08:30:42 -0800

Hi Manuel,

Sorry for the delay in replying. I'll quote a little more of the previous messages than usual to help refresh context.

Manuel M T Chakravarty wrote:

Antony Courtney <[EMAIL PROTECTED]> wrote,
You indicated that you were somewhat unclear why we need liveness dependencies. I'll attempt to clarify by sketching some of the details of the particular C library for which I am writing FFI wrappers.

I have a C library for 2D vector graphics. Two of the abstract types provided by this C library are:
Pixmap -- A handle to an actual buffer of raster data
RenderContext -- A handle that encapsulates all state associated with rendering, such as the current color, current font, target pixmap, etc.

Note that it is possible to create many RenderingContext's that all render on to the same underlying Pixmap.

To see why we need liveness dependencies, consider the following typical usage scenario in Haskell:
do pm <- createPixmap -- 1
rc <- createRenderContext pm -- 2
drawBox rc -- 3
...

Note that, in the above, it's possible that the call to createRenderContext in line 2 could be the last Haskell reference to pm, making it a candidate for collection. But we don't actually want the Pixmap to be collected (and its finalizer invoked) until both the Pixmap *and* all associated rendering contexts which refer to the Pixmap become unreachable.

The reason we need liveness dependencies is because, internally, the RenderContext maintains a pointer to the target Pixmap. But because this pointer exists only in the C heap, we need some way to inform Haskell's garbage collector that whenever a particular RenderContext is reachable, then its target pixmap is also reachable.

IMHO you are trying to compensate for a flaw in the whole
setup:

* Line 1: You get a pointer to a C object assuming it is the
last reference to that C object.

* Line 2: You pass this pointer back to C without copying
it; ie, the only reference to the C object is in C land.

At this moment, the pointer obtained on Line 1 is no longer
the business of the Haskell system. It is a pointer in C
land to a C object; so, memory management of that structure
should be let to the C library.

I'm sorry, but I simply don't agree with your rationale here (nor do I see a "flaw in the whole setup").

Yes, your observations about when references are live in C and when
references are live in Haskell in the above code fragment is correct. However, in my opinion, this is an implementation detail. The user of my Haskell library should not know or care whether the library is implemented in Haskell, in C, or in some combination of the two.

In this case, Pixmap and RenderContext could very easily be implemented entirely in Haskell (i.e. just make Pixmap a byte array, and RenderContext a record type that maintains a Pixmap in one of its fields). If it were implemented this way, then of course any live reference to a RenderContext will ensure that the Pixmap it refers to will not be GC'ed, since the field of the RenderContext record would contain a reference to the Pixmap. I see liveness dependencies as a way for me (as a library implementor) to use an external (C language) representation for a Haskell data structure, whilst retaining one of the most important benefits of programming in Haskell (garbage collection).

> Assume the following C function

  RenderContext *createPixmapWithContext ()
  {

    Pixmap *pm = createPixmap ();
    return createRenderContext (pm);
  }

in conjunction with

  do
    rc <- createPixmapWithContext
    drawBox rc

How is this different from your Haskell code in a way that
requires a foreign pointer dependency in one case, but not
in the other?

For starters, I would never, ever write the createPixmapWithContext() function in C because it is an obvious memory leak. It allocates two objects (via createPixmap and createRenderContext), but returns a pointer to only one of them. You could potentially get away with this if you happen to use some reference counting scheme in C, but I never suggested I was doing any such thing (more on this below).

To be honest, I don't really see your point here. I am implementing a Haskell library that happens to use some external (C language) representations for some data structures, and I would like to use Haskell's garbage collector to ensure that this Haskell library works as a Haskell programmer would expect. What you have presented above is an arbitrary C function that does some heap allocation that is never visible to the Haskell runtime. I would never expect liveness dependencies (or anything else) to enable the Haskell runtime to track heap allocation in arbitrary C code.

[...]

> As `createPixmapWithContext()' demonstrates, C land

must free `pm' when the last render context referring to
`pm' dies.

Not necessarily true! What you are suggesting (reference counting) is one possible memory management strategy in C, but is by no means the only option.

Another possibility (the one I actually use) is simply for RenderContexts to do absolutely no memory management of the underlying Pixmaps whatsoever. Then it is up to whoever created the Pixmap to free the Pixmap, and up to whoever created a RenderContext to ensure that the RenderContext will not be used after its underlying Pixmap has been freed. This is relatively easy to document in prose in a library manual page. I see liveness dependencies as a way of exporting exactly such informal requirements to a high-level language's garbage collector.

For C libraries I prefer this kind of explicit memory management scheme over reference counting because:
(a) it is simpler to implement,
(b) it provides a foundation for implementing higher level memory management schemes in C (it easy to wrap these primitive objects in higher level constructs that provide reference counting or arena-based allocation if you want that)
and
(c) the library can be exported to a garbage collected language, without any "impedance mismatch" between reference counting on the C side (with its known flaws collecting cyclic structures) and the calling language's GC scheme.

IMO the only clean way to approach this problem is to add a
reference counting scheme to `pm' in C land.

Obviously I disagree. I hope I've clearly articulated why, and that there is a reasonable, simple alternative.

BTW,
this is exactly how this problem is solved in the GTK+ GUI
toolkit.

Reference counting is a decent crutch (with some known flaws) for those who are stuck programming in C. But for those of us working with a true high level language, I think using foreign pointers, finalizers and liveness dependencies to enable use of the high level language's collector when programming in the high level language is a far better alternative.

-antony

--
Antony Courtney
Grad. Student, Dept. of Computer Science, Yale University
[EMAIL PROTECTED] http://www.apocalypse.org/pub/u/antony

_______________________________________________
FFI mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/ffi

Re: Finalizers: conclusion?

Reply via email to