On Wed, Feb 8, 2017 at 8:40 PM, Thomas Munro <thomas.mu...@enterprisedb.com> wrote: > On Tue, Feb 7, 2017 at 5:43 PM, Peter Geoghegan <p...@bowt.ie> wrote: >> Does anyone have any suggestions on how to tackle this? > > Hmm. One approach might be like this: > > [hand-wavy stuff]
Thinking a bit harder about this, I suppose there could be a kind of object called a SharedBufFileManager (insert better name) which you can store in a DSM segment. The leader backend that initialises a DSM segment containing one of these would then call a constructor function that sets an internal refcount to 1 and registers an on_dsm_detach callback for its on-detach function. All worker backends that attach to the DSM segment would need to call an attach function for the SharedBufFileManager to increment a refcount and also register the on_dsm_detach callback, before any chance that an error might be thrown (is that difficult?); failure to do so could result in file leaks. Then, when a BufFile is to be shared (AKA exported, made unifiable), a SharedBufFile object can be initialised somewhere in the same DSM segment and registered with the SharedBufFileManager. Internally all registered SharedBufFile objects would be linked together using offsets from the start of the DSM segment for link pointers. Now when SharedBufFileManager's on-detach function runs, it decrements the refcount in the SharedBufFileManager, and if that reaches zero then it runs a destructor that spins through the list of SharedBufFile objects deleting files that haven't already been deleted explicitly. I retract the pin/unpin and per-file refcounting stuff I mentioned earlier. You could make the default that all files registered with a SharedBufFileManager survive until the containing DSM segment is detached everywhere using that single refcount in the SharedBufFileManager object, but also provide a 'no really delete this particular shared file now' operation for client code that knows it's safe to do that sooner (which would be the case for me, I think). I don't think per-file refcounts are needed. There are a couple of problems with the above though. Firstly, doing reference counting in DSM segment on-detach hooks is really a way to figure out when the DSM segment is about to be destroyed by keeping a separate refcount in sync with the DSM segment's refcount, but it doesn't account for pinned DSM segments. It's not your use-case or mine currently, but someone might want a DSM segment to live even when it's not attached anywhere, to be reattached later. If we're trying to use DSM segment lifetime as a scope, we'd be ignoring this detail. Perhaps instead of adding our own refcount we need a new kind of hook on_dsm_destroy. Secondly, I might not want to be constrained by a fixed-sized DSM segment to hold my SharedBufFile objects... there are cases where I need to shared a number of batch files that is unknown at the start of execution time when the DSM segment is sized (I'll write about that shortly on the Parallel Shared Hash thread). Maybe I can find a way to get rid of that requirement. Or maybe it could support DSA memory too, but I don't think it's possible to use on_dsm_detach-based cleanup routines that refer to DSA memory because by the time any given DSM segment's detach hook runs, there's no telling which other DSM segments have been detached already, so the DSA area may already have partially vanished; some other kind of hook that runs earlier would be needed... Hmm. -- Thomas Munro http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers