On Thu, Dec 5, 2013 at 8:57 AM, Heikki Linnakangas
<hlinnakan...@vmware.com> wrote:
> On 12/05/2013 06:32 AM, Robert Haas wrote:
>> During development of the dynamic shared memory facility, Noah and I
>> spent a lot of time arguing about whether it was practical to ensure
>> that a dynamic shared memory segment got mapped at the same address in
>> every backend that used it.
>
> My vote goes for not trying to map at same address. I don't see how you
> could do that reliably, and I don't see much need for it anyway.
>
> That said, it naturally depends on what you're going to use the dynamic
> shared memory facility for. It's the same problem I have with reviewing the
> already-committed DSM patch and the message queue patch. The patches look
> fine as far as they go, but I have the nagging feeling that there are a
> bunch of big patches coming up later that use the facilities, and I can't
> tell if the facilities are over-engineered for what's actually needed, or
> not sufficient.

Sure, well, that's one of the challenges of any doing any sort of
large software development project.  One rarely knows at the outset
all of the problems that one will encounter before finishing said
project.  For small projects, you can usually predict these things
pretty well, but as the scope of the project goes, it becomes more and
more difficult to know whether you've made the right initial steps.
That having been said, I'm pretty confident in the steps taken thus
far - but if you're imagining that I have all the answers completely
worked out and am choosing to reveal them only bit by bit, it's not
like that.

If you want to see the overall vision for this project, see
https://wiki.postgresql.org/wiki/Parallel_Sort .  Here, I expect to
use dynamic shared memory for three purposes.  First, I expect to use
a shared memory message queue to propagate any error or warning
messages generated in the workers back to the user backend.  That's
the point of introducing that infrastructure now, though I hope it
will eventually also be suitable for streaming tuples between
backends, so that you can run one part of the query tree in one
backend and stream the output to a different backend that picks it up
and processes it further.  We could surely contrive a simpler solution
just for error messages, but I viewed that as short-sighted.  Second,
I expect to store the SortTuple array, or some analogue of it, in
dynamic shared memory.  Third, I expect to store the tuples themselves
in dynamic shared memory.

Down the road, I imagine wanting to put hash tables in shared memory,
so we can parallelize things like hash joins and hash aggregates.

> As a side-note, I've been thinking that we don't really need same-address
> mapping for shared_buffers either. Getting rid of it wouldn't buy us
> anything right now, but if we wanted e.g to make shared_buffers changeable
> without a restart, that would be useful.

Very true.  One major obstacle to that is that changing the size of
shared_buffers also means resizing the LWLock array and the buffer
descriptor array.  If we got rid of the idea of having lwlocks in
their own data structure and moved the buffer lwlocks into the buffer
descriptors, that would get us down to two segments, but that still
feels like one too many.  There's also the problem of the buffer
mapping hash table, which would need to grow somehow as well.
Technically, the size of the fsync absorb queue also depends on
shared_buffers, but we could decide not to care about that, I think.

The other problem here is that once you do implement all this, a
reference to a buffer beyond what your backend has mapped will
necessitate an unmap and remap of the shared-buffers segment.  If the
remap fails, and you hold any buffer pins, you will have to PANIC.
There could be some performance overhead from inserting bounds checks
in all the right places too, although there might not be enough places
to matter, since a too-high buffer number can only come from a limited
number of places - either a lookup in the buffer mapping table, or a
buffer allocation event.

I don't mention any of these things to discourage you from working on
the problem, but rather to because I've thought about it too - and the
aforementioned problems have are the things that have stumped me so
far.  If you've got ideas about how to solve them, or even better yet
want to implement something, great!

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to