On Thu, Dec 5, 2013 at 8:57 AM, Heikki Linnakangas <hlinnakan...@vmware.com> wrote: > On 12/05/2013 06:32 AM, Robert Haas wrote: >> During development of the dynamic shared memory facility, Noah and I >> spent a lot of time arguing about whether it was practical to ensure >> that a dynamic shared memory segment got mapped at the same address in >> every backend that used it. > > My vote goes for not trying to map at same address. I don't see how you > could do that reliably, and I don't see much need for it anyway. > > That said, it naturally depends on what you're going to use the dynamic > shared memory facility for. It's the same problem I have with reviewing the > already-committed DSM patch and the message queue patch. The patches look > fine as far as they go, but I have the nagging feeling that there are a > bunch of big patches coming up later that use the facilities, and I can't > tell if the facilities are over-engineered for what's actually needed, or > not sufficient.
Sure, well, that's one of the challenges of any doing any sort of large software development project. One rarely knows at the outset all of the problems that one will encounter before finishing said project. For small projects, you can usually predict these things pretty well, but as the scope of the project goes, it becomes more and more difficult to know whether you've made the right initial steps. That having been said, I'm pretty confident in the steps taken thus far - but if you're imagining that I have all the answers completely worked out and am choosing to reveal them only bit by bit, it's not like that. If you want to see the overall vision for this project, see https://wiki.postgresql.org/wiki/Parallel_Sort . Here, I expect to use dynamic shared memory for three purposes. First, I expect to use a shared memory message queue to propagate any error or warning messages generated in the workers back to the user backend. That's the point of introducing that infrastructure now, though I hope it will eventually also be suitable for streaming tuples between backends, so that you can run one part of the query tree in one backend and stream the output to a different backend that picks it up and processes it further. We could surely contrive a simpler solution just for error messages, but I viewed that as short-sighted. Second, I expect to store the SortTuple array, or some analogue of it, in dynamic shared memory. Third, I expect to store the tuples themselves in dynamic shared memory. Down the road, I imagine wanting to put hash tables in shared memory, so we can parallelize things like hash joins and hash aggregates. > As a side-note, I've been thinking that we don't really need same-address > mapping for shared_buffers either. Getting rid of it wouldn't buy us > anything right now, but if we wanted e.g to make shared_buffers changeable > without a restart, that would be useful. Very true. One major obstacle to that is that changing the size of shared_buffers also means resizing the LWLock array and the buffer descriptor array. If we got rid of the idea of having lwlocks in their own data structure and moved the buffer lwlocks into the buffer descriptors, that would get us down to two segments, but that still feels like one too many. There's also the problem of the buffer mapping hash table, which would need to grow somehow as well. Technically, the size of the fsync absorb queue also depends on shared_buffers, but we could decide not to care about that, I think. The other problem here is that once you do implement all this, a reference to a buffer beyond what your backend has mapped will necessitate an unmap and remap of the shared-buffers segment. If the remap fails, and you hold any buffer pins, you will have to PANIC. There could be some performance overhead from inserting bounds checks in all the right places too, although there might not be enough places to matter, since a too-high buffer number can only come from a limited number of places - either a lookup in the buffer mapping table, or a buffer allocation event. I don't mention any of these things to discourage you from working on the problem, but rather to because I've thought about it too - and the aforementioned problems have are the things that have stumped me so far. If you've got ideas about how to solve them, or even better yet want to implement something, great! -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers