On Tue, May 6, 2014 at 10:40 AM, Simon Riggs <si...@2ndquadrant.com> wrote:
>> 1. Segments are relocatable, so you can't actually use absolute
>> pointers.  Maybe someday we'll have a facility for dynamic shared
>> memory segments that are mapped at the same address in every process,
>> or maybe not, but right now we sure don't.
>
> Sounds like a problem for static allocations, not dynamic ones.

Maybe I didn't say that well: the dynamic shared memory segment isn't
guaranteed to be mapped at the same address in every backend.  So if
you use absolute pointers in your data structures, they might be
incorrect from the point of view of some other backend accessing the
same shared memory segment.  This is true regardless of how the
allocation is done; it's an artifact of the decision not to try to map
dynamic shared memory segments at a constant address in every process
using them.

> It makes a lot of sense to use dynamic shared memory for sorts
> especially, since you can just share the base pointer and other info
> and a "blind worker" can then do the sort for you without needing
> transactions, snapshots etc..

That would be nice, but I think we're actually going to need a lot of
that stuff to make parallel sort work, because the workers need to
have enough an environment set up to run the comparison functions, and
those may try to de-TOAST.

> I'd also like to consider putting common reference tables as hash
> tables into shmem.

I'm not sure exactly what you have in mind here, but some kind of hash
tables facility for dynamic memory is on my bucket list.

>> 2. You've got to decide up-front how much memory to set aside for
>> dynamic allocation, and you can't easily change your mind later.  Some
>> of the DSM implementations support growing the segment, but you've got
>> to somehow get everyone who is using it to remap it, possibly at a
>> different address, so it's a long way from being transparent.
>
> Again, depends on the algorithm. If we program the sort to work in
> fixed size chunks, we can then use a merge sort at end to link the
> chunks together. So we just use an array of fixed size chunks. We
> might need to dynamically add more chunks, but nobody needs to remap.

This is a possible approach, but since each segment might end up in a
different spot in each process that maps it, a pointer now needs to
consist of a segment identifier and an offset within that segment.

> Doing it that way means we do *not* need to change situation if it
> becomes an external sort. We just mix shmem and external files, all
> merged together at the end.

Huh.  Interesting idea.  I haven't researched external sort much yet.

> We need to take account of the amount of memory locally available per
> CPU, so there is a maximum size for these things. Not sure what tho'

I agree that NUMA effects could be significant, but so far I don't
know enough to have any idea what, if anything, makes sense to do in
that area.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to