Re: [HACKERS] Postgres-R: internal messaging

Markus Wanner Wed, 23 Jul 2008 03:32:22 -0700

Hi Alexey,

thanks for your feedback, these are interesting points.


Alexey Klyukin wrote:

In Replicator we avoided the need for postmaster to read/write backend's
shmem data by using it as a signal forwarder. When a backend wants to
inform a special process (i.e. queue monitor) about replication-related
event (such as commit) it sends SIGUSR1 to Postmaster with a related
"reason" flag and the postmaster upon receiving this signal forwards
it to the destination process. Termination of backends and special
processes are handled by the postmaster itself.

Hm.. how about larger data chunks, like change sets? In Postgres-R,those need to travel between the backends and the replication manager,which then sends it to the GCS.

Hm...what would happen with the new data under heavy load when the queuewould eventually be filled with messages, the relevant transactions
would be aborted or they would wait for the manager to release the queue
space occupied by already processed messages? ISTM that having a fixed
size buffer limits the maximum transaction rate.

That's why the replication manager is a very simple forwarder, whichdoes not block messages, but consumes them immediately from sharedmemory. It already features a message cache, which holds messages itcannot currently forward to a backend, because all backends are busy.

And it takes care to only send change sets to helper backend which arenot busy and can consume the process the remote transaction immediately.That way, I don't think the limit on shared memory is the bottleneck.However, I didn't measure.

WRT waiting vs aborting: I think at the moment I don't handle thissituation gracefully. I've never encountered it. ;-) But I think thesimpler option is letting the sender wait until there is enough room inthe queue for its message. To avoid deadlocks, each process shouldconsume its messages, before trying to send one. (Which is donecorrectly only for the replication manager ATM, not for the backends, IIRC).

What about keeping the per-process message queue in the local memory of
the process, and exporting only the queue head to the shmem, thus having
only one message per-process there.

The replication manager already does that with its cache. No otherprocess needs to send (large enough) messages which cannot be consumedimmediately. So such a local cache does not make much sense for anyother process.

Even for the replication manager, I find it dubious to require such acache, because it introduces an unnecessary copying of data within memory.

When the queue manager gets a
message from the process it may signal that process to copy the next
message from the process local memory into the shmem. To keep a
correct ordering of queue messages an additional shared memory queue of
pid_t can be maintained, containing one pid per each message.


The replication manager takes care of the ordering for cached messages.

Regards

Markus Wanner


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Postgres-R: internal messaging

Reply via email to