On Tue, Nov 23, 2010 at 3:43 PM, Eliot Gable <egable+pgsql-hack...@gmail.com<egable%2bpgsql-hack...@gmail.com> > wrote: <snip>
> Other than that, is there anything else I am missing? Wouldn't this type of > setup be far simpler to implement and provide better scalability than trying > to do multi-master replication using log shipping or binary object shipping > or any other techniques? Wouldn't it also be far more efficient since you > don't need to have a copy of your data on each master node and therefor also > don't have to ship your data to each node and have each node process it? > > I am mostly asking for educational purposes, and I would appreciate > technical (and hopefully specific) explanations as to what in Postgres would > need to change to support this. > > Now that I think about this more, it seems you would still need to ship the transactions to your other nodes and have some form of processing system on each that knew which node was supposed to be executing each transaction and whether that node is currently online. It would also have to have designated backup nodes to execute the transaction on. Otherwise, you could end up waiting forever for a transaction to finish that was sent to one node right before that node lost power. However, if a transaction manager on each node is able to figured out the ordering of the transactions for itself based on some globally incrementing transaction ID and able to figure out which node will be executing the transaction and which node is the backup if the first one fails, etc., then if the backup sees the primary for that transaction go offline, it could execute the transaction instead. Then, I suppose you also need some system in Postgres which can allow concurrent processing of transactions such that they don't process stuff in a transaction which is dependent on a transaction that has not yet been committed, but can process other stuff. So, evaluation of deterministic functions could take place, but anything volatile could not until all previous transactions finished. I assume Postgres already has something like this in order to scale across multiple cores in a single box. This setup would basically make all the master nodes for the database look like just extra memory and CPU cores.