On 11/14/2014 08:21 PM, Simon Riggs wrote:
The requested information is already available, as discussed. Logical
decoding adds commit ordering for *exactly* the purpose of using it
for replication, available to all solutions. This often requested
feature has now been added and doesn't need to be added twice.

So what we are discussing is adding a completely superfluous piece of
information.

Not including the LSN info does nothing to trigger based replication,
which will no doubt live on happily for many years. But adding LSN
will slow down logical replication, for no purpose at all.


Simon,
The use cases I'm talking about aren't really replication related. Often I have come across systems that want to do something such as 'select * from orders where X > the_last_row_I_saw order by X' and then do further processing on the order.

This is kind of awkard to do today because you don't have a good candidate for 'X' to order on. Using either a sequence or insert-row timestamp doesn't work well because a transaction with a lower value for X might end up committing after the higher value in in a query result.

Yes you could setup a logical wal slot and listen on the stream of inserts into your order table but thats a lot of administration overhead compared to just issuing an SQL query for what really is a query type operation.

Using the commit timestamp for my X sounded very tempting but could allow duplicates.

One could argue that this patch is about replication features, and providing commit ordering for query purposes should be a separate patch to add that on top of this infrastructure. I see merit to smaller more focused patches but that requires leaving the door open to easily extending things later.

It could also be that I'm the only one who wants to order and filter queries in this manner (but that would surprise me). If the commit lsn has limited appeal and we decide we don't want it at all then we shouldn't add it. I've seen this type of requirement in a number of different systems at a number of different companies. I've generally seen it dealt with by either selecting rows behind the last now() timestamp seen and then filtering out already processed rows or by tracking the 'processed' state of each row individually (ie performing an update on each row once its been processed) which performs poorly.

Steve







--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to