Re: [GENERAL] Comparing txid and xmin (under BDR)

2015-05-12 Thread Peter Mogensen



On 2015-05-12 06:06, Craig Ringer wrote:

On 11 May 2015 at 21:10, Peter Mogensen a...@one.com wrote:



So ... I can easily get the current txid of the SELECT transaction by
calling txid_current().


Note that by doing so, you force txid allocation for a read-only query that
might otherwise not need one, which increases your txid burn rate and
decreases time until you need to do wraparound-protection vacuuming.




The same is not true for txid_snapshot_xmin() is it?

I mean ... I really don't need the actual txid of a SELECT statement.
Only to ensure that it's never than any invalidation event.
So it's enough to just use txid_snapshot_xmin() ... at the cost of 
possibly not caching new values in a small window after invalidation.


/Peter



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Comparing txid and xmin (under BDR)

2015-05-11 Thread Peter Mogensen

Hi,

I would really like to be able to externally to Postgres at some point 
in time later, be able to compare the txid of 2 queries.


Namely:
The INSERT transaction for a certain row in a table,  and
The SELECT transaction reading some other data.

With the one caveat that this has to work with BDR.

So ... I can easily get the current txid of the SELECT transaction by 
calling txid_current().


However, - I can't in general compare it to the xmin of the table row.
BDR does ensure (it seems) that xmin is meaningfull within the local 
node xid context.


But after a xid wrap-around all txids will be larger than any xmin. And 
I can't get the epoch of the xmin value.


BUT ... will this work: ?

Comparing
txid_current() of the SELECT transaction,
to
txid-current()-age(xmin) of the table row?


/Peter



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Comparing txid and xmin (under BDR)

2015-05-11 Thread Peter Mogensen



On 2015-05-12 06:06, Craig Ringer wrote:

On 11 May 2015 at 21:10, Peter Mogensen a...@one.com wrote:


So ... I can easily get the current txid of the SELECT transaction by
calling txid_current().


Note that by doing so, you force txid allocation for a read-only query that
might otherwise not need one, which increases your txid burn rate and
decreases time until you need to do wraparound-protection vacuuming.


Good point.

Or, rather, it doesn't do anything different to what PostgreSQL its 
self does. I'm still not at all convinced that your desire to use 
transaction IDs for the cache invalidation stuff you're doing makes 
sense or will work sensibly even in a single-node environment. It's 
certainly not going to when comparing between nodes, especially in an 
async system. 


You could be right.

Let's recap

So - for an external cache in a system with high read-to-write ratio we 
can get very high cache hit ratios by having no TTL, but doing cache 
invalidation.
(about 98% i practice, so xid burn rate is probably not as bad as is may 
sound)


Invalidation events have 1 problem. (apart from how they are generated). 
There's a race condition.


We need to ensure that this sequence of events does not happen:
1) cache miss
2) DB read
3) DB invalidation event
4) cache invalidation applied
5) caching of the value read in 2)

 In a standard PostgreSQl setup, we can stop 5) from happening, by 
attaching txid_current() to the invalidation event in 3) and 
txid_snapshot_xmin() to the cache read and put a tombstone in the cache 
when doing invalidations.

(with a relative long TTL)

So ... when the value in 5) is about to get cached and if hits a 
tombstone in the cache it is only cached if the tombstone txid is older 
than the txid_snapshot_xmin of the new read - ie. if were sure that the 
invalidation took place before the value we're about to cache.


This scheme should work with standard Postgres. Also read-only slaves. 
But it won't work with BDR, since each node got its' own txid namespace.
Attaching txid_current() to an invalidation event has no meaning on 
other nodes.


On the other hand. ... it the invalidation event is stored in a BDR 
replicated table, then the xmin of the event row will have a meaning to 
the local node. Which was the only way I found to get the invalidation 
event placed in the local sequence of transactions.


/Peter






--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] BDR - triggers on receiving node?

2015-03-26 Thread Peter Mogensen



On 2015-03-26 11:57, Craig Ringer wrote:

If that's the case then BDR shouldn't make any difference.


It does. Because now with BDR you can't compare txid_current() as saved 
on the master with txid_snapshot_xmin() as read by the replica.
If however, you could save the txid associated with the application of 
the BDR replication on the target, you would again be able to compare.


Earlier you were speaking of (presumably not synchronous) streaming 
replicas, and writes on the master vs reads from the replica, and a 
way you avoid caching stale data from the read-replica using the 
application's cache manager to co-ordinate visibility horizons. It 
sounded like you were looking for a way to translate that into 
something that could work with BDR. Did I misunderstand? 


No. I think you understood, but the goal of using txid from the master 
was not to co-ordinate with the master. Only to match changes on the 
slave with reads on the slave.

The extra property of txid's being the same on the master was not used.
The problem when using BDR is that the slave has it's own txid 
sequence and I can't get the txid of all transactions doing changes, - 
since some of them are via BDR and doesn't cause triggers.




One way to handle that is to SELECT ... FOR SHARE in your reads, then
update the cache and not commit until the cache has been updated.

A concurrent UPDATE can then not proceed until the cache write has been
completed.


Yes, that would serialize the selects...
It's very complicated to integrate into the cache however, since it 
doesn't really know about how the values to cache are retrieved.



/Peter



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] BDR - triggers on receiving node?

2015-03-26 Thread Peter Mogensen



On 2015-03-26 12:56, Craig Ringer wrote:
My comment was with regards to it being on the local node. A master 
and synchronous replica isn't a local-node to local-node scenario. 


No. But all I'm exploiting is that change events to the local node see 
the same logical clock as SELECT statements to the same node.





As previously mentioned it's likely to be possible to add per-row apply
callbacks that can be written in plpgsql or other function languages, but
no such feature currently exists in BDR.


I'll put it on my wish list then :)


Firing real FOR EACH ROW ... ENABLE REPLICA triggers may also be
possible, but I haven't done any significant looking into it, and I'd be
concerned about the correctness of doing so for row triggers but not
statement triggers.


That too :)


One issue is that it'd probably have to be able to restrict writes to
unlogged tables or tables that aren't part of a replication set. If it
could write to replicated tables it'd be doing so in the context of the
apply worker, so the writes would not get replicated to other nodes. That'd
mean the triggers would be creating rows only on one node - and that's a
fast track to data divergence that can cause replication stalls and so on.


For the use case at hand, that's not a problem. The trigger would only 
need to update a fully local state. - like a table not part of the 
replication set and not related to any other tables. - like an FIFO 
queue of invalidation events.


/Peter



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] BDR - triggers on receiving node?

2015-03-26 Thread Peter Mogensen



On 2015-03-26 12:56, Craig Ringer wrote:
At this point I think commit timestamps are likely to be your best 
bet, and certainly what you should start looking into first. 


I've thought about this, but it seems that since these timestamps are 
made on the node doing the change and you have no way on knowing if one 
node in the cluster is far behind in it's replication to you, then you 
could make a select on the local node being much newer in timestamp than 
the transaction changing the value - which would only arrive later  (and 
making pg_get_latest_transaction_committime_data() go backwards).


But I think I might have found a way to make this work with BDR:

It seems that BDR translates the xmin column of tables from the sending 
node to the target node txid values.


So - instead of having a local FIFO table on each node, I could actually 
still (like with single-master replication) have a global FIFO table of 
invalidation events, being inserted into at the node actually making the 
change.


When that invalidation event reaches a node the row gets a local xmin 
value which can be compared to txid_snapshot_xmin() kept along every select.


So if I'm correct in observing that the xmin column gives me the a local 
txid of the transaction inserting the invalidation event - even if it 
was originally inserted on another node, then I think it'll work.


/Peter



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] BDR - triggers on receiving node?

2015-03-25 Thread Peter Mogensen



On 2015-03-25 12:32, Craig Ringer wrote:

On 25 March 2015 at 19:15, Peter Mogensen a...@one.com wrote:


Say ... I have a table in a BDR replicated database with an ON UPDATE
trigger. - and that trigger wants to locally find out the local
txid_snapshot_xmin() when a change was applied to the local node.


Why would you want to do that? Just out of interest?




Because I have a bunch of SELECT queries to the local node, which 
includes the txid_current() in their result, and I want to be able to 
recognize a result which was obtained prior to a change reaching the 
local node.


/Peter




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] BDR - triggers on receiving node?

2015-03-25 Thread Peter Mogensen

Hi,

Traditionally it hadn't made much sense to fire a trigger on a receiving 
replica node (slave) - for many reasons, including it being read-only.


But with BDRs multi parter, partial replication and the possibility that 
some tables are either actually or logically local to a single node, 
there's scenarios where this makes sense.


Is it possible?

Say ... I have a table in a BDR replicated database with an ON UPDATE 
trigger. - and that trigger wants to locally find out the local 
txid_snapshot_xmin() when a change was applied to the local node.
Not the upstream txid of the transaction which actually made the change 
originally (which only makes sense on that node) - but on the local node 
receiving the change via BDR LLSR.


Can that be done?

/Peter



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] BDR consensus algorithm

2014-12-19 Thread Peter Mogensen

Hi,

Can anyone provide a pointer a description of the bidirectional 
replication global sequences consensus algorithm?


The documentation says:
BDR uses a voting/election system to assign blocks from global 
sequences. The details are in the source code.

https://wiki.postgresql.org/wiki/BDR_Global_Sequences#How_global_sequences_are_managed_internally

But I can't find anything in the source remotely resembling an algorithm 
like PAXOS or RAFT.


kind regards,
Peter Mogensen


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Plan rows - 1 or many

2014-03-12 Thread Peter Mogensen

Hi,

I have an application where I would really like to be able to look at en 
SQL query and answer the question:


Is this query capable of returning more than 1 row?

So basically, queries are divided into 2 categories. Those that look up 
a single row (if it exists) and those who return a (possible empty) set 
of rows. (which could also be a set of 1 due to current database content.)


Example:
It's obvious to see that this query can never return more than one row, 
no matter what's in the DB:

SELECT * FROM mytable WHERE uniquecolumn = 17;

But only because we know the column has a unique constraint.
Now, unique constraints might be more complex. It could be composite 
(A,B) where A is a foreign key to another table (Ta) where we can 
determine that both row referenced by A is unique wrt. the query, so 
filtering on B and a unique column in Ta would still only ever result in 
1 row.


SELECT * FROM Ta,Tb WHERE Ta.id = Tb.A AND Tb.B = 17 AND Ta.uniquecolumn 
= 42;


This seems to be about the same as the query planners plan_rows try to 
do. I'm aware that plan_rows is only an estimate and can't answer the 
question of exactly how many rows in general a query returns, but the 
question is whether it can be trusted to answer the above more narrow 
question?


I tried look at the source, but could quite figure out if plan_rows were 
intended to be strict about when it concluded that there was only one 
row. (like when it evaluated a GROUP BY with only 1 aggregate).


Can I conclude that when plan_rows is 1 then there will never be more 
than 1 row returned by the query?


/Peter



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Plan rows - 1 or many

2014-03-12 Thread Peter Mogensen

On 2014-03-12 09:28, Albe Laurenz wrote:

Peter Mogensen wrote:

I have an application where I would really like to be able to look at en
SQL query and answer the question:

Is this query capable of returning more than 1 row?



Can I conclude that when plan_rows is 1 then there will never be more
than 1 row returned by the query?


I would say no.

If the planner estimates one row, that means that it guesses that
it will be one or less.


I think guesses is the operative word here. Because, if the planner 
can actually promise that for the example queries a gave there would be 
one or less rows, then I'm happy.



And a query that can potentially return many rows can certainly
also return only one row or none at all.


But that's fine... I have no problem with an answer telling me that this 
query can return any number of rows.


Hmm.. I think my question is answered by posing it in another way:

Could plan rows tell me whether the result of a given query could be 
changed by later INSERT statements?


And the answer becomes no, because there's a difference between 
queries of the type I gave as example and queries using aggregates and 
GROUP BY. ... for which the planner would also return plan_rows = 1.
However... GROUP BY queries could have their result changed by later 
INSERT statements.


/Peter



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general