Hi!

I want to review this patch set. Though I understand that it probably will be 
quite long process.

I like the idea that with this patch set universally all postgres instances are 
bound into single distributed DB, even if they never heard about each other 
before :) This is just amazing. Or do I get something wrong?

I've got few questions:
1. If we coordinate HA-clusters with replicas, can replicas participate if 
their part of transaction is read-only?
2. How does InDoubt transaction behave when we add or subtract leap seconds?

Also, I could not understand some notes from Arseny:

> 25 июля 2018 г., в 16:35, Arseny Sher <a.s...@postgrespro.ru> написал(а):
> 
> * One drawback of these patches is that only REPEATABLE READ is
>   supported. For READ COMMITTED, we must export every new snapshot
>   generated on coordinator to all nodes, which is fairly easy to
>   do. SERIALIZABLE will definitely require chattering between nodes,
>   but that's much less demanded isolevel (e.g. we still don't support
>   it on replicas).

If all shards are executing transaction in SERIALIZABLE, what anomalies does it 
permit?

If you have transactions on server A and server B, there are transactions 1 and 
2, transaction A1 is serialized before A2, but B1 is after B2, right?

Maybe we can somehow abort 1 or 2? 

> 
> * Another somewhat serious issue is that there is a risk of recency
>   guarantee violation. If client starts transaction at node with
>   lagging clocks, its snapshot might not include some recently
>   committed transactions; if client works with different nodes, she
>   might not even see her own changes. CockroachDB describes at [1] how
>   they and Google Spanner overcome this problem. In short, both set
>   hard limit on maximum allowed clock skew.  Spanner uses atomic
>   clocks, so this skew is small and they just wait it at the end of
>   each transaction before acknowledging the client. In CockroachDB, if
>   tuple is not visible but we are unsure whether it is truly invisible
>   or it's just the skew (the difference between snapshot and tuple's
>   csn is less than the skew), transaction is restarted with advanced
>   snapshot. This process is not infinite because the upper border
>   (initial snapshot + max skew) stays the same; this is correct as we
>   just want to ensure that our xact sees all the committed ones before
>   it started. We can implement the same thing.
I think that this situation is also covered in Clock-SI since transactions will 
not exit InDoubt state before we can see them. But I'm not sure, chances are 
that I get something wrong, I'll think more about it. I'd be happy to hear 
comments from Stas about this.
> 
> 
> * 003_bank_shared.pl test is removed. In current shape (loading one
>   node) it is useless, and if we bombard both nodes, deadlock surely
>   appears. In general, global snaphots are not needed for such
>   multimaster-like setup -- either there are no conflicts and we are
>   fine, or there is a conflict, in which case we get a deadlock.
Can we do something with this deadlock? Will placing an upper limit on time of 
InDoubt state fix the issue? I understand that aborting automatically is kind 
of dangerous...

Also, currently hanging 2pc transaction can cause a lot of headache for DBA. 
Can we have some kind of protection for the case when one node is gone 
permanently during transaction?

Thanks!

Best regards, Andrey Borodin.

Reply via email to