Hi,

On Sunday, July 01, 2012 05:51:54 PM Alexander Korotkov wrote:
> On Sun, Jul 1, 2012 at 3:11 PM, Alexander Korotkov 
<aekorot...@gmail.com>wrote:
> > 1) Patches don't apply cleanly to head. So I used commit
> > bed88fceac04042f0105eb22a018a4f91d64400d as the base for patches, then
> > all the patches close to this apply cleanly. Regression tests pass OK,
> > but it seems that new functionality isn't covered by regression tests.
I don't think the designs is stable enough yet to justify regression tests 
just yet. We also don't really have the infrastructure for testing things like 
this :(

> > 2) Patch needs more comments. I think we need at least one comment in
> > head of each function describing its behaviour, even if it is evident
> > from function name.
I can do that.

> > 4) There is significant code duplication in APPLY_CACHE_CHANGE_UPDATE and
> > APPLY_CACHE_CHANGE_DELETE branches of case in apply_change function. I
> > think this could be refactored to reduce code duplication.
Yes. I did a quick stab at that when writing the code but its different enough 
in the details (i.e. fastgetattr(&change->oldtuple->tuple, i + 1, index_desc, 
&isnull) vs fastgetattr(&change->newtuple->tuple, pkattnum[i], desc, &isnull)) 
and some others that it wasn't totally obvious how a good api would look like.

> > 5) Apply mechanism requires PK from each table. So, throwing error here
> > if we don't find PK is necessary. But we need to prevent user from run
> > logical replication earlier than failing applying received messages.
> > AFACS patch which is creating corresponding log messages is here:
> > http://archives.postgresql.org/message-id/1339586927-13156-7-git-send-ema
> > il-and...@2ndquadrant.com. And it throws any warning if it fails to find
> > PK. On which stage we prevent user from running logical replication on
> > tables which doesn't have PK?
We don't have that layer yet ;). As we currently don't have any of the 
configuration nailed down I annot answer that yet.
The reason it doesn't throw an error when generating the additional data is 
that it can easily get you in a situation where you cannot get out of.

> > 6) I've seen comment /* FIXME: locking */. But you open index with
> > command index_rel = index_open(indexoid, AccessShareLock);
> > and close it with command
> > heap_close(index_rel, NoLock);
> > Shouldn't we use same level of locking on close as on open?
We cannot release the locks before the transaction ends. If you do an (heap|
index)_close with NoLock specified the relation will be rleeased at the end of 
transaction.

> > Also,
> > heap_close doesn't looks good to me for closing index. Why don't use
> > index_close or relation_close?
Good point. index_close and heap_close currently do exactly the same thing but 
that doesn't mean it will always do so.

> > 7) We find each updated and deleted tuple by PK. Imagine we update
> > significant part of the table (for example, 10%) in single query and
> > planner choose sequential scan for it. Then applying of this changes
> > could be more expensive than doing original changes. This it probably
> > ok. But, we could do some heuristics: detect that sequential scan is
> > cheaper because of large amount of updates or deletes in one table.
I don't think thats a good idea. We only ever update a single row at once. You 
need *loads* of non-HOT updates to the same row for a seqscan to be faster 
than a single, one-row, index lookup.
Now if we would update the whole table at once and if its smaller than RAM it 
might be beneficial to preload the whole table into the cache. But thats 
nothing we should at this level imo.

> 8) If we can't find tuple for update or delete we likely need to put PK
> value into the log message.
Yes, thats a good idea.

Greetings,

Andres
-- 
Andres Freund           http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to