Andreas Pflug wrote:
> Jan Wieck wrote:
>> On 6/6/2006 12:04 PM, Christopher Browne wrote:
>>
>>> Either way, it would substantially complicate the subscription
>>> process :-(.
>>
>
>>
>> Since we are now substantially speeding up the copy_set, I don't see
>> how Slony is more of a problem than pg_dump.
>
> While Slony may be as fast as pg_dump now, it could be faster in some
> situations: when copying binary data, COPY is substantially slower
> than COPY BINARY. An option to advise slon to use COPY BINARY would be
> a first step, even better if this could be defined per table (and even
> better, if pgsql had a COPY option that's equally efficient for text
> and binary data).
>
> Regards,
> Andreas
>
Unfortunately, COPY BINARY is not portable across machine architectures
and PostgreSQL versions, thereby making it a completely unacceptable step.

Interestingly, the last time I set up large subscriptions, I found that
reindexing was a pretty material portion of the time involved.

---> Copying across a WAN link

2006-05-30 01:12:16 UTC DEBUG2 remoteWorkerThread_36: copy table
"public"."trans_log"
2006-05-30 01:12:17 UTC DEBUG2 remoteWorkerThread_36: Begin COPY of
table "public"."trans_log"
2006-05-30 02:26:10 UTC DEBUG2 remoteWorkerThread_36: 6158933843 bytes
copied for table "public"."trans_log"
2006-05-30 03:00:47 UTC DEBUG2 remoteWorkerThread_36: 6510.279 seconds
to copy table "public"."trans_log"

--> Copying the same data (populating another node) across local LAN

2006-05-30 06:54:10 UTC DEBUG2 remoteWorkerThread_36: copy table
"public"."trans_log"
2006-05-30 06:54:10 UTC DEBUG2 remoteWorkerThread_36: Begin COPY of
table "public"."trans_log"
2006-05-30 07:19:32 UTC DEBUG2 remoteWorkerThread_36: 6169849504 bytes
copied for table "public"."trans_log"
2006-05-30 07:53:20 UTC DEBUG2 remoteWorkerThread_36: 3550.417 seconds
to copy table "public"."trans_log"

Note that when the link was relatively slow, the amount of time involved
in doing the COPY was about 1:14, as compared to 0:23, across the faster
local link.

But the time required to generate indexes was relatively fixed; 0:44
minutes in both cases.

I don't see it being too likely that changing to COPY BINARY would
provide a material improvement in either case:

- In the first case, the speed of the link was the bottleneck, and
changing to a non-portable binary representation won't change this.

- In the second case, it took a lot longer to regenerate indexes than it
did to copy the data.  Changing to a non-portable binary representation
wouldn't change this.

I don't see there being a material benefit to be found in switching over
to a BINARY format.  Certainly not when it would lead to a loss of
portability.
_______________________________________________
Slony1-general mailing list
[email protected]
http://gborg.postgresql.org/mailman/listinfo/slony1-general

Reply via email to