Andreas Pflug wrote: > Jan Wieck wrote: >> On 6/6/2006 12:04 PM, Christopher Browne wrote: >> >>> Either way, it would substantially complicate the subscription >>> process :-(. >> > >> >> Since we are now substantially speeding up the copy_set, I don't see >> how Slony is more of a problem than pg_dump. > > While Slony may be as fast as pg_dump now, it could be faster in some > situations: when copying binary data, COPY is substantially slower > than COPY BINARY. An option to advise slon to use COPY BINARY would be > a first step, even better if this could be defined per table (and even > better, if pgsql had a COPY option that's equally efficient for text > and binary data). > > Regards, > Andreas > Unfortunately, COPY BINARY is not portable across machine architectures and PostgreSQL versions, thereby making it a completely unacceptable step.
Interestingly, the last time I set up large subscriptions, I found that reindexing was a pretty material portion of the time involved. ---> Copying across a WAN link 2006-05-30 01:12:16 UTC DEBUG2 remoteWorkerThread_36: copy table "public"."trans_log" 2006-05-30 01:12:17 UTC DEBUG2 remoteWorkerThread_36: Begin COPY of table "public"."trans_log" 2006-05-30 02:26:10 UTC DEBUG2 remoteWorkerThread_36: 6158933843 bytes copied for table "public"."trans_log" 2006-05-30 03:00:47 UTC DEBUG2 remoteWorkerThread_36: 6510.279 seconds to copy table "public"."trans_log" --> Copying the same data (populating another node) across local LAN 2006-05-30 06:54:10 UTC DEBUG2 remoteWorkerThread_36: copy table "public"."trans_log" 2006-05-30 06:54:10 UTC DEBUG2 remoteWorkerThread_36: Begin COPY of table "public"."trans_log" 2006-05-30 07:19:32 UTC DEBUG2 remoteWorkerThread_36: 6169849504 bytes copied for table "public"."trans_log" 2006-05-30 07:53:20 UTC DEBUG2 remoteWorkerThread_36: 3550.417 seconds to copy table "public"."trans_log" Note that when the link was relatively slow, the amount of time involved in doing the COPY was about 1:14, as compared to 0:23, across the faster local link. But the time required to generate indexes was relatively fixed; 0:44 minutes in both cases. I don't see it being too likely that changing to COPY BINARY would provide a material improvement in either case: - In the first case, the speed of the link was the bottleneck, and changing to a non-portable binary representation won't change this. - In the second case, it took a lot longer to regenerate indexes than it did to copy the data. Changing to a non-portable binary representation wouldn't change this. I don't see there being a material benefit to be found in switching over to a BINARY format. Certainly not when it would lead to a loss of portability. _______________________________________________ Slony1-general mailing list [email protected] http://gborg.postgresql.org/mailman/listinfo/slony1-general
