Hi,

I am trying to create version of COPY command which can scatter/replicate data 
to different nodes based on some distribution method.
There is some master process, having information about data distribution, to 
which all clients are connected.
This master process should receive copied data from client and scatters tuples 
to nodes.
May be somebody can recommend me the best way of implementing such COPY agent?

The obvious plan is the following:

1. Register utility callback
2. Handle T_CopyStmt in this callback
3. Use BeginCopyFrom/NextCopyFrom to receive tuples from client
4. Calculate distribution function for the received tuple
5. Establish connection with correspondent node (if not yet established) and 
start the same COPY command to this node (if not started yet).
6. Send data to this node using PQputCopyData.

The problem is with step 6: I do not see any way to copy received data to the 
destination node.
NextCopyFrom returns array of values (Dutums) of tuple columns. But there are 
no public methods to send tuple to the copy stream.
All this logic is implemented in src/backend/commands/copy.c and is not 
available outside this module.

It is more or less clear how to do it using text or CSV mode: I can use 
NextCopyFromRawFields and then construct a line with comma separated list of 
values.
But how to handle binary mode? Also, I suspect that copy in text mode is 
significantly slower than in binary mode, isn't it?

The dirty solution is just to cut&paste copy.c code. But may be there is some 
more elegant way?

Thanks in advance,
Konstantin






--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to