Greg Smith wrote:
Robert Haas wrote:
I'm fuzzy on what problem this is attempting to solve... as mentioned
in the above guidelines, it's usually good to start with some design
discussions before writing/submitting code.
This has been through some heavy design discussions with a few PG
hackers you know and some you don't, they just couldn't release the
result until now. As for what it's good for, if you look at what you
can do now with dblink, you can easily move rows between nodes using
dblink_build_sql_insert. This is perfectly fine for small bits of
work, but the performance isn't nearly good enough to do serious
replication with it. The upper level patch here allows using COPY as
the mechanism to move things between them, which is much faster for
some use cases (which includes most of the really useful ones). It
dramatically increases the scale of what you can move around using
dblink as the replication transport.
I recently found myself trying to push data through dblink() and ended
up writing code to make a call to the target to call a function which
called back to the source to select the data and insert it. The speedup
was massive, so I'll be interested to dig into the details here.
The lower level patch is needed to build that layer, which is an
immediate proof of its utility. In addition, adding a user-defined
function as a COPY target opens up all sorts of possibilities for
things like efficient ETL implementation. And if this approach is
accepted as a reasonable one, as Dan suggested a next step might even
be to similarly allow passing COPY FROM through a UDF, which has the
potential to provide a new efficient implementation path for some of
the custom input filter requests that pop up here periodically.
I'm also interested in COPY returning rows as text[], which was
discussed recently. Does this help move us towards that?
cheers
andrew
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers