I have extended COPY to support using a UDF as a target instead of the
normal 'file' or STDOUT targets.  This dovetails nicely with a couple
of extensions I have also written for dblink for the purposes of
enabling direct cross-node bulk loading and replication.  Please
peruse the patches (the non-test containing patches also possess
robust human-readable summaries and explanations) that are In-Reply-To
this email for more details.

You can also get these patches from a git repo.  These patches are
applied against the history tracked by git.postgresql.org:

  git fetch http://fdr.lolrus.org/postgresql.git                     \
    copy-to-function:copy-to-function
  git checkout copy-to-function

While the functionality exposed in these patches has appeared robust
and efficient to us at Truviso, the code had ease-of-upstream merging
as a central design point, and as such I have shied away from adding
more invasive functionality that would make the interface less
byzantine/more usable.  This was intended to be the most surgical cut
before it seemed likely that this might be interesting to the
PostgreSQL project.

At least one additional datapoint of someone else wanting such a
functionality is seen in this thread:

  http://archives.postgresql.org/pgsql-hackers/2009-08/msg00428.php

Open Issues:

 * setup/copy/teardown and error handling: as-is it is unhealthily
   tempting to use a global variable (as seen in the dblink patches)
   to track state between setup/copy/cleanup steps.  I'm not sure what
   the right aesthetic is to make this a little more controlled than
   calling specific functions in exactly the right order.

 * 'transition state': similar to an aggregate, it may make sense for
   the target of TO FUNCTION to have a context in which it can stash
   state, or at least have access to a few constant parameters as it
   accepts records.  If such functionality existed one might be able
   to conveniently rewrite the current COPY ... TO (STDOUT|'file')
   behavior to be syntactic sugar for TO FUNCTION behavior, which is
   somewhat aesthetically pleasing to me.

 * It might be interesting to increase the symmetry of this operation
   allowing COPY to bulk load into UDFs.  With that in mind, the
   design the interfaces may change...or not.

This work is released under the BSD license as utilized by the
PostgreSQL project.  The copyright owner is Truviso, Inc in 2009.

Daniel Farina (4):
  Add "COPY ... TO FUNCTION ..." support
  Add tests for "COPY ... TO FUNCTION ..."
  Add dblink functions for use with COPY ... TO FUNCTION ...
  Add tests to dblink covering use of COPY TO FUNCTION

 contrib/dblink/dblink.c             |  190 ++++++++++++++++++++++++
 contrib/dblink/dblink.h             |    5 +
 contrib/dblink/dblink.sql.in        |   20 +++
 contrib/dblink/expected/dblink.out  |  272 +++++++++++++++++++++++++++++++++++
 contrib/dblink/sql/dblink.sql       |  112 ++++++++++++++
 contrib/dblink/uninstall_dblink.sql |    8 +
 src/backend/catalog/namespace.c     |   21 +++
 src/backend/commands/copy.c         |  190 +++++++++++++++++++++----
 src/backend/executor/spi.c          |    2 +-
 src/backend/nodes/copyfuncs.c       |    2 +-
 src/backend/nodes/equalfuncs.c      |    2 +-
 src/backend/parser/gram.y           |   30 +++--
 src/include/catalog/namespace.h     |    1 +
 src/include/nodes/parsenodes.h      |    3 +-
 src/test/regress/input/copy.source  |   38 +++++
 src/test/regress/output/copy.source |   69 +++++++++
 src/test/regress/regress.c          |   56 +++++++
 17 files changed, 982 insertions(+), 39 deletions(-)

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to