Re: [HACKERS] COPY enhancements

Emmanuel Cecchet Tue, 13 Oct 2009 07:04:53 -0700

Tom Lane wrote:

Ultimately, there's always going to be a tradeoff between speed and
flexibility.  It may be that we should just say "if you want to import
dirty data, it's gonna cost ya" and not worry about the speed penalty
of subtransaction-per-row.  But that still leaves us with the 2^32
limit.  I wonder whether we could break down COPY into sub-sub
transactions to work around that...

Regarding that tradeoff between speed and flexibility I think we couldpropose multiple options:

- maximum speed: current implementation fails on first error

- speed with error logging: copy command fails if there is an error butcontinue to log all errors- speed with error logging best effort: no use of sub-transactions buterrors that can safely be trapped with pg_try/catch (no index violation,no before insert trigger, etc...) are logged and command can complete- pre-loading (2-phase copy): phase 1: copy good tuples into a [temp]table and bad tuples into an error table. phase 2: push good tuples todestination table. Note that if phase 2 fails, it could be retried sincethe temp table would be dropped only on success of phase 2.- slow but flexible: have every row in a sub-transaction -> is there anyreal benefits compared to pg_loader?

Tom was also suggesting 'refactoring COPY into a series of steps thatthe user can control'. What would these steps be? Would that be per rowand allow to discard a bad tuple?


Emmanuel

--
Emmanuel Cecchet

FTO @ Frog ThinkerOpen Source Development & Consulting

--
Web: http://www.frogthinker.org
email: [email protected]
Skype: emmanuel_cecchet


--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] COPY enhancements

Reply via email to