Hi,

>
> Another approach would be to distinguish between errors that require a
> subtransaction to recover to a consistent state, and less serious errors
> that don't have this requirement (e.g. invalid input to a data type
> input function). If all the errors that we want to tolerate during a
> bulk load fall into the latter category, we can do without
> subtransactions.
>

I think errors which occur after we have done a fast_heap_insert of the
tuple generated from the current input row are the ones which would require
the subtransaction to recover. Examples could be unique/primary key
violation errors or FKey/triggers related errors. Any errors which occur
before doing the heap_insert should not require any recovery according to
me.

The overhead of having a subtransaction per row is a very valid concern. But
instead of using a per insert or a batch insert substraction, I am
thinking that we can start off a subtraction and continue it till we
encounter a failure. The moment an error is encountered, since we have the
offending (already in heap) tuple around, we can call a simple_heap_delete
on the same and commit (instead of aborting) this subtransaction after doing
some minor cleanup. This current input data row can also be logged into a
bad file. Recall that we need to only handle those errors in which the
simple_heap_insert is successful, but the index insertion or the after row
insert trigger causes an error. The rest of the load then can go ahead with
the start of a new subtransaction.

Regards,
Nikhils
-- 
EnterpriseDB               http://www.enterprisedb.com

Reply via email to