Re: [HACKERS] GSOC'17 project introduction: Parallel COPY execution with errors handling

2018-03-03 Thread Tom Lane
Peter Eisentraut  writes:
> On 3/3/18 00:48, Tom Lane wrote:
>> I don't think that can possibly work.  It would only be safe if, between
>> the thrower and the catcher, there were no other levels of control
>> operating according to the normal error-handling rules.  But input
>> functions certainly cannot assume that they are only called by COPY,
>> so how could they safely throw a "soft error"?

> That assumes that throwing a soft error in a context that does not
> handle it specially is not safe.  I'd imagine in such situations the
> soft error just behaves like a normal exception.

My point is that neither the thrower of the error, nor the catcher of it,
can possibly guarantee that there were no levels of control in between
that were expecting their resource leakages to be cleaned up by normal
error recovery.  As a counterexample consider the chain of control

COPY -> domain_in -> SQL function in CHECK -> int4in

If int4in throws a "soft" error, and COPY catches it and skips
error cleanup, we've likely got problems, depending on what the SQL
function did.  This could even break domain_in, although probably any
such effort would've taken care to harden domain_in against the issue.
But the possibility of SQL or PL functions having done nearly-arbitrary
things in between means that most of the backend is at hazard.

You could imagine making something like this work, but it's gonna take
very invasive and bug-inducing changes to our error cleanup rules,
affecting a ton of places that have no connection to the goal.

regards, tom lane



Re: [HACKERS] GSOC'17 project introduction: Parallel COPY execution with errors handling

2018-03-03 Thread Pavel Stehule
2018-03-03 15:02 GMT+01:00 Peter Eisentraut <
peter.eisentr...@2ndquadrant.com>:

> On 3/3/18 00:48, Tom Lane wrote:
> > I don't think that can possibly work.  It would only be safe if, between
> > the thrower and the catcher, there were no other levels of control
> > operating according to the normal error-handling rules.  But input
> > functions certainly cannot assume that they are only called by COPY,
> > so how could they safely throw a "soft error"?
>
> That assumes that throwing a soft error in a context that does not
> handle it specially is not safe.  I'd imagine in such situations the
> soft error just behaves like a normal exception.
>

This soft error solves what? If somebody use fault tolerant COPY, then he
will not be satisfied, when only format errors will be ignored. Any
constraints, maybe trigger errors should be ignored too.

But is true, so some other databases raises soft error on format issues,
and doesn't raise a exception. Isn't better to use some alternative
algorithm like

 under subtransaction try to insert block of 1000 lines. When some fails do
rollback and try to import per block of 100 lines, and try to find wrong
line?

or another way - for this specific case reduce the cost of subtransaction?
This is special case, subtransaction under one command.

Regards

Pavel



>
> --
> Peter Eisentraut  http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>


Re: [HACKERS] GSOC'17 project introduction: Parallel COPY execution with errors handling

2018-03-03 Thread Peter Eisentraut
On 3/3/18 00:48, Tom Lane wrote:
> I don't think that can possibly work.  It would only be safe if, between
> the thrower and the catcher, there were no other levels of control
> operating according to the normal error-handling rules.  But input
> functions certainly cannot assume that they are only called by COPY,
> so how could they safely throw a "soft error"?

That assumes that throwing a soft error in a context that does not
handle it specially is not safe.  I'd imagine in such situations the
soft error just behaves like a normal exception.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] GSOC'17 project introduction: Parallel COPY execution with errors handling

2018-03-02 Thread Tom Lane
Craig Ringer  writes:
> On 3 March 2018 at 13:08, Peter Eisentraut  wrote:
>> I think one thing to try would to define a special kind of exception
>> that can safely be caught and ignored.  Then, input functions can
>> communicate benign parse errors by doing their own cleanup first, then
>> throwing this exception, and then the COPY subsystem can deal with it.

> That makes sense. We'd only use the error code range in question when it
> was safe to catch without re-throw, and we'd have to enforce rules around
> using a specific memory context.

I don't think that can possibly work.  It would only be safe if, between
the thrower and the catcher, there were no other levels of control
operating according to the normal error-handling rules.  But input
functions certainly cannot assume that they are only called by COPY,
so how could they safely throw a "soft error"?

regards, tom lane



Re: [HACKERS] GSOC'17 project introduction: Parallel COPY execution with errors handling

2018-03-02 Thread Craig Ringer
On 3 March 2018 at 13:08, Peter Eisentraut  wrote:

> On 1/22/18 21:33, Craig Ringer wrote:
> > We don't have much in the way of rules about what input functions can or
> > cannot do, so you can't assume much about their behaviour and what must
> > / must not be cleaned up. Nor can you just reset the state in a heavy
> > handed manner like (say) plpgsql does.
>
> I think one thing to try would to define a special kind of exception
> that can safely be caught and ignored.  Then, input functions can
> communicate benign parse errors by doing their own cleanup first, then
> throwing this exception, and then the COPY subsystem can deal with it.
>

That makes sense. We'd only use the error code range in question when it
was safe to catch without re-throw, and we'd have to enforce rules around
using a specific memory context. Of course no LWLocks could be held, but
that's IIRC true when throwing anyway unless you plan to proc_exit() in
your handler.

People will immediately ask for it to handle RI errors too, so something
similar would be needed there. But frankly, Pg's RI handling for bulk
loading desperately needs a major change in how it works to make it
efficient anyway, the current model of individual row triggers is horrible
for bulk load performance.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] GSOC'17 project introduction: Parallel COPY execution with errors handling

2018-03-02 Thread Peter Eisentraut
On 1/22/18 21:33, Craig Ringer wrote:
> We don't have much in the way of rules about what input functions can or
> cannot do, so you can't assume much about their behaviour and what must
> / must not be cleaned up. Nor can you just reset the state in a heavy
> handed manner like (say) plpgsql does.

I think one thing to try would to define a special kind of exception
that can safely be caught and ignored.  Then, input functions can
communicate benign parse errors by doing their own cleanup first, then
throwing this exception, and then the COPY subsystem can deal with it.

Anyway, this patch is clearly not doing this, so I'm setting it to
returned with feedback.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] GSOC'17 project introduction: Parallel COPY execution with errors handling

2018-01-22 Thread Craig Ringer
On 23 January 2018 at 15:21, Stephen Frost  wrote:

> Alexey,
>
> * Alexey Kondratov (kondratov.alek...@gmail.com) wrote:
> > On Fri, Dec 1, 2017 at 1:58 AM, Alvaro Herrera 
> wrote:
> > > On a *very* quick look, please use an enum to return from NextCopyFrom
> > > rather than 'int'.  The chunks that change bool to int are very
> > > odd-looking.  This would move the comment that explains the value from
> > > copy.c to copy.h, obviously.  Also, you seem to be using non-ASCII
> dashes
> > > in the descriptions of those values; please don't.
> >
> > I will fix it, thank you.
> >
> > > Or maybe I misunderstood the patch completely.
> >
> > I hope so. Here is my thoughts how it all works, please correct me,
> > where I am wrong:
> >
> > 1) First, I have simply changed ereport level to WARNING for specific
> > validations (extra or missing columns, etc) if INGONE_ERRORS option is
> > used. All these checks are inside NextCopyFrom. Thus, this patch
> > performs here pretty much the same as before, excepting that it is
> > possible to skip bad lines, and this part should be safe as well
> >
> > 2) About PG_TRY/CATCH. I use it to catch the only one specific
> > function call inside NextCopyFrom--it is InputFunctionCall--which is
> > used just to parse datatype from the input string. I have no idea how
> > WAL write or trigger errors could get here
> >
> > All of these is done before actually forming a tuple, putting it into
> > the heap, firing insert-related triggers, etc. I am not trying to
> > catch all errors during the row processing, only input data errors. So
> > why is it unsafe?
>
> The issue is that InputFunctionCall is actually calling out to other
> functions and they could be allocating memory and doing other things.
> What you're missing by using PG_TRY() here without doing a PG_RE_THROW()
> is all the cleanup work that AbortTransaction() and friends do- ensuring
> there's a valid memory context (cleaning up the ones that might have
> been allocated by the input function), releasing locks, resetting user
> and security contexts, and a whole slew of other things.
>
> Is all of that always needed for an error thrown by an input function?
> Hard to say, really, since input functions can be provided by
> extensions.  You might get away with doing this for int4in(), but we
> also have to be thinking about extensions like PostGIS which introduce
> their own geometry and geography data types that are a great deal more
> complicated.
>
>
For example, an input function could conceivably take a lwlock, do some
work in the heapam, or whatever.

We don't have much in the way of rules about what input functions can or
cannot do, so you can't assume much about their behaviour and what must /
must not be cleaned up. Nor can you just reset the state in a heavy handed
manner like (say) plpgsql does.


-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] GSOC'17 project introduction: Parallel COPY execution with errors handling

2018-01-22 Thread Stephen Frost
Alexey,

* Alexey Kondratov (kondratov.alek...@gmail.com) wrote:
> On Fri, Dec 1, 2017 at 1:58 AM, Alvaro Herrera  
> wrote:
> > On a *very* quick look, please use an enum to return from NextCopyFrom
> > rather than 'int'.  The chunks that change bool to int are very
> > odd-looking.  This would move the comment that explains the value from
> > copy.c to copy.h, obviously.  Also, you seem to be using non-ASCII dashes
> > in the descriptions of those values; please don't.
> 
> I will fix it, thank you.
> 
> > Or maybe I misunderstood the patch completely.
> 
> I hope so. Here is my thoughts how it all works, please correct me,
> where I am wrong:
> 
> 1) First, I have simply changed ereport level to WARNING for specific
> validations (extra or missing columns, etc) if INGONE_ERRORS option is
> used. All these checks are inside NextCopyFrom. Thus, this patch
> performs here pretty much the same as before, excepting that it is
> possible to skip bad lines, and this part should be safe as well
> 
> 2) About PG_TRY/CATCH. I use it to catch the only one specific
> function call inside NextCopyFrom--it is InputFunctionCall--which is
> used just to parse datatype from the input string. I have no idea how
> WAL write or trigger errors could get here
> 
> All of these is done before actually forming a tuple, putting it into
> the heap, firing insert-related triggers, etc. I am not trying to
> catch all errors during the row processing, only input data errors. So
> why is it unsafe?

The issue is that InputFunctionCall is actually calling out to other
functions and they could be allocating memory and doing other things.
What you're missing by using PG_TRY() here without doing a PG_RE_THROW()
is all the cleanup work that AbortTransaction() and friends do- ensuring
there's a valid memory context (cleaning up the ones that might have
been allocated by the input function), releasing locks, resetting user
and security contexts, and a whole slew of other things.

Is all of that always needed for an error thrown by an input function?
Hard to say, really, since input functions can be provided by
extensions.  You might get away with doing this for int4in(), but we
also have to be thinking about extensions like PostGIS which introduce
their own geometry and geography data types that are a great deal more
complicated.

In the end, this isn't an acceptable approach to this problem.  The
approach outlined previously using sub-transactions which attempted
insert some number of records and then, on failure, restarted and
inserted up until the failed record and then skipped it, starting over
again with the next batch, seemed pretty reasonable to me.  If you have
time to work on that, that'd be great, otherwise we'll probably need to
mark this as returned with feedback.

Thanks!

Stephen


signature.asc
Description: PGP signature


Re: [HACKERS] GSOC'17 project introduction: Parallel COPY execution with errors handling

2017-12-01 Thread Alexey Kondratov
On Fri, Dec 1, 2017 at 1:58 AM, Alvaro Herrera  wrote:
> On a *very* quick look, please use an enum to return from NextCopyFrom
> rather than 'int'.  The chunks that change bool to int are very
> odd-looking.  This would move the comment that explains the value from
> copy.c to copy.h, obviously.  Also, you seem to be using non-ASCII dashes
> in the descriptions of those values; please don't.

I will fix it, thank you.

>
> Or maybe I misunderstood the patch completely.
>

I hope so. Here is my thoughts how it all works, please correct me,
where I am wrong:

1) First, I have simply changed ereport level to WARNING for specific
validations (extra or missing columns, etc) if INGONE_ERRORS option is
used. All these checks are inside NextCopyFrom. Thus, this patch
performs here pretty much the same as before, excepting that it is
possible to skip bad lines, and this part should be safe as well

2) About PG_TRY/CATCH. I use it to catch the only one specific
function call inside NextCopyFrom--it is InputFunctionCall--which is
used just to parse datatype from the input string. I have no idea how
WAL write or trigger errors could get here

All of these is done before actually forming a tuple, putting it into
the heap, firing insert-related triggers, etc. I am not trying to
catch all errors during the row processing, only input data errors. So
why is it unsafe?


Best,

Alexey



Re: [HACKERS] GSOC'17 project introduction: Parallel COPY execution with errors handling

2017-11-30 Thread Alvaro Herrera
Alexey Kondratov wrote:

> I have rebased my branch and squashed all commits into one, since the
> patch is quite small. Everything seems to be working fine now, patch
> passes all regression tests.

On a *very* quick look, please use an enum to return from NextCopyFrom
rather than 'int'.  The chunks that change bool to int are very
odd-looking.  This would move the comment that explains the value from
copy.c to copy.h, obviously.  Also, you seem to be using non-ASCII dashes
in the descriptions of those values; please don't.

But those are really just minor nitpicks while paging through.  After
looking at the way you're handling errors, it seems that you've chosen
to go precisely the way people were saying that was not admissible: use
a PG_TRY block, and when things go south, simply log a WARNING and move
on.  This is unsafe.  For example: what happens if the error being
raised is out-of-memory?  Failing to re-throw means you don't release
current transaction memory context.  What if the error is that WAL
cannot be written because the WAL disk ran out of space?  This would
just be reported as a warning over and over until the server log disk is
also full.  What if your insertion fired a trigger that caused an update
which got a serializability exception?  You cannot just move on, because
at that point session state (serializability session state) might be
busted until you abort the current transaction.

Or maybe I misunderstood the patch completely.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] GSOC'17 project introduction: Parallel COPY execution with errors handling

2017-11-29 Thread Michael Paquier
On Wed, Jun 21, 2017 at 11:37 PM, Alex K  wrote:
>> On 16 Jun 2017, at 21:30, Alexey Kondratov  
>> wrote:
>
>> > On 13 Jun 2017, at 01:44, Peter Geoghegan  wrote:
>
>
>> > Speculative insertion has the following special entry points to
>> > heapam.c and execIndexing.c, currently only called within
>> > nodeModifyTable.c
>
>> > Offhand, it doesn't seem like it would be that hard to teach another
>> > heap_insert() caller the same tricks.
>
>
>> I went through the nodeModifyTable.c code and it seems not to be so
>> difficult to do the same inside COPY.
>
> After a more precise look, I have figured out at least one difficulty, COPY
> and INSERT follow the different execution paths: INSERT goes through
> the Planner, while COPY does not. It leads to the absence of some required
> attributes like arbiterIndexes, which are available during INSERT via
> PlanState/ModifyTableState. Probably it is possible to get the same in the
> COPY, but it is not clear for me how.
>
> Anyway, adding of the 'speculative insertion' into the COPY is worth of a
> separated patch; and I would be glad to try implementing it.
>
> In the same time I have prepared a complete working patch with:
>
> - ignoring of the input data formatting errors
> - IGNORE_ERRORS parameter in the COPY options
> - updated regression tests
>
> Please, find the patch attached or check the web UI diff on GitHub as always:
> https://github.com/ololobus/postgres/pull/1/files

"git diff master --check" complains heavily, and the patch does not
apply anymore. The last patch is 5-month old as well, so I am marking
the patch as returned with feedback.
-- 
Michael