Re: [HACKERS] DO with a large amount of statements get stuck with high memory consumption

2016-07-18 Thread Jan Wieck
On Mon, Jul 18, 2016 at 10:05 AM, Tom Lane  wrote:

> Jan Wieck  writes:
> > In the meantime, would it be appropriate to backpatch the double linking
> > of memory context children at this time? I believe it has had plenty of
> > testing in the 9.6 cycle to be sure it didn't break anything.
>
> I'm concerned about the ABI breakage risk from changing a data structure
> as fundamental as MemoryContext.  Yeah, code outside utils/mmgr probably
> *shouldn't* be looking at that struct, but that doesn't mean it isn't.
> In the past we've generally only taken that sort of risk when there was
> no other way to fix a bug; and this patch isn't a bug fix.  While this
> does help performance in some corner cases, I don't think it's enough of
> an across-the-board win to justify the risk of back-patching.
>

I would consider mucking with the linked lists of memory context children
inside
of 3rd party code a really bad idea, but I concede. It isn't a bug fix and
there is
that small risk that somebody did precisely that, so no backpatch.


Regards, Jan

-- 
Jan Wieck
Senior Postgres Architect
http://pgblog.wi3ck.info


Re: [HACKERS] DO with a large amount of statements get stuck with high memory consumption

2016-07-18 Thread Tom Lane
Merlin Moncure  writes:
> Hm, maybe, instead of trying to figure out if in a loop, set a
> 'called' flag  with each statement and only cache when touched the
> second time.  (If that's easier, dunno).

Well, then you just guarantee to lose once.  I think Jan's sketch of
marking potentially-cacheable expressions at compile time sounds
promising.  I'm envisioning a counter that starts at 1 normally or 0 in a
DO block, increment at beginning of parsing a loop construct, decrement at
end; then mark expressions/statements as cacheable if counter>0.

Of course the next question is what exactly to do differently for a
noncacheable expression.  My recollection is that that's all tied
pretty tightly to the plancache these days, so it might take a little
work.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] DO with a large amount of statements get stuck with high memory consumption

2016-07-18 Thread Tom Lane
Jan Wieck  writes:
> In the meantime, would it be appropriate to backpatch the double linking
> of memory context children at this time? I believe it has had plenty of
> testing in the 9.6 cycle to be sure it didn't break anything.

I'm concerned about the ABI breakage risk from changing a data structure
as fundamental as MemoryContext.  Yeah, code outside utils/mmgr probably
*shouldn't* be looking at that struct, but that doesn't mean it isn't.
In the past we've generally only taken that sort of risk when there was
no other way to fix a bug; and this patch isn't a bug fix.  While this
does help performance in some corner cases, I don't think it's enough of
an across-the-board win to justify the risk of back-patching.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] DO with a large amount of statements get stuck with high memory consumption

2016-07-18 Thread Merlin Moncure
On Mon, Jul 18, 2016 at 8:59 AM, Jan Wieck  wrote:
>
>
> On Mon, Jul 18, 2016 at 9:43 AM, Tom Lane  wrote:
>>
>> Merlin Moncure  writes:
>> > BTW, while the fix does address the cleanup performance issue, it's
>> > still the case that anonymous code blocks burn up lots of resident
>> > memory (my 315k example I tested with ate around 8gb IIRC) when run
>> > like this.  My question is, if the pl/pgsql code block is anonymous
>> > and not in some kind of a loop, why bother caching the plan at all?
>>
>> Nobody got around to it.  Also, as you note, it's not as simple as
>> "don't cache if in a DO block".  You'd need to track whether you were
>> inside any sort of looping construct.  Depending on how difficult
>> that turned out to be, it might add overhead to regular functions
>> that we don't want.
>
> Agreed. And from the structures themselves it is not really easy to detect
> if inside of a loop, the toplevel, while, for and if all use the same
> statement
> block and call exec_stmts(), which in turn calls exec_stmt() for  each
> element in that list. It is not impossible to add a flag, set at PL compile
> time, to that element and check it every time, the statement is executed.
> But such a change definitely needs more testing and probably won't
> qualify for backpatching.

Right. Note, not arguing for backpatch here, just some open
speculation and some evidence that we still have a problem (although
nearly as nasty of one -- the pre-patch behavior of not responding to
cancel is very dangerous and solved).

Hm, maybe, instead of trying to figure out if in a loop, set a
'called' flag  with each statement and only cache when touched the
second time.  (If that's easier, dunno).

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] DO with a large amount of statements get stuck with high memory consumption

2016-07-18 Thread Jan Wieck
On Mon, Jul 18, 2016 at 9:43 AM, Tom Lane  wrote:

> Merlin Moncure  writes:
> > BTW, while the fix does address the cleanup performance issue, it's
> > still the case that anonymous code blocks burn up lots of resident
> > memory (my 315k example I tested with ate around 8gb IIRC) when run
> > like this.  My question is, if the pl/pgsql code block is anonymous
> > and not in some kind of a loop, why bother caching the plan at all?
>
> Nobody got around to it.  Also, as you note, it's not as simple as
> "don't cache if in a DO block".  You'd need to track whether you were
> inside any sort of looping construct.  Depending on how difficult
> that turned out to be, it might add overhead to regular functions
> that we don't want.


Agreed. And from the structures themselves it is not really easy to detect
if inside of a loop, the toplevel, while, for and if all use the same
statement
block and call exec_stmts(), which in turn calls exec_stmt() for  each
element in that list. It is not impossible to add a flag, set at PL compile
time, to that element and check it every time, the statement is executed.
But such a change definitely needs more testing and probably won't
qualify for backpatching.

In the meantime, would it be appropriate to backpatch the double linking
of memory context children at this time? I believe it has had plenty of
testing in the 9.6 cycle to be sure it didn't break anything.


Regards, Jan

-- 
Jan Wieck
Senior Postgres Architect
http://pgblog.wi3ck.info


Re: [HACKERS] DO with a large amount of statements get stuck with high memory consumption

2016-07-18 Thread Tom Lane
Merlin Moncure  writes:
> BTW, while the fix does address the cleanup performance issue, it's
> still the case that anonymous code blocks burn up lots of resident
> memory (my 315k example I tested with ate around 8gb IIRC) when run
> like this.  My question is, if the pl/pgsql code block is anonymous
> and not in some kind of a loop, why bother caching the plan at all?

Nobody got around to it.  Also, as you note, it's not as simple as
"don't cache if in a DO block".  You'd need to track whether you were
inside any sort of looping construct.  Depending on how difficult
that turned out to be, it might add overhead to regular functions
that we don't want.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] DO with a large amount of statements get stuck with high memory consumption

2016-07-18 Thread Merlin Moncure
On Sat, Jul 16, 2016 at 2:47 PM, Jan Wieck  wrote:
> On Tue, Jul 12, 2016 at 3:29 PM, Merlin Moncure  wrote:
>>
>> I've noticed that pl/pgsql functions/do commands do not behave well
>> when the statement resolves and frees memory.   To be clear:
>>
>> FOR i in 1..100
>> LOOP
>>   INSERT INTO foo VALUES (i);
>> END LOOP;
>>
>> ...runs just fine while
>>
>> BEGIN
>>   INSERT INTO foo VALUES (1);
>>   INSERT INTO foo VALUES (2);
>>   ...
>>   INSERT INTO foo VALUES (100);
>> END;
>
>
> This sounds very much like what led to commit
> 25c539233044c235e97fd7c9dc600fb5f08fe065.
>
> It seems that patch was only applied to master and never backpatched to 9.5
> or earlier.

You're right; thanks (my bad for missing that).  For those following
along, the case that turned this up was:
DO


...;

Where the insertion step was a large number of standalone insert statements.

(temp table creation isn't necessary to turn up this bug, but it's a
common pattern when sending batch updates to a server).

For those following along, the workaround I recommend would be to do this:

do $d$
begin

create function doit() returns void as
$$
  
$$ language sql;
perform doit();
end;
$d$;

BTW, while the fix does address the cleanup performance issue, it's
still the case that anonymous code blocks burn up lots of resident
memory (my 315k example I tested with ate around 8gb IIRC) when run
like this.  My question is, if the pl/pgsql code block is anonymous
and not in some kind of a loop, why bother caching the plan at all?

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] DO with a large amount of statements get stuck with high memory consumption

2016-07-17 Thread Jan Wieck
BTW, here is the email thread about double-linking MemoryContext children
patch, that Kevin at the end committed to master.

https://www.postgresql.org/message-id/55F2D834.8040106%40wi3ck.info


Regards, Jan


On Sat, Jul 16, 2016 at 3:47 PM, Jan Wieck  wrote:

>
>
> On Tue, Jul 12, 2016 at 3:29 PM, Merlin Moncure 
> wrote:
>
>> I've noticed that pl/pgsql functions/do commands do not behave well
>> when the statement resolves and frees memory.   To be clear:
>>
>> FOR i in 1..100
>> LOOP
>>   INSERT INTO foo VALUES (i);
>> END LOOP;
>>
>> ...runs just fine while
>>
>> BEGIN
>>   INSERT INTO foo VALUES (1);
>>   INSERT INTO foo VALUES (2);
>>   ...
>>   INSERT INTO foo VALUES (100);
>> END;
>>
>
> This sounds very much like what led to
> commit 25c539233044c235e97fd7c9dc600fb5f08fe065.
>
> It seems that patch was only applied to master and never backpatched to
> 9.5 or earlier.
>
>
> Regards, Jan
>
>
>
>
>>
>> (for the curious, create a script yourself via
>> copy (
>>   select
>> 'do $$begin create temp table foo(i int);'
>>   union all select
>> format('insert into foo values (%s);', i) from
>> generate_series(1,100) i
>>   union all select 'raise notice ''abandon all hope!''; end; $$;'
>> ) to '/tmp/breakit.sql';
>>
>> ...while consume amounts of resident memory proportional to the number
>> of statemnts and eventually crash the server.  The problem is obvious;
>> each statement causes a plan to get created and the server gets stuck
>> in a loop where SPI_freeplan() is called repeatedly.  Everything is
>> working as designed I guess, but when this happens it's really
>> unpleasant: the query is uncancellable and unterminatable, nicht gut.
>> A pg_ctl kill ABRT  will do the trick but I was quite astonished
>> to see linux take a few minutes to clean up the mess (!) on a somewhat
>> pokey virtualized server with lots of memory.  With even as little as
>> ten thousand statements the cleanup time far exceed the runtime of the
>> statement block.
>>
>> I guess the key takeaway here is, "don't do that"; pl/pgsql
>> aggressively generates plans and turns out to be a poor choice for
>> bulk loading because of all the plan caching.   Having said that, I
>> can't help but wonder if there should be a (perhaps user configurable)
>> limit to the amount of SPI plans a single function call should be able
>> to acquire on the basis you are going to smack into very poor
>> behaviors in the memory subsystem.
>>
>> Stepping back, I can't help but wonder what the value of all the plan
>> caching going on is at all for statement blocks.  Loops might comprise
>> a notable exception, noted.  I'd humbly submit though that (relative
>> to functions) it's much more likely to want to do something like
>> insert a lot of statements and a impossible to utilize any cached
>> plans.
>>
>> This is not an academic gripe -- I just exploded production :-D.
>>
>> merlin
>>
>>
>> --
>> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-hackers
>>
>
>
>
> --
> Jan Wieck
> Senior Postgres Architect
>



-- 
Jan Wieck
Senior Postgres Architect
http://pgblog.wi3ck.info


Re: [HACKERS] DO with a large amount of statements get stuck with high memory consumption

2016-07-16 Thread Jan Wieck
On Tue, Jul 12, 2016 at 3:29 PM, Merlin Moncure  wrote:

> I've noticed that pl/pgsql functions/do commands do not behave well
> when the statement resolves and frees memory.   To be clear:
>
> FOR i in 1..100
> LOOP
>   INSERT INTO foo VALUES (i);
> END LOOP;
>
> ...runs just fine while
>
> BEGIN
>   INSERT INTO foo VALUES (1);
>   INSERT INTO foo VALUES (2);
>   ...
>   INSERT INTO foo VALUES (100);
> END;
>

This sounds very much like what led to
commit 25c539233044c235e97fd7c9dc600fb5f08fe065.

It seems that patch was only applied to master and never backpatched to 9.5
or earlier.


Regards, Jan




>
> (for the curious, create a script yourself via
> copy (
>   select
> 'do $$begin create temp table foo(i int);'
>   union all select
> format('insert into foo values (%s);', i) from
> generate_series(1,100) i
>   union all select 'raise notice ''abandon all hope!''; end; $$;'
> ) to '/tmp/breakit.sql';
>
> ...while consume amounts of resident memory proportional to the number
> of statemnts and eventually crash the server.  The problem is obvious;
> each statement causes a plan to get created and the server gets stuck
> in a loop where SPI_freeplan() is called repeatedly.  Everything is
> working as designed I guess, but when this happens it's really
> unpleasant: the query is uncancellable and unterminatable, nicht gut.
> A pg_ctl kill ABRT  will do the trick but I was quite astonished
> to see linux take a few minutes to clean up the mess (!) on a somewhat
> pokey virtualized server with lots of memory.  With even as little as
> ten thousand statements the cleanup time far exceed the runtime of the
> statement block.
>
> I guess the key takeaway here is, "don't do that"; pl/pgsql
> aggressively generates plans and turns out to be a poor choice for
> bulk loading because of all the plan caching.   Having said that, I
> can't help but wonder if there should be a (perhaps user configurable)
> limit to the amount of SPI plans a single function call should be able
> to acquire on the basis you are going to smack into very poor
> behaviors in the memory subsystem.
>
> Stepping back, I can't help but wonder what the value of all the plan
> caching going on is at all for statement blocks.  Loops might comprise
> a notable exception, noted.  I'd humbly submit though that (relative
> to functions) it's much more likely to want to do something like
> insert a lot of statements and a impossible to utilize any cached
> plans.
>
> This is not an academic gripe -- I just exploded production :-D.
>
> merlin
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>



-- 
Jan Wieck
Senior Postgres Architect


[HACKERS] DO with a large amount of statements get stuck with high memory consumption

2016-07-12 Thread Merlin Moncure
I've noticed that pl/pgsql functions/do commands do not behave well
when the statement resolves and frees memory.   To be clear:

FOR i in 1..100
LOOP
  INSERT INTO foo VALUES (i);
END LOOP;

...runs just fine while

BEGIN
  INSERT INTO foo VALUES (1);
  INSERT INTO foo VALUES (2);
  ...
  INSERT INTO foo VALUES (100);
END;

(for the curious, create a script yourself via
copy (
  select
'do $$begin create temp table foo(i int);'
  union all select
format('insert into foo values (%s);', i) from generate_series(1,100) i
  union all select 'raise notice ''abandon all hope!''; end; $$;'
) to '/tmp/breakit.sql';

...while consume amounts of resident memory proportional to the number
of statemnts and eventually crash the server.  The problem is obvious;
each statement causes a plan to get created and the server gets stuck
in a loop where SPI_freeplan() is called repeatedly.  Everything is
working as designed I guess, but when this happens it's really
unpleasant: the query is uncancellable and unterminatable, nicht gut.
A pg_ctl kill ABRT  will do the trick but I was quite astonished
to see linux take a few minutes to clean up the mess (!) on a somewhat
pokey virtualized server with lots of memory.  With even as little as
ten thousand statements the cleanup time far exceed the runtime of the
statement block.

I guess the key takeaway here is, "don't do that"; pl/pgsql
aggressively generates plans and turns out to be a poor choice for
bulk loading because of all the plan caching.   Having said that, I
can't help but wonder if there should be a (perhaps user configurable)
limit to the amount of SPI plans a single function call should be able
to acquire on the basis you are going to smack into very poor
behaviors in the memory subsystem.

Stepping back, I can't help but wonder what the value of all the plan
caching going on is at all for statement blocks.  Loops might comprise
a notable exception, noted.  I'd humbly submit though that (relative
to functions) it's much more likely to want to do something like
insert a lot of statements and a impossible to utilize any cached
plans.

This is not an academic gripe -- I just exploded production :-D.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers