Re: [GENERAL] max_stack_depth problem though query is substantially smaller

2016-04-10 Thread Bannert Matthias
Heureka! thanks so much for your help, patience and the right hunch. Actually I 
am glad now I ran into that stack issue (and you) cause the entire thing is 
also much faster now. 
I changed my app to emit strings like you suggested and it works, also with 
smaller max_stack_depth. 

Fwiw, I was not stubbornly insisting on nesting operators. Actually I switched 
from "=>" to the hstore function cause
a note in the manual said it was deprecated 
(http://www.postgresql.org/docs/9.0/static/hstore.html). Somehow I must have 
understand that note the wrong way. 
What's your take on that operator being deprecated? 

regards, matt bannert




From: Tom Lane [t...@sss.pgh.pa.us]
Sent: Saturday, April 09, 2016 5:25 PM
To: Bannert  Matthias
Cc: Charles Clavadetscher; pgsql-general@postgresql.org
Subject: Re: [GENERAL] max_stack_depth problem though query is substantially 
smaller

"Bannert  Matthias" <bann...@kof.ethz.ch> writes:
> [ very deep stack of parser transformExprRecurse calls ]

> #20137 0x7fe7fb80ab8c in pg_analyze_and_rewrite 
> (parsetree=parsetree@entry=0x7fe7fffdb2a0, 
> query_string=query_string@entry=0x7fe7fdf606b0 "INSERT INTO 
> ts_updates(ts_key, ts_data, ts_frequency) VALUES 
> ('some_id.sector_all.news_all_d',hstore('1900-01-01','-0.395131869823009')||hstore('1900-01-02','-0.395131869823009')||hstore('1"...,
>  paramTypes=paramTypes@entry=0x0, numParams=numParams@entry=0) at 
> /build/postgresql-9.3-G1RSAD/postgresql-9.3-9.3.11/build/../src/backend/tcop/postgres.c:640

The SQL fragment we can see here suggests that your "40K entry hstore" is
getting built up by stringing together 40K hstore concatenation operators.
Don't do that.  Even without the parser stack depth issue, it's uselessly
inefficient.  I presume you're generating this statement mechanically,
not by hand, so you could equally well have the app emit

'1900-01-01 => -0.395131869823009, 1900-01-02 => -0.395131869823009, 
...'::hstore

which would look like a single hstore literal to the parser, and be
processed much more quickly.

If you insist on emitting SQL statements that have operators nested
to such depths, then yes you'll need to increase max_stack_depth to
whatever it takes to allow it.

regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] max_stack_depth problem though query is substantially smaller

2016-04-10 Thread Bannert Matthias
ostmasterMain (argc=5, argv=) at 
/build/postgresql-9.3-G1RSAD/postgresql-9.3-9.3.11/build/../src/backend/postmaster/postmaster.c:1315
#20144 0x7fe7fb5ff0a3 in main (argc=5, argv=0x7fe7fdd25190) at 
/build/postgresql-9.3-G1RSAD/postgresql-9.3-9.3.11/build/../src/backend/main/main.c:227
Detaching from program: /usr/lib/postgresql/9.3/bin/postgres, process 1521


Is this of any help at all? I really how you can get almost 70 MB done and I 
can't even get 2-5 MB done. Maybe you can share a brief part of you 70 MB query?

regards, matt bannert




From: Tom Lane [t...@sss.pgh.pa.us]
Sent: Friday, April 08, 2016 9:39 PM
To: Bannert  Matthias
Cc: Charles Clavadetscher; pgsql-general@postgresql.org
Subject: Re: [GENERAL] max_stack_depth problem though query is substantially 
smaller

"Bannert  Matthias" <bann...@kof.ethz.ch> writes:
> Thanks for your reply. I do think it is rather a postgres than an R issue, 
> here's why:
> a) R simply puts an SQL string together. What Charles had posted was an 
> excerpt of that string.
> Basically we have 1.7 MB of that string. Everything else is equal just the 
> hstore contains 40K key value pairs.

Well, as a test I ran a query that included an hstore literal with 4
million key/value pairs (a bit shy of 70MB of query text).  I didn't see
any misbehavior on a machine with 2MB max_stack_depth.  So there's
something else going on in your situation.

I concur with the suggestion to try to get a stack backtrace from the
point of the error.  Setting a breakpoint at errfinish() is usually
an effective strategy when you know that the query will provoke a SQL
error report.

https://wiki.postgresql.org/wiki/Generating_a_stack_trace_of_a_PostgreSQL_backend

regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] max_stack_depth problem though query is substantially smaller

2016-04-10 Thread Bannert Matthias
I guess you are right. I have narrowed the query down 
to a simple create table, followed by an insert, one text field, one hstore 
field and an integer field.
No temporary table, no BEGIN etc. One record, yet the hstore has 40K kvp. No R 
involved. 
and I still end up with the same error. 

Thanks for the pointer to the stack trace backend. I'll try to set that up and 
report what I find. 





From: Tom Lane [t...@sss.pgh.pa.us]
Sent: Friday, April 08, 2016 9:39 PM
To: Bannert  Matthias
Cc: Charles Clavadetscher; pgsql-general@postgresql.org
Subject: Re: [GENERAL] max_stack_depth problem though query is substantially 
smaller

"Bannert  Matthias" <bann...@kof.ethz.ch> writes:
> Thanks for your reply. I do think it is rather a postgres than an R issue, 
> here's why:
> a) R simply puts an SQL string together. What Charles had posted was an 
> excerpt of that string.
> Basically we have 1.7 MB of that string. Everything else is equal just the 
> hstore contains 40K key value pairs.

Well, as a test I ran a query that included an hstore literal with 4
million key/value pairs (a bit shy of 70MB of query text).  I didn't see
any misbehavior on a machine with 2MB max_stack_depth.  So there's
something else going on in your situation.

I concur with the suggestion to try to get a stack backtrace from the
point of the error.  Setting a breakpoint at errfinish() is usually
an effective strategy when you know that the query will provoke a SQL
error report.

https://wiki.postgresql.org/wiki/Generating_a_stack_trace_of_a_PostgreSQL_backend

regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] max_stack_depth problem though query is substantially smaller

2016-04-10 Thread Tom Lane
"Bannert  Matthias"  writes:
> Fwiw, I was not stubbornly insisting on nesting operators. Actually I 
> switched from "=>" to the hstore function cause
> a note in the manual said it was deprecated 
> (http://www.postgresql.org/docs/9.0/static/hstore.html). Somehow I must have 
> understand that note the wrong way. 
> What's your take on that operator being deprecated? 

That's the old SQL operator (which is not even there anymore) that's
equivalent to the hstore(text,text) constructor function, ie
"text => text returning hstore".  It's quite a different concept
from the => notation inside an hstore literal.  That is:
'foo'::text => 'bar'::text
is not like
'"foo" => "bar"'::hstore
even though they have the same end result.

regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] max_stack_depth problem though query is substantially smaller

2016-04-09 Thread Tom Lane
"Bannert  Matthias"  writes:
> [ very deep stack of parser transformExprRecurse calls ]

> #20137 0x7fe7fb80ab8c in pg_analyze_and_rewrite 
> (parsetree=parsetree@entry=0x7fe7fffdb2a0, 
> query_string=query_string@entry=0x7fe7fdf606b0 "INSERT INTO 
> ts_updates(ts_key, ts_data, ts_frequency) VALUES 
> ('some_id.sector_all.news_all_d',hstore('1900-01-01','-0.395131869823009')||hstore('1900-01-02','-0.395131869823009')||hstore('1"...,
>  paramTypes=paramTypes@entry=0x0, numParams=numParams@entry=0) at 
> /build/postgresql-9.3-G1RSAD/postgresql-9.3-9.3.11/build/../src/backend/tcop/postgres.c:640

The SQL fragment we can see here suggests that your "40K entry hstore" is
getting built up by stringing together 40K hstore concatenation operators.
Don't do that.  Even without the parser stack depth issue, it's uselessly
inefficient.  I presume you're generating this statement mechanically,
not by hand, so you could equally well have the app emit

'1900-01-01 => -0.395131869823009, 1900-01-02 => -0.395131869823009, 
...'::hstore

which would look like a single hstore literal to the parser, and be
processed much more quickly.

If you insist on emitting SQL statements that have operators nested
to such depths, then yes you'll need to increase max_stack_depth to
whatever it takes to allow it.

regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] max_stack_depth problem though query is substantially smaller

2016-04-08 Thread Tom Lane
"Bannert  Matthias"  writes:
> Thanks for your reply. I do think it is rather a postgres than an R issue, 
> here's why:
> a) R simply puts an SQL string together. What Charles had posted was an 
> excerpt of that string. 
> Basically we have 1.7 MB of that string. Everything else is equal just the 
> hstore contains 40K key value pairs. 

Well, as a test I ran a query that included an hstore literal with 4
million key/value pairs (a bit shy of 70MB of query text).  I didn't see
any misbehavior on a machine with 2MB max_stack_depth.  So there's
something else going on in your situation.

I concur with the suggestion to try to get a stack backtrace from the
point of the error.  Setting a breakpoint at errfinish() is usually
an effective strategy when you know that the query will provoke a SQL
error report.

https://wiki.postgresql.org/wiki/Generating_a_stack_trace_of_a_PostgreSQL_backend

regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] max_stack_depth problem though query is substantially smaller

2016-04-08 Thread Bannert Matthias
Thanks for your reply. I do think it is rather a postgres than an R issue, 
here's why:

a) R simply puts an SQL string together. What Charles had posted was an excerpt 
of that string. 
Basically we have 1.7 MB of that string. Everything else is equal just the 
hstore contains 40K key value pairs. 

b) The error message clearly mentions max_stack_depth which is a postgres 
parameter. 

c) If I just take that SQL string (only the first part of it, i.e. the create 
temp table and insert into  part w/o all the 
update and join gibberish and put it to a .sql file and simply run it through a 
psql client like this: 
\i myquery.sql

I get exactly the same error message (without any R involved at any stage)

psql:query.sql:3: ERROR:  stack depth limit exceeded
HINT:  Increase the configuration parameter "max_stack_depth" (currently 
7168kB), after ensuring the platform's stack depth limit is adequate.

d) I ran into to quite some R stack errors and they look different... 
(C_STACK_SIZE)

conclusion:
We are running a simple insert. Nothing special except for the fact that hstore 
has 40K key value pairs. Could it be that the indexing of that hstore gets 
kinda large 
and thus a query string that only has 1.7 MB exceeds the stack ? 







From: Tom Lane [t...@sss.pgh.pa.us]
Sent: Friday, April 08, 2016 4:20 PM
To: Charles Clavadetscher
Cc: pgsql-general@postgresql.org; Bannert  Matthias
Subject: Re: [GENERAL] max_stack_depth problem though query is substantially 
smaller

"Charles Clavadetscher" <clavadetsc...@swisspug.org> writes:
> When R processes the daily time serie we get a stack size exceeded
error, followed by the hint to increase the max_stack_depth.

Postgres doesn't generally allocate large values on the stack, and I doubt
that R does either.  Almost certainly, what is causing this is not data
size per se but unreasonable call nesting depth in your R code.  You may
have a function that's actually in infinite recursion, or maybe it's
recursing to a depth governed by the number of data elements.  If so,
consider revising it into iteration with an explicitly-represented state
stack.

regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] max_stack_depth problem though query is substantially smaller

2016-04-08 Thread Tom Lane
"Charles Clavadetscher"  writes:
> When R processes the daily time serie we get a stack size exceeded
error, followed by the hint to increase the max_stack_depth.

Postgres doesn't generally allocate large values on the stack, and I doubt
that R does either.  Almost certainly, what is causing this is not data
size per se but unreasonable call nesting depth in your R code.  You may
have a function that's actually in infinite recursion, or maybe it's
recursing to a depth governed by the number of data elements.  If so,
consider revising it into iteration with an explicitly-represented state
stack.

regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] max_stack_depth problem though query is substantially smaller

2016-04-08 Thread Albe Laurenz
Charles Clavadetscher wrote:
> We have a process in R which reads statistical raw data from a table and 
> computes time series values
> from them.
> The time series values are in a hstore field with the date as the key and the 
> value as the value.
> The process writes the computed value into a temporary table and locks the 
> corresponding row(s) of the
> target table for update.
> Finally the row(s) are updated if they already exist or inserted if they do 
> not exist.
> 
> This process runs nightly and processes many GB of data without generating 
> any error. Normally these
> are low frequency time series
> (monthly or quarterly data).
> 
> Now we have a daily time serie with about 46'000 key/value pairs. In near 
> future this will increase to
> 86'000 including data from
> previous years.
> 
> When R processes the daily time serie we get a stack size exceeded error, 
> followed by the hint to
> increase the max_stack_depth. My
> colleague, who wrote the R function and performed the test read the docs and 
> increased, according to
> ulimit -s the max_stack_depth
> to 7MB.
> 
> Here the details of OS and PG:
> OS: osx 10.10.5
> PG: 9.3.3
> 
> ulimit -s = 8192
> 
> The resize did work as *show max_stack_depth;" has shown. After this change, 
> however, the query states
> the same error as before,
> just with the new limit of 7 MB.
> 
> The query itself was written to a file in order to verify its size. The size 
> turned out to be 1.7MB,
> i.e. even below the
> conservative default limit of 2 MB, yet alone substantially below 7 MB.
> 
> Apart from the fact that we could consider using a different strategy to 
> store time series, we would
> like to understand what is
> causing the problem.
> 
> Here the query as it looks like in the R code:
> sql_query_data <- sprintf("BEGIN;
>CREATE TEMPORARY TABLE ts_updates(ts_key varchar, 
> ts_data hstore,
> ts_frequency integer) ON COMMIT DROP;
>INSERT INTO ts_updates(ts_key, ts_data) VALUES %s;
>LOCK TABLE %s.timeseries_main IN EXCLUSIVE MODE;
> 
>UPDATE %s.timeseries_main
>SET ts_data = ts_updates.ts_data
>FROM ts_updates
>WHERE ts_updates.ts_key = 
> %s.timeseries_main.ts_key;
> 
>INSERT INTO %s.timeseries_main
>SELECT ts_updates.ts_key, ts_updates.ts_data, 
> ts_updates.ts_frequency
>FROM ts_updates
>LEFT OUTER JOIN %s.timeseries_main ON 
> (%s.timeseries_main.ts_key =
> ts_updates.ts_key)
>WHERE %s.timeseries_main.ts_key IS NULL;
>COMMIT;",
>values, schema, schema, schema, schema, schema, 
> schema, schema)
> 
> And here is how it looks like at the end:
> 
> INSERT INTO ts_updates(ts_key, ts_data, ts_frequency) VALUES 
> ('somekey',hstore('1900-01-01','-
> 0.395131869823009')||
> 
> hstore('1900-01-02','-
> 0.595131869823009')||
> 
> hstore('1900-01-03','-
> 0.395131869823009')||
> [...] 
> 46'000 times
> 
> hstore('1900-01-04','-
> 0.395131869823009'),NULL);

I don't understand which query causes the error.
The queries you quote above are smaller than 1.7 MB...
You could log errors to find out which statement causes the error.

One idea would be to attach a debugger to the backend, set a breakpoint in 
check_stack_depth()
where the error is thrown, and take a stack trace when you hit the error.
Maybe that can show you what is going on.

Yours,
Laurenz Albe

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] max_stack_depth problem though query is substantially smaller

2016-04-08 Thread Charles Clavadetscher
Hello

We have a process in R which reads statistical raw data from a table and 
computes time series values from them.
The time series values are in a hstore field with the date as the key and the 
value as the value.
The process writes the computed value into a temporary table and locks the 
corresponding row(s) of the target table for update.
Finally the row(s) are updated if they already exist or inserted if they do not 
exist.

This process runs nightly and processes many GB of data without generating any 
error. Normally these are low frequency time series
(monthly or quarterly data).

Now we have a daily time serie with about 46'000 key/value pairs. In near 
future this will increase to 86'000 including data from
previous years.

When R processes the daily time serie we get a stack size exceeded error, 
followed by the hint to increase the max_stack_depth. My
colleague, who wrote the R function and performed the test read the docs and 
increased, according to ulimit -s the max_stack_depth
to 7MB.

Here the details of OS and PG:
OS: osx 10.10.5
PG: 9.3.3

ulimit -s = 8192

The resize did work as *show max_stack_depth;" has shown. After this change, 
however, the query states the same error as before,
just with the new limit of 7 MB.

The query itself was written to a file in order to verify its size. The size 
turned out to be 1.7MB, i.e. even below the
conservative default limit of 2 MB, yet alone substantially below 7 MB.

Apart from the fact that we could consider using a different strategy to store 
time series, we would like to understand what is
causing the problem.

Here the query as it looks like in the R code:
sql_query_data <- sprintf("BEGIN;
   CREATE TEMPORARY TABLE ts_updates(ts_key varchar, 
ts_data hstore, ts_frequency integer) ON COMMIT DROP;
   INSERT INTO ts_updates(ts_key, ts_data) VALUES %s;
   LOCK TABLE %s.timeseries_main IN EXCLUSIVE MODE;

   UPDATE %s.timeseries_main
   SET ts_data = ts_updates.ts_data
   FROM ts_updates
   WHERE ts_updates.ts_key = %s.timeseries_main.ts_key;

   INSERT INTO %s.timeseries_main
   SELECT ts_updates.ts_key, ts_updates.ts_data, 
ts_updates.ts_frequency
   FROM ts_updates
   LEFT OUTER JOIN %s.timeseries_main ON 
(%s.timeseries_main.ts_key = ts_updates.ts_key)
   WHERE %s.timeseries_main.ts_key IS NULL;
   COMMIT;",
   values, schema, schema, schema, schema, schema, 
schema, schema)

And here is how it looks like at the end:

INSERT INTO ts_updates(ts_key, ts_data, ts_frequency) VALUES 
('somekey',hstore('1900-01-01','-0.395131869823009')||

hstore('1900-01-02','-0.595131869823009')||

hstore('1900-01-03','-0.395131869823009')||
[...] 
46'000 times

hstore('1900-01-04','-0.395131869823009'),NULL);

The computer where my colleague made the test is local. There are no other 
concurrent users.
We thank you for hints on what the problem may be and/or how to investigate it 
further.

Please reply to all, as my colleague is not yet subscribed to the mailing list.

Regards,
Charles and Matthias




-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general