Re: [GENERAL] max_stack_depth problem though query is substantially smaller
Heureka! thanks so much for your help, patience and the right hunch. Actually I am glad now I ran into that stack issue (and you) cause the entire thing is also much faster now. I changed my app to emit strings like you suggested and it works, also with smaller max_stack_depth. Fwiw, I was not stubbornly insisting on nesting operators. Actually I switched from "=>" to the hstore function cause a note in the manual said it was deprecated (http://www.postgresql.org/docs/9.0/static/hstore.html). Somehow I must have understand that note the wrong way. What's your take on that operator being deprecated? regards, matt bannert From: Tom Lane [t...@sss.pgh.pa.us] Sent: Saturday, April 09, 2016 5:25 PM To: Bannert Matthias Cc: Charles Clavadetscher; pgsql-general@postgresql.org Subject: Re: [GENERAL] max_stack_depth problem though query is substantially smaller "Bannert Matthias" <bann...@kof.ethz.ch> writes: > [ very deep stack of parser transformExprRecurse calls ] > #20137 0x7fe7fb80ab8c in pg_analyze_and_rewrite > (parsetree=parsetree@entry=0x7fe7fffdb2a0, > query_string=query_string@entry=0x7fe7fdf606b0 "INSERT INTO > ts_updates(ts_key, ts_data, ts_frequency) VALUES > ('some_id.sector_all.news_all_d',hstore('1900-01-01','-0.395131869823009')||hstore('1900-01-02','-0.395131869823009')||hstore('1"..., > paramTypes=paramTypes@entry=0x0, numParams=numParams@entry=0) at > /build/postgresql-9.3-G1RSAD/postgresql-9.3-9.3.11/build/../src/backend/tcop/postgres.c:640 The SQL fragment we can see here suggests that your "40K entry hstore" is getting built up by stringing together 40K hstore concatenation operators. Don't do that. Even without the parser stack depth issue, it's uselessly inefficient. I presume you're generating this statement mechanically, not by hand, so you could equally well have the app emit '1900-01-01 => -0.395131869823009, 1900-01-02 => -0.395131869823009, ...'::hstore which would look like a single hstore literal to the parser, and be processed much more quickly. If you insist on emitting SQL statements that have operators nested to such depths, then yes you'll need to increase max_stack_depth to whatever it takes to allow it. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] max_stack_depth problem though query is substantially smaller
ostmasterMain (argc=5, argv=) at /build/postgresql-9.3-G1RSAD/postgresql-9.3-9.3.11/build/../src/backend/postmaster/postmaster.c:1315 #20144 0x7fe7fb5ff0a3 in main (argc=5, argv=0x7fe7fdd25190) at /build/postgresql-9.3-G1RSAD/postgresql-9.3-9.3.11/build/../src/backend/main/main.c:227 Detaching from program: /usr/lib/postgresql/9.3/bin/postgres, process 1521 Is this of any help at all? I really how you can get almost 70 MB done and I can't even get 2-5 MB done. Maybe you can share a brief part of you 70 MB query? regards, matt bannert From: Tom Lane [t...@sss.pgh.pa.us] Sent: Friday, April 08, 2016 9:39 PM To: Bannert Matthias Cc: Charles Clavadetscher; pgsql-general@postgresql.org Subject: Re: [GENERAL] max_stack_depth problem though query is substantially smaller "Bannert Matthias" <bann...@kof.ethz.ch> writes: > Thanks for your reply. I do think it is rather a postgres than an R issue, > here's why: > a) R simply puts an SQL string together. What Charles had posted was an > excerpt of that string. > Basically we have 1.7 MB of that string. Everything else is equal just the > hstore contains 40K key value pairs. Well, as a test I ran a query that included an hstore literal with 4 million key/value pairs (a bit shy of 70MB of query text). I didn't see any misbehavior on a machine with 2MB max_stack_depth. So there's something else going on in your situation. I concur with the suggestion to try to get a stack backtrace from the point of the error. Setting a breakpoint at errfinish() is usually an effective strategy when you know that the query will provoke a SQL error report. https://wiki.postgresql.org/wiki/Generating_a_stack_trace_of_a_PostgreSQL_backend regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] max_stack_depth problem though query is substantially smaller
I guess you are right. I have narrowed the query down to a simple create table, followed by an insert, one text field, one hstore field and an integer field. No temporary table, no BEGIN etc. One record, yet the hstore has 40K kvp. No R involved. and I still end up with the same error. Thanks for the pointer to the stack trace backend. I'll try to set that up and report what I find. From: Tom Lane [t...@sss.pgh.pa.us] Sent: Friday, April 08, 2016 9:39 PM To: Bannert Matthias Cc: Charles Clavadetscher; pgsql-general@postgresql.org Subject: Re: [GENERAL] max_stack_depth problem though query is substantially smaller "Bannert Matthias" <bann...@kof.ethz.ch> writes: > Thanks for your reply. I do think it is rather a postgres than an R issue, > here's why: > a) R simply puts an SQL string together. What Charles had posted was an > excerpt of that string. > Basically we have 1.7 MB of that string. Everything else is equal just the > hstore contains 40K key value pairs. Well, as a test I ran a query that included an hstore literal with 4 million key/value pairs (a bit shy of 70MB of query text). I didn't see any misbehavior on a machine with 2MB max_stack_depth. So there's something else going on in your situation. I concur with the suggestion to try to get a stack backtrace from the point of the error. Setting a breakpoint at errfinish() is usually an effective strategy when you know that the query will provoke a SQL error report. https://wiki.postgresql.org/wiki/Generating_a_stack_trace_of_a_PostgreSQL_backend regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] max_stack_depth problem though query is substantially smaller
"Bannert Matthias"writes: > Fwiw, I was not stubbornly insisting on nesting operators. Actually I > switched from "=>" to the hstore function cause > a note in the manual said it was deprecated > (http://www.postgresql.org/docs/9.0/static/hstore.html). Somehow I must have > understand that note the wrong way. > What's your take on that operator being deprecated? That's the old SQL operator (which is not even there anymore) that's equivalent to the hstore(text,text) constructor function, ie "text => text returning hstore". It's quite a different concept from the => notation inside an hstore literal. That is: 'foo'::text => 'bar'::text is not like '"foo" => "bar"'::hstore even though they have the same end result. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] max_stack_depth problem though query is substantially smaller
"Bannert Matthias"writes: > [ very deep stack of parser transformExprRecurse calls ] > #20137 0x7fe7fb80ab8c in pg_analyze_and_rewrite > (parsetree=parsetree@entry=0x7fe7fffdb2a0, > query_string=query_string@entry=0x7fe7fdf606b0 "INSERT INTO > ts_updates(ts_key, ts_data, ts_frequency) VALUES > ('some_id.sector_all.news_all_d',hstore('1900-01-01','-0.395131869823009')||hstore('1900-01-02','-0.395131869823009')||hstore('1"..., > paramTypes=paramTypes@entry=0x0, numParams=numParams@entry=0) at > /build/postgresql-9.3-G1RSAD/postgresql-9.3-9.3.11/build/../src/backend/tcop/postgres.c:640 The SQL fragment we can see here suggests that your "40K entry hstore" is getting built up by stringing together 40K hstore concatenation operators. Don't do that. Even without the parser stack depth issue, it's uselessly inefficient. I presume you're generating this statement mechanically, not by hand, so you could equally well have the app emit '1900-01-01 => -0.395131869823009, 1900-01-02 => -0.395131869823009, ...'::hstore which would look like a single hstore literal to the parser, and be processed much more quickly. If you insist on emitting SQL statements that have operators nested to such depths, then yes you'll need to increase max_stack_depth to whatever it takes to allow it. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] max_stack_depth problem though query is substantially smaller
"Bannert Matthias"writes: > Thanks for your reply. I do think it is rather a postgres than an R issue, > here's why: > a) R simply puts an SQL string together. What Charles had posted was an > excerpt of that string. > Basically we have 1.7 MB of that string. Everything else is equal just the > hstore contains 40K key value pairs. Well, as a test I ran a query that included an hstore literal with 4 million key/value pairs (a bit shy of 70MB of query text). I didn't see any misbehavior on a machine with 2MB max_stack_depth. So there's something else going on in your situation. I concur with the suggestion to try to get a stack backtrace from the point of the error. Setting a breakpoint at errfinish() is usually an effective strategy when you know that the query will provoke a SQL error report. https://wiki.postgresql.org/wiki/Generating_a_stack_trace_of_a_PostgreSQL_backend regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] max_stack_depth problem though query is substantially smaller
Thanks for your reply. I do think it is rather a postgres than an R issue, here's why: a) R simply puts an SQL string together. What Charles had posted was an excerpt of that string. Basically we have 1.7 MB of that string. Everything else is equal just the hstore contains 40K key value pairs. b) The error message clearly mentions max_stack_depth which is a postgres parameter. c) If I just take that SQL string (only the first part of it, i.e. the create temp table and insert into part w/o all the update and join gibberish and put it to a .sql file and simply run it through a psql client like this: \i myquery.sql I get exactly the same error message (without any R involved at any stage) psql:query.sql:3: ERROR: stack depth limit exceeded HINT: Increase the configuration parameter "max_stack_depth" (currently 7168kB), after ensuring the platform's stack depth limit is adequate. d) I ran into to quite some R stack errors and they look different... (C_STACK_SIZE) conclusion: We are running a simple insert. Nothing special except for the fact that hstore has 40K key value pairs. Could it be that the indexing of that hstore gets kinda large and thus a query string that only has 1.7 MB exceeds the stack ? From: Tom Lane [t...@sss.pgh.pa.us] Sent: Friday, April 08, 2016 4:20 PM To: Charles Clavadetscher Cc: pgsql-general@postgresql.org; Bannert Matthias Subject: Re: [GENERAL] max_stack_depth problem though query is substantially smaller "Charles Clavadetscher" <clavadetsc...@swisspug.org> writes: > When R processes the daily time serie we get a stack size exceeded error, followed by the hint to increase the max_stack_depth. Postgres doesn't generally allocate large values on the stack, and I doubt that R does either. Almost certainly, what is causing this is not data size per se but unreasonable call nesting depth in your R code. You may have a function that's actually in infinite recursion, or maybe it's recursing to a depth governed by the number of data elements. If so, consider revising it into iteration with an explicitly-represented state stack. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] max_stack_depth problem though query is substantially smaller
"Charles Clavadetscher"writes: > When R processes the daily time serie we get a stack size exceeded error, followed by the hint to increase the max_stack_depth. Postgres doesn't generally allocate large values on the stack, and I doubt that R does either. Almost certainly, what is causing this is not data size per se but unreasonable call nesting depth in your R code. You may have a function that's actually in infinite recursion, or maybe it's recursing to a depth governed by the number of data elements. If so, consider revising it into iteration with an explicitly-represented state stack. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] max_stack_depth problem though query is substantially smaller
Charles Clavadetscher wrote: > We have a process in R which reads statistical raw data from a table and > computes time series values > from them. > The time series values are in a hstore field with the date as the key and the > value as the value. > The process writes the computed value into a temporary table and locks the > corresponding row(s) of the > target table for update. > Finally the row(s) are updated if they already exist or inserted if they do > not exist. > > This process runs nightly and processes many GB of data without generating > any error. Normally these > are low frequency time series > (monthly or quarterly data). > > Now we have a daily time serie with about 46'000 key/value pairs. In near > future this will increase to > 86'000 including data from > previous years. > > When R processes the daily time serie we get a stack size exceeded error, > followed by the hint to > increase the max_stack_depth. My > colleague, who wrote the R function and performed the test read the docs and > increased, according to > ulimit -s the max_stack_depth > to 7MB. > > Here the details of OS and PG: > OS: osx 10.10.5 > PG: 9.3.3 > > ulimit -s = 8192 > > The resize did work as *show max_stack_depth;" has shown. After this change, > however, the query states > the same error as before, > just with the new limit of 7 MB. > > The query itself was written to a file in order to verify its size. The size > turned out to be 1.7MB, > i.e. even below the > conservative default limit of 2 MB, yet alone substantially below 7 MB. > > Apart from the fact that we could consider using a different strategy to > store time series, we would > like to understand what is > causing the problem. > > Here the query as it looks like in the R code: > sql_query_data <- sprintf("BEGIN; >CREATE TEMPORARY TABLE ts_updates(ts_key varchar, > ts_data hstore, > ts_frequency integer) ON COMMIT DROP; >INSERT INTO ts_updates(ts_key, ts_data) VALUES %s; >LOCK TABLE %s.timeseries_main IN EXCLUSIVE MODE; > >UPDATE %s.timeseries_main >SET ts_data = ts_updates.ts_data >FROM ts_updates >WHERE ts_updates.ts_key = > %s.timeseries_main.ts_key; > >INSERT INTO %s.timeseries_main >SELECT ts_updates.ts_key, ts_updates.ts_data, > ts_updates.ts_frequency >FROM ts_updates >LEFT OUTER JOIN %s.timeseries_main ON > (%s.timeseries_main.ts_key = > ts_updates.ts_key) >WHERE %s.timeseries_main.ts_key IS NULL; >COMMIT;", >values, schema, schema, schema, schema, schema, > schema, schema) > > And here is how it looks like at the end: > > INSERT INTO ts_updates(ts_key, ts_data, ts_frequency) VALUES > ('somekey',hstore('1900-01-01','- > 0.395131869823009')|| > > hstore('1900-01-02','- > 0.595131869823009')|| > > hstore('1900-01-03','- > 0.395131869823009')|| > [...] > 46'000 times > > hstore('1900-01-04','- > 0.395131869823009'),NULL); I don't understand which query causes the error. The queries you quote above are smaller than 1.7 MB... You could log errors to find out which statement causes the error. One idea would be to attach a debugger to the backend, set a breakpoint in check_stack_depth() where the error is thrown, and take a stack trace when you hit the error. Maybe that can show you what is going on. Yours, Laurenz Albe -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] max_stack_depth problem though query is substantially smaller
Hello We have a process in R which reads statistical raw data from a table and computes time series values from them. The time series values are in a hstore field with the date as the key and the value as the value. The process writes the computed value into a temporary table and locks the corresponding row(s) of the target table for update. Finally the row(s) are updated if they already exist or inserted if they do not exist. This process runs nightly and processes many GB of data without generating any error. Normally these are low frequency time series (monthly or quarterly data). Now we have a daily time serie with about 46'000 key/value pairs. In near future this will increase to 86'000 including data from previous years. When R processes the daily time serie we get a stack size exceeded error, followed by the hint to increase the max_stack_depth. My colleague, who wrote the R function and performed the test read the docs and increased, according to ulimit -s the max_stack_depth to 7MB. Here the details of OS and PG: OS: osx 10.10.5 PG: 9.3.3 ulimit -s = 8192 The resize did work as *show max_stack_depth;" has shown. After this change, however, the query states the same error as before, just with the new limit of 7 MB. The query itself was written to a file in order to verify its size. The size turned out to be 1.7MB, i.e. even below the conservative default limit of 2 MB, yet alone substantially below 7 MB. Apart from the fact that we could consider using a different strategy to store time series, we would like to understand what is causing the problem. Here the query as it looks like in the R code: sql_query_data <- sprintf("BEGIN; CREATE TEMPORARY TABLE ts_updates(ts_key varchar, ts_data hstore, ts_frequency integer) ON COMMIT DROP; INSERT INTO ts_updates(ts_key, ts_data) VALUES %s; LOCK TABLE %s.timeseries_main IN EXCLUSIVE MODE; UPDATE %s.timeseries_main SET ts_data = ts_updates.ts_data FROM ts_updates WHERE ts_updates.ts_key = %s.timeseries_main.ts_key; INSERT INTO %s.timeseries_main SELECT ts_updates.ts_key, ts_updates.ts_data, ts_updates.ts_frequency FROM ts_updates LEFT OUTER JOIN %s.timeseries_main ON (%s.timeseries_main.ts_key = ts_updates.ts_key) WHERE %s.timeseries_main.ts_key IS NULL; COMMIT;", values, schema, schema, schema, schema, schema, schema, schema) And here is how it looks like at the end: INSERT INTO ts_updates(ts_key, ts_data, ts_frequency) VALUES ('somekey',hstore('1900-01-01','-0.395131869823009')|| hstore('1900-01-02','-0.595131869823009')|| hstore('1900-01-03','-0.395131869823009')|| [...] 46'000 times hstore('1900-01-04','-0.395131869823009'),NULL); The computer where my colleague made the test is local. There are no other concurrent users. We thank you for hints on what the problem may be and/or how to investigate it further. Please reply to all, as my colleague is not yet subscribed to the mailing list. Regards, Charles and Matthias -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general