Re: [HACKERS] plpython crash (PG 92)
On lör, 2012-04-28 at 00:32 -0400, Tom Lane wrote: I'm inclined to think that the best fix is for PLy_spi_execute_fetch_result to copy the tupledesc into TopMemoryContext, not the current context. This is a tad scary from a memory leakage standpoint, but I suppose that if python fails to recover the PLyResultObject, this isn't the only memory that's going to be leaked. This area appears to be shy a regression test case or two, in any event. Fixed like that. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plpython crash (PG 92)
On tor, 2012-04-26 at 17:32 +0500, Asif Naeem wrote: PFA test case. It used simple select statement to retrieve data via plpython. It crashes latest pg 9.2 with the following stack trace i.e. Apparently it is being crashed because of invalid related pointer value of pfree() *header-context-methods-free_p. It is reproducible with latest version of python i.e. Python-2.7.3 and Python-3.2.3. Thanks. This is because of this code: static void PLy_result_dealloc(PyObject *arg) { PLyResultObject *ob = (PLyResultObject *) arg; Py_XDECREF(ob-nrows); Py_XDECREF(ob-rows); Py_XDECREF(ob-status); if (ob-tupdesc) { FreeTupleDesc(ob-tupdesc); // -- dies here ob-tupdesc = NULL; } arg-ob_type-tp_free(arg); } I must have been confused about the tuple descriptor APIs. ob-tupdesc is created using CreateTupleDescCopy(), which copies the refcount of the original tuple descriptor, thus causing a failure when the (seemingly still referenced) tupdesc is freed. Is this behavior correct and useful? The dominant coding practice appears to be to not explicitly free copied tuple descriptors, so maybe that should be done here as well. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plpython crash (PG 92)
Peter Eisentraut pete...@gmx.net writes: I must have been confused about the tuple descriptor APIs. ob-tupdesc is created using CreateTupleDescCopy(), which copies the refcount of the original tuple descriptor, Um, surely not. That would be nonsensical, and anyway a look at the code shows it isn't doing that. thus causing a failure when the (seemingly still referenced) tupdesc is freed. Is this behavior correct and useful? I think there must be some mistake in your analysis. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plpython crash (PG 92)
Peter Eisentraut pete...@gmx.net writes: On tor, 2012-04-26 at 17:32 +0500, Asif Naeem wrote: PFA test case. It used simple select statement to retrieve data via plpython. It crashes latest pg 9.2 with the following stack trace i.e. Apparently it is being crashed because of invalid related pointer value of pfree() *header-context-methods-free_p. It is reproducible with latest version of python i.e. Python-2.7.3 and Python-3.2.3. Thanks. This is because of this code: I traced through this test case. The situation is: (1) PLy_spi_execute_fetch_result is executed with CurrentMemoryContext being the SPI Proc context for the current SPI invocation, so that's where CreateTupleDescCopy creates the tupledesc that's placed into the PLyResultObject. (2) As we fall out of the SETOF function with the first result row, PLy_exec_function does SPI_finish() (plpy_exec.c:146). That causes the SPI contexts to go away. The PLyResultObject's tupdesc pointer is now pointing at freed memory, and in a cassert build, that memory has been actively wiped. (3) The main executor calls back to plpython for the next value-per-call result. Control goes to the PyIter_Next call at plpy_exec.c:108, which decides that we're done iterating (since this example only returns one row), and evidently it tries to deallocate the PLyResultObject immediately. Whether that happens immediately or later, though, you're screwed because FreeTupleDesc will be invoked on garbage. I'm inclined to think that the best fix is for PLy_spi_execute_fetch_result to copy the tupledesc into TopMemoryContext, not the current context. This is a tad scary from a memory leakage standpoint, but I suppose that if python fails to recover the PLyResultObject, this isn't the only memory that's going to be leaked. This area appears to be shy a regression test case or two, in any event. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] plpython crash (PG 92)
Hi, PFA test case. It used simple select statement to retrieve data via plpython. It crashes latest pg 9.2 with the following stack trace i.e. #0 0x0073021f in pfree () #1 0x7fa74b632f7a in PLy_result_dealloc () from /home/masif/work/postgresql/postgresql/inst/lib/plpython2.so #2 0x7fa74b2c710b in iter_iternext (iterator=0x1ad7150) at Objects/iterobject.c:74 #3 0x7fa74b2934db in PyIter_Next (iter=0x1b3c5f0) at Objects/abstract.c:3107 #4 0x7fa74b630245 in PLy_exec_function () from /home/masif/work/postgresql/postgresql/inst/lib/plpython2.so #5 0x7fa74b630c57 in plpython_call_handler () from /home/masif/work/postgresql/postgresql/inst/lib/plpython2.so #6 0x00583907 in ExecMakeFunctionResult () #7 0x0057f146 in ExecProject () #8 0x00596740 in ExecResult () #9 0x0057e708 in ExecProcNode () #10 0x0057d582 in standard_ExecutorRun () #11 0x0064f477 in PortalRunSelect () #12 0x00650778 in PortalRun () #13 0x0064ceca in exec_simple_query () #14 0x0064ddc7 in PostgresMain () #15 0x0060bdd9 in ServerLoop () #16 0x0060e9d7 in PostmasterMain () #17 0x005ad360 in main () Apparently it is being crashed because of invalid related pointer value of pfree() *header-context-methods-free_p. It is reproducible with latest version of python i.e. Python-2.7.3 and Python-3.2.3. Thanks. Best Regards, Muhammad Asif Naeem plpython_crash.sql Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plpython crash (PG 92)
FYI, I have observed this crash on Linux64. Thanks. Best Regards, Muhammad Asif Naeem On Thu, Apr 26, 2012 at 5:32 PM, Asif Naeem asif.na...@enterprisedb.comwrote: Hi, PFA test case. It used simple select statement to retrieve data via plpython. It crashes latest pg 9.2 with the following stack trace i.e. #0 0x0073021f in pfree () #1 0x7fa74b632f7a in PLy_result_dealloc () from /home/masif/work/postgresql/postgresql/inst/lib/plpython2.so #2 0x7fa74b2c710b in iter_iternext (iterator=0x1ad7150) at Objects/iterobject.c:74 #3 0x7fa74b2934db in PyIter_Next (iter=0x1b3c5f0) at Objects/abstract.c:3107 #4 0x7fa74b630245 in PLy_exec_function () from /home/masif/work/postgresql/postgresql/inst/lib/plpython2.so #5 0x7fa74b630c57 in plpython_call_handler () from /home/masif/work/postgresql/postgresql/inst/lib/plpython2.so #6 0x00583907 in ExecMakeFunctionResult () #7 0x0057f146 in ExecProject () #8 0x00596740 in ExecResult () #9 0x0057e708 in ExecProcNode () #10 0x0057d582 in standard_ExecutorRun () #11 0x0064f477 in PortalRunSelect () #12 0x00650778 in PortalRun () #13 0x0064ceca in exec_simple_query () #14 0x0064ddc7 in PostgresMain () #15 0x0060bdd9 in ServerLoop () #16 0x0060e9d7 in PostmasterMain () #17 0x005ad360 in main () Apparently it is being crashed because of invalid related pointer value of pfree() *header-context-methods-free_p. It is reproducible with latest version of python i.e. Python-2.7.3 and Python-3.2.3. Thanks. Best Regards, Muhammad Asif Naeem
Re: [HACKERS] plpython crash
On 17/08/11 23:10, Tom Lane wrote: =?UTF-8?B?SmFuIFVyYmHFhHNraQ==?= wulc...@wulczer.org writes: Here are two patches that fix two separate bugs that you found simultaneously. Because they're actually separate issues, it turned out fixing them was a bit more tricky than I expected (fixing one was unmasking the other one etc). Applied with one non-cosmetic change: I got rid of the test on TransactionIdIsValid(arg-typrel_xmin) in PLy_input_tuple_funcs, as well as where you'd copied that logic in PLy_output_tuple_funcs. AFAICS skipping the update on the xmin/tid, if we're coming through there a second time, would be simply wrong. Thanks! The way things are set up now I think you never go through PLy_input_tuple_funcs twice, unless the cache is determined to be invalid and then you recreate the function from scratch. But of course it's better to be safe than sorry and even if I'm right and it was never executed twice, any refactoring effort might have broken it easily. Cheers, Jan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plpython crash
On 17/08/11 11:40, Jan Urbański wrote: On 16/08/11 19:12, Jan Urbański wrote: On 16/08/11 19:07, Jean-Baptiste Quenot wrote: [plpython is buggy] I'll have a patch ready soon. Here are two patches that fix two separate bugs that you found simultaneously. Because they're actually separate issues, it turned out fixing them was a bit more tricky than I expected (fixing one was unmasking the other one etc). Thanks for the report! Jan From 3c0bf7519cad735160d9d222d6f86f84987b38b5 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Jan=20Urba=C5=84ski?= wulc...@wulczer.org Date: Wed, 17 Aug 2011 16:07:54 +0200 Subject: [PATCH 2/2] Guard against return type changing in PL/Python functions. Functions cache their I/O routines and in case their return type is composite, a change of the underlying type can cause the cache to become invalid. PL/Python was already checking for composite type changes for input arguments, now the check is extended to cover the return type as well. Per bug report from Jean-Baptiste Quenot. --- src/pl/plpython/expected/plpython_record.out | 21 ++ src/pl/plpython/plpython.c | 93 ++--- src/pl/plpython/sql/plpython_record.sql | 15 3 files changed, 103 insertions(+), 26 deletions(-) diff --git a/src/pl/plpython/expected/plpython_record.out b/src/pl/plpython/expected/plpython_record.out index 7c60089..0bcc46c 100644 --- a/src/pl/plpython/expected/plpython_record.out +++ b/src/pl/plpython/expected/plpython_record.out @@ -308,6 +308,27 @@ SELECT * FROM test_inout_params('test_in'); test_in_inout (1 row) +-- try changing the return types and call functions again +ALTER TABLE table_record DROP COLUMN first; +ALTER TABLE table_record DROP COLUMN second; +ALTER TABLE table_record ADD COLUMN first text; +ALTER TABLE table_record ADD COLUMN second int4; +SELECT * FROM test_table_record_as('obj', 'one', 1, false); + first | second +---+ + one | 1 +(1 row) + +ALTER TYPE type_record DROP ATTRIBUTE first; +ALTER TYPE type_record DROP ATTRIBUTE second; +ALTER TYPE type_record ADD ATTRIBUTE first text; +ALTER TYPE type_record ADD ATTRIBUTE second int4; +SELECT * FROM test_type_record_as('obj', 'one', 1, false); + first | second +---+ + one | 1 +(1 row) + -- errors cases CREATE FUNCTION test_type_record_error1() RETURNS type_record AS $$ return { 'first': 'first' } diff --git a/src/pl/plpython/plpython.c b/src/pl/plpython/plpython.c index 90d3c47..a254ffa 100644 --- a/src/pl/plpython/plpython.c +++ b/src/pl/plpython/plpython.c @@ -1489,6 +1489,42 @@ PLy_function_delete_args(PLyProcedure *proc) PyDict_DelItemString(proc-globals, proc-argnames[i]); } +static bool +PLy_procedure_argument_valid(PLyTypeInfo *arg) +{ + Oid relid; + HeapTuple relTup; + bool valid; + + /* Only check input arguments that are composite */ + if (arg-is_rowtype != 1) { + return true; + } + + /* An uninitialised typ_relid means that we got called on an output + * argument of a function returning a unnamed record type */ + if (!OidIsValid(arg-typ_relid)) { + return true; + } + + Assert(TransactionIdIsValid(arg-typrel_xmin)); + Assert(ItemPointerIsValid(arg-typrel_tid)); + + /* Get the pg_class tuple for the argument type */ + relid = arg-typ_relid; + relTup = SearchSysCache1(RELOID, ObjectIdGetDatum(relid)); + if (!HeapTupleIsValid(relTup)) + elog(ERROR, cache lookup failed for relation %u, relid); + + /* If it has changed, the function is not valid */ + valid = (arg-typrel_xmin == HeapTupleHeaderGetXmin(relTup-t_data) + ItemPointerEquals(arg-typrel_tid, relTup-t_self)); + + ReleaseSysCache(relTup); + + return valid; +} + /* * Decide whether a cached PLyProcedure struct is still valid */ @@ -1509,33 +1545,16 @@ PLy_procedure_valid(PLyProcedure *proc, HeapTuple procTup) /* If there are composite input arguments, they might have changed */ for (i = 0; i proc-nargs; i++) { - Oid relid; - HeapTuple relTup; - /* Short-circuit on first changed argument */ if (!valid) break; - /* Only check input arguments that are composite */ - if (proc-args[i].is_rowtype != 1) - continue; - - Assert(OidIsValid(proc-args[i].typ_relid)); - Assert(TransactionIdIsValid(proc-args[i].typrel_xmin)); - Assert(ItemPointerIsValid(proc-args[i].typrel_tid)); - - /* Get the pg_class tuple for the argument type */ - relid = proc-args[i].typ_relid; - relTup = SearchSysCache1(RELOID, ObjectIdGetDatum(relid)); - if (!HeapTupleIsValid(relTup)) - elog(ERROR, cache lookup failed for relation %u, relid); - - /* If it has changed, the function is not valid */ - if (!(proc-args[i].typrel_xmin == HeapTupleHeaderGetXmin(relTup-t_data) - ItemPointerEquals(proc-args[i].typrel_tid, relTup-t_self))) - valid = false; + valid = PLy_procedure_argument_valid(proc-args[i]); + } - ReleaseSysCache(relTup); + /* if the output argument is composite, it might have changed */ + if (valid) { + valid =
Re: [HACKERS] plpython crash
=?UTF-8?B?SmFuIFVyYmHFhHNraQ==?= wulc...@wulczer.org writes: On 16/08/11 19:07, Jean-Baptiste Quenot wrote: [plpython is buggy] Here are two patches that fix two separate bugs that you found simultaneously. Because they're actually separate issues, it turned out fixing them was a bit more tricky than I expected (fixing one was unmasking the other one etc). These look generally sane although I have some minor stylistic gripes. Will clean them up and apply in a few hours (I have to leave for an appointment shortly). regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plpython crash
=?UTF-8?B?SmFuIFVyYmHFhHNraQ==?= wulc...@wulczer.org writes: Here are two patches that fix two separate bugs that you found simultaneously. Because they're actually separate issues, it turned out fixing them was a bit more tricky than I expected (fixing one was unmasking the other one etc). Applied with one non-cosmetic change: I got rid of the test on TransactionIdIsValid(arg-typrel_xmin) in PLy_input_tuple_funcs, as well as where you'd copied that logic in PLy_output_tuple_funcs. AFAICS skipping the update on the xmin/tid, if we're coming through there a second time, would be simply wrong. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plpython crash
After backporting plpython.c from HEAD, this is the error message I get: ERROR: key pg.dropped.6 not found in mapping HINT: To return null in a column, add the value None to the mapping with the key named after the column. CONTEXT: while creating return value PL/Python function myfunc What does it mean? -- Jean-Baptiste Quenot -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plpython crash
On 16/08/11 16:52, Jean-Baptiste Quenot wrote: After backporting plpython.c from HEAD, this is the error message I get: ERROR: key pg.dropped.6 not found in mapping HINT: To return null in a column, add the value None to the mapping with the key named after the column. CONTEXT: while creating return value PL/Python function myfunc What does it mean? Ah, interesting, I think that this means that you are returning a table type and that table has a dropped column. The code should skip over dropped columns, but apparently it does not and tries to find a value for that column in the mapping you are returning. I'll try to reproduce it here. Jan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plpython crash
On 16/08/11 17:06, Jan Urbański wrote: On 16/08/11 16:52, Jean-Baptiste Quenot wrote: After backporting plpython.c from HEAD, this is the error message I get: ERROR: key pg.dropped.6 not found in mapping HINT: To return null in a column, add the value None to the mapping with the key named after the column. CONTEXT: while creating return value PL/Python function myfunc What does it mean? I did a couple of simple tests and can't see how can the code not skip dropped columns. It seems like you're missing this commit: http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=41282111e6cc73aca4b63dffe950ba7a63e4bd8a Could you try running this query? (assuming your function is called 'myfync') select proname, relname, attname, attisdropped from pg_proc join pg_class on (prorettype = reltype) join pg_attribute on (attrelid = pg_class.oid) where proname = 'myfunc'; Cheers, Jan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plpython crash
Dear Jan, Sorry I typed the wrong git commands. With latest plpython from branch master I got the same gdb backtrace as reported before. I managed to wrap up a testcase that fails 100% of times on my setup: https://gist.github.com/1149512 Hope it crashes on your side too :-) This is the result on PG 9.0.4: https://gist.github.com/1149543 This is the result on PG 9.0.4 with plpython.c backported from HEAD: https://gist.github.com/1149558 Cheers, -- Jean-Baptiste Quenot -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plpython crash
On 16/08/11 19:07, Jean-Baptiste Quenot wrote: Dear Jan, Sorry I typed the wrong git commands. With latest plpython from branch master I got the same gdb backtrace as reported before. I managed to wrap up a testcase that fails 100% of times on my setup: https://gist.github.com/1149512 Hope it crashes on your side too :-) Awesome, it segfaults for me with HEAD ;) Now it's just a simple matter of programming... I'll take a look at it this evening. Jan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plpython crash
On 11/08/11 18:01, Jean-Baptiste Quenot wrote: Hi there, plpython crashes on me on various 64-bit Ubuntu hosts, see the gdb backtrace at: https://gist.github.com/1140005 Do you believe there was recent bugfixes regarding PLyMapping_ToTuple() ? This is PG 9.0.4 with HEAD of plpython taken in march 2011 and backported. Please tell me if you need more information. Hi, there were no changes to that area of plpython after March 2011. Could you try to see if the error also appears if you run your app with current PostgreSQL HEAD (both the server and plpython)? Which Python version is that? You can get that info by running: do $$ import sys; plpy.info(sys.version) $$ language plpythonu; Could you try to extract a self-contained example of how to reproduce it? If the bug only appears under your application's specific workload, perhaps you could try running it with Postgres compiled with -O0, because compiling with -O2 causes the gdb backtrace to be missing optimised out values and inlined functions? Cheers, Jan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plpython crash
Here is the same with -O0: https://gist.github.com/1140005 sys.version reports this: INFO: 2.6.6 (r266:84292, Sep 15 2010, 16:41:53) [GCC 4.4.5] -- Jean-Baptiste Quenot -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plpython crash
On 12/08/11 13:55, Jean-Baptiste Quenot wrote: Here is the same with -O0: https://gist.github.com/1140005 sys.version reports this: INFO: 2.6.6 (r266:84292, Sep 15 2010, 16:41:53) [GCC 4.4.5] I'm still at a loss. Did you reproduce it with git HEAD? I see that the query being execute is select * from myfunc(); would it be possible to share the code of myfunc? Jan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] plpython crash
Hi there, plpython crashes on me on various 64-bit Ubuntu hosts, see the gdb backtrace at: https://gist.github.com/1140005 Do you believe there was recent bugfixes regarding PLyMapping_ToTuple() ? This is PG 9.0.4 with HEAD of plpython taken in march 2011 and backported. Please tell me if you need more information. -- Jean-Baptiste Quenot -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers