Re: [HACKERS] plpython SPI cursors

Steve Singer Sat, 26 Nov 2011 08:22:02 -0800

On 11-11-23 01:58 PM, Jan Urbański wrote:

On 20/11/11 19:14, Steve Singer wrote:

On 11-10-15 07:28 PM, Jan Urbański wrote:

Hi,


attached is a patch implementing the usage of SPI cursors in PL/Python.
Currently when trying to process a large table in PL/Python you have
slurp it all into memory (that's what plpy.execute does).

J


I found a few bugs (see my testing section below) that will need fixing
+ a few questions about the code

Responding now to all questions and attaching a revised patch based onyour comments.


I've looked over the revised version of the patch and it now seems fine.

Ready for committer.

Do we like the name plpy.cursor or would we rather call it somethinglike
plpy.execute_cursor(...) or plpy.cursor_open(...) or
plpy.create_cursor(...)
Since we will be mostly stuck with the API once we release 9.2 this is
worth
some opinions on. I like cursor() but if anyone disagrees now is thetime.
We use plpy.subtransaction() to create Subxact objects, so I thoughplpy.cursor() would be most appropriate.
This patch does not provide a wrapper around SPI_cursor_move. The patch
is useful without that and I don't see anything that preculdessomeone else
adding that later if they see a need.
My idea is to add keyword arguments to plpy.cursor() that will allowyou to decide whether you want a scrollable cursor and after thatprovide a move() method.
The patch includes documentation updates that describes the new feature.
The Database Access page doesn't provide a API style list of database
access
functions like the plperl
http://www.postgresql.org/docs/9.1/interactive/plperl-builtins.html page
does. I think the organization of the perl page is
clearer than the python one and we should think about a doing some
documentaiton refactoring. That should be done as a seperate patch and
shouldn't be a barrier to committing this one.
Yeah, the PL/Python docs are a bit chaotic right now. I haven't yetsummoned force to overhaul them.
in PLy_cursor_plan line 4080
+ PG_TRY();
+ {
+ Portal portal;
+ char *volatile nulls;
+ volatile int j;
I am probably not seeing a code path or misunderstanding something
about the setjmp/longjump usages but I don't see why nulls and j needto be
volatile here.
It looked like you could drop volatile there (and inPLy_spi_execute_plan, where this is copied from (did I mention there'squite some code duplication in PL/Python?)) but digging in git I foundthis commit:
http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=2789b7278c11785750dd9d2837856510ffc67000
that added the original volatile qualification, so I guess there's areason.
line 444
PLy_cursor(PyObject *self, PyObject *args)
+ {
+ char *query;
+ PyObject *plan;
+ PyObject *planargs = NULL;
+
+ if (PyArg_ParseTuple(args, "s", &query))
+ return PLy_cursor_query(query);
+

Should query be freed with PyMem_free()
No, PyArg_ParseTuple returns a string on the stack, I check thatrepeatedly creating a cursor with a plan argument does not leak memoryand that adding PyMem_Free there promptly leads to a segfault.
I tested both python 2.6 and 3 on a Linux system

[test cases demonstrating bugs]
Turns out it's a really bad idea to store pointers to Portalstructures, because they get invalidated by the subtransaction aborthooks.
I switched to storing the cursor name and looking it up in theappropriate hash table every time it's used. The examples you sent(which I included as regression tests) now cause a ValueError to beraised with a message stating that the cursor has been created in anaborted subtransaction.
Not sure about the wording of the error message, though.

Thanks again for the review!

Cheers,
Jan

Re: [HACKERS] plpython SPI cursors

Reply via email to