subject:"\[HACKERS\] pl\/python improvements"

Re: [HACKERS] pl/python improvements

2010-12-24 Thread James William Pye

On Dec 23, 2010, at 3:38 AM, Jan Urbański wrote:
 Oh, didn't know that. I see that it does some more fancy things, like
 defining a inheritance hierarchy for these exceptions and adding some
 more into the mix.

Right, there were some cases that appeared to benefit from larger
buckets than what the existing code classes provided. Also, some of the
exceptions in there are strictly for py-postgresql/client-side things.

 The names I used are not really invented, they're just plpgsql condition
 names from
 http://www.postgresql.org/docs/current/static/errcodes-appendix.html
 with underscores changed to camel case. Also, since they're
 autogenerated from utils/errcodes.h they don't have any hierarchy, they
 just all inherit from SPIError.

For the backend setting, I think this is quite appropriate.

However, for pg-python, I had mixed feelings about this as I wanted to
be able to leverage py-postgresql's hierarchy, but still have the projects
independent. I ended up punting on this one by using a single error class,
and forcing the user to compare the codes. =(

 Sticking Error to every one of them will result in things like
 SubstringErrorError, so I'm not really sold on that.

There was some creativity applied to the names in postgresql.exceptions
to accommodate for things like that. (Like no redundant Error)

 Basically I think
 more PL/Python users will be familiar with condition names as you use
 them in pl/pgsql than with the names from py-postgresql.

I think that's fair assumption. In fact, I think that might make a good
TODO for py-postgresql/pg-python. Provide a plpgsql-code-name to
exception class mapping.

cheers, jwp
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pl/python improvements

2010-12-23 Thread Marti Raudsepp

On Thu, Dec 23, 2010 at 04:08, Jan Urbański wulc...@wulczer.org wrote:
  * providing custom exceptions for SPI errors, so you can catch only
 UniqueViolations and not have to muck around with SQLCODE

py-postgresql already has a mapping from error codes to Python
exceptions. I think it makes sense to re-use that, instead of
inventing new names.
https://github.com/jwp/py-postgresql/blob/v1.1/postgresql/exceptions.py

It also follows the Python convention of ending exception classes with
Error, so instead of UniqueViolation they have UniqueError, instead
of InvalidTextRepresentation, they have TextRepresentationError

 Meanwhile the code
 is available at https://github.com/wulczer/postgres. You will find 10
 branches there, 9 correspond to these features, and the plpython
 branch is the sum of them all.

I tried building the plpython branch, but got an unrelated error. I
didn't investigate further for now...

make[3]: Entering directory
`/home/marti/src/postgresql-py/src/backend/bootstrap'
gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith
-Wdeclaration-after-statement -Wendif-labels -fno-strict-aliasing
-fwrapv -I. -I. -I../../../src/include -D_GNU_SOURCE   -c -o
bootparse.o bootparse.c
bootparse.y: In function ‘boot_yyparse’:
bootparse.y:224:16: error: too few arguments to function ‘heap_create’
../../../src/include/catalog/heap.h:37:17: note: declared here
bootparse.y:249:16: error: too few arguments to function
‘heap_create_with_catalog’
../../../src/include/catalog/heap.h:48:12: note: declared here
make[3]: *** [bootparse.o] Error 1


Regards,
Marti

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pl/python improvements

2010-12-23 Thread Jan Urbański

On 23/12/10 12:16, Marti Raudsepp wrote:
 On Thu, Dec 23, 2010 at 04:08, Jan Urbański wulc...@wulczer.org wrote:
  * providing custom exceptions for SPI errors, so you can catch only
 UniqueViolations and not have to muck around with SQLCODE
 
 py-postgresql already has a mapping from error codes to Python
 exceptions. I think it makes sense to re-use that, instead of
 inventing new names.
 https://github.com/jwp/py-postgresql/blob/v1.1/postgresql/exceptions.py
 
 It also follows the Python convention of ending exception classes with
 Error, so instead of UniqueViolation they have UniqueError, instead
 of InvalidTextRepresentation, they have TextRepresentationError

Oh, didn't know that. I see that it does some more fancy things, like
defining a inheritance hierarchy for these exceptions and adding some
more into the mix.

The names I used are not really invented, they're just plpgsql condition
names from
http://www.postgresql.org/docs/current/static/errcodes-appendix.html
with underscores changed to camel case. Also, since they're
autogenerated from utils/errcodes.h they don't have any hierarchy, they
just all inherit from SPIError.

Sticking Error to every one of them will result in things like
SubstringErrorError, so I'm not really sold on that. Basically I think
more PL/Python users will be familiar with condition names as you use
them in pl/pgsql than with the names from py-postgresql.

 Meanwhile the code
 is available at https://github.com/wulczer/postgres. You will find 10
 branches there, 9 correspond to these features, and the plpython
 branch is the sum of them all.
 
 I tried building the plpython branch, but got an unrelated error. I
 didn't investigate further for now...
 
 make[3]: Entering directory
 `/home/marti/src/postgresql-py/src/backend/bootstrap'
 gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith
 -Wdeclaration-after-statement -Wendif-labels -fno-strict-aliasing
 -fwrapv -I. -I. -I../../../src/include -D_GNU_SOURCE   -c -o
 bootparse.o bootparse.c
 bootparse.y: In function ‘boot_yyparse’:
 bootparse.y:224:16: error: too few arguments to function ‘heap_create’
 ../../../src/include/catalog/heap.h:37:17: note: declared here
 bootparse.y:249:16: error: too few arguments to function
 ‘heap_create_with_catalog’
 ../../../src/include/catalog/heap.h:48:12: note: declared here
 make[3]: *** [bootparse.o] Error 1

I'm pretty sure it'll go away it you do make maintainer-clean or git
clean -dfx after switching to the plpython branch.

Cheers,
Jan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pl/python improvements

2010-12-22 Thread Jan Urbański

On 08/12/10 22:41, Peter Eisentraut wrote:
 On tis, 2010-12-07 at 23:56 +0100, Jan Urbański wrote:
 Peter suggested having a mail/patch per feature and the way I intend
 to do that is instead of having a dozen branches, have one and after
 I'm done rebase it interactively to produce incremental patches that
 apply to master, each one implementing one feature.
 
 Fair enough if you want to do it that way, but I'd encourage you to just
 send in any self-contained features/changes that you have finished.

I finished work on the PL/Python improvements. There 9 separate patches,
with some interdependencies.

The features implemented are:
 * general refactoring, dropping the global error state, using dynahash
instead of PyDicts for procedure caching, etc
 * a validator function
 * executing SPI calls in subtransactions
 * starting explicit subtransactions to have atomic behaviour of
multiple SPI calls
 * providing custom exceptions for SPI errors, so you can catch only
UniqueViolations and not have to muck around with SQLCODE
 * invalidate functions accepting composite types if the type changes
after the functions has been defined
 * traceback support
 * table functions (ability to return RECORD and SETOF RECORD)
 * using custom parsers for datatypes + an example application with
dict-hstore parsing

The dependencies are:

refactor - validator
 - SPI in subxacts
  - explicit subxacts
 - custom exceptions for SPI
 - invalidate composites
 - tracebacks
 - table functions
 - custom parsers

so everything depends on the refactoring patch, and the SPI changes are
incremental.

I will generate these patches and send them in in separate threads/add
them to the commitfest app before Christmas (I hope). Meanwhile the code
is available at https://github.com/wulczer/postgres. You will find 10
branches there, 9 correspond to these features, and the plpython
branch is the sum of them all.

From now on I will stop rebasing these branches, and instead just commit
fixes to the appropriate branch and merge upwards (from refactor to
everything, from spi-in-subxacts to explicit-subxacts and to
custom-spi-exceptions, etc). I'll also try to independently maintain the
plpython branch with the sum to make sure nothing has been left out as
the feature branches get merged into mainline.

Here's to hoping that this will not turn into a conflict hell.

Cheers,
Jan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pl/python improvements

2010-12-08 Thread Peter Eisentraut

On tis, 2010-12-07 at 23:56 +0100, Jan Urbański wrote:
 Peter suggested having a mail/patch per feature and the way I intend
 to do that is instead of having a dozen branches, have one and after
 I'm done rebase it interactively to produce incremental patches that
 apply to master, each one implementing one feature.

Fair enough if you want to do it that way, but I'd encourage you to just
send in any self-contained features/changes that you have finished.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] pl/python improvements

2010-12-07 Thread Jan Urbański

Hi,

no, no patch(es) yet. I'm going through plpython.c trying as best I can
to improve things there. I'll have a patch (or patches) ready for the
January commitfest, but I thought I'd open up a discussion already to
spare me having to redo features because the way I attacked the problem
is a dead end. Be warned, this ended up being a very long email...

The code is on https://github.com/wulczer/postgres, in the plpython
branch. I'll be rebasing it regularly, so don't be surprised by commit
hashes changing.

Currently the improvements are:
 * added a validator, so syntax errors are caught on function creation,
and you don't pay compile overhead when first calling the function
 * use HTABs instead of Python dictionaries to cache procedures,
maintaing two separate HTABs for procedures and triggers (this way you
can use the procedure's OID as the hash key)
 * get rid of the global PLy_error_in_progress, hopefuly things that
cause errors are reported immediately
 * use palloc(TopMemoryContext) instead of malloc() for things that need
to live until the end of the session
 * execute SPI calls in a subtransaction, report errors back to Python
as exceptions that can be caught etc.
 * do not prefix error messages with PL/Python, just use the error
string as errmsg
 * if a function has composite type inputs, check if the underlying
types have not changed when calling the function and recompile it if
necessary (fixes a TODO item)
 * support tracebacks and make them look just like Python tracebacks (as
much as possible that is)
 * fix plpy.error and plpy.fatal to actually raise a
plpy.Error/plpy.Fatal exception, that can be caught etc.
 * remove volatile modifiers from variables (AIUI they are completely
not necessary anywhere in plpython.c)
 * general small refactorings, removing unnecessary code, splitting big
procedures into smaller ones

Planned improvements include:
 * explicit subxact starting, so you can issue consecutive plpy.execute
calls atomically
 * split plpython.c into smaller modules, like plpython_traceback.c,
plpython_type_conversion.c, plpython_plpy.c etc.
 * variadic argument handling (is this possible for a non-plpgsql pl?)
 * multiple OUT parameters support (a TODO item, I think table function
support meant to say that)
 * add a DB-API interface on top of SPI (a TODO item)

Not sure if I'll make it in time with all of them, I have put them in
more-or-less priority order.

Now that you're excited, the questions.

Q1

To check for composite types changing after the procedure I/O functions
have been cached, I store the relid of the base relation for the type
(typeidTypeRelid(desc-tdtypeid), where desc is the type's TupleDesc),
as well as the xmin and tid of the relation's tuple from pg_class. Then
when the function is invoked, for each composite parameter I fetch the
relation's tuple from pg_class again and compare xmin and tid. This
means that functions with composite parameters are slower, because
there's a syscache search for each parameter for each invocation. I
roughly measured the slowdown for a function with one composite param to
be 4%. OTOH this works now:

CREATE TABLE employee (
name text,
basesalary integer,
bonus integer
);

INSERT INTO employee VALUES ('John', 100, 10), ('Mary', 200, 10);

CREATE OR REPLACE FUNCTION test_composite_table_input(e employee)
RETURNS integer AS $$
return e['basesalary'] + e['bonus']
$$ LANGUAGE plpythonu;

SELECT name, test_composite_table_input(employee.*) FROM employee;

ALTER TABLE employee DROP bonus;

ALTER TABLE employee ADD bonus integer;
UPDATE employee SET bonus = 10;

-- will recompile the function and recache I/O functions
SELECT name, test_composite_table_input(employee.*) FROM employee;


PL/Perl has the same problem, BTW. The question is: am I going the right
way checking if the composite type is up-to-date? The TODO item says:
(is this possible, easy, quick?). My answer is: I guess, kinda, dunno.

Q2

The format of error messages changes, because the PL/Python prefix
gets removed (I want raise plpy.Error(foo) to result in ERROR:
plpy.ERROR: foo, not in PL/Python: plpy.ERROR: foo) and because of
other things. Like for instance since plpy.error(msg) now just raises a
plpy.Error exception, you will get ERROR: plpy.Error: msg and not just
ERROR: msg. This potentially breaks clients parsing error messages,
but I don't buy that argument really. Still, is it acceptable?

Q3

Tracebacks! The way I implemented them (based on the patch linked from
the TODO item) they look like this:

CREATE FUNCTION nested_error() RETURNS text AS
'def fun1():
plpy.error(boom)

def fun2():
fun1()

def fun3():
fun2()

fun3()
return not reached
' LANGUAGE plpythonu;
SELECT nested_error();
ERROR:  plpy.Error: boom
DETAIL:  Traceback (most recent call last):
  PL/Python function nested_error, line 10, in module
fun3()
  PL/Python function nested_error, line 8, in fun3
fun2()
  PL/Python function nested_error, line 5, in fun2

Re: [HACKERS] pl/python improvements

2010-12-07 Thread Andres Freund

On Tuesday 07 December 2010 20:17:57 Jan Urbański wrote:
 Hi,
 
 no, no patch(es) yet. I'm going through plpython.c trying as best I can
 to improve things there. I'll have a patch (or patches) ready for the
 January commitfest, but I thought I'd open up a discussion already to
 spare me having to redo features because the way I attacked the problem
 is a dead end. Be warned, this ended up being a very long email...
Nice to see improvements there.

 Currently the improvements are:
  * execute SPI calls in a subtransaction, report errors back to Python
 as exceptions that can be caught etc.
Youre doing that unconditionally? I think the performance impact of this will 
be too severe... Subtransactions can get quite expensive - especially in the 
face of HS.

  * remove volatile modifiers from variables (AIUI they are completely
 not necessary anywhere in plpython.c)
I havent read the code recently but they might be needed for local variables 
changed inside a PG_TRY catching an error and also accessed in the CATCH block 
or afterwards if youre not rethrowing.

See the man page of longjmp.

  * explicit subxact starting, so you can issue consecutive plpy.execute
 calls atomically
That likely obsoletes my comment from above.

  * split plpython.c into smaller modules, like plpython_traceback.c,
 plpython_type_conversion.c, plpython_plpy.c etc.
  * variadic argument handling (is this possible for a non-plpgsql pl?)
  * multiple OUT parameters support (a TODO item, I think table function
 support meant to say that)
  * add a DB-API interface on top of SPI (a TODO item)
I am not hot on that as dbapi simply is too unspecified in too many areas 
(prepared statements anyone).

Thanks,

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pl/python improvements

2010-12-07 Thread Jan Urbański

On 07/12/10 21:33, Andres Freund wrote:
 On Tuesday 07 December 2010 20:17:57 Jan Urbański wrote:
  * execute SPI calls in a subtransaction, report errors back to Python
 as exceptions that can be caught etc.
 Youre doing that unconditionally? I think the performance impact of this will 
 be too severe... Subtransactions can get quite expensive - especially in the 
 face of HS.

See
http://www.mail-archive.com/pgsql-hackers@postgresql.org/msg163004.html :(

  * explicit subxact starting, so you can issue consecutive plpy.execute
 calls atomically
 That likely obsoletes my comment from above.

Not really, I'm afraid :( You can pay the subxact overhead once with
explicit subxact starting, but you always have to pay it...

with plpy.subxact():
plpy.execute(select 1)
plpy.execute(select 2)
plpy.execute(select 3)

will start only 1 subxact, as opposed to 3.

Cheers,
Jan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pl/python improvements

2010-12-07 Thread Peter Eisentraut

On tis, 2010-12-07 at 20:17 +0100, Jan Urbański wrote:
 no, no patch(es) yet. I'm going through plpython.c trying as best I can
 to improve things there. I'll have a patch (or patches) ready for the
 January commitfest, but I thought I'd open up a discussion already to
 spare me having to redo features because the way I attacked the problem
 is a dead end. Be warned, this ended up being a very long email...

This all sounds very nice (especially since I had many of the same items
on my todo list to tackle after Christmas or so).  But I think this
would be massively simpler if you created a separate patch/branch/email
thread for each feature or change.

 The code is on https://github.com/wulczer/postgres, in the plpython
 branch. I'll be rebasing it regularly, so don't be surprised by commit
 hashes changing.

I think rebasing published repositories isn't encouraged.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pl/python improvements

2010-12-07 Thread Andrew Dunstan




On 12/07/2010 04:50 PM, Peter Eisentraut wrote:



The code is on https://github.com/wulczer/postgres, in the plpython
branch. I'll be rebasing it regularly, so don't be surprised by commit
hashes changing.

I think rebasing published repositories isn't encouraged.



Indeed. See http://progit.org/book/ch3-6.html for example, which says:

   *Do not rebase commits that you have pushed to a public repository.*

   If you follow that guideline, you’ll be fine. If you don’t, people
   will hate you, and you’ll be scorned by friends and family.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pl/python improvements

2010-12-07 Thread Jan Urbański

On 07/12/10 23:00, Andrew Dunstan wrote:
 On 12/07/2010 04:50 PM, Peter Eisentraut wrote:

 The code is on https://github.com/wulczer/postgres, in the plpython
 branch. I'll be rebasing it regularly, so don't be surprised by commit
 hashes changing.
 I think rebasing published repositories isn't encouraged.
 
 Indeed. See http://progit.org/book/ch3-6.html for example, which says:
 
*Do not rebase commits that you have pushed to a public repository.*
 
If you follow that guideline, you’ll be fine. If you don’t, people
will hate you, and you’ll be scorned by friends and family.

If someone is interested in merging from that branch, I can refrain from
rebasing (just tell me). But I'm really using it just as a patch queue
and pushing to github is a way having an online backup for the changes,
so it's not really a public repository. It's meant for people who want
to try out the code as it stands today, or see the changes that have
been made and don't care about having to rebase their checkout themselves.

I find developing that way much easier than merging from the master
branch. If I get conflicts I get them on a single commit of mine, not on
the merge commit, which makes them easier to resolve. See
https://github.com/diaspora/diaspora/wiki/Git-Workflow for an example of
that kind of workflow.

Peter suggested having a mail/patch per feature and the way I intend to
do that is instead of having a dozen branches, have one and after I'm
done rebase it interactively to produce incremental patches that apply
to master, each one implementing one feature.

Cheers,
Jan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pl/python improvements

Re: [HACKERS] pl/python improvements

Re: [HACKERS] pl/python improvements

Re: [HACKERS] pl/python improvements

Re: [HACKERS] pl/python improvements

[HACKERS] pl/python improvements

Re: [HACKERS] pl/python improvements

Re: [HACKERS] pl/python improvements

Re: [HACKERS] pl/python improvements

Re: [HACKERS] pl/python improvements

Re: [HACKERS] pl/python improvements

11 matches

Site Navigation

Mail list logo

Footer information