from:"Jan Urbański"

Re: [HACKERS] libpq's multi-threaded SSL callback handling is busted

2015-04-09 Thread Jan Urbański


Peter Eisentraut writes:

 On 2/12/15 7:28 AM, Jan Urbański wrote:
 +#if OPENSSL_VERSION_NUMBER  0x1000
 +/* OpenSSL 1.0.0 deprecates the CRYPTO_set_id_callback function and 
 provides a
 + * default implementation, so there's no need for our own. */

 I have some additional concerns about this.  It is true that OpenSSL
 1.0.0 deprecates CRYPTO_set_id_callback(), but it replaces it with
 CRYPTO_THREADID_set_callback().  There is no indication that you don't
 need to set a callback anymore.  The man page
 (https://www.openssl.org/docs/crypto/threads.html) still says you need
 to set two callbacks, and points to the new interface.

Truly, no good deed can ever go unpunished.

I spent around an hour tracking down why setting both callbacks
for OpenSSL = 1.0.0 brought back the PHP segfaults. Turns out, in OpenSSL
there's *no way* to unregister a callback registered with
CRYPTO_THREADID_set_callback()!

Observe: 
https://github.com/openssl/openssl/blob/35a1cc90bc1795e8893c11e442790ee7f659fffb/crypto/thr_id.c#L174-L180

Once you set a callback, game over - unloading your library will cause a
segfault. I cannot express how joyful I felt when I discovered it.

 I suggest you remove this part from your patch and submit a separate
 patch for consideration if you want to.

At this point I will propose an even simpler patch (attached). I gave up on
trying to use the more modern CRYPTO_THREADID_* functions.

Right now, HEAD libpq won't compile against OpenSSL compiled with
OPENSSL_NO_DEPRECATED, which I think is fine, given the lack of complaints. So
let's just keep using the older, saner functions and ignore the THREADID crap.

By the way, might I take the opportunity to breach the subject of backpatching
this change, should it get accepted?

Cheers,
Jan

diff --git a/src/interfaces/libpq/fe-secure-openssl.c b/src/interfaces/libpq/fe-secure-openssl.c
index a32af34..02c9177 100644
--- a/src/interfaces/libpq/fe-secure-openssl.c
+++ b/src/interfaces/libpq/fe-secure-openssl.c
@@ -806,9 +806,12 @@ pgtls_init(PGconn *conn)
 
 		if (ssl_open_connections++ == 0)
 		{
-			/* These are only required for threaded libcrypto applications */
-			CRYPTO_set_id_callback(pq_threadidcallback);
-			CRYPTO_set_locking_callback(pq_lockingcallback);
+			/* These are only required for threaded libcrypto applications, but
+			 * make sure we don't stomp on them if they're already set. */
+			if (CRYPTO_get_id_callback() == NULL)
+CRYPTO_set_id_callback(pq_threadidcallback);
+			if (CRYPTO_get_locking_callback() == NULL)
+CRYPTO_set_locking_callback(pq_lockingcallback);
 		}
 	}
 #endif   /* ENABLE_THREAD_SAFETY */
@@ -885,9 +888,12 @@ destroy_ssl_system(void)
 
 	if (pq_init_crypto_lib  ssl_open_connections == 0)
 	{
-		/* No connections left, unregister libcrypto callbacks */
-		CRYPTO_set_locking_callback(NULL);
-		CRYPTO_set_id_callback(NULL);
+		/* No connections left, unregister libcrypto callbacks, if no one
+		 * registered different ones in the meantime. */
+		if (CRYPTO_get_id_callback() == pq_threadidcallback)
+			CRYPTO_set_id_callback(NULL);
+		if (CRYPTO_get_locking_callback() == pq_lockingcallback)
+			CRYPTO_set_locking_callback(NULL);
 
 		/*
 		 * We don't free the lock array or the SSL_context. If we get another

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] libpq's multi-threaded SSL callback handling is busted

2015-04-03 Thread Jan Urbański


Peter Eisentraut writes:
 On 4/2/15 4:32 AM, Jan Urbański wrote:
 
 Peter Eisentraut writes:
 I don't think this patch would actually fix the problem that was
 described after the original bug report
 (http://www.postgresql.org/message-id/5436991b.5020...@vmware.com),
 namely that another thread acquires a lock while the libpq callbacks are
 set and then cannot release the lock if libpq has been shut down in the
 meantime.
 
 I did test both the Python and the PHP repro scripts and the patch fixed both
 the deadlock and the segfault.
 
 What happens is that Python (for instance) stops over the callback
 unconditionally. So when libpq gets unloaded, it sees that the currently set
 callback is no the one it originally set and refrains from NULLing it.

 So this works because Python is just as broken as libpq right now.  What
 happens if Python is fixed as well?  Then we'll have the problem I
 described above.

Well, not really, libpq sets and unsets the callbacks every time an SSL
connection is opened and closed. Python sets the callbacks once when the ssl
module is imported and never touches them again.

Arguably, it should set them even earlier, but it's still saner than
flip-flopping them. AFAIK you can't unload Python, so setting the callback
and keeping it there is a sound strategy.

The change I'm proposing is not to set the callback in libpq if one is already
set and not remove it if the one that's set is not libpq's. That's as sane as
it gets.

With multiple libraries wanting to use OpenSSL in threaded code, the mechanism
seems to be last one wins. It doesn't matter *who's* callbacks are used, as
long as they're there and they stay there and that's Python's stance. This
doesn't work if you want to be able to unload the library, so the next best
thing is not touching the callback if someone else set his in the meantime.

What's broken is OpenSSL for offering such a bad way of dealing with
concurrency.

To reiterate: I recognise that not handling the callbacks is not the right
answer. But not stomping on callbacks others might have set sounds like a
minimal and safe improvement.

Cheers,
Jan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] libpq's multi-threaded SSL callback handling is busted

2015-04-02 Thread Jan Urbański


Peter Eisentraut writes:

 On 2/12/15 7:28 AM, Jan Urbański wrote:
 For the sake of discussion, here's a patch to prevent stomping on
 previously-set callbacks, racy as it looks.

 FWIW, it does fix the Python deadlock and doesn't cause the PHP segfault...

 I don't think this patch would actually fix the problem that was
 described after the original bug report
 (http://www.postgresql.org/message-id/5436991b.5020...@vmware.com),
 namely that another thread acquires a lock while the libpq callbacks are
 set and then cannot release the lock if libpq has been shut down in the
 meantime.

I did test both the Python and the PHP repro scripts and the patch fixed both
the deadlock and the segfault.

What happens is that Python (for instance) stops over the callback
unconditionally. So when libpq gets unloaded, it sees that the currently set
callback is no the one it originally set and refrains from NULLing it.

There's a small race window there, to be sure, but it's a lot better than what
we have now.

Cheers,
Jan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GSoC idea - Simulated annealing to search for query plans

2015-02-27 Thread Jan Urbański


Josh Berkus writes:

 On 02/26/2015 05:50 PM, Fabrízio de Royes Mello wrote:
 
 On Thu, Feb 26, 2015 at 10:27 PM, Andres Freund and...@2ndquadrant.com
 mailto:and...@2ndquadrant.com wrote:

 On 2015-02-26 20:23:33 -0500, Tom Lane wrote:
 
  I seem to recall somebody demo'ing a simulated-annealing GEQO
 replacement
  at PGCon a couple years back.  It never got to the point of being a
  submitted patch though.

 Yea, it was Jan Urbański (CCed).

 
 And the project link: https://github.com/wulczer/saio

 So what w'ere saying, Grzegorz, is that we would love to see someone
 pick this up and get it to the point of making it a feature as a GSOC
 project.  I think if you can start from where Jan left off, you could
 actually complete it.

Sorry, late to the party.

Yes, I wrote a GEQO replacement that used simulated annealing for my Master
thesis. It got to a point where it was generating plans similar to what GEQO
outputs for small queries and better plans for very large ones.

The thesis itself is here: https://wulczer.org/saio.pdf and the linked GitHub
repo contains source for the PGCon presentation, which gives a higher-level
overview.

The big problem turned out to be writing the step function that generates a new
join order from a previous one. Look for the Simulated Annealing challenges
and Moves generator chapters in my thesis, which are the only interesting
ones :)

If you'd like to pick up where I left, I'd be more than happy to help in any
ways I can.

Best,
Jan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] libpq's multi-threaded SSL callback handling is busted

2015-02-12 Thread Jan Urbański


Andres Freund writes:

 On 2015-02-12 09:31:27 +0100, Jan Urbański wrote:
 That doesn't solve the problem of the Python deadlock, where you're not at
 leisure to call a C function at the beginning of your module.

 We could just never unload the hooks...

That's what we did before 4e816286533dd34c10b368487d4079595a3e1418 :) And it
got changed after 
http://www.postgresql.org/message-id/48620925.6070...@pws.com.au


  * If there's already callbacks set: Remember that fact and don't
overwrite. In the next major version: warn.

 So yeah, that was my initial approach - check if callbacks are set, don't do
 the dance if they are. It felt like a crutch, though, and racy at that. 
 There's
 no atomic way to test-and-set those callbacks. The window for racyness is
 small, though.

 If you do that check during library initialization instead of every
 connection it shouldn't be racy - if that part is run in a multithreaded
 fashion you're doing something crazy.

Yes, that's true. The problem is that there's no real libpq initialisation
function. The docs say that:

If your application initializes libssl and/or libcrypto libraries and libpq is
built with SSL support, you should call PQinitOpenSSL

So most apps will just not bother. The moment you know you'll need SSL is only
when you get an 'S' message from the server...

Cheers,
Jan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] libpq's multi-threaded SSL callback handling is busted

2015-02-12 Thread Jan Urbański


Andres Freund writes:

 On 2015-02-09 18:17:14 +0100, Jan Urbański wrote:
 First of all, the current behaviour is crazy. We're setting and unsetting the
 locking callback every time a connection is made/closed, which is not how
 OpenSSL is supposed to be used. The *application* using libpq should set a
 callback before it starts threads, it's no business of the library's.

 I don't buy this argument at all. Delivering a libpq that's not
 threadsafe in some circumstances is a really bad idea.

I knew this would be a hard sell :( What I know is that the current situation
is not good and leaving it as-is causes real grief.

 The old behaviour was slightly less insane (set callbacks first time we're
 engaging OpenSSL code, never touch them again). The real sane solution is to
 leave it up to the application.

 The real solution would be for openssl to do it itself.

I think that the OpenSSL API is the real culprit there, requiring threads to
register a callback is what's causing all the issues, but I don't think this
will get ever changed.

 Note that the PHP pgsql extension doesn't set the OpenSSL callbacks, because
 libpq was setting them on its own. If we remove the callback handling from
 libpq, PHP will need to add them. By the way, the MySQL extension for PHP 
 also
 does not set those callbacks.

 I think this shows pretty clearly how insane it would be to change this
 in a minor release. Do you really expect people to update libpq and php
 in unison. After a minor release?

Well, I haven't found reports of threaded PHP+MySQL+SSL programs crashing, and
the MySQL extension also doesn't care about the callbacks. I think it's because
it's both because it's a rare combination, or because almost everyone loads the
cURL extension, which *does* set up callbacks. Like I said, it's not a good
situation.

 Note that we *already* provide applications with the ability to set the
 callbacks themselves and prevent us from doing so: PQinitSSL(false).

Ah, I only now saw that with PQinitOpenSSL(true, false) you can work around the
problem, if you set up your own callbacks first. That seems to at least provide
a way to make libpq not do the callbacks dance in released versions. Thank you.

 Let me reiterate: I now believe the callbacks should be set by the 
 application,
 libraries should not touch them, since they don't know what will they be
 stomping on. If the application is run through a VM like Python or PHP, it's
 the VM that should make sure the callbacks are set.

 I fail to see why it's any better to do it in the VM. That relies on
 either always loading the VM's openssl module, even if you don't
 actually need it because the library you use deals with openssl. It
 prevents other implementations of openssl in the VM.

The callbacks are a thing that's owned by the application, so libraries have no
business in setting them up. The way OpenSSL's API is specified (very
unfortunately IMHO) is that before you use OpenSSL from threads, you need to
set the callbacks. If you don't control your application's startup (like when
you're a script executed through a VM), you assume the VM took care of it at
startup. If you're a library, you assume the user took care of it, as they
should.

Now I know that a lot of apps get this wrong. Python almost does the right
thing, but you're right, it only sets up callbacks if you load the 'ssl'
module. It still feels like setting them up in library code is stomping on
things not owned by the library - like libperl setting up signal handlers,
which caused problems in Postgres. They're resources not owned by the library!

 What I think we should do is the following:

 * Emphasize the fact that it's important to use PQinitSSL(false) if you
   did things yourself.

PQinitOpenSSL(true, false) if anything. The reason that function got introduced
is precisely because PQinitSSL(false) isn't exactly right, see
http://www.postgresql.org/message-id/49820d7d.7030...@esilo.com

That doesn't solve the problem of the Python deadlock, where you're not at
leisure to call a C function at the beginning of your module.

 * If there's already callbacks set: Remember that fact and don't
   overwrite. In the next major version: warn.

So yeah, that was my initial approach - check if callbacks are set, don't do
the dance if they are. It felt like a crutch, though, and racy at that. There's
no atomic way to test-and-set those callbacks. The window for racyness is
small, though.

 * Assert or something if the callback when unregistering isn't the same
   as it was when registering. That's pretty much guaranteed to cause
   issues.

So let me try another proposal and see if it doesn't set alarm bells off.

  * When opening a connection, if there's a callback set, don't overwrite it
(small race window).

  * When closing a connection, if the callback set is not
pq_lockingcallback/pq_threadidcallback, don't NULL it (small race window)

Asserts in frontend code are impossible, right? There's no way

Re: [HACKERS] libpq's multi-threaded SSL callback handling is busted

2015-02-12 Thread Jan Urbański


Jan Urbański writes:

 Andres Freund writes:

 On 2015-02-12 09:31:27 +0100, Jan Urbański wrote:
 That doesn't solve the problem of the Python deadlock, where you're not at
 leisure to call a C function at the beginning of your module.

 We could just never unload the hooks...

 That's what we did before 4e816286533dd34c10b368487d4079595a3e1418 :) And it
 got changed after 
 http://www.postgresql.org/message-id/48620925.6070...@pws.com.au


  * If there's already callbacks set: Remember that fact and don't
overwrite. In the next major version: warn.

 So yeah, that was my initial approach - check if callbacks are set, don't do
 the dance if they are. It felt like a crutch, though, and racy at that. 
 There's
 no atomic way to test-and-set those callbacks. The window for racyness is
 small, though.

 If you do that check during library initialization instead of every
 connection it shouldn't be racy - if that part is run in a multithreaded
 fashion you're doing something crazy.

 Yes, that's true. The problem is that there's no real libpq initialisation
 function. The docs say that:

 If your application initializes libssl and/or libcrypto libraries and libpq 
 is
 built with SSL support, you should call PQinitOpenSSL

 So most apps will just not bother. The moment you know you'll need SSL is only
 when you get an 'S' message from the server...

For the sake of discussion, here's a patch to prevent stomping on
previously-set callbacks, racy as it looks.

FWIW, it does fix the Python deadlock and doesn't cause the PHP segfault...

J
diff --git a/src/interfaces/libpq/fe-secure-openssl.c b/src/interfaces/libpq/fe-secure-openssl.c
index a32af34..54312a5 100644
--- a/src/interfaces/libpq/fe-secure-openssl.c
+++ b/src/interfaces/libpq/fe-secure-openssl.c
@@ -710,6 +710,9 @@ verify_peer_name_matches_certificate(PGconn *conn)
  *	Callback functions for OpenSSL internal locking
  */
 
+#if OPENSSL_VERSION_NUMBER  0x1000
+/* OpenSSL 1.0.0 deprecates the CRYPTO_set_id_callback function and provides a
+ * default implementation, so there's no need for our own. */
 static unsigned long
 pq_threadidcallback(void)
 {
@@ -720,6 +723,7 @@ pq_threadidcallback(void)
 	 */
 	return (unsigned long) pthread_self();
 }
+#endif   /* OPENSSL_VERSION_NUMBER  0x1000 */
 
 static pthread_mutex_t *pq_lockarray;
 
@@ -806,9 +810,14 @@ pgtls_init(PGconn *conn)
 
 		if (ssl_open_connections++ == 0)
 		{
-			/* These are only required for threaded libcrypto applications */
-			CRYPTO_set_id_callback(pq_threadidcallback);
-			CRYPTO_set_locking_callback(pq_lockingcallback);
+			/* These are only required for threaded libcrypto applications, but
+			 * make sure we don't stomp on them if they're already set. */
+#if OPENSSL_VERSION_NUMBER  0x1000
+			if (CRYPTO_get_id_callback() == NULL)
+CRYPTO_set_id_callback(pq_threadidcallback);
+#endif
+			if (CRYPTO_get_locking_callback() == NULL)
+CRYPTO_set_locking_callback(pq_lockingcallback);
 		}
 	}
 #endif   /* ENABLE_THREAD_SAFETY */
@@ -885,9 +894,14 @@ destroy_ssl_system(void)
 
 	if (pq_init_crypto_lib  ssl_open_connections == 0)
 	{
-		/* No connections left, unregister libcrypto callbacks */
-		CRYPTO_set_locking_callback(NULL);
-		CRYPTO_set_id_callback(NULL);
+		/* No connections left, unregister libcrypto callbacks, if no one
+		 * registered different ones in the meantime. */
+#if OPENSSL_VERSION_NUMBER  0x1000
+		if (CRYPTO_get_id_callback() == pq_threadidcallback)
+			CRYPTO_set_id_callback(NULL);
+#endif
+		if (CRYPTO_get_locking_callback() == pq_lockingcallback)
+			CRYPTO_set_locking_callback(NULL);
 
 		/*
 		 * We don't free the lock array or the SSL_context. If we get another

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] libpq's multi-threaded SSL callback handling is busted

2015-02-11 Thread Jan Urbański

Jan Urbański writes:

 I did some more digging on bug
 http://www.postgresql.org/message-id/cahul3dpwyfnugdgo95ohydq4kugdnbkptjq0mnbtubhcmg4...@mail.gmail.com
 which describes a deadlock when using libpq with SSL in a multi-threaded
 environment with other threads doing SSL independently.

 [reproducing instructions]

 I posit we should remove all CRYPTO_set_*_callback functions and associated
 cruft from libpq.

 I could submit a patch to get rid of the crazy CRYPTO_*_callback dance in
 libpq, but at the very least this will require a warning in the release notes

Attached is a patch doing just that.

 I would very much like to have this change back-patched, since setting and
 resetting the callback makes using libpq in a threaded OpenSSL-enabled app
 arguably less safe than if it didn't use any locking.

Also attached is a patch for 9.4 and all previous supported releases, which is
the same thing, but adjusted for when we didn't have a separate fe-secure.c and
fe-secure-openssl.c

If committed, this change will warrant a notice in the release notes. I could
try drafting it, if that'd be helpful.

Cheers,
Jan
commit 62f7433f697d49ab005ad22822dc943661a4e48a
Author: Jan Urbański wulc...@wulczer.org
Date:   Wed Feb 11 17:47:09 2015 +0100

Do not attempt to manage OpenSSL locking callbacks in libpq.

According to OpenSSL documentation, threaded programs using SSL should register
locking callbacks before starting work. In 6daf396879b6502c41146cdb1013568654f52ff6
libpq started doing that, registering its callbacks when the connection would
switch to SSL mode.

This would lead to overwriting any locking callback a properly written threaded
SSL application might have already set, but only got uncovered as a bug in
http://www.postgresql.org/message-id/48620925.6070...@pws.com.au where it
turned out that unloading libpq would leave a dangling function
pointer. Specifically, the PHP runtime would segfault after unloading libpq, as
reported in https://bugs.php.net/bug.php?id=40926

Commit 4e816286533dd34c10b368487d4079595a3e1418 made libpq unset the callback
after the last SSL connection was closed. This solved the segfault issue, but
introduced a potential for deadlocks when one thread using libpq with SSL and
another doing an unrelated SSL operation.

T1 (libpq) T2 (other)

start libpq connection
add locking callback

   start SSL operation
   take lock

finish libpq connection
remove locking callback

   (!) release lock, noop since no callback

This got reported in bug
http://www.postgresql.org/message-id/cahul3dpwyfnugdgo95ohydq4kugdnbkptjq0mnbtubhcmg4...@mail.gmail.com

The reality is that it's not libpq's responsibility to manage those OpenSSL
callbacks. It's up to application code to set them up before using OpenSSL in
threads and trying to handle them in libpq is just stomping on the ones that
should already by in place by the time we start using shared OpenSSL
structures.

This commit gets rid of all OpenSSL callback handling from libpq. For
well-behaved applications, this should increase reliability, since they will
now have certainty that the callbacks they set up before attempting to use
OpenSSL will be used throughout the program execution.

Applications that relied on libpq to set up locking callbacks will need to be
fixed.

diff --git a/src/interfaces/libpq/fe-secure-openssl.c b/src/interfaces/libpq/fe-secure-openssl.c
index a32af34..5d0b132 100644
--- a/src/interfaces/libpq/fe-secure-openssl.c
+++ b/src/interfaces/libpq/fe-secure-openssl.c
@@ -67,7 +67,6 @@ static int	verify_cb(int ok, X509_STORE_CTX *ctx);
 static int	verify_peer_name_matches_certificate_name(PGconn *conn,
   ASN1_STRING *name,
   char **store_name);
-static void destroy_ssl_system(void);
 static int	initialize_SSL(PGconn *conn);
 static PostgresPollingStatusType open_client_SSL(PGconn *);
 static char *SSLerrmessage(void);
@@ -90,8 +89,6 @@ static bool pq_init_crypto_lib = true;
 static SSL_CTX *SSL_context = NULL;
 
 #ifdef ENABLE_THREAD_SAFETY
-static long ssl_open_connections = 0;
-
 #ifndef WIN32
 static pthread_mutex_t ssl_config_mutex = PTHREAD_MUTEX_INITIALIZER;
 #else
@@ -112,16 +109,6 @@ static long win32_ssl_create_mutex = 0;
 void
 pgtls_init_library(bool do_ssl, int do_crypto)
 {
-#ifdef ENABLE_THREAD_SAFETY
-
-	/*
-	 * Disallow changing the flags while we have open connections, else we'd
-	 * get completely confused.
-	 */
-	if (ssl_open_connections != 0)
-		return;
-#endif
-
 	pq_init_ssl_lib = do_ssl;
 	pq_init_crypto_lib = do_crypto;
 }
@@ -705,40 +692,6 @@ verify_peer_name_matches_certificate(PGconn *conn)
 	return found_match  !got_error;
 }
 
-#ifdef ENABLE_THREAD_SAFETY
-/*
- *	Callback functions

[HACKERS] libpq's multi-threaded SSL callback handling is busted

2015-02-09 Thread Jan Urbański

I did some more digging on bug
http://www.postgresql.org/message-id/cahul3dpwyfnugdgo95ohydq4kugdnbkptjq0mnbtubhcmg4...@mail.gmail.com
which describes a deadlock when using libpq with SSL in a multi-threaded
environment with other threads doing SSL independently.

Attached is a reproducing Python script in my experience is faster at showing
the problem. Run it with

python -u pg_lock.py

As Heikki correctly diagnosed, the problem is with libpq unsetting the OpenSSL
locking callback while another thread is holding one of the locks. The other
thread then never releases the lock and the next time anyone tries to take it,
the application deadlocks.

The exact way it goes down is like this:



T1 (libpq) T2 (Python)

start libpq connection
init ssl system
add locking callback

   start ssl operation
   take lock

finish libpq connection
destroy ssl system
remove locking callback

   (!) release lock, noop since no callback



And the next time any thread tries to take the lock, it deadlocks.

We added unsetting the locking callback in
4e816286533dd34c10b368487d4079595a3e1418 due to this bug report:
http://www.postgresql.org/message-id/48620925.6070...@pws.com.au

Indeed, commenting out the CRYPTO_set_locking_callback(NULL) call in
fe-secure-openssl.c gets rid of the deadlock. However, it makes php segfault
with the (attached) reproduction script from the original 2008 bug report. If
your php.ini loads both the pgsql and curl extensions, reproduce the segfault 
with:

php -f pg_segfault.php

The most difficult part about fixing this bug is to determine *who's at
fault*. I now lean towards the opinion that we shouldn't be messing with
OpenSSL callbacks *at all*.

First of all, the current behaviour is crazy. We're setting and unsetting the
locking callback every time a connection is made/closed, which is not how
OpenSSL is supposed to be used. The *application* using libpq should set a
callback before it starts threads, it's no business of the library's.

The old behaviour was slightly less insane (set callbacks first time we're
engaging OpenSSL code, never touch them again). The real sane solution is to
leave it up to the application.

I posit we should remove all CRYPTO_set_*_callback functions and associated
cruft from libpq. This unfortunately means that multi-threaded applications
using libpq and SSL will break if they haven't been setting their own callbacks
(if they have, well, tough luck! libpq will just stomp over them the first time
it connects to Postgres, but at least *some* callbacks are left present after
that).

However, AFAICS if your app is not in C, then runtimes already handle that for
you (as they should).

Python:

https://hg.python.org/cpython/file/dc820b44ce21/Modules/_ssl.c#l4284

PHP:

https://github.com/php/php-src/blob/master/ext/curl/interface.c#L1235

Note that the PHP pgsql extension doesn't set the OpenSSL callbacks, because
libpq was setting them on its own. If we remove the callback handling from
libpq, PHP will need to add them. By the way, the MySQL extension for PHP also
does not set those callbacks.

Let me reiterate: I now believe the callbacks should be set by the application,
libraries should not touch them, since they don't know what will they be
stomping on. If the application is run through a VM like Python or PHP, it's
the VM that should make sure the callbacks are set.

I could submit a patch to get rid of the crazy CRYPTO_*_callback dance in
libpq, but at the very least this will require a warning in the release notes
about how you can't assume that libpq will take care of making sure your app is
multi-threaded safe when using OpenSSL. I also don't know how far that's
back-patcheable.

I would very much like to have this change back-patched, since setting and
resetting the callback makes using libpq in a threaded OpenSSL-enabled app
arguably less safe than if it didn't use any locking. If the app is written
correctly, it will have set locking callbacks before starting. Then libpq will
happily stomp on them. If the app hasn't set callbacks, it wasn't written
correctly in the first place and it will get segfaults instead of deadlocks.

Thanks,
Jan
#!/usr/bin/env python
import sys
import time
import threading
import urllib

import psycopg2


DB_HOST = '127.0.0.1'
DB_USER = 'postgres'
DB_NAME = 'postgres'

HTTPS_URL = 'https://google.com'
NUM_HTTPS_THREADS = 10


def connect():
conn = psycopg2.connect(
host=DB_HOST, user=DB_USER,
database=DB_NAME, sslmode='require')
return conn


class Worker(threading.Thread):

def __init__(self):
super(Worker, self).__init__()
self._stop = threading.Event()

def stop(self):
self._stop.set()

def stopped(self):
return self._stop.isSet()

def run(self):
i = 0
while not self.stopped():
i += 1
self.do_work(i)


class

Re: [HACKERS] Min value for port

2013-06-27 Thread Jan Urbański


On 27/06/13 15:11, Magnus Hagander wrote:

On Thu, Jun 27, 2013 at 2:16 PM, Peter Eisentraut pete...@gmx.net wrote:

On 6/27/13 6:34 AM, Magnus Hagander wrote:

Is there a reason why we have set the min allowed value for port to 1,
not 1024? Given that you can't actually start postgres with a value of
1024, shoulnd't the entry in pg_settings reference that as well?


Are you thinking of the restriction that you need to be root to use
ports 1024?  That restriction is not necessarily universal.  We can let
the kernel tell us at run time if it doesn't like our port.


Yes, that's the restriction I was talking about. It's just a bit
annoying that if you look at pg_settings.min_value it doesn't actually
tell you the truth. But yeah, I believe Windows actually lets you use
a lower port number, so it'd at least have to be #ifdef'ed for that if
we wanted to change it.


There's also authbind and CAP_NET_BIND_SERVICE.

Jan


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] updated emacs configuration

2013-06-26 Thread Jan Urbański


On 26/06/13 10:51, Dimitri Fontaine wrote:

Peter Eisentraut pete...@gmx.net writes:

$ git clone git://git.postgresql.org/git/postgresql.git
Cloning into 'postgresql'...


I can reproduce that here. I don't know why I have those postgres dirs
then, and I'm pretty confused about my round of testing now.


Maybe you cloned from GitHub, where the mirrored repository is called 
'postgres'?


Cheers,
Jan



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] storing plpython global pointer

2013-05-28 Thread Jan Urbański


On 28/05/13 14:04, Szymon Guz wrote:

Hi,
I need to store a global pointer for plpython usage. This is a PyObject*
which can be initialized per session I think

Where should I keep such a pointer?


Hi,

you probably could use a global variable, similar to PLy_interp_globals 
that's defined in plpy_main.c.


Another method would be to expose the Decimal constructor in the plpy 
module. You could modify plpy_plpymodule.c to import decimal and expose 
the Decimal constructor as plpy.Decimal.


Best,
Jan


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] plpgsql versus SPI plan abstraction

2013-01-30 Thread Jan Urbański


On 30/01/13 22:23, Tom Lane wrote:

BTW, I'm also wondering if it's really necessary for plpython/plpy_spi.c
to be looking into spi_priv.h ...


As far as I can tell, it's not necessary, spi.h would be perfectly fine.

Cheers,
Jan


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_dump --pretty-print-views

2013-01-28 Thread Jan Urbański


On 28/01/13 12:31, Marko Tiikkaja wrote:

On 1/28/13 12:14 PM, Jeevan Chalke wrote:

I could not apply the patch with git apply, but able to apply it by patch
-p1 command.


IME that's normal for patches that went through filterdiff.  I do: git
diff |filterdiff --format=context  to re-format the patches to the
context diff preferred on the mailing list.  Maybe if I somehow told git
to produce context diff it would work..


Try this, worked for me:

http://wiki.postgresql.org/wiki/Working_with_Git#Context_diffs_with_Git

Cheers,
Jan


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] fix SQL example syntax in file comment

2013-01-12 Thread Jan Urbański


Hi,

Here's a trivial patch that fixes a comment in execProcNode.c

For archeological interest, that comment dates back to when it was 
written in POSTQUEL... The cleanup in 
a9b1ff4c1d699c8aa615397d47bb3071275c64ef changed RETRIEVE to SELECT, but 
forgot to add a FROM clause :)


Cheers,
Jan
diff --git a/src/backend/executor/execProcnode.c b/src/backend/executor/execProcnode.c
index 4c1189e..76dd62f 100644
--- a/src/backend/executor/execProcnode.c
+++ b/src/backend/executor/execProcnode.c
@@ -32,6 +32,7 @@
  *		the number of employees in that department.  So we have the query:
  *
  *select DEPT.no_emps, EMP.age
+ *from DEPT, EMP
  *where EMP.name = DEPT.mgr and
  *	  DEPT.name = shoe
  *

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] PL/Python: Add spidata to all spiexceptions

2013-01-09 Thread Jan Urbański


On 12/12/12 20:21, Karl O. Pinc wrote:

There were 2 outstanding issue raised:

Is this useful enough/proceeding in the right direction to commit
now?



Should some of the logic be moved out of a subroutine and into the
calling code?





I can see arguments to be made for both sides.  I'm asking that you
think it through to the extent you deem necessary and make
changes or not.  At that point it should be ready to send
to a committer.  (I'll re-test first, if you make any changes.)


Oh my, I have dropped the ball on this one in the worst manner possible. 
Sorry!


I actually still prefer to keep the signatures of PLy_get_spi_sqlerrcode 
and PLy_get_spi_error_data similar, so I'd opt for keeping the patch 
as-is :)


Thanks,
Jan


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] PL/Python: Add spidata to all spiexceptions

2012-12-12 Thread Jan Urbański

Though not the original patch autor, I did modify and submit it to the 
CF app, so I'll reply here :)


On 10/12/12 19:20, Karl O. Pinc wrote:

On 12/09/2012 10:33:59 PM, Karl O. Pinc wrote:



I've gone ahead and signed up to review this patch.


Thanks!


There were 2 outstanding issue raised:

Is this useful enough/proceeding in the right direction to commit
now?


I believe the process would be to mark it as Ready for Committer if 
you feel like it's ready and a then a committer would make the final call.



Should some of the logic be moved out of a subroutine and into the
calling code?


I think I structured the PLy_get_spi_sqlerrcode to accept the same kind 
of arguments as PLy_get_spi_error_data, which means a SPIException 
object and a pointer to whatever values it can fill in.


That said, I haven't really thought about that too much, so I'm 
perfectly fine with moving that bit of logic to the caller.


Cheers,
Jan


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] PL/Python: Add spidata to all spiexceptions

2012-11-05 Thread Jan Urbański


On 05/11/12 18:35, Robert Haas wrote:

On Wed, Oct 31, 2012 at 5:33 AM, Jan Urbańskiwulc...@wulczer.org  wrote:

On 30/10/12 22:06, Oskari Saarenmaa wrote:


PL/Python maps Python SPIError exceptions with 'spidata' attribute into
SQL
errors.


Here's an alternative patch that takes advantage of the already present (and
documented) sqlstate variable to set the error code when handling SPIError
exceptions.

I also used your test case and added another one, just in case.


You should probably add this to the next CF so we don't forget about it.


I will, as soon as I recover my community account.

Cheers,
Jan


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] PL/Python: Add spidata to all spiexceptions

2012-11-05 Thread Jan Urbański


On 05/11/12 19:07, Jan Urbański wrote:

On 05/11/12 18:35, Robert Haas wrote:


You should probably add this to the next CF so we don't forget about it.


I will, as soon as I recover my community account.


Added as https://commitfest.postgresql.org/action/patch_view?id=971

J


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] PL/Python: Add spidata to all spiexceptions

2012-10-31 Thread Jan Urbański


On 30/10/12 22:06, Oskari Saarenmaa wrote:

PL/Python maps Python SPIError exceptions with 'spidata' attribute into SQL
errors.  PL/Python also creates classes in plpy.spiexceptions for all known
errors but does not initialize their spidata, so when a PL/Python function
raises such an exception it is not recognized properly and is always
reported as an internal error.


You're right, I never thought of the possibility of user code explicitly 
throwing SPIError exceptions.


The root issue is that PLy_elog will only set errcode if it finds the 
spidata attribute, but I think passing error details through that 
attribute is a kludge more than something more code should rely on.


Here's an alternative patch that takes advantage of the already present 
(and documented) sqlstate variable to set the error code when handling 
SPIError exceptions.


I also used your test case and added another one, just in case.

Thanks,
Jan
From 633fe3cf5872f57fcc33c5462048c0cfea81d907 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Jan=20Urba=C5=84ski?= wulc...@wulczer.org
Date: Wed, 31 Oct 2012 10:27:59 +0100
Subject: [PATCH] Set SQL error code for SPI errors raised from userland
 Python code.

If a PL/Python function was terminated by a SPIError raised from user
code, the Postgres-level error code was always set to internal_error.

This makes it pay attention to the exception's sqlstate attribute and
set the error code accordingly.
---
 src/pl/plpython/expected/plpython_error.out   |   26 
 src/pl/plpython/expected/plpython_error_0.out |   26 
 src/pl/plpython/plpy_elog.c   |   40 +
 src/pl/plpython/sql/plpython_error.sql|   30 +++
 4 files changed, 117 insertions(+), 5 deletions(-)

diff --git a/src/pl/plpython/expected/plpython_error.out b/src/pl/plpython/expected/plpython_error.out
index e1ec9c2..be2ec97 100644
--- a/src/pl/plpython/expected/plpython_error.out
+++ b/src/pl/plpython/expected/plpython_error.out
@@ -400,3 +400,29 @@ CONTEXT:  Traceback (most recent call last):
   PL/Python function manual_subxact_prepared, line 4, in module
 plpy.execute(save)
 PL/Python function manual_subxact_prepared
+/* raising plpy.spiexception.* from python code should preserve sqlstate
+ */
+CREATE FUNCTION plpy_raise_spiexception() RETURNS void AS $$
+raise plpy.spiexceptions.DivisionByZero()
+$$ LANGUAGE plpythonu;
+DO $$
+BEGIN
+	SELECT plpy_raise_spiexception();
+EXCEPTION WHEN division_by_zero THEN
+	-- NOOP
+END
+$$ LANGUAGE plpgsql;
+/* setting a custom sqlstate should be handled
+ */
+CREATE FUNCTION plpy_raise_spiexception_override() RETURNS void AS $$
+exc = plpy.spiexceptions.DivisionByZero()
+exc.sqlstate = 'SILLY'
+raise exc
+$$ LANGUAGE plpythonu;
+DO $$
+BEGIN
+	SELECT plpy_raise_spiexception_override();
+EXCEPTION WHEN SQLSTATE 'SILLY' THEN
+	-- NOOP
+END
+$$ LANGUAGE plpgsql;
diff --git a/src/pl/plpython/expected/plpython_error_0.out b/src/pl/plpython/expected/plpython_error_0.out
index 6f74a50..39c63c5 100644
--- a/src/pl/plpython/expected/plpython_error_0.out
+++ b/src/pl/plpython/expected/plpython_error_0.out
@@ -400,3 +400,29 @@ CONTEXT:  Traceback (most recent call last):
   PL/Python function manual_subxact_prepared, line 4, in module
 plpy.execute(save)
 PL/Python function manual_subxact_prepared
+/* raising plpy.spiexception.* from python code should preserve sqlstate
+ */
+CREATE FUNCTION plpy_raise_spiexception() RETURNS void AS $$
+raise plpy.spiexceptions.DivisionByZero()
+$$ LANGUAGE plpythonu;
+DO $$
+BEGIN
+	SELECT plpy_raise_spiexception();
+EXCEPTION WHEN division_by_zero THEN
+	-- NOOP
+END
+$$ LANGUAGE plpgsql;
+/* setting a custom sqlstate should be handled
+ */
+CREATE FUNCTION plpy_raise_spiexception_override() RETURNS void AS $$
+exc = plpy.spiexceptions.DivisionByZero()
+exc.sqlstate = 'SILLY'
+raise exc
+$$ LANGUAGE plpythonu;
+DO $$
+BEGIN
+	SELECT plpy_raise_spiexception_override();
+EXCEPTION WHEN SQLSTATE 'SILLY' THEN
+	-- NOOP
+END
+$$ LANGUAGE plpgsql;
diff --git a/src/pl/plpython/plpy_elog.c b/src/pl/plpython/plpy_elog.c
index c375ac0..e3044e7 100644
--- a/src/pl/plpython/plpy_elog.c
+++ b/src/pl/plpython/plpy_elog.c
@@ -336,6 +336,29 @@ PLy_traceback(char **xmsg, char **tbmsg, int *tb_depth)
 	Py_DECREF(e);
 }
 
+
+static void
+PLy_get_spi_sqlerrcode(PyObject *exc, int *sqlerrcode)
+{
+	PyObject	*sqlstate;
+	char		*buffer;
+
+	sqlstate = PyObject_GetAttrString(exc, sqlstate);
+	if (sqlstate == NULL)
+		return;
+
+	buffer = PyString_AsString(sqlstate);
+	if (strlen(buffer) == 5 
+		strspn(buffer, 0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ) == 5)
+	{
+		*sqlerrcode = MAKE_SQLSTATE(buffer[0], buffer[1], buffer[2],
+	buffer[3], buffer[4]);
+	}
+
+	Py_DECREF(sqlstate);
+}
+
+
 /*
  * Extract the error data from a SPIError
  */
@@ -345,13 +368,20 @@ PLy_get_spi_error_data(PyObject *exc, int *sqlerrcode, char **detail, char **hin
 	PyObject   *spidata = NULL;
 
 	spidata =

[HACKERS] trivial typo in src/tools/RELEASE_CHANGES

2012-09-23 Thread Jan Urbański


There's a typo in src/tools/RELEASE_CHANGES

It just ticked off my OCD I guess...

Cheers,
Jan
diff --git a/src/tools/RELEASE_CHANGES b/src/tools/RELEASE_CHANGES
new file mode 100644
index 5f1277a..aba1630
*** a/src/tools/RELEASE_CHANGES
--- b/src/tools/RELEASE_CHANGES
*** Then doing it like this:
*** 164,170 
  
  	void print_stuff(int arg1, int arg2)
  	{
! 	print_stuff(arg1, arg2, 0);
  	}
  
  would maintain binary compatibility. Obviously this would add a fair
--- 164,170 
  
  	void print_stuff(int arg1, int arg2)
  	{
! 	print_stuff2(arg1, arg2, 0);
  	}
  
  would maintain binary compatibility. Obviously this would add a fair

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PostgreSQL 9.2beta4 ( git HEAD) server crash on creating extension plpython3u

2012-08-22 Thread Jan Urbański


On 21/08/12 20:13, Josh Berkus wrote:



No. I get the same backtrace when I try against the 9.1.5 (REL9_1_STABLE)
branch.


I have reproduced this on Linux, seems like the fix is to to run the 
postmaster with this env variable exported:


PYTHONHOME=/opt/ActivePython-3.2/

(or wherever you installed ActivePython).

To give credit, I found the decisive clue here: 
http://manojadinesh.blogspot.com/2012/06/fatal-python-error-pyinitialize-unable.html


Cheers,
Jan


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PostgreSQL 9.2beta4 ( git HEAD) server crash on creating extension plpython3u

2012-08-22 Thread Jan Urbański


On 22/08/12 13:28, Sachin Srivastava wrote:

Yes, It worked for me also..
So will this be a workaround? Or do we intend to use something like
Py_SetPythonHome() before calling Py_Initialize()/


I think the best we can do is to document that for some installations 
you might need to set PYTHONHOME. I don't think we can figure it out at 
runtime (we could then use setenv to fix it). This is similar to having 
to set PYTHONPATH if you want to import code installed in virtualenvs or 
other nonstandard locations.


I'm leaning towards just documenting, mucking around with environment 
variables from inside of Postgres' shared libraries seems like a recipe 
for disaster.


Cheers,
Jan


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Fix mapping of PostgreSQL encodings to Python encodings.

2012-08-09 Thread Jan Urbański


On 06/08/12 13:59, Heikki Linnakangas wrote:

On 20.07.2012 10:13, Jan Urbański wrote:

On 20/07/12 08:59, Jan Urbański wrote:

On 18/07/12 17:17, Heikki Linnakangas wrote:

On 14.07.2012 17:50, Jan Urbański wrote:

If pg_do_encoding_conversion() throws an error, you don't get a chance
to call Py_DECREF() to release the string. Is that a problem?

If an error occurs in PLy_traceback(), after incrementing
recursion_depth, you don't get a chance to decrement it again. I'm not
sure if the Py* function calls can fail, but at least seemingly trivial
things like initStringInfo() can throw an out-of-memory error.


Of course you're right (on both accounts).

Here's a version with a bunch of PG_TRies thrown in.


Silly me, playing tricks with postincrements before fully waking up.

Here's v3, with a correct inequality test for exceeding the traceback
recursion test.


Committed the convert-via-UTF-8 part of this. I'll take a closer look at
the recursion check next.


Thanks!

Jan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Passing tabular data around using python functions

2012-07-31 Thread Jan Urbański


On 30/07/12 14:33, Achim Domma wrote:

Hi,


Hi Achim,

this list is meant for discussing the development of PostgreSQL, in the 
future you might want to ask your question on pgsql-general.


However, to answer your question:


I call the function like this:

select * from vectormatch(array(select (docid,conceptid,rank)::fps from fps 
where docid = 4205591))

and get the following output:

NOTICE:type 'list'
CONTEXT:  PL/Python function vectormatch
NOTICE:  ['(4205591,1,1)', '(4205591,1219,1)', ...]
CONTEXT:  PL/Python function vectormatch

I'm quite surprised that there are strings in the list and not tuples!? I tried 
my best, but I have no idea what I might be doing wrong. The main purpose of my 
sample/experiment is, to pass the results of a query to a function and to 
process it there. Any hint would be very appreciated.


Yes, it's a missing feature of PL/Python, but in your case you could 
work around it by writing your function like this:


create or replace function vectormatch(docid integer[], conceptid 
integer[], rank float4[])

returns table(docid integer, weigth float4)
as $$
data = zip(docid, conceptid, rank)
plpy.notice(data)
...
$$ language plpythonu;

and then calling it like this:

select vectormatch(array_agg(docid), array_agg(conceptid), 
array_agg(rank)) from fps where docid = 4205591;


Cheers,
Jan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Fix mapping of PostgreSQL encodings to Python encodings.

2012-07-20 Thread Jan Urbański


On 18/07/12 17:17, Heikki Linnakangas wrote:

On 14.07.2012 17:50, Jan Urbański wrote:

If pg_do_encoding_conversion() throws an error, you don't get a chance
to call Py_DECREF() to release the string. Is that a problem?

If an error occurs in PLy_traceback(), after incrementing
recursion_depth, you don't get a chance to decrement it again. I'm not
sure if the Py* function calls can fail, but at least seemingly trivial
things like initStringInfo() can throw an out-of-memory error.


Of course you're right (on both accounts).

Here's a version with a bunch of PG_TRies thrown in.

Cheers,
Jan
diff --git a/src/pl/plpython/plpy_elog.c b/src/pl/plpython/plpy_elog.c
new file mode 100644
index c375ac0..9944f72
*** a/src/pl/plpython/plpy_elog.c
--- b/src/pl/plpython/plpy_elog.c
*** static char *get_source_line(const char
*** 28,33 
--- 28,41 
  
  
  /*
+  * Guard agains re-entrant calls to PLy_traceback, which can happen if
+  * traceback formatting functions raise Python errors.
+  */
+ #define TRACEBACK_RECURSION_LIMIT 2
+ static int	recursion_depth = 0;
+ 
+ 
+ /*
   * Emit a PG error or notice, together with any available info about
   * the current Python error, previously set by PLy_exception_set().
   * This should be used to propagate Python errors into PG.	If fmt is
*** static char *get_source_line(const char
*** 38,46 
  void
  PLy_elog(int elevel, const char *fmt,...)
  {
! 	char	   *xmsg;
! 	char	   *tbmsg;
! 	int			tb_depth;
  	StringInfoData emsg;
  	PyObject   *exc,
  			   *val,
--- 46,54 
  void
  PLy_elog(int elevel, const char *fmt,...)
  {
! 	char	   *xmsg = NULL;
! 	char	   *tbmsg = NULL;
! 	int			tb_depth = 0;
  	StringInfoData emsg;
  	PyObject   *exc,
  			   *val,
*** PLy_elog(int elevel, const char *fmt,...
*** 62,68 
  	}
  	PyErr_Restore(exc, val, tb);
  
! 	PLy_traceback(xmsg, tbmsg, tb_depth);
  
  	if (fmt)
  	{
--- 70,90 
  	}
  	PyErr_Restore(exc, val, tb);
  
! 	if (recursion_depth++ = TRACEBACK_RECURSION_LIMIT)
! 	{
! 		PG_TRY();
! 		{
! 			PLy_traceback(xmsg, tbmsg, tb_depth);
! 		}
! 		PG_CATCH();
! 		{
! 			recursion_depth--;
! 			PG_RE_THROW();
! 		}
! 		PG_END_TRY();
! 	}
! 
! 	recursion_depth--;
  
  	if (fmt)
  	{
diff --git a/src/pl/plpython/plpy_util.c b/src/pl/plpython/plpy_util.c
new file mode 100644
index 4aabafc..94c035e
*** a/src/pl/plpython/plpy_util.c
--- b/src/pl/plpython/plpy_util.c
*** PLy_free(void *ptr)
*** 61,126 
  PyObject *
  PLyUnicode_Bytes(PyObject *unicode)
  {
! 	PyObject   *rv;
! 	const char *serverenc;
  
! 	/*
! 	 * Map PostgreSQL encoding to a Python encoding name.
! 	 */
! 	switch (GetDatabaseEncoding())
  	{
! 		case PG_SQL_ASCII:
! 			/*
! 			 * Mapping SQL_ASCII to Python's 'ascii' is a bit bogus. Python's
! 			 * 'ascii' means true 7-bit only ASCII, while PostgreSQL's
! 			 * SQL_ASCII means that anything is allowed, and the system doesn't
! 			 * try to interpret the bytes in any way. But not sure what else
! 			 * to do, and we haven't heard any complaints...
! 			 */
! 			serverenc = ascii;
! 			break;
! 		case PG_WIN1250:
! 			serverenc = cp1250;
! 			break;
! 		case PG_WIN1251:
! 			serverenc = cp1251;
! 			break;
! 		case PG_WIN1252:
! 			serverenc = cp1252;
! 			break;
! 		case PG_WIN1253:
! 			serverenc = cp1253;
! 			break;
! 		case PG_WIN1254:
! 			serverenc = cp1254;
! 			break;
! 		case PG_WIN1255:
! 			serverenc = cp1255;
! 			break;
! 		case PG_WIN1256:
! 			serverenc = cp1256;
! 			break;
! 		case PG_WIN1257:
! 			serverenc = cp1257;
! 			break;
! 		case PG_WIN1258:
! 			serverenc = cp1258;
! 			break;
! 		case PG_WIN866:
! 			serverenc = cp866;
! 			break;
! 		case PG_WIN874:
! 			serverenc = cp874;
! 			break;
! 		default:
! 			/* Other encodings have the same name in Python. */
! 			serverenc = GetDatabaseEncodingName();
! 			break;
  	}
  
! 	rv = PyUnicode_AsEncodedString(unicode, serverenc, strict);
! 	if (rv == NULL)
! 		PLy_elog(ERROR, could not convert Python Unicode object to PostgreSQL server encoding);
  	return rv;
  }
  
--- 61,106 
  PyObject *
  PLyUnicode_Bytes(PyObject *unicode)
  {
! 	PyObject	*bytes, *rv;
! 	char		*utf8string, *encoded;
  
! 	/* first encode the Python unicode object with UTF-8 */
! 	bytes = PyUnicode_AsUTF8String(unicode);
! 	if (bytes == NULL)
! 		PLy_elog(ERROR, could not convert Python Unicode object to bytes);
! 
! 	utf8string = PyBytes_AsString(bytes);
! 	if (utf8string == NULL) {
! 		Py_DECREF(bytes);
! 		PLy_elog(ERROR, could not extract bytes from encoded string);
! 	}
! 
! 	/* then convert to server encoding */
! 	PG_TRY();
  	{
! 		encoded = (char *) pg_do_encoding_conversion(
! 			(unsigned char *) utf8string,
! 			strlen(utf8string),
! 			PG_UTF8,
! 			GetDatabaseEncoding());
! 	}
! 	PG_CATCH();
! 	{
! 		Py_DECREF(bytes);
! 		PG_RE_THROW();
  	}
+ 	PG_END_TRY();
  
! 	/* finally, build a bytes object in the server encoding */
! 	rv = PyBytes_FromStringAndSize(encoded, strlen(encoded

Re: [HACKERS] Re: [COMMITTERS] pgsql: Fix mapping of PostgreSQL encodings to Python encodings.

2012-07-20 Thread Jan Urbański


On 20/07/12 08:59, Jan Urbański wrote:

On 18/07/12 17:17, Heikki Linnakangas wrote:

On 14.07.2012 17:50, Jan Urbański wrote:

If pg_do_encoding_conversion() throws an error, you don't get a chance
to call Py_DECREF() to release the string. Is that a problem?

If an error occurs in PLy_traceback(), after incrementing
recursion_depth, you don't get a chance to decrement it again. I'm not
sure if the Py* function calls can fail, but at least seemingly trivial
things like initStringInfo() can throw an out-of-memory error.


Of course you're right (on both accounts).

Here's a version with a bunch of PG_TRies thrown in.


Silly me, playing tricks with postincrements before fully waking up.

Here's v3, with a correct inequality test for exceeding the traceback 
recursion test.


J
diff --git a/src/pl/plpython/plpy_elog.c b/src/pl/plpython/plpy_elog.c
new file mode 100644
index c375ac0..0ad8358
*** a/src/pl/plpython/plpy_elog.c
--- b/src/pl/plpython/plpy_elog.c
*** static char *get_source_line(const char
*** 28,33 
--- 28,41 
  
  
  /*
+  * Guard agains re-entrant calls to PLy_traceback, which can happen if
+  * traceback formatting functions raise Python errors.
+  */
+ #define TRACEBACK_RECURSION_LIMIT 2
+ static int	recursion_depth = 0;
+ 
+ 
+ /*
   * Emit a PG error or notice, together with any available info about
   * the current Python error, previously set by PLy_exception_set().
   * This should be used to propagate Python errors into PG.	If fmt is
*** static char *get_source_line(const char
*** 38,46 
  void
  PLy_elog(int elevel, const char *fmt,...)
  {
! 	char	   *xmsg;
! 	char	   *tbmsg;
! 	int			tb_depth;
  	StringInfoData emsg;
  	PyObject   *exc,
  			   *val,
--- 46,54 
  void
  PLy_elog(int elevel, const char *fmt,...)
  {
! 	char	   *xmsg = NULL;
! 	char	   *tbmsg = NULL;
! 	int			tb_depth = 0;
  	StringInfoData emsg;
  	PyObject   *exc,
  			   *val,
*** PLy_elog(int elevel, const char *fmt,...
*** 62,68 
  	}
  	PyErr_Restore(exc, val, tb);
  
! 	PLy_traceback(xmsg, tbmsg, tb_depth);
  
  	if (fmt)
  	{
--- 70,90 
  	}
  	PyErr_Restore(exc, val, tb);
  
! 	if (recursion_depth++  TRACEBACK_RECURSION_LIMIT)
! 	{
! 		PG_TRY();
! 		{
! 			PLy_traceback(xmsg, tbmsg, tb_depth);
! 		}
! 		PG_CATCH();
! 		{
! 			recursion_depth--;
! 			PG_RE_THROW();
! 		}
! 		PG_END_TRY();
! 	}
! 
! 	recursion_depth--;
  
  	if (fmt)
  	{
diff --git a/src/pl/plpython/plpy_util.c b/src/pl/plpython/plpy_util.c
new file mode 100644
index 4aabafc..94c035e
*** a/src/pl/plpython/plpy_util.c
--- b/src/pl/plpython/plpy_util.c
*** PLy_free(void *ptr)
*** 61,126 
  PyObject *
  PLyUnicode_Bytes(PyObject *unicode)
  {
! 	PyObject   *rv;
! 	const char *serverenc;
  
! 	/*
! 	 * Map PostgreSQL encoding to a Python encoding name.
! 	 */
! 	switch (GetDatabaseEncoding())
  	{
! 		case PG_SQL_ASCII:
! 			/*
! 			 * Mapping SQL_ASCII to Python's 'ascii' is a bit bogus. Python's
! 			 * 'ascii' means true 7-bit only ASCII, while PostgreSQL's
! 			 * SQL_ASCII means that anything is allowed, and the system doesn't
! 			 * try to interpret the bytes in any way. But not sure what else
! 			 * to do, and we haven't heard any complaints...
! 			 */
! 			serverenc = ascii;
! 			break;
! 		case PG_WIN1250:
! 			serverenc = cp1250;
! 			break;
! 		case PG_WIN1251:
! 			serverenc = cp1251;
! 			break;
! 		case PG_WIN1252:
! 			serverenc = cp1252;
! 			break;
! 		case PG_WIN1253:
! 			serverenc = cp1253;
! 			break;
! 		case PG_WIN1254:
! 			serverenc = cp1254;
! 			break;
! 		case PG_WIN1255:
! 			serverenc = cp1255;
! 			break;
! 		case PG_WIN1256:
! 			serverenc = cp1256;
! 			break;
! 		case PG_WIN1257:
! 			serverenc = cp1257;
! 			break;
! 		case PG_WIN1258:
! 			serverenc = cp1258;
! 			break;
! 		case PG_WIN866:
! 			serverenc = cp866;
! 			break;
! 		case PG_WIN874:
! 			serverenc = cp874;
! 			break;
! 		default:
! 			/* Other encodings have the same name in Python. */
! 			serverenc = GetDatabaseEncodingName();
! 			break;
  	}
  
! 	rv = PyUnicode_AsEncodedString(unicode, serverenc, strict);
! 	if (rv == NULL)
! 		PLy_elog(ERROR, could not convert Python Unicode object to PostgreSQL server encoding);
  	return rv;
  }
  
--- 61,106 
  PyObject *
  PLyUnicode_Bytes(PyObject *unicode)
  {
! 	PyObject	*bytes, *rv;
! 	char		*utf8string, *encoded;
  
! 	/* first encode the Python unicode object with UTF-8 */
! 	bytes = PyUnicode_AsUTF8String(unicode);
! 	if (bytes == NULL)
! 		PLy_elog(ERROR, could not convert Python Unicode object to bytes);
! 
! 	utf8string = PyBytes_AsString(bytes);
! 	if (utf8string == NULL) {
! 		Py_DECREF(bytes);
! 		PLy_elog(ERROR, could not extract bytes from encoded string);
! 	}
! 
! 	/* then convert to server encoding */
! 	PG_TRY();
  	{
! 		encoded = (char *) pg_do_encoding_conversion(
! 			(unsigned char *) utf8string,
! 			strlen(utf8string),
! 			PG_UTF8,
! 			GetDatabaseEncoding());
! 	}
! 	PG_CATCH

Re: [HACKERS] Re: [COMMITTERS] pgsql: Fix mapping of PostgreSQL encodings to Python encodings.

2012-07-14 Thread Jan Urbański


On 13/07/12 13:38, Jan Urbański wrote:

On 12/07/12 11:08, Heikki Linnakangas wrote:

On 07.07.2012 00:12, Jan Urbański wrote:

So you're in favour of doing unicode - bytes by encoding with UTF-8 and
then using the server's encoding functions?


Sounds reasonable to me. The extra conversion between UTF-8 and UCS-2
should be quite fast, and it would be good to be consistent in the way
we do conversions in both directions.



I'll implement that than (sorry for not following up on that eariler).


Here's a patch that always encodes Python unicode objects using UTF-8 
and then uses Postgres's internal functions to produce bytes in the 
server encoding.


Cheers,
Jan
diff --git a/src/pl/plpython/plpy_elog.c b/src/pl/plpython/plpy_elog.c
new file mode 100644
index c375ac0..c2b3cb8
*** a/src/pl/plpython/plpy_elog.c
--- b/src/pl/plpython/plpy_elog.c
*** static char *get_source_line(const char
*** 28,33 
--- 28,41 
  
  
  /*
+  * Guard agains re-entrant calls to PLy_traceback, which can happen if
+  * traceback formatting functions raise Python errors.
+  */
+ #define TRACEBACK_RECURSION_LIMIT 2
+ static int	recursion_depth = 0;
+ 
+ 
+ /*
   * Emit a PG error or notice, together with any available info about
   * the current Python error, previously set by PLy_exception_set().
   * This should be used to propagate Python errors into PG.	If fmt is
*** PLy_traceback(char **xmsg, char **tbmsg,
*** 147,166 
  	StringInfoData xstr;
  	StringInfoData tbstr;
  
  	/*
  	 * get the current exception
  	 */
  	PyErr_Fetch(e, v, tb);
  
  	/*
! 	 * oops, no exception, return
  	 */
! 	if (e == NULL)
  	{
  		*xmsg = NULL;
  		*tbmsg = NULL;
  		*tb_depth = 0;
  
  		return;
  	}
  
--- 155,177 
  	StringInfoData xstr;
  	StringInfoData tbstr;
  
+ 	recursion_depth++;
+ 
  	/*
  	 * get the current exception
  	 */
  	PyErr_Fetch(e, v, tb);
  
  	/*
! 	 * oops, no exception or recursion depth exceeded, return
  	 */
! 	if (e == NULL || recursion_depth  TRACEBACK_RECURSION_LIMIT)
  	{
  		*xmsg = NULL;
  		*tbmsg = NULL;
  		*tb_depth = 0;
  
+ 		recursion_depth--;
  		return;
  	}
  
*** PLy_traceback(char **xmsg, char **tbmsg,
*** 326,331 
--- 337,344 
  		(*tb_depth)++;
  	}
  
+ 	recursion_depth--;
+ 
  	/* Return the traceback. */
  	*tbmsg = tbstr.data;
  
diff --git a/src/pl/plpython/plpy_util.c b/src/pl/plpython/plpy_util.c
new file mode 100644
index 4aabafc..fe43312
*** a/src/pl/plpython/plpy_util.c
--- b/src/pl/plpython/plpy_util.c
*** PLy_free(void *ptr)
*** 61,126 
  PyObject *
  PLyUnicode_Bytes(PyObject *unicode)
  {
! 	PyObject   *rv;
! 	const char *serverenc;
  
! 	/*
! 	 * Map PostgreSQL encoding to a Python encoding name.
! 	 */
! 	switch (GetDatabaseEncoding())
  	{
! 		case PG_SQL_ASCII:
! 			/*
! 			 * Mapping SQL_ASCII to Python's 'ascii' is a bit bogus. Python's
! 			 * 'ascii' means true 7-bit only ASCII, while PostgreSQL's
! 			 * SQL_ASCII means that anything is allowed, and the system doesn't
! 			 * try to interpret the bytes in any way. But not sure what else
! 			 * to do, and we haven't heard any complaints...
! 			 */
! 			serverenc = ascii;
! 			break;
! 		case PG_WIN1250:
! 			serverenc = cp1250;
! 			break;
! 		case PG_WIN1251:
! 			serverenc = cp1251;
! 			break;
! 		case PG_WIN1252:
! 			serverenc = cp1252;
! 			break;
! 		case PG_WIN1253:
! 			serverenc = cp1253;
! 			break;
! 		case PG_WIN1254:
! 			serverenc = cp1254;
! 			break;
! 		case PG_WIN1255:
! 			serverenc = cp1255;
! 			break;
! 		case PG_WIN1256:
! 			serverenc = cp1256;
! 			break;
! 		case PG_WIN1257:
! 			serverenc = cp1257;
! 			break;
! 		case PG_WIN1258:
! 			serverenc = cp1258;
! 			break;
! 		case PG_WIN866:
! 			serverenc = cp866;
! 			break;
! 		case PG_WIN874:
! 			serverenc = cp874;
! 			break;
! 		default:
! 			/* Other encodings have the same name in Python. */
! 			serverenc = GetDatabaseEncodingName();
! 			break;
  	}
  
! 	rv = PyUnicode_AsEncodedString(unicode, serverenc, strict);
! 	if (rv == NULL)
! 		PLy_elog(ERROR, could not convert Python Unicode object to PostgreSQL server encoding);
  	return rv;
  }
  
--- 61,96 
  PyObject *
  PLyUnicode_Bytes(PyObject *unicode)
  {
! 	PyObject	*bytes, *rv;
! 	char		*utf8string, *encoded;
  
! 	/* first encode the Python unicode object with UTF-8 */
! 	bytes = PyUnicode_AsUTF8String(unicode);
! 	if (bytes == NULL)
! 		PLy_elog(ERROR, could not convert Python Unicode object to bytes);
! 
! 	utf8string = PyBytes_AsString(bytes);
! 	if (utf8string == NULL) {
! 		Py_DECREF(bytes);
! 		PLy_elog(ERROR, could not extract bytes from encoded string);
! 	}
! 
! 	/* then convert to server encoding */
! 	encoded = (char *) pg_do_encoding_conversion((unsigned char *) utf8string,
!  strlen(utf8string),
!  PG_UTF8,
!  GetDatabaseEncoding());
! 
! 	/* finally, build a bytes object in the server encoding */
! 	rv = PyBytes_FromStringAndSize(encoded, strlen(encoded

Re: [HACKERS] Re: [COMMITTERS] pgsql: Fix mapping of PostgreSQL encodings to Python encodings.

2012-07-13 Thread Jan Urbański


On 12/07/12 11:08, Heikki Linnakangas wrote:

On 07.07.2012 00:12, Jan Urbański wrote:

On 06/07/12 22:47, Peter Eisentraut wrote:

On fre, 2012-07-06 at 18:53 +0300, Heikki Linnakangas wrote:

What shall we do about those? Ignore them? Document that if you're
sing
one of these encodings then PL/Python with Python 2 will be crippled
and
with Python 3 just won't work?


We could convert to UTF-8, and use the PostgreSQL functions to convert
from UTF-8 to the server encoding. Double conversion might be slow, but
I think it would be better than failing.


Actually, we already do the other direction that way
(PLyUnicode_FromStringAndSize) , so maybe it would be more consistent to
always use this.

I would hesitate to use this as a kind of fallback, because then we
would sometimes be using PostgreSQL's recoding tables and sometimes
Python's recoding tables, which could became confusing.


So you're in favour of doing unicode - bytes by encoding with UTF-8 and
then using the server's encoding functions?


Sounds reasonable to me. The extra conversion between UTF-8 and UCS-2
should be quite fast, and it would be good to be consistent in the way
we do conversions in both directions.



I'll implement that than (sorry for not following up on that eariler).

J

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Fix mapping of PostgreSQL encodings to Python encodings.

2012-07-06 Thread Jan Urbański


On 06/07/12 10:05, Heikki Linnakangas wrote:

On 06.07.2012 00:54, Jan Urbański wrote:

On 05/07/12 23:30, Peter Eisentraut wrote:

On tor, 2012-07-05 at 22:53 +0200, Jan Urbański wrote:

The problem is that PLyUnicode_Bytes is (via an ifdef) used as
PyString_ToString on Python3, which means that there are numerous call
sites and new ones might appear in any moment. I'm not that keen on
invoking the traceback machinery on low-level encoding errors.


Why not?


Because it can lead to recursion errors, like the one this patch was
supposed to fix. The traceback machinery calls into the encoding
functions, because it converts Python strings (like function names) into
C strings.


In the backend elog routines, there is a global variable
'recursion_depth', which is incremented when an error-handling routine
is entered, and decremented afterwards. Can we use a similar mechinism
in PLy_elog() to detect and stop recursion?


I guess we can, I'll try to do some tests in order to see if there's an 
easy user-triggereable way of causing PLy_elog to recurse and if not 
then a guard like this should be enough as a safety measure against as 
yet unknown conditions (as opposed to something we expect to happen 
regularly).


Cheers,
Jan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Fix mapping of PostgreSQL encodings to Python encodings.

2012-07-06 Thread Jan Urbański


On 06/07/12 10:14, Jan Urbański wrote:

On 06/07/12 10:05, Heikki Linnakangas wrote:

In the backend elog routines, there is a global variable
'recursion_depth', which is incremented when an error-handling routine
is entered, and decremented afterwards. Can we use a similar mechinism
in PLy_elog() to detect and stop recursion?


I guess we can, I'll try to do some tests in order to see if there's an
easy user-triggereable way of causing PLy_elog to recurse and if not
then a guard like this should be enough as a safety measure against as
yet unknown conditions (as opposed to something we expect to happen
regularly).


Attached is a patch that stores the recursion level of PLy_traceback and 
prevents it from running if it's too deep (PLy_traceback is the one 
doing heavy lifting, that's why I chose to put the logic to skip running 
there).


I tried a few things and was not able to easily invoke the infinite 
recursion condition, but I did notice that there are two more encodings 
that have different names in Postgres and in Python (KOI8-R and KOI8-U) 
and added them to the switch.


There's still trouble with EUC_TW and MULE_INTERNAL which don't have 
Python equivalents. EUC-TW has been discussed in 
http://bugs.python.org/issue2066 and rejected (see 
http://bugs.python.org/issue2066#msg113731).


If you use any of these encodings, you *will* get into the recursion 
trouble described eariler, just as before the path you'd get into it 
with CP1252 as your encoding.


What shall we do about those? Ignore them? Document that if you're sing 
one of these encodings then PL/Python with Python 2 will be crippled and 
with Python 3 just won't work?


Cheers,
Jan
diff --git a/src/pl/plpython/plpy_elog.c b/src/pl/plpython/plpy_elog.c
new file mode 100644
index c375ac0..c2b3cb8
*** a/src/pl/plpython/plpy_elog.c
--- b/src/pl/plpython/plpy_elog.c
*** static char *get_source_line(const char
*** 28,33 
--- 28,41 
  
  
  /*
+  * Guard agains re-entrant calls to PLy_traceback, which can happen if
+  * traceback formatting functions raise Python errors.
+  */
+ #define TRACEBACK_RECURSION_LIMIT 2
+ static int	recursion_depth = 0;
+ 
+ 
+ /*
   * Emit a PG error or notice, together with any available info about
   * the current Python error, previously set by PLy_exception_set().
   * This should be used to propagate Python errors into PG.	If fmt is
*** PLy_traceback(char **xmsg, char **tbmsg,
*** 147,166 
  	StringInfoData xstr;
  	StringInfoData tbstr;
  
  	/*
  	 * get the current exception
  	 */
  	PyErr_Fetch(e, v, tb);
  
  	/*
! 	 * oops, no exception, return
  	 */
! 	if (e == NULL)
  	{
  		*xmsg = NULL;
  		*tbmsg = NULL;
  		*tb_depth = 0;
  
  		return;
  	}
  
--- 155,177 
  	StringInfoData xstr;
  	StringInfoData tbstr;
  
+ 	recursion_depth++;
+ 
  	/*
  	 * get the current exception
  	 */
  	PyErr_Fetch(e, v, tb);
  
  	/*
! 	 * oops, no exception or recursion depth exceeded, return
  	 */
! 	if (e == NULL || recursion_depth  TRACEBACK_RECURSION_LIMIT)
  	{
  		*xmsg = NULL;
  		*tbmsg = NULL;
  		*tb_depth = 0;
  
+ 		recursion_depth--;
  		return;
  	}
  
*** PLy_traceback(char **xmsg, char **tbmsg,
*** 326,331 
--- 337,344 
  		(*tb_depth)++;
  	}
  
+ 	recursion_depth--;
+ 
  	/* Return the traceback. */
  	*tbmsg = tbstr.data;
  
diff --git a/src/pl/plpython/plpy_util.c b/src/pl/plpython/plpy_util.c
new file mode 100644
index bf29532..ea4ecdf
*** a/src/pl/plpython/plpy_util.c
--- b/src/pl/plpython/plpy_util.c
*** PLyUnicode_Bytes(PyObject *unicode)
*** 112,117 
--- 112,123 
  		case PG_WIN874:
  			serverenc = cp874;
  			break;
+ 		case PG_KOI8R:
+ 			serverenc = koi8-r;
+ 			break;
+ 		case PG_KOI8U:
+ 			serverenc = koi8-u;
+ 			break;
  		default:
  			/* Other encodings have the same name in Python. */
  			serverenc = GetDatabaseEncodingName();
*** PLyUnicode_Bytes(PyObject *unicode)
*** 120,135 
  
  	rv = PyUnicode_AsEncodedString(unicode, serverenc, strict);
  	if (rv == NULL)
! 	{
! 		/*
! 		 * Use a plain ereport instead of PLy_elog to avoid recursion, if
! 		 * the traceback formatting functions try to do unicode to bytes
! 		 * conversion again.
! 		 */
! 		ereport(ERROR,
! (errcode(ERRCODE_INTERNAL_ERROR),
!  errmsg(could not convert Python Unicode object to PostgreSQL server encoding)));
! 	}
  	return rv;
  }
  
--- 126,132 
  
  	rv = PyUnicode_AsEncodedString(unicode, serverenc, strict);
  	if (rv == NULL)
! 		PLy_elog(ERROR, could not convert Python Unicode object to PostgreSQL server encoding);
  	return rv;
  }
  

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Fix mapping of PostgreSQL encodings to Python encodings.

2012-07-06 Thread Jan Urbański


On 06/07/12 22:47, Peter Eisentraut wrote:

On fre, 2012-07-06 at 18:53 +0300, Heikki Linnakangas wrote:

What shall we do about those? Ignore them? Document that if you're sing
one of these encodings then PL/Python with Python 2 will be crippled and
with Python 3 just won't work?


We could convert to UTF-8, and use the PostgreSQL functions to convert
from UTF-8 to the server encoding. Double conversion might be slow, but
I think it would be better than failing.


Actually, we already do the other direction that way
(PLyUnicode_FromStringAndSize) , so maybe it would be more consistent to
always use this.

I would hesitate to use this as a kind of fallback, because then we
would sometimes be using PostgreSQL's recoding tables and sometimes
Python's recoding tables, which could became confusing.


So you're in favour of doing unicode - bytes by encoding with UTF-8 and 
then using the server's encoding functions?


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] plpython issue with Win64 (PG 9.2)

2012-07-05 Thread Jan Urbański


On 05/07/12 21:37, Heikki Linnakangas wrote:

Committed. This bug was present in versions = 9.0, so backpatched.


Thanks!


I used ereport() rather than elog() in the error message. Correct me if
that was wrong, but the point was to avoid PLy_elog(), because that
might cause recursion, and ereport() should be ok. I believe the message
should be translated, as it's quite possible to get that error, at least
if you use SQL_ASCII, so ereport() is more approriate than elog().


Yes, you're absolutely right.

Cheers,
Jan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: [COMMITTERS] pgsql: Fix mapping of PostgreSQL encodings to Python encodings.

2012-07-05 Thread Jan Urbański


On 05/07/12 22:37, Heikki Linnakangas wrote:

On 05.07.2012 23:31, Tom Lane wrote:

Heikki Linnakangasheikki.linnakan...@iki.fi writes:

Fix mapping of PostgreSQL encodings to Python encodings.


The buildfarm doesn't like this --- did you check for side effects on
regression test results?


Hmm, I ran the regressions tests, but not with C encoding. With the
patch, you no longer get the errdetail you used to, when an encoding
conversion fails:


***
*** 41,47 

SELECT unicode_plan1();
ERROR: spiexceptions.InternalError: could not convert Python Unicode
object to PostgreSQL server encoding
- DETAIL: UnicodeEncodeError: 'ascii' codec can't encode character
u'\x80' in position 0: ordinal not in range(128)
CONTEXT: Traceback (most recent call last):
PL/Python function unicode_plan1, line 3, in module
rv = plpy.execute(plan, [u\x80], 1)
--- 39,44 


We could just update the expected output, there's two expected outputs
for this test case and one of them is now wrong. But it'd actually be
quite a shame to lose that extra information, that's quite valuable.
Perhaps we should go back to using PLu_elog() here, and find some other
way to avoid the recursion.


Seems that the problem is that the LC_ALL=C makes Postgres use SQL_ASCII 
as the database encoding and as the comment states, translating PG's 
SQL_ASCII to Python's ascii is not ideal.


The problem is that PLyUnicode_Bytes is (via an ifdef) used as 
PyString_ToString on Python3, which means that there are numerous call 
sites and new ones might appear in any moment. I'm not that keen on 
invoking the traceback machinery on low-level encoding errors.


Hm, since PyUnicode_Bytes should get a unicode object and return bytes 
in the server encoding, we might just say that for SQL_ASCII we 
arbitrarily choose UTF-8 to encode the unicode codepoints, so we'd just 
set serverenc = utf-8 in the first switch case.


That doesn't solve the problem of the missing error detail, though.

Jan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Fix mapping of PostgreSQL encodings to Python encodings.

2012-07-05 Thread Jan Urbański


On 05/07/12 23:30, Peter Eisentraut wrote:

On tor, 2012-07-05 at 22:53 +0200, Jan Urbański wrote:

The problem is that PLyUnicode_Bytes is (via an ifdef) used as
PyString_ToString on Python3, which means that there are numerous call
sites and new ones might appear in any moment. I'm not that keen on
invoking the traceback machinery on low-level encoding errors.


Why not?


Because it can lead to recursion errors, like the one this patch was 
supposed to fix. The traceback machinery calls into the encoding 
functions, because it converts Python strings (like function names) into 
C strings.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] plpython issue with Win64 (PG 9.2)

2012-07-04 Thread Jan Urbański


On 04/07/12 13:58, Asif Naeem wrote:

Patch attached. Asif, could you try a few things on a CP1252 database?


First verify if your original test case now works and then try this:




I have test the patch on Win64. postgres server is working fine now for
WIN1252. Thanks.



create function enctest() returns text as $$
   return b'tr\xc3\xb3spido'.decode('**utf-8')
$$ language plpython3u;

select enctest(), encode(convert_to(enctest(), 'utf-8'), 'hex');



create function enctest() returns text as $$
   return b'tr\xc3\xb3spido'.decode('utf-8')
$$ language plpython3u;
select enctest(), encode(convert_to(enctest(), 'utf-8'), 'hex');
  enctest  |   encode
--+
  tróspido | 7472c3b3737069646f
(1 row)

Please do let me know If you have any other query. Thanks.


Great, this looks correct.

Can we apply this to 9.2?

Cheers,
Jan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] plpython issue with Win64 (PG 9.2)

2012-07-03 Thread Jan Urbański


On 29/06/12 00:36, Jan Urbański wrote:

On 27/06/12 13:57, Jan Urbański wrote:

On 27/06/12 11:51, Asif Naeem wrote:

Hi,

On Windows 7 64bit, plpython is causing server crash with the following
test case i.e.



So: I'd add code to translate WINxxx into CPxxx when choosing the Python
to use, change PLy_elog to elog in PLyUnicode_Bytes and leave the
SQL_ASCII case alone, as there were no complaints and people using
SQL_ASCII are asking for it anyway.


Since no one commented, I'll produce a patch to that effect. I believe 
this should go into 9.2 given that otherwise PL/Python will basically 
crash any database using the CP12xx encoding.


Cheers,
Jan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] plpython issue with Win64 (PG 9.2)

2012-07-03 Thread Jan Urbański


On 03/07/12 17:45, Jan Urbański wrote:

On 29/06/12 00:36, Jan Urbański wrote:

On 27/06/12 13:57, Jan Urbański wrote:

On 27/06/12 11:51, Asif Naeem wrote:

Hi,

On Windows 7 64bit, plpython is causing server crash with the following
test case i.e.



So: I'd add code to translate WINxxx into CPxxx when choosing the Python
to use, change PLy_elog to elog in PLyUnicode_Bytes and leave the
SQL_ASCII case alone, as there were no complaints and people using
SQL_ASCII are asking for it anyway.


Since no one commented, I'll produce a patch to that effect. I believe
this should go into 9.2 given that otherwise PL/Python will basically
crash any database using the CP12xx encoding.


Patch attached. Asif, could you try a few things on a CP1252 database?

First verify if your original test case now works and then try this:

create function enctest() returns text as $$
  return b'tr\xc3\xb3spido'.decode('utf-8')
$$ language plpython3u;

select enctest(), encode(convert_to(enctest(), 'utf-8'), 'hex');

Thanks,
Jan
diff --git a/src/pl/plpython/plpy_util.c b/src/pl/plpython/plpy_util.c
new file mode 100644
index 9a4901e..5fafbd1
*** a/src/pl/plpython/plpy_util.c
--- b/src/pl/plpython/plpy_util.c
*** PLyUnicode_Bytes(PyObject *unicode)
*** 66,80 
  
  	/*
  	 * Python understands almost all PostgreSQL encoding names, but it doesn't
! 	 * know SQL_ASCII.
  	 */
! 	if (GetDatabaseEncoding() == PG_SQL_ASCII)
! 		serverenc = ascii;
! 	else
! 		serverenc = GetDatabaseEncodingName();
  	rv = PyUnicode_AsEncodedString(unicode, serverenc, strict);
! 	if (rv == NULL)
! 		PLy_elog(ERROR, could not convert Python Unicode object to PostgreSQL server encoding);
  	return rv;
  }
  
--- 66,125 
  
  	/*
  	 * Python understands almost all PostgreSQL encoding names, but it doesn't
! 	 * know SQL_ASCII and calls the Windows encodings differently.
  	 */
! 	switch (GetDatabaseEncoding())
! 	{
! 		case PG_SQL_ASCII:
! 			serverenc = ascii;
! 			break;
! 		case PG_WIN1250:
! 			serverenc = cp1250;
! 			break;
! 		case PG_WIN1251:
! 			serverenc = cp1251;
! 			break;
! 		case PG_WIN1252:
! 			serverenc = cp1252;
! 			break;
! 		case PG_WIN1253:
! 			serverenc = cp1253;
! 			break;
! 		case PG_WIN1254:
! 			serverenc = cp1254;
! 			break;
! 		case PG_WIN1255:
! 			serverenc = cp1255;
! 			break;
! 		case PG_WIN1256:
! 			serverenc = cp1256;
! 			break;
! 		case PG_WIN1257:
! 			serverenc = cp1257;
! 			break;
! 		case PG_WIN1258:
! 			serverenc = cp1258;
! 			break;
! 		case PG_WIN866:
! 			serverenc = cp866;
! 			break;
! 		case PG_WIN874:
! 			serverenc = cp874;
! 			break;
! 		default:
! 			serverenc = GetDatabaseEncodingName();
! 			break;
! 	}
! 
  	rv = PyUnicode_AsEncodedString(unicode, serverenc, strict);
! 	if (rv == NULL) {
! 		/*
! 		 * Use a plan elog instead of PLy_elog here to avoid getting in
! 		 * recursion trouble when the traceback formatting functions try doing
! 		 * unicode to bytes conversion.
! 		 */
! 		elog(ERROR, could not convert Python Unicode object to PostgreSQL server encoding);
! 	}
  	return rv;
  }
  

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] plpython issue with Win64 (PG 9.2)

2012-06-28 Thread Jan Urbański


On 27/06/12 13:57, Jan Urbański wrote:

On 27/06/12 11:51, Asif Naeem wrote:

Hi,

On Windows 7 64bit, plpython is causing server crash with the following
test case i.e.

CREATE PROCEDURAL LANGUAGE 'plpython3u';

CREATE OR REPLACE FUNCTION pymax (a integer, b integer)
RETURNS integer
AS $$
if a b:
return a
return b
$$ LANGUAGE plpython3u;
SELECT pymax(1, 2);




I think primary reason that trigger this issue is when Function
PLyUnicode_Bytes() calls PyUnicode_AsEncodedString( ,WIN1252 /*Server
encoding*/, )  it fails with null. I built latest pg 9.2 source code
with
python 3.2.2.3 by using Visual Studio 2010. Thanks.


I'll try to reproduce this on Linux, which should be possible given the
results of your investigation.


Your analysis is correct, I managed to reproduce this by injecting

serverenc = win1252;

into PLyUnicode_Bytes. The comment in that function says that Python 
understands all PostgreSQL encoding names except for SQL_ASCII, but 
that's not really true. In your case GetDatabaseEncodingName() returns 
WIN1252 and Python accepts CP125.


I'm wondering how this should be fixed. Just by adding more special 
cases in PLyUnicode_Bytes?


Even if we add a switch statement that would convert PG_WIN1250 into 
CP1250, Python can still raise an exception when encoding (for various 
reasons). How about replacing the PLy_elog there with just an elog? This 
loses traceback support and the Python exception message, which could be 
helpful for debugging (something like invalid character foo for 
encoding cp1250). OTOH, I'm uneasy about invoking the entire PLy_elog 
machinery from a function that's as low-level as PLyUnicode_Bytes.


Lastly, we map SQL_ASCII to ascii which is arguably wrong. The 
function is supposed to return bytes in the server encoding, and under 
SQL_ASCII that probably means we can return anything (ie. use any 
encoding we deem useful). Using ascii as the Python codec name will 
raise an error on anything that has the high bit set.


So: I'd add code to translate WINxxx into CPxxx when choosing the Python 
to use, change PLy_elog to elog in PLyUnicode_Bytes and leave the 
SQL_ASCII case alone, as there were no complaints and people using 
SQL_ASCII are asking for it anyway.


Cheers,
Jan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] plpython issue with Win64 (PG 9.2)

2012-06-27 Thread Jan Urbański


On 27/06/12 11:51, Asif Naeem wrote:

Hi,

On Windows 7 64bit, plpython is causing server crash with the following
test case i.e.

CREATE PROCEDURAL LANGUAGE 'plpython3u';

CREATE OR REPLACE FUNCTION pymax (a integer, b integer)
   RETURNS integer
AS $$
   if a  b:
 return a
   return b
$$ LANGUAGE plpython3u;
SELECT pymax(1, 2);




I think primary reason that trigger this issue is when Function
PLyUnicode_Bytes() calls PyUnicode_AsEncodedString( ,WIN1252 /*Server
encoding*/, )  it fails with null. I built latest pg 9.2 source code with
python 3.2.2.3 by using Visual Studio 2010. Thanks.


I'll try to reproduce this on Linux, which should be possible given the 
results of your investigation.


Cheers,
Jan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PL/Python result set slicing broken in Python 3

2012-05-11 Thread Jan Urbański


On 10/05/12 19:45, Peter Eisentraut wrote:

On lör, 2012-05-05 at 22:45 +0200, Jan Urbański wrote:

Apparently once you implement PyMappingMethods.mp_subscript you can
drop PySequenceMethods.sq_slice, but I guess there's no harm in
keeping it (and I'm not sure it'd work on Python 2.3 with only
mp_subscript implemented).


Committed this now.

 From test coverage reports, I now see that PLy_result_ass_item() is no
longer called.  That's probably OK, if assignments are now handled
through the mapping methods.  But should we remove the function then?


Have you tried on Python 2.3 as well? People on #python said that if you 
implement the mapping functions, the sequence slicing functions are no 
longer used, but maybe we should revisit for the next release, rather 
than risk introducing a regression for the benefit of removing a few 
dead lines.


Cheers,
Jan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PL/Python result set slicing broken in Python 3

2012-05-05 Thread Jan Urbański


On 04/05/12 20:00, Jan Urbański wrote:

On 03/05/12 11:04, Jan Urbański wrote:

On 02/05/12 20:18, Peter Eisentraut wrote:

This doesn't work anymore with Python 3:

rv = plpy.execute(...)
do_something(rv[0:1])


Sounds ugly. I'll take a look.


I found some instructions on how to deal with the Python 2/Python 3
slicing mess:

http://renesd.blogspot.com/2009/07/python3-c-api-simple-slicing-sqslice.html


Thanks to the helpful folk at #python I found out that the fix is much 
easier. Attached is a patch that fixes the bug and passes regression 
tests on Pythons 2.3 through 3.2.


Apparently once you implement PyMappingMethods.mp_subscript you can drop 
PySequenceMethods.sq_slice, but I guess there's no harm in keeping it 
(and I'm not sure it'd work on Python 2.3 with only mp_subscript 
implemented).


Do we want to backpatch this? If so, I'd need to produce a version that 
applies to the monolithic plpython.c file from the previous releases.


Cheers,
Jan
From 000a1285d66c65c36ae6fa064266f00def5ee9d7 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Jan=20Urba=C5=84ski?= wulc...@wulczer.org
Date: Sat, 5 May 2012 22:39:26 +0200
Subject: [PATCH] Fix slicing support for result objects for Python 3.

The old way of implementing slicing support by implementing
PySequenceMethods.sq_slice has been deprecated in Python 3, you should
not implement PyMappingMethods.mp_subscript. Do this by simply
proxying the call to the wrapped list of result dictionaries.

While at it, fix an incorrect comment about PLyResultObject-rows being
None if the result set is empty (it actually is an empty list in that
case).
---
 src/pl/plpython/expected/plpython_spi.out |   55 +
 src/pl/plpython/plpy_resultobject.c   |   27 +-
 src/pl/plpython/plpy_resultobject.h   |2 +-
 src/pl/plpython/sql/plpython_spi.sql  |   36 +++
 4 files changed, 118 insertions(+), 2 deletions(-)

diff --git a/src/pl/plpython/expected/plpython_spi.out b/src/pl/plpython/expected/plpython_spi.out
index 671c24e..54ef512 100644
--- a/src/pl/plpython/expected/plpython_spi.out
+++ b/src/pl/plpython/expected/plpython_spi.out
@@ -228,6 +228,61 @@ SELECT result_len_test($$UPDATE foo3 SET b= '' WHERE a = 2$$);
0
 (1 row)
 
+CREATE FUNCTION result_subscript_test() RETURNS void
+AS $$
+result = plpy.execute(SELECT 1 AS c UNION SELECT 2 
+  UNION SELECT 3 UNION SELECT 4)
+
+plpy.info(result[1]['c'])
+plpy.info(result[-1]['c'])
+
+plpy.info([item['c'] for item in result[1:3]])
+plpy.info([item['c'] for item in result[::2]])
+
+result[-1] = {'c': 1000}
+result[:2] = [{'c': 10}, {'c': 100}]
+plpy.info([item['c'] for item in result[:]])
+
+# raises TypeError, but the message differs on Python 2.6, so silence it
+try:
+plpy.info(result['foo'])
+except TypeError:
+pass
+else:
+assert False, TypeError not raised
+
+$$ LANGUAGE plpythonu;
+SELECT result_subscript_test();
+INFO:  2
+CONTEXT:  PL/Python function result_subscript_test
+INFO:  4
+CONTEXT:  PL/Python function result_subscript_test
+INFO:  [2, 3]
+CONTEXT:  PL/Python function result_subscript_test
+INFO:  [1, 3]
+CONTEXT:  PL/Python function result_subscript_test
+INFO:  [10, 100, 3, 1000]
+CONTEXT:  PL/Python function result_subscript_test
+ result_subscript_test 
+---
+ 
+(1 row)
+
+CREATE FUNCTION result_empty_test() RETURNS void
+AS $$
+result = plpy.execute(select 1 where false)
+
+plpy.info(result[:])
+
+$$ LANGUAGE plpythonu;
+SELECT result_empty_test();
+INFO:  []
+CONTEXT:  PL/Python function result_empty_test
+ result_empty_test 
+---
+ 
+(1 row)
+
 -- cursor objects
 CREATE FUNCTION simple_cursor_test() RETURNS int AS $$
 res = plpy.cursor(select fname, lname from users)
diff --git a/src/pl/plpython/plpy_resultobject.c b/src/pl/plpython/plpy_resultobject.c
index fcf8074..06ba2ee 100644
--- a/src/pl/plpython/plpy_resultobject.c
+++ b/src/pl/plpython/plpy_resultobject.c
@@ -23,6 +23,9 @@ static PyObject *PLy_result_item(PyObject *arg, Py_ssize_t idx);
 static PyObject *PLy_result_slice(PyObject *arg, Py_ssize_t lidx, Py_ssize_t hidx);
 static int	PLy_result_ass_item(PyObject *arg, Py_ssize_t idx, PyObject *item);
 static int	PLy_result_ass_slice(PyObject *rg, Py_ssize_t lidx, Py_ssize_t hidx, PyObject *slice);
+static PyObject *PLy_result_subscript(PyObject *arg, PyObject *item);
+static PyObject *PLy_result_subscript(PyObject *arg, PyObject *item);
+static int PLy_result_ass_subscript(PyObject* self, PyObject* item, PyObject* value);
 
 static char PLy_result_doc[] = {
 	Results of a PostgreSQL query
@@ -38,6 +41,12 @@ static PySequenceMethods PLy_result_as_sequence = {
 	PLy_result_ass_slice,		/* sq_ass_slice */
 };
 
+static PyMappingMethods PLy_result_as_mapping = {
+	PLy_result_length,			/* mp_length */
+	PLy_result_subscript,		/* mp_subscript */
+	PLy_result_ass_subscript,	/* mp_ass_subscript */
+};
+
 static PyMethodDef PLy_result_methods[] = {
 	{colnames

Re: [HACKERS] PL/Python result set slicing broken in Python 3

2012-05-04 Thread Jan Urbański


On 03/05/12 11:04, Jan Urbański wrote:

On 02/05/12 20:18, Peter Eisentraut wrote:

This doesn't work anymore with Python 3:

rv = plpy.execute(...)
do_something(rv[0:1])

Apparently, they changed the C API for doing slicing, or rather made one
of the two APIs for it silently do nothing. Details are difficult to
find, but this email message seems to contain something:
http://mail.python.org/pipermail/python-3000/2007-August/009851.html.

I'll try to sort this out sometime, but if someone wants to take a shot
at it, go ahead.


Sounds ugly. I'll take a look.


I found some instructions on how to deal with the Python 2/Python 3 
slicing mess:


http://renesd.blogspot.com/2009/07/python3-c-api-simple-slicing-sqslice.html

Apparently you need that egregious hack in order to avoid code 
duplication. I'll try to produce a patch over the weekend.


Cheers,
Jan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PL/Python result set slicing broken in Python 3

2012-05-03 Thread Jan Urbański


On 02/05/12 20:18, Peter Eisentraut wrote:

This doesn't work anymore with Python 3:

rv = plpy.execute(...)
do_something(rv[0:1])

Apparently, they changed the C API for doing slicing, or rather made one
of the two APIs for it silently do nothing.  Details are difficult to
find, but this email message seems to contain something:
http://mail.python.org/pipermail/python-3000/2007-August/009851.html.

I'll try to sort this out sometime, but if someone wants to take a shot
at it, go ahead.


Sounds ugly. I'll take a look.

Cheers,
Jan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] plpython triggers are broken for composite-type columns

2012-04-22 Thread Jan Urbański


On 10/04/12 21:47, Jan Urbański wrote:

On 10/04/12 21:27, Tom Lane wrote:

=?UTF-8?B?SmFuIFVyYmHFhHNraQ==?=wulc...@wulczer.org writes:

Yes, that would be ideal, even though not backwards-compatible.
Back-patching is out of the question, but do we want to change trigger
functions to receive dictionaries in NEW?


Hm, I was not thinking of this as being trigger-specific, but more a
general principle that composite columns of tuples ought to be handled
in a recursive fashion.


Sure, that would be the way.


If so, should this be 9.2 material, or just a TODO?


If it can be done quickly and with not much risk, I'd vote for
squeezing it into 9.2, because it seems to me to be a clear bug that the
two directions are not handled consistently. If you don't have time for
it now or you don't think it would be a small/safe patch, we'd better
just put it on TODO.


It turned out not to be as straightforward as I though :(

The I/O code in PL/Python is a bit of a mess and that's something that 
I'd like to address somewhere in the 9.3 development cycle. Right now 
making the conversion function recursive is not possible without some 
deep surgery (or kludgery...) so I limited myself to producing 
regression-fixing patches for 9.2 and 9.1 (attached).


Cheers,
Jan
From 513e8c484f32599a8753035fad638c8712339480 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Jan=20Urba=C5=84ski?= wulc...@wulczer.org
Date: Mon, 23 Apr 2012 02:07:46 +0200
Subject: [PATCH] Accept strings in PL/Python functions returning composite
 types.

Before 9.1, PL/Python functions returning composite types could return
a string and it would be parsed using record_in. The 9.1 changes made
PL/Python only expect dictionaries, tuples or objects supporting
getattr as output of composite functions, resulting in a regression
and a confusing error message, as the strings were interpreted as
sequences and the code for transforming lists Postgres tuples was
being used.

The reason why it's important is that trigger functions on tables with
composite columns get the composite row passed in as a string (from
record_out). This makes it impossible to implement passthrough
behaviour for these columns, as PL/Python no longer accepts strings
for composite values. A better solution would be to fix the code that
transforms composite inputs into Python objecst to produce
dictionaries that would then be correctly interpreted by the
Python-Postgres counterpart code. This would be too invasive to
backpatch in 9.1 and it is too late in the 9.2 cycle to do it in
HEAD. It should get revisited in 9.3, though.

Reported as bug #6559 by Kirill Simonov.
---
 src/pl/plpython/expected/plpython_record.out  |   16 
 src/pl/plpython/expected/plpython_trigger.out |   37 +
 src/pl/plpython/plpython.c|   94 ---
 src/pl/plpython/regression.diffs  |   99 +
 src/pl/plpython/regression.out|   20 +
 src/pl/plpython/sql/plpython_record.sql   |   10 +++
 src/pl/plpython/sql/plpython_trigger.sql  |   36 +
 7 files changed, 268 insertions(+), 44 deletions(-)
 create mode 100644 src/pl/plpython/regression.diffs
 create mode 100644 src/pl/plpython/regression.out

diff --git a/src/pl/plpython/expected/plpython_record.out b/src/pl/plpython/expected/plpython_record.out
index 0bcc46c..4583307 100644
--- a/src/pl/plpython/expected/plpython_record.out
+++ b/src/pl/plpython/expected/plpython_record.out
@@ -38,6 +38,8 @@ elif typ == 'obj':
 	type_record.first = first
 	type_record.second = second
 	return type_record
+elif typ == 'str':
+	return ('%s',%r) % (first, second)
 $$ LANGUAGE plpythonu;
 CREATE FUNCTION test_in_out_params(first in text, second out text) AS $$
 return first + '_in_to_out';
@@ -290,6 +292,12 @@ SELECT * FROM test_type_record_as('obj', null, null, true);
|   
 (1 row)
 
+SELECT * FROM test_type_record_as('str', 'one', 1, false);
+ first | second 
+---+
+ 'one' |  1
+(1 row)
+
 SELECT * FROM test_in_out_params('test_in');
   second   
 ---
@@ -355,3 +363,11 @@ ERROR:  attribute second does not exist in Python object
 HINT:  To return null in a column, let the returned object have an attribute named after column with value None.
 CONTEXT:  while creating return value
 PL/Python function test_type_record_error3
+CREATE FUNCTION test_type_record_error4() RETURNS type_record AS $$
+return 'foo'
+$$ LANGUAGE plpythonu;
+SELECT * FROM test_type_record_error4();
+ERROR:  malformed record literal: foo
+DETAIL:  Missing left parenthesis.
+CONTEXT:  while creating return value
+PL/Python function test_type_record_error4
diff --git a/src/pl/plpython/expected/plpython_trigger.out b/src/pl/plpython/expected/plpython_trigger.out
index 2ef66a8..ea4d500 100644
--- a/src/pl/plpython/expected/plpython_trigger.out
+++ b/src/pl/plpython/expected/plpython_trigger.out
@@ -567,3 +567,40 @@ SELECT * FROM composite_trigger_test;
  (3,f

Re: [HACKERS] plpython triggers are broken for composite-type columns

2012-04-10 Thread Jan Urbański


On 10/04/12 07:35, Jan Urbański wrote:

On 10/04/12 04:20, Tom Lane wrote:

Don't know if anybody noticed bug #6559
http://archives.postgresql.org/pgsql-bugs/2012-03/msg00180.php

I've confirmed that the given test case works in 9.0 but fails in
9.1 and HEAD.


So, I know what's going on, I still don't know what's the best way to 
handle it.


The function that converts Python objects to PG data checks what type 
it's supposed to produce and acts accordingly. In 9.0 it checked for 
bool, bytea and arrays, in 9.1 it also takes composite types into account.


This has been done to support functions returning composite types - to 
do that they need to return a dictionary or a list, for instance 
{'col1': 1, 'col2': 2}.


The problem is that the routine that converts PG data into Python 
objects does not handle composite type inputs all that well - it just 
bails and returns the string representation, hence '(3)' appearing in 
Python land.


Now previously, the Python-PG function did not see that the given 
conversion is supposed to return a composite so it also bailed and used 
a default text-composite conversion, so '(3)' was converted to ROW(3) 
and all went well. The new code tries to treat what it gets as a 
dictionary/list/tuple and fails in a more or less random way.


Now that I understand what's been going on, I'll try to think of a 
non-invasive way of fixing that...


Cheers,
Jan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] plpython triggers are broken for composite-type columns

2012-04-10 Thread Jan Urbański


On 10/04/12 20:47, Tom Lane wrote:

I wrote:

=?UTF-8?B?SmFuIFVyYmHFhHNraQ==?=wulc...@wulczer.org  writes:

Now that I understand what's been going on, I'll try to think of a
non-invasive way of fixing that...



ISTM that conversion of a composite value to Python ought to produce a
dict, now that the other direction expects a dict.  I can see that this
is probably infeasible for compatibility reasons in 9.1, but it's not
too late to fix it for 9.2.  We might have to leave the bug unfixed in
9.1, since anything we do about it will represent a compatibility break.


On reflection, can't we fix this as follows: if the value coming in from
Python is a string, just feed it to record_in, the same as we used to.
When I traced through the logic before, it seemed like it was failing
to distinguish strings from sequences, but I would hope that Python
is more strongly typed than that.


Yeah, we can fix PLyObject_ToTuple to check for strings too and use the 
default PG input function. The reason it was complaining about length is 
that we're checking if the object passed implements the sequence 
protocol, which Python strings do (length, iteration, etc). Sticking a 
if branch that will catch the string case above that should be sufficient.




I still think the conversion in the other direction ought to yield a
dict, but that's clearly not back-patch material.


Yes, that would be ideal, even though not backwards-compatible. 
Back-patching is out of the question, but do we want to change trigger 
functions to receive dictionaries in NEW? If so, should this be 9.2 
material, or just a TODO?


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] plpython triggers are broken for composite-type columns

2012-04-10 Thread Jan Urbański


On 10/04/12 21:27, Tom Lane wrote:

=?UTF-8?B?SmFuIFVyYmHFhHNraQ==?=wulc...@wulczer.org  writes:

Yes, that would be ideal, even though not backwards-compatible.
Back-patching is out of the question, but do we want to change trigger
functions to receive dictionaries in NEW?


Hm, I was not thinking of this as being trigger-specific, but more a
general principle that composite columns of tuples ought to be handled
in a recursive fashion.


Sure, that would be the way.


If so, should this be 9.2 material, or just a TODO?


If it can be done quickly and with not much risk, I'd vote for
squeezing it into 9.2, because it seems to me to be a clear bug that the
two directions are not handled consistently.  If you don't have time for
it now or you don't think it would be a small/safe patch, we'd better
just put it on TODO.


I'll see if making the conversion function recursive is easy and 
independently whip up a patch to check for strings and routes them 
through InputFunctionCall, for back-patching purposes.


Cheers,
Jan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] plpython triggers are broken for composite-type columns

2012-04-09 Thread Jan Urbański


On 10/04/12 04:20, Tom Lane wrote:

Don't know if anybody noticed bug #6559
http://archives.postgresql.org/pgsql-bugs/2012-03/msg00180.php

I've confirmed that the given test case works in 9.0 but fails in
9.1 and HEAD.

I find this code pretty unreadable, though, and know nothing to
speak of about the Python side of things anyhow.  So somebody else
had better pick this up.


I'll look into that.

Cheers,
Jan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Potential reference miscounts and segfaults in plpython.c

2012-02-22 Thread Jan Urbański

On 21/02/12 18:28, Jan Urbański wrote:
 On 21/02/12 18:05, Peter Eisentraut wrote:
 it might be better to use ereport, to expose the message
 for translation.

 After giving it some thought some of these elogs could be changed into
 PLy_elogs (which is meant to propagate a Python error into Postgres) and
 the others made into ereports.
 
 I'll send updated patches this evening (CET).

Inevitably, this turned into the next morning CET.

Here are the updated patches which use PLy_elog instead of plain elog.
The difference is that they will get marked for translation and that the
original Python exception will show up in the errdetail field.

Cheers,
Jan
diff --git a/src/pl/plpython/plpy_elog.c b/src/pl/plpython/plpy_elog.c
index 741980c..332898d 100644
*** a/src/pl/plpython/plpy_elog.c
--- b/src/pl/plpython/plpy_elog.c
*** get_source_line(const char *src, int lin
*** 365,370 
--- 365,374 
  	const char *next = src;
  	int			current = 0;
  
+ 	/* sanity check */
+ 	if (lineno = 0)
+ 		return NULL;
+ 
  	while (current  lineno)
  	{
  		s = next;
diff --git a/src/pl/plpython/plpy_main.c b/src/pl/plpython/plpy_main.c
index ae9d87e..f259ecf 100644
*** a/src/pl/plpython/plpy_main.c
--- b/src/pl/plpython/plpy_main.c
*** PLy_init_interp(void)
*** 133,138 
--- 133,140 
  	Py_INCREF(mainmod);
  	PLy_interp_globals = PyModule_GetDict(mainmod);
  	PLy_interp_safe_globals = PyDict_New();
+ 	if (PLy_interp_safe_globals == NULL)
+ 		PLy_elog(ERROR, could not create globals);
  	PyDict_SetItemString(PLy_interp_globals, GD, PLy_interp_safe_globals);
  	Py_DECREF(mainmod);
  	if (PLy_interp_globals == NULL || PyErr_Occurred())
diff --git a/src/pl/plpython/plpy_plpymodule.c b/src/pl/plpython/plpy_plpymodule.c
index d2d0a2a..0c96f03 100644
*** a/src/pl/plpython/plpy_plpymodule.c
--- b/src/pl/plpython/plpy_plpymodule.c
*** PLy_init_plpy(void)
*** 173,181 
  	main_mod = PyImport_AddModule(__main__);
  	main_dict = PyModule_GetDict(main_mod);
  	plpy_mod = PyImport_AddModule(plpy);
  	PyDict_SetItemString(main_dict, plpy, plpy_mod);
  	if (PyErr_Occurred())
! 		elog(ERROR, could not initialize plpy);
  }
  
  static void
--- 173,183 
  	main_mod = PyImport_AddModule(__main__);
  	main_dict = PyModule_GetDict(main_mod);
  	plpy_mod = PyImport_AddModule(plpy);
+ 	if (plpy_mod == NULL)
+ 		PLy_elog(ERROR, could not initialize plpy);
  	PyDict_SetItemString(main_dict, plpy, plpy_mod);
  	if (PyErr_Occurred())
! 		PLy_elog(ERROR, could not initialize plpy);
  }
  
  static void
*** PLy_add_exceptions(PyObject *plpy)
*** 208,213 
--- 210,220 
  	PLy_exc_fatal = PyErr_NewException(plpy.Fatal, NULL, NULL);
  	PLy_exc_spi_error = PyErr_NewException(plpy.SPIError, NULL, NULL);
  
+ 	if (PLy_exc_error == NULL ||
+ 		PLy_exc_fatal == NULL ||
+ 		PLy_exc_spi_error == NULL)
+ 		PLy_elog(ERROR, could not create the base SPI exceptions);
+ 
  	Py_INCREF(PLy_exc_error);
  	PyModule_AddObject(plpy, Error, PLy_exc_error);
  	Py_INCREF(PLy_exc_fatal);
*** PLy_generate_spi_exceptions(PyObject *mo
*** 241,247 
--- 248,260 
  		PyObject   *sqlstate;
  		PyObject   *dict = PyDict_New();
  
+ 		if (dict == NULL)
+ 			PLy_elog(ERROR, could not generate SPI exceptions);
+ 
  		sqlstate = PyString_FromString(unpack_sql_state(exception_map[i].sqlstate));
+ 		if (sqlstate == NULL)
+ 			PLy_elog(ERROR, could not generate SPI exceptions);
+ 
  		PyDict_SetItemString(dict, sqlstate, sqlstate);
  		Py_DECREF(sqlstate);
  		exc = PyErr_NewException(exception_map[i].name, base, dict);
*** PLy_output(volatile int level, PyObject
*** 369,376 
  		 * decoration.
  		 */
  		PyObject   *o;
  
- 		PyArg_UnpackTuple(args, plpy.elog, 1, 1, o);
  		so = PyObject_Str(o);
  	}
  	else
--- 382,393 
  		 * decoration.
  		 */
  		PyObject   *o;
+ 		int			result;
+ 
+ 		result = PyArg_UnpackTuple(args, plpy.elog, 1, 1, o);
+ 		if (!result)
+ 			PLy_elog(ERROR, could not unpack arguments in plpy.elog);
  
  		so = PyObject_Str(o);
  	}
  	else
diff --git a/src/pl/plpython/plpy_spi.c b/src/pl/plpython/plpy_spi.c
index 0d63c4f..20b93cd 100644
*** a/src/pl/plpython/plpy_spi.c
--- b/src/pl/plpython/plpy_spi.c
*** PLy_spi_execute_query(char *query, long
*** 338,344 
  	int			rv;
  	volatile MemoryContext oldcontext;
  	volatile ResourceOwner oldowner;
! 	PyObject   *ret;
  
  	oldcontext = CurrentMemoryContext;
  	oldowner = CurrentResourceOwner;
--- 338,344 
  	int			rv;
  	volatile MemoryContext oldcontext;
  	volatile ResourceOwner oldowner;
! 	PyObject   *ret = NULL;
  
  	oldcontext = CurrentMemoryContext;
  	oldowner = CurrentResourceOwner;
*** PLy_spi_execute_query(char *query, long
*** 362,367 
--- 362,368 
  
  	if (rv  0)
  	{
+ 		Py_XDECREF(ret);
  		PLy_exception_set(PLy_exc_spi_error,
  		  SPI_execute failed: %s,
  		  SPI_result_code_string(rv));
diff --git a/src/pl/plpython/plpy_typeio.c b/src/pl

Re: [HACKERS] Potential reference miscounts and segfaults in plpython.c

2012-02-21 Thread Jan Urbański

On 21/02/12 18:05, Peter Eisentraut wrote:
 On sön, 2012-02-19 at 22:29 -0500, Tom Lane wrote:
 My only comment is whether elog(ERROR) is appropriate, ie, do we
 consider these to be internal errors that users will never see in
 practice? If there's a significant risk of the error being thrown in
 the field, it might be better to use ereport, to expose the message
 for translation.
 
 I find the wording of the error messages a bit inappropriate.  For
 example,
 
 list = PyList_New(length);
 +   if (list == NULL)
 +   elog(ERROR, could not transform Python list to array);
 
 The error is not about the transforming, it's about creating a new list.

Well, what I tried to convery here was that the process of transforming
a Python list to a Postgres array failed. Which, by the way, is wrong
since this function transforms a Postgres array to a Python list...

After giving it some thought some of these elogs could be changed into
PLy_elogs (which is meant to propagate a Python error into Postgres) and
the others made into ereports.

I'll send updated patches this evening (CET).

Cheers,
Jan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Potential reference miscounts and segfaults in plpython.c

2012-02-20 Thread Jan Urbański

On 20/02/12 04:29, Tom Lane wrote:
 =?UTF-8?B?SmFuIFVyYmHFhHNraQ==?= wulc...@wulczer.org writes:
 On 18/02/12 21:17, Tom Lane wrote:
 Dave Malcolm at Red Hat has been working on a static code analysis tool
 for Python-related C code.  He reports here on some preliminary results
 for plpython.c:
 https://bugzilla.redhat.com/show_bug.cgi?id=795011
 
 Here's a patch that fixes everything I was sure was an actual bug. The
 rest of the warnings seem to be caused by the tool not knowing that
 elog(ERROR) throws a longjmp and things like we never unref this
 object, so it can't disappear mid-execution.
 
 My only comment is whether elog(ERROR) is appropriate, ie, do we consider
 these to be internal errors that users will never see in practice?

AFAICS these errors can only happen on out of memory conditions or other
internal errors (like trying to create a list with a negative length).

Cheers,
Jan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Potential reference miscounts and segfaults in plpython.c

2012-02-19 Thread Jan Urbański

On 18/02/12 21:18, Jan Urbański wrote:
 On 18/02/12 21:17, Tom Lane wrote:
 =?UTF-8?B?SmFuIFVyYmHFhHNraQ==?= wulc...@wulczer.org writes:
 On 18/02/12 20:30, Tom Lane wrote:
 Dave Malcolm at Red Hat has been working on a static code analysis tool
 for Python-related C code.  He reports here on some preliminary results
 for plpython.c:
 https://bugzilla.redhat.com/show_bug.cgi?id=795011

 If you find any live bugs, it'd likely be better to deal with them as
 a separate patch so that we can back-patch ...
 
 Sure, I meant to say I'll look at these as well, but will make them into
 a separate patch.

Here's a patch that fixes everything I was sure was an actual bug. The
rest of the warnings seem to be caused by the tool not knowing that
elog(ERROR) throws a longjmp and things like we never unref this
object, so it can't disappear mid-execution.

Attached are patches for HEAD and for 9.1.x (since splitting plpython.c
in 9.2 was kind of my idea I felt bad about you having to back-patch
this so I tried to do the necessary legwork myself; I hope the attached
is what you need).

BTW, that tool is quite handy, I'll have to try running it over psycopg2.

Cheers,
Jan
diff --git a/src/pl/plpython/plpython.c b/src/pl/plpython/plpython.c
index 530b5f0..2b064c5 100644
*** a/src/pl/plpython/plpython.c
--- b/src/pl/plpython/plpython.c
*** PLyList_FromArray(PLyDatumToOb *arg, Dat
*** 2345,2350 
--- 2345,2352 
  	length = ARR_DIMS(array)[0];
  	lbound = ARR_LBOUND(array)[0];
  	list = PyList_New(length);
+ 	if (list == NULL)
+ 		elog(ERROR, could not transform Python list to array);
  
  	for (i = 0; i  length; i++)
  	{
*** PLy_spi_execute_query(char *query, long
*** 3664,3670 
  	int			rv;
  	volatile MemoryContext oldcontext;
  	volatile ResourceOwner oldowner;
! 	PyObject   *ret;
  
  	oldcontext = CurrentMemoryContext;
  	oldowner = CurrentResourceOwner;
--- 3666,3672 
  	int			rv;
  	volatile MemoryContext oldcontext;
  	volatile ResourceOwner oldowner;
! 	PyObject   *ret = NULL;
  
  	oldcontext = CurrentMemoryContext;
  	oldowner = CurrentResourceOwner;
*** PLy_spi_execute_query(char *query, long
*** 3727,3732 
--- 3729,3735 
  
  	if (rv  0)
  	{
+ 		Py_XDECREF(ret);
  		PLy_exception_set(PLy_exc_spi_error,
  		  SPI_execute failed: %s,
  		  SPI_result_code_string(rv));
*** PLy_generate_spi_exceptions(PyObject *mo
*** 3967,3973 
--- 3970,3982 
  		PyObject   *sqlstate;
  		PyObject   *dict = PyDict_New();
  
+ 		if (dict == NULL)
+ 			elog(ERROR, could not generate SPI exceptions);
+ 
  		sqlstate = PyString_FromString(unpack_sql_state(exception_map[i].sqlstate));
+ 		if (sqlstate == NULL)
+ 			elog(ERROR, could not generate SPI exceptions);
+ 
  		PyDict_SetItemString(dict, sqlstate, sqlstate);
  		Py_DECREF(sqlstate);
  		exc = PyErr_NewException(exception_map[i].name, base, dict);
*** PLy_add_exceptions(PyObject *plpy)
*** 4008,4013 
--- 4017,4027 
  	PLy_exc_fatal = PyErr_NewException(plpy.Fatal, NULL, NULL);
  	PLy_exc_spi_error = PyErr_NewException(plpy.SPIError, NULL, NULL);
  
+ 	if (PLy_exc_error == NULL ||
+ 		PLy_exc_fatal == NULL ||
+ 		PLy_exc_spi_error == NULL)
+ 		elog(ERROR, could not create the base SPI exceptions);
+ 
  	Py_INCREF(PLy_exc_error);
  	PyModule_AddObject(plpy, Error, PLy_exc_error);
  	Py_INCREF(PLy_exc_fatal);
*** PLy_init_interp(void)
*** 4124,4129 
--- 4138,4145 
  	Py_INCREF(mainmod);
  	PLy_interp_globals = PyModule_GetDict(mainmod);
  	PLy_interp_safe_globals = PyDict_New();
+ 	if (PLy_interp_safe_globals == NULL)
+ 		PLy_elog(ERROR, could not create globals);
  	PyDict_SetItemString(PLy_interp_globals, GD, PLy_interp_safe_globals);
  	Py_DECREF(mainmod);
  	if (PLy_interp_globals == NULL || PyErr_Occurred())
*** PLy_init_plpy(void)
*** 4164,4169 
--- 4180,4187 
  	main_mod = PyImport_AddModule(__main__);
  	main_dict = PyModule_GetDict(main_mod);
  	plpy_mod = PyImport_AddModule(plpy);
+ 	if (plpy_mod == NULL)
+ 		elog(ERROR, could not initialize plpy);
  	PyDict_SetItemString(main_dict, plpy, plpy_mod);
  	if (PyErr_Occurred())
  		elog(ERROR, could not initialize plpy);
*** PLy_output(volatile int level, PyObject
*** 4231,4238 
  		 * decoration.
  		 */
  		PyObject   *o;
  
- 		PyArg_UnpackTuple(args, plpy.elog, 1, 1, o);
  		so = PyObject_Str(o);
  	}
  	else
--- 4249,4260 
  		 * decoration.
  		 */
  		PyObject   *o;
+ 		int			result;
+ 
+ 		result = PyArg_UnpackTuple(args, plpy.elog, 1, 1, o);
+ 		if (!result)
+ 			elog(ERROR, could not unpack arguments in plpy.elog);
  
  		so = PyObject_Str(o);
  	}
  	else
*** get_source_line(const char *src, int lin
*** 4554,4559 
--- 4576,4585 
  	const char *next = src;
  	int			current = 0;
  
+ 	/* sanity check */
+ 	if (lineno = 0)
+ 		return NULL;
+ 
  	while (current  lineno)
  	{
  		s = next;
diff --git a/src/pl/plpython/plpy_elog.c b/src

Re: [HACKERS] pl/python long-lived allocations in datum-dict transformation

2012-02-19 Thread Jan Urbański

On 14/02/12 01:35, Tom Lane wrote:
 =?UTF-8?B?SmFuIFVyYmHFhHNraQ==?= wulc...@wulczer.org writes:
 It's not very comfortable, but
 I think PLyDict_FromTuple can be allowed to be non-reentrant.
 
 I think that's pretty short-sighted.  Even if it's safe today (which
 I am not 100% convinced of), there are plenty of foreseeable reasons
 why it might^Wwill break in the future.
 
 OTOH if we want to make it reentrant, some more tinkering would be in order.
 
 I think that's in order.

Here are the results of the tinkering.

I came up with a stack of context structures that gets pushed when a
PL/Python starts being executed and popped when it returns. At first
they contained just a scratch memory context used by PLyDict_FromTuple.
Then under the premise of confirming the usefulness of introducing such
contexts I removed the global PLy_curr_procedure variable and changed
all users to get the current procedure from the context. It seems to
have worked, so the total count of global variables is unchanged - hooray!

While testing I found one more leak, this time caused by allocating a
structure for caching array type I/O functions and never freeing it.
Attached as separate patch.

Cheers,
Jan
diff --git a/src/pl/plpython/plpy_cursorobject.c b/src/pl/plpython/plpy_cursorobject.c
index 4226dc7..46930b0 100644
*** a/src/pl/plpython/plpy_cursorobject.c
--- b/src/pl/plpython/plpy_cursorobject.c
***
*** 14,19 
--- 14,20 
  #include plpy_cursorobject.h
  
  #include plpy_elog.h
+ #include plpy_main.h
  #include plpy_planobject.h
  #include plpy_procedure.h
  #include plpy_resultobject.h
*** PLy_cursor_query(const char *query)
*** 121,126 
--- 122,128 
  	{
  		SPIPlanPtr	plan;
  		Portal		portal;
+ 		PLyExecutionContext	*exec_ctx = PLy_current_execution_context();
  
  		pg_verifymbstr(query, strlen(query), false);
  
*** PLy_cursor_query(const char *query)
*** 129,136 
  			elog(ERROR, SPI_prepare failed: %s,
   SPI_result_code_string(SPI_result));
  
  		portal = SPI_cursor_open(NULL, plan, NULL, NULL,
!  PLy_curr_procedure-fn_readonly);
  		SPI_freeplan(plan);
  
  		if (portal == NULL)
--- 131,140 
  			elog(ERROR, SPI_prepare failed: %s,
   SPI_result_code_string(SPI_result));
  
+ 		Assert(exec_ctx-curr_proc != NULL);
+ 
  		portal = SPI_cursor_open(NULL, plan, NULL, NULL,
!  exec_ctx-curr_proc-fn_readonly);
  		SPI_freeplan(plan);
  
  		if (portal == NULL)
*** PLy_cursor_plan(PyObject *ob, PyObject *
*** 210,215 
--- 214,220 
  		Portal		portal;
  		char	   *volatile nulls;
  		volatile int j;
+ 		PLyExecutionContext *exec_ctx = PLy_current_execution_context();
  
  		if (nargs  0)
  			nulls = palloc(nargs * sizeof(char));
*** PLy_cursor_plan(PyObject *ob, PyObject *
*** 252,259 
  			}
  		}
  
  		portal = SPI_cursor_open(NULL, plan-plan, plan-values, nulls,
!  PLy_curr_procedure-fn_readonly);
  		if (portal == NULL)
  			elog(ERROR, SPI_cursor_open() failed: %s,
   SPI_result_code_string(SPI_result));
--- 257,266 
  			}
  		}
  
+ 		Assert(exec_ctx-curr_proc != NULL);
+ 
  		portal = SPI_cursor_open(NULL, plan-plan, plan-values, nulls,
!  exec_ctx-curr_proc-fn_readonly);
  		if (portal == NULL)
  			elog(ERROR, SPI_cursor_open() failed: %s,
   SPI_result_code_string(SPI_result));
diff --git a/src/pl/plpython/plpy_elog.c b/src/pl/plpython/plpy_elog.c
index 741980c..9909f23 100644
*** a/src/pl/plpython/plpy_elog.c
--- b/src/pl/plpython/plpy_elog.c
***
*** 12,17 
--- 12,18 
  
  #include plpy_elog.h
  
+ #include plpy_main.h
  #include plpy_procedure.h
  
  
*** PLy_traceback(char **xmsg, char **tbmsg,
*** 260,265 
--- 261,267 
  			char	   *line;
  			char	   *plain_filename;
  			long		plain_lineno;
+ 			PLyExecutionContext	*exec_ctx = PLy_current_execution_context();
  
  			/*
  			 * The second frame points at the internal function, but to mimick
*** PLy_traceback(char **xmsg, char **tbmsg,
*** 270,276 
  			else
  fname = PyString_AsString(name);
  
! 			proname = PLy_procedure_name(PLy_curr_procedure);
  			plain_filename = PyString_AsString(filename);
  			plain_lineno = PyInt_AsLong(lineno);
  
--- 272,280 
  			else
  fname = PyString_AsString(name);
  
! 			Assert(exec_ctx-curr_proc != NULL);
! 
! 			proname = PLy_procedure_name(exec_ctx-curr_proc);
  			plain_filename = PyString_AsString(filename);
  			plain_lineno = PyInt_AsLong(lineno);
  
*** PLy_traceback(char **xmsg, char **tbmsg,
*** 287,293 
  			 * function code object was compiled with string as the
  			 * filename
  			 */
! 			if (PLy_curr_procedure  plain_filename != NULL 
  strcmp(plain_filename, string) == 0)
  			{
  /*
--- 291,297 
  			 * function code object was compiled with string as the
  			 * filename
  			 */
! 			if (exec_ctx-curr_proc  plain_filename != NULL

Re: [HACKERS] Potential reference miscounts and segfaults in plpython.c

2012-02-18 Thread Jan Urbański

On 18/02/12 20:30, Tom Lane wrote:
 Dave Malcolm at Red Hat has been working on a static code analysis tool
 for Python-related C code.  He reports here on some preliminary results
 for plpython.c:
 https://bugzilla.redhat.com/show_bug.cgi?id=795011
 
 I'm not enough of a Python hacker to evaluate the significance of these
 issues, but somebody who does work on that code ought to take a look and
 see what we ought to patch.

Very cool!

Some of them look like legitimate bugs, some seem to stem from the fact
that the tool does not know that PLy_elog(ERROR) does not return. I
wonder if it could be taught that.

I'll try to fixes these while reworking the I/O functions memory leak
patch I sent earlier.

Cheers,
Jan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Potential reference miscounts and segfaults in plpython.c

2012-02-18 Thread Jan Urbański

On 18/02/12 21:17, Tom Lane wrote:
 =?UTF-8?B?SmFuIFVyYmHFhHNraQ==?= wulc...@wulczer.org writes:
 On 18/02/12 20:30, Tom Lane wrote:
 Dave Malcolm at Red Hat has been working on a static code analysis tool
 for Python-related C code.  He reports here on some preliminary results
 for plpython.c:
 https://bugzilla.redhat.com/show_bug.cgi?id=795011
 
 I'll try to fixes these while reworking the I/O functions memory leak
 patch I sent earlier.
 
 If you find any live bugs, it'd likely be better to deal with them as
 a separate patch so that we can back-patch ...

Sure, I meant to say I'll look at these as well, but will make them into
a separate patch.

Cheers,
Jan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pl/python long-lived allocations in datum-dict transformation

2012-02-13 Thread Jan Urbański

On 12/02/12 00:48, Tom Lane wrote:
 =?UTF-8?B?SmFuIFVyYmHFhHNraQ==?= wulc...@wulczer.org writes:
 This is annoying for functions that plough through large tables, doing
 some calculation. Attached is a patch that does the conversion of
 PostgreSQL Datums into Python dict objects in a scratch memory context
 that gets reset every time.
 
 As best I can tell, this patch proposes creating a new, separate context
 (chewing up 8KB+) for every plpython procedure that's ever used in a
 given session.  This cure could easily be worse than the disease as far

Yeah, that's not ideal.

 What's more, it's unclear that
 it won't malfunction altogether if the function is used recursively
 (ie, what if PLyDict_FromTuple ends up calling the same function again?)

I was a bit worried about that, but the only place where
PLyDict_FromTuple calls into some other code is when it calls the type's
specific I/O function, which AFAICT can't call back into user code
(except for user-defined C I/O routines). It's not very comfortable, but
I think PLyDict_FromTuple can be allowed to be non-reentrant.

 Can't you fix it so that the temp context is associated with a
 particular function execution, rather than being statically allocated
 per-function?

That would be cool, but I failed to easily get a handle on something
that's like the execution context of a PL/Python function... Actually,
if we assume that PLyDict_FromTuple (which is quite a low-level thing)
never calls PL/Python UDFs we could keep a single memory context in
top-level PL/Python memory and pay the overhead once in a session, not
once per function.

OTOH if we want to make it reentrant, some more tinkering would be in order.

Cheers,
Jan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] pl/python long-lived allocations in datum-dict transformation

2012-02-05 Thread Jan Urbański

Consider this:

create table arrays as select array[random(), random(), random(),
random(), random(), random()] as a from generate_series(1, 100);

create or replace function plpython_outputfunc() returns void as $$
c = plpy.cursor('select a from arrays')
for row in c:
pass
$$ language plpythonu;

When running the function, every datum will get transformed into a
Python dict, which includes calling the type's output function,
resulting in a memory allocation. The memory is allocated in the SPI
context, so it accumulates until the function is finished.

This is annoying for functions that plough through large tables, doing
some calculation. Attached is a patch that does the conversion of
PostgreSQL Datums into Python dict objects in a scratch memory context
that gets reset every time.

Cheers,
Jan
diff --git a/src/pl/plpython/plpy_main.c b/src/pl/plpython/plpy_main.c
index ae9d87e..79d7784 100644
*** a/src/pl/plpython/plpy_main.c
--- b/src/pl/plpython/plpy_main.c
***
*** 12,17 
--- 12,18 
  #include executor/spi.h
  #include miscadmin.h
  #include utils/guc.h
+ #include utils/memutils.h
  #include utils/syscache.h
  
  #include plpython.h
*** plpython_inline_handler(PG_FUNCTION_ARGS
*** 268,273 
--- 269,279 
  
  	MemSet(proc, 0, sizeof(PLyProcedure));
  	proc.pyname = PLy_strdup(__plpython_inline_block);
+ 	proc.tmp_ctx = AllocSetContextCreate(TopMemoryContext,
+ 		 PL/Python temporary ctx,
+ 		 ALLOCSET_DEFAULT_MINSIZE,
+ 		 ALLOCSET_DEFAULT_INITSIZE,
+ 		 ALLOCSET_DEFAULT_MAXSIZE);
  	proc.result.out.d.typoid = VOIDOID;
  
  	PG_TRY();
diff --git a/src/pl/plpython/plpy_procedure.c b/src/pl/plpython/plpy_procedure.c
index 229966a..f539cec 100644
*** a/src/pl/plpython/plpy_procedure.c
--- b/src/pl/plpython/plpy_procedure.c
***
*** 12,17 
--- 12,18 
  #include catalog/pg_type.h
  #include utils/builtins.h
  #include utils/hsearch.h
+ #include utils/memutils.h
  #include utils/syscache.h
  
  #include plpython.h
*** PLy_procedure_create(HeapTuple procTup,
*** 169,174 
--- 170,180 
  	proc-setof = NULL;
  	proc-src = NULL;
  	proc-argnames = NULL;
+ 	proc-tmp_ctx = AllocSetContextCreate(TopMemoryContext,
+ 		  PL/Python temporary ctx,
+ 		  ALLOCSET_DEFAULT_MINSIZE,
+ 		  ALLOCSET_DEFAULT_INITSIZE,
+ 		  ALLOCSET_DEFAULT_MAXSIZE);
  
  	PG_TRY();
  	{
*** PLy_procedure_delete(PLyProcedure *proc)
*** 411,416 
--- 417,424 
  		PLy_free(proc-src);
  	if (proc-argnames)
  		PLy_free(proc-argnames);
+ 	if (proc-tmp_ctx)
+ 		MemoryContextDelete(proc-tmp_ctx);
  }
  
  /*
diff --git a/src/pl/plpython/plpy_procedure.h b/src/pl/plpython/plpy_procedure.h
index e986c7e..5ee73fa 100644
*** a/src/pl/plpython/plpy_procedure.h
--- b/src/pl/plpython/plpy_procedure.h
*** typedef struct PLyProcedure
*** 30,35 
--- 30,36 
  	PyObject   *code;			/* compiled procedure code */
  	PyObject   *statics;		/* data saved across calls, local scope */
  	PyObject   *globals;		/* data saved across calls, global scope */
+ 	MemoryContext tmp_ctx;
  } PLyProcedure;
  
  /* the procedure cache entry */
diff --git a/src/pl/plpython/plpy_typeio.c b/src/pl/plpython/plpy_typeio.c
index d5cac9f..6f0eb46 100644
*** a/src/pl/plpython/plpy_typeio.c
--- b/src/pl/plpython/plpy_typeio.c
***
*** 23,28 
--- 23,29 
  #include plpy_typeio.h
  
  #include plpy_elog.h
+ #include plpy_procedure.h
  
  
  /* I/O function caching */
*** PLy_output_record_funcs(PLyTypeInfo *arg
*** 258,268 
  	Assert(arg-is_rowtype == 1);
  }
  
  PyObject *
  PLyDict_FromTuple(PLyTypeInfo *info, HeapTuple tuple, TupleDesc desc)
  {
! 	PyObject   *volatile dict;
! 	int			i;
  
  	if (info-is_rowtype != 1)
  		elog(ERROR, PLyTypeInfo structure describes a datum);
--- 259,274 
  	Assert(arg-is_rowtype == 1);
  }
  
+ /*
+  * Transform a tuple into a Python dict object. The transformation can result
+  * in memory allocation, so it's done in a scratch memory context.
+  */
  PyObject *
  PLyDict_FromTuple(PLyTypeInfo *info, HeapTuple tuple, TupleDesc desc)
  {
! 	PyObject		*volatile dict;
! 	MemoryContext	oldcontext;
! 	inti;
  
  	if (info-is_rowtype != 1)
  		elog(ERROR, PLyTypeInfo structure describes a datum);
*** PLyDict_FromTuple(PLyTypeInfo *info, Hea
*** 271,276 
--- 277,284 
  	if (dict == NULL)
  		PLy_elog(ERROR, could not create new dictionary);
  
+ 	oldcontext = MemoryContextSwitchTo(PLy_curr_procedure-tmp_ctx);
+ 
  	PG_TRY();
  	{
  		for (i = 0; i  info-in.r.natts; i++)
*** PLyDict_FromTuple(PLyTypeInfo *info, Hea
*** 298,308 
--- 306,322 
  	}
  	PG_CATCH();
  	{
+ 		MemoryContextSwitchTo(oldcontext);
+ 		MemoryContextReset(PLy_curr_procedure-tmp_ctx);
+ 
  		Py_DECREF(dict);
  		PG_RE_THROW();
  	}
  	PG_END_TRY();
  
+ 	MemoryContextSwitchTo(oldcontext);
+

[HACKERS] plpgsql leaking memory when stringifying datums

2012-02-05 Thread Jan Urbański

While chasing a PL/Python memory leak, I did a few tests with PL/pgSQL
and I think there are places where memory is not freed sufficiently early.

Attached are two functions that when run will make the backend's memory
consumption increase until they finish. With both, the cause is
convert_value_to_string that calls a datum's output function, which for
toasted data results in an allocation.

The proposed patch changes convert_value_to_string to call the output
function in the per-tuple memory context and then copy the result string
back to the original context.

The comment in that function says that callers generally pfree its
result, but that wasn't the case with exec_stmt_raise, so I added a
pfree() there as well.

With that I was still left with a leak in the typecast() test function
and it turns out that sticking a exec_eval_cleanup into exec_move_row
fixed it. The regression tests pass, but I'm not 100% sure if it's
actually safe.

Since convert_value_to_string needed to access the PL/pgSQL's execution
state to get its hands on the per-tuple context, I needed to pass it to
every caller that didn't have it already, which means exec_cast_value
and exec_simple_cast_value. Anyone has a better idea?

The initial diagnosis and proposed solution are by Andres Freund - thanks!

Cheers,
Jan
diff --git a/src/pl/plpgsql/src/pl_exec.c b/src/pl/plpgsql/src/pl_exec.c
index bf952b6..a691905 100644
*** a/src/pl/plpgsql/src/pl_exec.c
--- b/src/pl/plpgsql/src/pl_exec.c
*** static void exec_move_row(PLpgSQL_execst
*** 188,201 
  static HeapTuple make_tuple_from_row(PLpgSQL_execstate *estate,
  	PLpgSQL_row *row,
  	TupleDesc tupdesc);
! static char *convert_value_to_string(Datum value, Oid valtype);
! static Datum exec_cast_value(Datum value, Oid valtype,
  Oid reqtype,
  FmgrInfo *reqinput,
  Oid reqtypioparam,
  int32 reqtypmod,
  bool isnull);
! static Datum exec_simple_cast_value(Datum value, Oid valtype,
  	   Oid reqtype, int32 reqtypmod,
  	   bool isnull);
  static void exec_init_tuple_store(PLpgSQL_execstate *estate);
--- 188,204 
  static HeapTuple make_tuple_from_row(PLpgSQL_execstate *estate,
  	PLpgSQL_row *row,
  	TupleDesc tupdesc);
! static char *convert_value_to_string(PLpgSQL_execstate *estate,
! 	Datum value, Oid valtype);
! static Datum exec_cast_value(PLpgSQL_execstate *estate,
! Datum value, Oid valtype,
  Oid reqtype,
  FmgrInfo *reqinput,
  Oid reqtypioparam,
  int32 reqtypmod,
  bool isnull);
! static Datum exec_simple_cast_value(PLpgSQL_execstate *estate,
! 	   Datum value, Oid valtype,
  	   Oid reqtype, int32 reqtypmod,
  	   bool isnull);
  static void exec_init_tuple_store(PLpgSQL_execstate *estate);
*** plpgsql_exec_function(PLpgSQL_function *
*** 430,436 
  		else
  		{
  			/* Cast value to proper type */
! 			estate.retval = exec_cast_value(estate.retval, estate.rettype,
  			func-fn_rettype,
  			(func-fn_retinput),
  			func-fn_rettypioparam,
--- 433,440 
  		else
  		{
  			/* Cast value to proper type */
! 			estate.retval = exec_cast_value(estate, estate.retval,
! 			estate.rettype,
  			func-fn_rettype,
  			(func-fn_retinput),
  			func-fn_rettypioparam,
*** exec_stmt_fori(PLpgSQL_execstate *estate
*** 1757,1763 
  	 * Get the value of the lower bound
  	 */
  	value = exec_eval_expr(estate, stmt-lower, isnull, valtype);
! 	value = exec_cast_value(value, valtype, var-datatype-typoid,
  			(var-datatype-typinput),
  			var-datatype-typioparam,
  			var-datatype-atttypmod, isnull);
--- 1761,1767 
  	 * Get the value of the lower bound
  	 */
  	value = exec_eval_expr(estate, stmt-lower, isnull, valtype);
! 	value = exec_cast_value(estate, value, valtype, var-datatype-typoid,
  			(var-datatype-typinput),
  			var-datatype-typioparam,
  			var-datatype-atttypmod, isnull);
*** exec_stmt_fori(PLpgSQL_execstate *estate
*** 1772,1778 
  	 * Get the value of the upper bound
  	 */
  	value = exec_eval_expr(estate, stmt-upper, isnull, valtype);
! 	value = exec_cast_value(value, valtype, var-datatype-typoid,
  			(var-datatype-typinput),
  			var-datatype-typioparam,
  			var-datatype-atttypmod, isnull);
--- 1776,1782 
  	 * Get the value of the upper bound
  	 */
  	value = exec_eval_expr(estate, stmt-upper, isnull, valtype);
! 	value = exec_cast_value(estate, value, valtype, var-datatype-typoid,
  			(var-datatype-typinput),
  			var-datatype-typioparam,
  			var-datatype-atttypmod, isnull);
*** exec_stmt_fori(PLpgSQL_execstate *estate
*** 1789,1795 
  	if (stmt-step)
  	{
  		value = exec_eval_expr(estate, stmt-step, isnull, valtype);
! 		value = exec_cast_value(value, valtype, var-datatype-typoid,
  (var-datatype-typinput),
  var-datatype-typioparam,
  var-datatype-atttypmod,

[HACKERS] unfriendly error when accessing NEW in statement-level trigger

2012-01-28 Thread Jan Urbański

When you try to use the NEW variable in a statement-level trigger, you get

ERROR:  record new is not assigned yet
DETAIL:  The tuple structure of a not-yet-assigned record is indeterminate.

which is not too friendly, given that people sometimes forget to specify
FOR EACH sth at all, get the default behaviour of FOR EACH STATEMENT
and scratch their heads. A quick search on the error detail reveals a
few such confused users already.

What's more, the documentation for PL/pgSQL triggers says that This
variable is NULL in statement-level triggers but it's not really a
NULL, for instance you can't execute x is NULL with it (you get the
same error).

I'm not that familiar with PL/pgSQL code, so I'm not sure how or if this
should be fixed. From a quick look it seems that the NEW and OLD
variables could be defined as row variables instead of record, since the
format of the tuple is known.

Would that make sense?

Cheers,
Jan

PS: If changing them to row types would induce too much code churn that
maybe we could use some trick to check if the error comes from using the
OLD or NEW variable and add a HINT to the error message?

J

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PL/Python result metadata

2012-01-11 Thread Jan Urbański

On 11/01/12 22:52, Dimitri Fontaine wrote:
 Peter Eisentraut pete...@gmx.net writes:
 .colnames() returns a list of column names (strings)
 .coltypes() returns a list of type OIDs (integers)

 I just made that up because there is no guidance in the other standard
 PLs for this sort of thing, AFAICT.
 
 What about having the same or comparable API as in psycopg or DB API
 
   http://initd.org/psycopg/docs/cursor.html
 
 You could expose a py.description structure?

+1 for providing a read-only result.description. Not sure if it's worth
it to follow DB-API there, but maybe yes. Perhaps we could have a
result.description_ex information that's PG-specific or just not present
in PEP 249, like the typmod, collation and so on.

J

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgsphere

2012-01-07 Thread Jan Urbański

- Original message -
 On Saturday, January 07, 2012 04:43:43 PM Dave Cramer wrote:
  Well I've sent Teodor a personal email asking him if he was interested
  and so far no response, so I interpret that as he no longer has
  interest in the project.
 I dimly remember him mentioning traveling/hiking planning to travel
 around the   Himalayas somewhere on -hackers.

That sounds more like Oleg :)

Jan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2011-12-18 Thread Jan Urbański

On 18/12/11 04:21, Robert Haas wrote:
 On Sat, Dec 17, 2011 at 7:50 PM, David E. Wheeler da...@kineticode.com 
 wrote:
 Love having the start here. I forwarded this message to Claes Jakobsson, 
 creator of the jansson-using pg-json extension. He’s a bit less supportive. 
 He gave me permission to quote him here:

 Frankly I see the inclusion of a JSON datatype in core as unnecessary. 
 Stuff should be moved out of core rather than in, as we do in Perl. Also, 
 does this patch mean that the 'json' type is forever claimed and can't be 
 replaced by extensions?

 There's little reason to reimplement JSON parsing, comparision and other 
 routines when there's a multitude of already good libraries.
 
 That's fair enough, but we've had *many* requests for this
 functionality in core, I don't see what we lose by having at least
 some basic functionality built in.

I think having a JSON data type in core would drastically limit the
exposure third-party JSON extensions would get and that's bad. There are
tons of interesting features a JSON type could have and tying its
development to a one year release cycle might be a disservice both for
people who are willing to provide these features earlier, the users
which are faced with a choice between a fast-moving third-party addon
and a blessed core type and would cause overall confusion.

How about we try the tsearch way and let JSON extensions live outside
core for some time and perhaps if one emerges dominant and would benefit
from inclusion then consider it?

If we keep treating extensions as second-class citizens, they'll never
get the mindshare and importance we seem to want for them (or otherwise
why go through all the trouble to provide an infrastructure for them).

Cheers,
Jan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] splitting plpython into smaller parts

2011-12-18 Thread Jan Urbański

On 18/12/11 20:53, Peter Eisentraut wrote:
 On tis, 2011-12-06 at 00:58 +0100, Jan Urbański wrote:
 Rebased against master after the SPI cursor patch has been committed.

 The first patch removes SPI boilerplate from the cursor functions as
 well and the second patch creates a plpython_cursor.c file.

 A side effect of creating a separate file for cursors is that I had to
 make PLy_spi_transaction_{begin,commit,abort} helper functions external
 since they're used both by regular SPI execution functions and the
 cursor functions.

 They live the plpython_spi.c which is not an ideal place for them, but
 IMHO it's not bad either.
 
 Committed now.

Great, thanks! I hope this will make for a more maintanable PL/Python.

By the way, the buildfarm is turning red because it's missing the
attached patch.

Cheers,
Jan
diff --git a/src/pl/plpython/Makefile b/src/pl/plpython/Makefile
index 12ce26e..a31328d 100644
*** a/src/pl/plpython/Makefile
--- b/src/pl/plpython/Makefile
*** endif # can't build
*** 189,195 
  # distprep and maintainer-clean rules should be run even if we can't build.
  
  # Force this dependency to be known even without dependency info built:
! plpython_plpy.o: spiexceptions.h
  
  spiexceptions.h: $(top_srcdir)/src/backend/utils/errcodes.txt generate-spiexceptions.pl
  	$(PERL) $(srcdir)/generate-spiexceptions.pl $  $@
--- 189,195 
  # distprep and maintainer-clean rules should be run even if we can't build.
  
  # Force this dependency to be known even without dependency info built:
! plpy_plpymodule.o: spiexceptions.h
  
  spiexceptions.h: $(top_srcdir)/src/backend/utils/errcodes.txt generate-spiexceptions.pl
  	$(PERL) $(srcdir)/generate-spiexceptions.pl $  $@

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2011-12-12 Thread Jan Urbański

- Original message -
 On Dec 12, 2011, at 4:51 PM, Peter van Hardenberg wrote:
 
  Because we haven't heard from him in a while we've been using PL/V8 to
  validate a JSON datatype simulated by a DOMAIN with a simple
  acceptance function. (See below.) This is not ideally performant but
  thanks to V8's JIT the JSON parser is actually reasonably good.
 
 Note that Claes Jakobsson has been working on a JSON data type using the
 Jansson JSON library.
 
     http://pgxn.org/dist/pg-json/

We recently needed to store/valisate JSON data and be able to do some trivial 
extraction of values from it and went with pg-json.

The great benefit of having JSON as an extension type is being able to use an 
external library (Jansson is very small, MIT licensed, looks really well 
written and has been actively maintaied for years) and not being tied to a 
yearly release cycle.

Postgres jumps through a lot of hoops to be extensible and I think a JSON type 
is a kind of thing that fits the bill of an extension perfectly.

Cheers,
Jan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: plpython: Add SPI cursor support

2011-12-07 Thread Jan Urbański

On 06/12/11 19:33, Jan Urbański wrote:
 On 06/12/11 19:23, Tom Lane wrote:
 Peter Eisentraut pete...@gmx.net writes:
 plpython: Add SPI cursor support

 Buildfarm member narwhal does not like this patch.  It looks like
 PyObject_SelfIter is not a compile-time constant on its version
 of python (2.5, apparently).
 
 I'll try to dig around to see what can be causing that...

Seems that PyObject_GenericGetIter has been renamed to PyObject_SelfIter
at some point:

http://hg.python.org/cpython/rev/fd5ef7003469

I'm trying to muster whatever mercurial-foo I have to check if there's
been a 2.5 version tagged without that patch...

Jan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: plpython: Add SPI cursor support

2011-12-07 Thread Jan Urbański

On 07/12/11 11:16, Jan Urbański wrote:
 On 06/12/11 19:33, Jan Urbański wrote:
 On 06/12/11 19:23, Tom Lane wrote:
 Peter Eisentraut pete...@gmx.net writes:
 plpython: Add SPI cursor support

 Buildfarm member narwhal does not like this patch.  It looks like
 PyObject_SelfIter is not a compile-time constant on its version
 of python (2.5, apparently).

 I'll try to dig around to see what can be causing that...
 
 Seems that PyObject_GenericGetIter has been renamed to PyObject_SelfIter
 at some point:
 
 http://hg.python.org/cpython/rev/fd5ef7003469
 
 I'm trying to muster whatever mercurial-foo I have to check if there's
 been a 2.5 version tagged without that patch...

So no, the renaming happened on Mar 17 2003 and 2.5 was tagged on
Sep 18 2006. The source tarball for 2.5 contains PyObject_SelfIter.

Both fennec and mastodon build OK with Python 2.5. Is it possible that
narwhal's Python installation is in some way broken?

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: plpython: Add SPI cursor support

2011-12-06 Thread Jan Urbański

On 06/12/11 19:23, Tom Lane wrote:
 Peter Eisentraut pete...@gmx.net writes:
 plpython: Add SPI cursor support
 
 Buildfarm member narwhal does not like this patch.  It looks like
 PyObject_SelfIter is not a compile-time constant on its version
 of python (2.5, apparently).

Hm, I quickly tried with a self-compiled Python 2.5.6 from the upstream
tarball and it compiled on my Linux machine.

I see that PyObject_SelfIter is defined here:

http://hg.python.org/cpython/file/b48e1b48e670/Include/object.h#l407

I'll try to dig around to see what can be causing that...

Cheers,
Jan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] plpython SPI cursors

2011-12-05 Thread Jan Urbański

On 05/12/11 18:58, Peter Eisentraut wrote:
 On ons, 2011-11-23 at 19:58 +0100, Jan Urbański wrote:
 On 20/11/11 19:14, Steve Singer wrote:
 Responding now to all questions and attaching a revised patch based on 
 your comments.
 
 Committed
 
 Please refresh the other patch.

Great, thanks!

I'll try to send an updated version of the other patch this evening.

Cheers,
Jan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] plpython SPI cursors

2011-12-05 Thread Jan Urbański

On 05/12/11 19:12, Bruce Momjian wrote:
 Jan Urbański wrote:
 On 05/12/11 18:58, Peter Eisentraut wrote:
 On ons, 2011-11-23 at 19:58 +0100, Jan Urba?ski wrote:
 On 20/11/11 19:14, Steve Singer wrote:
 Responding now to all questions and attaching a revised patch based on 
 your comments.

 Committed

 Please refresh the other patch.

 Great, thanks!

 I'll try to send an updated version of the other patch this evening.
 
 I assume this is _not_ related to this TODO item:
 
   Add a DB-API compliant interface on top of the SPI interface 

No,  not related.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Java LISTEN/NOTIFY client library work-around

2011-11-30 Thread Jan Urbański


On 30/11/11 13:07, Joel Jacobson wrote:

Hi,

As you know, LISTEN/NOTIFY is broken in the Java client library. You have
to do a SELECT 1 in a while-loop to receive the notifications.

http://jdbc.postgresql.org/documentation/head/listennotify.html

Is there some other library with a proper implementation where you don't
have to spam the database with queries to get the notifications?


You mean a Java library? If Java is not a requirement, the psycopg2 
Python library supports notifies well.


Cheers,
Jan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] splitting plpython into smaller parts

2011-11-28 Thread Jan Urbański


On 28/11/11 11:00, Greg Smith wrote:

On 11/13/2011 09:45 AM, Jan Urbański wrote:

The second one is the actual split. plpython.c has been split into 11
separate files and one header.


Could you comment a bit more about what the goal of this is? We don't
have a reviewer for this patch yet, and I think part of the reason is
because it's not really obvious what it's supposed to be doing, and why
that's useful.


The idea of splitting plpython.c (an almost 5k lines file) into 
something more modular.


It's been floated around here:

http://postgresql.1045698.n5.nabble.com/Large-C-files-tt4766446.html#a4773493

and I think at other occasions, too.

The patch introduces no behavioural changes, it's only shuffling code 
around. The only goal is to improve the maintainability.


I guess the reviewer could verify that a) I haven't botched the split 
and it all still compiles and workds b) the choice of which modules were 
defined is correct


Cheers,
Jan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PL/Python SQL error code pass-through

2011-11-24 Thread Jan Urbański


On 23/11/11 17:24, Mika Eloranta wrote:

Hi all,

[PL/Python in 9.1 does not preserve SQLSTATE of errors]


Oops, you're right, it's a regression from 9.0 behaviour.

The fix looks good to me, I changed one place to indent with tabs 
instead of spaces and added a regression test.


I think this should be backpatched to 9.1, no?

Thanks for the report and the patch!

Cheers,
Jan
diff --git a/src/pl/plpython/expected/plpython_error.out b/src/pl/plpython/expected/plpython_error.out
index dbf19fd..bab07fb 100644
*** a/src/pl/plpython/expected/plpython_error.out
--- b/src/pl/plpython/expected/plpython_error.out
*** CONTEXT:  PL/Python function specific_e
*** 351,356 
--- 351,378 
   
  (1 row)
  
+ /* SPI errors in PL/Python functions should preserve the SQLSTATE value
+  */
+ CREATE FUNCTION python_unique_violation() RETURNS void AS $$
+ plpy.execute(insert into specific values (1))
+ plpy.execute(insert into specific values (1))
+ $$ LANGUAGE plpythonu;
+ CREATE FUNCTION catch_python_unique_violation() RETURNS text AS $$
+ begin
+ begin
+ perform python_unique_violation();
+ exception when unique_violation then
+ return 'ok';
+ end;
+ return 'not reached';
+ end;
+ $$ language plpgsql;
+ SELECT catch_python_unique_violation();
+  catch_python_unique_violation 
+ ---
+  ok
+ (1 row)
+ 
  /* manually starting subtransactions - a bad idea
   */
  CREATE FUNCTION manual_subxact() RETURNS void AS $$
diff --git a/src/pl/plpython/expected/plpython_error_0.out b/src/pl/plpython/expected/plpython_error_0.out
index b2194ff..6cb2ed0 100644
*** a/src/pl/plpython/expected/plpython_error_0.out
--- b/src/pl/plpython/expected/plpython_error_0.out
*** CONTEXT:  PL/Python function specific_e
*** 351,356 
--- 351,378 
   
  (1 row)
  
+ /* SPI errors in PL/Python functions should preserve the SQLSTATE value
+  */
+ CREATE FUNCTION python_unique_violation() RETURNS void AS $$
+ plpy.execute(insert into specific values (1))
+ plpy.execute(insert into specific values (1))
+ $$ LANGUAGE plpythonu;
+ CREATE FUNCTION catch_python_unique_violation() RETURNS text AS $$
+ begin
+ begin
+ perform python_unique_violation();
+ exception when unique_violation then
+ return 'ok';
+ end;
+ return 'not reached';
+ end;
+ $$ language plpgsql;
+ SELECT catch_python_unique_violation();
+  catch_python_unique_violation 
+ ---
+  ok
+ (1 row)
+ 
  /* manually starting subtransactions - a bad idea
   */
  CREATE FUNCTION manual_subxact() RETURNS void AS $$
diff --git a/src/pl/plpython/plpython.c b/src/pl/plpython/plpython.c
index 93e8043..afd5dfc 100644
*** a/src/pl/plpython/plpython.c
--- b/src/pl/plpython/plpython.c
*** static char *PLy_procedure_name(PLyProce
*** 383,389 
  static void
  PLy_elog(int, const char *,...)
  __attribute__((format(PG_PRINTF_ATTRIBUTE, 2, 3)));
! static void PLy_get_spi_error_data(PyObject *exc, char **detail, char **hint, char **query, int *position);
  static void PLy_traceback(char **, char **, int *);
  
  static void *PLy_malloc(size_t);
--- 383,389 
  static void
  PLy_elog(int, const char *,...)
  __attribute__((format(PG_PRINTF_ATTRIBUTE, 2, 3)));
! static void PLy_get_spi_error_data(PyObject *exc, int *sqlerrcode, char **detail, char **hint, char **query, int *position);
  static void PLy_traceback(char **, char **, int *);
  
  static void *PLy_malloc(size_t);
*** PLy_spi_exception_set(PyObject *excclass
*** 4441,4447 
  	if (!spierror)
  		goto failure;
  
! 	spidata = Py_BuildValue((zzzi), edata-detail, edata-hint,
  			edata-internalquery, edata-internalpos);
  	if (!spidata)
  		goto failure;
--- 4441,4447 
  	if (!spierror)
  		goto failure;
  
! 	spidata = Py_BuildValue((izzzi), edata-sqlerrcode, edata-detail, edata-hint,
  			edata-internalquery, edata-internalpos);
  	if (!spidata)
  		goto failure;
*** PLy_elog(int elevel, const char *fmt,...
*** 4481,4486 
--- 4481,4487 
  			   *val,
  			   *tb;
  	const char *primary = NULL;
+ 	intsqlerrcode = 0;
  	char	   *detail = NULL;
  	char	   *hint = NULL;
  	char	   *query = NULL;
*** PLy_elog(int elevel, const char *fmt,...
*** 4490,4496 
  	if (exc != NULL)
  	{
  		if (PyErr_GivenExceptionMatches(val, PLy_exc_spi_error))
! 			PLy_get_spi_error_data(val, detail, hint, query, position);
  		else if (PyErr_GivenExceptionMatches(val, PLy_exc_fatal))
  			elevel = FATAL;
  	}
--- 4491,4497 
  	if (exc != NULL)
  	{
  		if (PyErr_GivenExceptionMatches(val, PLy_exc_spi_error))
! 			PLy_get_spi_error_data(val, sqlerrcode, detail, hint, query, position);
  		else if (PyErr_GivenExceptionMatches(val, PLy_exc_fatal))
  			elevel = FATAL;
  	}
*** PLy_elog(int elevel, const char *fmt,...
*** 4531,4537 
  	PG_TRY();
  	{
  		ereport(elevel,
! (errmsg_internal(%s, primary ? primary : no

Re: [HACKERS] PL/Python SQL error code pass-through

2011-11-24 Thread Jan Urbański


On 24/11/11 16:15, Heikki Linnakangas wrote:

On 24.11.2011 10:07, Jan Urbański wrote:

On 23/11/11 17:24, Mika Eloranta wrote:

Hi all,

[PL/Python in 9.1 does not preserve SQLSTATE of errors]


Oops, you're right, it's a regression from 9.0 behaviour.

The fix looks good to me, I changed one place to indent with tabs
instead of spaces and added a regression test.


Thank you, both. Is there some other fields that we should propagate
from the original error message that we're missing? Like, context and
file/line information? Or are those left out on purpose? I wonder if we
should have a more wholesale approach, and store the whole ErrorData
struct somewhere, and only add some extra context information with
errcontext().


In case of SPI errors we're preserving the following from the original 
ErrorData:


 * sqlerrcode (as of Mika's patch)
 * detail
 * hint
 * query
 * internalpos

that leaves us with the following which are not preserved:

 * message
 * context
 * detail_log

The message is being constructed from the Python exception name and I 
think that's useful. The context is being taken by the traceback string. 
I'm not sure if detail_log is ever set in these types of errors, 
probably not? So I guess we're safe.


The problem with storing the entire ErrorData struct is that this 
information has to be transformed to Python objects, because we attach 
it to the Python exception that gets raised and in case it bubbles all 
the way up to the topmost PL/Python function, we recover these Python 
objects and use them to construct the ereport call. While the exception 
is inside Python, user code can interact with it, so it'd be hard to 
have C pointers to non-Python stuff there.


Cheers,
Jan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] plpython SPI cursors

2011-11-23 Thread Jan Urbański


On 20/11/11 19:14, Steve Singer wrote:

On 11-10-15 07:28 PM, Jan Urbański wrote:

Hi,

attached is a patch implementing the usage of SPI cursors in PL/Python.
Currently when trying to process a large table in PL/Python you have
slurp it all into memory (that's what plpy.execute does).

J


I found a few bugs (see my testing section below) that will need fixing
+ a few questions about the code


Responding now to all questions and attaching a revised patch based on 
your comments.



Do we like the name plpy.cursor or would we rather call it something like
plpy.execute_cursor(...) or plpy.cursor_open(...) or
plpy.create_cursor(...)
Since we will be mostly stuck with the API once we release 9.2 this is
worth
some opinions on. I like cursor() but if anyone disagrees now is the time.


We use plpy.subtransaction() to create Subxact objects, so I though 
plpy.cursor() would be most appropriate.



This patch does not provide a wrapper around SPI_cursor_move. The patch
is useful without that and I don't see anything that preculdes someone else
adding that later if they see a need.


My idea is to add keyword arguments to plpy.cursor() that will allow you 
to decide whether you want a scrollable cursor and after that provide a 
move() method.



The patch includes documentation updates that describes the new feature.
The Database Access page doesn't provide a API style list of database
access
functions like the plperl
http://www.postgresql.org/docs/9.1/interactive/plperl-builtins.html page
does. I think the organization of the perl page is
clearer than the python one and we should think about a doing some
documentaiton refactoring. That should be done as a seperate patch and
shouldn't be a barrier to committing this one.


Yeah, the PL/Python docs are a bit chaotic right now. I haven't yet 
summoned force to overhaul them.



in PLy_cursor_plan line 4080
+ PG_TRY();
+ {
+ Portal portal;
+ char *volatile nulls;
+ volatile int j;



I am probably not seeing a code path or misunderstanding something
about the setjmp/longjump usages but I don't see why nulls and j need to be
volatile here.


It looked like you could drop volatile there (and in 
PLy_spi_execute_plan, where this is copied from (did I mention there's 
quite some code duplication in PL/Python?)) but digging in git I found 
this commit:


http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=2789b7278c11785750dd9d2837856510ffc67000

that added the original volatile qualification, so I guess there's a reason.


line 444
PLy_cursor(PyObject *self, PyObject *args)
+ {
+ char *query;
+ PyObject *plan;
+ PyObject *planargs = NULL;
+
+ if (PyArg_ParseTuple(args, s, query))
+ return PLy_cursor_query(query);
+

Should query be freed with PyMem_free()


No, PyArg_ParseTuple returns a string on the stack, I check that 
repeatedly creating a cursor with a plan argument does not leak memory 
and that adding PyMem_Free there promptly leads to a segfault.




I tested both python 2.6 and 3 on a Linux system

[test cases demonstrating bugs]


Turns out it's a really bad idea to store pointers to Portal structures, 
because they get invalidated by the subtransaction abort hooks.


I switched to storing the cursor name and looking it up in the 
appropriate hash table every time it's used. The examples you sent 
(which I included as regression tests) now cause a ValueError to be 
raised with a message stating that the cursor has been created in an 
aborted subtransaction.


Not sure about the wording of the error message, though.

Thanks again for the review!

Cheers,
Jan
diff --git a/doc/src/sgml/plpython.sgml b/doc/src/sgml/plpython.sgml
index eda2bbf..d08c3d1 100644
--- a/doc/src/sgml/plpython.sgml
+++ b/doc/src/sgml/plpython.sgml
@@ -892,6 +892,15 @@ $$ LANGUAGE plpythonu;
   /para
 
   para
+Note that calling literalplpy.execute/literal will cause the entire
+result set to be read into memory. Only use that function when you are sure
+that the result set will be relatively small.  If you don't want to risk
+excessive memory usage when fetching large results,
+use literalplpy.cursor/literal rather
+than literalplpy.execute/literal.
+  /para
+
+  para
For example:
 programlisting
 rv = plpy.execute(SELECT * FROM my_table, 5)
@@ -958,6 +967,77 @@ $$ LANGUAGE plpythonu;
 
   /sect2
 
+  sect2
+titleAccessing data with cursors/title
+
+  para
+The literalplpy.cursor/literal function accepts the same arguments
+as literalplpy.execute/literal (except for literallimit/literal)
+and returns a cursor object, which allows you to process large result sets
+in smaller chunks.  As with literalplpy.execute/literal, either a query
+string or a plan object along with a list of arguments can be used.  The
+cursor object provides a literalfetch/literal method that accepts an
+integer paramter and returns a result object.  Each time you
+call literalfetch/literal, the returned object will contain the next

Re: [HACKERS] plpython SPI cursors

2011-11-20 Thread Jan Urbański


On 20/11/11 19:14, Steve Singer wrote:

On 11-10-15 07:28 PM, Jan Urbański wrote:

Hi,

attached is a patch implementing the usage of SPI cursors in PL/Python.



I found a few bugs (see my testing section below) that will need fixing
+ a few questions about the code


Hi Steve,

thanks a lot for the review, I'll investigate the errors you were 
getting and post a follow-up.


Good catch on trying cursors with explicit subtransactions, I didn't 
think about how they would interact.


Cheers,
Jan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [9.1] unusable for large views

2011-10-24 Thread Jan Urbański

On 24/10/11 10:57, Omar Bettin wrote:
 
 [monster query]

I see that your problem is already solved, but incidentially I'm working
on a join order planning module and I'm looking for real-life examples
of humongous queries like that to benchmark against them.

Any chance you could share the schema, or at least part of it, that goes
with this query? Or perhaps you have more of these queries?

Cheers,
Jan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Monster query

2011-10-24 Thread Jan Urbański

On 24/10/11 19:16, Omar Bettin wrote:
 Hi,
 
 Since the data are of third parties, will prepare a database suitable for the 
 purpose.
 In any case, the compressed backup will be around 20 MB.

If you are able to prepare a database dump that doesn't contain private
data, it would be awesome.

If it's 20 MB please just post a link to the archive.

Thank you very much!
Jan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] proposal: set GUC variables for single query

2011-10-17 Thread Jan Urbański

On 17/10/11 02:53, Robert Haas wrote:
 On Sun, Oct 16, 2011 at 4:58 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Dimitri Fontaine dimi...@2ndquadrant.fr writes:
 Now that you mention it, the following might actually already work:

  WITH settings AS (
SELECT set_config('timezone', 'Europe/Amsterdam', t),
   set_config('work_mem', '1 GB', t)
  ),
   foo AS (
SELECT …
  )
  INSERT INTO bar SELECT * FROM foo;

 Only for small values of work ... you won't be able to affect planner
 settings that way, nor can you assume that that WITH item is executed
 before all else.  See recent thread pointing out that setting values
 mid-query is unsafe.
 
 I previously floated the idea of using a new keyword, possibly LET,
 for this, like this:
 
 LET var = value [, ...] IN query

LET was something I thought about, although you'd have to use something
like parenthesis around the GUC assignements because value can contain
commas, leading to shift/reduce conflicts (that sucks, unfortunately).

But before whipping out the paint bucket I wanted to see if there's
enough buy-in to justify rehashing the syntax details.

Cheers,
Jan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] proposal: set GUC variables for single query

2011-10-16 Thread Jan Urbański

Hi,

this idea has cropped up last PGCon - the ability to set GUC variables
for the duration of a single query. It would work by setting the GUCs
for the duration of the query and setting them back to what they were
after it has terminated. By setting them back I mean respecting the
previously set values, regardless of their source (set in run-time,
per-role settings, postgresql.conf settings).

An example of where this would be useful: an application maintains a
persistent connection to the database and answers requests for data from
a bunch of clients. Each connected client has a preferred timezone and
would like to get results in that timezone. Currently the application
has to either sprinkle each query with AT TIME ZONE or wrap the queries
in BEGIN; SET LOCAL TIMEZONE ..; query; COMMIT. It gets more complex
when things like pgbouncer come into play.

Another example is a one-off query that should use a different
statement_timeout than the server has configured or a REINDEX command
that would like to use more maintenance_work_mem.

It mostly falls into the realm of syntax sugar, but as more than one
person felt it's a good idea, I thought I'd float it around here.

I poked a little bit at the grammar to see where could it fit and didn't
have much success of doing it without a new reserved keyword. Supposing
the idea gets some traction, any suggestions for the syntax?

Cheers,
Jan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] proposal: set GUC variables for single query

2011-10-16 Thread Jan Urbański

On 16/10/11 17:49, Tom Lane wrote:
 =?UTF-8?B?SmFuIFVyYmHFhHNraQ==?= wulc...@wulczer.org writes:
 this idea has cropped up last PGCon - the ability to set GUC variables
 for the duration of a single query. It would work by setting the GUCs
 for the duration of the query and setting them back to what they were
 after it has terminated.
 
 Doesn't SET LOCAL cover this use-case pretty well already?

It does to a certain degree. If you have a bunch of statements in a
transaction and want to execute one of them with a different timezone
setting, you have to do the SET/RESET dance. In theory you should also
first grab the current value to set it back afterwards, in case someone
else did SET LOCAL before you, but I'll admin that's far-fetched.

The main use case would be apps running behing pgbouncer and using
statement pooling, and plain convenience.

I'd find it useful myself, but for now I'm making do with SET LOCAL and
it works fine. I'm bringing it up because it appears in the TODO created
at the PL Summit:

* Further discussion of per-statement config parameters for things
like timezone - Jan Urbanski

Tryin' to do my bit and all ;)

Jan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] plpython SPI cursors

2011-10-15 Thread Jan Urbański

Hi,

attached is a patch implementing the usage of SPI cursors in PL/Python.
Currently when trying to process a large table in PL/Python you have
slurp it all into memory (that's what plpy.execute does).

This patch allows reading the result set in smaller chunks, using a SPI
cursor behind the scenes.

Example usage:

cursor = plpy.cursor(select a, b from hugetable)
for row in cursor:
plpy.info(a is %s and b is %s % (row['a'], row['b']))

The patch itself is simple, but there's a lot of boilerplate dedicated
to opening a subtransaction and handling prepared plans. I'd like to do
some refactoring of they way PL/Python uses SPI to reduce the amount of
boilerplate needed, but that'll come as a separate patch (just before
the patch to split plpython.c in smaller chunks).

This feature has been sponsored by Nomao.

Cheers,
Jan

PS: I already added it to the November CF.

J
From 9ad14957e7b4ae19667df3bb8cc2aa5ef5bf96c8 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Jan=20Urba=C5=84ski?= wulc...@wulczer.org
Date: Tue, 13 Sep 2011 14:42:41 +0200
Subject: [PATCH] Add cursor support to plpythonu.

Exposes SPI cursors as plpythonu objects allowing processing large
result sets without loading them entirely into memory, as plpy.execute
is doing.
---
 doc/src/sgml/plpython.sgml |   80 
 src/pl/plpython/expected/plpython_spi.out  |  151 +++
 src/pl/plpython/expected/plpython_test.out |6 +-
 src/pl/plpython/plpython.c |  605 
 src/pl/plpython/sql/plpython_spi.sql   |  116 ++
 5 files changed, 955 insertions(+), 3 deletions(-)

diff --git a/doc/src/sgml/plpython.sgml b/doc/src/sgml/plpython.sgml
index eda2bbf..d08c3d1 100644
*** a/doc/src/sgml/plpython.sgml
--- b/doc/src/sgml/plpython.sgml
*** $$ LANGUAGE plpythonu;
*** 892,897 
--- 892,906 
/para
  
para
+ Note that calling literalplpy.execute/literal will cause the entire
+ result set to be read into memory. Only use that function when you are sure
+ that the result set will be relatively small.  If you don't want to risk
+ excessive memory usage when fetching large results,
+ use literalplpy.cursor/literal rather
+ than literalplpy.execute/literal.
+   /para
+ 
+   para
 For example:
  programlisting
  rv = plpy.execute(SELECT * FROM my_table, 5)
*** $$ LANGUAGE plpythonu;
*** 958,963 
--- 967,1043 
  
/sect2
  
+   sect2
+ titleAccessing data with cursors/title
+ 
+   para
+ The literalplpy.cursor/literal function accepts the same arguments
+ as literalplpy.execute/literal (except for literallimit/literal)
+ and returns a cursor object, which allows you to process large result sets
+ in smaller chunks.  As with literalplpy.execute/literal, either a query
+ string or a plan object along with a list of arguments can be used.  The
+ cursor object provides a literalfetch/literal method that accepts an
+ integer paramter and returns a result object.  Each time you
+ call literalfetch/literal, the returned object will contain the next
+ batch of rows, never larger than the parameter value.  Once all rows are
+ exhausted, literalfetch/literal starts returning an empty result
+ object.  Cursor objects also provide an
+ ulink url=http://docs.python.org/library/stdtypes.html#iterator-types;iterator
+ interface/ulink, yielding one row at a time until all rows are exhausted.
+ Data fetched that way is not returned as result objects, but rather as
+ dictionaries, each dictionary corresponding to a single result row.
+   /para
+ 
+   para
+ Cursors are automatically disposed of, but if you want to explicitly
+ release all resources held by a cursor, use the literalclose/literal
+ method.  Once closed, a cursor cannot be fetched from anymore.
+   /para
+ 
+note
+ para
+   Do not confuse objects created by literalplpy.cursor/literal with
+   DBAPI cursors as defined by
+   the ulink url=http://www.python.org/dev/peps/pep-0249/;Python Database API specification/ulink.
+   They don't have anything in common except for the name.
+ /para
+/note
+ 
+   para
+ An example of two ways of processing data from a large table would be:
+ programlisting
+ CREATE FUNCTION count_odd_iterator() RETURNS integer AS $$
+ odd = 0
+ for row in plpy.cursor(select num from largetable):
+ if row['num'] % 2:
+  odd += 1
+ return odd
+ $$ LANGUAGE plpythonu;
+ 
+ CREATE FUNCTION count_odd_fetch(batch_size integer) RETURNS integer AS $$
+ odd = 0
+ cursor = plpy.cursor(select num from largetable)
+ while True:
+ rows = cursor.fetch(batch_size)
+ if not rows:
+ break
+ for row in rows:
+ if row['num'] % 2:
+ odd += 1
+ return odd
+ $$ LANGUAGE plpythonu;
+ 
+ CREATE FUNCTION count_odd_prepared() RETURNS integer AS $$
+ odd = 0
+ plan = plpy.prepare(select num from largetable where num % $1 != 0,

Re: [HACKERS] Error building v9.1.1 (git) with python 3.2.2

2011-10-05 Thread Jan Urbański

On 05/10/11 15:24, Lou Picciano wrote:
 Hackers, 
 
 Have been following with some interest the dialog recently about getting 
 pl/python built within the v9 tree; most recently, I've been trying v9.1.1 
 (git) and python 3.2.2. Configure and make complete without error, but 'make 
 install' does not produce any plpython.so 
 
 I have since added a symlink to /usr/lib/config-3.2mu, as suggested in that 
 dialog - No Joy. 
 
 Can anyone point me in the right direction? 

Could you attach the configure and build logs? It's hard to diagnose
with so little info.

Cheers,
Jan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Large C files

2011-09-06 Thread Jan Urbański

On 07/09/11 01:13, Peter Geoghegan wrote:
 On 6 September 2011 08:29, Peter Eisentraut pete...@gmx.net wrote:
 I was thinking about splitting up plpython.c, but it's not even on that
 list. ;-)
 
 IIRC the obesity of that file is something that Jan Urbański intends
 to fix, or is at least concerned about. Perhaps you should compare
 notes.

Yeah, plpython.c could easily be splitted into submodules for builtin
functions, spi support, subtransactions support, traceback support etc.

If there is consensus that this should be done, I'll see if I can find
time to submit a giant-diff-no-behaviour-changes patch for the next CF.

Cheers,
Jan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] any results from PL summit?

2011-08-29 Thread Jan Urbański

On 29/08/11 08:21, Pavel Stehule wrote:
 Hello

Hi Pavel,

 is there some some result, report from PL summit?

The notes taken during the summit are here:

http://wiki.postgresql.org/wiki/PgCon_2011_PL_Summit

I think Selena also sent a summary and next steps mail to the
participants, perhaps it could be reposted on the mailing list.

Cheers,
Jan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] plpython crash

2011-08-18 Thread Jan Urbański

On 17/08/11 23:10, Tom Lane wrote:
 =?UTF-8?B?SmFuIFVyYmHFhHNraQ==?= wulc...@wulczer.org writes:
 Here are two patches that fix two separate bugs that you found
 simultaneously. Because they're actually separate issues, it turned out
 fixing them was a bit more tricky than I expected (fixing one was
 unmasking the other one etc).
 
 Applied with one non-cosmetic change: I got rid of the test on
 TransactionIdIsValid(arg-typrel_xmin) in PLy_input_tuple_funcs,
 as well as where you'd copied that logic in PLy_output_tuple_funcs.
 AFAICS skipping the update on the xmin/tid, if we're coming through
 there a second time, would be simply wrong.

Thanks!

The way things are set up now I think you never go through
PLy_input_tuple_funcs twice, unless the cache is determined to be
invalid and then you recreate the function from scratch.

But of course it's better to be safe than sorry and even if I'm right
and it was never executed twice, any refactoring effort might have
broken it easily.

Cheers,
Jan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Caching Python modules

2011-08-17 Thread Jan Urbański

On 17/08/11 14:09, PostgreSQL - Hans-Jürgen Schönig wrote:
 CREATE OR REPLACE FUNCTION textprocess.add_to_corpus(lang text, t text) 
 RETURNS float4 AS $$
 
 from SecondCorpus import SecondCorpus
 from SecondDocument import SecondDocument
 
 i am doing some intense text mining here.
 the problem is: is it possible to cache those imported modules from function 
 to function call.
 GD works nicely for variables but can this actually be done with imported 
 modules as well?
 the import takes around 95% of the total time so it is definitely something 
 which should go away somehow.
 i have checked the docs but i am not more clever now.

After a module is imported in a backend, it stays in the interpreter's
sys.modules dictionary and importing it again will not cause the module
Python code to be executed.

As long as you are using the same backend you should be able to call
add_to_corpus repeatedly and the import statements should take a long
time only the first time you call them.

This simple test demonstrates it:

$ cat /tmp/slow.py
import time
time.sleep(5)

$ PYTHONPATH=/tmp/ bin/postgres -p 5433 -D data/
LOG:  database system was shut down at 2011-08-17 14:16:18 CEST
LOG:  database system is ready to accept connections

$ bin/psql -p 5433 postgres
Timing is on.
psql (9.2devel)
Type help for help.

postgres=# select slow();
 slow
--

(1 row)

Time: 5032.835 ms
postgres=# select slow();
 slow
--

(1 row)

Time: 1.051 ms

Cheers,
Jan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Caching Python modules

2011-08-17 Thread Jan Urbański

On 17/08/11 14:19, Jan Urbański wrote:
 On 17/08/11 14:09, PostgreSQL - Hans-Jürgen Schönig wrote:
 CREATE OR REPLACE FUNCTION textprocess.add_to_corpus(lang text, t text) 
 RETURNS float4 AS $$

 from SecondCorpus import SecondCorpus
 from SecondDocument import SecondDocument

 i am doing some intense text mining here.
 the problem is: is it possible to cache those imported modules from function 
 to function call.
 GD works nicely for variables but can this actually be done with imported 
 modules as well?
 the import takes around 95% of the total time so it is definitely something 
 which should go away somehow.
 i have checked the docs but i am not more clever now.
 
 After a module is imported in a backend, it stays in the interpreter's
 sys.modules dictionary and importing it again will not cause the module
 Python code to be executed.
 
 As long as you are using the same backend you should be able to call
 add_to_corpus repeatedly and the import statements should take a long
 time only the first time you call them.
 
 This simple test demonstrates it:
 
 [example missing the slow() function code]

Oops, forgot to show the CREATE statement of the test function:

postgres=# create or replace function slow() returns void language
plpythonu as $$ import slow $$;

Jan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] plpython crash

2011-08-17 Thread Jan Urbański

On 17/08/11 11:40, Jan Urbański wrote:
 On 16/08/11 19:12, Jan Urbański wrote:
 On 16/08/11 19:07, Jean-Baptiste Quenot wrote:

 [plpython is buggy]

 I'll have a patch ready soon.

Here are two patches that fix two separate bugs that you found
simultaneously. Because they're actually separate issues, it turned out
fixing them was a bit more tricky than I expected (fixing one was
unmasking the other one etc).

Thanks for the report!
Jan
From 3c0bf7519cad735160d9d222d6f86f84987b38b5 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Jan=20Urba=C5=84ski?= wulc...@wulczer.org
Date: Wed, 17 Aug 2011 16:07:54 +0200
Subject: [PATCH 2/2] Guard against return type changing in PL/Python
 functions.

Functions cache their I/O routines and in case their return type is
composite, a change of the underlying type can cause the cache to
become invalid. PL/Python was already checking for composite type
changes for input arguments, now the check is extended to cover the
return type as well.

Per bug report from Jean-Baptiste Quenot.
---
 src/pl/plpython/expected/plpython_record.out |   21 ++
 src/pl/plpython/plpython.c   |   93 ++---
 src/pl/plpython/sql/plpython_record.sql  |   15 
 3 files changed, 103 insertions(+), 26 deletions(-)

diff --git a/src/pl/plpython/expected/plpython_record.out b/src/pl/plpython/expected/plpython_record.out
index 7c60089..0bcc46c 100644
--- a/src/pl/plpython/expected/plpython_record.out
+++ b/src/pl/plpython/expected/plpython_record.out
@@ -308,6 +308,27 @@ SELECT * FROM test_inout_params('test_in');
  test_in_inout
 (1 row)
 
+-- try changing the return types and call functions again
+ALTER TABLE table_record DROP COLUMN first;
+ALTER TABLE table_record DROP COLUMN second;
+ALTER TABLE table_record ADD COLUMN first text;
+ALTER TABLE table_record ADD COLUMN second int4;
+SELECT * FROM test_table_record_as('obj', 'one', 1, false);
+ first | second 
+---+
+ one   |  1
+(1 row)
+
+ALTER TYPE type_record DROP ATTRIBUTE first;
+ALTER TYPE type_record DROP ATTRIBUTE second;
+ALTER TYPE type_record ADD ATTRIBUTE first text;
+ALTER TYPE type_record ADD ATTRIBUTE second int4;
+SELECT * FROM test_type_record_as('obj', 'one', 1, false);
+ first | second 
+---+
+ one   |  1
+(1 row)
+
 -- errors cases
 CREATE FUNCTION test_type_record_error1() RETURNS type_record AS $$
 return { 'first': 'first' }
diff --git a/src/pl/plpython/plpython.c b/src/pl/plpython/plpython.c
index 90d3c47..a254ffa 100644
--- a/src/pl/plpython/plpython.c
+++ b/src/pl/plpython/plpython.c
@@ -1489,6 +1489,42 @@ PLy_function_delete_args(PLyProcedure *proc)
 			PyDict_DelItemString(proc-globals, proc-argnames[i]);
 }
 
+static bool
+PLy_procedure_argument_valid(PLyTypeInfo *arg)
+{
+	Oid			relid;
+	HeapTuple	relTup;
+	bool		valid;
+
+	/* Only check input arguments that are composite */
+	if (arg-is_rowtype != 1) {
+		return true;
+	}
+
+	/* An uninitialised typ_relid means that we got called on an output
+	 * argument of a function returning a unnamed record type */
+	if (!OidIsValid(arg-typ_relid)) {
+		return true;
+	}
+
+	Assert(TransactionIdIsValid(arg-typrel_xmin));
+	Assert(ItemPointerIsValid(arg-typrel_tid));
+
+	/* Get the pg_class tuple for the argument type */
+	relid = arg-typ_relid;
+	relTup = SearchSysCache1(RELOID, ObjectIdGetDatum(relid));
+	if (!HeapTupleIsValid(relTup))
+		elog(ERROR, cache lookup failed for relation %u, relid);
+
+	/* If it has changed, the function is not valid */
+	valid = (arg-typrel_xmin == HeapTupleHeaderGetXmin(relTup-t_data) 
+			 ItemPointerEquals(arg-typrel_tid, relTup-t_self));
+
+	ReleaseSysCache(relTup);
+
+	return valid;
+}
+
 /*
  * Decide whether a cached PLyProcedure struct is still valid
  */
@@ -1509,33 +1545,16 @@ PLy_procedure_valid(PLyProcedure *proc, HeapTuple procTup)
 	/* If there are composite input arguments, they might have changed */
 	for (i = 0; i  proc-nargs; i++)
 	{
-		Oid			relid;
-		HeapTuple	relTup;
-
 		/* Short-circuit on first changed argument */
 		if (!valid)
 			break;
 
-		/* Only check input arguments that are composite */
-		if (proc-args[i].is_rowtype != 1)
-			continue;
-
-		Assert(OidIsValid(proc-args[i].typ_relid));
-		Assert(TransactionIdIsValid(proc-args[i].typrel_xmin));
-		Assert(ItemPointerIsValid(proc-args[i].typrel_tid));
-
-		/* Get the pg_class tuple for the argument type */
-		relid = proc-args[i].typ_relid;
-		relTup = SearchSysCache1(RELOID, ObjectIdGetDatum(relid));
-		if (!HeapTupleIsValid(relTup))
-			elog(ERROR, cache lookup failed for relation %u, relid);
-
-		/* If it has changed, the function is not valid */
-		if (!(proc-args[i].typrel_xmin == HeapTupleHeaderGetXmin(relTup-t_data) 
-			  ItemPointerEquals(proc-args[i].typrel_tid, relTup-t_self)))
-			valid = false;
+		valid = PLy_procedure_argument_valid(proc-args[i]);
+	}
 
-		ReleaseSysCache(relTup);
+	/* if the output argument is composite, it might have changed */
+	if (valid) {
+		valid

Re: [HACKERS] rc1 or beta4?

2011-08-17 Thread Jan Urbański

On 17/08/11 15:00, Dave Page wrote:
 The current plan (or, the last one I recall) is to push another 9.1
 release tomorrow, for Monday release. Are we going with beta4 or rc1?

Sorry to butt in, but it would probably be good to include fixes for the
two segfault plpython bugs[1] before wrapping up the release.

Cheers,
Jan

[1] http://archives.postgresql.org/message-id/4e4bcd52.90...@wulczer.org

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] rc1 or beta4?

2011-08-17 Thread Jan Urbański

On 17/08/11 17:50, Thom Brown wrote:
 On 17 August 2011 16:47, Jan Urbański wulc...@wulczer.org wrote:
 
 On 17/08/11 15:00, Dave Page wrote:
 The current plan (or, the last one I recall) is to push another 9.1
 release tomorrow, for Monday release. Are we going with beta4 or rc1?

 Sorry to butt in, but it would probably be good to include fixes for the
 two segfault plpython bugs[1] before wrapping up the release.

 
 It's not listed as a beta-blocker yet.  I take it that it should?

Oh, in the wiki? I don't know, it is a segfault-causing bug, but all I
wanted was to draw some attention in case the people wrapping the
release missed that thread.

Jan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] plpython crash

2011-08-16 Thread Jan Urbański

On 16/08/11 16:52, Jean-Baptiste Quenot wrote:
 After backporting plpython.c from HEAD, this is the error message I get:
 
 ERROR:  key pg.dropped.6 not found in mapping
 HINT:  To return null in a column, add the value None to the mapping
 with the key named after the column.
 CONTEXT:  while creating return value
 PL/Python function myfunc
 
 What does it mean?

Ah, interesting, I think that this means that you are returning a table
type and that table has a dropped column. The code should skip over
dropped columns, but apparently it does not and tries to find a value
for that column in the mapping you are returning.

I'll try to reproduce it here.

Jan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] plpython crash

2011-08-16 Thread Jan Urbański

On 16/08/11 17:06, Jan Urbański wrote:
 On 16/08/11 16:52, Jean-Baptiste Quenot wrote:
 After backporting plpython.c from HEAD, this is the error message I get:

 ERROR:  key pg.dropped.6 not found in mapping
 HINT:  To return null in a column, add the value None to the mapping
 with the key named after the column.
 CONTEXT:  while creating return value
 PL/Python function myfunc

 What does it mean?

I did a couple of simple tests and can't see how can the code not skip
dropped columns.

It seems like you're missing this commit:

http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=41282111e6cc73aca4b63dffe950ba7a63e4bd8a

Could you try running this query? (assuming your function is called
'myfync')

select proname, relname, attname, attisdropped from pg_proc join
pg_class on (prorettype = reltype) join pg_attribute on (attrelid =
pg_class.oid) where proname = 'myfunc';

Cheers,
Jan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] plpython crash

2011-08-16 Thread Jan Urbański

On 16/08/11 19:07, Jean-Baptiste Quenot wrote:
 Dear Jan,
 
 Sorry I typed the wrong git commands.  With latest plpython from
 branch master I got the same gdb backtrace as reported before.  I
 managed to wrap up a testcase that fails 100% of times on my setup:
 https://gist.github.com/1149512
 
 Hope it crashes on your side too :-)

Awesome, it segfaults for me with HEAD ;)

Now it's just a simple matter of programming... I'll take a look at it
this evening.

Jan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] plpython crash

2011-08-12 Thread Jan Urbański

On 11/08/11 18:01, Jean-Baptiste Quenot wrote:
 Hi there,
 
 plpython crashes on me on various 64-bit Ubuntu hosts, see the gdb
 backtrace at: https://gist.github.com/1140005
 
 Do you believe there was recent bugfixes regarding PLyMapping_ToTuple() ?
 
 This is PG 9.0.4 with HEAD of plpython taken in march 2011 and backported.
 
 Please tell me if you need more information.

Hi,

there were no changes to that area of plpython after March 2011.

Could you try to see if the error also appears if you run your app with
current PostgreSQL HEAD (both the server and plpython)?

Which Python version is that? You can get that info by running:

do $$ import sys; plpy.info(sys.version) $$ language plpythonu;

Could you try to extract a self-contained example of how to reproduce
it? If the bug only appears under your application's specific workload,
perhaps you could try running it with Postgres compiled with -O0,
because compiling with -O2 causes the gdb backtrace to be missing
optimised out values and inlined functions?

Cheers,
Jan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] plpython crash

2011-08-12 Thread Jan Urbański

On 12/08/11 13:55, Jean-Baptiste Quenot wrote:
 Here is the same with -O0:
 
 https://gist.github.com/1140005
 
 sys.version reports this:
 
 INFO:  2.6.6 (r266:84292, Sep 15 2010, 16:41:53)
 [GCC 4.4.5]

I'm still at a loss. Did you reproduce it with git HEAD? I see that the
query being execute is select * from myfunc(); would it be possible to
share the code of myfunc?

Jan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Initial Review: JSON contrib modul was: Re: [HACKERS] Another swing at JSON

2011-07-22 Thread Jan Urbański

On 23/07/11 01:12, Robert Haas wrote:
 On Fri, Jul 22, 2011 at 6:04 PM, Joey Adams joeyadams3.14...@gmail.com 
 wrote:
 On another matter, should the JSON type guard against duplicate member
 keys?  The JSON RFC says The names within an object SHOULD be
 unique, meaning JSON with duplicate members can be considered valid.
 JavaScript interpreters (the ones I tried), PHP, and Python all have
 the same behavior: discard the first member in favor of the second.
 That is, {key:1,key:2} becomes {key:2}.  The XML type throws an
 error if a duplicate attribute is present (e.g. 'a href=b
 href=c/'::xml).
 
 Hmm.  That's tricky.  I lean mildly toward throwing an error as being
 more consistent with the general PG philosophy.

OTOH:

regression=# select 'key=1,key=2'::hstore;
   hstore

 key=1
(1 row)

Cheers,
Jan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PL/Python: No stack trace for an exception

2011-07-21 Thread Jan Urbański

On 21/07/11 15:27, Sushant Sinha wrote:
 I am using plpythonu on postgres 9.0.2. One of my python functions was
 throwing a TypeError exception. However, I only see the exception in the
 database and not the stack trace. It becomes difficult to debug if the
 stack trace is absent in Python.
 
 logdb=# select get_words(forminput) from fi;   
 ERROR:  PL/Python: TypeError: an integer is required
 CONTEXT:  PL/Python function get_words
 
 And here is the error if I run that function on the same data in python:
 
 [traceback]
 
 Is this a known problem or this needs addressing?

Yes, traceback support in PL/Python has already been implemented and is
a new feature that will be available in PostgreSQL 9.1.

Cheers,
Jan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench--new transaction type

2011-06-13 Thread Jan Urbański

On 13/06/11 06:38, Greg Smith wrote:
 On 06/11/2011 03:21 PM, Jeff Janes wrote:
 I wouldn't expect IPC chatter to show up in profiling, because it
 costs wall time, but not CPU time.  The time spent might be attributed
 to the kernel, or to pgbench, or to nothing at all.

 
 Profilers aren't necessarily just accumulating raw CPU time though.  If
 the approach includes sampling what code is active right now?
 periodically, you might be able to separate this out even though it's
 not using CPU time in the normal fashion.  I think you might just need
 to use a better profiler.

I got surprisingly insightful results in the past using
http://poormansprofiler.org/

I never used it with Postgres, but it might be worth to try.

J

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] increasing collapse_limits?

2011-05-06 Thread Jan Urbański

On 01/05/11 21:16, Joshua Berkus wrote:
 Speaking of which, what happened to replacing GEQO with Simulated Annealing?  
 Where did that project go?

It stayed on github (https://github.com/wulczer/saio) and stagnated a
bit after I got my degree. It's on the top of my list of things to pick
up after the summer (or maybe even during the summer).

Cheers,
Jan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

1 2 3 4 >

1 - 100 of 345 matches

Mail list logo