Re: [HACKERS] libpq's multi-threaded SSL callback handling is busted
Peter Eisentraut writes: On 2/12/15 7:28 AM, Jan Urbański wrote: +#if OPENSSL_VERSION_NUMBER 0x1000 +/* OpenSSL 1.0.0 deprecates the CRYPTO_set_id_callback function and provides a + * default implementation, so there's no need for our own. */ I have some additional concerns about this. It is true that OpenSSL 1.0.0 deprecates CRYPTO_set_id_callback(), but it replaces it with CRYPTO_THREADID_set_callback(). There is no indication that you don't need to set a callback anymore. The man page (https://www.openssl.org/docs/crypto/threads.html) still says you need to set two callbacks, and points to the new interface. Truly, no good deed can ever go unpunished. I spent around an hour tracking down why setting both callbacks for OpenSSL = 1.0.0 brought back the PHP segfaults. Turns out, in OpenSSL there's *no way* to unregister a callback registered with CRYPTO_THREADID_set_callback()! Observe: https://github.com/openssl/openssl/blob/35a1cc90bc1795e8893c11e442790ee7f659fffb/crypto/thr_id.c#L174-L180 Once you set a callback, game over - unloading your library will cause a segfault. I cannot express how joyful I felt when I discovered it. I suggest you remove this part from your patch and submit a separate patch for consideration if you want to. At this point I will propose an even simpler patch (attached). I gave up on trying to use the more modern CRYPTO_THREADID_* functions. Right now, HEAD libpq won't compile against OpenSSL compiled with OPENSSL_NO_DEPRECATED, which I think is fine, given the lack of complaints. So let's just keep using the older, saner functions and ignore the THREADID crap. By the way, might I take the opportunity to breach the subject of backpatching this change, should it get accepted? Cheers, Jan diff --git a/src/interfaces/libpq/fe-secure-openssl.c b/src/interfaces/libpq/fe-secure-openssl.c index a32af34..02c9177 100644 --- a/src/interfaces/libpq/fe-secure-openssl.c +++ b/src/interfaces/libpq/fe-secure-openssl.c @@ -806,9 +806,12 @@ pgtls_init(PGconn *conn) if (ssl_open_connections++ == 0) { - /* These are only required for threaded libcrypto applications */ - CRYPTO_set_id_callback(pq_threadidcallback); - CRYPTO_set_locking_callback(pq_lockingcallback); + /* These are only required for threaded libcrypto applications, but + * make sure we don't stomp on them if they're already set. */ + if (CRYPTO_get_id_callback() == NULL) +CRYPTO_set_id_callback(pq_threadidcallback); + if (CRYPTO_get_locking_callback() == NULL) +CRYPTO_set_locking_callback(pq_lockingcallback); } } #endif /* ENABLE_THREAD_SAFETY */ @@ -885,9 +888,12 @@ destroy_ssl_system(void) if (pq_init_crypto_lib ssl_open_connections == 0) { - /* No connections left, unregister libcrypto callbacks */ - CRYPTO_set_locking_callback(NULL); - CRYPTO_set_id_callback(NULL); + /* No connections left, unregister libcrypto callbacks, if no one + * registered different ones in the meantime. */ + if (CRYPTO_get_id_callback() == pq_threadidcallback) + CRYPTO_set_id_callback(NULL); + if (CRYPTO_get_locking_callback() == pq_lockingcallback) + CRYPTO_set_locking_callback(NULL); /* * We don't free the lock array or the SSL_context. If we get another -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] libpq's multi-threaded SSL callback handling is busted
Peter Eisentraut writes: On 4/2/15 4:32 AM, Jan Urbański wrote: Peter Eisentraut writes: I don't think this patch would actually fix the problem that was described after the original bug report (http://www.postgresql.org/message-id/5436991b.5020...@vmware.com), namely that another thread acquires a lock while the libpq callbacks are set and then cannot release the lock if libpq has been shut down in the meantime. I did test both the Python and the PHP repro scripts and the patch fixed both the deadlock and the segfault. What happens is that Python (for instance) stops over the callback unconditionally. So when libpq gets unloaded, it sees that the currently set callback is no the one it originally set and refrains from NULLing it. So this works because Python is just as broken as libpq right now. What happens if Python is fixed as well? Then we'll have the problem I described above. Well, not really, libpq sets and unsets the callbacks every time an SSL connection is opened and closed. Python sets the callbacks once when the ssl module is imported and never touches them again. Arguably, it should set them even earlier, but it's still saner than flip-flopping them. AFAIK you can't unload Python, so setting the callback and keeping it there is a sound strategy. The change I'm proposing is not to set the callback in libpq if one is already set and not remove it if the one that's set is not libpq's. That's as sane as it gets. With multiple libraries wanting to use OpenSSL in threaded code, the mechanism seems to be last one wins. It doesn't matter *who's* callbacks are used, as long as they're there and they stay there and that's Python's stance. This doesn't work if you want to be able to unload the library, so the next best thing is not touching the callback if someone else set his in the meantime. What's broken is OpenSSL for offering such a bad way of dealing with concurrency. To reiterate: I recognise that not handling the callbacks is not the right answer. But not stomping on callbacks others might have set sounds like a minimal and safe improvement. Cheers, Jan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] libpq's multi-threaded SSL callback handling is busted
Peter Eisentraut writes: On 2/12/15 7:28 AM, Jan Urbański wrote: For the sake of discussion, here's a patch to prevent stomping on previously-set callbacks, racy as it looks. FWIW, it does fix the Python deadlock and doesn't cause the PHP segfault... I don't think this patch would actually fix the problem that was described after the original bug report (http://www.postgresql.org/message-id/5436991b.5020...@vmware.com), namely that another thread acquires a lock while the libpq callbacks are set and then cannot release the lock if libpq has been shut down in the meantime. I did test both the Python and the PHP repro scripts and the patch fixed both the deadlock and the segfault. What happens is that Python (for instance) stops over the callback unconditionally. So when libpq gets unloaded, it sees that the currently set callback is no the one it originally set and refrains from NULLing it. There's a small race window there, to be sure, but it's a lot better than what we have now. Cheers, Jan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GSoC idea - Simulated annealing to search for query plans
Josh Berkus writes: On 02/26/2015 05:50 PM, Fabrízio de Royes Mello wrote: On Thu, Feb 26, 2015 at 10:27 PM, Andres Freund and...@2ndquadrant.com mailto:and...@2ndquadrant.com wrote: On 2015-02-26 20:23:33 -0500, Tom Lane wrote: I seem to recall somebody demo'ing a simulated-annealing GEQO replacement at PGCon a couple years back. It never got to the point of being a submitted patch though. Yea, it was Jan Urbański (CCed). And the project link: https://github.com/wulczer/saio So what w'ere saying, Grzegorz, is that we would love to see someone pick this up and get it to the point of making it a feature as a GSOC project. I think if you can start from where Jan left off, you could actually complete it. Sorry, late to the party. Yes, I wrote a GEQO replacement that used simulated annealing for my Master thesis. It got to a point where it was generating plans similar to what GEQO outputs for small queries and better plans for very large ones. The thesis itself is here: https://wulczer.org/saio.pdf and the linked GitHub repo contains source for the PGCon presentation, which gives a higher-level overview. The big problem turned out to be writing the step function that generates a new join order from a previous one. Look for the Simulated Annealing challenges and Moves generator chapters in my thesis, which are the only interesting ones :) If you'd like to pick up where I left, I'd be more than happy to help in any ways I can. Best, Jan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] libpq's multi-threaded SSL callback handling is busted
Andres Freund writes: On 2015-02-12 09:31:27 +0100, Jan Urbański wrote: That doesn't solve the problem of the Python deadlock, where you're not at leisure to call a C function at the beginning of your module. We could just never unload the hooks... That's what we did before 4e816286533dd34c10b368487d4079595a3e1418 :) And it got changed after http://www.postgresql.org/message-id/48620925.6070...@pws.com.au * If there's already callbacks set: Remember that fact and don't overwrite. In the next major version: warn. So yeah, that was my initial approach - check if callbacks are set, don't do the dance if they are. It felt like a crutch, though, and racy at that. There's no atomic way to test-and-set those callbacks. The window for racyness is small, though. If you do that check during library initialization instead of every connection it shouldn't be racy - if that part is run in a multithreaded fashion you're doing something crazy. Yes, that's true. The problem is that there's no real libpq initialisation function. The docs say that: If your application initializes libssl and/or libcrypto libraries and libpq is built with SSL support, you should call PQinitOpenSSL So most apps will just not bother. The moment you know you'll need SSL is only when you get an 'S' message from the server... Cheers, Jan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] libpq's multi-threaded SSL callback handling is busted
Andres Freund writes: On 2015-02-09 18:17:14 +0100, Jan Urbański wrote: First of all, the current behaviour is crazy. We're setting and unsetting the locking callback every time a connection is made/closed, which is not how OpenSSL is supposed to be used. The *application* using libpq should set a callback before it starts threads, it's no business of the library's. I don't buy this argument at all. Delivering a libpq that's not threadsafe in some circumstances is a really bad idea. I knew this would be a hard sell :( What I know is that the current situation is not good and leaving it as-is causes real grief. The old behaviour was slightly less insane (set callbacks first time we're engaging OpenSSL code, never touch them again). The real sane solution is to leave it up to the application. The real solution would be for openssl to do it itself. I think that the OpenSSL API is the real culprit there, requiring threads to register a callback is what's causing all the issues, but I don't think this will get ever changed. Note that the PHP pgsql extension doesn't set the OpenSSL callbacks, because libpq was setting them on its own. If we remove the callback handling from libpq, PHP will need to add them. By the way, the MySQL extension for PHP also does not set those callbacks. I think this shows pretty clearly how insane it would be to change this in a minor release. Do you really expect people to update libpq and php in unison. After a minor release? Well, I haven't found reports of threaded PHP+MySQL+SSL programs crashing, and the MySQL extension also doesn't care about the callbacks. I think it's because it's both because it's a rare combination, or because almost everyone loads the cURL extension, which *does* set up callbacks. Like I said, it's not a good situation. Note that we *already* provide applications with the ability to set the callbacks themselves and prevent us from doing so: PQinitSSL(false). Ah, I only now saw that with PQinitOpenSSL(true, false) you can work around the problem, if you set up your own callbacks first. That seems to at least provide a way to make libpq not do the callbacks dance in released versions. Thank you. Let me reiterate: I now believe the callbacks should be set by the application, libraries should not touch them, since they don't know what will they be stomping on. If the application is run through a VM like Python or PHP, it's the VM that should make sure the callbacks are set. I fail to see why it's any better to do it in the VM. That relies on either always loading the VM's openssl module, even if you don't actually need it because the library you use deals with openssl. It prevents other implementations of openssl in the VM. The callbacks are a thing that's owned by the application, so libraries have no business in setting them up. The way OpenSSL's API is specified (very unfortunately IMHO) is that before you use OpenSSL from threads, you need to set the callbacks. If you don't control your application's startup (like when you're a script executed through a VM), you assume the VM took care of it at startup. If you're a library, you assume the user took care of it, as they should. Now I know that a lot of apps get this wrong. Python almost does the right thing, but you're right, it only sets up callbacks if you load the 'ssl' module. It still feels like setting them up in library code is stomping on things not owned by the library - like libperl setting up signal handlers, which caused problems in Postgres. They're resources not owned by the library! What I think we should do is the following: * Emphasize the fact that it's important to use PQinitSSL(false) if you did things yourself. PQinitOpenSSL(true, false) if anything. The reason that function got introduced is precisely because PQinitSSL(false) isn't exactly right, see http://www.postgresql.org/message-id/49820d7d.7030...@esilo.com That doesn't solve the problem of the Python deadlock, where you're not at leisure to call a C function at the beginning of your module. * If there's already callbacks set: Remember that fact and don't overwrite. In the next major version: warn. So yeah, that was my initial approach - check if callbacks are set, don't do the dance if they are. It felt like a crutch, though, and racy at that. There's no atomic way to test-and-set those callbacks. The window for racyness is small, though. * Assert or something if the callback when unregistering isn't the same as it was when registering. That's pretty much guaranteed to cause issues. So let me try another proposal and see if it doesn't set alarm bells off. * When opening a connection, if there's a callback set, don't overwrite it (small race window). * When closing a connection, if the callback set is not pq_lockingcallback/pq_threadidcallback, don't NULL it (small race window) Asserts in frontend code are impossible, right? There's no way
Re: [HACKERS] libpq's multi-threaded SSL callback handling is busted
Jan Urbański writes: Andres Freund writes: On 2015-02-12 09:31:27 +0100, Jan Urbański wrote: That doesn't solve the problem of the Python deadlock, where you're not at leisure to call a C function at the beginning of your module. We could just never unload the hooks... That's what we did before 4e816286533dd34c10b368487d4079595a3e1418 :) And it got changed after http://www.postgresql.org/message-id/48620925.6070...@pws.com.au * If there's already callbacks set: Remember that fact and don't overwrite. In the next major version: warn. So yeah, that was my initial approach - check if callbacks are set, don't do the dance if they are. It felt like a crutch, though, and racy at that. There's no atomic way to test-and-set those callbacks. The window for racyness is small, though. If you do that check during library initialization instead of every connection it shouldn't be racy - if that part is run in a multithreaded fashion you're doing something crazy. Yes, that's true. The problem is that there's no real libpq initialisation function. The docs say that: If your application initializes libssl and/or libcrypto libraries and libpq is built with SSL support, you should call PQinitOpenSSL So most apps will just not bother. The moment you know you'll need SSL is only when you get an 'S' message from the server... For the sake of discussion, here's a patch to prevent stomping on previously-set callbacks, racy as it looks. FWIW, it does fix the Python deadlock and doesn't cause the PHP segfault... J diff --git a/src/interfaces/libpq/fe-secure-openssl.c b/src/interfaces/libpq/fe-secure-openssl.c index a32af34..54312a5 100644 --- a/src/interfaces/libpq/fe-secure-openssl.c +++ b/src/interfaces/libpq/fe-secure-openssl.c @@ -710,6 +710,9 @@ verify_peer_name_matches_certificate(PGconn *conn) * Callback functions for OpenSSL internal locking */ +#if OPENSSL_VERSION_NUMBER 0x1000 +/* OpenSSL 1.0.0 deprecates the CRYPTO_set_id_callback function and provides a + * default implementation, so there's no need for our own. */ static unsigned long pq_threadidcallback(void) { @@ -720,6 +723,7 @@ pq_threadidcallback(void) */ return (unsigned long) pthread_self(); } +#endif /* OPENSSL_VERSION_NUMBER 0x1000 */ static pthread_mutex_t *pq_lockarray; @@ -806,9 +810,14 @@ pgtls_init(PGconn *conn) if (ssl_open_connections++ == 0) { - /* These are only required for threaded libcrypto applications */ - CRYPTO_set_id_callback(pq_threadidcallback); - CRYPTO_set_locking_callback(pq_lockingcallback); + /* These are only required for threaded libcrypto applications, but + * make sure we don't stomp on them if they're already set. */ +#if OPENSSL_VERSION_NUMBER 0x1000 + if (CRYPTO_get_id_callback() == NULL) +CRYPTO_set_id_callback(pq_threadidcallback); +#endif + if (CRYPTO_get_locking_callback() == NULL) +CRYPTO_set_locking_callback(pq_lockingcallback); } } #endif /* ENABLE_THREAD_SAFETY */ @@ -885,9 +894,14 @@ destroy_ssl_system(void) if (pq_init_crypto_lib ssl_open_connections == 0) { - /* No connections left, unregister libcrypto callbacks */ - CRYPTO_set_locking_callback(NULL); - CRYPTO_set_id_callback(NULL); + /* No connections left, unregister libcrypto callbacks, if no one + * registered different ones in the meantime. */ +#if OPENSSL_VERSION_NUMBER 0x1000 + if (CRYPTO_get_id_callback() == pq_threadidcallback) + CRYPTO_set_id_callback(NULL); +#endif + if (CRYPTO_get_locking_callback() == pq_lockingcallback) + CRYPTO_set_locking_callback(NULL); /* * We don't free the lock array or the SSL_context. If we get another -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] libpq's multi-threaded SSL callback handling is busted
Jan Urbański writes: I did some more digging on bug http://www.postgresql.org/message-id/cahul3dpwyfnugdgo95ohydq4kugdnbkptjq0mnbtubhcmg4...@mail.gmail.com which describes a deadlock when using libpq with SSL in a multi-threaded environment with other threads doing SSL independently. [reproducing instructions] I posit we should remove all CRYPTO_set_*_callback functions and associated cruft from libpq. I could submit a patch to get rid of the crazy CRYPTO_*_callback dance in libpq, but at the very least this will require a warning in the release notes Attached is a patch doing just that. I would very much like to have this change back-patched, since setting and resetting the callback makes using libpq in a threaded OpenSSL-enabled app arguably less safe than if it didn't use any locking. Also attached is a patch for 9.4 and all previous supported releases, which is the same thing, but adjusted for when we didn't have a separate fe-secure.c and fe-secure-openssl.c If committed, this change will warrant a notice in the release notes. I could try drafting it, if that'd be helpful. Cheers, Jan commit 62f7433f697d49ab005ad22822dc943661a4e48a Author: Jan Urbański wulc...@wulczer.org Date: Wed Feb 11 17:47:09 2015 +0100 Do not attempt to manage OpenSSL locking callbacks in libpq. According to OpenSSL documentation, threaded programs using SSL should register locking callbacks before starting work. In 6daf396879b6502c41146cdb1013568654f52ff6 libpq started doing that, registering its callbacks when the connection would switch to SSL mode. This would lead to overwriting any locking callback a properly written threaded SSL application might have already set, but only got uncovered as a bug in http://www.postgresql.org/message-id/48620925.6070...@pws.com.au where it turned out that unloading libpq would leave a dangling function pointer. Specifically, the PHP runtime would segfault after unloading libpq, as reported in https://bugs.php.net/bug.php?id=40926 Commit 4e816286533dd34c10b368487d4079595a3e1418 made libpq unset the callback after the last SSL connection was closed. This solved the segfault issue, but introduced a potential for deadlocks when one thread using libpq with SSL and another doing an unrelated SSL operation. T1 (libpq) T2 (other) start libpq connection add locking callback start SSL operation take lock finish libpq connection remove locking callback (!) release lock, noop since no callback This got reported in bug http://www.postgresql.org/message-id/cahul3dpwyfnugdgo95ohydq4kugdnbkptjq0mnbtubhcmg4...@mail.gmail.com The reality is that it's not libpq's responsibility to manage those OpenSSL callbacks. It's up to application code to set them up before using OpenSSL in threads and trying to handle them in libpq is just stomping on the ones that should already by in place by the time we start using shared OpenSSL structures. This commit gets rid of all OpenSSL callback handling from libpq. For well-behaved applications, this should increase reliability, since they will now have certainty that the callbacks they set up before attempting to use OpenSSL will be used throughout the program execution. Applications that relied on libpq to set up locking callbacks will need to be fixed. diff --git a/src/interfaces/libpq/fe-secure-openssl.c b/src/interfaces/libpq/fe-secure-openssl.c index a32af34..5d0b132 100644 --- a/src/interfaces/libpq/fe-secure-openssl.c +++ b/src/interfaces/libpq/fe-secure-openssl.c @@ -67,7 +67,6 @@ static int verify_cb(int ok, X509_STORE_CTX *ctx); static int verify_peer_name_matches_certificate_name(PGconn *conn, ASN1_STRING *name, char **store_name); -static void destroy_ssl_system(void); static int initialize_SSL(PGconn *conn); static PostgresPollingStatusType open_client_SSL(PGconn *); static char *SSLerrmessage(void); @@ -90,8 +89,6 @@ static bool pq_init_crypto_lib = true; static SSL_CTX *SSL_context = NULL; #ifdef ENABLE_THREAD_SAFETY -static long ssl_open_connections = 0; - #ifndef WIN32 static pthread_mutex_t ssl_config_mutex = PTHREAD_MUTEX_INITIALIZER; #else @@ -112,16 +109,6 @@ static long win32_ssl_create_mutex = 0; void pgtls_init_library(bool do_ssl, int do_crypto) { -#ifdef ENABLE_THREAD_SAFETY - - /* - * Disallow changing the flags while we have open connections, else we'd - * get completely confused. - */ - if (ssl_open_connections != 0) - return; -#endif - pq_init_ssl_lib = do_ssl; pq_init_crypto_lib = do_crypto; } @@ -705,40 +692,6 @@ verify_peer_name_matches_certificate(PGconn *conn) return found_match !got_error; } -#ifdef ENABLE_THREAD_SAFETY -/* - * Callback functions
[HACKERS] libpq's multi-threaded SSL callback handling is busted
I did some more digging on bug http://www.postgresql.org/message-id/cahul3dpwyfnugdgo95ohydq4kugdnbkptjq0mnbtubhcmg4...@mail.gmail.com which describes a deadlock when using libpq with SSL in a multi-threaded environment with other threads doing SSL independently. Attached is a reproducing Python script in my experience is faster at showing the problem. Run it with python -u pg_lock.py As Heikki correctly diagnosed, the problem is with libpq unsetting the OpenSSL locking callback while another thread is holding one of the locks. The other thread then never releases the lock and the next time anyone tries to take it, the application deadlocks. The exact way it goes down is like this: T1 (libpq) T2 (Python) start libpq connection init ssl system add locking callback start ssl operation take lock finish libpq connection destroy ssl system remove locking callback (!) release lock, noop since no callback And the next time any thread tries to take the lock, it deadlocks. We added unsetting the locking callback in 4e816286533dd34c10b368487d4079595a3e1418 due to this bug report: http://www.postgresql.org/message-id/48620925.6070...@pws.com.au Indeed, commenting out the CRYPTO_set_locking_callback(NULL) call in fe-secure-openssl.c gets rid of the deadlock. However, it makes php segfault with the (attached) reproduction script from the original 2008 bug report. If your php.ini loads both the pgsql and curl extensions, reproduce the segfault with: php -f pg_segfault.php The most difficult part about fixing this bug is to determine *who's at fault*. I now lean towards the opinion that we shouldn't be messing with OpenSSL callbacks *at all*. First of all, the current behaviour is crazy. We're setting and unsetting the locking callback every time a connection is made/closed, which is not how OpenSSL is supposed to be used. The *application* using libpq should set a callback before it starts threads, it's no business of the library's. The old behaviour was slightly less insane (set callbacks first time we're engaging OpenSSL code, never touch them again). The real sane solution is to leave it up to the application. I posit we should remove all CRYPTO_set_*_callback functions and associated cruft from libpq. This unfortunately means that multi-threaded applications using libpq and SSL will break if they haven't been setting their own callbacks (if they have, well, tough luck! libpq will just stomp over them the first time it connects to Postgres, but at least *some* callbacks are left present after that). However, AFAICS if your app is not in C, then runtimes already handle that for you (as they should). Python: https://hg.python.org/cpython/file/dc820b44ce21/Modules/_ssl.c#l4284 PHP: https://github.com/php/php-src/blob/master/ext/curl/interface.c#L1235 Note that the PHP pgsql extension doesn't set the OpenSSL callbacks, because libpq was setting them on its own. If we remove the callback handling from libpq, PHP will need to add them. By the way, the MySQL extension for PHP also does not set those callbacks. Let me reiterate: I now believe the callbacks should be set by the application, libraries should not touch them, since they don't know what will they be stomping on. If the application is run through a VM like Python or PHP, it's the VM that should make sure the callbacks are set. I could submit a patch to get rid of the crazy CRYPTO_*_callback dance in libpq, but at the very least this will require a warning in the release notes about how you can't assume that libpq will take care of making sure your app is multi-threaded safe when using OpenSSL. I also don't know how far that's back-patcheable. I would very much like to have this change back-patched, since setting and resetting the callback makes using libpq in a threaded OpenSSL-enabled app arguably less safe than if it didn't use any locking. If the app is written correctly, it will have set locking callbacks before starting. Then libpq will happily stomp on them. If the app hasn't set callbacks, it wasn't written correctly in the first place and it will get segfaults instead of deadlocks. Thanks, Jan #!/usr/bin/env python import sys import time import threading import urllib import psycopg2 DB_HOST = '127.0.0.1' DB_USER = 'postgres' DB_NAME = 'postgres' HTTPS_URL = 'https://google.com' NUM_HTTPS_THREADS = 10 def connect(): conn = psycopg2.connect( host=DB_HOST, user=DB_USER, database=DB_NAME, sslmode='require') return conn class Worker(threading.Thread): def __init__(self): super(Worker, self).__init__() self._stop = threading.Event() def stop(self): self._stop.set() def stopped(self): return self._stop.isSet() def run(self): i = 0 while not self.stopped(): i += 1 self.do_work(i) class
Re: [HACKERS] Min value for port
On 27/06/13 15:11, Magnus Hagander wrote: On Thu, Jun 27, 2013 at 2:16 PM, Peter Eisentraut pete...@gmx.net wrote: On 6/27/13 6:34 AM, Magnus Hagander wrote: Is there a reason why we have set the min allowed value for port to 1, not 1024? Given that you can't actually start postgres with a value of 1024, shoulnd't the entry in pg_settings reference that as well? Are you thinking of the restriction that you need to be root to use ports 1024? That restriction is not necessarily universal. We can let the kernel tell us at run time if it doesn't like our port. Yes, that's the restriction I was talking about. It's just a bit annoying that if you look at pg_settings.min_value it doesn't actually tell you the truth. But yeah, I believe Windows actually lets you use a lower port number, so it'd at least have to be #ifdef'ed for that if we wanted to change it. There's also authbind and CAP_NET_BIND_SERVICE. Jan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] updated emacs configuration
On 26/06/13 10:51, Dimitri Fontaine wrote: Peter Eisentraut pete...@gmx.net writes: $ git clone git://git.postgresql.org/git/postgresql.git Cloning into 'postgresql'... I can reproduce that here. I don't know why I have those postgres dirs then, and I'm pretty confused about my round of testing now. Maybe you cloned from GitHub, where the mirrored repository is called 'postgres'? Cheers, Jan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] storing plpython global pointer
On 28/05/13 14:04, Szymon Guz wrote: Hi, I need to store a global pointer for plpython usage. This is a PyObject* which can be initialized per session I think Where should I keep such a pointer? Hi, you probably could use a global variable, similar to PLy_interp_globals that's defined in plpy_main.c. Another method would be to expose the Decimal constructor in the plpy module. You could modify plpy_plpymodule.c to import decimal and expose the Decimal constructor as plpy.Decimal. Best, Jan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plpgsql versus SPI plan abstraction
On 30/01/13 22:23, Tom Lane wrote: BTW, I'm also wondering if it's really necessary for plpython/plpy_spi.c to be looking into spi_priv.h ... As far as I can tell, it's not necessary, spi.h would be perfectly fine. Cheers, Jan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump --pretty-print-views
On 28/01/13 12:31, Marko Tiikkaja wrote: On 1/28/13 12:14 PM, Jeevan Chalke wrote: I could not apply the patch with git apply, but able to apply it by patch -p1 command. IME that's normal for patches that went through filterdiff. I do: git diff |filterdiff --format=context to re-format the patches to the context diff preferred on the mailing list. Maybe if I somehow told git to produce context diff it would work.. Try this, worked for me: http://wiki.postgresql.org/wiki/Working_with_Git#Context_diffs_with_Git Cheers, Jan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] fix SQL example syntax in file comment
Hi, Here's a trivial patch that fixes a comment in execProcNode.c For archeological interest, that comment dates back to when it was written in POSTQUEL... The cleanup in a9b1ff4c1d699c8aa615397d47bb3071275c64ef changed RETRIEVE to SELECT, but forgot to add a FROM clause :) Cheers, Jan diff --git a/src/backend/executor/execProcnode.c b/src/backend/executor/execProcnode.c index 4c1189e..76dd62f 100644 --- a/src/backend/executor/execProcnode.c +++ b/src/backend/executor/execProcnode.c @@ -32,6 +32,7 @@ * the number of employees in that department. So we have the query: * *select DEPT.no_emps, EMP.age + *from DEPT, EMP *where EMP.name = DEPT.mgr and * DEPT.name = shoe * -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] PL/Python: Add spidata to all spiexceptions
On 12/12/12 20:21, Karl O. Pinc wrote: There were 2 outstanding issue raised: Is this useful enough/proceeding in the right direction to commit now? Should some of the logic be moved out of a subroutine and into the calling code? I can see arguments to be made for both sides. I'm asking that you think it through to the extent you deem necessary and make changes or not. At that point it should be ready to send to a committer. (I'll re-test first, if you make any changes.) Oh my, I have dropped the ball on this one in the worst manner possible. Sorry! I actually still prefer to keep the signatures of PLy_get_spi_sqlerrcode and PLy_get_spi_error_data similar, so I'd opt for keeping the patch as-is :) Thanks, Jan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] PL/Python: Add spidata to all spiexceptions
Though not the original patch autor, I did modify and submit it to the CF app, so I'll reply here :) On 10/12/12 19:20, Karl O. Pinc wrote: On 12/09/2012 10:33:59 PM, Karl O. Pinc wrote: I've gone ahead and signed up to review this patch. Thanks! There were 2 outstanding issue raised: Is this useful enough/proceeding in the right direction to commit now? I believe the process would be to mark it as Ready for Committer if you feel like it's ready and a then a committer would make the final call. Should some of the logic be moved out of a subroutine and into the calling code? I think I structured the PLy_get_spi_sqlerrcode to accept the same kind of arguments as PLy_get_spi_error_data, which means a SPIException object and a pointer to whatever values it can fill in. That said, I haven't really thought about that too much, so I'm perfectly fine with moving that bit of logic to the caller. Cheers, Jan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] PL/Python: Add spidata to all spiexceptions
On 05/11/12 18:35, Robert Haas wrote: On Wed, Oct 31, 2012 at 5:33 AM, Jan Urbańskiwulc...@wulczer.org wrote: On 30/10/12 22:06, Oskari Saarenmaa wrote: PL/Python maps Python SPIError exceptions with 'spidata' attribute into SQL errors. Here's an alternative patch that takes advantage of the already present (and documented) sqlstate variable to set the error code when handling SPIError exceptions. I also used your test case and added another one, just in case. You should probably add this to the next CF so we don't forget about it. I will, as soon as I recover my community account. Cheers, Jan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] PL/Python: Add spidata to all spiexceptions
On 05/11/12 19:07, Jan Urbański wrote: On 05/11/12 18:35, Robert Haas wrote: You should probably add this to the next CF so we don't forget about it. I will, as soon as I recover my community account. Added as https://commitfest.postgresql.org/action/patch_view?id=971 J -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] PL/Python: Add spidata to all spiexceptions
On 30/10/12 22:06, Oskari Saarenmaa wrote: PL/Python maps Python SPIError exceptions with 'spidata' attribute into SQL errors. PL/Python also creates classes in plpy.spiexceptions for all known errors but does not initialize their spidata, so when a PL/Python function raises such an exception it is not recognized properly and is always reported as an internal error. You're right, I never thought of the possibility of user code explicitly throwing SPIError exceptions. The root issue is that PLy_elog will only set errcode if it finds the spidata attribute, but I think passing error details through that attribute is a kludge more than something more code should rely on. Here's an alternative patch that takes advantage of the already present (and documented) sqlstate variable to set the error code when handling SPIError exceptions. I also used your test case and added another one, just in case. Thanks, Jan From 633fe3cf5872f57fcc33c5462048c0cfea81d907 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Jan=20Urba=C5=84ski?= wulc...@wulczer.org Date: Wed, 31 Oct 2012 10:27:59 +0100 Subject: [PATCH] Set SQL error code for SPI errors raised from userland Python code. If a PL/Python function was terminated by a SPIError raised from user code, the Postgres-level error code was always set to internal_error. This makes it pay attention to the exception's sqlstate attribute and set the error code accordingly. --- src/pl/plpython/expected/plpython_error.out | 26 src/pl/plpython/expected/plpython_error_0.out | 26 src/pl/plpython/plpy_elog.c | 40 + src/pl/plpython/sql/plpython_error.sql| 30 +++ 4 files changed, 117 insertions(+), 5 deletions(-) diff --git a/src/pl/plpython/expected/plpython_error.out b/src/pl/plpython/expected/plpython_error.out index e1ec9c2..be2ec97 100644 --- a/src/pl/plpython/expected/plpython_error.out +++ b/src/pl/plpython/expected/plpython_error.out @@ -400,3 +400,29 @@ CONTEXT: Traceback (most recent call last): PL/Python function manual_subxact_prepared, line 4, in module plpy.execute(save) PL/Python function manual_subxact_prepared +/* raising plpy.spiexception.* from python code should preserve sqlstate + */ +CREATE FUNCTION plpy_raise_spiexception() RETURNS void AS $$ +raise plpy.spiexceptions.DivisionByZero() +$$ LANGUAGE plpythonu; +DO $$ +BEGIN + SELECT plpy_raise_spiexception(); +EXCEPTION WHEN division_by_zero THEN + -- NOOP +END +$$ LANGUAGE plpgsql; +/* setting a custom sqlstate should be handled + */ +CREATE FUNCTION plpy_raise_spiexception_override() RETURNS void AS $$ +exc = plpy.spiexceptions.DivisionByZero() +exc.sqlstate = 'SILLY' +raise exc +$$ LANGUAGE plpythonu; +DO $$ +BEGIN + SELECT plpy_raise_spiexception_override(); +EXCEPTION WHEN SQLSTATE 'SILLY' THEN + -- NOOP +END +$$ LANGUAGE plpgsql; diff --git a/src/pl/plpython/expected/plpython_error_0.out b/src/pl/plpython/expected/plpython_error_0.out index 6f74a50..39c63c5 100644 --- a/src/pl/plpython/expected/plpython_error_0.out +++ b/src/pl/plpython/expected/plpython_error_0.out @@ -400,3 +400,29 @@ CONTEXT: Traceback (most recent call last): PL/Python function manual_subxact_prepared, line 4, in module plpy.execute(save) PL/Python function manual_subxact_prepared +/* raising plpy.spiexception.* from python code should preserve sqlstate + */ +CREATE FUNCTION plpy_raise_spiexception() RETURNS void AS $$ +raise plpy.spiexceptions.DivisionByZero() +$$ LANGUAGE plpythonu; +DO $$ +BEGIN + SELECT plpy_raise_spiexception(); +EXCEPTION WHEN division_by_zero THEN + -- NOOP +END +$$ LANGUAGE plpgsql; +/* setting a custom sqlstate should be handled + */ +CREATE FUNCTION plpy_raise_spiexception_override() RETURNS void AS $$ +exc = plpy.spiexceptions.DivisionByZero() +exc.sqlstate = 'SILLY' +raise exc +$$ LANGUAGE plpythonu; +DO $$ +BEGIN + SELECT plpy_raise_spiexception_override(); +EXCEPTION WHEN SQLSTATE 'SILLY' THEN + -- NOOP +END +$$ LANGUAGE plpgsql; diff --git a/src/pl/plpython/plpy_elog.c b/src/pl/plpython/plpy_elog.c index c375ac0..e3044e7 100644 --- a/src/pl/plpython/plpy_elog.c +++ b/src/pl/plpython/plpy_elog.c @@ -336,6 +336,29 @@ PLy_traceback(char **xmsg, char **tbmsg, int *tb_depth) Py_DECREF(e); } + +static void +PLy_get_spi_sqlerrcode(PyObject *exc, int *sqlerrcode) +{ + PyObject *sqlstate; + char *buffer; + + sqlstate = PyObject_GetAttrString(exc, sqlstate); + if (sqlstate == NULL) + return; + + buffer = PyString_AsString(sqlstate); + if (strlen(buffer) == 5 + strspn(buffer, 0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ) == 5) + { + *sqlerrcode = MAKE_SQLSTATE(buffer[0], buffer[1], buffer[2], + buffer[3], buffer[4]); + } + + Py_DECREF(sqlstate); +} + + /* * Extract the error data from a SPIError */ @@ -345,13 +368,20 @@ PLy_get_spi_error_data(PyObject *exc, int *sqlerrcode, char **detail, char **hin PyObject *spidata = NULL; spidata =
[HACKERS] trivial typo in src/tools/RELEASE_CHANGES
There's a typo in src/tools/RELEASE_CHANGES It just ticked off my OCD I guess... Cheers, Jan diff --git a/src/tools/RELEASE_CHANGES b/src/tools/RELEASE_CHANGES new file mode 100644 index 5f1277a..aba1630 *** a/src/tools/RELEASE_CHANGES --- b/src/tools/RELEASE_CHANGES *** Then doing it like this: *** 164,170 void print_stuff(int arg1, int arg2) { ! print_stuff(arg1, arg2, 0); } would maintain binary compatibility. Obviously this would add a fair --- 164,170 void print_stuff(int arg1, int arg2) { ! print_stuff2(arg1, arg2, 0); } would maintain binary compatibility. Obviously this would add a fair -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PostgreSQL 9.2beta4 ( git HEAD) server crash on creating extension plpython3u
On 21/08/12 20:13, Josh Berkus wrote: No. I get the same backtrace when I try against the 9.1.5 (REL9_1_STABLE) branch. I have reproduced this on Linux, seems like the fix is to to run the postmaster with this env variable exported: PYTHONHOME=/opt/ActivePython-3.2/ (or wherever you installed ActivePython). To give credit, I found the decisive clue here: http://manojadinesh.blogspot.com/2012/06/fatal-python-error-pyinitialize-unable.html Cheers, Jan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PostgreSQL 9.2beta4 ( git HEAD) server crash on creating extension plpython3u
On 22/08/12 13:28, Sachin Srivastava wrote: Yes, It worked for me also.. So will this be a workaround? Or do we intend to use something like Py_SetPythonHome() before calling Py_Initialize()/ I think the best we can do is to document that for some installations you might need to set PYTHONHOME. I don't think we can figure it out at runtime (we could then use setenv to fix it). This is similar to having to set PYTHONPATH if you want to import code installed in virtualenvs or other nonstandard locations. I'm leaning towards just documenting, mucking around with environment variables from inside of Postgres' shared libraries seems like a recipe for disaster. Cheers, Jan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Fix mapping of PostgreSQL encodings to Python encodings.
On 06/08/12 13:59, Heikki Linnakangas wrote: On 20.07.2012 10:13, Jan Urbański wrote: On 20/07/12 08:59, Jan Urbański wrote: On 18/07/12 17:17, Heikki Linnakangas wrote: On 14.07.2012 17:50, Jan Urbański wrote: If pg_do_encoding_conversion() throws an error, you don't get a chance to call Py_DECREF() to release the string. Is that a problem? If an error occurs in PLy_traceback(), after incrementing recursion_depth, you don't get a chance to decrement it again. I'm not sure if the Py* function calls can fail, but at least seemingly trivial things like initStringInfo() can throw an out-of-memory error. Of course you're right (on both accounts). Here's a version with a bunch of PG_TRies thrown in. Silly me, playing tricks with postincrements before fully waking up. Here's v3, with a correct inequality test for exceeding the traceback recursion test. Committed the convert-via-UTF-8 part of this. I'll take a closer look at the recursion check next. Thanks! Jan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Passing tabular data around using python functions
On 30/07/12 14:33, Achim Domma wrote: Hi, Hi Achim, this list is meant for discussing the development of PostgreSQL, in the future you might want to ask your question on pgsql-general. However, to answer your question: I call the function like this: select * from vectormatch(array(select (docid,conceptid,rank)::fps from fps where docid = 4205591)) and get the following output: NOTICE:type 'list' CONTEXT: PL/Python function vectormatch NOTICE: ['(4205591,1,1)', '(4205591,1219,1)', ...] CONTEXT: PL/Python function vectormatch I'm quite surprised that there are strings in the list and not tuples!? I tried my best, but I have no idea what I might be doing wrong. The main purpose of my sample/experiment is, to pass the results of a query to a function and to process it there. Any hint would be very appreciated. Yes, it's a missing feature of PL/Python, but in your case you could work around it by writing your function like this: create or replace function vectormatch(docid integer[], conceptid integer[], rank float4[]) returns table(docid integer, weigth float4) as $$ data = zip(docid, conceptid, rank) plpy.notice(data) ... $$ language plpythonu; and then calling it like this: select vectormatch(array_agg(docid), array_agg(conceptid), array_agg(rank)) from fps where docid = 4205591; Cheers, Jan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Fix mapping of PostgreSQL encodings to Python encodings.
On 18/07/12 17:17, Heikki Linnakangas wrote: On 14.07.2012 17:50, Jan Urbański wrote: If pg_do_encoding_conversion() throws an error, you don't get a chance to call Py_DECREF() to release the string. Is that a problem? If an error occurs in PLy_traceback(), after incrementing recursion_depth, you don't get a chance to decrement it again. I'm not sure if the Py* function calls can fail, but at least seemingly trivial things like initStringInfo() can throw an out-of-memory error. Of course you're right (on both accounts). Here's a version with a bunch of PG_TRies thrown in. Cheers, Jan diff --git a/src/pl/plpython/plpy_elog.c b/src/pl/plpython/plpy_elog.c new file mode 100644 index c375ac0..9944f72 *** a/src/pl/plpython/plpy_elog.c --- b/src/pl/plpython/plpy_elog.c *** static char *get_source_line(const char *** 28,33 --- 28,41 /* + * Guard agains re-entrant calls to PLy_traceback, which can happen if + * traceback formatting functions raise Python errors. + */ + #define TRACEBACK_RECURSION_LIMIT 2 + static int recursion_depth = 0; + + + /* * Emit a PG error or notice, together with any available info about * the current Python error, previously set by PLy_exception_set(). * This should be used to propagate Python errors into PG. If fmt is *** static char *get_source_line(const char *** 38,46 void PLy_elog(int elevel, const char *fmt,...) { ! char *xmsg; ! char *tbmsg; ! int tb_depth; StringInfoData emsg; PyObject *exc, *val, --- 46,54 void PLy_elog(int elevel, const char *fmt,...) { ! char *xmsg = NULL; ! char *tbmsg = NULL; ! int tb_depth = 0; StringInfoData emsg; PyObject *exc, *val, *** PLy_elog(int elevel, const char *fmt,... *** 62,68 } PyErr_Restore(exc, val, tb); ! PLy_traceback(xmsg, tbmsg, tb_depth); if (fmt) { --- 70,90 } PyErr_Restore(exc, val, tb); ! if (recursion_depth++ = TRACEBACK_RECURSION_LIMIT) ! { ! PG_TRY(); ! { ! PLy_traceback(xmsg, tbmsg, tb_depth); ! } ! PG_CATCH(); ! { ! recursion_depth--; ! PG_RE_THROW(); ! } ! PG_END_TRY(); ! } ! ! recursion_depth--; if (fmt) { diff --git a/src/pl/plpython/plpy_util.c b/src/pl/plpython/plpy_util.c new file mode 100644 index 4aabafc..94c035e *** a/src/pl/plpython/plpy_util.c --- b/src/pl/plpython/plpy_util.c *** PLy_free(void *ptr) *** 61,126 PyObject * PLyUnicode_Bytes(PyObject *unicode) { ! PyObject *rv; ! const char *serverenc; ! /* ! * Map PostgreSQL encoding to a Python encoding name. ! */ ! switch (GetDatabaseEncoding()) { ! case PG_SQL_ASCII: ! /* ! * Mapping SQL_ASCII to Python's 'ascii' is a bit bogus. Python's ! * 'ascii' means true 7-bit only ASCII, while PostgreSQL's ! * SQL_ASCII means that anything is allowed, and the system doesn't ! * try to interpret the bytes in any way. But not sure what else ! * to do, and we haven't heard any complaints... ! */ ! serverenc = ascii; ! break; ! case PG_WIN1250: ! serverenc = cp1250; ! break; ! case PG_WIN1251: ! serverenc = cp1251; ! break; ! case PG_WIN1252: ! serverenc = cp1252; ! break; ! case PG_WIN1253: ! serverenc = cp1253; ! break; ! case PG_WIN1254: ! serverenc = cp1254; ! break; ! case PG_WIN1255: ! serverenc = cp1255; ! break; ! case PG_WIN1256: ! serverenc = cp1256; ! break; ! case PG_WIN1257: ! serverenc = cp1257; ! break; ! case PG_WIN1258: ! serverenc = cp1258; ! break; ! case PG_WIN866: ! serverenc = cp866; ! break; ! case PG_WIN874: ! serverenc = cp874; ! break; ! default: ! /* Other encodings have the same name in Python. */ ! serverenc = GetDatabaseEncodingName(); ! break; } ! rv = PyUnicode_AsEncodedString(unicode, serverenc, strict); ! if (rv == NULL) ! PLy_elog(ERROR, could not convert Python Unicode object to PostgreSQL server encoding); return rv; } --- 61,106 PyObject * PLyUnicode_Bytes(PyObject *unicode) { ! PyObject *bytes, *rv; ! char *utf8string, *encoded; ! /* first encode the Python unicode object with UTF-8 */ ! bytes = PyUnicode_AsUTF8String(unicode); ! if (bytes == NULL) ! PLy_elog(ERROR, could not convert Python Unicode object to bytes); ! ! utf8string = PyBytes_AsString(bytes); ! if (utf8string == NULL) { ! Py_DECREF(bytes); ! PLy_elog(ERROR, could not extract bytes from encoded string); ! } ! ! /* then convert to server encoding */ ! PG_TRY(); { ! encoded = (char *) pg_do_encoding_conversion( ! (unsigned char *) utf8string, ! strlen(utf8string), ! PG_UTF8, ! GetDatabaseEncoding()); ! } ! PG_CATCH(); ! { ! Py_DECREF(bytes); ! PG_RE_THROW(); } + PG_END_TRY(); ! /* finally, build a bytes object in the server encoding */ ! rv = PyBytes_FromStringAndSize(encoded, strlen(encoded
Re: [HACKERS] Re: [COMMITTERS] pgsql: Fix mapping of PostgreSQL encodings to Python encodings.
On 20/07/12 08:59, Jan Urbański wrote: On 18/07/12 17:17, Heikki Linnakangas wrote: On 14.07.2012 17:50, Jan Urbański wrote: If pg_do_encoding_conversion() throws an error, you don't get a chance to call Py_DECREF() to release the string. Is that a problem? If an error occurs in PLy_traceback(), after incrementing recursion_depth, you don't get a chance to decrement it again. I'm not sure if the Py* function calls can fail, but at least seemingly trivial things like initStringInfo() can throw an out-of-memory error. Of course you're right (on both accounts). Here's a version with a bunch of PG_TRies thrown in. Silly me, playing tricks with postincrements before fully waking up. Here's v3, with a correct inequality test for exceeding the traceback recursion test. J diff --git a/src/pl/plpython/plpy_elog.c b/src/pl/plpython/plpy_elog.c new file mode 100644 index c375ac0..0ad8358 *** a/src/pl/plpython/plpy_elog.c --- b/src/pl/plpython/plpy_elog.c *** static char *get_source_line(const char *** 28,33 --- 28,41 /* + * Guard agains re-entrant calls to PLy_traceback, which can happen if + * traceback formatting functions raise Python errors. + */ + #define TRACEBACK_RECURSION_LIMIT 2 + static int recursion_depth = 0; + + + /* * Emit a PG error or notice, together with any available info about * the current Python error, previously set by PLy_exception_set(). * This should be used to propagate Python errors into PG. If fmt is *** static char *get_source_line(const char *** 38,46 void PLy_elog(int elevel, const char *fmt,...) { ! char *xmsg; ! char *tbmsg; ! int tb_depth; StringInfoData emsg; PyObject *exc, *val, --- 46,54 void PLy_elog(int elevel, const char *fmt,...) { ! char *xmsg = NULL; ! char *tbmsg = NULL; ! int tb_depth = 0; StringInfoData emsg; PyObject *exc, *val, *** PLy_elog(int elevel, const char *fmt,... *** 62,68 } PyErr_Restore(exc, val, tb); ! PLy_traceback(xmsg, tbmsg, tb_depth); if (fmt) { --- 70,90 } PyErr_Restore(exc, val, tb); ! if (recursion_depth++ TRACEBACK_RECURSION_LIMIT) ! { ! PG_TRY(); ! { ! PLy_traceback(xmsg, tbmsg, tb_depth); ! } ! PG_CATCH(); ! { ! recursion_depth--; ! PG_RE_THROW(); ! } ! PG_END_TRY(); ! } ! ! recursion_depth--; if (fmt) { diff --git a/src/pl/plpython/plpy_util.c b/src/pl/plpython/plpy_util.c new file mode 100644 index 4aabafc..94c035e *** a/src/pl/plpython/plpy_util.c --- b/src/pl/plpython/plpy_util.c *** PLy_free(void *ptr) *** 61,126 PyObject * PLyUnicode_Bytes(PyObject *unicode) { ! PyObject *rv; ! const char *serverenc; ! /* ! * Map PostgreSQL encoding to a Python encoding name. ! */ ! switch (GetDatabaseEncoding()) { ! case PG_SQL_ASCII: ! /* ! * Mapping SQL_ASCII to Python's 'ascii' is a bit bogus. Python's ! * 'ascii' means true 7-bit only ASCII, while PostgreSQL's ! * SQL_ASCII means that anything is allowed, and the system doesn't ! * try to interpret the bytes in any way. But not sure what else ! * to do, and we haven't heard any complaints... ! */ ! serverenc = ascii; ! break; ! case PG_WIN1250: ! serverenc = cp1250; ! break; ! case PG_WIN1251: ! serverenc = cp1251; ! break; ! case PG_WIN1252: ! serverenc = cp1252; ! break; ! case PG_WIN1253: ! serverenc = cp1253; ! break; ! case PG_WIN1254: ! serverenc = cp1254; ! break; ! case PG_WIN1255: ! serverenc = cp1255; ! break; ! case PG_WIN1256: ! serverenc = cp1256; ! break; ! case PG_WIN1257: ! serverenc = cp1257; ! break; ! case PG_WIN1258: ! serverenc = cp1258; ! break; ! case PG_WIN866: ! serverenc = cp866; ! break; ! case PG_WIN874: ! serverenc = cp874; ! break; ! default: ! /* Other encodings have the same name in Python. */ ! serverenc = GetDatabaseEncodingName(); ! break; } ! rv = PyUnicode_AsEncodedString(unicode, serverenc, strict); ! if (rv == NULL) ! PLy_elog(ERROR, could not convert Python Unicode object to PostgreSQL server encoding); return rv; } --- 61,106 PyObject * PLyUnicode_Bytes(PyObject *unicode) { ! PyObject *bytes, *rv; ! char *utf8string, *encoded; ! /* first encode the Python unicode object with UTF-8 */ ! bytes = PyUnicode_AsUTF8String(unicode); ! if (bytes == NULL) ! PLy_elog(ERROR, could not convert Python Unicode object to bytes); ! ! utf8string = PyBytes_AsString(bytes); ! if (utf8string == NULL) { ! Py_DECREF(bytes); ! PLy_elog(ERROR, could not extract bytes from encoded string); ! } ! ! /* then convert to server encoding */ ! PG_TRY(); { ! encoded = (char *) pg_do_encoding_conversion( ! (unsigned char *) utf8string, ! strlen(utf8string), ! PG_UTF8, ! GetDatabaseEncoding()); ! } ! PG_CATCH
Re: [HACKERS] Re: [COMMITTERS] pgsql: Fix mapping of PostgreSQL encodings to Python encodings.
On 13/07/12 13:38, Jan Urbański wrote: On 12/07/12 11:08, Heikki Linnakangas wrote: On 07.07.2012 00:12, Jan Urbański wrote: So you're in favour of doing unicode - bytes by encoding with UTF-8 and then using the server's encoding functions? Sounds reasonable to me. The extra conversion between UTF-8 and UCS-2 should be quite fast, and it would be good to be consistent in the way we do conversions in both directions. I'll implement that than (sorry for not following up on that eariler). Here's a patch that always encodes Python unicode objects using UTF-8 and then uses Postgres's internal functions to produce bytes in the server encoding. Cheers, Jan diff --git a/src/pl/plpython/plpy_elog.c b/src/pl/plpython/plpy_elog.c new file mode 100644 index c375ac0..c2b3cb8 *** a/src/pl/plpython/plpy_elog.c --- b/src/pl/plpython/plpy_elog.c *** static char *get_source_line(const char *** 28,33 --- 28,41 /* + * Guard agains re-entrant calls to PLy_traceback, which can happen if + * traceback formatting functions raise Python errors. + */ + #define TRACEBACK_RECURSION_LIMIT 2 + static int recursion_depth = 0; + + + /* * Emit a PG error or notice, together with any available info about * the current Python error, previously set by PLy_exception_set(). * This should be used to propagate Python errors into PG. If fmt is *** PLy_traceback(char **xmsg, char **tbmsg, *** 147,166 StringInfoData xstr; StringInfoData tbstr; /* * get the current exception */ PyErr_Fetch(e, v, tb); /* ! * oops, no exception, return */ ! if (e == NULL) { *xmsg = NULL; *tbmsg = NULL; *tb_depth = 0; return; } --- 155,177 StringInfoData xstr; StringInfoData tbstr; + recursion_depth++; + /* * get the current exception */ PyErr_Fetch(e, v, tb); /* ! * oops, no exception or recursion depth exceeded, return */ ! if (e == NULL || recursion_depth TRACEBACK_RECURSION_LIMIT) { *xmsg = NULL; *tbmsg = NULL; *tb_depth = 0; + recursion_depth--; return; } *** PLy_traceback(char **xmsg, char **tbmsg, *** 326,331 --- 337,344 (*tb_depth)++; } + recursion_depth--; + /* Return the traceback. */ *tbmsg = tbstr.data; diff --git a/src/pl/plpython/plpy_util.c b/src/pl/plpython/plpy_util.c new file mode 100644 index 4aabafc..fe43312 *** a/src/pl/plpython/plpy_util.c --- b/src/pl/plpython/plpy_util.c *** PLy_free(void *ptr) *** 61,126 PyObject * PLyUnicode_Bytes(PyObject *unicode) { ! PyObject *rv; ! const char *serverenc; ! /* ! * Map PostgreSQL encoding to a Python encoding name. ! */ ! switch (GetDatabaseEncoding()) { ! case PG_SQL_ASCII: ! /* ! * Mapping SQL_ASCII to Python's 'ascii' is a bit bogus. Python's ! * 'ascii' means true 7-bit only ASCII, while PostgreSQL's ! * SQL_ASCII means that anything is allowed, and the system doesn't ! * try to interpret the bytes in any way. But not sure what else ! * to do, and we haven't heard any complaints... ! */ ! serverenc = ascii; ! break; ! case PG_WIN1250: ! serverenc = cp1250; ! break; ! case PG_WIN1251: ! serverenc = cp1251; ! break; ! case PG_WIN1252: ! serverenc = cp1252; ! break; ! case PG_WIN1253: ! serverenc = cp1253; ! break; ! case PG_WIN1254: ! serverenc = cp1254; ! break; ! case PG_WIN1255: ! serverenc = cp1255; ! break; ! case PG_WIN1256: ! serverenc = cp1256; ! break; ! case PG_WIN1257: ! serverenc = cp1257; ! break; ! case PG_WIN1258: ! serverenc = cp1258; ! break; ! case PG_WIN866: ! serverenc = cp866; ! break; ! case PG_WIN874: ! serverenc = cp874; ! break; ! default: ! /* Other encodings have the same name in Python. */ ! serverenc = GetDatabaseEncodingName(); ! break; } ! rv = PyUnicode_AsEncodedString(unicode, serverenc, strict); ! if (rv == NULL) ! PLy_elog(ERROR, could not convert Python Unicode object to PostgreSQL server encoding); return rv; } --- 61,96 PyObject * PLyUnicode_Bytes(PyObject *unicode) { ! PyObject *bytes, *rv; ! char *utf8string, *encoded; ! /* first encode the Python unicode object with UTF-8 */ ! bytes = PyUnicode_AsUTF8String(unicode); ! if (bytes == NULL) ! PLy_elog(ERROR, could not convert Python Unicode object to bytes); ! ! utf8string = PyBytes_AsString(bytes); ! if (utf8string == NULL) { ! Py_DECREF(bytes); ! PLy_elog(ERROR, could not extract bytes from encoded string); ! } ! ! /* then convert to server encoding */ ! encoded = (char *) pg_do_encoding_conversion((unsigned char *) utf8string, ! strlen(utf8string), ! PG_UTF8, ! GetDatabaseEncoding()); ! ! /* finally, build a bytes object in the server encoding */ ! rv = PyBytes_FromStringAndSize(encoded, strlen(encoded
Re: [HACKERS] Re: [COMMITTERS] pgsql: Fix mapping of PostgreSQL encodings to Python encodings.
On 12/07/12 11:08, Heikki Linnakangas wrote: On 07.07.2012 00:12, Jan Urbański wrote: On 06/07/12 22:47, Peter Eisentraut wrote: On fre, 2012-07-06 at 18:53 +0300, Heikki Linnakangas wrote: What shall we do about those? Ignore them? Document that if you're sing one of these encodings then PL/Python with Python 2 will be crippled and with Python 3 just won't work? We could convert to UTF-8, and use the PostgreSQL functions to convert from UTF-8 to the server encoding. Double conversion might be slow, but I think it would be better than failing. Actually, we already do the other direction that way (PLyUnicode_FromStringAndSize) , so maybe it would be more consistent to always use this. I would hesitate to use this as a kind of fallback, because then we would sometimes be using PostgreSQL's recoding tables and sometimes Python's recoding tables, which could became confusing. So you're in favour of doing unicode - bytes by encoding with UTF-8 and then using the server's encoding functions? Sounds reasonable to me. The extra conversion between UTF-8 and UCS-2 should be quite fast, and it would be good to be consistent in the way we do conversions in both directions. I'll implement that than (sorry for not following up on that eariler). J -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Fix mapping of PostgreSQL encodings to Python encodings.
On 06/07/12 10:05, Heikki Linnakangas wrote: On 06.07.2012 00:54, Jan Urbański wrote: On 05/07/12 23:30, Peter Eisentraut wrote: On tor, 2012-07-05 at 22:53 +0200, Jan Urbański wrote: The problem is that PLyUnicode_Bytes is (via an ifdef) used as PyString_ToString on Python3, which means that there are numerous call sites and new ones might appear in any moment. I'm not that keen on invoking the traceback machinery on low-level encoding errors. Why not? Because it can lead to recursion errors, like the one this patch was supposed to fix. The traceback machinery calls into the encoding functions, because it converts Python strings (like function names) into C strings. In the backend elog routines, there is a global variable 'recursion_depth', which is incremented when an error-handling routine is entered, and decremented afterwards. Can we use a similar mechinism in PLy_elog() to detect and stop recursion? I guess we can, I'll try to do some tests in order to see if there's an easy user-triggereable way of causing PLy_elog to recurse and if not then a guard like this should be enough as a safety measure against as yet unknown conditions (as opposed to something we expect to happen regularly). Cheers, Jan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Fix mapping of PostgreSQL encodings to Python encodings.
On 06/07/12 10:14, Jan Urbański wrote: On 06/07/12 10:05, Heikki Linnakangas wrote: In the backend elog routines, there is a global variable 'recursion_depth', which is incremented when an error-handling routine is entered, and decremented afterwards. Can we use a similar mechinism in PLy_elog() to detect and stop recursion? I guess we can, I'll try to do some tests in order to see if there's an easy user-triggereable way of causing PLy_elog to recurse and if not then a guard like this should be enough as a safety measure against as yet unknown conditions (as opposed to something we expect to happen regularly). Attached is a patch that stores the recursion level of PLy_traceback and prevents it from running if it's too deep (PLy_traceback is the one doing heavy lifting, that's why I chose to put the logic to skip running there). I tried a few things and was not able to easily invoke the infinite recursion condition, but I did notice that there are two more encodings that have different names in Postgres and in Python (KOI8-R and KOI8-U) and added them to the switch. There's still trouble with EUC_TW and MULE_INTERNAL which don't have Python equivalents. EUC-TW has been discussed in http://bugs.python.org/issue2066 and rejected (see http://bugs.python.org/issue2066#msg113731). If you use any of these encodings, you *will* get into the recursion trouble described eariler, just as before the path you'd get into it with CP1252 as your encoding. What shall we do about those? Ignore them? Document that if you're sing one of these encodings then PL/Python with Python 2 will be crippled and with Python 3 just won't work? Cheers, Jan diff --git a/src/pl/plpython/plpy_elog.c b/src/pl/plpython/plpy_elog.c new file mode 100644 index c375ac0..c2b3cb8 *** a/src/pl/plpython/plpy_elog.c --- b/src/pl/plpython/plpy_elog.c *** static char *get_source_line(const char *** 28,33 --- 28,41 /* + * Guard agains re-entrant calls to PLy_traceback, which can happen if + * traceback formatting functions raise Python errors. + */ + #define TRACEBACK_RECURSION_LIMIT 2 + static int recursion_depth = 0; + + + /* * Emit a PG error or notice, together with any available info about * the current Python error, previously set by PLy_exception_set(). * This should be used to propagate Python errors into PG. If fmt is *** PLy_traceback(char **xmsg, char **tbmsg, *** 147,166 StringInfoData xstr; StringInfoData tbstr; /* * get the current exception */ PyErr_Fetch(e, v, tb); /* ! * oops, no exception, return */ ! if (e == NULL) { *xmsg = NULL; *tbmsg = NULL; *tb_depth = 0; return; } --- 155,177 StringInfoData xstr; StringInfoData tbstr; + recursion_depth++; + /* * get the current exception */ PyErr_Fetch(e, v, tb); /* ! * oops, no exception or recursion depth exceeded, return */ ! if (e == NULL || recursion_depth TRACEBACK_RECURSION_LIMIT) { *xmsg = NULL; *tbmsg = NULL; *tb_depth = 0; + recursion_depth--; return; } *** PLy_traceback(char **xmsg, char **tbmsg, *** 326,331 --- 337,344 (*tb_depth)++; } + recursion_depth--; + /* Return the traceback. */ *tbmsg = tbstr.data; diff --git a/src/pl/plpython/plpy_util.c b/src/pl/plpython/plpy_util.c new file mode 100644 index bf29532..ea4ecdf *** a/src/pl/plpython/plpy_util.c --- b/src/pl/plpython/plpy_util.c *** PLyUnicode_Bytes(PyObject *unicode) *** 112,117 --- 112,123 case PG_WIN874: serverenc = cp874; break; + case PG_KOI8R: + serverenc = koi8-r; + break; + case PG_KOI8U: + serverenc = koi8-u; + break; default: /* Other encodings have the same name in Python. */ serverenc = GetDatabaseEncodingName(); *** PLyUnicode_Bytes(PyObject *unicode) *** 120,135 rv = PyUnicode_AsEncodedString(unicode, serverenc, strict); if (rv == NULL) ! { ! /* ! * Use a plain ereport instead of PLy_elog to avoid recursion, if ! * the traceback formatting functions try to do unicode to bytes ! * conversion again. ! */ ! ereport(ERROR, ! (errcode(ERRCODE_INTERNAL_ERROR), ! errmsg(could not convert Python Unicode object to PostgreSQL server encoding))); ! } return rv; } --- 126,132 rv = PyUnicode_AsEncodedString(unicode, serverenc, strict); if (rv == NULL) ! PLy_elog(ERROR, could not convert Python Unicode object to PostgreSQL server encoding); return rv; } -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Fix mapping of PostgreSQL encodings to Python encodings.
On 06/07/12 22:47, Peter Eisentraut wrote: On fre, 2012-07-06 at 18:53 +0300, Heikki Linnakangas wrote: What shall we do about those? Ignore them? Document that if you're sing one of these encodings then PL/Python with Python 2 will be crippled and with Python 3 just won't work? We could convert to UTF-8, and use the PostgreSQL functions to convert from UTF-8 to the server encoding. Double conversion might be slow, but I think it would be better than failing. Actually, we already do the other direction that way (PLyUnicode_FromStringAndSize) , so maybe it would be more consistent to always use this. I would hesitate to use this as a kind of fallback, because then we would sometimes be using PostgreSQL's recoding tables and sometimes Python's recoding tables, which could became confusing. So you're in favour of doing unicode - bytes by encoding with UTF-8 and then using the server's encoding functions? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plpython issue with Win64 (PG 9.2)
On 05/07/12 21:37, Heikki Linnakangas wrote: Committed. This bug was present in versions = 9.0, so backpatched. Thanks! I used ereport() rather than elog() in the error message. Correct me if that was wrong, but the point was to avoid PLy_elog(), because that might cause recursion, and ereport() should be ok. I believe the message should be translated, as it's quite possible to get that error, at least if you use SQL_ASCII, so ereport() is more approriate than elog(). Yes, you're absolutely right. Cheers, Jan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: [COMMITTERS] pgsql: Fix mapping of PostgreSQL encodings to Python encodings.
On 05/07/12 22:37, Heikki Linnakangas wrote: On 05.07.2012 23:31, Tom Lane wrote: Heikki Linnakangasheikki.linnakan...@iki.fi writes: Fix mapping of PostgreSQL encodings to Python encodings. The buildfarm doesn't like this --- did you check for side effects on regression test results? Hmm, I ran the regressions tests, but not with C encoding. With the patch, you no longer get the errdetail you used to, when an encoding conversion fails: *** *** 41,47 SELECT unicode_plan1(); ERROR: spiexceptions.InternalError: could not convert Python Unicode object to PostgreSQL server encoding - DETAIL: UnicodeEncodeError: 'ascii' codec can't encode character u'\x80' in position 0: ordinal not in range(128) CONTEXT: Traceback (most recent call last): PL/Python function unicode_plan1, line 3, in module rv = plpy.execute(plan, [u\x80], 1) --- 39,44 We could just update the expected output, there's two expected outputs for this test case and one of them is now wrong. But it'd actually be quite a shame to lose that extra information, that's quite valuable. Perhaps we should go back to using PLu_elog() here, and find some other way to avoid the recursion. Seems that the problem is that the LC_ALL=C makes Postgres use SQL_ASCII as the database encoding and as the comment states, translating PG's SQL_ASCII to Python's ascii is not ideal. The problem is that PLyUnicode_Bytes is (via an ifdef) used as PyString_ToString on Python3, which means that there are numerous call sites and new ones might appear in any moment. I'm not that keen on invoking the traceback machinery on low-level encoding errors. Hm, since PyUnicode_Bytes should get a unicode object and return bytes in the server encoding, we might just say that for SQL_ASCII we arbitrarily choose UTF-8 to encode the unicode codepoints, so we'd just set serverenc = utf-8 in the first switch case. That doesn't solve the problem of the missing error detail, though. Jan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Fix mapping of PostgreSQL encodings to Python encodings.
On 05/07/12 23:30, Peter Eisentraut wrote: On tor, 2012-07-05 at 22:53 +0200, Jan Urbański wrote: The problem is that PLyUnicode_Bytes is (via an ifdef) used as PyString_ToString on Python3, which means that there are numerous call sites and new ones might appear in any moment. I'm not that keen on invoking the traceback machinery on low-level encoding errors. Why not? Because it can lead to recursion errors, like the one this patch was supposed to fix. The traceback machinery calls into the encoding functions, because it converts Python strings (like function names) into C strings. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plpython issue with Win64 (PG 9.2)
On 04/07/12 13:58, Asif Naeem wrote: Patch attached. Asif, could you try a few things on a CP1252 database? First verify if your original test case now works and then try this: I have test the patch on Win64. postgres server is working fine now for WIN1252. Thanks. create function enctest() returns text as $$ return b'tr\xc3\xb3spido'.decode('**utf-8') $$ language plpython3u; select enctest(), encode(convert_to(enctest(), 'utf-8'), 'hex'); create function enctest() returns text as $$ return b'tr\xc3\xb3spido'.decode('utf-8') $$ language plpython3u; select enctest(), encode(convert_to(enctest(), 'utf-8'), 'hex'); enctest | encode --+ tróspido | 7472c3b3737069646f (1 row) Please do let me know If you have any other query. Thanks. Great, this looks correct. Can we apply this to 9.2? Cheers, Jan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plpython issue with Win64 (PG 9.2)
On 29/06/12 00:36, Jan Urbański wrote: On 27/06/12 13:57, Jan Urbański wrote: On 27/06/12 11:51, Asif Naeem wrote: Hi, On Windows 7 64bit, plpython is causing server crash with the following test case i.e. So: I'd add code to translate WINxxx into CPxxx when choosing the Python to use, change PLy_elog to elog in PLyUnicode_Bytes and leave the SQL_ASCII case alone, as there were no complaints and people using SQL_ASCII are asking for it anyway. Since no one commented, I'll produce a patch to that effect. I believe this should go into 9.2 given that otherwise PL/Python will basically crash any database using the CP12xx encoding. Cheers, Jan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plpython issue with Win64 (PG 9.2)
On 03/07/12 17:45, Jan Urbański wrote: On 29/06/12 00:36, Jan Urbański wrote: On 27/06/12 13:57, Jan Urbański wrote: On 27/06/12 11:51, Asif Naeem wrote: Hi, On Windows 7 64bit, plpython is causing server crash with the following test case i.e. So: I'd add code to translate WINxxx into CPxxx when choosing the Python to use, change PLy_elog to elog in PLyUnicode_Bytes and leave the SQL_ASCII case alone, as there were no complaints and people using SQL_ASCII are asking for it anyway. Since no one commented, I'll produce a patch to that effect. I believe this should go into 9.2 given that otherwise PL/Python will basically crash any database using the CP12xx encoding. Patch attached. Asif, could you try a few things on a CP1252 database? First verify if your original test case now works and then try this: create function enctest() returns text as $$ return b'tr\xc3\xb3spido'.decode('utf-8') $$ language plpython3u; select enctest(), encode(convert_to(enctest(), 'utf-8'), 'hex'); Thanks, Jan diff --git a/src/pl/plpython/plpy_util.c b/src/pl/plpython/plpy_util.c new file mode 100644 index 9a4901e..5fafbd1 *** a/src/pl/plpython/plpy_util.c --- b/src/pl/plpython/plpy_util.c *** PLyUnicode_Bytes(PyObject *unicode) *** 66,80 /* * Python understands almost all PostgreSQL encoding names, but it doesn't ! * know SQL_ASCII. */ ! if (GetDatabaseEncoding() == PG_SQL_ASCII) ! serverenc = ascii; ! else ! serverenc = GetDatabaseEncodingName(); rv = PyUnicode_AsEncodedString(unicode, serverenc, strict); ! if (rv == NULL) ! PLy_elog(ERROR, could not convert Python Unicode object to PostgreSQL server encoding); return rv; } --- 66,125 /* * Python understands almost all PostgreSQL encoding names, but it doesn't ! * know SQL_ASCII and calls the Windows encodings differently. */ ! switch (GetDatabaseEncoding()) ! { ! case PG_SQL_ASCII: ! serverenc = ascii; ! break; ! case PG_WIN1250: ! serverenc = cp1250; ! break; ! case PG_WIN1251: ! serverenc = cp1251; ! break; ! case PG_WIN1252: ! serverenc = cp1252; ! break; ! case PG_WIN1253: ! serverenc = cp1253; ! break; ! case PG_WIN1254: ! serverenc = cp1254; ! break; ! case PG_WIN1255: ! serverenc = cp1255; ! break; ! case PG_WIN1256: ! serverenc = cp1256; ! break; ! case PG_WIN1257: ! serverenc = cp1257; ! break; ! case PG_WIN1258: ! serverenc = cp1258; ! break; ! case PG_WIN866: ! serverenc = cp866; ! break; ! case PG_WIN874: ! serverenc = cp874; ! break; ! default: ! serverenc = GetDatabaseEncodingName(); ! break; ! } ! rv = PyUnicode_AsEncodedString(unicode, serverenc, strict); ! if (rv == NULL) { ! /* ! * Use a plan elog instead of PLy_elog here to avoid getting in ! * recursion trouble when the traceback formatting functions try doing ! * unicode to bytes conversion. ! */ ! elog(ERROR, could not convert Python Unicode object to PostgreSQL server encoding); ! } return rv; } -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plpython issue with Win64 (PG 9.2)
On 27/06/12 13:57, Jan Urbański wrote: On 27/06/12 11:51, Asif Naeem wrote: Hi, On Windows 7 64bit, plpython is causing server crash with the following test case i.e. CREATE PROCEDURAL LANGUAGE 'plpython3u'; CREATE OR REPLACE FUNCTION pymax (a integer, b integer) RETURNS integer AS $$ if a b: return a return b $$ LANGUAGE plpython3u; SELECT pymax(1, 2); I think primary reason that trigger this issue is when Function PLyUnicode_Bytes() calls PyUnicode_AsEncodedString( ,WIN1252 /*Server encoding*/, ) it fails with null. I built latest pg 9.2 source code with python 3.2.2.3 by using Visual Studio 2010. Thanks. I'll try to reproduce this on Linux, which should be possible given the results of your investigation. Your analysis is correct, I managed to reproduce this by injecting serverenc = win1252; into PLyUnicode_Bytes. The comment in that function says that Python understands all PostgreSQL encoding names except for SQL_ASCII, but that's not really true. In your case GetDatabaseEncodingName() returns WIN1252 and Python accepts CP125. I'm wondering how this should be fixed. Just by adding more special cases in PLyUnicode_Bytes? Even if we add a switch statement that would convert PG_WIN1250 into CP1250, Python can still raise an exception when encoding (for various reasons). How about replacing the PLy_elog there with just an elog? This loses traceback support and the Python exception message, which could be helpful for debugging (something like invalid character foo for encoding cp1250). OTOH, I'm uneasy about invoking the entire PLy_elog machinery from a function that's as low-level as PLyUnicode_Bytes. Lastly, we map SQL_ASCII to ascii which is arguably wrong. The function is supposed to return bytes in the server encoding, and under SQL_ASCII that probably means we can return anything (ie. use any encoding we deem useful). Using ascii as the Python codec name will raise an error on anything that has the high bit set. So: I'd add code to translate WINxxx into CPxxx when choosing the Python to use, change PLy_elog to elog in PLyUnicode_Bytes and leave the SQL_ASCII case alone, as there were no complaints and people using SQL_ASCII are asking for it anyway. Cheers, Jan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plpython issue with Win64 (PG 9.2)
On 27/06/12 11:51, Asif Naeem wrote: Hi, On Windows 7 64bit, plpython is causing server crash with the following test case i.e. CREATE PROCEDURAL LANGUAGE 'plpython3u'; CREATE OR REPLACE FUNCTION pymax (a integer, b integer) RETURNS integer AS $$ if a b: return a return b $$ LANGUAGE plpython3u; SELECT pymax(1, 2); I think primary reason that trigger this issue is when Function PLyUnicode_Bytes() calls PyUnicode_AsEncodedString( ,WIN1252 /*Server encoding*/, ) it fails with null. I built latest pg 9.2 source code with python 3.2.2.3 by using Visual Studio 2010. Thanks. I'll try to reproduce this on Linux, which should be possible given the results of your investigation. Cheers, Jan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PL/Python result set slicing broken in Python 3
On 10/05/12 19:45, Peter Eisentraut wrote: On lör, 2012-05-05 at 22:45 +0200, Jan Urbański wrote: Apparently once you implement PyMappingMethods.mp_subscript you can drop PySequenceMethods.sq_slice, but I guess there's no harm in keeping it (and I'm not sure it'd work on Python 2.3 with only mp_subscript implemented). Committed this now. From test coverage reports, I now see that PLy_result_ass_item() is no longer called. That's probably OK, if assignments are now handled through the mapping methods. But should we remove the function then? Have you tried on Python 2.3 as well? People on #python said that if you implement the mapping functions, the sequence slicing functions are no longer used, but maybe we should revisit for the next release, rather than risk introducing a regression for the benefit of removing a few dead lines. Cheers, Jan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PL/Python result set slicing broken in Python 3
On 04/05/12 20:00, Jan Urbański wrote: On 03/05/12 11:04, Jan Urbański wrote: On 02/05/12 20:18, Peter Eisentraut wrote: This doesn't work anymore with Python 3: rv = plpy.execute(...) do_something(rv[0:1]) Sounds ugly. I'll take a look. I found some instructions on how to deal with the Python 2/Python 3 slicing mess: http://renesd.blogspot.com/2009/07/python3-c-api-simple-slicing-sqslice.html Thanks to the helpful folk at #python I found out that the fix is much easier. Attached is a patch that fixes the bug and passes regression tests on Pythons 2.3 through 3.2. Apparently once you implement PyMappingMethods.mp_subscript you can drop PySequenceMethods.sq_slice, but I guess there's no harm in keeping it (and I'm not sure it'd work on Python 2.3 with only mp_subscript implemented). Do we want to backpatch this? If so, I'd need to produce a version that applies to the monolithic plpython.c file from the previous releases. Cheers, Jan From 000a1285d66c65c36ae6fa064266f00def5ee9d7 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Jan=20Urba=C5=84ski?= wulc...@wulczer.org Date: Sat, 5 May 2012 22:39:26 +0200 Subject: [PATCH] Fix slicing support for result objects for Python 3. The old way of implementing slicing support by implementing PySequenceMethods.sq_slice has been deprecated in Python 3, you should not implement PyMappingMethods.mp_subscript. Do this by simply proxying the call to the wrapped list of result dictionaries. While at it, fix an incorrect comment about PLyResultObject-rows being None if the result set is empty (it actually is an empty list in that case). --- src/pl/plpython/expected/plpython_spi.out | 55 + src/pl/plpython/plpy_resultobject.c | 27 +- src/pl/plpython/plpy_resultobject.h |2 +- src/pl/plpython/sql/plpython_spi.sql | 36 +++ 4 files changed, 118 insertions(+), 2 deletions(-) diff --git a/src/pl/plpython/expected/plpython_spi.out b/src/pl/plpython/expected/plpython_spi.out index 671c24e..54ef512 100644 --- a/src/pl/plpython/expected/plpython_spi.out +++ b/src/pl/plpython/expected/plpython_spi.out @@ -228,6 +228,61 @@ SELECT result_len_test($$UPDATE foo3 SET b= '' WHERE a = 2$$); 0 (1 row) +CREATE FUNCTION result_subscript_test() RETURNS void +AS $$ +result = plpy.execute(SELECT 1 AS c UNION SELECT 2 + UNION SELECT 3 UNION SELECT 4) + +plpy.info(result[1]['c']) +plpy.info(result[-1]['c']) + +plpy.info([item['c'] for item in result[1:3]]) +plpy.info([item['c'] for item in result[::2]]) + +result[-1] = {'c': 1000} +result[:2] = [{'c': 10}, {'c': 100}] +plpy.info([item['c'] for item in result[:]]) + +# raises TypeError, but the message differs on Python 2.6, so silence it +try: +plpy.info(result['foo']) +except TypeError: +pass +else: +assert False, TypeError not raised + +$$ LANGUAGE plpythonu; +SELECT result_subscript_test(); +INFO: 2 +CONTEXT: PL/Python function result_subscript_test +INFO: 4 +CONTEXT: PL/Python function result_subscript_test +INFO: [2, 3] +CONTEXT: PL/Python function result_subscript_test +INFO: [1, 3] +CONTEXT: PL/Python function result_subscript_test +INFO: [10, 100, 3, 1000] +CONTEXT: PL/Python function result_subscript_test + result_subscript_test +--- + +(1 row) + +CREATE FUNCTION result_empty_test() RETURNS void +AS $$ +result = plpy.execute(select 1 where false) + +plpy.info(result[:]) + +$$ LANGUAGE plpythonu; +SELECT result_empty_test(); +INFO: [] +CONTEXT: PL/Python function result_empty_test + result_empty_test +--- + +(1 row) + -- cursor objects CREATE FUNCTION simple_cursor_test() RETURNS int AS $$ res = plpy.cursor(select fname, lname from users) diff --git a/src/pl/plpython/plpy_resultobject.c b/src/pl/plpython/plpy_resultobject.c index fcf8074..06ba2ee 100644 --- a/src/pl/plpython/plpy_resultobject.c +++ b/src/pl/plpython/plpy_resultobject.c @@ -23,6 +23,9 @@ static PyObject *PLy_result_item(PyObject *arg, Py_ssize_t idx); static PyObject *PLy_result_slice(PyObject *arg, Py_ssize_t lidx, Py_ssize_t hidx); static int PLy_result_ass_item(PyObject *arg, Py_ssize_t idx, PyObject *item); static int PLy_result_ass_slice(PyObject *rg, Py_ssize_t lidx, Py_ssize_t hidx, PyObject *slice); +static PyObject *PLy_result_subscript(PyObject *arg, PyObject *item); +static PyObject *PLy_result_subscript(PyObject *arg, PyObject *item); +static int PLy_result_ass_subscript(PyObject* self, PyObject* item, PyObject* value); static char PLy_result_doc[] = { Results of a PostgreSQL query @@ -38,6 +41,12 @@ static PySequenceMethods PLy_result_as_sequence = { PLy_result_ass_slice, /* sq_ass_slice */ }; +static PyMappingMethods PLy_result_as_mapping = { + PLy_result_length, /* mp_length */ + PLy_result_subscript, /* mp_subscript */ + PLy_result_ass_subscript, /* mp_ass_subscript */ +}; + static PyMethodDef PLy_result_methods[] = { {colnames
Re: [HACKERS] PL/Python result set slicing broken in Python 3
On 03/05/12 11:04, Jan Urbański wrote: On 02/05/12 20:18, Peter Eisentraut wrote: This doesn't work anymore with Python 3: rv = plpy.execute(...) do_something(rv[0:1]) Apparently, they changed the C API for doing slicing, or rather made one of the two APIs for it silently do nothing. Details are difficult to find, but this email message seems to contain something: http://mail.python.org/pipermail/python-3000/2007-August/009851.html. I'll try to sort this out sometime, but if someone wants to take a shot at it, go ahead. Sounds ugly. I'll take a look. I found some instructions on how to deal with the Python 2/Python 3 slicing mess: http://renesd.blogspot.com/2009/07/python3-c-api-simple-slicing-sqslice.html Apparently you need that egregious hack in order to avoid code duplication. I'll try to produce a patch over the weekend. Cheers, Jan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PL/Python result set slicing broken in Python 3
On 02/05/12 20:18, Peter Eisentraut wrote: This doesn't work anymore with Python 3: rv = plpy.execute(...) do_something(rv[0:1]) Apparently, they changed the C API for doing slicing, or rather made one of the two APIs for it silently do nothing. Details are difficult to find, but this email message seems to contain something: http://mail.python.org/pipermail/python-3000/2007-August/009851.html. I'll try to sort this out sometime, but if someone wants to take a shot at it, go ahead. Sounds ugly. I'll take a look. Cheers, Jan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plpython triggers are broken for composite-type columns
On 10/04/12 21:47, Jan Urbański wrote: On 10/04/12 21:27, Tom Lane wrote: =?UTF-8?B?SmFuIFVyYmHFhHNraQ==?=wulc...@wulczer.org writes: Yes, that would be ideal, even though not backwards-compatible. Back-patching is out of the question, but do we want to change trigger functions to receive dictionaries in NEW? Hm, I was not thinking of this as being trigger-specific, but more a general principle that composite columns of tuples ought to be handled in a recursive fashion. Sure, that would be the way. If so, should this be 9.2 material, or just a TODO? If it can be done quickly and with not much risk, I'd vote for squeezing it into 9.2, because it seems to me to be a clear bug that the two directions are not handled consistently. If you don't have time for it now or you don't think it would be a small/safe patch, we'd better just put it on TODO. It turned out not to be as straightforward as I though :( The I/O code in PL/Python is a bit of a mess and that's something that I'd like to address somewhere in the 9.3 development cycle. Right now making the conversion function recursive is not possible without some deep surgery (or kludgery...) so I limited myself to producing regression-fixing patches for 9.2 and 9.1 (attached). Cheers, Jan From 513e8c484f32599a8753035fad638c8712339480 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Jan=20Urba=C5=84ski?= wulc...@wulczer.org Date: Mon, 23 Apr 2012 02:07:46 +0200 Subject: [PATCH] Accept strings in PL/Python functions returning composite types. Before 9.1, PL/Python functions returning composite types could return a string and it would be parsed using record_in. The 9.1 changes made PL/Python only expect dictionaries, tuples or objects supporting getattr as output of composite functions, resulting in a regression and a confusing error message, as the strings were interpreted as sequences and the code for transforming lists Postgres tuples was being used. The reason why it's important is that trigger functions on tables with composite columns get the composite row passed in as a string (from record_out). This makes it impossible to implement passthrough behaviour for these columns, as PL/Python no longer accepts strings for composite values. A better solution would be to fix the code that transforms composite inputs into Python objecst to produce dictionaries that would then be correctly interpreted by the Python-Postgres counterpart code. This would be too invasive to backpatch in 9.1 and it is too late in the 9.2 cycle to do it in HEAD. It should get revisited in 9.3, though. Reported as bug #6559 by Kirill Simonov. --- src/pl/plpython/expected/plpython_record.out | 16 src/pl/plpython/expected/plpython_trigger.out | 37 + src/pl/plpython/plpython.c| 94 --- src/pl/plpython/regression.diffs | 99 + src/pl/plpython/regression.out| 20 + src/pl/plpython/sql/plpython_record.sql | 10 +++ src/pl/plpython/sql/plpython_trigger.sql | 36 + 7 files changed, 268 insertions(+), 44 deletions(-) create mode 100644 src/pl/plpython/regression.diffs create mode 100644 src/pl/plpython/regression.out diff --git a/src/pl/plpython/expected/plpython_record.out b/src/pl/plpython/expected/plpython_record.out index 0bcc46c..4583307 100644 --- a/src/pl/plpython/expected/plpython_record.out +++ b/src/pl/plpython/expected/plpython_record.out @@ -38,6 +38,8 @@ elif typ == 'obj': type_record.first = first type_record.second = second return type_record +elif typ == 'str': + return ('%s',%r) % (first, second) $$ LANGUAGE plpythonu; CREATE FUNCTION test_in_out_params(first in text, second out text) AS $$ return first + '_in_to_out'; @@ -290,6 +292,12 @@ SELECT * FROM test_type_record_as('obj', null, null, true); | (1 row) +SELECT * FROM test_type_record_as('str', 'one', 1, false); + first | second +---+ + 'one' | 1 +(1 row) + SELECT * FROM test_in_out_params('test_in'); second --- @@ -355,3 +363,11 @@ ERROR: attribute second does not exist in Python object HINT: To return null in a column, let the returned object have an attribute named after column with value None. CONTEXT: while creating return value PL/Python function test_type_record_error3 +CREATE FUNCTION test_type_record_error4() RETURNS type_record AS $$ +return 'foo' +$$ LANGUAGE plpythonu; +SELECT * FROM test_type_record_error4(); +ERROR: malformed record literal: foo +DETAIL: Missing left parenthesis. +CONTEXT: while creating return value +PL/Python function test_type_record_error4 diff --git a/src/pl/plpython/expected/plpython_trigger.out b/src/pl/plpython/expected/plpython_trigger.out index 2ef66a8..ea4d500 100644 --- a/src/pl/plpython/expected/plpython_trigger.out +++ b/src/pl/plpython/expected/plpython_trigger.out @@ -567,3 +567,40 @@ SELECT * FROM composite_trigger_test; (3,f
Re: [HACKERS] plpython triggers are broken for composite-type columns
On 10/04/12 07:35, Jan Urbański wrote: On 10/04/12 04:20, Tom Lane wrote: Don't know if anybody noticed bug #6559 http://archives.postgresql.org/pgsql-bugs/2012-03/msg00180.php I've confirmed that the given test case works in 9.0 but fails in 9.1 and HEAD. So, I know what's going on, I still don't know what's the best way to handle it. The function that converts Python objects to PG data checks what type it's supposed to produce and acts accordingly. In 9.0 it checked for bool, bytea and arrays, in 9.1 it also takes composite types into account. This has been done to support functions returning composite types - to do that they need to return a dictionary or a list, for instance {'col1': 1, 'col2': 2}. The problem is that the routine that converts PG data into Python objects does not handle composite type inputs all that well - it just bails and returns the string representation, hence '(3)' appearing in Python land. Now previously, the Python-PG function did not see that the given conversion is supposed to return a composite so it also bailed and used a default text-composite conversion, so '(3)' was converted to ROW(3) and all went well. The new code tries to treat what it gets as a dictionary/list/tuple and fails in a more or less random way. Now that I understand what's been going on, I'll try to think of a non-invasive way of fixing that... Cheers, Jan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plpython triggers are broken for composite-type columns
On 10/04/12 20:47, Tom Lane wrote: I wrote: =?UTF-8?B?SmFuIFVyYmHFhHNraQ==?=wulc...@wulczer.org writes: Now that I understand what's been going on, I'll try to think of a non-invasive way of fixing that... ISTM that conversion of a composite value to Python ought to produce a dict, now that the other direction expects a dict. I can see that this is probably infeasible for compatibility reasons in 9.1, but it's not too late to fix it for 9.2. We might have to leave the bug unfixed in 9.1, since anything we do about it will represent a compatibility break. On reflection, can't we fix this as follows: if the value coming in from Python is a string, just feed it to record_in, the same as we used to. When I traced through the logic before, it seemed like it was failing to distinguish strings from sequences, but I would hope that Python is more strongly typed than that. Yeah, we can fix PLyObject_ToTuple to check for strings too and use the default PG input function. The reason it was complaining about length is that we're checking if the object passed implements the sequence protocol, which Python strings do (length, iteration, etc). Sticking a if branch that will catch the string case above that should be sufficient. I still think the conversion in the other direction ought to yield a dict, but that's clearly not back-patch material. Yes, that would be ideal, even though not backwards-compatible. Back-patching is out of the question, but do we want to change trigger functions to receive dictionaries in NEW? If so, should this be 9.2 material, or just a TODO? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plpython triggers are broken for composite-type columns
On 10/04/12 21:27, Tom Lane wrote: =?UTF-8?B?SmFuIFVyYmHFhHNraQ==?=wulc...@wulczer.org writes: Yes, that would be ideal, even though not backwards-compatible. Back-patching is out of the question, but do we want to change trigger functions to receive dictionaries in NEW? Hm, I was not thinking of this as being trigger-specific, but more a general principle that composite columns of tuples ought to be handled in a recursive fashion. Sure, that would be the way. If so, should this be 9.2 material, or just a TODO? If it can be done quickly and with not much risk, I'd vote for squeezing it into 9.2, because it seems to me to be a clear bug that the two directions are not handled consistently. If you don't have time for it now or you don't think it would be a small/safe patch, we'd better just put it on TODO. I'll see if making the conversion function recursive is easy and independently whip up a patch to check for strings and routes them through InputFunctionCall, for back-patching purposes. Cheers, Jan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plpython triggers are broken for composite-type columns
On 10/04/12 04:20, Tom Lane wrote: Don't know if anybody noticed bug #6559 http://archives.postgresql.org/pgsql-bugs/2012-03/msg00180.php I've confirmed that the given test case works in 9.0 but fails in 9.1 and HEAD. I find this code pretty unreadable, though, and know nothing to speak of about the Python side of things anyhow. So somebody else had better pick this up. I'll look into that. Cheers, Jan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Potential reference miscounts and segfaults in plpython.c
On 21/02/12 18:28, Jan Urbański wrote: On 21/02/12 18:05, Peter Eisentraut wrote: it might be better to use ereport, to expose the message for translation. After giving it some thought some of these elogs could be changed into PLy_elogs (which is meant to propagate a Python error into Postgres) and the others made into ereports. I'll send updated patches this evening (CET). Inevitably, this turned into the next morning CET. Here are the updated patches which use PLy_elog instead of plain elog. The difference is that they will get marked for translation and that the original Python exception will show up in the errdetail field. Cheers, Jan diff --git a/src/pl/plpython/plpy_elog.c b/src/pl/plpython/plpy_elog.c index 741980c..332898d 100644 *** a/src/pl/plpython/plpy_elog.c --- b/src/pl/plpython/plpy_elog.c *** get_source_line(const char *src, int lin *** 365,370 --- 365,374 const char *next = src; int current = 0; + /* sanity check */ + if (lineno = 0) + return NULL; + while (current lineno) { s = next; diff --git a/src/pl/plpython/plpy_main.c b/src/pl/plpython/plpy_main.c index ae9d87e..f259ecf 100644 *** a/src/pl/plpython/plpy_main.c --- b/src/pl/plpython/plpy_main.c *** PLy_init_interp(void) *** 133,138 --- 133,140 Py_INCREF(mainmod); PLy_interp_globals = PyModule_GetDict(mainmod); PLy_interp_safe_globals = PyDict_New(); + if (PLy_interp_safe_globals == NULL) + PLy_elog(ERROR, could not create globals); PyDict_SetItemString(PLy_interp_globals, GD, PLy_interp_safe_globals); Py_DECREF(mainmod); if (PLy_interp_globals == NULL || PyErr_Occurred()) diff --git a/src/pl/plpython/plpy_plpymodule.c b/src/pl/plpython/plpy_plpymodule.c index d2d0a2a..0c96f03 100644 *** a/src/pl/plpython/plpy_plpymodule.c --- b/src/pl/plpython/plpy_plpymodule.c *** PLy_init_plpy(void) *** 173,181 main_mod = PyImport_AddModule(__main__); main_dict = PyModule_GetDict(main_mod); plpy_mod = PyImport_AddModule(plpy); PyDict_SetItemString(main_dict, plpy, plpy_mod); if (PyErr_Occurred()) ! elog(ERROR, could not initialize plpy); } static void --- 173,183 main_mod = PyImport_AddModule(__main__); main_dict = PyModule_GetDict(main_mod); plpy_mod = PyImport_AddModule(plpy); + if (plpy_mod == NULL) + PLy_elog(ERROR, could not initialize plpy); PyDict_SetItemString(main_dict, plpy, plpy_mod); if (PyErr_Occurred()) ! PLy_elog(ERROR, could not initialize plpy); } static void *** PLy_add_exceptions(PyObject *plpy) *** 208,213 --- 210,220 PLy_exc_fatal = PyErr_NewException(plpy.Fatal, NULL, NULL); PLy_exc_spi_error = PyErr_NewException(plpy.SPIError, NULL, NULL); + if (PLy_exc_error == NULL || + PLy_exc_fatal == NULL || + PLy_exc_spi_error == NULL) + PLy_elog(ERROR, could not create the base SPI exceptions); + Py_INCREF(PLy_exc_error); PyModule_AddObject(plpy, Error, PLy_exc_error); Py_INCREF(PLy_exc_fatal); *** PLy_generate_spi_exceptions(PyObject *mo *** 241,247 --- 248,260 PyObject *sqlstate; PyObject *dict = PyDict_New(); + if (dict == NULL) + PLy_elog(ERROR, could not generate SPI exceptions); + sqlstate = PyString_FromString(unpack_sql_state(exception_map[i].sqlstate)); + if (sqlstate == NULL) + PLy_elog(ERROR, could not generate SPI exceptions); + PyDict_SetItemString(dict, sqlstate, sqlstate); Py_DECREF(sqlstate); exc = PyErr_NewException(exception_map[i].name, base, dict); *** PLy_output(volatile int level, PyObject *** 369,376 * decoration. */ PyObject *o; - PyArg_UnpackTuple(args, plpy.elog, 1, 1, o); so = PyObject_Str(o); } else --- 382,393 * decoration. */ PyObject *o; + int result; + + result = PyArg_UnpackTuple(args, plpy.elog, 1, 1, o); + if (!result) + PLy_elog(ERROR, could not unpack arguments in plpy.elog); so = PyObject_Str(o); } else diff --git a/src/pl/plpython/plpy_spi.c b/src/pl/plpython/plpy_spi.c index 0d63c4f..20b93cd 100644 *** a/src/pl/plpython/plpy_spi.c --- b/src/pl/plpython/plpy_spi.c *** PLy_spi_execute_query(char *query, long *** 338,344 int rv; volatile MemoryContext oldcontext; volatile ResourceOwner oldowner; ! PyObject *ret; oldcontext = CurrentMemoryContext; oldowner = CurrentResourceOwner; --- 338,344 int rv; volatile MemoryContext oldcontext; volatile ResourceOwner oldowner; ! PyObject *ret = NULL; oldcontext = CurrentMemoryContext; oldowner = CurrentResourceOwner; *** PLy_spi_execute_query(char *query, long *** 362,367 --- 362,368 if (rv 0) { + Py_XDECREF(ret); PLy_exception_set(PLy_exc_spi_error, SPI_execute failed: %s, SPI_result_code_string(rv)); diff --git a/src/pl/plpython/plpy_typeio.c b/src/pl
Re: [HACKERS] Potential reference miscounts and segfaults in plpython.c
On 21/02/12 18:05, Peter Eisentraut wrote: On sön, 2012-02-19 at 22:29 -0500, Tom Lane wrote: My only comment is whether elog(ERROR) is appropriate, ie, do we consider these to be internal errors that users will never see in practice? If there's a significant risk of the error being thrown in the field, it might be better to use ereport, to expose the message for translation. I find the wording of the error messages a bit inappropriate. For example, list = PyList_New(length); + if (list == NULL) + elog(ERROR, could not transform Python list to array); The error is not about the transforming, it's about creating a new list. Well, what I tried to convery here was that the process of transforming a Python list to a Postgres array failed. Which, by the way, is wrong since this function transforms a Postgres array to a Python list... After giving it some thought some of these elogs could be changed into PLy_elogs (which is meant to propagate a Python error into Postgres) and the others made into ereports. I'll send updated patches this evening (CET). Cheers, Jan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Potential reference miscounts and segfaults in plpython.c
On 20/02/12 04:29, Tom Lane wrote: =?UTF-8?B?SmFuIFVyYmHFhHNraQ==?= wulc...@wulczer.org writes: On 18/02/12 21:17, Tom Lane wrote: Dave Malcolm at Red Hat has been working on a static code analysis tool for Python-related C code. He reports here on some preliminary results for plpython.c: https://bugzilla.redhat.com/show_bug.cgi?id=795011 Here's a patch that fixes everything I was sure was an actual bug. The rest of the warnings seem to be caused by the tool not knowing that elog(ERROR) throws a longjmp and things like we never unref this object, so it can't disappear mid-execution. My only comment is whether elog(ERROR) is appropriate, ie, do we consider these to be internal errors that users will never see in practice? AFAICS these errors can only happen on out of memory conditions or other internal errors (like trying to create a list with a negative length). Cheers, Jan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Potential reference miscounts and segfaults in plpython.c
On 18/02/12 21:18, Jan Urbański wrote: On 18/02/12 21:17, Tom Lane wrote: =?UTF-8?B?SmFuIFVyYmHFhHNraQ==?= wulc...@wulczer.org writes: On 18/02/12 20:30, Tom Lane wrote: Dave Malcolm at Red Hat has been working on a static code analysis tool for Python-related C code. He reports here on some preliminary results for plpython.c: https://bugzilla.redhat.com/show_bug.cgi?id=795011 If you find any live bugs, it'd likely be better to deal with them as a separate patch so that we can back-patch ... Sure, I meant to say I'll look at these as well, but will make them into a separate patch. Here's a patch that fixes everything I was sure was an actual bug. The rest of the warnings seem to be caused by the tool not knowing that elog(ERROR) throws a longjmp and things like we never unref this object, so it can't disappear mid-execution. Attached are patches for HEAD and for 9.1.x (since splitting plpython.c in 9.2 was kind of my idea I felt bad about you having to back-patch this so I tried to do the necessary legwork myself; I hope the attached is what you need). BTW, that tool is quite handy, I'll have to try running it over psycopg2. Cheers, Jan diff --git a/src/pl/plpython/plpython.c b/src/pl/plpython/plpython.c index 530b5f0..2b064c5 100644 *** a/src/pl/plpython/plpython.c --- b/src/pl/plpython/plpython.c *** PLyList_FromArray(PLyDatumToOb *arg, Dat *** 2345,2350 --- 2345,2352 length = ARR_DIMS(array)[0]; lbound = ARR_LBOUND(array)[0]; list = PyList_New(length); + if (list == NULL) + elog(ERROR, could not transform Python list to array); for (i = 0; i length; i++) { *** PLy_spi_execute_query(char *query, long *** 3664,3670 int rv; volatile MemoryContext oldcontext; volatile ResourceOwner oldowner; ! PyObject *ret; oldcontext = CurrentMemoryContext; oldowner = CurrentResourceOwner; --- 3666,3672 int rv; volatile MemoryContext oldcontext; volatile ResourceOwner oldowner; ! PyObject *ret = NULL; oldcontext = CurrentMemoryContext; oldowner = CurrentResourceOwner; *** PLy_spi_execute_query(char *query, long *** 3727,3732 --- 3729,3735 if (rv 0) { + Py_XDECREF(ret); PLy_exception_set(PLy_exc_spi_error, SPI_execute failed: %s, SPI_result_code_string(rv)); *** PLy_generate_spi_exceptions(PyObject *mo *** 3967,3973 --- 3970,3982 PyObject *sqlstate; PyObject *dict = PyDict_New(); + if (dict == NULL) + elog(ERROR, could not generate SPI exceptions); + sqlstate = PyString_FromString(unpack_sql_state(exception_map[i].sqlstate)); + if (sqlstate == NULL) + elog(ERROR, could not generate SPI exceptions); + PyDict_SetItemString(dict, sqlstate, sqlstate); Py_DECREF(sqlstate); exc = PyErr_NewException(exception_map[i].name, base, dict); *** PLy_add_exceptions(PyObject *plpy) *** 4008,4013 --- 4017,4027 PLy_exc_fatal = PyErr_NewException(plpy.Fatal, NULL, NULL); PLy_exc_spi_error = PyErr_NewException(plpy.SPIError, NULL, NULL); + if (PLy_exc_error == NULL || + PLy_exc_fatal == NULL || + PLy_exc_spi_error == NULL) + elog(ERROR, could not create the base SPI exceptions); + Py_INCREF(PLy_exc_error); PyModule_AddObject(plpy, Error, PLy_exc_error); Py_INCREF(PLy_exc_fatal); *** PLy_init_interp(void) *** 4124,4129 --- 4138,4145 Py_INCREF(mainmod); PLy_interp_globals = PyModule_GetDict(mainmod); PLy_interp_safe_globals = PyDict_New(); + if (PLy_interp_safe_globals == NULL) + PLy_elog(ERROR, could not create globals); PyDict_SetItemString(PLy_interp_globals, GD, PLy_interp_safe_globals); Py_DECREF(mainmod); if (PLy_interp_globals == NULL || PyErr_Occurred()) *** PLy_init_plpy(void) *** 4164,4169 --- 4180,4187 main_mod = PyImport_AddModule(__main__); main_dict = PyModule_GetDict(main_mod); plpy_mod = PyImport_AddModule(plpy); + if (plpy_mod == NULL) + elog(ERROR, could not initialize plpy); PyDict_SetItemString(main_dict, plpy, plpy_mod); if (PyErr_Occurred()) elog(ERROR, could not initialize plpy); *** PLy_output(volatile int level, PyObject *** 4231,4238 * decoration. */ PyObject *o; - PyArg_UnpackTuple(args, plpy.elog, 1, 1, o); so = PyObject_Str(o); } else --- 4249,4260 * decoration. */ PyObject *o; + int result; + + result = PyArg_UnpackTuple(args, plpy.elog, 1, 1, o); + if (!result) + elog(ERROR, could not unpack arguments in plpy.elog); so = PyObject_Str(o); } else *** get_source_line(const char *src, int lin *** 4554,4559 --- 4576,4585 const char *next = src; int current = 0; + /* sanity check */ + if (lineno = 0) + return NULL; + while (current lineno) { s = next; diff --git a/src/pl/plpython/plpy_elog.c b/src
Re: [HACKERS] pl/python long-lived allocations in datum-dict transformation
On 14/02/12 01:35, Tom Lane wrote: =?UTF-8?B?SmFuIFVyYmHFhHNraQ==?= wulc...@wulczer.org writes: It's not very comfortable, but I think PLyDict_FromTuple can be allowed to be non-reentrant. I think that's pretty short-sighted. Even if it's safe today (which I am not 100% convinced of), there are plenty of foreseeable reasons why it might^Wwill break in the future. OTOH if we want to make it reentrant, some more tinkering would be in order. I think that's in order. Here are the results of the tinkering. I came up with a stack of context structures that gets pushed when a PL/Python starts being executed and popped when it returns. At first they contained just a scratch memory context used by PLyDict_FromTuple. Then under the premise of confirming the usefulness of introducing such contexts I removed the global PLy_curr_procedure variable and changed all users to get the current procedure from the context. It seems to have worked, so the total count of global variables is unchanged - hooray! While testing I found one more leak, this time caused by allocating a structure for caching array type I/O functions and never freeing it. Attached as separate patch. Cheers, Jan diff --git a/src/pl/plpython/plpy_cursorobject.c b/src/pl/plpython/plpy_cursorobject.c index 4226dc7..46930b0 100644 *** a/src/pl/plpython/plpy_cursorobject.c --- b/src/pl/plpython/plpy_cursorobject.c *** *** 14,19 --- 14,20 #include plpy_cursorobject.h #include plpy_elog.h + #include plpy_main.h #include plpy_planobject.h #include plpy_procedure.h #include plpy_resultobject.h *** PLy_cursor_query(const char *query) *** 121,126 --- 122,128 { SPIPlanPtr plan; Portal portal; + PLyExecutionContext *exec_ctx = PLy_current_execution_context(); pg_verifymbstr(query, strlen(query), false); *** PLy_cursor_query(const char *query) *** 129,136 elog(ERROR, SPI_prepare failed: %s, SPI_result_code_string(SPI_result)); portal = SPI_cursor_open(NULL, plan, NULL, NULL, ! PLy_curr_procedure-fn_readonly); SPI_freeplan(plan); if (portal == NULL) --- 131,140 elog(ERROR, SPI_prepare failed: %s, SPI_result_code_string(SPI_result)); + Assert(exec_ctx-curr_proc != NULL); + portal = SPI_cursor_open(NULL, plan, NULL, NULL, ! exec_ctx-curr_proc-fn_readonly); SPI_freeplan(plan); if (portal == NULL) *** PLy_cursor_plan(PyObject *ob, PyObject * *** 210,215 --- 214,220 Portal portal; char *volatile nulls; volatile int j; + PLyExecutionContext *exec_ctx = PLy_current_execution_context(); if (nargs 0) nulls = palloc(nargs * sizeof(char)); *** PLy_cursor_plan(PyObject *ob, PyObject * *** 252,259 } } portal = SPI_cursor_open(NULL, plan-plan, plan-values, nulls, ! PLy_curr_procedure-fn_readonly); if (portal == NULL) elog(ERROR, SPI_cursor_open() failed: %s, SPI_result_code_string(SPI_result)); --- 257,266 } } + Assert(exec_ctx-curr_proc != NULL); + portal = SPI_cursor_open(NULL, plan-plan, plan-values, nulls, ! exec_ctx-curr_proc-fn_readonly); if (portal == NULL) elog(ERROR, SPI_cursor_open() failed: %s, SPI_result_code_string(SPI_result)); diff --git a/src/pl/plpython/plpy_elog.c b/src/pl/plpython/plpy_elog.c index 741980c..9909f23 100644 *** a/src/pl/plpython/plpy_elog.c --- b/src/pl/plpython/plpy_elog.c *** *** 12,17 --- 12,18 #include plpy_elog.h + #include plpy_main.h #include plpy_procedure.h *** PLy_traceback(char **xmsg, char **tbmsg, *** 260,265 --- 261,267 char *line; char *plain_filename; long plain_lineno; + PLyExecutionContext *exec_ctx = PLy_current_execution_context(); /* * The second frame points at the internal function, but to mimick *** PLy_traceback(char **xmsg, char **tbmsg, *** 270,276 else fname = PyString_AsString(name); ! proname = PLy_procedure_name(PLy_curr_procedure); plain_filename = PyString_AsString(filename); plain_lineno = PyInt_AsLong(lineno); --- 272,280 else fname = PyString_AsString(name); ! Assert(exec_ctx-curr_proc != NULL); ! ! proname = PLy_procedure_name(exec_ctx-curr_proc); plain_filename = PyString_AsString(filename); plain_lineno = PyInt_AsLong(lineno); *** PLy_traceback(char **xmsg, char **tbmsg, *** 287,293 * function code object was compiled with string as the * filename */ ! if (PLy_curr_procedure plain_filename != NULL strcmp(plain_filename, string) == 0) { /* --- 291,297 * function code object was compiled with string as the * filename */ ! if (exec_ctx-curr_proc plain_filename != NULL
Re: [HACKERS] Potential reference miscounts and segfaults in plpython.c
On 18/02/12 20:30, Tom Lane wrote: Dave Malcolm at Red Hat has been working on a static code analysis tool for Python-related C code. He reports here on some preliminary results for plpython.c: https://bugzilla.redhat.com/show_bug.cgi?id=795011 I'm not enough of a Python hacker to evaluate the significance of these issues, but somebody who does work on that code ought to take a look and see what we ought to patch. Very cool! Some of them look like legitimate bugs, some seem to stem from the fact that the tool does not know that PLy_elog(ERROR) does not return. I wonder if it could be taught that. I'll try to fixes these while reworking the I/O functions memory leak patch I sent earlier. Cheers, Jan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Potential reference miscounts and segfaults in plpython.c
On 18/02/12 21:17, Tom Lane wrote: =?UTF-8?B?SmFuIFVyYmHFhHNraQ==?= wulc...@wulczer.org writes: On 18/02/12 20:30, Tom Lane wrote: Dave Malcolm at Red Hat has been working on a static code analysis tool for Python-related C code. He reports here on some preliminary results for plpython.c: https://bugzilla.redhat.com/show_bug.cgi?id=795011 I'll try to fixes these while reworking the I/O functions memory leak patch I sent earlier. If you find any live bugs, it'd likely be better to deal with them as a separate patch so that we can back-patch ... Sure, I meant to say I'll look at these as well, but will make them into a separate patch. Cheers, Jan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pl/python long-lived allocations in datum-dict transformation
On 12/02/12 00:48, Tom Lane wrote: =?UTF-8?B?SmFuIFVyYmHFhHNraQ==?= wulc...@wulczer.org writes: This is annoying for functions that plough through large tables, doing some calculation. Attached is a patch that does the conversion of PostgreSQL Datums into Python dict objects in a scratch memory context that gets reset every time. As best I can tell, this patch proposes creating a new, separate context (chewing up 8KB+) for every plpython procedure that's ever used in a given session. This cure could easily be worse than the disease as far Yeah, that's not ideal. What's more, it's unclear that it won't malfunction altogether if the function is used recursively (ie, what if PLyDict_FromTuple ends up calling the same function again?) I was a bit worried about that, but the only place where PLyDict_FromTuple calls into some other code is when it calls the type's specific I/O function, which AFAICT can't call back into user code (except for user-defined C I/O routines). It's not very comfortable, but I think PLyDict_FromTuple can be allowed to be non-reentrant. Can't you fix it so that the temp context is associated with a particular function execution, rather than being statically allocated per-function? That would be cool, but I failed to easily get a handle on something that's like the execution context of a PL/Python function... Actually, if we assume that PLyDict_FromTuple (which is quite a low-level thing) never calls PL/Python UDFs we could keep a single memory context in top-level PL/Python memory and pay the overhead once in a session, not once per function. OTOH if we want to make it reentrant, some more tinkering would be in order. Cheers, Jan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] pl/python long-lived allocations in datum-dict transformation
Consider this: create table arrays as select array[random(), random(), random(), random(), random(), random()] as a from generate_series(1, 100); create or replace function plpython_outputfunc() returns void as $$ c = plpy.cursor('select a from arrays') for row in c: pass $$ language plpythonu; When running the function, every datum will get transformed into a Python dict, which includes calling the type's output function, resulting in a memory allocation. The memory is allocated in the SPI context, so it accumulates until the function is finished. This is annoying for functions that plough through large tables, doing some calculation. Attached is a patch that does the conversion of PostgreSQL Datums into Python dict objects in a scratch memory context that gets reset every time. Cheers, Jan diff --git a/src/pl/plpython/plpy_main.c b/src/pl/plpython/plpy_main.c index ae9d87e..79d7784 100644 *** a/src/pl/plpython/plpy_main.c --- b/src/pl/plpython/plpy_main.c *** *** 12,17 --- 12,18 #include executor/spi.h #include miscadmin.h #include utils/guc.h + #include utils/memutils.h #include utils/syscache.h #include plpython.h *** plpython_inline_handler(PG_FUNCTION_ARGS *** 268,273 --- 269,279 MemSet(proc, 0, sizeof(PLyProcedure)); proc.pyname = PLy_strdup(__plpython_inline_block); + proc.tmp_ctx = AllocSetContextCreate(TopMemoryContext, + PL/Python temporary ctx, + ALLOCSET_DEFAULT_MINSIZE, + ALLOCSET_DEFAULT_INITSIZE, + ALLOCSET_DEFAULT_MAXSIZE); proc.result.out.d.typoid = VOIDOID; PG_TRY(); diff --git a/src/pl/plpython/plpy_procedure.c b/src/pl/plpython/plpy_procedure.c index 229966a..f539cec 100644 *** a/src/pl/plpython/plpy_procedure.c --- b/src/pl/plpython/plpy_procedure.c *** *** 12,17 --- 12,18 #include catalog/pg_type.h #include utils/builtins.h #include utils/hsearch.h + #include utils/memutils.h #include utils/syscache.h #include plpython.h *** PLy_procedure_create(HeapTuple procTup, *** 169,174 --- 170,180 proc-setof = NULL; proc-src = NULL; proc-argnames = NULL; + proc-tmp_ctx = AllocSetContextCreate(TopMemoryContext, + PL/Python temporary ctx, + ALLOCSET_DEFAULT_MINSIZE, + ALLOCSET_DEFAULT_INITSIZE, + ALLOCSET_DEFAULT_MAXSIZE); PG_TRY(); { *** PLy_procedure_delete(PLyProcedure *proc) *** 411,416 --- 417,424 PLy_free(proc-src); if (proc-argnames) PLy_free(proc-argnames); + if (proc-tmp_ctx) + MemoryContextDelete(proc-tmp_ctx); } /* diff --git a/src/pl/plpython/plpy_procedure.h b/src/pl/plpython/plpy_procedure.h index e986c7e..5ee73fa 100644 *** a/src/pl/plpython/plpy_procedure.h --- b/src/pl/plpython/plpy_procedure.h *** typedef struct PLyProcedure *** 30,35 --- 30,36 PyObject *code; /* compiled procedure code */ PyObject *statics; /* data saved across calls, local scope */ PyObject *globals; /* data saved across calls, global scope */ + MemoryContext tmp_ctx; } PLyProcedure; /* the procedure cache entry */ diff --git a/src/pl/plpython/plpy_typeio.c b/src/pl/plpython/plpy_typeio.c index d5cac9f..6f0eb46 100644 *** a/src/pl/plpython/plpy_typeio.c --- b/src/pl/plpython/plpy_typeio.c *** *** 23,28 --- 23,29 #include plpy_typeio.h #include plpy_elog.h + #include plpy_procedure.h /* I/O function caching */ *** PLy_output_record_funcs(PLyTypeInfo *arg *** 258,268 Assert(arg-is_rowtype == 1); } PyObject * PLyDict_FromTuple(PLyTypeInfo *info, HeapTuple tuple, TupleDesc desc) { ! PyObject *volatile dict; ! int i; if (info-is_rowtype != 1) elog(ERROR, PLyTypeInfo structure describes a datum); --- 259,274 Assert(arg-is_rowtype == 1); } + /* + * Transform a tuple into a Python dict object. The transformation can result + * in memory allocation, so it's done in a scratch memory context. + */ PyObject * PLyDict_FromTuple(PLyTypeInfo *info, HeapTuple tuple, TupleDesc desc) { ! PyObject *volatile dict; ! MemoryContext oldcontext; ! inti; if (info-is_rowtype != 1) elog(ERROR, PLyTypeInfo structure describes a datum); *** PLyDict_FromTuple(PLyTypeInfo *info, Hea *** 271,276 --- 277,284 if (dict == NULL) PLy_elog(ERROR, could not create new dictionary); + oldcontext = MemoryContextSwitchTo(PLy_curr_procedure-tmp_ctx); + PG_TRY(); { for (i = 0; i info-in.r.natts; i++) *** PLyDict_FromTuple(PLyTypeInfo *info, Hea *** 298,308 --- 306,322 } PG_CATCH(); { + MemoryContextSwitchTo(oldcontext); + MemoryContextReset(PLy_curr_procedure-tmp_ctx); + Py_DECREF(dict); PG_RE_THROW(); } PG_END_TRY(); + MemoryContextSwitchTo(oldcontext); +
[HACKERS] plpgsql leaking memory when stringifying datums
While chasing a PL/Python memory leak, I did a few tests with PL/pgSQL and I think there are places where memory is not freed sufficiently early. Attached are two functions that when run will make the backend's memory consumption increase until they finish. With both, the cause is convert_value_to_string that calls a datum's output function, which for toasted data results in an allocation. The proposed patch changes convert_value_to_string to call the output function in the per-tuple memory context and then copy the result string back to the original context. The comment in that function says that callers generally pfree its result, but that wasn't the case with exec_stmt_raise, so I added a pfree() there as well. With that I was still left with a leak in the typecast() test function and it turns out that sticking a exec_eval_cleanup into exec_move_row fixed it. The regression tests pass, but I'm not 100% sure if it's actually safe. Since convert_value_to_string needed to access the PL/pgSQL's execution state to get its hands on the per-tuple context, I needed to pass it to every caller that didn't have it already, which means exec_cast_value and exec_simple_cast_value. Anyone has a better idea? The initial diagnosis and proposed solution are by Andres Freund - thanks! Cheers, Jan diff --git a/src/pl/plpgsql/src/pl_exec.c b/src/pl/plpgsql/src/pl_exec.c index bf952b6..a691905 100644 *** a/src/pl/plpgsql/src/pl_exec.c --- b/src/pl/plpgsql/src/pl_exec.c *** static void exec_move_row(PLpgSQL_execst *** 188,201 static HeapTuple make_tuple_from_row(PLpgSQL_execstate *estate, PLpgSQL_row *row, TupleDesc tupdesc); ! static char *convert_value_to_string(Datum value, Oid valtype); ! static Datum exec_cast_value(Datum value, Oid valtype, Oid reqtype, FmgrInfo *reqinput, Oid reqtypioparam, int32 reqtypmod, bool isnull); ! static Datum exec_simple_cast_value(Datum value, Oid valtype, Oid reqtype, int32 reqtypmod, bool isnull); static void exec_init_tuple_store(PLpgSQL_execstate *estate); --- 188,204 static HeapTuple make_tuple_from_row(PLpgSQL_execstate *estate, PLpgSQL_row *row, TupleDesc tupdesc); ! static char *convert_value_to_string(PLpgSQL_execstate *estate, ! Datum value, Oid valtype); ! static Datum exec_cast_value(PLpgSQL_execstate *estate, ! Datum value, Oid valtype, Oid reqtype, FmgrInfo *reqinput, Oid reqtypioparam, int32 reqtypmod, bool isnull); ! static Datum exec_simple_cast_value(PLpgSQL_execstate *estate, ! Datum value, Oid valtype, Oid reqtype, int32 reqtypmod, bool isnull); static void exec_init_tuple_store(PLpgSQL_execstate *estate); *** plpgsql_exec_function(PLpgSQL_function * *** 430,436 else { /* Cast value to proper type */ ! estate.retval = exec_cast_value(estate.retval, estate.rettype, func-fn_rettype, (func-fn_retinput), func-fn_rettypioparam, --- 433,440 else { /* Cast value to proper type */ ! estate.retval = exec_cast_value(estate, estate.retval, ! estate.rettype, func-fn_rettype, (func-fn_retinput), func-fn_rettypioparam, *** exec_stmt_fori(PLpgSQL_execstate *estate *** 1757,1763 * Get the value of the lower bound */ value = exec_eval_expr(estate, stmt-lower, isnull, valtype); ! value = exec_cast_value(value, valtype, var-datatype-typoid, (var-datatype-typinput), var-datatype-typioparam, var-datatype-atttypmod, isnull); --- 1761,1767 * Get the value of the lower bound */ value = exec_eval_expr(estate, stmt-lower, isnull, valtype); ! value = exec_cast_value(estate, value, valtype, var-datatype-typoid, (var-datatype-typinput), var-datatype-typioparam, var-datatype-atttypmod, isnull); *** exec_stmt_fori(PLpgSQL_execstate *estate *** 1772,1778 * Get the value of the upper bound */ value = exec_eval_expr(estate, stmt-upper, isnull, valtype); ! value = exec_cast_value(value, valtype, var-datatype-typoid, (var-datatype-typinput), var-datatype-typioparam, var-datatype-atttypmod, isnull); --- 1776,1782 * Get the value of the upper bound */ value = exec_eval_expr(estate, stmt-upper, isnull, valtype); ! value = exec_cast_value(estate, value, valtype, var-datatype-typoid, (var-datatype-typinput), var-datatype-typioparam, var-datatype-atttypmod, isnull); *** exec_stmt_fori(PLpgSQL_execstate *estate *** 1789,1795 if (stmt-step) { value = exec_eval_expr(estate, stmt-step, isnull, valtype); ! value = exec_cast_value(value, valtype, var-datatype-typoid, (var-datatype-typinput), var-datatype-typioparam, var-datatype-atttypmod,
[HACKERS] unfriendly error when accessing NEW in statement-level trigger
When you try to use the NEW variable in a statement-level trigger, you get ERROR: record new is not assigned yet DETAIL: The tuple structure of a not-yet-assigned record is indeterminate. which is not too friendly, given that people sometimes forget to specify FOR EACH sth at all, get the default behaviour of FOR EACH STATEMENT and scratch their heads. A quick search on the error detail reveals a few such confused users already. What's more, the documentation for PL/pgSQL triggers says that This variable is NULL in statement-level triggers but it's not really a NULL, for instance you can't execute x is NULL with it (you get the same error). I'm not that familiar with PL/pgSQL code, so I'm not sure how or if this should be fixed. From a quick look it seems that the NEW and OLD variables could be defined as row variables instead of record, since the format of the tuple is known. Would that make sense? Cheers, Jan PS: If changing them to row types would induce too much code churn that maybe we could use some trick to check if the error comes from using the OLD or NEW variable and add a HINT to the error message? J -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PL/Python result metadata
On 11/01/12 22:52, Dimitri Fontaine wrote: Peter Eisentraut pete...@gmx.net writes: .colnames() returns a list of column names (strings) .coltypes() returns a list of type OIDs (integers) I just made that up because there is no guidance in the other standard PLs for this sort of thing, AFAICT. What about having the same or comparable API as in psycopg or DB API http://initd.org/psycopg/docs/cursor.html You could expose a py.description structure? +1 for providing a read-only result.description. Not sure if it's worth it to follow DB-API there, but maybe yes. Perhaps we could have a result.description_ex information that's PG-specific or just not present in PEP 249, like the typmod, collation and so on. J -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgsphere
- Original message - On Saturday, January 07, 2012 04:43:43 PM Dave Cramer wrote: Well I've sent Teodor a personal email asking him if he was interested and so far no response, so I interpret that as he no longer has interest in the project. I dimly remember him mentioning traveling/hiking planning to travel around the Himalayas somewhere on -hackers. That sounds more like Oleg :) Jan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] JSON for PG 9.2
On 18/12/11 04:21, Robert Haas wrote: On Sat, Dec 17, 2011 at 7:50 PM, David E. Wheeler da...@kineticode.com wrote: Love having the start here. I forwarded this message to Claes Jakobsson, creator of the jansson-using pg-json extension. He’s a bit less supportive. He gave me permission to quote him here: Frankly I see the inclusion of a JSON datatype in core as unnecessary. Stuff should be moved out of core rather than in, as we do in Perl. Also, does this patch mean that the 'json' type is forever claimed and can't be replaced by extensions? There's little reason to reimplement JSON parsing, comparision and other routines when there's a multitude of already good libraries. That's fair enough, but we've had *many* requests for this functionality in core, I don't see what we lose by having at least some basic functionality built in. I think having a JSON data type in core would drastically limit the exposure third-party JSON extensions would get and that's bad. There are tons of interesting features a JSON type could have and tying its development to a one year release cycle might be a disservice both for people who are willing to provide these features earlier, the users which are faced with a choice between a fast-moving third-party addon and a blessed core type and would cause overall confusion. How about we try the tsearch way and let JSON extensions live outside core for some time and perhaps if one emerges dominant and would benefit from inclusion then consider it? If we keep treating extensions as second-class citizens, they'll never get the mindshare and importance we seem to want for them (or otherwise why go through all the trouble to provide an infrastructure for them). Cheers, Jan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] splitting plpython into smaller parts
On 18/12/11 20:53, Peter Eisentraut wrote: On tis, 2011-12-06 at 00:58 +0100, Jan Urbański wrote: Rebased against master after the SPI cursor patch has been committed. The first patch removes SPI boilerplate from the cursor functions as well and the second patch creates a plpython_cursor.c file. A side effect of creating a separate file for cursors is that I had to make PLy_spi_transaction_{begin,commit,abort} helper functions external since they're used both by regular SPI execution functions and the cursor functions. They live the plpython_spi.c which is not an ideal place for them, but IMHO it's not bad either. Committed now. Great, thanks! I hope this will make for a more maintanable PL/Python. By the way, the buildfarm is turning red because it's missing the attached patch. Cheers, Jan diff --git a/src/pl/plpython/Makefile b/src/pl/plpython/Makefile index 12ce26e..a31328d 100644 *** a/src/pl/plpython/Makefile --- b/src/pl/plpython/Makefile *** endif # can't build *** 189,195 # distprep and maintainer-clean rules should be run even if we can't build. # Force this dependency to be known even without dependency info built: ! plpython_plpy.o: spiexceptions.h spiexceptions.h: $(top_srcdir)/src/backend/utils/errcodes.txt generate-spiexceptions.pl $(PERL) $(srcdir)/generate-spiexceptions.pl $ $@ --- 189,195 # distprep and maintainer-clean rules should be run even if we can't build. # Force this dependency to be known even without dependency info built: ! plpy_plpymodule.o: spiexceptions.h spiexceptions.h: $(top_srcdir)/src/backend/utils/errcodes.txt generate-spiexceptions.pl $(PERL) $(srcdir)/generate-spiexceptions.pl $ $@ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] JSON for PG 9.2
- Original message - On Dec 12, 2011, at 4:51 PM, Peter van Hardenberg wrote: Because we haven't heard from him in a while we've been using PL/V8 to validate a JSON datatype simulated by a DOMAIN with a simple acceptance function. (See below.) This is not ideally performant but thanks to V8's JIT the JSON parser is actually reasonably good. Note that Claes Jakobsson has been working on a JSON data type using the Jansson JSON library. http://pgxn.org/dist/pg-json/ We recently needed to store/valisate JSON data and be able to do some trivial extraction of values from it and went with pg-json. The great benefit of having JSON as an extension type is being able to use an external library (Jansson is very small, MIT licensed, looks really well written and has been actively maintaied for years) and not being tied to a yearly release cycle. Postgres jumps through a lot of hoops to be extensible and I think a JSON type is a kind of thing that fits the bill of an extension perfectly. Cheers, Jan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: plpython: Add SPI cursor support
On 06/12/11 19:33, Jan Urbański wrote: On 06/12/11 19:23, Tom Lane wrote: Peter Eisentraut pete...@gmx.net writes: plpython: Add SPI cursor support Buildfarm member narwhal does not like this patch. It looks like PyObject_SelfIter is not a compile-time constant on its version of python (2.5, apparently). I'll try to dig around to see what can be causing that... Seems that PyObject_GenericGetIter has been renamed to PyObject_SelfIter at some point: http://hg.python.org/cpython/rev/fd5ef7003469 I'm trying to muster whatever mercurial-foo I have to check if there's been a 2.5 version tagged without that patch... Jan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: plpython: Add SPI cursor support
On 07/12/11 11:16, Jan Urbański wrote: On 06/12/11 19:33, Jan Urbański wrote: On 06/12/11 19:23, Tom Lane wrote: Peter Eisentraut pete...@gmx.net writes: plpython: Add SPI cursor support Buildfarm member narwhal does not like this patch. It looks like PyObject_SelfIter is not a compile-time constant on its version of python (2.5, apparently). I'll try to dig around to see what can be causing that... Seems that PyObject_GenericGetIter has been renamed to PyObject_SelfIter at some point: http://hg.python.org/cpython/rev/fd5ef7003469 I'm trying to muster whatever mercurial-foo I have to check if there's been a 2.5 version tagged without that patch... So no, the renaming happened on Mar 17 2003 and 2.5 was tagged on Sep 18 2006. The source tarball for 2.5 contains PyObject_SelfIter. Both fennec and mastodon build OK with Python 2.5. Is it possible that narwhal's Python installation is in some way broken? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: plpython: Add SPI cursor support
On 06/12/11 19:23, Tom Lane wrote: Peter Eisentraut pete...@gmx.net writes: plpython: Add SPI cursor support Buildfarm member narwhal does not like this patch. It looks like PyObject_SelfIter is not a compile-time constant on its version of python (2.5, apparently). Hm, I quickly tried with a self-compiled Python 2.5.6 from the upstream tarball and it compiled on my Linux machine. I see that PyObject_SelfIter is defined here: http://hg.python.org/cpython/file/b48e1b48e670/Include/object.h#l407 I'll try to dig around to see what can be causing that... Cheers, Jan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plpython SPI cursors
On 05/12/11 18:58, Peter Eisentraut wrote: On ons, 2011-11-23 at 19:58 +0100, Jan Urbański wrote: On 20/11/11 19:14, Steve Singer wrote: Responding now to all questions and attaching a revised patch based on your comments. Committed Please refresh the other patch. Great, thanks! I'll try to send an updated version of the other patch this evening. Cheers, Jan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plpython SPI cursors
On 05/12/11 19:12, Bruce Momjian wrote: Jan Urbański wrote: On 05/12/11 18:58, Peter Eisentraut wrote: On ons, 2011-11-23 at 19:58 +0100, Jan Urba?ski wrote: On 20/11/11 19:14, Steve Singer wrote: Responding now to all questions and attaching a revised patch based on your comments. Committed Please refresh the other patch. Great, thanks! I'll try to send an updated version of the other patch this evening. I assume this is _not_ related to this TODO item: Add a DB-API compliant interface on top of the SPI interface No, not related. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Java LISTEN/NOTIFY client library work-around
On 30/11/11 13:07, Joel Jacobson wrote: Hi, As you know, LISTEN/NOTIFY is broken in the Java client library. You have to do a SELECT 1 in a while-loop to receive the notifications. http://jdbc.postgresql.org/documentation/head/listennotify.html Is there some other library with a proper implementation where you don't have to spam the database with queries to get the notifications? You mean a Java library? If Java is not a requirement, the psycopg2 Python library supports notifies well. Cheers, Jan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] splitting plpython into smaller parts
On 28/11/11 11:00, Greg Smith wrote: On 11/13/2011 09:45 AM, Jan Urbański wrote: The second one is the actual split. plpython.c has been split into 11 separate files and one header. Could you comment a bit more about what the goal of this is? We don't have a reviewer for this patch yet, and I think part of the reason is because it's not really obvious what it's supposed to be doing, and why that's useful. The idea of splitting plpython.c (an almost 5k lines file) into something more modular. It's been floated around here: http://postgresql.1045698.n5.nabble.com/Large-C-files-tt4766446.html#a4773493 and I think at other occasions, too. The patch introduces no behavioural changes, it's only shuffling code around. The only goal is to improve the maintainability. I guess the reviewer could verify that a) I haven't botched the split and it all still compiles and workds b) the choice of which modules were defined is correct Cheers, Jan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PL/Python SQL error code pass-through
On 23/11/11 17:24, Mika Eloranta wrote: Hi all, [PL/Python in 9.1 does not preserve SQLSTATE of errors] Oops, you're right, it's a regression from 9.0 behaviour. The fix looks good to me, I changed one place to indent with tabs instead of spaces and added a regression test. I think this should be backpatched to 9.1, no? Thanks for the report and the patch! Cheers, Jan diff --git a/src/pl/plpython/expected/plpython_error.out b/src/pl/plpython/expected/plpython_error.out index dbf19fd..bab07fb 100644 *** a/src/pl/plpython/expected/plpython_error.out --- b/src/pl/plpython/expected/plpython_error.out *** CONTEXT: PL/Python function specific_e *** 351,356 --- 351,378 (1 row) + /* SPI errors in PL/Python functions should preserve the SQLSTATE value + */ + CREATE FUNCTION python_unique_violation() RETURNS void AS $$ + plpy.execute(insert into specific values (1)) + plpy.execute(insert into specific values (1)) + $$ LANGUAGE plpythonu; + CREATE FUNCTION catch_python_unique_violation() RETURNS text AS $$ + begin + begin + perform python_unique_violation(); + exception when unique_violation then + return 'ok'; + end; + return 'not reached'; + end; + $$ language plpgsql; + SELECT catch_python_unique_violation(); + catch_python_unique_violation + --- + ok + (1 row) + /* manually starting subtransactions - a bad idea */ CREATE FUNCTION manual_subxact() RETURNS void AS $$ diff --git a/src/pl/plpython/expected/plpython_error_0.out b/src/pl/plpython/expected/plpython_error_0.out index b2194ff..6cb2ed0 100644 *** a/src/pl/plpython/expected/plpython_error_0.out --- b/src/pl/plpython/expected/plpython_error_0.out *** CONTEXT: PL/Python function specific_e *** 351,356 --- 351,378 (1 row) + /* SPI errors in PL/Python functions should preserve the SQLSTATE value + */ + CREATE FUNCTION python_unique_violation() RETURNS void AS $$ + plpy.execute(insert into specific values (1)) + plpy.execute(insert into specific values (1)) + $$ LANGUAGE plpythonu; + CREATE FUNCTION catch_python_unique_violation() RETURNS text AS $$ + begin + begin + perform python_unique_violation(); + exception when unique_violation then + return 'ok'; + end; + return 'not reached'; + end; + $$ language plpgsql; + SELECT catch_python_unique_violation(); + catch_python_unique_violation + --- + ok + (1 row) + /* manually starting subtransactions - a bad idea */ CREATE FUNCTION manual_subxact() RETURNS void AS $$ diff --git a/src/pl/plpython/plpython.c b/src/pl/plpython/plpython.c index 93e8043..afd5dfc 100644 *** a/src/pl/plpython/plpython.c --- b/src/pl/plpython/plpython.c *** static char *PLy_procedure_name(PLyProce *** 383,389 static void PLy_elog(int, const char *,...) __attribute__((format(PG_PRINTF_ATTRIBUTE, 2, 3))); ! static void PLy_get_spi_error_data(PyObject *exc, char **detail, char **hint, char **query, int *position); static void PLy_traceback(char **, char **, int *); static void *PLy_malloc(size_t); --- 383,389 static void PLy_elog(int, const char *,...) __attribute__((format(PG_PRINTF_ATTRIBUTE, 2, 3))); ! static void PLy_get_spi_error_data(PyObject *exc, int *sqlerrcode, char **detail, char **hint, char **query, int *position); static void PLy_traceback(char **, char **, int *); static void *PLy_malloc(size_t); *** PLy_spi_exception_set(PyObject *excclass *** 4441,4447 if (!spierror) goto failure; ! spidata = Py_BuildValue((zzzi), edata-detail, edata-hint, edata-internalquery, edata-internalpos); if (!spidata) goto failure; --- 4441,4447 if (!spierror) goto failure; ! spidata = Py_BuildValue((izzzi), edata-sqlerrcode, edata-detail, edata-hint, edata-internalquery, edata-internalpos); if (!spidata) goto failure; *** PLy_elog(int elevel, const char *fmt,... *** 4481,4486 --- 4481,4487 *val, *tb; const char *primary = NULL; + intsqlerrcode = 0; char *detail = NULL; char *hint = NULL; char *query = NULL; *** PLy_elog(int elevel, const char *fmt,... *** 4490,4496 if (exc != NULL) { if (PyErr_GivenExceptionMatches(val, PLy_exc_spi_error)) ! PLy_get_spi_error_data(val, detail, hint, query, position); else if (PyErr_GivenExceptionMatches(val, PLy_exc_fatal)) elevel = FATAL; } --- 4491,4497 if (exc != NULL) { if (PyErr_GivenExceptionMatches(val, PLy_exc_spi_error)) ! PLy_get_spi_error_data(val, sqlerrcode, detail, hint, query, position); else if (PyErr_GivenExceptionMatches(val, PLy_exc_fatal)) elevel = FATAL; } *** PLy_elog(int elevel, const char *fmt,... *** 4531,4537 PG_TRY(); { ereport(elevel, ! (errmsg_internal(%s, primary ? primary : no
Re: [HACKERS] PL/Python SQL error code pass-through
On 24/11/11 16:15, Heikki Linnakangas wrote: On 24.11.2011 10:07, Jan Urbański wrote: On 23/11/11 17:24, Mika Eloranta wrote: Hi all, [PL/Python in 9.1 does not preserve SQLSTATE of errors] Oops, you're right, it's a regression from 9.0 behaviour. The fix looks good to me, I changed one place to indent with tabs instead of spaces and added a regression test. Thank you, both. Is there some other fields that we should propagate from the original error message that we're missing? Like, context and file/line information? Or are those left out on purpose? I wonder if we should have a more wholesale approach, and store the whole ErrorData struct somewhere, and only add some extra context information with errcontext(). In case of SPI errors we're preserving the following from the original ErrorData: * sqlerrcode (as of Mika's patch) * detail * hint * query * internalpos that leaves us with the following which are not preserved: * message * context * detail_log The message is being constructed from the Python exception name and I think that's useful. The context is being taken by the traceback string. I'm not sure if detail_log is ever set in these types of errors, probably not? So I guess we're safe. The problem with storing the entire ErrorData struct is that this information has to be transformed to Python objects, because we attach it to the Python exception that gets raised and in case it bubbles all the way up to the topmost PL/Python function, we recover these Python objects and use them to construct the ereport call. While the exception is inside Python, user code can interact with it, so it'd be hard to have C pointers to non-Python stuff there. Cheers, Jan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plpython SPI cursors
On 20/11/11 19:14, Steve Singer wrote: On 11-10-15 07:28 PM, Jan Urbański wrote: Hi, attached is a patch implementing the usage of SPI cursors in PL/Python. Currently when trying to process a large table in PL/Python you have slurp it all into memory (that's what plpy.execute does). J I found a few bugs (see my testing section below) that will need fixing + a few questions about the code Responding now to all questions and attaching a revised patch based on your comments. Do we like the name plpy.cursor or would we rather call it something like plpy.execute_cursor(...) or plpy.cursor_open(...) or plpy.create_cursor(...) Since we will be mostly stuck with the API once we release 9.2 this is worth some opinions on. I like cursor() but if anyone disagrees now is the time. We use plpy.subtransaction() to create Subxact objects, so I though plpy.cursor() would be most appropriate. This patch does not provide a wrapper around SPI_cursor_move. The patch is useful without that and I don't see anything that preculdes someone else adding that later if they see a need. My idea is to add keyword arguments to plpy.cursor() that will allow you to decide whether you want a scrollable cursor and after that provide a move() method. The patch includes documentation updates that describes the new feature. The Database Access page doesn't provide a API style list of database access functions like the plperl http://www.postgresql.org/docs/9.1/interactive/plperl-builtins.html page does. I think the organization of the perl page is clearer than the python one and we should think about a doing some documentaiton refactoring. That should be done as a seperate patch and shouldn't be a barrier to committing this one. Yeah, the PL/Python docs are a bit chaotic right now. I haven't yet summoned force to overhaul them. in PLy_cursor_plan line 4080 + PG_TRY(); + { + Portal portal; + char *volatile nulls; + volatile int j; I am probably not seeing a code path or misunderstanding something about the setjmp/longjump usages but I don't see why nulls and j need to be volatile here. It looked like you could drop volatile there (and in PLy_spi_execute_plan, where this is copied from (did I mention there's quite some code duplication in PL/Python?)) but digging in git I found this commit: http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=2789b7278c11785750dd9d2837856510ffc67000 that added the original volatile qualification, so I guess there's a reason. line 444 PLy_cursor(PyObject *self, PyObject *args) + { + char *query; + PyObject *plan; + PyObject *planargs = NULL; + + if (PyArg_ParseTuple(args, s, query)) + return PLy_cursor_query(query); + Should query be freed with PyMem_free() No, PyArg_ParseTuple returns a string on the stack, I check that repeatedly creating a cursor with a plan argument does not leak memory and that adding PyMem_Free there promptly leads to a segfault. I tested both python 2.6 and 3 on a Linux system [test cases demonstrating bugs] Turns out it's a really bad idea to store pointers to Portal structures, because they get invalidated by the subtransaction abort hooks. I switched to storing the cursor name and looking it up in the appropriate hash table every time it's used. The examples you sent (which I included as regression tests) now cause a ValueError to be raised with a message stating that the cursor has been created in an aborted subtransaction. Not sure about the wording of the error message, though. Thanks again for the review! Cheers, Jan diff --git a/doc/src/sgml/plpython.sgml b/doc/src/sgml/plpython.sgml index eda2bbf..d08c3d1 100644 --- a/doc/src/sgml/plpython.sgml +++ b/doc/src/sgml/plpython.sgml @@ -892,6 +892,15 @@ $$ LANGUAGE plpythonu; /para para +Note that calling literalplpy.execute/literal will cause the entire +result set to be read into memory. Only use that function when you are sure +that the result set will be relatively small. If you don't want to risk +excessive memory usage when fetching large results, +use literalplpy.cursor/literal rather +than literalplpy.execute/literal. + /para + + para For example: programlisting rv = plpy.execute(SELECT * FROM my_table, 5) @@ -958,6 +967,77 @@ $$ LANGUAGE plpythonu; /sect2 + sect2 +titleAccessing data with cursors/title + + para +The literalplpy.cursor/literal function accepts the same arguments +as literalplpy.execute/literal (except for literallimit/literal) +and returns a cursor object, which allows you to process large result sets +in smaller chunks. As with literalplpy.execute/literal, either a query +string or a plan object along with a list of arguments can be used. The +cursor object provides a literalfetch/literal method that accepts an +integer paramter and returns a result object. Each time you +call literalfetch/literal, the returned object will contain the next
Re: [HACKERS] plpython SPI cursors
On 20/11/11 19:14, Steve Singer wrote: On 11-10-15 07:28 PM, Jan Urbański wrote: Hi, attached is a patch implementing the usage of SPI cursors in PL/Python. I found a few bugs (see my testing section below) that will need fixing + a few questions about the code Hi Steve, thanks a lot for the review, I'll investigate the errors you were getting and post a follow-up. Good catch on trying cursors with explicit subtransactions, I didn't think about how they would interact. Cheers, Jan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [9.1] unusable for large views
On 24/10/11 10:57, Omar Bettin wrote: [monster query] I see that your problem is already solved, but incidentially I'm working on a join order planning module and I'm looking for real-life examples of humongous queries like that to benchmark against them. Any chance you could share the schema, or at least part of it, that goes with this query? Or perhaps you have more of these queries? Cheers, Jan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Monster query
On 24/10/11 19:16, Omar Bettin wrote: Hi, Since the data are of third parties, will prepare a database suitable for the purpose. In any case, the compressed backup will be around 20 MB. If you are able to prepare a database dump that doesn't contain private data, it would be awesome. If it's 20 MB please just post a link to the archive. Thank you very much! Jan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: set GUC variables for single query
On 17/10/11 02:53, Robert Haas wrote: On Sun, Oct 16, 2011 at 4:58 PM, Tom Lane t...@sss.pgh.pa.us wrote: Dimitri Fontaine dimi...@2ndquadrant.fr writes: Now that you mention it, the following might actually already work: WITH settings AS ( SELECT set_config('timezone', 'Europe/Amsterdam', t), set_config('work_mem', '1 GB', t) ), foo AS ( SELECT … ) INSERT INTO bar SELECT * FROM foo; Only for small values of work ... you won't be able to affect planner settings that way, nor can you assume that that WITH item is executed before all else. See recent thread pointing out that setting values mid-query is unsafe. I previously floated the idea of using a new keyword, possibly LET, for this, like this: LET var = value [, ...] IN query LET was something I thought about, although you'd have to use something like parenthesis around the GUC assignements because value can contain commas, leading to shift/reduce conflicts (that sucks, unfortunately). But before whipping out the paint bucket I wanted to see if there's enough buy-in to justify rehashing the syntax details. Cheers, Jan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] proposal: set GUC variables for single query
Hi, this idea has cropped up last PGCon - the ability to set GUC variables for the duration of a single query. It would work by setting the GUCs for the duration of the query and setting them back to what they were after it has terminated. By setting them back I mean respecting the previously set values, regardless of their source (set in run-time, per-role settings, postgresql.conf settings). An example of where this would be useful: an application maintains a persistent connection to the database and answers requests for data from a bunch of clients. Each connected client has a preferred timezone and would like to get results in that timezone. Currently the application has to either sprinkle each query with AT TIME ZONE or wrap the queries in BEGIN; SET LOCAL TIMEZONE ..; query; COMMIT. It gets more complex when things like pgbouncer come into play. Another example is a one-off query that should use a different statement_timeout than the server has configured or a REINDEX command that would like to use more maintenance_work_mem. It mostly falls into the realm of syntax sugar, but as more than one person felt it's a good idea, I thought I'd float it around here. I poked a little bit at the grammar to see where could it fit and didn't have much success of doing it without a new reserved keyword. Supposing the idea gets some traction, any suggestions for the syntax? Cheers, Jan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: set GUC variables for single query
On 16/10/11 17:49, Tom Lane wrote: =?UTF-8?B?SmFuIFVyYmHFhHNraQ==?= wulc...@wulczer.org writes: this idea has cropped up last PGCon - the ability to set GUC variables for the duration of a single query. It would work by setting the GUCs for the duration of the query and setting them back to what they were after it has terminated. Doesn't SET LOCAL cover this use-case pretty well already? It does to a certain degree. If you have a bunch of statements in a transaction and want to execute one of them with a different timezone setting, you have to do the SET/RESET dance. In theory you should also first grab the current value to set it back afterwards, in case someone else did SET LOCAL before you, but I'll admin that's far-fetched. The main use case would be apps running behing pgbouncer and using statement pooling, and plain convenience. I'd find it useful myself, but for now I'm making do with SET LOCAL and it works fine. I'm bringing it up because it appears in the TODO created at the PL Summit: * Further discussion of per-statement config parameters for things like timezone - Jan Urbanski Tryin' to do my bit and all ;) Jan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] plpython SPI cursors
Hi, attached is a patch implementing the usage of SPI cursors in PL/Python. Currently when trying to process a large table in PL/Python you have slurp it all into memory (that's what plpy.execute does). This patch allows reading the result set in smaller chunks, using a SPI cursor behind the scenes. Example usage: cursor = plpy.cursor(select a, b from hugetable) for row in cursor: plpy.info(a is %s and b is %s % (row['a'], row['b'])) The patch itself is simple, but there's a lot of boilerplate dedicated to opening a subtransaction and handling prepared plans. I'd like to do some refactoring of they way PL/Python uses SPI to reduce the amount of boilerplate needed, but that'll come as a separate patch (just before the patch to split plpython.c in smaller chunks). This feature has been sponsored by Nomao. Cheers, Jan PS: I already added it to the November CF. J From 9ad14957e7b4ae19667df3bb8cc2aa5ef5bf96c8 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Jan=20Urba=C5=84ski?= wulc...@wulczer.org Date: Tue, 13 Sep 2011 14:42:41 +0200 Subject: [PATCH] Add cursor support to plpythonu. Exposes SPI cursors as plpythonu objects allowing processing large result sets without loading them entirely into memory, as plpy.execute is doing. --- doc/src/sgml/plpython.sgml | 80 src/pl/plpython/expected/plpython_spi.out | 151 +++ src/pl/plpython/expected/plpython_test.out |6 +- src/pl/plpython/plpython.c | 605 src/pl/plpython/sql/plpython_spi.sql | 116 ++ 5 files changed, 955 insertions(+), 3 deletions(-) diff --git a/doc/src/sgml/plpython.sgml b/doc/src/sgml/plpython.sgml index eda2bbf..d08c3d1 100644 *** a/doc/src/sgml/plpython.sgml --- b/doc/src/sgml/plpython.sgml *** $$ LANGUAGE plpythonu; *** 892,897 --- 892,906 /para para + Note that calling literalplpy.execute/literal will cause the entire + result set to be read into memory. Only use that function when you are sure + that the result set will be relatively small. If you don't want to risk + excessive memory usage when fetching large results, + use literalplpy.cursor/literal rather + than literalplpy.execute/literal. + /para + + para For example: programlisting rv = plpy.execute(SELECT * FROM my_table, 5) *** $$ LANGUAGE plpythonu; *** 958,963 --- 967,1043 /sect2 + sect2 + titleAccessing data with cursors/title + + para + The literalplpy.cursor/literal function accepts the same arguments + as literalplpy.execute/literal (except for literallimit/literal) + and returns a cursor object, which allows you to process large result sets + in smaller chunks. As with literalplpy.execute/literal, either a query + string or a plan object along with a list of arguments can be used. The + cursor object provides a literalfetch/literal method that accepts an + integer paramter and returns a result object. Each time you + call literalfetch/literal, the returned object will contain the next + batch of rows, never larger than the parameter value. Once all rows are + exhausted, literalfetch/literal starts returning an empty result + object. Cursor objects also provide an + ulink url=http://docs.python.org/library/stdtypes.html#iterator-types;iterator + interface/ulink, yielding one row at a time until all rows are exhausted. + Data fetched that way is not returned as result objects, but rather as + dictionaries, each dictionary corresponding to a single result row. + /para + + para + Cursors are automatically disposed of, but if you want to explicitly + release all resources held by a cursor, use the literalclose/literal + method. Once closed, a cursor cannot be fetched from anymore. + /para + +note + para + Do not confuse objects created by literalplpy.cursor/literal with + DBAPI cursors as defined by + the ulink url=http://www.python.org/dev/peps/pep-0249/;Python Database API specification/ulink. + They don't have anything in common except for the name. + /para +/note + + para + An example of two ways of processing data from a large table would be: + programlisting + CREATE FUNCTION count_odd_iterator() RETURNS integer AS $$ + odd = 0 + for row in plpy.cursor(select num from largetable): + if row['num'] % 2: + odd += 1 + return odd + $$ LANGUAGE plpythonu; + + CREATE FUNCTION count_odd_fetch(batch_size integer) RETURNS integer AS $$ + odd = 0 + cursor = plpy.cursor(select num from largetable) + while True: + rows = cursor.fetch(batch_size) + if not rows: + break + for row in rows: + if row['num'] % 2: + odd += 1 + return odd + $$ LANGUAGE plpythonu; + + CREATE FUNCTION count_odd_prepared() RETURNS integer AS $$ + odd = 0 + plan = plpy.prepare(select num from largetable where num % $1 != 0,
Re: [HACKERS] Error building v9.1.1 (git) with python 3.2.2
On 05/10/11 15:24, Lou Picciano wrote: Hackers, Have been following with some interest the dialog recently about getting pl/python built within the v9 tree; most recently, I've been trying v9.1.1 (git) and python 3.2.2. Configure and make complete without error, but 'make install' does not produce any plpython.so I have since added a symlink to /usr/lib/config-3.2mu, as suggested in that dialog - No Joy. Can anyone point me in the right direction? Could you attach the configure and build logs? It's hard to diagnose with so little info. Cheers, Jan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Large C files
On 07/09/11 01:13, Peter Geoghegan wrote: On 6 September 2011 08:29, Peter Eisentraut pete...@gmx.net wrote: I was thinking about splitting up plpython.c, but it's not even on that list. ;-) IIRC the obesity of that file is something that Jan Urbański intends to fix, or is at least concerned about. Perhaps you should compare notes. Yeah, plpython.c could easily be splitted into submodules for builtin functions, spi support, subtransactions support, traceback support etc. If there is consensus that this should be done, I'll see if I can find time to submit a giant-diff-no-behaviour-changes patch for the next CF. Cheers, Jan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] any results from PL summit?
On 29/08/11 08:21, Pavel Stehule wrote: Hello Hi Pavel, is there some some result, report from PL summit? The notes taken during the summit are here: http://wiki.postgresql.org/wiki/PgCon_2011_PL_Summit I think Selena also sent a summary and next steps mail to the participants, perhaps it could be reposted on the mailing list. Cheers, Jan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plpython crash
On 17/08/11 23:10, Tom Lane wrote: =?UTF-8?B?SmFuIFVyYmHFhHNraQ==?= wulc...@wulczer.org writes: Here are two patches that fix two separate bugs that you found simultaneously. Because they're actually separate issues, it turned out fixing them was a bit more tricky than I expected (fixing one was unmasking the other one etc). Applied with one non-cosmetic change: I got rid of the test on TransactionIdIsValid(arg-typrel_xmin) in PLy_input_tuple_funcs, as well as where you'd copied that logic in PLy_output_tuple_funcs. AFAICS skipping the update on the xmin/tid, if we're coming through there a second time, would be simply wrong. Thanks! The way things are set up now I think you never go through PLy_input_tuple_funcs twice, unless the cache is determined to be invalid and then you recreate the function from scratch. But of course it's better to be safe than sorry and even if I'm right and it was never executed twice, any refactoring effort might have broken it easily. Cheers, Jan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Caching Python modules
On 17/08/11 14:09, PostgreSQL - Hans-Jürgen Schönig wrote: CREATE OR REPLACE FUNCTION textprocess.add_to_corpus(lang text, t text) RETURNS float4 AS $$ from SecondCorpus import SecondCorpus from SecondDocument import SecondDocument i am doing some intense text mining here. the problem is: is it possible to cache those imported modules from function to function call. GD works nicely for variables but can this actually be done with imported modules as well? the import takes around 95% of the total time so it is definitely something which should go away somehow. i have checked the docs but i am not more clever now. After a module is imported in a backend, it stays in the interpreter's sys.modules dictionary and importing it again will not cause the module Python code to be executed. As long as you are using the same backend you should be able to call add_to_corpus repeatedly and the import statements should take a long time only the first time you call them. This simple test demonstrates it: $ cat /tmp/slow.py import time time.sleep(5) $ PYTHONPATH=/tmp/ bin/postgres -p 5433 -D data/ LOG: database system was shut down at 2011-08-17 14:16:18 CEST LOG: database system is ready to accept connections $ bin/psql -p 5433 postgres Timing is on. psql (9.2devel) Type help for help. postgres=# select slow(); slow -- (1 row) Time: 5032.835 ms postgres=# select slow(); slow -- (1 row) Time: 1.051 ms Cheers, Jan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Caching Python modules
On 17/08/11 14:19, Jan Urbański wrote: On 17/08/11 14:09, PostgreSQL - Hans-Jürgen Schönig wrote: CREATE OR REPLACE FUNCTION textprocess.add_to_corpus(lang text, t text) RETURNS float4 AS $$ from SecondCorpus import SecondCorpus from SecondDocument import SecondDocument i am doing some intense text mining here. the problem is: is it possible to cache those imported modules from function to function call. GD works nicely for variables but can this actually be done with imported modules as well? the import takes around 95% of the total time so it is definitely something which should go away somehow. i have checked the docs but i am not more clever now. After a module is imported in a backend, it stays in the interpreter's sys.modules dictionary and importing it again will not cause the module Python code to be executed. As long as you are using the same backend you should be able to call add_to_corpus repeatedly and the import statements should take a long time only the first time you call them. This simple test demonstrates it: [example missing the slow() function code] Oops, forgot to show the CREATE statement of the test function: postgres=# create or replace function slow() returns void language plpythonu as $$ import slow $$; Jan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plpython crash
On 17/08/11 11:40, Jan Urbański wrote: On 16/08/11 19:12, Jan Urbański wrote: On 16/08/11 19:07, Jean-Baptiste Quenot wrote: [plpython is buggy] I'll have a patch ready soon. Here are two patches that fix two separate bugs that you found simultaneously. Because they're actually separate issues, it turned out fixing them was a bit more tricky than I expected (fixing one was unmasking the other one etc). Thanks for the report! Jan From 3c0bf7519cad735160d9d222d6f86f84987b38b5 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Jan=20Urba=C5=84ski?= wulc...@wulczer.org Date: Wed, 17 Aug 2011 16:07:54 +0200 Subject: [PATCH 2/2] Guard against return type changing in PL/Python functions. Functions cache their I/O routines and in case their return type is composite, a change of the underlying type can cause the cache to become invalid. PL/Python was already checking for composite type changes for input arguments, now the check is extended to cover the return type as well. Per bug report from Jean-Baptiste Quenot. --- src/pl/plpython/expected/plpython_record.out | 21 ++ src/pl/plpython/plpython.c | 93 ++--- src/pl/plpython/sql/plpython_record.sql | 15 3 files changed, 103 insertions(+), 26 deletions(-) diff --git a/src/pl/plpython/expected/plpython_record.out b/src/pl/plpython/expected/plpython_record.out index 7c60089..0bcc46c 100644 --- a/src/pl/plpython/expected/plpython_record.out +++ b/src/pl/plpython/expected/plpython_record.out @@ -308,6 +308,27 @@ SELECT * FROM test_inout_params('test_in'); test_in_inout (1 row) +-- try changing the return types and call functions again +ALTER TABLE table_record DROP COLUMN first; +ALTER TABLE table_record DROP COLUMN second; +ALTER TABLE table_record ADD COLUMN first text; +ALTER TABLE table_record ADD COLUMN second int4; +SELECT * FROM test_table_record_as('obj', 'one', 1, false); + first | second +---+ + one | 1 +(1 row) + +ALTER TYPE type_record DROP ATTRIBUTE first; +ALTER TYPE type_record DROP ATTRIBUTE second; +ALTER TYPE type_record ADD ATTRIBUTE first text; +ALTER TYPE type_record ADD ATTRIBUTE second int4; +SELECT * FROM test_type_record_as('obj', 'one', 1, false); + first | second +---+ + one | 1 +(1 row) + -- errors cases CREATE FUNCTION test_type_record_error1() RETURNS type_record AS $$ return { 'first': 'first' } diff --git a/src/pl/plpython/plpython.c b/src/pl/plpython/plpython.c index 90d3c47..a254ffa 100644 --- a/src/pl/plpython/plpython.c +++ b/src/pl/plpython/plpython.c @@ -1489,6 +1489,42 @@ PLy_function_delete_args(PLyProcedure *proc) PyDict_DelItemString(proc-globals, proc-argnames[i]); } +static bool +PLy_procedure_argument_valid(PLyTypeInfo *arg) +{ + Oid relid; + HeapTuple relTup; + bool valid; + + /* Only check input arguments that are composite */ + if (arg-is_rowtype != 1) { + return true; + } + + /* An uninitialised typ_relid means that we got called on an output + * argument of a function returning a unnamed record type */ + if (!OidIsValid(arg-typ_relid)) { + return true; + } + + Assert(TransactionIdIsValid(arg-typrel_xmin)); + Assert(ItemPointerIsValid(arg-typrel_tid)); + + /* Get the pg_class tuple for the argument type */ + relid = arg-typ_relid; + relTup = SearchSysCache1(RELOID, ObjectIdGetDatum(relid)); + if (!HeapTupleIsValid(relTup)) + elog(ERROR, cache lookup failed for relation %u, relid); + + /* If it has changed, the function is not valid */ + valid = (arg-typrel_xmin == HeapTupleHeaderGetXmin(relTup-t_data) + ItemPointerEquals(arg-typrel_tid, relTup-t_self)); + + ReleaseSysCache(relTup); + + return valid; +} + /* * Decide whether a cached PLyProcedure struct is still valid */ @@ -1509,33 +1545,16 @@ PLy_procedure_valid(PLyProcedure *proc, HeapTuple procTup) /* If there are composite input arguments, they might have changed */ for (i = 0; i proc-nargs; i++) { - Oid relid; - HeapTuple relTup; - /* Short-circuit on first changed argument */ if (!valid) break; - /* Only check input arguments that are composite */ - if (proc-args[i].is_rowtype != 1) - continue; - - Assert(OidIsValid(proc-args[i].typ_relid)); - Assert(TransactionIdIsValid(proc-args[i].typrel_xmin)); - Assert(ItemPointerIsValid(proc-args[i].typrel_tid)); - - /* Get the pg_class tuple for the argument type */ - relid = proc-args[i].typ_relid; - relTup = SearchSysCache1(RELOID, ObjectIdGetDatum(relid)); - if (!HeapTupleIsValid(relTup)) - elog(ERROR, cache lookup failed for relation %u, relid); - - /* If it has changed, the function is not valid */ - if (!(proc-args[i].typrel_xmin == HeapTupleHeaderGetXmin(relTup-t_data) - ItemPointerEquals(proc-args[i].typrel_tid, relTup-t_self))) - valid = false; + valid = PLy_procedure_argument_valid(proc-args[i]); + } - ReleaseSysCache(relTup); + /* if the output argument is composite, it might have changed */ + if (valid) { + valid
Re: [HACKERS] rc1 or beta4?
On 17/08/11 15:00, Dave Page wrote: The current plan (or, the last one I recall) is to push another 9.1 release tomorrow, for Monday release. Are we going with beta4 or rc1? Sorry to butt in, but it would probably be good to include fixes for the two segfault plpython bugs[1] before wrapping up the release. Cheers, Jan [1] http://archives.postgresql.org/message-id/4e4bcd52.90...@wulczer.org -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] rc1 or beta4?
On 17/08/11 17:50, Thom Brown wrote: On 17 August 2011 16:47, Jan Urbański wulc...@wulczer.org wrote: On 17/08/11 15:00, Dave Page wrote: The current plan (or, the last one I recall) is to push another 9.1 release tomorrow, for Monday release. Are we going with beta4 or rc1? Sorry to butt in, but it would probably be good to include fixes for the two segfault plpython bugs[1] before wrapping up the release. It's not listed as a beta-blocker yet. I take it that it should? Oh, in the wiki? I don't know, it is a segfault-causing bug, but all I wanted was to draw some attention in case the people wrapping the release missed that thread. Jan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plpython crash
On 16/08/11 16:52, Jean-Baptiste Quenot wrote: After backporting plpython.c from HEAD, this is the error message I get: ERROR: key pg.dropped.6 not found in mapping HINT: To return null in a column, add the value None to the mapping with the key named after the column. CONTEXT: while creating return value PL/Python function myfunc What does it mean? Ah, interesting, I think that this means that you are returning a table type and that table has a dropped column. The code should skip over dropped columns, but apparently it does not and tries to find a value for that column in the mapping you are returning. I'll try to reproduce it here. Jan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plpython crash
On 16/08/11 17:06, Jan Urbański wrote: On 16/08/11 16:52, Jean-Baptiste Quenot wrote: After backporting plpython.c from HEAD, this is the error message I get: ERROR: key pg.dropped.6 not found in mapping HINT: To return null in a column, add the value None to the mapping with the key named after the column. CONTEXT: while creating return value PL/Python function myfunc What does it mean? I did a couple of simple tests and can't see how can the code not skip dropped columns. It seems like you're missing this commit: http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=41282111e6cc73aca4b63dffe950ba7a63e4bd8a Could you try running this query? (assuming your function is called 'myfync') select proname, relname, attname, attisdropped from pg_proc join pg_class on (prorettype = reltype) join pg_attribute on (attrelid = pg_class.oid) where proname = 'myfunc'; Cheers, Jan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plpython crash
On 16/08/11 19:07, Jean-Baptiste Quenot wrote: Dear Jan, Sorry I typed the wrong git commands. With latest plpython from branch master I got the same gdb backtrace as reported before. I managed to wrap up a testcase that fails 100% of times on my setup: https://gist.github.com/1149512 Hope it crashes on your side too :-) Awesome, it segfaults for me with HEAD ;) Now it's just a simple matter of programming... I'll take a look at it this evening. Jan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plpython crash
On 11/08/11 18:01, Jean-Baptiste Quenot wrote: Hi there, plpython crashes on me on various 64-bit Ubuntu hosts, see the gdb backtrace at: https://gist.github.com/1140005 Do you believe there was recent bugfixes regarding PLyMapping_ToTuple() ? This is PG 9.0.4 with HEAD of plpython taken in march 2011 and backported. Please tell me if you need more information. Hi, there were no changes to that area of plpython after March 2011. Could you try to see if the error also appears if you run your app with current PostgreSQL HEAD (both the server and plpython)? Which Python version is that? You can get that info by running: do $$ import sys; plpy.info(sys.version) $$ language plpythonu; Could you try to extract a self-contained example of how to reproduce it? If the bug only appears under your application's specific workload, perhaps you could try running it with Postgres compiled with -O0, because compiling with -O2 causes the gdb backtrace to be missing optimised out values and inlined functions? Cheers, Jan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plpython crash
On 12/08/11 13:55, Jean-Baptiste Quenot wrote: Here is the same with -O0: https://gist.github.com/1140005 sys.version reports this: INFO: 2.6.6 (r266:84292, Sep 15 2010, 16:41:53) [GCC 4.4.5] I'm still at a loss. Did you reproduce it with git HEAD? I see that the query being execute is select * from myfunc(); would it be possible to share the code of myfunc? Jan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Initial Review: JSON contrib modul was: Re: [HACKERS] Another swing at JSON
On 23/07/11 01:12, Robert Haas wrote: On Fri, Jul 22, 2011 at 6:04 PM, Joey Adams joeyadams3.14...@gmail.com wrote: On another matter, should the JSON type guard against duplicate member keys? The JSON RFC says The names within an object SHOULD be unique, meaning JSON with duplicate members can be considered valid. JavaScript interpreters (the ones I tried), PHP, and Python all have the same behavior: discard the first member in favor of the second. That is, {key:1,key:2} becomes {key:2}. The XML type throws an error if a duplicate attribute is present (e.g. 'a href=b href=c/'::xml). Hmm. That's tricky. I lean mildly toward throwing an error as being more consistent with the general PG philosophy. OTOH: regression=# select 'key=1,key=2'::hstore; hstore key=1 (1 row) Cheers, Jan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PL/Python: No stack trace for an exception
On 21/07/11 15:27, Sushant Sinha wrote: I am using plpythonu on postgres 9.0.2. One of my python functions was throwing a TypeError exception. However, I only see the exception in the database and not the stack trace. It becomes difficult to debug if the stack trace is absent in Python. logdb=# select get_words(forminput) from fi; ERROR: PL/Python: TypeError: an integer is required CONTEXT: PL/Python function get_words And here is the error if I run that function on the same data in python: [traceback] Is this a known problem or this needs addressing? Yes, traceback support in PL/Python has already been implemented and is a new feature that will be available in PostgreSQL 9.1. Cheers, Jan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench--new transaction type
On 13/06/11 06:38, Greg Smith wrote: On 06/11/2011 03:21 PM, Jeff Janes wrote: I wouldn't expect IPC chatter to show up in profiling, because it costs wall time, but not CPU time. The time spent might be attributed to the kernel, or to pgbench, or to nothing at all. Profilers aren't necessarily just accumulating raw CPU time though. If the approach includes sampling what code is active right now? periodically, you might be able to separate this out even though it's not using CPU time in the normal fashion. I think you might just need to use a better profiler. I got surprisingly insightful results in the past using http://poormansprofiler.org/ I never used it with Postgres, but it might be worth to try. J -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] increasing collapse_limits?
On 01/05/11 21:16, Joshua Berkus wrote: Speaking of which, what happened to replacing GEQO with Simulated Annealing? Where did that project go? It stayed on github (https://github.com/wulczer/saio) and stagnated a bit after I got my degree. It's on the top of my list of things to pick up after the summer (or maybe even during the summer). Cheers, Jan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers