Re: [HACKERS] pl/python tracebacks
On Mar 1, 2011, at 12:10 PM, Jan Urbański wrote: So you end up with a context message saying PL/Python function %s and a detail message with the saved detail (if it's present) *and* the traceback. The problem is that the name of the function is already in the traceback, so there's no need for the context *if* there's a traceback present. The problem I'm having is technical: since the callback is already set when the code reaches the traceback-printing stage, you can't really unset it. AFAICS the elog code calls *all* callbacks from error_context_stack. So I can't prevent the context message from appearing. If I make the traceback part of the context as well, it's just going to appear together with the message from the callback. I remember going through a lot of pain getting this done right in pg-python[pl/py]. SELECT it_blows_up(); ERROR: function's main raised a Python exception CONTEXT: [exception from Python] Traceback (most recent call last): File public.it_blows_up(), line 13, in main three() File public.it_blows_up(), line 10, in three return two() File public.it_blows_up(), line 7, in two return one() File public.it_blows_up(), line 4, in one raise OverflowError(there's water everywhere) OverflowError: there's water everywhere [public.it_blows_up()] IIRC, I unconditionally print the [public.it_blows_up()] part iff it's not an ERROR. If it is an ERROR, I let the traceback rendering part of the code handle it on the PL's entry point exit. It was really tricky to do this because I was rendering the traceback *after* the error_context_stack had been called. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pl/python improvements
On Dec 23, 2010, at 3:38 AM, Jan Urbański wrote: Oh, didn't know that. I see that it does some more fancy things, like defining a inheritance hierarchy for these exceptions and adding some more into the mix. Right, there were some cases that appeared to benefit from larger buckets than what the existing code classes provided. Also, some of the exceptions in there are strictly for py-postgresql/client-side things. The names I used are not really invented, they're just plpgsql condition names from http://www.postgresql.org/docs/current/static/errcodes-appendix.html with underscores changed to camel case. Also, since they're autogenerated from utils/errcodes.h they don't have any hierarchy, they just all inherit from SPIError. For the backend setting, I think this is quite appropriate. However, for pg-python, I had mixed feelings about this as I wanted to be able to leverage py-postgresql's hierarchy, but still have the projects independent. I ended up punting on this one by using a single error class, and forcing the user to compare the codes. =( Sticking Error to every one of them will result in things like SubstringErrorError, so I'm not really sold on that. There was some creativity applied to the names in postgresql.exceptions to accommodate for things like that. (Like no redundant Error) Basically I think more PL/Python users will be familiar with condition names as you use them in pl/pgsql than with the names from py-postgresql. I think that's fair assumption. In fact, I think that might make a good TODO for py-postgresql/pg-python. Provide a plpgsql-code-name to exception class mapping. cheers, jwp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] hstores in pl/python
On Dec 13, 2010, at 6:16 PM, Tom Lane wrote: how do you identify which type OID is really hstore? How about an identification field on pg_type? CREATE TYPE hstore ..., IDENTIFIER 'org.postgresql.hstore'; -- Where the identifier is an arbitrary string. Type information can be looked up by the PL, and the I/O functions can be dynamically resolved using the identifier. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [JDBC] Trouble with COPY IN
On Aug 9, 2010, at 11:49 AM, Kris Jurka wrote: Oh, duh. It's a server side copy not going through the client at all. Here's a hopefully final patch. Trying it out... Works for me. I understand the resistance to the patch, but it would be quite nice to see this wart in the rear view. =\ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Python 2.7 deprecated the PyCObject API?
On Aug 14, 2010, at 9:08 AM, Tom Lane wrote: Just to clarify, you're recommending something like proc-me = PyCObject_FromVoidPtr(proc, NULL); + if (proc-me == NULL) + elog(ERROR, could not create PyCObject for function); PyDict_SetItemString(PLy_procedure_cache, key, proc-me); correct? (Hm, and it looks like we'd better move the pfree just above that...) Almost, there's still a Python exception to report and/or clear. I only glanced at this and didn't recall what the plpython mechanisms were for that, thus the ambiguous complain(). Yeah, and since we'll have to back-patch it, a fairly noninvasive patch would be nice. Will you work on that? I was hoping that Peter would pop in with a patch, but I think a few lines of CPP may suffice.. (warning: untested =) #ifdef Py_CAPSULE_H /* * Python.h (2.7 and up) includes pycapsule.h, so rely on the header * define to detect the API's existence. */ #define PyCObject_FromVoidPtr(POINTER, IGNORED) PyCapsule_New(POINTER, NULL, NULL) #undef PyCObject_Check #define PyCObject_Check(OBJ) PyCapsule_CheckExact(OBJ) #define PyCObject_AsVoidPtr(OBJ) PyCapsule_GetPointer(OBJ, NULL) #endif /* Py_CAPSULE_H */ http://svn.python.org/view/python/branches/release27-maint/Include/pycapsule.h?view=markup yay? nay? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Python 2.7 deprecated the PyCObject API?
On Aug 13, 2010, at 5:20 PM, Tom Lane wrote: According to a discussion over in Fedora-land, $subject is true: http://lists.fedoraproject.org/pipermail/devel/2010-August/140995.html I see several calls in plpython.c that seem to refer to PyCObject stuff. Anybody have any idea if we need to do something about this? Well, we should at least be checking for an exception here anyways: proc-me = PyCObject_FromVoidPtr(proc, NULL); PyDict_SetItemString(PLy_procedure_cache, key, proc-me); if (proc-me == NULL) complain(); That is, with those warnings adjustments, proc-me will be NULL and then explode in PyDict_SetItemString: [David Malcolm] However, if someone overrides the process-wide warnings settings, then the API can fail altogether, raising a PendingDeprecationWarning exception (which in CPython terms means setting a thread-specific error state and returning NULL). ./ AFA a better fix is concerned, the shortest route would seem to be to use the new capsule stuff iff Python = 2.7. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [JDBC] Trouble with COPY IN
On Aug 6, 2010, at 4:31 PM, Kris Jurka wrote: binary-copy-end-v2.patch I think there's a snag in the patch: postgres=# COPY data FROM '/Users/jwp/DATA.bcopy' WITH BINARY; ERROR: row field count is -1, expected 1 CONTEXT: COPY data, line 4 Probably a quick/small fix away, I imagine. But, I was able to trigger the new ERROR with py-postgresql: import postgresql as pg db=pg.open('localhost/postgres') q=db.prepare('copy data FROM STDIN WITH BINARY') from itertools import chain import sys db.pq.tracer = sys.stderr.write q.load_rows(chain(open('/Users/jwp/DATA.bcopy', 'rb'), (b'EXTRA',))) ↑ B(25): b'B\x00\x00\x00\x18\x00py:0x1268b30\x00\x00\x00\x00\x00\x00\x00' ↑ E(10): b'E\x00\x00\x00\t\x00\x00\x00\x00\x01' ↑ S(5): b'S\x00\x00\x00\x04' ↓ b'2'(0): b'' ↓ b'G'(5): b'\x01\x00\x01\x00\x01' ↑__(7): b'PGCOPY\n' ↑__(3): b'\xff\r\n' ↑__(41): b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00\x04\x00\x00\x00\x01\x00\x01\x00\x00\x00\x04\x00\x00\x00\x02\x00\x01\x00\x00\x00\x04\x00\x00\x00\x03\xff\xff' ↑__(5): b'EXTRA' ↑ c(5): b'c\x00\x00\x00\x04' ↑ S(5): b'S\x00\x00\x00\x04' ↓ b'E'(95): b'SERROR\x00C22P04\x00Mreceived copy data after EOF marker\x00WCOPY data, line 4\x00Fcopy.c\x00L2081\x00RCopyFrom\x00\x00' ↓ b'Z'(1): b'I' Traceback (most recent call last): File stdin, line 1, in module snip File /Library/Frameworks/Python.framework/Versions/3.1/lib/python3.1/site-packages/postgresql/driver/pq3.py, line 462, in raise_server_error raise server_error postgresql.exceptions.BadCopyError: received copy data after EOF marker CODE: 22P04 LOCATION: File 'copy.c', line 2081, in CopyFrom from SERVER CONTEXT: COPY data, line 4 STATEMENT: [prepared] sql_parameter_types: [] statement_id: py:0x1268b30 string: copy data FROM STDIN WITH BINARY CONNECTION: [idle] client_address: ::1/128 client_port: 63922 version: PostgreSQL 9.1devel on x86_64-apple-darwin10.4.0, compiled by GCC i686-apple-darwin10-gcc-4.2.1 (GCC) 4.2.1 (Apple Inc. build 5664), 64-bit CONNECTOR: [Host] pq://jwp:*...@localhost:5432/postgres category: None DRIVER: postgresql.driver.pq3.Driver -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [JDBC] Trouble with COPY IN
On Jul 25, 2010, at 8:01 AM, Kris Jurka wrote: The JDBC driver reads server messages for multiple reasons. One of them is indeed to do early failure detection. That's high quality. =) Another is to pickup NoticeResponse messages to avoid a network buffer deadlock. That's a good catch. I don't think psql/restore would often run into this as when COPY IN is in play, it's normally restoring a database. However, with JDBC, I imagine COPY would more often be used to do bulk loading into live tables that may very well cause a NOTICE. [Well, I reference psql/libpq because I don't recall it recognizing failure during COPY IN in the past, so I assume it's not receiving any data in that state.] hrm, I suppose a lazy way around that problem would be to suspend all client messages(client_min_messages) during COPY IN. Tho, I guess one would still have to contend with NotificationResponse, and ParameterStatus.. So this is possible to work around driver side by peeking into the network stream and delaying processing of the end of copy until the driver agrees that the copy is done, but I don't think you would have to peek in. If the interface were to always hold onto the last message or last n-bytes submitted to be sent, it would be able to send the possible CopyData(EOF) and CopyDone once the COPY operation (at the interface level) is closed/shutdown/terminated. Granted, this is dependent on CopyData(EOF) not being in the middle of regular CopyData, but I gather that that would end in an ErrorResponse anyways. I still maintain that this is a server bug. It is not OK for the server to assume that the client is done and move on, the client must tell the server what it wants done. I'm a bit torn here. While it would seem to be either a bug in the spec or a bug in the server, I'm inclined to call it a wart in the server's implementation of the spec. I don't see the fix as being dangerous, but I imagine an implementor would want to have the workaround in place regardless. I certainly would. I'd be in favor of seeing this fixed in 9.x, and the documentation updated to warn implementors about the wart in the older versions.. That is, I don't see any reason why we can't get rid of this unsightly thing considering the workarounds would still work with a wart-free server. cheers, jwp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [JDBC] Trouble with COPY IN
On Jul 28, 2010, at 9:53 AM, Kris Jurka wrote: Technically you won't get NotificationResponse until transaction end, so you don't need to worry about that mid copy. Ah, thanks for noting that. It would appear my original reading of the async section didn't get far enough beyond Frontends must be prepared to deal with these messages at any time, even when not engaged in a query.. I see the note below clarifying NotificationResponse. One of the key points of confusion is that CopyData(EOF) does not result in an error. It results in ignoring any futher data. The problem I have is that for text mode it waits for CopyDone, but in binary mode it ends the copy sequence immediately. That is bothersome. :\ Additionally the interface exposed by the JDBC driver lets the user write arbitrary CopyData bytes to the server, so without parsing all of that we don't know whether they've issued CopyData(EOF) or not. Okay, so you can't know with absolute certainty without parsing the data, but the usual case would be handled by holding onto the last-N bytes or so. Enough to fit the EOF and perhaps a little more for paranoia's sake. That's not to say that I'm missing the problem. When (not if, when) the user feeds data past a CopyData(EOF), it's going to get interesting. [Thinking about the logic necessary to handle such a case and avoid network buffer deadlock...] I would think the least invasive way to handle it would be to set the CommandComplete and ReadyForQuery messages aside when they are received if CopyDone hasn't been sent, continue the COPY operation as usual until it is shutdown, send CopyDone and, finally, reinstate CommandComplete and RFQ as if they were just received.. I don't think that really accommodates for CopyFail as the status in RFQ will need to be adjusted to match what was actually done..? Well, I'm not sure you would need to worry about NoticeResponse after a premature CommandComplete as INSERTs are no longer happening. ugh. +1 for a fix. Not directly regarding your patch, but while the discussion is in the general area. I think it would be wise to throw an error when non-empty CopyData messages are received after CopyData(EOF). Chances are that the user is making a mistake and should be notified of it. cheers, jwp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [JDBC] Trouble with COPY IN
On Jul 23, 2010, at 7:11 AM, Tom Lane wrote: I can't help thinking that the JDBC driver must be being overly cute if this breaks it ... I was wondering the same thing when I first saw Kris' message. However, iff I understand what JDBC is trying to achieve, I don't think I would call it overly. @Kris Is this a problem because JDBC is trying to detect failures as early as possible during a COPY IN? Or, is it just JDBC's normal MO to always be reading? Well, I've wanted to do the former (early error detection) with py-postgresql's COPY support, and I imagine getting a read event marking completion prior to emitting done/fail could be a snag. cheers, jwp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Python Interface Hacking
On Jul 7, 2010, at 12:07 AM, Peter Froehlich wrote: I joined this list under the mis-impression that it was about hacking the Python interfaces to pgsql. Is there possibly another list for that? Or is the Python stuff just mixed in with all the rest? Sorry for the meta-question... For BSD/MIT licensed python3 fun, be sure to check out: http://python.projects.postgresql.org/ http://python.projects.postgresql.org/backend/ http://pgfoundry.org/mailman/listinfo/python-general/ The project currently enjoys a status as an alternative, but with help from people like you that could change. =) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plpython3
On Feb 1, 2010, at 12:18 PM, James William Pye wrote: Right now, I'm trying to trim some of the easy issues[1] and getting a project web page up. I expect to be able to make a release soon, and I'll follow-up to this thread when I do. Well, I ended up doing some others things at that point in time, but I have managed to get a release out: http://python.projects.postgresql.org/backend/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] libpq connectoin redirect
On Apr 20, 2010, at 10:03 PM, feng tian wrote: Another way to do this, is to send the client an redirect message. When client connect to 127.0.0.10, instead of accepting the connection, it can reply to client telling it to reconnect to one of the server on 127.0.0.11-14. ISTM that this would be better handled at a higher-level. That is, given a server (127.0.0.10) that holds 127.0.0.11-14. Connect to that server and query for the correct target host. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Tightening binary receive functions
On Nov 10, 2009, at 9:54 AM, Bruce Momjian wrote: FYI, Heikki has fixed this bug and the fix will appear in Postgres 8.5. Heikki Oops, you're right. The check is indeed confusing julian day Heikki numbers, with epoch at 23th of Nov 4714 BC, with Heikki postgres-reckoning day numbers, with epoch at 1th of Jan Heikki 2000. Thanks, will fix. Need a special case for the infinities as well? postgres=# create table foo (d date); CREATE TABLE postgres=# INSERT INTO foo VALUES ('infinity'); INSERT 0 1 postgres=# COPY foo TO '/Users/jwp/foo.copy' WITH BINARY; COPY 1 postgres=# COPY foo FROM '/Users/jwp/foo.copy' WITH BINARY; ERROR: date out of range CONTEXT: COPY foo, line 1, column d postgres=# DELETE FROM foo; DELETE 1 postgres=# INSERT INTO foo VALUES ('-infinity'); INSERT 0 1 postgres=# COPY foo TO '/Users/jwp/foo.copy' WITH BINARY; COPY 1 postgres=# COPY foo FROM '/Users/jwp/foo.copy' WITH BINARY; ERROR: date out of range CONTEXT: COPY foo, line 1, column d postgres=# SELECT version(); version --- PostgreSQL 8.5devel on i386-apple-darwin10.2.0, compiled by GCC i686-apple-darwin10-gcc-4.2.1 (GCC) 4.2.1 (Apple Inc. build 5646) (dot 1), 64-bit (1 row) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Confusion over Python drivers
On Feb 6, 2010, at 5:51 PM, Josh Berkus wrote: Finally, I just don't see the existing (often PG specific) goals that I have in mind for it appealing to the majority of [web framework/abstraction] users. What are those goals? I think the most interesting one that has yet to be implemented is fast, multiple destination COPY support. Currently, COPY is supported, but a bytes() object is allocated for each row, so it's currently not efficient for moving data(pg-to-pg ETL sans the T? =). While some C is still needed to make it properly efficient, it's primarily to keep track of the COPY's state and to update stats. This is pretty useless to a django user... Well, I suppose it might be interesting if COPY OUT could target [or transform into] JSON, but idk... The general, ongoing goal is to implement and document *convenient* Python interfaces to PostgreSQL features. A, perhaps uninteresting, case being supporting advisory locks. I was thinking a context manager, but it might just be something as trivial as an additional method on the connection(very simple/direct binding). Some fuzzy goals: twisted support, some asynchronous interfaces, and greater user control over type I/O. The first, twisted, mostly interests me as an exercise. The second, async interfaces, scares me as it took me some time just to feel not unhappy with the blocking APIs. The third will probably happen, but it's going to be a while. I also have some goals not directly related to a driver. postgresql.unittest is currently only used internally, but I hope to document it some day soon so that people can write Python unittest.TestCase's that auto-magically build out a target cluster(~pg_regress/pgTap for Python?). Well, it works, but it's not documented and the APIs haven't been given much thought. Generally, basic cluster management tools for Python. (At one point I tried to write a programmer's HBA editor, but I think I hurt myself trying to figure out rule reduction.. That is, it was trying to be smarter than insert/delete rule at position x.) Well, these are the ones that come to mind, atm, but I don't think there's much beyond them. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Confusion over Python drivers
On Feb 5, 2010, at 1:34 PM, Marko Kreen wrote: py-postgresql seems to be more serious, but as it's python3 only which makes it irrelevant today. Furthermore, if it did work on python2, it's *not* something that's going to appeal to mainstream users (Python heavy web frameworks) as it *partially* suffers from the same problem that pg8000 does. It's mostly pure-Python, but it has some C optimizations(notably, PQ message buffer). I have done some profiling, and *with a few tweaks* it's about 2x-3x *slower than psycopg2* for the retrieval of a single int column. I think it could go faster, but I don't think it's worth the work. ISTM that the target audience are folk who are married to PG, and are generally unhappy with DB-API, but do not want to buy into a big abstraction layer like SQLAlchemy. Sure, it supports DB-API like other drivers so it *would be* usable with frameworks, but why take the 3x *or greater* hit over a properly implemented libpq version? Finally, I just don't see the existing (often PG specific) goals that I have in mind for it appealing to the majority of [web framework/abstraction] users. Psycopg was the leader, especially in web-environments, but it has non-obvious license and with dead website it does not seem that attractive. Although it is well-maintained still. Best path forward would be to talk with Psycopg guys about license clarification/change. Yep. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Confusion over Python drivers
On Feb 5, 2010, at 11:34 AM, Josh Berkus wrote: For people who use Python a lot, could I have a list of the deficiencies in DBAPI? I've got my horse and lance ready. Given that SQLAlchemy isn't for everyone, of course ... it couldn't be, or Django would use it, no? Here are some to start with: - paramstyle - No direct support of prepared statements [they *tried* to compensate for this with cached statements, but it's inadequate] - Too many *optional* extensions (Cursor Method .__iter__() being rather notable) http://www.python.org/dev/peps/pep-0249/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Confusion over Python drivers
On Feb 5, 2010, at 8:00 AM, Peter Eisentraut wrote: I think another difference is that the Perl DBI interface is very rich, whereas the Python DB-API is quite minimal and almost forces people to write (incompatible) extensions. Yep. The DB-SIG at Python that ought to drive all this is also quite dead, From reading messages that come into that list(not-so-lately), I've gotten the impression that they are content with DB-API 2. Aside from the TPC interfaces, the last round of activity that I know of was dbapi3[1 (which was a long while ago)]. possibly because everyone has moved on to SQLAlchemy. Yeah. AFAICT, that is the general direction encouraged by DB-SIG. Write an abstraction on top of DB-API. SQLAlchemy, SQLObject, anysql, and I'm sure there are others. [1] http://wiki.python.org/moin/Aug2001DbApi3Strawman http://wiki.python.org/moin/DbApi3 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plpython3
On Feb 1, 2010, at 11:29 AM, Joshua D. Drake wrote: On Mon, 2010-02-01 at 13:20 -0500, Robert Haas wrote: On the basis of all of the foregoing, I don't think we can consider this patch further for this CommitFest and will update commitfest.postgresql.org accordingly. If the user community grows or if one of the committers takes an interest in this down the road, I think we could consider it for a future release. I spoke with James offline about this as well. My understanding (correct me James) is that he is working on an implementation that can be installed via PGXS. yep, mostly done: http://github.com/jwp/pg-python The tests that can pass are passing on 8.3 and 8.4 now, save optimized Cursor returns in 8.3(the materialize preferred flag). Also made some other improvements like getting rid of the ugly `__func__.stateful = True` in favor of a decorator, @Stateful. (Thanks to Harald for a push in that direction.) Right now, I'm trying to trim some of the easy issues[1] and getting a project web page up. I expect to be able to make a release soon, and I'll follow-up to this thread when I do. However, the lack of release files shouldn't stop anyone from trying it out. =) Snapshot: http://github.com/jwp/pg-python/zipball/master [1] http://github.com/jwp/pg-python/issues -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plpython3
On Feb 1, 2010, at 2:13 PM, Bruce Momjian wrote: I would love to know why PL/Python can't be incrementally improved like the rest of our code. AFAICT, there are two primary, perhaps identifying, parts to a PL extension: code management (compilation, execution, etc) and type I/O (conversion in most PLs). (well, aside from the language itself =) My proposed extension chooses a different design for both of those parts. It didn't make sense to try and incrementally change PL/Python because I would have been rewriting the whole thing anyways. Not to mention breaking user code in the process for the mentioned parts--thus the Python 3 target. Hope this clears things up. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plpython3
On Feb 1, 2010, at 1:23 PM, Nathan Boley wrote: I think it would be great for you to review it... I doubt that will cause it to get committed for 9.0, but my doubt is no reason for you to hold off reviewing it. I assumed so, but the pretense of a chance will probably help to motivate me :-) I'll have something by Thursday, and then 'Returned with Feedback' will at least be factual. I haven't updated the plpython3 branch in a while, so you may want to hit the github repo with the PGXS build: http://github.com/jwp/pg-python ...Should probably get the updated docs published too, but they are available in src/documentation as ReST files. If you have sphinx installed, running `make html` in the root project directory should build them into src/sphinx/html. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] development setup and libdir
On Jan 30, 2010, at 3:36 PM, Ivan Sergio Borgonovo wrote: For development purposes you would be far better off building a private version of postgres (with configure --prefix=/path) and using its pgxs to build, install and test your module. That's pretty expensive. eh: j...@torch[]:org/postgresql/git 0% ls /src/build pg pg-nodb-plpypg81pg83pg85 pg_foreach py py31-nodb pg-nodb pg80pg82pg84pg90 pgweb py31 The tricky part is that they tend to multiply. ;) Requiring I compile a private version of postgres[1] increase the cost of development unreasonably, That install of PG that you're using will *probably not* have debugging information. Now, granted, you might not need PG with debugging for some problems, but chances are that you'll come across one (or two or fifty or so) where you *will* need it. I'd suggest using a private install for dev as well. Enable debugging and casserts, and make sure your ulimit's allow core dumps. I'm not sure if pgxs can do what you want or not, but using a private prefix works well enough for me. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] C function accepting/returning cstring vs. text
On Jan 27, 2010, at 1:00 PM, Joe Conway wrote: Implementing true value_per_call is still something on my TODO list, but obviously has not risen to a very high priority for me as it has now been an embarrassing long time since it was put there. But that said, materialize mode has proven extremely good at covering the most common use cases with acceptable performance. Hrm. I think this has been noted before, but one of the problems with VPC is that there can be a fairly significant amount of overhead involved with context setup and teardown--especially with PLs. If you're streaming millions of rows, it's no longer a small matter. I would think some extension to Tuplestore would be preferable. Where chunks of rows are placed into the Tuplestore on demand in order to minimize context setup/teardown overhead. That is, if the Tuplestore is empty and the user needs more rows, invoke the procedure again with the expectation that it will dump another chunk of rows into the container. Not a formal specification by any means, but I'm curious if anyone has considered that direction. Or along the same lines, how about a valueS-per-call mode? =) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plpython3
On Jan 14, 2010, at 7:08 PM, Greg Smith wrote: So more targeted examples like you're considering now would help. Here's the trigger example which should help reveal some of the advantages of native typing. This is a generic trigger that constructs and logs manipulation statements for simple replication purposes. The original plpython version is located here: http://ar.pycon.org/common/2009/talkdata/PyCon2009/020/plpython.txt [You'll need to scroll down to the very bottom of that page.] There are three points in this example that need to be highlighted: 1. There is no need for a mogrify function (see original in the above link). 2. Attributes/columns of the records (new/old) are extracted when referenced. 3. The comparisons in after_update uses the data type's actual inequality operator. The first point is true because native typing gives the user direct access to a given type's typoutput via ``str(ob)``. This makes constructing the PG string representation of a given object *much* easier--quote_nullable, and done. The original plpython example will need to be updated to compensate for any changes in conversion: arrays will now need special handling and MD arrays will not work at all. It also relies heavily on the Python object representation matching PG's; where that fails, special cases need to be implemented(composites, notably). All of that compensation performed in the original version is unnecessary in the plpython3 version. The second point touches on the efficiency that was referenced in an earlier message. No cycles are spent converting the contents of a container object unless the user chooses to. Naturally, there is no advantage performance-wise if you are always converting everything. I'd wager that with triggers, it's rare that everything needs to be converted. The third point reveals that Postgres.Object instances--a component of native typing--use the data type's operator for inequality. It's not limited to comparisons as all available Python operators are mapped to corresponding operators in PG. For many or all primitives, there is no added value over conversion. However, this provides a lot of convenience when working with UDTs, datetime types, and geometric types. ...ISTM that the primary advantage of native typing is that we get to define the Python interface to a given Postgres data type. Three files are attached: afterlog.py - the trigger returning function afterlog.sql - the sql exercising the TRF (creates the replica_log table as well) afterlog.out - the contents of the replica_log table after executing afterlog.sql To replay: \i afterlog.py \i afterlog.sql SELECT * FROM replica_log; CREATE OR REPLACE FUNCTION log_manipulation() RETURNS TRIGGER LANGUAGE plpython3u AS $python$ from Postgres import quote_nullable, quote_ident, notify # parameter type extracted from the statement record_manipulation = prepare(INSERT INTO replica_log (sql) VALUES ($1)) def log(sql): record_manipulation(sql) # notify listeners that new data is available notify('replicas') fmt_insert = INSERT INTO {relname} ({columns}) VALUES ({values});.format fmt_update = UPDATE {relname} SET {changes} WHERE {keys};.format fmt_delete = DELETE FROM {relname} WHERE {keys};.format def fmt_eqs(joinwith, keys, record, fmt = {0} = {1}.format): pairs = [fmt(quote_ident(k), quote_nullable(record[k])) for k in keys] return joinwith.join(pairs) ## # entry points def after_insert(td, new): relname = td.args[0] sql = fmt_insert( relname = relname, columns = ', '.join(map(quote_ident, new.keys())), values = ', '.join(map(quote_nullable, new.values())), ) log(sql) def after_update(td, old, new): relname, *pkeys = td.args modified_columns = {k : v for k,v in new.items() if old[k] != v} # only log if there were modified columns if modified_columns: sql = fmt_update( relname = relname, changes = fmt_eqs(', ', modified_columns.keys(), modified_columns), keys = fmt_eqs(' AND ', pkeys, old) ) log(sql) def after_delete(td, old): relname, *pkeys = td.args sql = fmt_delete( relname = relname, keys = fmt_eqs(' AND ', pkeys, old) ) log(sql) $python$; afterlog.sql Description: Binary data afterlog.out Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [NOVICE] Python verison for build in config.pl (Win32)
On Jan 20, 2010, at 12:27 PM, Magnus Hagander wrote: Well, it needs the version to match it to the DLL name. For python 2.6, it needs python26.dll. But yes, there should probably be some way to ask python itself about that - that would be the non-naive method. But as long as python is installed per default, we got it for free, which is why it has worked so far. [on tom's question] IIRC, the reason you can't query Python in the same way that configure/python.m4 does is because the generated Makefile that supports distutils.sysconfig does not exist in standard win32 builds. That is, AFAIK, there is no way to request the exact path of the dll/lib file in win32. However, I'm not particularly familiar with Python on win32, so that may not be the case. Given the absence of a more precise method, I'd recommend considering something along the lines of: Allow the user specify (config.pl?) the Python executable to build against and default to the python.exe in %PATH%. (this may already be the case, idk) Query Python for the version information and installation prefix. python -c 'import sys; print(str(sys.version_info[0]) + str(sys.version_info[1]))' python -c 'import sys; print(sys.prefix)' Assume that the prefix has a normal layout, and construct the lib path from the extracted version and prefix. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plpython3 perf
On Jan 14, 2010, at 2:03 PM, Joshua D. Drake wrote: What I would (as a non hacker) would look for is: (1) Generalized benchmarks between plpython(core) and plpython3u I know a lot of these are subjective, but it is still good to see if there are any curves or points that bring the performance of either to light. k, it was pretty much as expected. However, the surprise for me was that @pytypes didn't perform as terribly as I expected it to. I imagine it's impact may become more noticeable with more parameters, but, nonetheless, I was surprised. I didn't do any SRF tests, but the installations are still setup, so if anyone really wants to see that, it shouldn't take long to do. Apologies ahead of time for the lack pretty graphs. =) I used two different builds/installations of PG to test as the PL names conflict. Both were compiled with the following CFLAGS(pg_config output): -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -fno-strict-aliasing -fwrapv Both PLs were built against the same build of Python(recent svn update of release31-maint): Python 3.1.1+ (release31-maint:77585M, Jan 17 2010, 10:29:13) [GCC 4.2.1 (Apple Inc. build 5646) (dot 1)] on darwin I ran each of the test files a few times against the target installation, but I only attached one of each to this message. (Primarily, multiple runs to filter out any spurious spikes.) The source SQL and output files are attached. rawtest.sql.out is the output for raw data objects(native typing). pytypestest.sql.out is the output of the @pytypes test(native typing with conversion overhead). plpythontest.sql.out is the output for core's plpython(conversion). A few samples from the output files are included inline below. Each volatile function is called 100,000 times from a COUNT() aggregate, and the duration is measured using psql's \timing. Most of the functions simply return the first parameter given to it. The functions are ran inside a transaction because plpython3 does some cache clears(linecache) and GC at the end of transactions. The parameter type, if any, is indicated by the label: noparams: raw: 125ms pytypes: 372ms (base overhead, it would appear) plpython: 309ms oneint2: raw: 140ms pytypes: 684ms plpython: 750ms oneint8: raw: 145ms pytypes: 676ms plpython: 718ms text_large: raw: 271ms pytypes: 2766ms plpython: 2310ms composite: raw: 235ms pytypes: 795ms (N/A, no conversion done, but takes a bit of a hit anyways) plpython: 1654ms plpythontest.sql.out Description: Binary data pytypestest.sql.out Description: Binary data rawtest.sql.out Description: Binary data pytypestest.sql Description: Binary data rawtest.sql Description: Binary data plpythontest.sql Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plpython3
On Jan 14, 2010, at 7:08 PM, Greg Smith wrote: So more targeted examples like you're considering now would help. So far, I have three specific examples in mind: The first will illustrate the advantages of function modules wrt setup code in the module body. Primarily this is about convenience. (I'm going to send this example when I send this message) The second is a generic after trigger that does manipulation logging for some simple replication purposes. This example will illustrate one application of native typing as it provides direct access to a PG type's typoutput. The third one is a fairly old plpythonu example written by Elein that exercises SD to keep state for an aggregate. I'm expecting this to be a good candidate for showing off stateful functions. Other things I plan to cover, but nothing specific in mind yet: Direct function calls Internal subtransactions, with xact(): (something plpython can't do, save calling plpgsql =) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plpython3
On Jan 14, 2010, at 7:08 PM, Greg Smith wrote: So more targeted examples like you're considering now would help. Here's the first example. This covers an advantage of function modules. This is a conversion of a plpythonu function published to the wiki: http://wiki.postgresql.org/wiki/Google_Translate In the above link, the code is executed in the body of a Python function. Please see plpython's documentation if you don't understand what I mean by that. The effect of this is that every time the FUNCTION is called from PG, the import statements are ran, a new class object, UrlOpener, is created, and a new function object, translate, is created. Granted, a minor amount of overhead in this case, but the point is that in order to avoid it the author would have to use SD: if urlopener in SD: UrlOpener = SD[urlopener] else: class UrlOpener(urllib.UrlOpener): ... SD[urlopener] = UrlOpener While some may consider this a minor inconvenience, the problem is that *setup code is common*, so it's, at least, a rather frequent, minor inconvenience. With function modules, users have a module body to run any necessary setup code. Now, WRT the actual example code, I'm not suggesting that either example is ideal. Only that it should *help* identify one particular advantage of function modules. CREATE OR REPLACE FUNCTION public.gtranslate(src text, target text, phrase text) RETURNS text LANGUAGE plpython3u AS $function$ from urllib.request import URLopener from urllib.parse import quote_plus import json base_uri = http://ajax.googleapis.com/ajax/services/language/translate?; class UrlOpener(URLopener): version = py-gtranslate/1.0 urlopen = UrlOpener().open equal_fmt = '{0}={1}'.format @pytypes def main(src, to, phrase): args = ( ('v', '1.0'), ('langpair', quote_plus(src + '|' + to)), ('q', quote_plus(phrase)), ) argstring = ''.join([equal_fmt(k,v) for (k,v) in args]) resp = urlopen(base_uri + argstring).read() resp = json.loads(resp.decode('utf-8')) try: return resp['responseData']['translatedText'] except: # should probably warn about failed translation return phrase $function$; pl_regression=# SELECT gtranslate('en', 'es', 'i like coffee'); gtranslate -- Me gusta el café (1 row) pl_regression=# SELECT gtranslate('en', 'de', 'i like coffee'); gtranslate Ich mag Kaffee (1 row) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plpython3
On Jan 14, 2010, at 2:03 PM, Joshua D. Drake wrote: What I would (as a non hacker) would look for is: (1) Generalized benchmarks between plpython(core) and plpython3u I know a lot of these are subjective, but it is still good to see if there are any curves or points that bring the performance of either to light. I guess I could do some simple function I/O tests to identify invocation overhead(take a single parameter and return it). This should give a somewhat reasonable view of the trade-offs of native typing vs conversion performance-wise. One thing to keep in mind is that *three* tests would need to be done per parameter set: 1. plpython's 2. plpython3's (raw data objects/native typing) 3. plpython3's + @pytypes The third should show degraded performance in comparison to plpythonu's whereas the second should show improvement or near equivalence. @pytypes is actually implemented in pure-Python, so the impact should be quite visible. http://python.projects.postgresql.org/pldocs/plpython3-postgres-pytypes.html I'm not sure there's anything else worth measuring. SRFs, maybe? (2) Example of the traceback facility, I know it is silly but I don't have time to actually download head, apply the patch and test this. Well, if you ever do find some time, the *easiest* way would probably be to download a branch snapshot from git.pg.org: http://git.postgresql.org/gitweb?p=plpython3.git;a=snapshot;h=refs/heads/plpython3;sf=tgz It requires Python 3.1. 3.0 has been abandoned by python.org. This type of thing, showing debugging facilities within the function would be killer. The test output has a *lot* of tracebacks, so I'll just copy and paste one here. This one shows the traceback output of a chained exception. -- suffocates a pg error, and attempts to enter a protected area CREATE OR REPLACE FUNCTION pg_failure_suf_IFTE() RETURNS VOID LANGUAGE plpython3u AS $python$ import Postgres rp = Postgres.Type(Postgres.CONST['REGPROCEDUREOID']) def main(): try: fun = rp('nosuchfunc(int17,zzz)') except: # Should be valid, but the protection of # PL_DB_IN_ERROR should keep it from getting called. rp('pg_x_failure_suf()') $python$; SELECT pg_failure_suf_IFTE(); ERROR: database action attempted while in failed transaction CONTEXT: [exception from Python] Traceback (most recent call last): File public.pg_failure_suf_ifte(), line 8, in main fun = rp('nosuchfunc(int17,zzz)') Postgres.Exception: type int17 does not exist CODE: 42704 During handling of the above exception, another exception occurred: Traceback (most recent call last): File public.pg_failure_suf_ifte(), line 12, in main rp('pg_x_failure_suf()') Postgres.Exception [public.pg_failure_suf_ifte()] (3) A distinct real world comparison where the core plpython falls down (if it does) against the plpython3u implementation Hrm. Are you looking for something that plpython3 can do that plpython can't? Or are you looking for something where plpython makes the user work a lot harder? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plpython3
On Jan 14, 2010, at 12:17 AM, Greg Smith wrote: Code samples. Okay. I don't know, because even with several thousand lines of basic Python code to my credit I cannot understand a single one of the arguments you presented for why your implementation is better--except agreeing that, yes, tracebacks are useful And even on that one, I'm not going to take your word on the superiority of your implementation. Sure, that's what review is about. No? You're writing way over people's heads here. Okay. I guess I hoped the documentation would help clarify a lot of this, and make the advantages self-evident. On that: (Doesn't help that your docs link at the bottom of http://wiki.postgresql.org/wiki/WIP:plpython3 is broken either). Ouch. Thanks, that's fixed now. Please take a look again: http://python.projects.postgresql.org/pldocs/plpython3.html If one has to be a Python expert to understand your position, you've already lost. Function modules should be pretty obvious. native typing is a bit more difficult as a solid understanding of PG's type system is fairly important for a firm grasp. Python code is easy to read though. If you'd said here's a great example of how Function Modules are an improvement over what you can do with the current pl/python, that would be infinitely more useful than the list of language trivia related to them. You should be aiming to put Peter on the spot to respond to claims you make like you can't do this easily with the current implementation after showing an elegant bit of code. Okay. So, some examples would help. The documentation is back up, so please be sure to look at the numerous examples provided therein. In addition to that, I'll try to get some contrasting examples posted as a follow-up to an earlier message. In plpython you do X whereas in plpython3 you do Y. Thanks. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plpython3
On Jan 13, 2010, at 11:08 AM, Joshua D. Drake wrote: My argument would be now, what is the benefit of the James Pye version over our version. James can you illustrate succinctly why we should be supporting a new version? Doing so, succinctly, is unfortunately difficult. It is primarily a matter of comparing features, AFAICT. And, furthermore, some features may not be useful to some users. It exposes additional functionality that should *not* be incrementally developed in plpython as it would break applications. This was the point of trying to move forward with it for Python 3. Function Modules: - Does away with the need for GD/SD (more natural Python environment). - Allows tracebacks (tracebacks are useful, right?) to implemented easily. - Does *not* expose a bastardized variant of the language by pretending that modules/script files can return and yield. - Helps to promote the Python tenet of being explicit. Native Typing: - Provides PG type introspection not available in any other PL, AFAIK. - Improves efficiency in some cases (conversion must be _explicitly_ called for) - MD Array support. - Composites are a sequence and a mapping. Other features: http://wiki.postgresql.org/wiki/WIP:plpython3 Aside from function modules and native typing, many of plpython3's features could be implemented incrementally. However, I had a chance to sprint and they are available now in a new implementation. I did so, rather than improving plpython, because I believe that native typing and function modules are very useful. I'm not sure this fulfills your request, but, hopefully, it's a start. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plpython3
On Jan 13, 2010, at 2:27 PM, Peter Eisentraut wrote: The problem I'm having with this discussion is that every time someone asks what the supposed advantages of this new Python PL are, a feature list like the above is dumped, I agree that this is unfortunate, but how else can we to discuss the advantages? It boils down to comparing a couple feature lists, and *maybe* some implementation details. No? 75% of which is subjective and tends to use semi-buzzwords, You say semi-buzzwords, I say names. I have to call it something. such that then someone else who by his own admission isn't completely up to date on things says, sure, that sounds great. Which is why we need to get some more experienced Python users involved in this. Well, even the mileage of inexperienced users is quite useful for detecting what level of obviousness has been achieved by the features, so I'm not trying to exclude anyone. The current PL/Python also has, arguably, a more natural Python environment, No, it doesn't. GD/SD are contrived in order to compensate for the very absence of that. Additionally, modules / script files don't return or yield. native typing, Okay, that's arguably subjective, but when I write native typing, it's wrt PG, not Python's built-in types. If that wasn't the case, I wouldn't call it anything--save [data] conversion when necessary--as it wouldn't be much of a feature to clamor about. efficiency, Yes, as discussed in the thread before there are trade-offs here wrt how PG data is handled. It depends pretty heavily on how the parameters / query results are used. Although, as stated before, the difference in efficiency can be rather significant in situations where conversion to built-in Python types is *not* desired. and explicitness. Hrm? What is explicit about munging the source code of the procedure to make it into a function body? Perhaps you could give some examples where plpython helps promote explicitness? And sure, I understand that other PLs do this. That may be fine for those languages, but for Python it's not, IMO. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plpython3
On Jan 13, 2010, at 12:15 PM, Robert Haas wrote: 1. It's not just a rewrite, it's an incompatible rewrite that will present significant user-visible behavioral differences. So replacing the current implementation wholesale would produce massive breakage for anyone actually using PL/python in production. Right. That was the point of trying to leverage Python 3 to make the distinction. Most people will need to update their functions if they are moving to Python 3. And for the larger chunks of code, the hard stuff, the amount of change required is likely significant already. snip some Given these two facts, it's hard for me to see how we could decide to REMOVE the current implementation and replace it with the new one. So the most we could do is maintain them side by side, and then you have to ask, why? My original hope was that plpython would be maintained for several years to come and when Python 3 started picking up steam, we would deprecate plpython. If people still wanted to use it, they could continue using the older version of PG and/or someone could continue to maintain plpython out of core for legacy support. snip/maintaining out of core Sure, his code won't get as much exposure that way, =) Try next to none. The existence of core's implementation makes competing *very* difficult, IMO. Thinking of something along the lines: Why would I use/contribute to your implementation when core has one? And, all I can say in response is, Check out my features. Subsequently, they will probably weigh the added risk of choosing the loner's implementation and come to the conclusion that using core's would be safer in the long term. I can't fault that line of reasoning, so ISTM that it would be difficult to sell. but it's been posted to the mailing list several times now over a period of 8 months and nobody has said oh, wow, this is great. Yeah. :( In the past, one person showed interest in function modules(Stuart, see the first WIP message), and two others showed interest in native typing(Nathan and Tino). Mr. Drake has also shown some interest in this thread. But, yes, you are correct. There has been no wow, this is great message. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Additional SPI functions
On Dec 20, 2009, at 12:03 AM, Tom Lane wrote: This looks like it's most likely redundant with the stuff I added recently for the plpgsql parser rewrite. Please see if you can use that instead. The parser param hooks will definitely work. As for getting the result TupleDesc prior to execution, I can include spi_priv.h and look at the CPS list directly. Something more crayola would be preferable, but I don't think SPI_prepare_statement is that something; although, it did make for a fine stopgap. (Well, fine, saving that my proposed SPI_prepare_statement appeared to be broken wrt plan revalidation and bound parameters.. ew) So, after looking into the parser hooks, CachedPlanSource, and SPI more, I ended up taking a slightly different route. I expect it to work with a couple prior versions of PG as well, so there is some added value over a new SPI function or exclusively using param hooks. And, now, thinking of compatibility with past versions of PG, I'll find a different route for SPI_execute_statements as well. Thanks. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Additional SPI functions
On Dec 20, 2009, at 12:03 AM, Tom Lane wrote: Why not code a loop around one of the existing SPI execution functions? *shrug* seemed nicer to push it on the parser than to force the user to split up the statements/calls. Or split up the statements myself(well, the parser does it so swimmingly =). It's purpose is to allow the user to put a chunk of SQL into a single big block: sqlexec( CREATE TEMP TABLE one ...; CREATE TEMP TABLE two ...; init temp tables with data for use in the procedure ) For me, that tends to read better than breaking up the calls. Well, the above may be a bad example for crying about readability, but I'm thinking of cases with a bit more SQL in em'.. [spi_prepare_statement] This looks like it's most likely redundant with the stuff I added recently for the plpgsql parser rewrite. If that allows me to identify the parameter type Oids of the statement, optionally supply constant parameters after identifying the types(so I can properly create the parameter Datums), and provides access to the resultDesc, then yes it is redundant. Personally, I'm hoping for redundant. =) Please see if you can use that instead. I took a very short peak (wasn't really looking..) earlier today (err yesterday now) and nothing jumped out at me, but I'll take a closer look now. Thanks =) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] alpha3 bundled -- please verify
On Dec 20, 2009, at 1:36 AM, Peter Eisentraut wrote: Please check that it is sane. I'm up, so: Works for me on snow leopard. But it doesn't seem to want to stop configure'ing on my fbsd8/amd64 box: $ ./configure --prefix=/src/build/pg85a3 $ gmake # GNU make 3.81 keeps running configure again and again ... (last few lines before it appears to restart configure) configure: creating ./config.status cd . ./config.status src/Makefile.global config.status: creating src/Makefile.global ./config.status GNUmakefile config.status: creating GNUmakefile cd . ./config.status --recheck running CONFIG_SHELL=/bin/sh /bin/sh ./configure --no-create --no-recursion However, I can build the REL8_5_ALPHA3_BRANCH from git using that box.. Recently pulled, gmake distclean'd and remade again.. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] alpha3 bundled -- please verify
On Dec 20, 2009, at 9:20 AM, Tom Lane wrote: Usually that means timestamp skew, ie file timestamps are later than your system clock. Yep. It's working now. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Additional SPI functions
In the event that my plpython3 patch does not make it, it seems prudent to try and get a *much* smaller patch in to allow the PL to easily exist out of core. I added a couple SPI functions in order to support the database access functionality in plpython3u. Also, a getelevel() function for conditionally including context information due to error trapping awkwardness: extern int SPI_execute_statements(const char *src); Execute multiple statements. Intended, primarily, for executing one or more DDL or DML statements. In contrast with the other execution functions, the RPT loop plans and executes the statement before planning and executing the next in order to allow subsequent statements to see the effects of all the formers. The read only argument is omitted as it should only be used in read-write cases(you can't read anything out of it). extern SPIPlanPtr SPI_prepare_statement( const char *src, int cursorOptions, SPIParamCallback pcb, void *pcb_arg, TupleDesc *resultDesc); Prepare a *single* statement and call the SPIParamCallback with the parameter information allowing the caller to store the information and supply constant parameters based on the identified parameter types, if need be. Also, if it returns rows, return the TupleDesc via *resultDesc. typedef void (*SPIParamCallback)( void *cb_data, const char *commandTag, int nargs, Oid *typoids, Datum **param_values, char **param_nulls); Not at all in love with the callback, but it seemed desirable over using an intermediate structure that would require some additional management. Certainly, docs and tests will be necessary for this, but I'm sending it out now with the hopes of getting some feedback before sweating those tasks. The patch is attached for easy reference. Any help would, of course, be greatly appreciated. cheers spi.diff Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Options for protocol level cursors
On Jun 13, 2008, at 9:24 AM, Tom Lane wrote: You'd do it while Binding a FETCH command. Indeed, that is true. It seems quite unfortunate that drivers have to jump through such hoops to provide a convenient programmer's interface to held and/or scrollable cursors; bearing in mind all that has been discussed about the--well, *my*--desire of equivalent capabilities wrt the usual protocol level Prepare, Describe, Bind, and Execute sequence. Well, perhaps it is better to say that it is *ideal* to be able to merely use the protocol mechanisms to achieve the desired effect, rather than using them to use the SQL command yielding the same or similar(Binding FETCH for different formats) effect. =\ [Obviously, I was looking to propose... ;] My thoughts for creating a HOLD and/or SCROLL cursor on Bind would be to add YA GUC stating the cursor options for Bind cursors. Something along the lines of default_bind_options=HOLD,SCROLL. Of course the actual default would be an empty string so as to preserve the existing functionality by default. I imagine there's a big fat No waiting for me for at least the following reasons[in no particular order]: 1. It's already possible to achieve the desired result and the proposed feature is, of course, not going to work with past versions. [Just put in the extra work to support past versions of PG.] 2. I'm the only one asking/looking for it. (I'm so lonely ;) It is, of course, ideal to be able to state these options in the Bind message, but I don't see how that would be a possibility without a new protocol version or doing something dangerous like embedding the options in the cursor's name. ain't happenin'. And, yeah, despite the first reason, I think I would prefer to use a new GUC. Certainly, with some bitterness. =( In any case, thanks for the discussion, Tom. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Options for protocol level cursors
On Jun 13, 2008, at 4:40 PM, Kris Jurka wrote: The JDBC driver would also like this ability, but a GUC is a pretty ugly hack. I completely agree that it is an ugly hack. :) Also, since you still have to go to the SQL level to issue the MOVE or FETCH BACKWARD, you're still not all the way there to a full protocol solution. Completely true. However, this is, of course, only pertinent to SCROLL cursors. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Options for protocol level cursors
Is there anyway to bind a cursor with SCROLL and WITH HOLD at the protocol level? Or perhaps configuring it so after binding it? I know you can use DECLARE, but I believe that this inhibits the driver from being able to select the transfer format for individual columns; it's all binary or it's all text. Also, I don't think I can DECLARE against a prepared statement, save restating the query, no? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Options for protocol level cursors
On Jun 12, 2008, at 10:08 AM, Tom Lane wrote: James William Pye [EMAIL PROTECTED] writes: Is there anyway to bind a cursor with SCROLL and WITH HOLD at the protocol level? No, and for at least the first of those I don't see the point, since the protocol doesn't offer any behavior other than forward fetch. That protocol level inability doesn't keep you from using MOVE on cursor's name. And yes, it yield the appropriate errors when you try to MOVE backwards on a protocol declared cursor. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Options for protocol level cursors
On Jun 12, 2008, at 3:59 PM, Tom Lane wrote: Sure, but if you're willing to use a SQL-level operation on the portal then you could perfectly well declare the cursor at SQL level too. Indeed, but like I said in my initial e-mail:: I know you can use DECLARE, but I believe that this inhibits the driver from being able to select the transfer format for individual columns; it's all binary or it's all text. Also, I don't think I can DECLARE against a prepared statement, save restating the query, no? Also, the latter has other problems wrt statement parameters. I guess you could prepare(protocol level) the DECLARE, but that seems like a gross workaround as it defeats the purpose of prepared statements by forcing you to create a new statement for each cursor that you plan to open. Of course, unless you can close the existing one before opening the next one. [ I really do hope that I'm missing something, btw :( ] -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Options for protocol level cursors
On Jun 12, 2008, at 4:45 PM, Tom Lane wrote: Huh? I don't see why... you might have such a limitation in a particular driver, but not in the protocol. Oh? I know when you bind a prepared statement you have the ability state the formats of each column, but I'm not aware of the protocol's capacity to reconfigure the formats of an already existing cursor; ie, a DECLARE'd cursor. I know you can use the Describe message to learn about the cursor's column types and formats Got a link to the part of the protocol docs describing this feature? Also, the latter has other problems wrt statement parameters. I guess you could prepare(protocol level) the DECLARE, but that seems like a gross workaround as it defeats the purpose of prepared statements by forcing you to create a new statement for each cursor that you plan to open. Well, using a query for a cursor is grounds for replanning anyway, because you might want a fast-start plan in such a case. And it's *definitely* grounds for replanning if you are asking for SCROLL capability --- the plan stored for a regular prepared statement very likely can't support that. Ah, that is good to know. Thanks. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
You guys call this simplification? You're out of your minds. This proposal is ridiculously complicated, and yet it still fails even to consider adjusting non-numeric parameters. And what about things that require more than a trivial arithmetic expression to compute? It's not hard at all to imagine wanting log, sqrt, etc. We do not need to put computational capability into GUC. Any computations needed to determine a parameter setting should be done by a wizard. +1 (save the comment speculating about anyone's location relative to their mind ;) Additionally, obvious as it may be, there's nothing stopping anyone from developing a tool to generate the configuration file from a more interesting source. Whether that's XML or some DSL that supports computations, doesn't matter. I would think if such a tool showed dramatic merit it would provoke another discussion about core integration, but ISTM that leaving it dead simple is best. [mm, hook into the postgres startup script, shouldn't be that hard to administer..] -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] rfc: add pg_dump options to dump output
On Tue, Jun 03, 2008 at 01:17:43AM -0400, Tom Lane wrote: Robert Treat [EMAIL PROTECTED] writes: I would like to add the flags given to pg_dump into the output of the pg_dump file. +1, FWIW Anyone see any issues with this? I'm a bit worried about breaking diff-equality of matching dumps, I don't tend to do this too often, but I'm curious if others make a habit of this for some process. When I have done this, it's normally one-off case. but mainly I don't see the point. It's informative, no? Sure if every dump used the same flags, it's pointless. However, that's certainly not the case: --schema, --exclude-schema, etc. ie, I imagine it would be nice to have the information about how a dump was performed without having to track down someone's script or pester the person who's in the know via e-mail if you wanted/needed to repeat the process. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: new function array_init
On Mon, Jun 02, 2008 at 08:10:19PM +0200, Pavel Stehule wrote: * I have a mild preference for array_fill instead of array_init. maybe, maybe array_set. Any ideas are welcome array_create? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 8.2 beta blockers
On Sun, Sep 17, 2006 at 07:38:38PM -0400, Tom Lane wrote: We have three possible choices for this: do nothing, install a bug-compatible, allegedly-clean-room implementation in contrib: http://archives.postgresql.org/pgsql-patches/2006-09/msg00077.php or put a hopefully-cleaner design into core, eg per my suggestions here: http://archives.postgresql.org/pgsql-hackers/2006-09/msg00467.php I favor the third alternative, mainly because by changing the API we remove all doubt as to whether any intellectual property remains from the original GPL'd code. However, we've got to make up our minds and get on with it. FWIW, I'm +1 on the cleaner design you suggested. While I understand the concerns of adding features/API this late; as a user, I'd rather not wait another year to have these available in core(yes, I know alternative measures would exist if it did not make it into core, but the convenience of having it there would certainly be nice). That is, I really like the waiting variant. It is something that I would use. The lack thereof(IIRC) in the current contrib implementation is something that I have recently lamented about. I understand that want is not a reason to compromise the feature freeze, so I hope the legal concerns Tom mentions will be enough. =) ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] pg_proc probin misuse
On Sun, May 28, 2006 at 09:12:34PM -0400, Tom Lane wrote: But we're still avoiding the central issue: does it make sense to dump a probin clause at all for plpython functions? If it's a compiled form of prosrc then it probably doesn't belong in the dump. That's why I initially thought pg_dump or I was the dirty one. Even if CREATE FUNCTION would take it, the probin value would be ignored(well, overwritten). On reflection I'm kind of inclined to think that plpython is abusing the column. If it were really expensive to derive bytecode from source text then maybe it'd make sense to do what you're doing, but surely that's not all that expensive. Everyone else manages to parse prosrc on the fly and cache the result in memory; why isn't plpython doing that? Yeah, I don't think it's expensive. It wasn't a feature that I implemented out of any kind of demand or testing. Rather, I knew I could marshal code objects, and I figured it would likely yield some improvement on initial loads, so I implemented it. If we think that plpython is leading the wave of the future, I'd be kind of inclined to invent a new pg_proc column in which derived text can be stored, rather than trying to use probin for the purpose. Although arguably probin itself was once meant to do that, there's too much baggage now. I think having something like that in pg_proc could be useful. Certainly my case may not really be demanding, but I guess there may be some languages that could enjoy a good benefit from avoiding recompilation. Tho, such a column seems like it would be more of a mere convenience for PL authors. If initial load were truly that expensive, I would think that it would justify creating a table containing compiled code and taking the extra lookup hit on initial load. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] pg_proc probin misuse
On Fri, May 26, 2006 at 11:21:32PM -0400, Tom Lane wrote: James William Pye [EMAIL PROTECTED] writes: So is this fix your broken PL or pg_dump should only be doing that for C language functions? Offhand it seems to me that pg_dump is behaving reasonably: it's storing probin if it sees something there to be stored. The asymmetry is in the backend, specifically functioncmds.c's interpret_AS_clause(): it has a hardwired assumption that probin is only relevant to C functions. Feel free to propose a saner definition. AFAICS the current coding makes probin useless for all except C functions, so I think it could be improved. I guess there are two ways to go about it. Simply remove the assumption that probin is only relevant to C functions; perhaps allowing a hardwired exception for builtin languages where allowing probin to be set would be deemed unsightly (ie, the easy way ;). Or, add a column to pg_language that specifies the language's probin usage so that pg_dump and the backend have an idea of how to handle these things for the given language(the takes a bit more work way). [I imagine the former could gracefully lead into the latter as well.] ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[HACKERS] pg_proc probin misuse
Hi, In PL/Py, I had the bright idea of storing bytecode in the probin field of the function's pg_proc row. However, this idea has lately become rather dim as I have recently rediscovered(thanks Adrian) that this breaks dumps; pg_dump outputs a PL/Py function as CREATE FUNCTION x() RETURNS y LANGUAGE python AS 'bytecode', 'source'. Of course, when loading this, it fails: 'ERROR: only one AS item needed for language python'. So is this fix your broken PL or pg_dump should only be doing that for C language functions? I imagine the former, so if that is the case perhaps the 'probin' column description at [1] should be reworded to ensure others don't get the same bright idea(the language specific part in particular). Ugh, even if it were the latter, I would still be breaking existing versions, so I'm inclined to fix it regardless.. Have a good evening (afternoon, morning, etc :). [1] http://www.postgresql.org/docs/8.1/static/catalog-pg-proc.html [Yeah, I do see the clarification at the bottom of the page. :( ] ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] pseudo-type record arguments for PL-functions
On Sun, May 07, 2006 at 12:16:16AM +0200, Thomas Hallgren wrote: Yes, the intarray stuff was very helpful but also somewhat confusing. Why are there two ways of representing some of the array types? I mean, why is there an _int4 when you could just as well write int4[]? I'm probably missing the point altogether. FWICT, Prefixing a '_' is the convention used to make the array type's typname unique. Being able to reference array types as _type is a side effect. (array types being actual rows in pg_type) ---(end of broadcast)--- TIP 6: explain analyze is your friend
[HACKERS] Scanning for insert
Greets, [Ugh, sent one with uncompressed patch. Seems to be swallowed(No failure msg?). Ignore it if it surfaces.] The attached patch extends pg_am by adding two methods, scan for insert and insert from scan. These methods are meant to provide an index user with the ability to do conditional insertions based on the results of a conclusive locking scan--a scan that not only finds a match, but resolves its actual existence(what _bt_check_unique does), and makes any necessary locks to warrant later insertion by a call to insertfromscan. (It should also be noted that insertions are aborted by giving the insertfromscan method an InvalidItemPointer instead of adding another method, ie abortinsertscan.) I'm sending this to hackers instead of patches, because it's not complete(I know it's broken in the some of the new places), and, most importantly, I want to know if this is actually the appropriate general direction to travel in. Not to mention that I have probably missed something and I hope that someone will give me my due whackings with The Clue Stick. :P These new index interfaces are meant to provide the necessary functionality for doing conditional index insertions that will likely aid in any implementation of error logging, and other features(merge?). I have failed to find much discussion on this, so if there is a good thread about, please point me at it. The most recent discussion of something along these lines was Martijn's thread about providing non-btree indexes with uniqueness: http://archives.postgresql.org/pgsql-hackers/2006-01/msg00541.php (However, I doubt that this patch would actually help with deferred constraints or giving GiST/hash uniqueness.) This patch is mostly incremental, and doesn't make many alterations to existing code for the time being; the most significant alteration was restructuring a bit of _bt_check_unique(_bt_evaluate in this patch) to work for both scanforinsert and doinsert. (This is probably a good thing as this code is all pretty new to me.) [Sorry about some of the whitespace changes in the patch, remnants from an overzealous version.] Any thoughts and directions would be appreciated. -- Regards, James William Pye ciinsert.patch.gz Description: Binary data ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Scanning for insert
On Mon, Feb 27, 2006 at 05:44:20PM -0500, Tom Lane wrote: Is this really a good idea? The fundamental problem I see with it is that it takes away the index AM's ability to make any guarantees about its locking behavior, ie, how long locks will be held or what other operations might intervene while holding them. Yeah, I saw that as well. :( My only thoughts on that issue so far have been that the user must tread carefully while holding these scans; it becomes the user's responsibility. Weak at best, perhaps, so I can understand if that does not move you or anyone else into thinking positively of these proposed interfaces. ;) It'll also require the AM to save potentially large amounts of state between the two calls (eg, an entire search path might be needed for GiST). Currently any such state can live on the stack in local variables, but that won't work if it has to be remembered until a later AM call. Hrm, certainly, implementing these methods for AMs that use such state keeping techniques may be extra difficult. Massive changes may be necessary. However, I don't imagine that making such a change would be impossible, nor would it necessarily be required at all. Similar to uniqueness, the index need not be forced to implement these new interfaces--surely not upon this patch's introduction into the source(if/when). If a feature were to demand the use of scan insertions on a specific index that does not provide the interfaces, it could easily raise an informative exception about the shortcoming. Hehe, perhaps a Truly Foolish Notion, but could we use siglongjmp for such cases(state living on the stack)? Seems questionable and evil, tho; I know. ;) Lastly, what happens if the caller loses control (due to elog) and never makes the followup AM call? I imagine/hope it would get cleaned up similar to how nbtree gets cleaned up at the end of the transaction that was rolled back due to a unique constraint violation. [I seem to recall having to do some special _bt_wrtbuf to get my experimental insert unless patch working, so I guess that at eox some cleanup is done w.r.t. to those locks and whatnot. (yeah, technical term, whatnot ;)] Additionally, if the caller is very concerned with potential exceptions, perhaps a PG_TRY() block should be exercised in those areas of worry. I guess in most cases it simply comes back to becoming the scan's user's responsibility to be sure to keep things kosher. :\ If that's what you want, maybe a better answer is to simply allow aminsert to return a uniqueness-conflict indication, rather than raising the ereport for itself. Maybe so. However, I guess I was thinking along lines that many times multiple insert scans may need to be done before the final decision to actually do the insertion is made(yeah, uh, the insert unless thing is what I've had in mind ;). Also, I think part of this point is to be able to avoid the actual insertion into heap, so as to avoid a superfluous heap_insert heap_delete, and other unwinding code if a uniqueness-conflict indication were made to a user that needs to respond to such signals for the feature being implemented. (The more unique constraint violations that occur, the more I/O that gets saved with insert scans. I imagine this could be a very good thing for certain applications.) -- Regards, James William Pye ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Pl/Python -- current maintainer?
On Sun, Feb 26, 2006 at 01:08:52PM -0500, Tom Lane wrote: That design is broken on its face, as the system does not guarantee to call the validator. Hrm. Other than language creations that do not specify a validator, at what times will Postgres not call the validator upon function creation? -- Regards, James William Pye ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Pl/Python -- current maintainer?
On Sun, Feb 26, 2006 at 07:36:01PM +0100, Michael Paesold wrote: SET check_function_bodies = off; Hrm, thanks for pointing that out. However, this, thankfully, does not appear to inhibit the validator at all. Rather, the PL's validator is left with the job to respect it if need be: jwp=# select version(); version - PostgreSQL 8.1.3 on i386-unknown-freebsd6.1, compiled by GCC gcc (GCC) 3.4.4 [FreeB SD] 20050518 (1 row) jwp=# set check_function_bodies = off; SET jwp=# SHOW check_function_bodies ; check_function_bodies --- off (1 row) jwp=# CREATE FUNCTION test() RETURNS int LANGUAGE python AS 'return 1'; CREATE FUNCTION jwp=# SELECT prolang, probin FROM pg_proc WHERE proname = 'test'; prolang | probin -+-- 16390 | c\002\000\000\000\002\000\000\000\001\000\000\000C\000\000\000s\004\000\0 00\000d\001\000S(\002\000\000\000Ni\001\000\000\000(\000\000\000\000(\002\000\000\00 0s\004\000\000\000selfs\004\000\000\000args(\000\000\000\000(\000\000\000\000s\005\0 00\000\00069121s\006\000\000\000test()\002\000\000\000s\002\000\000\000\000\001 (1 row) jwp=# SELECT test(); test -- 1 (1 row) ^D This is a good thing, IMO. Python primarily does a syntax check so the cited[1] reasons for the use of check_function_bodies do not seem to apply here. (Not sure if this is different on 8.0 or earlier.) [1] http://www.postgresql.org/docs/8.1/static/runtime-config-client.html -- Regards, James William Pye ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Pl/Python -- current maintainer?
On Sat, Feb 25, 2006 at 10:09:52AM +0100, Tino Wildenhain wrote: And with even more love the restricted python from zope could be ported so there could be a pl/python again :-) Ok, just haluzinating ;) Not necessarily. ;) From what I have seen of zope's restricted python, it does, or can, force its restrictions by checking bytecode. I imagine a simple PL sitting on top of the untrusted varient that merely implements a custom validator that checks the bytecode produced by the untrusted PL's validator. The language handler would remain the same: Create untrusted plpy... CREATE FUNCTION zope_restrict(oid) RETURNS VOID LANGUAGE python AS $$ getcode = Postgres.CachedQuery(SELECT probin FROM pg_proc WHERE oid = $1) getlangval = Postgres.CachedQuery( SELECT lanvalidator FROM pg_language WHERE oid = (SELECT prolang FROM pg_proc WHERE oid = $1) ) langvaloid = getlangval(self.function.procid) langval = Postgres.Function(langvaloid) # produce the function's bytecode for checking langval(args[0]) # get the function's newly created bytecode code = getcode(args[0]).next()[0] # # XXX: do the zope restriction stuff, raise exception if necessary # Postgres.NOTICE(repr(code)) $$; CREATE TRUSTED LANGUAGE plpyr HANDLER python.handler VALIDATOR zope_restrict; [This almost works in plpy head, but I think I just found a bug ;] -- Regards, James William Pye ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Pl/Python -- current maintainer?
On Sat, Feb 25, 2006 at 06:36:19PM -0300, Alvaro Herrera wrote: I'm not sure it's an issue now that we have pg_pltemplate, but in older versions it's possible to create a language without setting a validator. This would make the validator an unsuitable place for checking the restrictions. Hrm. I think this would only be an issue in PL/Py is if the user had the ability to alter probin. The handler will never directly execute code in prosrc; it relies on a validator to fill in probin. Whether a regular user could take advantage of this or not, I'm not sure as I have yet to test it or to give it much thought. -- Regards, James William Pye ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Pl/Python -- current maintainer?
On Sat, Feb 25, 2006 at 01:21:34PM -0700, I wrote: From what I have seen of zope's restricted python, it does, or can, force its restrictions by checking bytecode. I imagine a simple PL sitting on top of the untrusted varient that merely implements a custom validator that checks the bytecode produced by the untrusted PL's validator. The language handler would remain the same: [ugh, Correcting my assumptions...] Zope's RestrictedPython is a custom bytecode generator that compiles Python code specially, as opposed to a bytecode processor that validates against some rule set as I had thought for some (wishful? ;) reason. The bytecode then needs to be executed in an special environment that then imposes some specified restrictions at runtime(I'm not really clear on all the details here as I am having a very difficult time finding documentation). This doesn't mean that it couldn't be used. However, it does mean that some munging of the handler would be required(Something that I desired to avoid). -- Regards, James William Pye ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Copy From Insert UNLESS
On Mon, Feb 06, 2006 at 05:08:38PM -0500, Alon Goldshuv wrote: The proposal is neat, however, I am not too excited about handling errors in such high granularity, as far as the user is concerned. I am more on the same line with Tom Lane's statement in Simon's thread (Practical error logging for very large COPY statements): The general problem that needs to be solved is trap any error that occurs during attempted insertion of a COPY row, and instead of aborting the copy, record the data and the error message someplace else. Seen in that light, implementing a special path for uniqueness violations is pretty pointless. I think I would be inclined to actually agree with that, which is why I proposed a special path for constraint violations in general as opposed to just uniqueness. However, I can understand if you remain unmoved. ;) But, I definitely share your struggle to finding a good way to handle those unique/FK constraints... Aye, :-( As far as UNIQUE goes, maybe there is a good way to do a bt scan against the index table right before the simple_heap_insert call? Yes, but I believe some additional locking is required in order to make that safe. Not that that would kill it, but I think there is a better way; I'm cooking up a general proposal for refactoring unique constraints, so I'm hoping something along those lines will aid any patch attempting to solving this problem[copy error/violation management]. -- Regards, James William Pye ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Copy From Insert UNLESS
On Sun, Feb 05, 2006 at 07:14:49PM -0800, Stephan Szabo wrote: On Sun, 5 Feb 2006, James William Pye wrote: However, constraints referenced in an UNLESS clause that are deferred, in any fashion, should probably be immediated within the context of the command. Perhaps a WARNING or NOTICE would be appropriately informative if UNLESS were to actually alter the timing of a given constraint. The problem is that even immediate constraints are supposed to be checked at end of statement, not at row time. I see. Immediated is not the word that I am actually looking for then. :( Perhaps Postgres should specify our current immediate as a new constraint mode. instant, maybe? Sadly, I think it will be difficult to get away from using that or some other synonym if such an idea were to be implemented. [Getting the feeling that this has been discussed before. ;] Our implementation of UNIQUE is particularly bad for this. Yes. Changing how UNIQUE constraints are implemented will likely be the first step in this patch. Any facility that can alter the tuple before it being inserted into the heap should probably be exercised prior to the application of the tuple against UNLESS's behavior. The problem is that you can un-violate a unique constraint by changing some other row that's already in the table. And I think that it might even be legal to do so in an after trigger (and in fact, some other row's after trigger). [join] Basically a violation at the time the row is created is irrelevant if the violation is gone by the end of statement. Okay. I can't help but think such a trigger as being questionable at best. However, per spec, it should be possible. =\ This isn't necessarily a killer to the idea though, it probably just means the semantics are harder to nail down. Aye. I figured there would be some details that might take a while. Once the UNIQUE constraint code is relocated, I think implementing more standards compliant constraint timing might be substantially easier. However, I don't think this should effect UNLESS. Rather, I think UNLESS should, more or less, demand that specified constraints be checked at the same time as they are currently. This is meant to be an optimization at multiple levels; reduce code redundancy(rewriting constraint checks for use prior to the actual insertion), computational redundancy(potentially, running the rewritten checks more than once), and reduce unnecessary I/O(avoiding heap_insert()'ing an evil tuple into the target table despite the fact that the statement may later inviolate it). Although, perhaps, it could be configurable with an option; INSERT INTO t UNLESS [DEFERRED] CONSTRAINT VIOLATION. =) -- Regards, James William Pye ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Copy From Insert UNLESS
On Mon, Feb 06, 2006 at 11:03:06AM -0800, Josh Berkus wrote: Are you sure that a new type of constraint is the way to go for this? [Thinking that you are referring to the new constraint mode that I was confusingly referring to...] Well, it really wouldn't be new. It's just labeling what we do now as something other than immediate. Considering that immediate constraints are meant to be checked at the end of the SQL-statement, and our implementation of immediate is truly immediate, as Stephan pointed out to me. However, I think our current timing method is better for normal cases, at least for Postgres, than what the spec specifies. [See pages 63-66: The second paragraph in 4.17.2 Checking of constraints] Ultimately, I don't care about this very much. However, I think an implementation of my proposal would aid in implementing spec compliant immediate timing. [If I misunderstood what you were getting at, sorry. :] It doesn't solve our issues in the data warehousing space. The spec we started with for Error-tolerant COPY is: 1) It must be able to handle parsing errors (i.e. bad char set); My proposal did not handle this, and purposefully so. A constraint violation, while inhibiting insertion into the target table would still yield a kosher tuple--just not okay for that table, which could then be dropped or redirected using the THEN INSERT INTO into another precisely structured table for later analysis. Bad data errors would not even have a tuple to work with in the first place, which is why I wanted to draw a distinction. I think having something to handle bad data is useful, but I think it should be distinct, syntactically and implementation-wise, from constraint violations. That's not to say that it couldn't fit into the model that UNLESS would try to create: COPY ... UNLESS BAD DATA [ON COLUMN (y)] OR CONSTRAINT VIOLATION [ON (z)] ... 2) It must be able to handle constraint violations; Check. :) 3) It must output all row errors to a log or errors table which makes it possible to determine which input row failed and why; Check; save data errors for now. 4) It must not slow significantly (like, not more than 15%) the speed of bulk loading. Check. (See below) It seems like your idea, which would involve a second constraint check, would achieve neigher #1 nor #4. I'm not proposing that a second constraint check should be made. The difficulty of my implementation comes from the position that I don't think the current implementation of UNIQUE constraints is ideal. It is hidden inside nbtree, which, while convenient, is not likely to be the best place for it. I believe my original letter covered this by proposing a new pg_am column; one that would hold a regproc that would be able to 'scan for insert' and return the state(position, locks, whether an entry exists, anything else necessary for a quick insert) of that scan to the caller for later use in the actual insert or update. All other constraints appear to require trivial modifications to get it to work with UNLESS without any redundancy. -- Regards, James William Pye ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Copy From Insert UNLESS
On Sun, Feb 05, 2006 at 02:08:12PM -0800, Stephan Szabo wrote: Have you considered how this might work with spec-compliant constraint timing? I haven't gone so far as to look into the spec, yet. [Noise of rustling papers] However, constraints referenced in an UNLESS clause that are deferred, in any fashion, should probably be immediated within the context of the command. Perhaps a WARNING or NOTICE would be appropriately informative if UNLESS were to actually alter the timing of a given constraint. I think even in inserting cases, a later trigger before statement end could in some cases un-violate a constraint, so checking before insert won't actually be the same behavior as the normal constraint handling which seems bad for this kind of system. Any facility that can alter the tuple before it being inserted into the heap should probably be exercised prior to the application of the tuple against UNLESS's behavior. The implementation of UNLESS will probably completely change ExecConstraints(), which comes after the firing of BEFORE triggers and before heap_insert(). Beyond that, I am not sure what other considerations should be made with respect to triggers. So, UNLESS should/will be applied after BEFORE triggers, but before non-UNLESS specified constraints. ;) -- Regards, James William Pye ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[HACKERS] Copy From Insert UNLESS
/pgsql-hackers/2005-11/msg01100.php Extant Solutions There are quite a few solutions to this problem as I'm sure many (all?) know: . Temporary table that filters out the evil tuples. . BEFORE TRIGGER handling the tuple if the constraint of interest is violated. . INSERT wrapped in a subtransaction. . (Other variations) Temporary tables are probably the fastest here. However, it still exhibits redundancy, and requires post-load tuple movement(extra step). Savepoints require client side logic in order to detect the appropriate error code to trap or raise. (Also, this seems to be quite slow, regardless.) A before trigger is going to require that interested constraints be tested twice and for code to be effectively duplicated. So, there are quite a few ways to do error controlled bulk loading. Temporary tables appear to be the best current solution. However, I think the implementation described in this proposal will yield improvements by simply reducing redundancy. Simple Numbers [Using the attached, *very* experimental patch]: (PostgreSQL 8.2 [~HEAD], redhat 9) These numbers were rendered from a simple single client trial where psql and the postmaster were running on the same machine. I ran each trial a few times and I would cleanup and VACUUM FULL between runs. The numbers do not include the cleanup or the vacuum. The Insert unless trials are drawn from my attached patch. (For the INSERT UNLESS cases I used the patch, otherwise a clean PG.) CREATE TABLE foo (i int PRIMARY KEY); [Straight SQL INSERTs (via psql and an SQL file)] (10 Violations) Auto-Commit Transactions: 22.213 Savepoints: (ran out of shared memory) Insert into temporary table, Merge: 24.225 * Insert Unless: 14.668 (5 Violations, 5 New) Auto-Commit Transactions: 33.342 Savepoints: (untested) Insert into temporary table, Merge: 24.243 * Insert Unless: 14.260 (10 New) Auto-Commit Transactions: 47.990 Savepoints: 3:05.60 (three minutes!!) Temporary table: 26.178 * Insert Unless: 14.283 The numbers here look pretty good, especially for such a hackish patch. [btw, I hope I screwed up somehow on the savepoints.] However: [COPY FROM STDIN (via psql and a file of data)] (10 Violations) Copy Unless: 2.4132 Copy to temporary, Merge: 0.72675 (5 Conflicts, 5 New) Copy Unless: 2.1145 Copy to temporary, Merge: 1.469 (10 New) Copy Unless: 1.6386 Copy to temporary, Merge: 2.4305 The numbers here don't look so good now. :( However, I'm convinced that this is showing the inefficiencies in my current hack, rather than dissolving the likelihood of the discussed implementation being an improvement. (The files that rendered these results are available on request. They are 100K a piece after being bzip'd) -- Regards, James William Pye Index: src/backend/access/nbtree/nbtinsert.c === RCS file: /projects/cvsroot/pgsql/src/backend/access/nbtree/nbtinsert.c,v retrieving revision 1.131 diff -c -r1.131 nbtinsert.c *** src/backend/access/nbtree/nbtinsert.c 17 Jan 2006 00:09:00 - 1.131 --- src/backend/access/nbtree/nbtinsert.c 2 Feb 2006 20:05:47 - *** *** 121,127 { TransactionId xwait; ! xwait = _bt_check_unique(rel, btitem, heapRel, buf, itup_scankey); if (TransactionIdIsValid(xwait)) { --- 121,139 { TransactionId xwait; ! PG_TRY(); ! { ! xwait = _bt_check_unique(rel, btitem, heapRel, buf, itup_scankey); ! } ! PG_CATCH(); ! { ! _bt_freestack(stack); ! _bt_freeskey(itup_scankey); ! HOLD_INTERRUPTS(); ! _bt_wrtbuf(rel, buf); ! PG_RE_THROW(); ! } ! PG_END_TRY(); if (TransactionIdIsValid(xwait)) { *** *** 240,245 --- 252,259 return xwait; } + if (nbuf != InvalidBuffer) + ReleaseBuffer(nbuf); /* * Otherwise we have a definite conflict. */ Index: src/backend/commands/copy.c === RCS file: /projects/cvsroot/pgsql/src/backend/commands/copy.c,v
Re: [HACKERS] Adding a --quiet option to initdb
On Wed, 2006-01-25 at 19:23 +0100, Thomas Hallgren wrote: Make it completely silent by default instead and then introduce a --verbose. +1. I imagine initdb is usually ran interactively, so I don't think having the extra output is a big issue considering the normal case, but I think the If you want it, ask for it idea that Thomas is proposing is the right way. Why should initdb give it [processing information] to the user if the user didn't request it in the first place? For applications that want to automate the initdb process in a GUI-way or whatnot, the output [of initdb] isn't likely to be a very elegant aspect of the environment the developer would be trying to create, but they are, more or less, stuck with getting it if they wish to provide their user with more informative feedback about the ongoing process. While for Devrim's case, it would be overkill, but what about a libinitdb, or some sort authoritative source of processing steps in order to initialize a new database location that other applications could make easier use of? -- Regards, James William Pye iCrossing Privileged and Confidential Information This email message is for the sole use of the intended recipient(s) and may contain confidential and privileged information of iCrossing. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Coding standards? Recommendations?
On Sun, 2006-01-15 at 11:14 -0500, korry wrote: (sorry if this has been covered before - if you can point me to a document that covers the information I'm looking for, that would be very helpful). I asked this question about a half year ago or so, and got: http://archives.postgresql.org/pgsql-hackers/2005-05/msg00331.php Hope it helps. -- Regards, James William Pye ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] plpython and bytea
On Mon, 2005-11-21 at 15:18 +0200, Hannu Krosing wrote: The project seems quite interesting, will surely take a deeper look It looks quite alpha, seems to mess to deeply with backend, and the cvs checkout of module be does not build, so I will probably not be able to use it in production for quite a while :( [EMAIL PROTECTED] be]$ python setup.py build running build running config Traceback (most recent call last): File setup.py, line 256, in ? setup(defaults) File setup.py, line 249, in setup d = distutils.core.setup(**kw) File /usr/lib/python2.4/distutils/core.py, line 149, in setup dist.run_commands() File /usr/lib/python2.4/distutils/dist.py, line 946, in run_commands self.run_command(cmd) File /usr/lib/python2.4/distutils/dist.py, line 966, in run_command cmd_obj.run() File /home/hannu/work/postgresPy/be/lib/ldistutils.py, line 193, in run self.run_command('config') File /usr/lib/python2.4/distutils/cmd.py, line 333, in run_command self.distribution.run_command(command) File /usr/lib/python2.4/distutils/dist.py, line 966, in run_command cmd_obj.run() File /home/hannu/work/postgresPy/be/lib/ldistutils.py, line 166, in run self.distribution.config = cfg = self.load() File /home/hannu/work/postgresPy/be/lib/ldistutils.py, line 129, in load cfgsrc(self) File setup.py, line 19, in configure import postgresql.utility.config ImportError: No module named postgresql.utility.config The 'layout' package needs to be installed first. See this quick start section: http://python.projects.postgresql.org/quick.html#Fetch+and+Install+the +Backend ('be' depends on 'lo' and 'ex') -- Regards, James William Pye ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] FAQ/HTML standard?
On Sat, 2005-09-10 at 17:12 -0400, Neil Conway wrote: Bruno Wolff III wrote: I ran accross an article a few weeks ago that suggested that this wasn't all that great of an idea. Using HTML 4.01 should be just as useful. Is there a reason why the FAQ can't be in DocBook, like the rest of the documentation? That would allow multiple output formats to be produced, and avoid the need to store multiple copies of the FAQ in CVS. Additionally, I would be inclined to think that this would be a good idea considering that DocBook defines elements specifically for structuring FAQ material. See http://docbook.org/tdg5/en/html/qandaset.html -- Regards, James William Pye ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] PQ versions request message
On Thu, 2005-09-08 at 09:17 -0400, Tom Lane wrote: You're right, it wouldn't be necessary to tear down the socket --- but it *would* be necessary to have two network round trips. And the point remains that in most scenarios the client and server will be of similar vintages and so wish to speak the same protocol version anyway, so most of the time the extra probe would be useless. I think you're trying to optimize the uncommon case at the expense of the common case. The feature, being optional, does not always require any extra expense. The expense is only incurred when the feature is used; those who use are those who pay. For those users, I imagine this would normally be once per host per process, and only if the user of the client does not explicitly specify the PQ version in the first place. AFA the likelihood of client and servers being of similar vintages, I imagine that you are right here. Although, I would rather not play a guessing game if I didn't have to, and the backend very well has the ability to give me the information that I need to avoid any such thing. The point is to give client authors the ability to authoritatively resolve ambiguity that may exist in multiversion supporting clients and to do so without any version specific code(or at a minimum wrt older servers) or fingerprinting of any sort. -- Regards, James William Pye ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] PQ versions request message
On Thu, 2005-09-08 at 16:27 -0400, Tom Lane wrote: Had we had such a facility from the beginning, it would indeed have that benefit. But unless you are going to start out by dropping client-side support for all extant server versions, you will not get any such benefit; you'll still need retry code. So I still think this isn't really worth the trouble it would take to implement. The benefit will come when extant server versions become antiquated. AFA difficulty with the implementation is concerned, I wouldn't bother with anything except the backend. libpq-fe can wait until said antiquation occurs, and I imagine the backend work being 40 lines or so, no? Also, you keep referring to caching the result on the client side and re-using it across multiple connections --- but you can do that now, so why is that an argument in favor? Not so much favor, but, rather, it was to target your complaint about the two required round-trips involved in connection negotiation with a version query. I was trying to ease that distaste by showing that if such ambiguity existed where resolution were necessary, it would/should only need to be done once(save various exceptions, of course). -- Regards, James William Pye ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[HACKERS] PQ versions request message
I have been writing a PQ client and I have come to think that a supported PQ versions request startup packet would be useful to client authors. That is, sending a StartupPacket(Version(0, 0))(or whatever) would return a message containing a list of supported PQ versions, and maybe the server version as well. Having the said feature would allow client authors to select the connection code based on the versions returned, rather than using a trial and error process. It also gives the client the ability to quickly error out and notify the user that it cannot connect to the server if the client doesn't support any of the server versions. I find the trial and error process unseemly as it could require code to be loaded that simply does not need to be loaded--of course, making assumptions about the client design(this would likely be the case for my client). In addition, the trial and error process could be time consuming depending on how the connection to the server is established, thus further decreasing the appeal of the trial and error process(hrm, this may not be a useful criticism of te if pg keeps the pipe open until the client sends a suitable startup packet?). Although, I do see that the trial and error process would only need to happen once(per process, I figure) if the client author cached the code selection information about a given server. Thoughts? Has this been proposed/shot down before? -- Regards, James William Pye ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] PQ versions request message
On Thu, 2005-09-08 at 03:48 +0200, Peter Eisentraut wrote: This doesn't make sense to me, because a server does not have any version requirements on the client (aside from the protocol versions, which are negotiated automatically). The use case primarily applies to custom clients(non-libpq, atm) that support multiple PQ versions that may be implemented in separate modules/libraries. (Avoid loading client-2.0 code for a 3.0 connection, and/or future versions.) libpq automatically negotiates the version using trial and error, effectively(assume 3.0 by sending 'S', if 'E', fallback to 2.0, and reestablish the connection, apparently). and maybe the server version as well. That is already available automatically. Yes, but, AFAIK, only after the protocol has been negotiated and authentication is complete. Really, I'm not sure if such a feature should include the server version as selecting feature implementations based on it is probably a bad idea(TM). -- Regards, James William Pye ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] PQ versions request message
On Wed, 2005-09-07 at 22:02 -0400, Tom Lane wrote: Given that it'd be guaranteed not to work with any existing server versions, I find the usefulness a bit debatable... Indeed =(. However, older servers could be easily detected then if the returned message type is 'E'. If 'E' is returned, it would likely be on a pre 8.x server running 2.0 and/or 3.0. Thus signalling the client connection code to fall back on old-style version detection, if need be. Alas, this would, of course, require yet another connection expenditure for any so called pre-version list supporting servers. In particular I disagree with the premise that clients should expend an extra connection setup/teardown cycle to find out which protocol versions the server supports. We change protocol versions seldom enough that I think the optimal strategy is try the newest version you know of, then back off one at a time until it works. This is always optimal if the server is newer than the client, and it's only pessimal if the server is much older than the client --- how often does that happen? [Assuming you're not referring to connection attempts to the pre-version list supporting servers] Why does it have to be torn down? I imagine that it could easily be an extra, optional part of connection negotiation. Wait for startup packet, If Version(0,0), Send PQ version list, wait for real startup packet, Else take it as the real startup packet, or ssl neg message, or etc. If I implied that it should be torn down entirely, it was probably due to how I was planning to manage the initial connection that I establish to a server to validate that a connection can actually be made and resolve any/some server specific issues(int timestamps, etc). Although, this is all specific to the implementation that I'm brewing. (not sure about all the specifics yet, as I've just been toying with the idea) To put it more concretely: there are currently only two protocol versions in live use (V1 died a long time ago). If you try V3 and then V2, you will expend either one or two connection cycles, with the average converging towards one as time passes and more people update their servers. If you probe for the right answer and then do it, you will always expend two connection cycles. That is a win how exactly? Like I asked above, why does it have to be done in two connection cycles? I'm assume by connection cycle you are referring to reopening the socket, or...? -- Regards, James William Pye ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] process crash when a plpython function returns
On Mon, 2005-06-27 at 08:12 -0600, Michael Fuhr wrote: also in this context it would be helpful if sys.defaultencoding would be set to the database encoding so strings get encoded to utf-8 when postgres works in unicode mode rather then the default encoding of ascii. This could avoid most of the PyObject_Str() exeptions in the first place. I haven't looked at doing that yet and probably won't before feature freeze. Gerrit van Dyk has expressed an interest in hacking on PL/Python (he recently submitted a SETOF patch) so maybe he'll work on it. I have this fixed, for the most part, in PL/Py. What I have done might be a good starting place for someone who wants to get it fixed in core. http://cvs.pgfoundry.org/cgi-bin/cvsweb.cgi/python/be/src/encoding.c This file makes using PostgreSQL encodings in Python a more friendly experience by setting up some aliases. (uóäæ.encode('UNICODE') would work in 8.0) Also, to set the default encoding used by Python's Unicode strings: PyUnicode_SetDefaultEncoding(PyEncoding_FromPgEncoding(GetDatabaseEncoding())) PyEncoding_FromPgEncoding is defined in encoding.c. Also, it should be noted that to get the interpreter to read the function code as a specific encoding, one must use, afaik, the # -*- encoding: utf-8 -*- magic. -- Regards, James William Pye ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] inclusions WAS: Increased company involvement
On Tue, 2005-05-03 at 18:06 -0700, Josh Berkus wrote: 1) If we start including everything that's useful, where do we stop? There are enough pg add-ins to fill a CD -- 200 projects on GBorg and pgFoundry and others elsewhere. And some of them probably conflict with each other. Any decision to include some projects and not others will alienate people and possibly be a mis-evaluation; the libpq++/libpqxx mistake comes to mind. Mmm, just think of it. If enough projects get into core maybe, just maybe, pg could compete with mozilla for the longest build time. Wouldn't that be nice. ;) [snip some stuff that I agree with] With regards to PLs, there is a good argument for having them in core: the volatility of the backend's APIs make it difficult to externally maintain. I know this is the case, first hand. Although, if dynamically loaded extensions are to be allowed, I think that that volatility should be seen as a problem, and not so much as a reason to keep things in the same tree. While I understand that it's likely to be difficult to give interversion [source] compatibility without sacrificing general improvement, I think it would be a good goal to have. ... I asked on IRC and I'm still curious, does PG have a API styling standard/guide? I see formatting and info about error messages, but nothing about function/global/typedef naming. -- Regards, James William Pye ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
[HACKERS] Python
Greets, Well, I've been solo'ing it for over a year now, and it's definitely time to make my presence known to the community and invite others to participate as I can't keep putting it off as I have been. There has always been something else that I wanted to finish up before making any announcements, so I kept putting it off and putting it off. Time to stop that. =) For those of you who are not familiar with what I've been doing, I have been working on a Python procedural language extension for PostgreSQL and, furthermore, making attempts to formalize a specification of Python interfaces to PostgreSQL elements(making a draft doc [relatively ;] soon). It is a PgFoundry project, and I do have some information up on the project site[1]. There are other subprojects as well as the PL, such as layout and exceptions, which should be of interest to other developers working on PostgresSQL related Python modules. The backend part--the procedural language--of the project offers a substantial number of improvements over the plpython that is currently in core[2]. The design, in general, has a significant contrast to the one in core, as it focuses on providing 'interfaces to' rather than 'conversions for'. While this appears to cause a larger code base, it provides a level of flexibility that makes adaptations and improvements trivial in many or most cases. For instance, I already have the necessary infrastructure to support OUT parameters in a way that is _very_ natural(tho, support for 8.1dev is not there yet; busy with other stuff). It, currently, is *not* my hope to eventually put this into core; rather, I think I would like to give a try at setting a good example for extension development. PgFoundry is a good resource, and I think it should be exercised, as it makes a great home for projects like this(gborg is pretty good as well! ;).(Yes, I'm +1 for, eventually, pushing things out of core. =) While this is not so much of a formal announcement of the initiative/project(a formal one will eventually come to -announce and c.l.py), it is an invitation to those interested parties on -hackers to join the python-general list[4], which, in addition to up and coming development releases, I hope will lead to the project becoming more active, as opposed to me just silently hacking away. In addition, for those who might want it, I'm open to the idea of giving admin/commit access--especially to well known devs and members of core, so please don't hesitate to ask. If you are concerned about the licensing and the ambiguity of the primary license, don't be. It is--at least meant to be--a gift license, so I have no problems with others using the works under the BSD or MIT licenses; this is explicitly stated/granted here[3], and a reference to the authorization to use those substitute terms will eventually make its way into every instrument-stamped source file. If you want to play with what I've got so far, see the quick start[5]. It gives you some neat code, so you can play around in the interactive console in a single-user backend to get a better feel of things. (It should work without much/any trouble, at least on FreeBSD 5.3. If you do have trouble don't hesitate to contact me by e-mail or on freenode where I go by the nick 'jwp'.) [1]http://python.projects.postgresql.org [2]http://python.projects.postgresql.org/project/be.html [3]http://python.projects.postgresql.org/license.html [4]http://lists.pgfoundry.org/mailman/listinfo/python-general [5]http://python.projects.postgresql.org/quick.html -- Regards, James William Pye ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Output functions with multiple arguments considered
On Sat, 2005-04-30 at 16:17 -0400, Tom Lane wrote: An example that Elein put up yesterday: http://archives.postgresql.org/pgsql-general/2005-04/msg01384.php caused me to realize that type output functions that depend on additional arguments to determine what they are dealing with are fundamentally security holes. It is trivial to crash 8.0's record_out by lying to it about the rowtype of its first argument. I was bitten by this a little while ago where I was running an OidFunctionCall1(yes, 1) on typoutput's. Andrew on IRC pointed out that calls to recordout out normally used a FunctionCall3, thus showing the reason for the issue. Sometimes junk data in the heap signalled the function to use it instead of the datum tuple's typoid, normally causing a failed cache lookup. I figured it was somehow my fault, but I just couldn't put my finger on it.(thanks again Andrew if you're on here) I imagine this change will also help save others from that mistake as well. ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
[HACKERS] pl/Python
Greets, [apologies if an earlier dup gets unswallowed] Well, I've been solo'ing it for over a year now, and it's definitely time to make my presence known to the community and invite others to participate as I can't keep putting it off as I have been. There has always been something else that I wanted to finish up before making any announcements, so I kept putting it off and putting it off. Time to stop that. =) For those of you who are not familiar with what I've been doing, I have been working on a Python procedural language extension for PostgreSQL and, furthermore, making attempts to formalize a specification of Python interfaces to PostgreSQL elements(making a draft doc [relatively ;] soon). It is a PgFoundry project, and I do have some information up on the project site[1]. There are other subprojects as well as the PL, such as layout and exceptions, which should be of interest to other developers working on PostgresSQL related Python modules. The backend part--the procedural language--of the project offers a substantial number of improvements over the plpython that is currently in core[2]. The design, in general, has a significant contrast to the one in core, as it focuses on providing 'interfaces to' rather than 'conversions for'. While this appears to cause a larger code base, it provides a level of flexibility that makes adaptations and improvements trivial in many or most cases. For instance, I already have the necessary infrastructure to support OUT parameters in a way that is _very_ natural(tho, support for 8.1dev is not there yet; busy with other stuff). It, currently, is *not* my hope to eventually put this into core; rather, I think I would like to give a try at setting a good example for extension development. PgFoundry is a good resource, and I think it should be exercised, as it makes a great home for projects like this(gborg is pretty good as well!).(Yes, I'm +1 for, eventually, pushing things out of core. ;) While this is not so much of a formal announcement of the initiative/project(a formal one will eventually come to -announce and c.l.py), it is an invitation to those interested parties on -hackers to join the python-general list[4], which, in addition to up and coming development releases, I hope will lead to the project becoming more active, as opposed to me just silently hacking away. In addition, for those who might want it, I'm open to the idea of giving admin/commit access--especially to well known devs and members of core, so please don't hesitate to ask if you think you would like to take a relatively active role. If you are concerned about the licensing and the ambiguity of the primary license, don't be. It is--at least meant to be--a gift license, so I have no problems with others using the works under the BSD or MIT licenses; this is explicitly stated/granted here[3], and a reference to the authorization to use those substitute terms will eventually make its way into every instrument-stamped source file. (I may restamp files with the BSDL, as it is the license used by PG, but this one is so short and sweet. ;) If you want to play with what I've got so far, see the quick start[5]. It gives you some neat code, so you can play around in the interactive console in a single-user backend to get a better feel of things. (It should work without much/any trouble, at least on FreeBSD 5.3. If you do have trouble don't hesitate to contact me by e-mail or on freenode where I go by the nick 'jwp'.) I imagine this being the first step that I need to take to begin gathering interest and increasing project activity. Any suggestions on how I can further achieve this goal would be appreciated. [1]http://python.projects.postgresql.org [2]http://python.projects.postgresql.org/project/be.html [3]http://python.projects.postgresql.org/license.html [4]http://lists.pgfoundry.org/mailman/listinfo/python-general [5]http://python.projects.postgresql.org/quick.html -- Regards, James William Pye ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] Error handling in plperl and pltcl
While your message was directed at Thomas, I think I share Thomas' position; well, for the most part. On Tue, 2004-11-30 at 11:21 -0500, Tom Lane wrote: But I think it's a bogus design, because (a) it puts extra burden on the function author who's already got enough things to worry about, and Simply put, IMO, a subtransaction is != an exception, and shouldn't be treated as one. If the author wishes to worry about transaction management that is his worry. I don't feel the extra burden is significant enough to justify hacking around in the Python interpreter(assuming that it's possible in the first place). Personally, I think the decision is fine for plpgsql, but not for Python, or just about any other language. plpgsql is a special case, IMO. (b) since you can't support arbitrary rollback patterns, you have to contort the semantics of Savepoint objects with restrictions that are both hard to design correctly and complicated to enforce. Hrm, isn't this what savepoint levels are supposed to do? Impose those restrictions? I'm guessing Postgres doesn't have savepoint levels yet, per lack of response to my message inquiring about them(well, a savepoint scoping facility), and poking around xact.h not revealing anything either. I think I may hold a more of a hold nose stance here than Thomas. I am not sure if I want to implement savepoint/rollback restrictions as I can't help but feel this is something Postgres should handle; not me or any other PL or C Function author. plpy being an untrusted language, I *ultimately* do not have control over this. I can only specify things within my code. I *cannot* stop a user from making an extension module that draws interfaces to those routines that may rollback to a savepoint defined by the caller. (Not a great point, as a user could also try to dereference a NULL pointer from an extension module as well. ;) I feel if I were to implement such restrictions/regulations it would be analogous to a security guard trying to enforce the law, whereas a real police officer is needed.. ;-) -- Regards, James William Pye signature.asc Description: This is a digitally signed message part
Re: [HACKERS] Error handling in plperl and pltcl
On Sat, 2004-11-20 at 16:39 -0500, Tom Lane wrote: You're right. You can *not* expose those as user-callable operations in a PL language. Consider for example what will happen if the user tries to roll back to a savepoint that was established outside your function call, or tries to exit the function while still inside a local savepoint. You have to enforce strict nesting of functions and subtransactions; therefore it's a lot easier to present an API that looks like an exception-block construct (per plpgsql), or that just hides the whole deal in the SPI calling interface (as I'm proposing for plperl/pltcl). Hrm, what about a savepoint scoping facility that would be wrapped around calls to [volatile?] functions to explicitly enforce these regulations? [...Poking around the archives a bit...] [Or do I mean savepoint levels?]: http://archives.postgresql.org/pgsql-hackers/2004-07/msg00505.php http://archives.postgresql.org/pgsql-hackers/2004-09/msg00569.php -- Regards, James William Pye signature.asc Description: This is a digitally signed message part
Re: [HACKERS] Preventing some SQL commands
On Sun, 2004-11-21 at 16:55 +0100, Thomas Hallgren wrote: In a PL language it's sometimes desirable to prevent execution of some commands. I would like to prevent the commands begin [work or transaction], commit, and rollback, completely and I would like to force the user to use explicit methods for the savepoint methods. I implemented my own SPI; not for this purpose, but I could [may] use it for that. This, of course, would also allow mapping specific utility commands to my internal methods; instead of inhibiting them (tho, I don't do this [yet, perhaps]). I wonder if there's any way to extract the nature of a command from the execution plan returned by SPI_prepare. If not, would it be very difficult to add? It doesn't feel optimal to add a home brewed parser that parses the statements prior to prepare just to find out if I they should prevented. [snip] CmdType SPI_get_command_type(void* executionPlan) [I see Tom's reply, but considering I already wrote this; here it is anyways.] Hrm, I can't help but think it would be better just to allow fetching/access to the Node tag, (e.g. T_TransactionStmt) as opposed to creating a new command type. NodeTag SPI_get_utility_tag(void *execPlan); Well, something like that. I suppose it would somehow need to handle compound queries. Perhaps a filter operation would be a better idea. Passing a function pointer like: bool (*SPI_UtilityFilter) (NodeTag aStmt); To a void SPI_FilterUtilities(void *execPlan, SPI_UtilityFilter fp). Throwing an error if deemed necessary by the pointed to function. Although, I'm inclined to think that if you require this sort of flexibility you should probably think about writing your own SPI. While a ways from complete/stable, my Python SPI: http://gborg.postgresql.org/project/postgrespy/cvs/co.php/imp/src/query.c?r=HEAD http://gborg.postgresql.org/project/postgrespy/cvs/co.php/imp/src/portal.c?r=HEAD -- Regards, James William Pye signature.asc Description: This is a digitally signed message part
Re: [HACKERS] Error handling in plperl and pltcl
On Fri, 2004-11-19 at 16:58 -0500, Tom Lane wrote: What I think we ought to do is change both PL languages so that every SPI call is executed as a subtransaction. If the call elogs, we can clean up by aborting the subtransaction, and then we can report the error message as a Perl or Tcl error condition, which the function author can trap if he chooses. If he doesn't choose to, then the language interpreter will return an error condition to plperl.c or pltcl.c, and we can re-throw the error. I do this already in my plpy, save the subtransaction handling feature. In plpy, all Postgres ERRORs are caught and transformed into Python exceptions, then when the interpreter exits with a Python exception, it is transformed back into a Postgres ERROR and raised. I even created a class of Python exceptions for Postgres ERRORs(e.g. raise Postgres.ERROR('msg', code=someErrCode, hint='foo')). (And more specific classes as well, putting errcodes to good use.) I plan(well, already working on it) to create Python interfaces to Postgres transaction facilities so that the author can start, rollback, and commit subxacts as needed for use/cleanup. Of course, I feel that this is the best way to go AFA subxacts are concerned; leaving the details to the author. I have been playing with RollbackToSavepoint and ReleaseSavepoint, but per Neil's comments on IRC and the fact that I have to annoyingly construct a List containing the savepoint name. I get the feeling that I am not meant to use them. If they are provided for possible use, shouldn't they take a string instead of a List? (Is a List used here to discourage use?) -- Regards, James William Pye signature.asc Description: This is a digitally signed message part
[HACKERS] Version defines
Greets, Would it be possible to get something along the lines of the attached patch in 8? (major,minor,patch,state version defines) (I tried making them shell vars and giving it to AC_INIT, but it seemed to want a literal, so...) Yes, I know there are other ways to get and define this information, but [something like] this is considerably more convenient, IMO. -- Regards, James William Pye Index: configure.in === RCS file: /projects/cvsroot/pgsql/configure.in,v retrieving revision 1.383 diff -c -r1.383 configure.in *** configure.in 25 Oct 2004 00:11:04 - 1.383 --- configure.in 29 Oct 2004 18:21:13 - *** *** 26,31 --- 26,35 AC_PREFIX_DEFAULT(/usr/local/pgsql) AC_SUBST(configure_args, [$ac_configure_args]) + AC_DEFINE_UNQUOTED(PG_VERSION_MAJOR, 8, [PostgreSQL major version]) + AC_DEFINE_UNQUOTED(PG_VERSION_MINOR, 0, [PostgreSQL minor version]) + AC_DEFINE_UNQUOTED(PG_VERSION_PATCH, 0, [PostgreSQL patch level]) + AC_DEFINE_UNQUOTED(PG_VERSION_STATE, beta4, [PostgreSQL version state]) AC_DEFINE_UNQUOTED(PG_VERSION, $PACKAGE_VERSION, [PostgreSQL version]) AC_CANONICAL_HOST Index: src/include/pg_config.h.in === RCS file: /projects/cvsroot/pgsql/src/include/pg_config.h.in,v retrieving revision 1.80 diff -c -r1.80 pg_config.h.in *** src/include/pg_config.h.in 6 Oct 2004 09:35:22 - 1.80 --- src/include/pg_config.h.in 29 Oct 2004 18:21:13 - *** *** 587,592 --- 587,604 /* Define to the address where bug reports for this package should be sent. */ #undef PACKAGE_BUGREPORT + /* Define to the major version */ + #undef PG_VERSION_MAJOR + + /* Define to the minor version */ + #undef PG_VERSION_MINOR + + /* Define to the patch level */ + #undef PG_VERSION_PATCH + + /* Define to version descriptor */ + #undef PG_VERSION_STATE + /* Define to the full name of this package. */ #undef PACKAGE_NAME signature.asc Description: This is a digitally signed message part
Re: [HACKERS] Version defines
On Sun, 2004-10-31 at 08:02, Tom Lane wrote: This has been proposed and rejected before, mainly on the grounds that it would encourage bad programming practices. I admit that I am probably practicing this bad programming at few places in my source, and shame on me for it. I have hoped to tighten it up a bit later, but it is convenient for the time being. At compile time, you should be checking the specific feature you care about, Well, for one of my uses, it is not a feature check. My PL loads a Python extension module whose path is dependent on the major and minor version of the PostgreSQL installation that the PL was compiled against. So I construct the module path string based on the major and minor at compile time. If this is the stance that the group has, that is fine. For now, I will continue my shameful practice of parsing up pg_config --version and defining the components for use in my source. (; -- Regards, James William Pye signature.asc Description: This is a digitally signed message part
Re: [HACKERS] Version defines
On Sun, 2004-10-31 at 10:49, Tom Lane wrote: Er ... can't you just keep it in pkglibdir and refer to it via $libdir? Given that 8.0 now supports relocatable installations, I'd think it best to avoid hardwiring any paths at compile time. Hmm.. I think it would be best to keep Python [extension] modules in Python's site-packages. AFA hardwiring is concerned, I will probably make it a GUC variable in 8.0 that will default to how I currently hardwire it. -- Regards, James William Pye signature.asc Description: This is a digitally signed message part
Re: [HACKERS] CVS configure failure
On Tue, 2004-09-21 at 17:13, Neil Conway wrote: Should this be documented in the installation instructions? I think so. I figure just a mention should be enough. (Also add some productname tags around 'Python'.) -- Regards, James William Pye Index: installation.sgml === RCS file: /projects/cvsroot/pgsql-server/doc/src/sgml/installation.sgml,v retrieving revision 1.203 diff -c -r1.203 installation.sgml *** installation.sgml 20 Jun 2004 01:32:46 - 1.203 --- installation.sgml 22 Sep 2004 05:03:30 - *** *** 170,181 listitem para To build the PL/Python server programming language, you need a ! Python installation, including the header files. Since ! PL/Python will be a shared library, the indextermprimarylibpython/primary/indexterm filenamelibpython/filename library must be a shared library also on most platforms. This is not the case in a default ! Python installation. /para para --- 170,181 listitem para To build the PL/Python server programming language, you need a ! productnamePython/productname installation with the header files and ! the distutils module. Since PL/Python will be a shared library, the indextermprimarylibpython/primary/indexterm filenamelibpython/filename library must be a shared library also on most platforms. This is not the case in a default ! productnamePython/productname installation. /para para signature.asc Description: This is a digitally signed message part
Re: [HACKERS] PG_exception_stack
On Wed, 2004-09-15 at 09:04, G u i d o B a r o s i o wrote: [EMAIL PROTECTED] postgres]$ createlang plpgsql tech_mis createlang: language installation failed: ERROR: could not load library /usr/local/pgsql/lib/plpgsql.so: /usr/local/pgsql/lib/plpgsql.so: undefined symbol: PG_exception_stack It looks like your build is a bit fubar'd. I don't know why exactly, but it can't find that symbol, so I'd suggest a gmake distclean, configure, and gmake all install. (hrm, might also want to make sure that your installations arent crossing paths somewhere here, if you have multiple versions installed) (Note that I just did it with my dev build without trouble) -- Regards, James William Pye signature.asc Description: This is a digitally signed message part
Re: [HACKERS] banner vs version
On Wed, 2004-09-15 at 09:15, G u i d o B a r o s i o wrote: [EMAIL PROTECTED] local]$ psql template1 Welcome to psql 8.0.0beta2, the PostgreSQL interactive terminal. template1=# select version(); version - PostgreSQL 7.4.2 on i686-pc-linux-gnu, compiled by GCC 2.96 (1 row) Is this ok? Banner version, 8 beta2, version() returns 7.4.2. I think this explains the problem that you e-mailed in earlier. You probably tried to install a 8.0beta2 plpgsql into a 7.4.2 backend. AFAIK, this is fine, 8.0 psql clients should be able to connect to 7.4 backends (; -- Regards, James William Pye signature.asc Description: This is a digitally signed message part
Re: [HACKERS] x86_64 configure problem
On Sat, 2004-09-11 at 14:05, Joe Conway wrote: Perhaps. The first error message seems clear enough to me, but if you want to send me the necessary distutils test, I can incorporate it. Any other opinions out there? AC_MSG_CHECKING([for Python distutils module]) if ${PYTHON} 2- -c 'import distutils' then AC_MSG_RESULT(yes) else AC_MSG_RESULT(no) AC_MSG_ERROR([distutils module not found]) fi That should work. While I'm of the opinion that installations without distutils are antiquated, I see no reason why we couldn't check for it anyways. (Gaetano requested it so there would probably be others who would appreciate it.) -- Regards, James William Pye signature.asc Description: This is a digitally signed message part
Re: [HACKERS] x86_64 configure problem
On Thu, 2004-09-09 at 11:17, Joe Conway wrote: On an x86_64 machine I'm finding that I cannot configure --with-python without the attached patch. Undoubtedly there is a better way to fix this -- any suggestions? python_configdir=`${PYTHON} -c from distutils.sysconfig import get_python_lib as f; import os; print os.path.join(f(plat_specific=1, standard_lib=1),'config')` That should work, regardless of the lib directory that Python is installed to. (at least it works on my system): [EMAIL PROTECTED]:~ % py -c from distutils.sysconfig import get_python_lib as f; import os; print os.path.join(f(plat_specific=1,standard_lib=1),'config') /usr/local/lib/python2.3/config -- Regards, James William Pye signature.asc Description: This is a digitally signed message part
Re: [HACKERS] x86_64 configure problem
On Thu, 2004-09-09 at 12:28, James William Pye wrote: That should work, regardless of the lib directory that Python is installed to. Looking at get_python_lib(), I'm not so sure that I'm correct: if os.name == posix: libpython = os.path.join(prefix, lib, python + get_python_version()) Although, I'm getting a bit confused. I just compiled and installed Python(CVS) configured as: ./configure --prefix=/usr/dev --libdir=/usr/dev/pylib But it didn't install anything in /usr/dev/pylib, as one would expect. It just threw everything in /usr/dev/lib. I even gmake'd distclean to make sure there wasn't some configure option cache of some sort. Same thing. ISTM that Python's libdir wasn't meant to be located anywhere other than EPREFIX/lib. -- Regards, James William Pye signature.asc Description: This is a digitally signed message part
Re: [HACKERS] [PATCHES] [BUGS] casting strings to multidimensional arrays yields
On 08/06/04:31/5, Joe Conway wrote: I just reread your comment and thought about it more. I think most of what I said in the last post holds, but in addition part of the problem lies in the fact that we don't implement multidimensional arrays as nested arrays, but rather as one monolithic structure. If we did the former, then 2x0 would be feasible. Investigating, and possibly implementing, multidimensional arrays as a nested structure is another item on my wish list. I'm still not entirely sure how difficult or desirable it is, but my interpretation of SQL99 is that nested arrays is the requirement. +1 This would probably make MD array support in plpy almost trivial, so it is definitely desirable to me. -- Regards, James William Pye pgpnxxoAV7fns.pgp Description: PGP signature
[HACKERS] SRFs ExecMakeTableFunctionResult
Greets, While I was finishing up SRF support in PL/Py, I noticed that when VPC is the selected mode for a table function, ExecMakeTableFunctionResult will set rsinfo.isDone to ExprSingleResult each time it loops to fetch another value (when a direct_function_call). This makes the VPC-SRF author set isDone to ExprMultipleResult on _every_ call while returning values, as it will break out if rsinfo.isDone != ExprMultipleResult. Is this the desired behavior? -- Regards, James William Pye pgp5jFzzXeeDI.pgp Description: PGP signature
Re: [HACKERS] SRFs ExecMakeTableFunctionResult
Seems reasonable to me. A SRF function really ought to explicitly set isDone on every call anyway. Aye, it seems reasonable, but a bit inconsistent with the effect of ExecMakeFunctionResult, which does the same thing but bases the continuity of the result gathering on the isDone pointer, which is set to ExprMultipleResult if isDone is not ExprEndResult, thus making it continue until rsinfo.isDone is explicitly set to ExprEndResult, unlike table functions which will end on either SingleResult or EndResult. (Around lines #941-984 in execQual.c) Is this inconsistency desired? My confusion came in when I implemented SRFs that worked with non-table SRFs, and then table functions didn't work because I wasn't setting isDone to MultipleResult every call. -- Regards, James William Pye pgpoHdRYbOzbM.pgp Description: PGP signature
Re: Postgres development model (was Re: [HACKERS] CVS comment)
On 08/09/04:32/1, Peter Eisentraut wrote: BitKeeper ist not open source, so it's out of the question for most people. Subversion is shockingly unstable. I'm very open for something that replaces CVS, but I'd rather not use any than one of these two. From my casual usage of svn, I haven't noticed any stability issues. If it were shockingly unstable, I would have expected to have had problems with it. Using the ssh tunnel, served on an fbsd jail(yeah, rented from Marc), and connecting with svn client 1.0.4 on my home fbsd 4.10 box.. -- Regards, James William Pye pgpSZcjQr1lVw.pgp Description: PGP signature
Re: [HACKERS] try/catch macros for Postgres backend
On 07/28/04:30/3, Tom Lane wrote: In service of the refactoring of error handling that I was talking about a few days ago, I'm finding that there are several places that really ought to catch longjmps and clean up after themselves, instead of expecting that whatever mess they've made will be cleaned up for them when control gets back to PostgresMain(). If we have functions that can catch errors, control might *not* go right back to PostgresMain(), and so throwing random cleanup actions into the sigsetjmp branch there is No Good. This is wonderful news. plpy for 7.5 will be very nice. :) This is no big deal since pltcl and plpython already do much the same thing, but I'm starting to think that instead of directly hacking on Warn_restart, it would be good to have some macros hiding the details. The syntax I'm toying with is ... Does anyone have a problem with this macro syntax? The try/catch names are stolen from Java, so I'm figuring they won't terribly surprise any modern programmer, but I'm open to different names if anyone has a better idea. Sounds good, but perhaps it would be useful for some developers to have the macro syntax broken up into smaller pieces, plpythonu does/did this in plpython.h(gone now), and I rolled my own based on plpython's in plpy. for example: #define PG_EXC_DECLARE() sigjmp_buf local_sigjmp_buf #define PG_EXC_SAVE() \ sigjmp_buf *save_exception_stack = PG_exception_stack; \ ErrorContextCallback *save_context_stack = error_context_stack #define PG_EXC_TRAP() (sigsetjmp(local_sigjmp_buf, 1) == 0) ... You could then use those to make up PG_TRY, PG_CATCH, PG_END_TRY. Although, I'm not too concerned about this, as if someone wants the smaller pieces they could probably just write their own without much difficulty. Also, the memcpy technique for saving/restoring Warn_restart is what pltcl and plpython currently use, and it works, but it seems unnecessarily inefficient. A further improvement would be to replace Warn_restart by a pointer defined like extern sigjmp_buf *exception_stack_top; Aye, good idea. -- Regards, James William Pye pgpVR2RlKrgUv.pgp Description: PGP signature
Re: [HACKERS] try/catch macros for Postgres backend
On 07/31/04:30/6, Tom Lane wrote: Is there any actual functional usefulness to that, or is it just to avoid having to reorder existing code to fit into the try/catch paradigm? Both, I imagine. In the case of the former, it *may be* useful for someone to create their own paradigm, which seems like it would tye back into the latter. I would actually prefer to force people to use the try/catch macros, in the name of code readability and consistent coding style. Ah, you must be a Python programmer at heart! ;) I had never felt that I understood the way the plpython error-trapping code was structured, until I had to go in and examine it in detail to rewrite it into the try/catch style. Yeah, it wasn't pretty. When I first started on plpy, I hadn't even heard of sigjmp_buf before. Perhaps you can imagine the clumps of hair I had to pull out to finally get a grasp on it. I think it's now a lot more legible to the casual reader, and that's an important goal for Postgres-related code. Definitely. It is a vast improvement over plpython's more demanding style. If you're really intent on doing that, you probably can do it no matter what I say about it ;-). I have yet to decide to adopt the new syntax, as I just saw it yesterday, but it is likely that I will, as I do depend on PG, so if it convenient, I might as well use the tools that it gives me. But I find it hardly any improvement over direct use of the setjmp API. Well, I find it more aesthetically appealing, and it can be quite nice to have a macro interface to allow the underlying to change willy-nilly(not that it should, but that it can). I'll bet that's the hardly any improvement that you mentioned. -- Regards, James William Pye pgp1kRiCAv5XB.pgp Description: PGP signature
Re: [HACKERS] Clean-up callbacks for non-SR functions
On 05/21/04:20/5, Tom Lane wrote: Hm. I do not think you can use an expression context callback for this anyway, because the callback won't be called in the case that query execution is abandoned due to an error. What you'd need for something like that is a global data structure that is traversed at transaction commit/abort and tells you which PyObjects you are holding refcounts on. Indeed. I was planning to implement something along those lines for that case at a later point in time, just wasn't sure when the cleanup should occur, and how it was to be triggered at that time. I still have reservations about the temporary leakage, but I can't help but think that a solution to that is likely cost more than its worth. I can't imagine anything other than an explicit end of usage callback function stored in FmgrInfo that takes fn_extra as an argument, and this would still require ERROR handling... You might want to look at plpgsql's plpgsql_eoxact() function for something vaguely similar. Aye! Thanks for pointing me at EOXact! It's quite the perfect point to cleanup my fn_extra usage. I think that it is quite reasonable to specify that inter-transaction usage of the feature to be forbidden; thus further qualifying it as an appropriate cleanup point. Disqualifying my own idea: AFA free-hooks are concerned, it is clear to me now that they are BAD for my application, as it assumes FmgrInfo is allocated by Postgres's memory manager, and after a quick grep I see that FmgrInfo is statically declared/allocated is many places.. -- Regards, James William Pye pgpGrvIR3fGuo.pgp Description: PGP signature
Re: [HACKERS] Clean-up callbacks for non-SR functions
with that. Regards, James William Pye pgpCCTm1n2ksA.pgp Description: PGP signature
Re: [HACKERS] Clean-up callbacks for non-SR functions
in ExecMakeTableFunctionResult () #2 0x80f9d47 in ExecReScanNestLoop () #3 0x80f3758 in ExecScan () #4 0x80f9e0a in ExecFunctionScan () #5 0x80efd51 in ExecProcNode () #6 0x80eea48 in ExecEndPlan () #7 0x80edff4 in ExecutorRun () #8 0x8153d56 in PortalRun () #9 0x8153c56 in PortalRun () #10 0x8150d87 in pg_plan_queries () #11 0x815310f in PostgresMain () #12 0x8107096 in main () #13 0x806d772 in _start () NOTE: If I SELECT Composite(); resultinfo is NULL.. -- And finally, an actual SRF: -- CREATE FUNCTION SRF() RETURNS SETOF someTable LANGUAGE plpy AS 'return [[0L]]'; backend SELECT * FROM SRF(); Breakpoint 1, pl_handler (fcinfo=0xbfbff1f4) at src/pl.c:468 (gdb) print *fcinfo $11 = { flinfo = 0x842631c, context = 0x0, resultinfo = 0xbfbff1d4, isnull = 0 '\000', nargs = 0, arg = {0 repeats 32 times}, argnull = '\000' repeats 31 times } (gdb) print *fcinfo-flinfo $12 = { fn_addr = 0x285dc118 pl_handler, fn_oid = 554026, fn_nargs = 0, fn_strict = 0 '\000', fn_retset = 1 '\001', fn_extra = 0x0, fn_mcxt = 0x82ab858, fn_expr = 0x8313a60 } (gdb) print *((ReturnSetInfo *)fcinfo-resultinfo) $13 = { type = T_ReturnSetInfo, econtext = 0x84261b0, expectedDesc = 0x8426258, allowedModes = 3, returnMode = SFRM_ValuePerCall, isDone = ExprSingleResult, setResult = 0x0, setDesc = 0x0 } (gdb) bt #0 pl_handler (fcinfo=0xbfbff1f4) at src/pl.c:468 #1 0x80f13a3 in ExecMakeTableFunctionResult () #2 0x80f9d47 in ExecReScanNestLoop () #3 0x80f3758 in ExecScan () #4 0x80f9e0a in ExecFunctionScan () #5 0x80efd51 in ExecProcNode () #6 0x80eea48 in ExecEndPlan () #7 0x80edff4 in ExecutorRun () #8 0x8153d56 in PortalRun () #9 0x8153c56 in PortalRun () #10 0x8150d87 in pg_plan_queries () #11 0x815310f in PostgresMain () #12 0x8107096 in main () #13 0x806d772 in _start () Probably more info than you need/want, but gdb is fun, so here's lots! 8) Regards, James William Pye Index: execQual.c === RCS file: /projects/cvsroot/pgsql-server/src/backend/executor/execQual.c,v retrieving revision 1.159 diff -u -r1.159 execQual.c --- execQual.c 10 May 2004 22:44:43 - 1.159 +++ execQual.c 20 May 2004 11:28:20 - @@ -894,21 +894,20 @@ } /* -* If function returns set, prepare a resultinfo node for -* communication +* Prepare a resultinfo node for communication. We always do this +* even if not expecting a set result, so that we can pass +* expectedDesc. In the generic-expression case, the expression +* doesn't actually get to see the resultinfo, but set it up anyway +* because we use some of the fields as our own state variables. */ - if (fcache-func.fn_retset) - { - fcinfo.resultinfo = (Node *) rsinfo; - rsinfo.type = T_ReturnSetInfo; - rsinfo.econtext = econtext; - rsinfo.expectedDesc = NULL; - rsinfo.allowedModes = (int) SFRM_ValuePerCall; - rsinfo.returnMode = SFRM_ValuePerCall; - /* isDone is filled below */ - rsinfo.setResult = NULL; - rsinfo.setDesc = NULL; - } + fcinfo.resultinfo = (Node *) rsinfo; + rsinfo.type = T_ReturnSetInfo; + rsinfo.econtext = econtext; + rsinfo.allowedModes = (int) (SFRM_ValuePerCall | SFRM_Materialize); + rsinfo.returnMode = SFRM_ValuePerCall; + /* isDone is filled below */ + rsinfo.setResult = NULL; + rsinfo.setDesc = NULL; /* * now return the value gotten by calling the function manager, pgpBPHh9Vf1HP.pgp Description: PGP signature