Re: [HACKERS] Patch: add --if-exists to pg_recvlogical
On Fri, Sep 29, 2017 at 3:06 PM, Peter Eisentraut wrote: > On 9/22/17 15:31, Peter Eisentraut wrote: >> I suggest you create a patch that focuses on the --create part. >> >> You can create a separate follow-on patch for the --start part if you >> wish, but I think that that patch would be rejected. > > I have moved this entry to the next commit fest, awaiting your updated > patch. I'm looking for simple patches to review so doing some gardening. Per Peter's message, moved this to Waiting on author. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [bug fix] Savepoint-related statements terminates connection
On Thu, Sep 7, 2017 at 8:07 PM, Tom Lane wrote: > I've pushed up an attempt at this: > > https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=b976499480bdbab6d69a11e47991febe53865adc > > Feel free to suggest improvements. Thank you, this helps a lot. Especially since some of the behavior is a bit surprising, for example stopping on error leading to ROLLBACK not being done and the retroactive upgrade of preceding commands in an implicit block to a transaction block when a BEGIN appears. When reading this I also realized that the backend does send responses for every individual query in a multi-query request, it's only libpq's PQexec that throws away the intermediate results and only provides access to the last one. I always thought the backend did that. The docs hinted that it's the frontend ("psql only prints the last one", "PGresult describes the result of the last command") but to assure myself I looked with tcpdump. It's a pity that the underlying protocol has 2 ways to do batching of queries but the official library hides both. I guess I should go review the "Batch/pipelining support for libpq" patch rather than complaining.
Re: [HACKERS] [bug fix] Savepoint-related statements terminates connection
On Mon, Sep 4, 2017 at 4:15 PM, Tom Lane wrote: > Also, the main thing that we need xact.c's involvement for in the first > place is the fact that implicit transaction blocks, unlike regular ones, > auto-cancel on an error, leaving you outside a block not inside a failed > one. So I don't exactly see how savepoints would fit into that. I think this hits the nail on the head and should have a place in the official docs as I now realize I didn't grasp this distinction before I read this. My mental model was always "sending a bunch of semicolon separated queries without BEGIN/COMMIT/ROLLBACK; in one PQexec is like sending them one by one preceeded by a BEGIN; and followed by a COMMIT; except you only get the response from the last one". Also, explain what happens when there are BEGIN/ROLLBACK/COMMIT inside that multiquery string, that's still not completely clear to me and I don't want to reverse engineer it from your patch. > Now admittedly, the same set of issues pops up if one uses an > explicit transaction block in a multi-query string: > > begin\; insert ...\; savepoint\; insert ...\; release savepoint\; insert > ...\; commit; According to my mental model described above, this would be exactly the same as without the begin; and commit; which is not the case so I think the distinction is worth explaining. I think the lack of a more detailed explanation about the stuff above confuses *a lot* of people, especially newcomers, and the confusion is only increased by what client drivers do on top (like issuing implicit BEGIN if configured in various modes specified by language-specific-DB-independent specs like Python's DBAPI or Java's JDBC) and one's background from other DBs that do it differently. Speaking of the above, psql also doesn't explicitly document how it groups lines of the file it's executing into PQexec calls. See below for a personal example of the confusions all this generates. I also encountered this FATAL a month ago in the context of "we have some (migration schema) queries in some files and want to orchestrate running them for testing". Initially we started with calling psql but then we needed some client side logic for some other stuff and switched to Python and Psycopg2. We did "read the whole file in a Python string" and then call Psycopg2's execute() on that string. Note that Psycopg2 only uses PQexec to issue queries. We had some SAVEPOINT statements in the file which lead to the backend stopping and the next Psycopg2 execute() on that connection saying Connection closed. It was already confusing why Psycopg2 behaves differently than psql (because we were issuing the whole file in one PQexec vs. psql splitting on ; and issuing multiple PQexecs and SAVEPOINTs working there) and the backend stopping only added to that confusion. Add on top of that "Should we put BEGIN; and COMMIT; in the file itself? Or is a single Psycopg2 execute() enough to have this schema migration be applied transactionally? Is there a difference between the two?". I searched the docs for existing explanations of multiquery strings and found these references but all of them are a bit hand wavy: - psql's reference explaining -c - libpq's PQexec explanation - the message flow document in the FE/BE protocol description -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] strange case of kernel performance regression (4.3.x and newer)
On Wed, Oct 5, 2016 at 8:18 AM, Tomas Vondra wrote: > Over the past couple of days I've been doing a bit of benchmarking for the > "group clog" patch [1], and I've ran into what I suspect might be a fairly > serious performance regression on newer kernels (essentially 4.3.0 and > newer). I got to a point where I need help with further investigation, > testing on other systems etc. > > The workload tested in the patch [1] is quite simple - a transaction with 3 > x SELECT FOR UPDATE queries and 2 x SAVEPOINT on unlogged tables. The > results (average tps from 5 x 5 minute runs, for 32 and 64 clients) on > multiple kernels look like this: > > kernel 32 64 >- > 3.19.8 4852459291 > 4.1.33 4719359574 > 4.2.84890159877 > 4.3.03218738970 > 4.3.63188938815 > 4.4.03194637702 > 4.4.23 3149837724 > 4.5.53153137351 > 4.7.63285938490 Just stumbled upon this old thread. Since the regression is very clear and reproduceable and on a vanilla kernel I don't think you need more testing. The kernel devs or at least Linus take regressions (including performance) very seriously, even if it's theoretically not the kernel's fault (so in this case regardless what Postgres is doing). I think you should just report it to lkml. If you have time to do a bisection between 4.2.8 and 4.3.0 and find the offending commit that would be even better, then email the author and CC lkml and Linus, that should be enough to get it fixed. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch: Implement failover on libpq connect level.
On Tue, Nov 22, 2016 at 8:38 AM, Tsunakawa, Takayuki wrote: > transaction_read_only=on does not mean the standby. As the manual article on > hot standby says, they are different. > I'm afraid that if the current patch is committed, you will lose interest in > the ideal solution. I agree with Robert that lacking the ideal solution shouldn't block a good enough solution today. But I find the current patch as it stands too confusing to pass the "good enough" bar. > Then if the current patch is out as v10, there would be a concern about > incompatibility when we pursue the ideal solution in a later release. That > is, "should we continue to report > that this server is standby even if it's > actually a primary with transaction_read_only is on, to maintain > compatibility with the older release." Agreed. I am worried that this will end up being a wart: target_server_type=primary doesn't really mean primary it means that by default the session is not read only which by the way is usually the case for primaries but that can be changed. > If you want to connect to a server where the transaction is read-only, then > shouldn't the connection parameter be something like > "target_session_attrs=readonly"? That represents exactly what the code does. FWIW I find this to be a reasonable compromise. To keep the analogy with the current patch it would be more something like "target_session_attrs=read_write|any" and the docs could point out "by the way, if you didn't tweak default_transaction_read_only read_write will only pick a primary so this can be used for failover". The functionality is the same but changing the name removes the ambiguity in the semantics and leaves the door open for a future real primary/replica functionality which includes logical replication with whatever semantics are deemed appropriate then. We do lose pgjdbc compatibility but the current patch doesn't have that 100% anyway: in pgjdbc it's called targetServerType and it accepts master instead of primary and it also accepts slave and preferSlave I notice at https://github.com/pgjdbc/www/blob/master/documentation/head/connect.md that pgjdbc says "The master/slave distinction is currently done by observing if the server allows writes" which, due to the "currently" part seems to acknowledge this is not ideal and leaves the door open for changes in semantics but I don't think changing semantics is going to fly for Postgres itself. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch: Implement failover on libpq connect level.
On Wed, Nov 16, 2016 at 2:50 PM, Robert Haas wrote: > Hmm, let's go back to the JDBC method, then. "show > transaction_read_only" will return true on a standby, but presumably > also on any other non-writable node. You could even force it to be > true artificially if you wanted to force traffic off of a node, using > ALTER {SYSTEM|USER ...|DATABASE ..} SET default_transaction_read_only > = on > > I think that would address Alvaro's concern, and it's nicer anyway if > libpq and JDBC are doing the same thing. Not sure I agree that using this is a good idea in the first place. But if we end up using this, I really think the docs should be very explicit about what's implemented and not just say master/any. With the master/any docs in the patch I would be *very* surprised if a master is skipped only because it was configured with default_transaction_read_only = on. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch: Implement failover on libpq connect level.
On Tue, Nov 15, 2016 at 3:42 PM, Alvaro Herrera wrote: > FWIW I'm not sure "primary" is the right term here either. I think what > we really want to know is whether the node can accept writes; maybe > "pg_is_writable_node". This made me think of another complication: what about cascading replication where the middle node is both a master and a slave? You'd probably not want to fallback to the middle node even though it is a master for its downstream. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch: Implement failover on libpq connect level.
On Tue, Nov 15, 2016 at 5:58 PM, Robert Haas wrote: > On Tue, Nov 15, 2016 at 9:42 AM, Alvaro Herrera > wrote: >> I would rather come up with something that works in both cases that we >> can extend internally later, say pg_is_primary_node() or something like >> that instead; and we implement it initially by returning the inverse of >> pg_is_in_recovery() for replicated non-logical flocks, while we figure >> out what to do in logical replication. Otherwise it will be harder to >> change later if we embed it in libpq, and we may be forced into >> supporting nonsensical situations such as having pg_is_in_recovery() >> return true for logical replication primary nodes. > > I don't think we'll be backed into a corner like that, because we can > always make this contingent on server version. libpq will have that > available. But even with server version checking code, that code will be inside libpq so there will be old libpq versions in the field that won't know the proper query to send to new server versions. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Add TAP tests for pg_dump
On 5/9/16, Stephen Frost wrote: >> And what if the tests are buggy? New test suites should go through a >> CF first I think for proper review. And this is clearly one. > > They still won't result in data loss, corruption, or other direct impact > on our users, even if they are buggy. They also won't destablize the > code base or introduce new bugs into the code. Yes, but they could make things worse long term. For example if introduce intermittent failures or if they're hard to understand or if they test internals too deeply and don't let those internals evolve because the tests break. All of them reasons why it's good that they're reviewed. > There is pretty clearly a lack of consensus on the question about adding > new tests. > What *should* we be doing right now? Reviewing code and testing the > system is what I had thought but perhaps I'm wrong. To the extent that > we're testing the system, isn't it better to write repeatable tests, > particularly ones which add code coverage, than one-off tests that only > check that the current snapshot of the code is operating correctly? I also think that testing *and* keeping those tests for the future is more valuable than just testing. But committers can just push their tests while non committers need to wait for the next CF to get their new tests committed. And committers pushing tests means they bypass the review process which, as far as I know, doesn't happen for regular patches: committers usually only directly commit small fixes not lots of new (test) code. So, crazy idea: what about test only CommitFests in between now and the release? The rules would be: only new tests and existing test improvements are allowed. For the rest, the CF would be the same as usual: have an end date, committers agree to go through each patch and commit or return it with feedback etc. Stephen would have added his tests to the upcoming June test CF, Michael would have reviewed them and maybe add his own and so on. This does create work for committers (go through the submitted patches) but acts as an incentive for test writers. Want your test code committed? Well, there will be a CF very soon where it will get committer attention so better write that test. It also gives a clear date after which test churn also stops in order to not destabilize the buildfarm right before a release. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: PL/Pythonu - function ereport
On 3/1/16, Pavel Stehule wrote: >> I though about it before and I prefer variant with possibility to enter >> message as keyword parameter. That's also ok, but indeed with a check that it's not specified twice which I see you already added. > I merged your patches without @3 (many thanks for it). Instead @3 I > disallow double message specification (+regress test) Great, welcome. Ran the tests for plpython-enhanced-error-06 again on 2.4 - 2.7 and 3.5 versions and they passed. I see the pfree you added isn't allowed on a NULL pointer but as far as I see message is guaranteed not to be NULL as dgettext never returns NULL. I'll mark this Ready for Committer. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: PL/Pythonu - function ereport
On 2/26/16, Pavel Stehule wrote: > Sending updated version I did some changes on top of your last version as that was easier than commenting about them, see attached. 0001 and 0005 are comment changes. 0002 is really needed, without it the tests fail on Python2.4. 0004 removes more code related to the unused position. 0003 is the most controversial. It removes the ability to pass message as keyword argument. My reasoning was that keyword arguments are usually optional and configure extra aspects of the function call while message is required and fundamental so therefore it should be positional. If you allow it as keyword as well, you have to deal with the ambiguity of writing plpy.info('a message', message='a keyword arg message, does this overwrite the first one or what?'). For the code with my patches on top on I ran the PL/Python tests for 2.4, 2.5, 2.6, 2.7 and 3.5. Everything passed. Can you have a look at the patches, fold the ones you agree with on top of yours and send the final version? With that I think this will be Ready for Committer. 0001-Comment-improvements.patch Description: binary/octet-stream 0002-Fix-tests-for-Python-2.4.patch Description: binary/octet-stream 0003-Pass-message-only-as-positional-argument.patch Description: binary/octet-stream 0004-Get-rid-of-setting-unused-position.patch Description: binary/octet-stream 0005-Improve-comments.patch Description: binary/octet-stream -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: PL/Pythonu - function ereport
On Fri, Feb 19, 2016 at 9:41 PM, Pavel Stehule wrote: > It looks like good idea. Last version are not breaking compatibility - and > I think so it can works. > > I wrote the code, that works on Python2 and Python3 Hi, I've attached a patch on top of yours with some documentation improvements, feel free to fold it in your next version. I think the concept is good. We're down to code level changes. Most of them are cosmetical, misleading comments and so on but there are some bugs in there as far as I see. In object_to_string you don't need to test for Py_None. PyObject_Str will return NULL on failure and whatever str() returns on the underlying object. No need to special case None. In object_to_string, when you're already in a (so != NULL) block, you can use Py_DECREF instead of XDECREF. object_to_string seems buggy to me: it returns the pointer returned by PyString_AsString which points to the internal buffer of the Python object but it also XDECREFs that object. You seem to be returning a pointer to freed space. get_string_attr seems to have the same issue as object_to_string. In PLy_get_error_data query and position will never be set for plpy.Error since you don't offer them to Python and therefore don't set them in PLy_output_kw. So I think you should remove the code related to them, including the char **query, int *position parameters for PLy_get_error_data. Removing position also allows removing get_int_attr and the need to exercise this function in the tests. You're using PLy_get_spi_sqlerrcode in PLy_get_error_data which makes the spi in the name unsuitable. I would rename it to just PLy_get_sqlerrcode. At least the comment at the top of PLy_get_spi_sqlerrcode needs to change since it now also extracts info from Error not just SPIError. /* set value of string pointer on object field */ comments are weird for a function that gets a value. But those functions need to change anyway since they're buggy (see above). The only change in plpy_plpymodule.h is the removal of an empty line unrelated to this patch, probably from previous versions. You should undo it to leave plpy_plpymodule.h untouched. Why rename PLy_output to PLy_output_kw? It only makes the patch bigger without being an improvement. Maybe you also have this from previous versions. In PLy_output_kw you don't need to check message for NULL, just as sv wasn't checked before. In PLy_output_kw you added a FreeErrorData(edata) which didn't exist before. I'm not familiar with that API but I'm wondering if it's needed or not, good to have it or not etc. In PLy_output_kw you didn't update the "Note: If sv came from PyString_AsString(), it points into storage..." comment which still refers to sv which is now deleted. In PLy_output_kw you removed the "return a legal object so the interpreter will continue on its merry way" comment which might not be the most valuable comment in the world but removing it is still unrelated to this patch. In PLy_output_kw you test for kw != NULL && kw != Py_None. The kw != NULL is definitely needed but I'm quite sure Python will never pass Py_None into it so the != Py_None isn't needed. Can't find a reference to prove this at the moment. Some more tests could be added to exercise more parts of the patch: - check that SPIError is enhanced with all the new things: schema_name, table_name etc. - in the plpy.error call use every keyword argument not just detail and hint: it proves you save and restore every field correctly from the Error fields since that's not exercised by the info call above which does use every argument - use a wrong keyword argument to see it gets rejected with you error message - try some other types than str (like detail=42) for error as well since error goes on another code path than info - a test exercising the "invalid sqlstate" code 0001-Doc-improvements.patch Description: binary/octet-stream -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: make NOTIFY list de-duplication optional
On Sat, Feb 20, 2016 at 2:00 PM, Filip Rembiałkowski wrote: > I was stuck because both syntaxes have their ugliness. NOTIFY allows the > payload to be NULL: > NOTIFY chan01; > > How would this look like in "never" mode? > NOTIFY chan01, NULL, 'never'; -- seems very cryptic. The docs say: "The information passed to the client for a notification event includes the notification channel name, the notifying session's server process PID, and the payload string, which is an empty string if it has not been specified." So a missing payload is not a SQL NULL but an empty string. This means you would have: NOTIFY chan01; NOTIFY chan01, ''; -- same as above NOTIFY chan01, '', 'maybe'; -- same as above NOTIFY chan01, '', 'never'; -- send this all the time Seems ok to me. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: make NOTIFY list de-duplication optional
On 2/9/16, Tom Lane wrote: > FWIW, I think it would be a good thing if the NOTIFY statement syntax were > not remarkably different from the syntax used in the pg_notify() function > call. To do otherwise would certainly be confusing. So on the whole > I'd go with the "NOTIFY channel [ , payload [ , mode ] ]" option. I'm quite interested in getting this addressed in time for 9.6 as I'll be using NOTIFY extensively in a project and I agree with Craig that the deduplication is frustrating both because you sometimes want every event and because it can apparently cause O(n^2) behaviour (which I didn't know before this thread). If another use case for suppressing deduplication is needed, consider publishing events like "inserted tuple", "deleted tuple" from triggers and a transaction that does "insert, delete, insert" which the client then sees as "insert, delete, oops nothing else". Tom's proposal allows for more flexible modes than just the ALL and DISTINCT keywords and accommodates the concern that DISTINCT will lead to bug reports about not really being distinct due to savepoints. Android has a similar thing for push notifications to mobile devices which they call collapse: https://developers.google.com/cloud-messaging/concept-options, search for collapse_key. So I propose NOTIFY channel [ , payload [ , collapse_mode ] ] with collapse mode being: * 'never' ** Filip's proposed behaviour for the ALL option ** if specified, every notification is queued regardless what's in the queue * 'maybe' ** vague word allowing for flexibility in what the server decides to do ** current behaviour ** improves performance for big transactions if a row trigger creates the same payload over and over one after the other due to the current optimization of checking the tail of the list ** has performance problems O(n^2) for big transactions with different payloads *** the performance problems can be addressed by a different patch which uses a hash table, or decides to collapse less aggressively (Tom's check last 100 idea), or whatever else *** in the meantime the 'never' mode acts as a good workaround In the future we might support an 'always' collapse_mode which would really be always, including across savepoints. Or an 'only_inside_savepoints' which guarantees the current behaviour. Filip, do you agree with Tom's proposal? Do you plan to rework the patch on these lines? If you are I'll try to review it, if not I could give it a shot as I'm interested in having this in 9.6. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: PL/Pythonu - function ereport
On 2/18/16, Pavel Stehule wrote: > it doesn't look badly. Is there any possibility how to emulate it with > Python2 ? What do you think about some similar implementation on Python2? The example code I gave works as is in Python2. The Python3 keyword only arguments are only syntactic sugar. See https://www.python.org/dev/peps/pep-3102 for the details. But, as the PEP notes, def f(a, b, *, key1, key2) is similar to doing this which also works in Python2 def f(a, b, *ignore, key1, key2): if ignore: raise TypeError('too many positional arguments') For our case, we want to accept any number of positional arguments due to compatibility so we don't need or want the check for 'too many positional arguments'. Note that in both Python 2 and 3, invoking f(1, 2, key1='k1', key2='k2') is just syntactic sugar for constructing the (1, 2) tuple and {'key1': 'k1', 'key2': 'k2'} dict and passing those to f which then unpacks them into a, b, key1 and key2. You see that reflected in the C API where you get PyObject* args and PyObject* kw and you unpack them explicitly with PyArg_ParseTupleAndKeywords or just use tuple and dict calls to peek inside. What you loose by not having Python3 is that you can't use PyArg_ParseTupleAndKeywords and tell it that detail and so on are keywork only arguments. But because we don't want to unpack args we probably couldn't use that anyway even if we wouldn't support Python2. Therefore you need to look inside the kw dictionary manually as my example shows. What we could also do is check that the kw dictionary *only* contains detail, hint and so on and raise a TypeError if it has more things to avoid silently accepting stuff like: plpy.error('message', some_param='abc'). In Python3 PyArg_ParseTupleAndKeywords would ensure that, but we need to do it manually. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: PL/Pythonu - function ereport
On Wed, Feb 17, 2016 at 3:32 PM, Pavel Stehule wrote: >> Python 3 has keyword only arguments. It occurs to me they're exactly >> for "optional extra stuff" like detail, hint etc. >> Python 2 doesn't have that notion but you can kind of fake it since >> you get an args tuple and a kwargs dictionary. > > > I prefer a possibility to use both ways - positional form is shorter, > keywords can help with some parameters. > > But I cannot to imagine your idea, can you show it in detail? Sure, what I mean is: plpy.error('msg') # as before produces message 'msg' plpy.error(42) # as before produces message '42', including the conversion of the int to str plpy.error('msg', 'arg 2 is still part of msg') # as before, produces message '('msg', 'arg2 is still part of msg')' # and so on for as many positional arguments, nothing changes # I still think allowing more than one positional argument is unfortunate but for compatibility we keep allowing more # to pass detail you MUST use keyword args to disambiguate "I really want detail" vs. "I have argument 2 which is part of the messsage tuple for compatibility" plpy.error('msg', 42, detail='a detail') # produces message '('msg', 42)' and detail 'a detail' plpy.error('msg', detail=77) # produces message 'msg' and detail '77' so detail is also converted to str just like message for consistency # and so on for the others plpy.error('msg', 42, detail='a detail', hint='a hint') plpy.error('msg', 42, schema='sch') Only keyword arguments are treated specially and we know no existing code has keyword arguments since they didn't work before. Implementation wise, it's something like this but in C: def error(*args, **kwargs): if len(args) == 1: message = str(args[0]) else: message = str(args) # fetch value from dictionary or None if the key is missing detail = kwargs.pop('detail', None) hint = kwargs.pop('hint', None) # use message, detail, hint etc. to raise exception for error and fatal/call ereport for the other levels Is it clear now? What do you think? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: PL/Pythonu - function ereport
After I unzip the patch it doesn't apply. patch says: patch: Only garbage was found in the patch input. It's a combined diff, the git-diff manual says this about it: Chunk header format is modified to prevent people from accidentally feeding it to patch -p1. Combined diff format was created for review of merge commit changes, and was not meant for apply. Thanks for doing the test changes. It definitely shows the change is big. The tests at least were heavily relying on both the str conversion and on passing more than one argument. Sad face. You're going to hate me but seeing this I changed my mind about 5, requiring all those extra str calls is too much change in behaviour. I was going to propose passing everything through str (so message, detail, hint but also schema, table) but thinking about it some more, I think I have something better. Python 3 has keyword only arguments. It occurs to me they're exactly for "optional extra stuff" like detail, hint etc. Python 2 doesn't have that notion but you can kind of fake it since you get an args tuple and a kwargs dictionary. What about keeping the same behaviour for multiple positional arguments (single one -> make it str, multiple -> use str of the args tuple) and requiring users to pass detail, hint only as keyword arguments? Code wise it would only mean adding PyObject* kw to PLy_output and adding some code to extract detail, hint etc. from kw. This would also allow getting rid of the GUC since backward compatibility is fully preserved. Again, sorry for all the back and forth but it still feels like we haven't nailed the design to something satisfactory. All the tests you needed to change are a hint towards that. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: PL/Pythonu - function ereport
On Tue, Feb 16, 2016 at 5:18 PM, Catalin Iacob wrote: > We have > elog_test_legacy to test elog_test_legacy=true, the rest of the tests > should use legacy_custom_exception=false. This should have been: We have elog_test_legacy to test legacy_custom_exception=true, the rest of the tests should use legacy_custom_exception=false. I noticed the patch isn't registered in the March CF, please do that. It would be a pity to miss 9.6 because of not registering it in time. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: PL/Pythonu - function ereport
On Sat, Feb 13, 2016 at 4:26 PM, Pavel Stehule wrote: > I am sending new version. Current version does: Hello, I had a look and I like the big picture. Tests fail when built against Python 3.5 with stuff like this in regression.diffs: -ERROR: plpy.Error: stop on error -DETAIL: some detail -HINT: some hint -CONTEXT: Traceback (most recent call last): - PL/Python function "elog_test", line 18, in -plpy.error('stop on error', 'some detail','some hint') -PL/Python function "elog_test" +ERROR: could not convert Python Unicode object to bytes +DETAIL: TypeError: bad argument type for built-in operation +CONTEXT: PL/Python function "elog_test" This is related to the use of PyString_AsString and the changed semantics of that in Python 3 (due to the fact that strings are now Unicode objects and so on). Didn't have time to dig more deeply into the exact cause. Similarly, there are alternative expected test outputs that you didn't update, for example src/pl/plpython/expected/plpython_types_3.out so tests fail on some Python versions due to those as well. > 1. the plpy utility functions can use all ErrorData fields, > 2. there are no new functions, > 3. via GUC plpythonu.legacy_custom_exception we can return previous behave, > 4. only exception Error is raised with natural structure - no composite > value spidata. > 5. fields: message, detail and hint are implicitly translated to string - it > decrease a necessity of legacy mode I disagree that 5. is a good idea. I think we should just treat message, detail and hint like the other ones (schema, table etc.). Use s in PyArg_ParseTupleAndKeywords and let the user explicitly cast to a string. Explicit is better than implicit. The way you did it you keep part of the weird old interface which used to cast to string for you. We shouldn't keep warts of the old behaviour, especially with the GUC to ask for the old behaviour. By the way, getting rid of object_to_string and its usage in PLy_output_kw removed some "ERROR: could not convert Python Unicode object to bytes" failures in the tests so I'm quite sure that the usage of PyString_AsString is responsible for those. I don't like that you set legacy_custom_exception=true in some existing tests, probably to avoid changing them to the new behaviour. We should trust that the new behaviour is what we want and the tests should reflect that. If it's too much work, remember we're asking users to do the same work to convert their code. We have elog_test_legacy to test elog_test_legacy=true, the rest of the tests should use legacy_custom_exception=false. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: PL/Pythonu - function ereport
On Tue, Feb 2, 2016 at 11:52 PM, Alvaro Herrera wrote: > Robert Haas wrote: >> The eventual committer is likely to be much happier with this patch if >> you guys have achieved consensus among yourselves on the best >> approach. > > Actually I imagine that if there's no agreement between author and first > reviewer, there might not *be* a committer in the first place. Perhaps > try to get someone else to think about it and make a decision. It is > possible that some other committer is able to decide by themselves but I > wouldn't count on it. Pavel and I agree that the backward incompatible behavior is cleaner, but it's backward incompatible. Whether that extra cleanness is worth the incompatibility is never obvious. In this case I think it does. But since my opinion is just my opinion, I was planning to make the committer be that someone else that weighs the issues and makes a decision. I'm new around here and picked this patch to get started due to having Python knowledge and the patch seeming self contained enough. Being new makes me wonder all the time "am I just making life difficult for the patch author or is this idea genuinely good and therefore I should push it forward?". I think more beginners that try to do reviews struggle with this. But, let's try to reach a decision. The existing behaviour dates back to 0bef7ba Add plpython code by Bruce. I've added him to the mail, maybe he can help us with a decision. I'll summarize the patch and explain the incompatibility decision with some illustrative code. The patch is trying to make it possible to call ereport from PL/Python and provide the rich ereport information (detail, hint, sqlcode etc.). There are already functions like plpy.info() but they only expose message and they call elog instead of ereport. See the attached incompat.py. Running it produces: existing behaviour PG elog/ereport with message: 1: hi detail: None hint: None PG elog/ereport with message: ('2: hi', 'another argument') detail: None hint: None PG elog/ereport with message: ('3: hi', 'another argument', 2) detail: None hint: None PG elog/ereport with message: ('4: hi', 'another argument', 2, 'lots', 'of', 'arguments') detail: None hint: None new behaviour PG elog/ereport with message: 1: hi detail: None hint: None PG elog/ereport with message: 2: hi detail: another argument hint: None PG elog/ereport with message: 3: hi detail: another argument hint: 2 Traceback (most recent call last): File "incompat.py", line 43, in info_new('4: hi', 'another argument', 2, 'lots', 'of', 'arguments') TypeError: info_new() takes at most 3 arguments (6 given) I find existing behaviour for 2, 3 and 4 unlike other Python APIs I've seen, surprising and not very useful. If I want to log a tuple I can construct and pass a single tuple, I don't see why plpy.info() needs to construct it for me. And for the documented, single argument call nothing changes. The question to Bruce (and others) is: is it ok to change to the new behaviour illustrated and change meaning for usages like 2, 3 and 4? If we don't want that, the solution Pavel and I see is introducing a parallel API named plpy.raise_info or plpy.rich_info or something like that with the new behaviour and leave the existing functions unchanged. Another option is some compatibility GUC but I don't think it's worth the trouble and confusion. # simulated PG elog def elog(message): ereport(message) # simulated PG ereport def ereport(message, detail=None, hint=None): print 'PG elog/ereport with message: %s detail: %s hint: %s' % (message, detail, hint) # existing code behaves like this: # takes an unlimited number of arguments # doesn't take keyword arguments # makes a string representation of the tuple of all arguments unless that tuple has size 1 in which case it only makes a string representation of the first one def info_existing(*args): if len(args) == 1: # special case added by Peter in 2e3b16 str_to_log = str(args[0]) else: str_to_log = str(args) elog(str_to_log) # and I'm proposing to change it to do this: # take 1 required argument and extra optional arguments for every argument accepted by ereport # accepts keyword arguments # passing too many arguments will get rejected def info_new(message, detail=None, hint=None): ereport(message, detail, hint) print 'existing behaviour'.center(40) info_existing('1: hi') info_existing('2: hi', 'another argument') info_existing('3: hi', 'another argument', 2) info_existing('4: hi', 'another argument', 2, 'lots', 'of', 'arguments') print print 'new behaviour'.center(40) info_new('1: hi') # for the documented single argument case same behaviour as existing info_new('2: hi', 'another argument') info_new('3: hi', 'another argument', 2) info_new('4: hi', 'another argument', 2, 'lots', 'of', 'arguments') -- Sent via pgsql-hackers mailing list (pgsql-hackers
Re: [HACKERS] proposal: PL/Pythonu - function ereport
On Tue, Feb 2, 2016 at 3:48 PM, Pavel Stehule wrote: > If we decided to break compatibility, then we have to do explicitly. Thid > discussion can continue with commiter, but send path with duplicitly defined > functions has not sense for me. Currently I out of office, so I cannot to > clean it. 4.2 I should to work usually I think I didn't make myself clear so I'll try again. There are 2 options: 1. keep info etc. unchanged and add raise_info etc. 2. change behaviour of info etc. and of course don't add raise_* You already implemented 1. I said I think 2. is better and worth the compatibility break in my opinion. But the committer decides not me. Since 1. is already done, what I propose is: let's finish the other aspects of the patch (incorporate my docs updates and details in Error instead of SPIError) then I'll mark this ready for committer and summarize the discussion. I will say there that option 1. was implemented since it's backward compatible but that if the committer thinks option 2. is better we can change the patch to implement option 2. If you do the work for 2 now, the committer might still say "I want 1" and then you need to do more work again to go back to 1. Easier to just stay with 1 for now until we have committer input. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: PL/Pythonu - function ereport
On Mon, Feb 1, 2016 at 5:37 PM, Pavel Stehule wrote: > Dne 29. 1. 2016 18:09 napsal uživatel "Catalin Iacob" > : >> Looking at the output above, I don't see who would rely on calling >> plpy.error with multiple arguments and getting the tuple so I'm >> actually in favor of just breaking backward compatibility. Note that >> passing multiple arguments isn't even documented. So I would just >> change debug, info, error and friends to do what raise_debug, >> raise_info, raise_error do. With a single argument behavior stays the >> same, with multiple arguments one gets more useful behavior (detail, >> hint) instead of the useless tuple. That's my preference but we can >> leave the patch with raise and leave the decision to the committer. >> > > if breaking compatibility, then raise* functions are useless, and should be > removed. Indeed. I think it's better to change the existing functions and break compatibility instead of adding the raise_ functions. But the committer will decide if that's what should be done. Since you wrote the patch with raise_* I propose you keep it that way for now and let the committer decide. I wrote the doc patch based on raise_* as well. Attached is the doc patch (made on top of your patch). I'll wait for you to combine them and switch to raising Error and then hopefully this is ready for a committer to look at. 0003-Improve-docs.patch Description: binary/octet-stream -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: PL/Pythonu - function ereport
On Tue, Jan 26, 2016 at 5:42 PM, Pavel Stehule wrote: > I removed from previous patch all OOP related changes. New patch contains > raise_ functions only. This interface is new generation of previous > functions: info, notice, warning, error with keyword parameters interface. I > didn't changed older functions due keeping compatibility. Hi, Even without the OOP changes, the exception classes are still there as the underlying mechanism that error and raise_error use. I looked at the patch and what I don't like about it is that raise_error uses SPIError to transport detail, hint etc while error uses Error. This is inconsistent and confusing. Take this example: CREATE OR REPLACE FUNCTION err() RETURNS int AS $$ plpy.error('only msg from plpy.error', 'arg2 for error is part of tuple', 'arg3 also in tuple') $$ LANGUAGE plpython3u; SELECT err(); CREATE OR REPLACE FUNCTION raise_err() RETURNS int AS $$ plpy.raise_error('only msg from plpy.raise_error', 'arg2 for raise_error is detail', 'arg3 is hint') $$ LANGUAGE plpython3u; SELECT raise_err(); With your patch, this results in: CREATE FUNCTION psql:plpy_error_raise_error_difference:9: ERROR: plpy.Error: ('only msg from plpy.error', 'arg2 for error is part of tuple', 'arg3 also in tuple') CONTEXT: Traceback (most recent call last): PL/Python function "err", line 2, in plpy.error('only msg from plpy.error', 'arg2 for error is part of tuple', 'arg3 also in tuple') PL/Python function "err" CREATE FUNCTION psql:plpy_error_raise_error_difference:17: ERROR: plpy.SPIError: only msg from plpy.raise_error DETAIL: arg2 for raise_error is detail HINT: arg3 is hint CONTEXT: Traceback (most recent call last): PL/Python function "raise_err", line 2, in plpy.raise_error('only msg from plpy.raise_error', 'arg2 for raise_error is detail', 'arg3 is hint') PL/Python function "raise_err" >From the output you can already see that plpy.error and plpy.raise_error (even with a single argument) don't use the same exception: plpy.error uses Error while raise_error uses SPIError. I think with a single argument they should be identical and with multiple arguments raise_error should still use Error but use the arguments as detail, hint etc. In code you had to export a function to plpy_spi.h to get to the details in SPIError while plpy.error worked just fine without that. I've attached two patches on top of yours: first is a comment fix, the next one is a rough proof of concept for using plpy.Error to carry the details. This allows undoing the change to export PLy_spi_exception_set to plpy_spi.h. And then I realized, the changes you did to SPIError are actually a separate enhancement (report more details like schema name, table name and so on for the query executed by SPI). They should be split into a separate patch. With these patches the output of the above test is: CREATE FUNCTION psql:plpy_error_raise_error_difference:9: ERROR: plpy.Error: ('only msg from plpy.error', 'arg2 for error is part of tuple', 'arg3 also in tuple') CONTEXT: Traceback (most recent call last): PL/Python function "err", line 2, in plpy.error('only msg from plpy.error', 'arg2 for error is part of tuple', 'arg3 also in tuple') PL/Python function "err" CREATE FUNCTION psql:plpy_error_raise_error_difference:17: ERROR: plpy.Error: only msg from plpy.raise_error DETAIL: arg2 for raise_error is detail HINT: arg3 is hint CONTEXT: Traceback (most recent call last): PL/Python function "raise_err", line 2, in plpy.raise_error('only msg from plpy.raise_error', 'arg2 for raise_error is detail', 'arg3 is hint') PL/Python function "raise_err" The two approaches are consistent now, both create a plpy.Error. The patch is not complete, it only handles detail and hint not the others but it should illustrate the idea. Looking at the output above, I don't see who would rely on calling plpy.error with multiple arguments and getting the tuple so I'm actually in favor of just breaking backward compatibility. Note that passing multiple arguments isn't even documented. So I would just change debug, info, error and friends to do what raise_debug, raise_info, raise_error do. With a single argument behavior stays the same, with multiple arguments one gets more useful behavior (detail, hint) instead of the useless tuple. That's my preference but we can leave the patch with raise and leave the decision to the committer. What do you think? Jim doesn't like the separate Error being raised. I don't agree, but even if I would, it's not this patch's job to change that and Error is already raised today. This patch should be consistent with the status quo and therefore be similar to plpy.error. If Error is bad, it should be replaced by SPIError everywhere separately. Next week I'll send a patch to improve the docs. 0002-use-plpy.Error-to-hold-the-data-not-plpy.SPIError.patch Description: binary/octet-stream 0001-fix-comment.patch Description: binary/octet-stream -- Sent via pgsql
Re: [HACKERS] proposal: PL/Pythonu - function ereport
On 1/21/16, Pavel Stehule wrote: > 2016-01-14 17:16 GMT+01:00 Catalin Iacob : >> Consider this call chain: SQL code 1 -> PL/Python code 1 -> SPI (via >> plpy.execute) -> SQL code 2 -> PL/Python code 2 >> >> If I'm writing PL/Python code 1 and I want to raise an error toward >> SQL code 1 I use raise plpy.Error. plpy.SPIError is what I get if I >> call into SQL code 2 and that has an error. That error could indeed >> come from a plpy.Error thrown by PL/Python code 2 or it could come >> from a SQL syntax error or whatever. But the symmetry holds: >> plpy.Error is what you raise to stop the query and return errors to >> your SQL caller and plpy.SPIError is what you get back if you use SPI >> to execute some other piece of SQL which has an error. I *think* (I'm >> an internals newbie though so I could be wrong) that SQL code 1 >> doesn't call into PL/Python code 1 via SPI so why would the latter use >> something called SPIError to inform the former about an error? >> > > It is not correct - outside PLPython you got a Error (PostgreSQL error has > not any classes), and isn't important the raising class (Error or > SPIError). Inside PL/Python you will got SPIError or successors (based on > SQLcode). What exactly is not correct? Indeed, *outside* PLPython you don't get a Python exception because Postgres isn't Python, you get Postgres' way of representing an exception (error code, message, hint and so on that might go to the client or not etc.). But that's normal, it's just the impedance mismatch between the fact that you embed Python with its rules into Postgres with its other rules. It's normal that the "host" (Postgres) wins in the end and uses its mechanism to report errors. The "guest" (PLPython) is there only to give mechanisms to activate the "host". But *inside* PLPython what I wrote is true, see this example for what I mean: CREATE FUNCTION test() RETURNS int AS $$ def a_func(): raise plpy.Error('an error') try: a_func() except plpy.Error as exc: plpy.info('have exc {} of type {}'.format(exc, type(exc))) plpy.info('have spidata {}'.format(hasattr(exc, 'spidata'))) $$ LANGUAGE plpython3u; Running this produces: DROP FUNCTION CREATE FUNCTION psql:plpy_demo:16: INFO: have exc an error of type psql:plpy_demo:16: INFO: have spidata False > Currently If I raise plpy.Error, then it is immediately trasformed to > PostgreSQL, and and then to SPIError and successors. Depends how you define immediately. I would say it's really not immediately, it's only at the Postgres boundary. Imagine in the example above that a_func could call lots of other Python code and somewhere deep down raise Error would be used. Within that whole execution the error stays Error and can be caught as such, it will have nothing to do with SPIError. But in my opinion this discussion shouldn't really even be about catching these things, most of the times you won't catch them and instead you'll let them go to Postgres. The discussion should be whether raise plpy.Error(...), plpy.raise_error, plpy.raise_info(,,,) etc. all with keyword argument support are a good PLPython interface to Postgres' ereport. I think they are. On a different note, I've explained what I think but I might be wrong and don't want to stall the patch just because I don't agree with the approach it takes. The waiting for author status is ok since, as far as I can see, Pavel also agrees that at least some of the issues raised in http://www.postgresql.org/message-id/CAHg_5grU=LVRbZDhfCiiYc9PhS=1X5f=gd44-3jcul-qaor...@mail.gmail.com need to be fixed. Would it help maybe to remove myself as reviewer in the CF? Maybe that makes somebody else pick it up and move it further. I could also code the version I'm describing but that design would be contrary to what Jim and Pavel lean towards now. It could help to compare approaches side by side but I don't like the idea of dueling patches. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: PL/Pythonu - function ereport
On Wed, Jan 13, 2016 at 7:40 PM, Jim Nasby wrote: > On 1/12/16 11:25 AM, Catalin Iacob wrote: >> They're similar but not really the same thing. raise Error and >> plpy.error are both ways to call ereport(ERROR, ...) while SPIError is >> raised when coming back after calling into Postgres to execute SQL >> that itself raises an error. Now indeed, that executed SQL raised an >> error itself via another ereport(ERROR, ...) somewhere but that's a >> different thing. > > > Why should they be different? An error is an error. You either want to trap > a specific type of error or you don't. Having two completely different ways > to do the same thing is just confusing. With my (indeed limited) understanding, I don't agree they're the same thing and I tried to explain above why I think they're quite different. You may not agree. If they are different things, using different exception types is natural. Consider this call chain: SQL code 1 -> PL/Python code 1 -> SPI (via plpy.execute) -> SQL code 2 -> PL/Python code 2 If I'm writing PL/Python code 1 and I want to raise an error toward SQL code 1 I use raise plpy.Error. plpy.SPIError is what I get if I call into SQL code 2 and that has an error. That error could indeed come from a plpy.Error thrown by PL/Python code 2 or it could come from a SQL syntax error or whatever. But the symmetry holds: plpy.Error is what you raise to stop the query and return errors to your SQL caller and plpy.SPIError is what you get back if you use SPI to execute some other piece of SQL which has an error. I *think* (I'm an internals newbie though so I could be wrong) that SQL code 1 doesn't call into PL/Python code 1 via SPI so why would the latter use something called SPIError to inform the former about an error? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: PL/Pythonu - function ereport
On Mon, Jan 11, 2016 at 7:33 PM, Pavel Stehule wrote: > I see it. > > it looks like distinguish between Error and SPIError is wrong way. And I > have any other ugly use case. > > If I raise a Error from one PLPythonu function, then in other PLPython > function I'll trap a SPIError exception, because the information about class > was lost inside Postgres. And it should be pretty messy. > I have not information if any exception was User based or SPI based. This isn't that bad. The real Python exception is only lost if the first function calls into Postgres which then ends into another PL/Python function which raises. That's just the way it is. If you use Psycopg2 on the client and a PL/Python function in the server you also don't get in the client the real exception that PL/Python raised even though it's Python at both ends. The analogy isn't perfect since one is in process and the other cross process but you get the point. If a PL/Python function calls another one directly and that one raises something, the caller can catch that just like in regular Python. > The differentiation between Error and SPIError is wrong, because there isn't > any difference in reality. They're similar but not really the same thing. raise Error and plpy.error are both ways to call ereport(ERROR, ...) while SPIError is raised when coming back after calling into Postgres to execute SQL that itself raises an error. Now indeed, that executed SQL raised an error itself via another ereport(ERROR, ...) somewhere but that's a different thing. I'm getting more and more convinced that SPIError should be left unchanged and should not even inherit from BaseError as it's a different thing. And as I said this means BaseError isn't the base of all exceptions that can be raised by the plpy module so then it should probably not be called BaseError. Maybe something like ReportableError (the base class of ereport frontends). Or don't have a base class at all and just allow constructing both Error and Fatal instances with message, detail, hint etc. What you loose is you can't catch both Error and Fatal with a single except ReportableError block but Fatal is maybe not meant to be caught and Error is also not really meant to be caught either, it's meant to be raised in order to call ereport(ERROR, ...). I now like this option (no base class but same attributes in both instances) most. As I see it, what this patch tries to solve is that raise Error and plpy.error (and all the others) are not exposing all features of ereport, they can only pass a message to ereport but can't pass all the other things ereport accepts: detail, hint etc. The enhancement is being able to pass all those arguments (as positional or keyword arguments) when constructing and raising an Error or Fatal instance or when using the plpy.error helpers. Since the existing helpers accept multiple arguments already (which they unfortunately and weirdly concatenate in a tuple whose string representation is pushed into ereport as message) we can't repurpose the multiple arguments as ereport detail, hint etc. so Pavel invented paralel plpy.raise_error and friends which do that. If a bold committer says the string representation of the tuple is too weird to be relied on we could even just change meaning of plpy.error and accept (and document) the compatibility break. So the patch is almost there I think, it just needs to stop messing around with SPIError and the spidata attribute. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: PL/Pythonu - function ereport
On Tue, Jan 12, 2016 at 5:50 PM, Pavel Stehule wrote: > Error and Fatal exception classes are introduced in my patch - it was Peter' > request (if I remember it well), and now I am thinking so it is not good > idea. Now everybody is confused :). Your patch does: -PLy_exc_error = PyErr_NewException("plpy.Error", NULL, NULL); -PLy_exc_fatal = PyErr_NewException("plpy.Fatal", NULL, NULL); -PLy_exc_spi_error = PyErr_NewException("plpy.SPIError", NULL, NULL); [snip] +PLy_exc_error = PyErr_NewException("plpy.Error", PLy_exc_base, NULL); +PLy_exc_fatal = PyErr_NewException("plpy.Fatal", PLy_exc_base, NULL); +PLy_exc_spi_error = PyErr_NewException("plpy.SPIError", PLy_exc_base, NULL); So they are there without the patch, you now make them inherit from the new BaseError previously they just inherited from Exception. More soon in another reply I was just typing when this came in. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: PL/Pythonu - function ereport
On Fri, Jan 8, 2016 at 7:56 AM, Pavel Stehule wrote: >> 3. PLy_error__init__ is used for BaseError but always sets an >> attribute called spidata, I would expect that to only happen for >> SPIError not for BaseError. You'll need to pick some other way of >> attaching the error details to BaseError instances. Similarly, >> PLy_get_spi_error_data became PLy_get_error_data and it's invoked on >> other classes than SPIError but it still always looks at the spidata >> attribute. > > > I afraid of compatibility issues - so I am not changing internal format > simply. Some legacy application can be based on detection of spidata > attribute. So I cannot to change this structure for SPIError. Indeed, I wasn't suggesting changing it for SPIError, that needs to stay spidata. > I can change it for BaseError, but this is question. Internally BaseError > and SPIError share same data. So it can be strange if BaseError and SPIError > uses different internal formats. > > I am interesting your opinion?? Yes, it's not great if BaseError and SPIError have different ways but I think having BaseError have a member (spidata) named after one of its subclasses is even worse. I would say detail, hint etc belong to BaseError and I would make them different attributes of BaseError instances instead of one big BaseError.data tuple similar to what SPIError.spidata is now. SPIError would keep its spidata for compatibility but would get the same information into its detail, hint etc. attributes since it's derived from BaseError. So an equivalent to this Python (pseudo)code: # base class for all exceptions raised by PL/Python class BaseError: def __init__(self, msg, detail=None, hint=None, and so on for every accepted argument): self.msg = msg self.detail = detail # and so on class Error(BaseError): pass class Fatal(BaseError): pass class SPIError(BaseError): def __init__(self, msg): # call BaseError.__init__ so that SPIError also gets .msg, .detail and so on like all other BaseError instances # do what's done today to fill spidata to keep backward compatibility If you don't like that spidata duplicates the other fields, it's probably also ok to not make inherit from BaseError at all. I actually like this better. Then I would call the base class something like UserRaisedError (ugly name but can't think of something better to emphasize that it's about errors that the PL/Python developer is supposed to raise rather SPIError which is Postgres raising them) and it would be: # base class for exceptions raised by users in their PL/Python code class UserRaisedError: def __init__(self, msg, detail=None, hint=None, and so on for every accepted argument): self.msg = msg self.detail = detail # and so on class Error(UserRaisedError): pass class Fatal(UserRaisedError): pass # base class for exceptions raised by Postgres when executing sql code class SPIError: # no change to this class > > a Fatal cannnot be raised - it is a session end. Personally - support of > Fatal level is wrong, and it should not be propagated to user level, but it > is now. And then due consistency I wrote support for fatal level. But hard > to test it. > I see, then never mind. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: PL/Pythonu - function ereport
shellOn Thu, Dec 31, 2015 at 11:37 AM, Pavel Stehule wrote: > here is new version. And here's a new review, sorry for the delay. > Now I use a common ancestor "plpy.BaseError" for plpy builtin classes. So > plpy.SPIError isn't descendant of plpy.Error and then there are not possible > incompatibility issues. That's good. > Instead modification builtin function plpy.debug, plpy.info, ... and > introduction incompatibility I wrote new set of functions with keyword > parameters (used mainly for elevel < ERROR): > > plpy.raise_debug, plpy.raise_info, plpy.raise_notice, plpy.raise_warning, > plpy.raise_error and plpy.raise_fatal. I'm on the fence whether these are good ideas. On one hand having them is nice and they avoid changing the existing plpy.debug etc. in backward incompatible ways, on the other hand they're similar to those so it can be confusing to know which ones to pick. Any opinions from others on whether it's better to add these or not? The docs need more work, especially if raise_* are kept, as the docs should then clearly explain the differences between the 2 sets and nudge users toward the new raise_* functions. I can help with that though if there are objections to these functions I wouldn't mind hearing it before I document them. As for the code: 1. there are quite some out of date comments referring to SPIError in places that now handle more types (like /* prepare dictionary with __init__ method for SPIError class */ in plpy_plpymodule.c) 2. you moved PLy_spi_exception_set from plpy_spi.c to plpy_elog.c and renamed it to PLy_base_exception_set but it's still spi specific: it still sets an attribute called spidata. You needed this because you call it in PLy_output_kw but can't you instead make PLy_output_kw similar to PLy_output and just call PLy_exception_set in the PG_CATCH block like PLy_output does? With the patch plpy.error(msg) will raise an error object without an spidata attribute but plpy.raise_error(msg) will raise an error with an spidata attribute. 3. PLy_error__init__ is used for BaseError but always sets an attribute called spidata, I would expect that to only happen for SPIError not for BaseError. You'll need to pick some other way of attaching the error details to BaseError instances. Similarly, PLy_get_spi_error_data became PLy_get_error_data and it's invoked on other classes than SPIError but it still always looks at the spidata attribute. 4. it wouldn't hurt to expand the tests a bit to also raise plpy.Fatal with keyword arguments and maybe catch BaseError and inspect it a bit to see it contains reasonable values (per 4. have spidata when raising an SPIError but not when raising an Error or BaseError or Fatal etc.) As seen from 1, 2, and 3 the patch is still too much about SPIError. As I see it, it should only touch SPIError to make it inherit from the new BaseError but for the rest, the patch shouldn't change it and shouldn't propagate spidata attributes and SPIError comments. As it stands, the patch has historical artifacts that show it was initially a patch for SPIError but it's now a patch for error reporting so those should go away. I'll set the patch back to waiting for author. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] No Issue Tracker - Say it Ain't So!
On Wed, Jan 6, 2016 at 3:11 AM, Jim Nasby wrote: > Which doesn't help anyone, because neither of those provide a list of "hey, > here's stuff you could do to contribute". The closest we come to that is the > TODO, which isn't well known and has almost no items for newbies (and the > newbie items that are there don't offer much advice). > > The reason I this matters is because yesterday I posted a task for a new > hacker with simple guidelines and 24 hours later it was completed[1]. That > tells me there's people that would love to contribute but don't know how or > what to contribute. Jim, I want to explicitly thank you for your post about that task, I would like to see more of that. I share the sentiment that there are more people wanting to contribute but finding it rather hard to find a starting point. I was (am?) in that position and so far found two ways out: 1. I looked at the commit fest patches as a list of things to contribute to (by doing a review which is currently in progress) 2. Robert at some point mentioned in an email "someone could improve the docs in this patch" so I did that But I can see that 1. is intimidating to do for new people that might think "how can you review without ever having looked at the code before?". Turns out you can and the wiki mentions it explicitly but it's probably still intimidating for some. And 2. required noticing Robert's sentence in the middle of a long email thread. I also think a list of small things suitable for new contributors would help attracting them. Not all would stick and go on to larger items but hopefully at least some would. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: PL/Pythonu - function ereport
On Thu, Dec 3, 2015 at 6:54 PM, Pavel Stehule wrote: > Don't understand - if Fatal has same behave as Error, then why it cannot be > inherited from Error? > > What can be broken? Existing code that did "except plpy.Error as exc" will now also be called if plpy.Fatal is raised. That wasn't the case so this changes meaning of existing code, therefore it introduces an incompatibility. >> >> 5. all the reporting functions: plpy.debug...plpy.fatal functions >> should also be changed to allow more arguments than the message and >> allow them as keyword arguments > > > this is maybe bigger source of broken compatibility > > lot of people uses plpy.debug(var1, var2, var3) > > rich constructor use $1 for message, $2 for detail, $3 for hint. So it was a > reason, why didn't touch these functions No, if you use PyArg_ParseTupleAndKeywords you'll have Python accept all of these so previous ways are still accepted: plpy.error('a_msg', 'a_detail', 'a_hint') plpy.error'a_msg', 'a_detail', hint='a_hint') plpy.error('a_msg', detail='a_detail', hint='a_hint') plpy.error(msg='a_msg', detail='a_detail', hint='a_hint') plpy.error('a_msg') plpy.error(msg='a_msg') etc. But I now see that even though the docs say plpy.error and friends take a single msg argument, they actually allow any number of arguments. If multiple arguments are passed they are converted to a tuple and the string representation of that tuple goes into ereport/plpy.Error: CREATE FUNCTION test() RETURNS int AS $$ try: plpy.error('an error message', 'detail for error', 'hint for error') except plpy.Error as exc: plpy.info('have exc {}'.format(exc)) plpy.info('have exc.args |{}| of type {}'.format(exc.args, type(exc.args))) $$ LANGUAGE plpython3u; SELECT test(); [catalin@archie pg]$ psql -f plpy test CREATE FUNCTION psql:plpy:13: INFO: have exc ('an error message', 'detail for error', 'hint for error') psql:plpy:13: INFO: have exc.args |("('an error message', 'detail for error', 'hint for error')",)| of type In the last line note that exc.args (the args tuple passed in the constructor of plpy.Error) is a tuple with a single element which is a string which is a representation of the tuple passed into plpy.error. Don't know why this was thought useful, it was certainly surprising to me. Anyway, the separate $2, $3 etc are currently not detail and hint, they're just more stuff that gets appended to this tuple. They don't get passed to clients as hints. And you can pass lots of them not just detail and hint. My proposal would change the meaning of this to actually interpret the second argument as detail, third as hint and to only allow a limited number (the ones with meaning to ereport). The hint would go to ereport so it would be a real hint: it would go to clients as HINT and so on. I think that's way more useful that what is done now. But I now see my proposal is also backward incompatible. > I am not against to plpy.ereport function - it can live together with rich > exceptions class, and users can use what they prefer. I wasn't suggesting introducing ereport, I was saying the existing functions and exceptions are very similar to your proposed ereport. Enhancing them to take keyword arguments would make them a bit more powerful. Adding ereport would be another way of doing the same thing so that's more confusing than useful. All in all it's hard to improve this cleanly. I'm still not sure the latest patch is a good idea but now I'm also not sure what I proposed is better than leaving it as it is. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: multiple psql option -c
On Fri, Dec 4, 2015 at 3:47 PM, Robert Haas wrote: > For the most part, the cleanups in this version are just cosmetic: I > fixed some whitespace damage, and reverted some needless changes to > the psql references page that were whitespace-only adjustments. In a > few places, I tweaked documentation or comment language. Sorry for the docs whitespace-only changes, I did that. I realized before the submission I made the diff bigger than it needed to be, but that's because I used M-q in Emacs to break the lines I did change and that reformatted the whole paragraph including some unchanged lines. Breaking all the lines by hand would be quite a job, and any time you go back and tweak the wording or so you need to do it again. So I just used M-q and sent the result of that. Do you happen to know of a better way to do this? I do load src/tools/editors/emacs.samples in my ~/.emacs but it seems the width my Emacs chooses doesn't match the one already in the file. The doc tweaks are good, they make the text more clear. I'm happy that's all you found to improve: writing good docs is hard and the Postgres docs are already good so it's not easy to change them, I had the feeling I'll only make them worse and spent quite some time trying not to do that. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: PL/Pythonu - function ereport
On Thu, Dec 3, 2015 at 7:58 AM, Pavel Stehule wrote: > I am sorry, I don't understand. Now due inheritance plpy.Fatal and > plpy.SPIError has availability to use keyword parameters. Indeed, I didn't realize this, but I don't think it changes the main argument. What I think should happen: 1. Error should take keyword arguments 2. same for Fatal 3. Fatal should not be changed to inherit from Error - it should stay like it is now, just another exception class You can argue a Fatal error is an Error but these classes already exist and are independent, changing their relationship is backward incompatible. 4. SPIError should not be changed at all It's for errors raised by query execution not user PL/Python code so doing raise SPIError in PL/Python doesn't make sense. And again, changing the inheritance relationship of these existing classes changes meaning of existing code that catches the exceptions. 5. all the reporting functions: plpy.debug...plpy.fatal functions should also be changed to allow more arguments than the message and allow them as keyword arguments They are Python wrappers for ereport and this would make them similar in capabilities to the PL/pgSQL RAISE This will make plpy.error(...) stay equivalent in capabilities with raise plpy.Error(...), same for fatal and Fatal 6. the docs should move to the "Utility Functions" section There, it's already described how to raise errors either via the exceptions or the utility functions. I think the above doesn't break anything, doesn't confuse user exceptions with backend SPIError exceptions, enhances error reporting features for the PL/Python user to bring them up to par with ereport and PL/pgSQL RAISE and solve your initial use case at the top of the thread (but with slightly different spelling to match what already exists in PL/Python): "We cannot to raise PostgreSQL exception with setting all possible fields. I propose new function plpy.ereport(level, [ message [, detail [, hint [, sqlstate, ... )" Is what I mean more clear now? Do you (and others) agree? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: PL/Pythonu - function ereport
On Tue, Dec 1, 2015 at 2:17 AM, Pavel Stehule wrote: > here is complete patch - regress tests for all supported Python branches I had a look at what changed in v10 since my last reviewed version and indeed most of it is straightforward: renames from SPIError to Error. The patch also changes plpy.Fatal and plpy.SPIError to inherit from the existing plpy.Error which is a backward incompatibility: catching a plpy.Error will now also catch SPIError and Fatal. But by doing this I noticed plpy.Error already existed without the patch and raising plpy.Error(msg) is documented as equivalent to calling plpy.error(msg), similar for plpy.Fatal and plpy.fatal(). This patch makes it possible to raise plpy.Error with more arguments including keyword ones but doesn't change plpy.error(msg). And Fatal is not touched so it becomes inconsistent with Error. So I now think the approach this patch took is wrong. We should instead enhance the existing error and fatal functions and Error and Fatal exceptions to accept other arguments that ereport accepts (hint, sqlstate) and be able to pass all those as keyword parameters. SPIError should be left unchanged as that's for errors raised by query execution not by the PL/Python user. This is also close to Pavel's original ereport proposal but by extending existing functionality instead of adding a new function and it covers Peter's observation that in Python the ereport function should be "raise an exception" as that's already an alternative way of doing it. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: multiple psql option -c
On Tue, Dec 1, 2015 at 1:53 PM, Michael Paquier wrote: > Attached is a patch implementing those suggestions. This simplifies > the code without changing its usefulness. If you are fine with those > changes I will switch this patch as ready for committer. I tested the v07 patch (so not Michael's version) a few days ago but didn't send this email earlier. I combined various -c and -f with --echo-all, --single-transaction, \set ON_ERROR_STOP=1, separate -c "VACUUM", "SELECT + VACUUM" in a single and in two -c, inserting -f - somewhere in the middle of the other -c and -f. They all behave as I would expect. One maybe slightly surprising behaviour is that -f - can be specified multiple times and only the first one has an effect since the others act on an exhausted stdin. But I don't think forbidding multiple -f - is better. As for the code (these still apply to Michael's latest patch): 1. the be compiler quiete comment is not good English, /* silence the compiler */ would be better or remove it completely 2. shouldn't atyp in SimpleActionListCell be of type enum _atypes? Otherwise why an enum if it's casted to int when actually used? If it's an enum the repeated ifs on cell->atyp should be a switch, either with a default Assert(0) or no default which makes gcc give a warning if an enum value is ever added without having a corresponding case. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: multiple psql option -c
On Wed, Nov 18, 2015 at 5:49 PM, Catalin Iacob wrote: > On Tue, Nov 17, 2015 at 10:13 PM, Tom Lane wrote: >> 1. -c no longer implies --no-psqlrc. That's a backwards incompatibility, >> but very easy to explain and very easy to work around. >> >> 2. You can have multiple -c and/or -f. Each -c is processed in >> the traditional way, ie, either it's a single backslash command >> or it's sent in a single PQexec. That doesn't seem to me to have >> much impact on the behavior of adjacent -c or -f. >> >> 3. If you combine -1 with -c and/or -f, you get one BEGIN inserted >> at the beginning and one COMMIT at the end. Nothing else changes. > I'll try to write the documentation patch for these semantics sometime > next week. Attached is my attempt at a documentation patch, feedback welcome. I'm assuming Pavel will pick up the implementation, if not I could also try it. Update-docs-for-repeated-c.patch Description: binary/octet-stream -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: multiple psql option -c
On Tue, Nov 17, 2015 at 10:13 PM, Tom Lane wrote: > 1. -c no longer implies --no-psqlrc. That's a backwards incompatibility, > but very easy to explain and very easy to work around. > > 2. You can have multiple -c and/or -f. Each -c is processed in > the traditional way, ie, either it's a single backslash command > or it's sent in a single PQexec. That doesn't seem to me to have > much impact on the behavior of adjacent -c or -f. > > 3. If you combine -1 with -c and/or -f, you get one BEGIN inserted > at the beginning and one COMMIT at the end. Nothing else changes. And -v AUTOCOMMIT=off should do the same as now for -c: issue a BEGIN before each single PQexec with the content of each -c. I like it, it avoids what I didn't like about -C semantics since the grouping now means something (single PQexec per group) and you even see the effects of the grouping in the result (only last result of group is returned). If you don't like that grouping (probably most people won't) the solution is intuitive: split the group to multiple -c. Another incompatibility is that -1 is now silently ignored by -c so for example psql -1 -c VACUUM now works and it won't work with the new semantics. But this seems like a good thing because it reflects that VACUUM doesn't work in a transaction so I don't think it should stop this proposal from going ahead. I'll try to write the documentation patch for these semantics sometime next week. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: multiple psql option -c
On Sun, Nov 15, 2015 at 3:53 PM, Andrew Dunstan wrote: > I suggest you review the original thread on this subject before a line was > ever written. "multiple" (see subject line on this whole thread) is clearly > what is being asked for. Making people turn that into a single argument is > not what was envisaged. See for example Pavel's original example involving > use of xargs where that's clearly not at all easy. I couldn't see why it would matter to have multiple -C, but xargs having -n which consumes more than 1 stdin item is indeed an use case. When reading the thread I didn't notice it since I didn't know what -n does. But multiple -C is confusing since it suggests the groupings matter. To me at least, it feels weird that -C "SELECT 1; SELECT 2;" -C "SELECT 3;" is the same as -C "SELECT 1; SELECT 2; SELECT 3" and lots of other combinations. It feels like the split in groups must mean something, otherwise why would you support/use multiple groups? Upthread at least somebody thought each -C group would/should be a transaction and I can see this confusion coming up again and again, enough to question whether this patch is an improvement over the current situation. And if a single -C is too small of an improvement, maybe it means the whole idea should be dropped. I think the same of multiple -f as well: to me they're more confusing than helpful for the same reason. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: multiple psql option -c
On Sun, Nov 15, 2015 at 1:27 AM, Andrew Dunstan wrote: > That seems to me to get rid of the main motivation for this change, which is > to allow multiple such arguments, which together would as as if they were > all written to a file which was then invoked like -f file. It seems to me the motivation is not "multiple command line arguments" but sending multiple statements to the backend in one psql invocation without needing bash specific here docs or a temporary file for -f. Most combinations of such multiple statements can already be sent via -c which sends them in one query, the backend executes them in one transaction but the backend rejects some combinations like SELECT + VACUUM. I think the proposal was mislead by the apparent similarity with -c and said "if -c can't do SELECT + VACUUM let's do a sort of repeated -c and call that -C SELECT -C VACUUM". But why not do the same with -C "SELECT 1; VACUUM" which works just like having a file with that content works today but handier for scripts? Doesn't this solve the exact need in this thread? I'm arguing that sending multiple statements and executing each in one transaction (the current proposed semantics of -C) is not like -c and doesn't need to be "repeated -c" it's exactly like reading stdin or file passed to -f and solves the original problem.But maybe I'm missing something. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: multiple psql option -c
So I promised I'd try to document this. I had a look at the proposed semantics of -C and I think in the patch they're too complicated which makes explaining them hard. My assumptions about behaviour without this patch, from reading the docs and some experimenting, correct me if I'm wrong: 1. psql normally splits its input by ; let's call each piece of the split a statement 2. for every statement resulting after 1, if it's a \ command it's interpreted internally, else a query with it is sent to the server, the result is displayed 3. 1. and 2. happen when the input comes from a file (-f) or from stdin 4. autocommit off changes behaviour in that it sends a BEGIN before any of the statements after the split in 1 (except for \ commands, BEGIN or things like VACUUM which don't work within transactions) 5. --single-transaction changes behaviour in that it puts a BEGIN before the whole input (not around each statement) and a COMMIT after 6. all of the above DON'T apply for -c which very different things: it doesn't split and instead it sends everything, in one query to the backend. The backend can execute such a thing (it splits itself by ;) except in some cases like SELECT + VACUUM. Since the single query is effectively a single transaction for the backend -c ignores --single-transaction and autocommit off. Even more, when executing such a multiple statement the backend only returns results for the last statement of the query. >From the above it seems -c is a different thing altogether while other behaviour allows 1 input with multiple commands, multiple results and works the same on stdin and a file. So my proposal is: allow a *single* argument for -C and treat its content *exactly* like the input from stdin or from a file. This answers all the questions about interactions with --single-transaction and autocommit naturally: it behaves exactly like stdin and -f behave today. And having a single parameter is similar to having a single file or single stdin. Having multiple -C is also confusing since it seems the statements in one -C are grouped somehow and the ones in the next -C are another group so this starts feeling like there's maybe a transaction per -C group etc. Am I missing something or is it that simple? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: PL/Pythonu - function ereport
On Mon, Nov 9, 2015 at 6:31 PM, Pavel Stehule wrote: > merged your patch So, I just that tested version 08 is the same as the previous patch + my change and I already tested that on Python 2.4, 2.5, 2.6, 2.7 and 3.5. All previous comments addressed so I'll mark this Ready for Committer. Thanks Pavel -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: PL/Pythonu - function ereport
On Mon, Nov 9, 2015 at 2:58 PM, Pavel Stehule wrote: > I needed to understand the support for Python 3.5. > > The patch with the fix is attached regress tests 3.5 Python I wanted to type a reply this morning to explain and then I noticed there's another file (_0.out) for Python2.4 and older (as explained by src/pl/plpython/expected/README) so tests there fail as well. I built Python 2.4.6 and then Postgres against it and updated the expected output for 2.4 as well. Find the changes for 2.4 attached. You can add those to your patch. While I was at it I tested 2.5 and 2.6 as well, they also pass all tests. So with the 2.4 changes attached I think we're done. BTW, the alternative output files for pg_regress are explained at http://www.postgresql.org/docs/devel/static/regress-variant.html adjust_py24_expected.patch Description: binary/octet-stream -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: multiple psql option -c
On Thu, Nov 5, 2015 at 5:27 PM, Robert Haas wrote: >> I wrote some text. But needs some work of native speaker. > > It does. It would be nice if some kind reviewer could help volunteer > to clean that up. I'll give it a go sometime next week. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: PL/Pythonu - function ereport
On Wed, Nov 4, 2015 at 10:12 AM, Pavel Stehule wrote: > It helped me lot of, thank you Welcome, I learned quite some from the process as well. >> >> >> There's just the doc part left then. > > > done We're almost there but not quite. There's still a typo in the docs: excation. A plpy.SPIError can be raised should be A plpy.SPIError can be raised right? And most importantly, for Python 3.5 there is a plpython_error_5.out which is needed because of an alternative exception message in that version. You didn't update this file, this makes the tests fail on Python3.5. Since you might not have Python 3.5 easily available I've attached a patch to plpython_error_5.out which makes the tests pass, you can fold this into your patch. adjust_py35_expected.patch Description: binary/octet-stream -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: PL/Pythonu - function ereport
Sorry, you're right, I didn't notice the x = plpy.SPIError() test. I did notice that you included the kw != NULL, I was explaining why it really is needed even though it *seems* the code also works without it. There's just the doc part left then. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: PL/Pythonu - function ereport
On Tue, Nov 3, 2015 at 12:49 PM, Pavel Stehule wrote: >> 1. in PLy_spi_error__init__ you need to check kw for NULL before doing >> PyDict_Size(kw) otherwise for plpy.SPIError() you get Bad internal >> call because PyDict_Size expects a real dictionary not NULL > > > PyDict_Size returns -1 when the dictionary is NULL > > http://grokbase.com/t/python/python-dev/042ft6qaqq/pydict-size-error-return Yes, but it also sets the error indicator to BadInternalCall. In 2.7 the code is: Py_ssize_t PyDict_Size(PyObject *mp) { if (mp == NULL || !PyDict_Check(mp)) { PyErr_BadInternalCall(); return -1; } return ((PyDictObject *)mp)->ma_used; } I had a PLy_elog right after the PyDict_Size call for debugging and that one was raising BadInternalCall since it checked the error indicator. So the NULL check is needed. >> 2. a test with just plpy.SPIError() is still missing, this would have >> caught 1. > > > please, can you write some example - I am not able raise described error The example was plpy.SPIError() but I now realize that, in order to see a failure, you need the extra PLy_elog which I had in there. But this basic form of the constructor is still an important thing to test so please add this as well to the regression test. >> 5. there is conceptual code duplication between PLy_spi_exception_set >> in plpy_spi.c, since that code also constructs an SPIError but from C >> and with more info available (edata->internalquery and >> edata->internalpos). But making a tuple and assigning it to spidata is >> in both. Not sure how this can be improved. > > > I see it, but I don't think, so current code should be changed. > PLy_spi_exception_set is controlled directly by fully filled ErrorData > structure, __init__ is based only on keyword parameters with possibility use > inherited data. If I'll build ErrorData in __init__ function and call > PLy_spi_exception_set, then the code will be longer and more complex. Indeed, I don't really see how to improve this but it does bug me a bit. One more thing, +The plpy module provides several possibilities to +to raise a exception: This has "to" 2 times and is weird since it says it offers several possibilities but then shows only one (the SPIError constructor). And SPIError should be plpy.SPIError everywhere to be consistent. Maybe something like (needs markup): A plpy.SPIError can be raised from PL/Python, the constructor accepts keyword parameters: plpy.SPIError([ message [, detail [, hint [, sqlstate [, schema [, table [, column [, datatype [, constraint ]) then the example If you fix the doc and add the plpy.SPIError() test I'm happy. I'll give it one more test on Python2.7 and 3.5 and mark it Ready for Committer. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: PL/Pythonu - function ereport
Hello, Here's a detailed review: 1. in PLy_spi_error__init__ you need to check kw for NULL before doing PyDict_Size(kw) otherwise for plpy.SPIError() you get Bad internal call because PyDict_Size expects a real dictionary not NULL 2. a test with just plpy.SPIError() is still missing, this would have caught 1. 3. the tests have "possibility to set all accessable fields in custom exception" above a test that doesn't set all fields, it's confusing (and accessible is spelled wrong) 4. in the docs, "The constructor of SPIError exception (class) supports keyword parameters." sounds better as "An SPIError instance can be constructed using keyword parameters." 5. there is conceptual code duplication between PLy_spi_exception_set in plpy_spi.c, since that code also constructs an SPIError but from C and with more info available (edata->internalquery and edata->internalpos). But making a tuple and assigning it to spidata is in both. Not sure how this can be improved. 6. __init__ can use METH_VARARGS | METH_KEYWORDS in both Python2 and 3, no need for the #if 7. "could not create dictionary for SPI exception" would be more precise as "could not create dictionary for SPIError" right? 8. in PLy_add_methods_to_dictionary I would avoid import since it makes one think of importing modules; maybe "could not create function wrapper for \"%s\"", "could not create method wrapper for \"%s\"" 9. also in PLy_add_methods_to_dictionary "could public method \"%s\" in dictionary" is not proper English and is missing not, maybe "could not add method \"%s\" to class dictionary"? 10. in PLy_add_methods_to_dictionary if PyCFunction_New succeeds but PyMethod_New fails, func will leak 11. it would be nice to have a test for the invalid SQLSTATE code part 12. similar to 10, in PLy_spi_error__init__ taking the "invalid SQLSTATE" branch leaks exc_args, in general early returns via PLy_elog will leave the intermediate Python objects leaking Will mark the patch as Waiting for author. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: PL/Pythonu - function ereport
On Tue, Oct 27, 2015 at 9:34 AM, Pavel Stehule wrote: > Hi > > 2015-10-23 7:34 GMT+02:00 Catalin Iacob : >> The current code doesn't build on Python3 because the 3rd argument of >> PyMethod_New, the troubled one you need set to NULL has been removed. >> This has to do with the distinction between bound and unbound methods >> which is gone in Python3. > > > this part of is well isolated, and we can do switch for Python2 and Python3 > without significant problems. I had a quick look at this and at least 2 things are needed for Python3: * using PyInstanceMethod_New instead of PyMethod_New; as http://bugs.python.org/issue1587 and https://docs.python.org/3/c-api/method.html explain that's the new way of binding a PyCFunction to a class, PyMethod_New(func, NULL) fails * in the PLy_spi_error_methods definition, __init__ has to take METH_VARARGS | METH_KEYWORDS not just METH_KEYWORDS; in Python2 METH_KEYWORDS implied METH_VARARGS so it's not needed (it also doesn't hurt) but if I don't add it, in Python3 I get: ERROR: SystemError: Bad call flags in PyCFunction_Call. METH_OLDARGS is no longer supported! >> Still, the above link shows a (more verbose but maybe better) >> alternative: don't use PyErr_NewException and instead define an >> SPIError type with each slot spelled out explicitly. This will remove >> the "is it safe to set the third argument to NULL" question. > > Should be there some problems, if we lost dependency on basic exception > class? Some compatibility issues? Or is possible to create full type with > inheritance support? It's possible to give it a base type, see at https://hg.python.org/cpython/rev/429cadbc5b10/ this line before calling PyType_Ready: PyComError_Type.tp_base = (PyTypeObject*)PyExc_Exception; PyErr_NewException is a shortcut for defining simple Exception deriving types, usually one defines a type with the full PyTypeObject definition. In the meantime, I had a deeper look at the 2.7.10 code and I trust that PyMethod_New with the last argument set to NULL is ok. Setting that to NULL will lead to the PyMethod representing __init__ im_class being NULL. But that PyMethod object is not held onto by C code, it's added to the SPIError class' dict. From there, it is always retrieved from Python via an instance or via the class (so SPIError().__init__ or SPIError.__init__) which will lead to instancemethod_descr_get being called because it's the tp_descr_get slot of PyMethod_Type and that code knows the class you're retrieving the attribute from (SPIError in this case), checks if the PyMethod already has a not NULL im_class (which SPIError.__init__ doesn't) and, if not, calls PyMethod_New again and passes the class (SPIError) as the 3rd argument. Given this, I think it's ok to keep using PyErr_NewException rather than spelling out the type explicitly. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: PL/Pythonu - function ereport
On Sat, Oct 17, 2015 at 8:18 PM, Pavel Stehule wrote: > here is new patch > > cleaned, all unwanted artefacts removed. I am not sure if used way for > method registration is 100% valid, but I didn't find any related > documentation. Pavel, some notes about the patch, not a full review (yet?). In PLy_add_exceptions PyDict_New is not checked for failure. In PLy_spi_error__init__, kw will be NULL if SPIError is called with no arguments. With the current code NULL will get passed to PyDict_Size which will generate something like SystemError Bad internal function call. This also means a test using just SPIError() is needed. It seems args should never be NULL by the way, if there are no arguments it's an empty tuple so the other side of the check is ok. The current code doesn't build on Python3 because the 3rd argument of PyMethod_New, the troubled one you need set to NULL has been removed. This has to do with the distinction between bound and unbound methods which is gone in Python3. There is http://bugs.python.org/issue1587 which discusses how to replace the "third argument" functionality for PyMethod_New in Python3. One of the messages there links to http://bugs.python.org/issue1505 and https://hg.python.org/cpython/rev/429cadbc5b10/ which has an example very similar to what you're trying to do, rewritten to work in Python3. But this is still confusing: note that the replaced code *didn't really use PyMethod_New at all* as the removed line meth = PyMethod_New(func, NULL, ComError); was commented out and the replaced code used to simply assign the function to the class dictionary without even creating a method. Still, the above link shows a (more verbose but maybe better) alternative: don't use PyErr_NewException and instead define an SPIError type with each slot spelled out explicitly. This will remove the "is it safe to set the third argument to NULL" question. I tried to answer the "is it safe to use NULL for the third argument of PyMethod_New in Python2?" question and don't have a definite answer yet. If you look at the CPython code it's used to set im_class (Objects/classobject.c) of PyMethodObject which is accessible from Python. But this code: init = plpy.SPIError.__init__ plpy.notice("repr {} str {} im_class {}".format(repr(init), str(init), init.im_class)) produces: NOTICE: repr str im_class so the SPIError class name is set in im_class from somewhere. But this is all moot if you use the explicit SPIError type definition. >> Any help is welcome I can work with you on this. I don't have a lot of experience with the C API but not zero either. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Adding since-version tags to the docs?
On Mon, Aug 31, 2015 at 4:48 PM, Tom Lane wrote: > Right now, you might well care about whether a feature arrived in 9.3 vs > 9.4, for instance; but it's highly unlikely that you care whether a > feature arrived in 7.1 or 7.2. The problem with this proposal is that > it will add far more bloat of the latter sort than currently-useful > information; and the ratio will get worse over time. After working with Python, when reading the Postgres docs I did feel this was missing so +1 from a user. An example for Python's annotations is at https://docs.python.org/3/library/zipfile.html. Python only keeps the annotations for a few versions back (3 or 4 I think). Another use case which is maybe worth mentioning explicitly: when I needed to target a certain older version of Python, as long as it was reasonably close to the current one, I would still read the latest Python docs and rely on the version annotations to let me know I can't yet use certain features. For Postgres, that doesn't feel "safe" so you tend to go to the docs of the specific version you're targeting. But by doing that, you miss getting extra education about improvements in newer versions. That extra information could also be a powerful trigger to update in order to gain those improvements. I think both are quite valuable. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb concatenate operator's semantics seem questionable
On Mon, May 18, 2015 at 9:03 PM Andrew Dunstan wrote: > So you're arguing that we shouldn't call the operation in question || > because it's pretty much the same, mutatis mutandis, as the hstore > operation of the same name. You've lost me. > Hopefully this helps. Peter's argument, as I understand it is: In hstore @> means unnested containment, in jsonb it means nested containment. Therefore, when an hstore operator is applied to jsonb it gets "nestedness" as jsonb is nested and adds that nestedness is an important thing that sets it apart from hstore. Therefore, since || is unnested concatenation in hstore, it should become nested concatenation for jsonb. I don't know if the argument is strong enough but it does make some sense.