Re: [HACKERS] Patch: add --if-exists to pg_recvlogical

2017-11-01 Thread Catalin Iacob
On Fri, Sep 29, 2017 at 3:06 PM, Peter Eisentraut
 wrote:
> On 9/22/17 15:31, Peter Eisentraut wrote:
>> I suggest you create a patch that focuses on the --create part.
>>
>> You can create a separate follow-on patch for the --start part if you
>> wish, but I think that that patch would be rejected.
>
> I have moved this entry to the next commit fest, awaiting your updated
> patch.

I'm looking for simple patches to review so doing some gardening. Per
Peter's message, moved this to Waiting on author.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [bug fix] Savepoint-related statements terminates connection

2017-09-08 Thread Catalin Iacob
On Thu, Sep 7, 2017 at 8:07 PM, Tom Lane  wrote:
> I've pushed up an attempt at this:
>
>
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=b976499480bdbab6d69a11e47991febe53865adc
>
> Feel free to suggest improvements.

Thank you, this helps a lot. Especially since some of the behavior is a bit
surprising, for example stopping on error leading to ROLLBACK not being
done and the retroactive upgrade of preceding commands in an implicit block
to a transaction block when a BEGIN appears.

When reading this I also realized that the backend does send responses for
every individual query in a multi-query request, it's only libpq's PQexec
that throws away the intermediate results and only provides access to the
last one. I always thought the backend did that. The docs hinted that it's
the frontend ("psql only prints the last one", "PGresult describes the
result of the last command") but to assure myself I looked with tcpdump.

It's a pity that the underlying protocol has 2 ways to do batching of
queries but the official library hides both. I guess I should go review the
"Batch/pipelining support for libpq" patch rather than complaining.


Re: [HACKERS] [bug fix] Savepoint-related statements terminates connection

2017-09-05 Thread Catalin Iacob
On Mon, Sep 4, 2017 at 4:15 PM, Tom Lane  wrote:
> Also, the main thing that we need xact.c's involvement for in the first
> place is the fact that implicit transaction blocks, unlike regular ones,
> auto-cancel on an error, leaving you outside a block not inside a failed
> one.  So I don't exactly see how savepoints would fit into that.

I think this hits the nail on the head and should have a place in the
official docs as I now realize I didn't grasp this distinction before
I read this. My mental model was always "sending a bunch of semicolon
separated queries without BEGIN/COMMIT/ROLLBACK; in one PQexec is like
sending them one by one preceeded by a BEGIN; and followed by a
COMMIT; except you only get the response from the last one". Also,
explain what happens when there are BEGIN/ROLLBACK/COMMIT inside that
multiquery string, that's still not completely clear to me and I don't
want to reverse engineer it from your patch.

> Now admittedly, the same set of issues pops up if one uses an
> explicit transaction block in a multi-query string:
>
> begin\; insert ...\; savepoint\; insert ...\; release savepoint\; insert 
> ...\; commit;

According to my mental model described above, this would be exactly
the same as without the begin; and commit; which is not the case so I
think the distinction is worth explaining.

I think the lack of a more detailed explanation about the stuff above
confuses *a lot* of people, especially newcomers, and the confusion is
only increased by what client drivers do on top (like issuing implicit
BEGIN if configured in various modes specified by
language-specific-DB-independent specs like Python's DBAPI or Java's
JDBC) and one's background from other DBs that do it differently.

Speaking of the above, psql also doesn't explicitly document how it
groups lines of the file it's executing into PQexec calls. See below
for a personal example of the confusions all this generates.

I also encountered this FATAL a month ago in the context of "we have
some (migration schema) queries in some files and want to orchestrate
running them for testing". Initially we started with calling psql but
then we needed some client side logic for some other stuff and
switched to Python and Psycopg2. We did "read the whole file in a
Python string" and then call Psycopg2's execute() on that string. Note
that Psycopg2 only uses PQexec to issue queries. We had some SAVEPOINT
statements in the file which lead to the backend stopping and the next
Psycopg2 execute() on that connection saying Connection closed.
It was already confusing why Psycopg2 behaves differently than psql
(because we were issuing the whole file in one PQexec vs. psql
splitting on ; and issuing multiple PQexecs and SAVEPOINTs working
there) and the backend stopping only added to that confusion. Add on
top of that "Should we put BEGIN; and COMMIT; in the file itself? Or
is a single Psycopg2 execute() enough to have this schema migration be
applied transactionally? Is there a difference between the two?".

I searched the docs for existing explanations of multiquery strings
and found these references but all of them are a bit hand wavy:
- psql's reference explaining -c
- libpq's PQexec explanation
- the message flow document in the FE/BE protocol description


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] strange case of kernel performance regression (4.3.x and newer)

2016-12-05 Thread Catalin Iacob
On Wed, Oct 5, 2016 at 8:18 AM, Tomas Vondra
 wrote:
> Over the past couple of days I've been doing a bit of benchmarking for the
> "group clog" patch [1], and I've ran into what I suspect might be a fairly
> serious performance regression on newer kernels (essentially 4.3.0 and
> newer). I got to a point where I need help with further investigation,
> testing on other systems etc.
>
> The workload tested in the patch [1] is quite simple - a transaction with 3
> x SELECT FOR UPDATE queries and 2 x SAVEPOINT on unlogged tables. The
> results (average tps from 5 x 5 minute runs, for 32 and 64 clients) on
> multiple kernels look like this:
>
> kernel  32   64
>-
> 3.19.8   4852459291
> 4.1.33   4719359574
> 4.2.84890159877
> 4.3.03218738970
> 4.3.63188938815
> 4.4.03194637702
> 4.4.23   3149837724
> 4.5.53153137351
> 4.7.63285938490

Just stumbled upon this old thread.

Since the regression is very clear and reproduceable and on a vanilla
kernel I don't think you need more testing. The kernel devs or at
least Linus take regressions (including performance) very seriously,
even if it's theoretically not the kernel's fault (so in this case
regardless what Postgres is doing).

I think you should just report it to lkml. If you have time to do a
bisection between 4.2.8 and 4.3.0 and find the offending commit that
would be even better, then email the author and CC lkml and Linus,
that should be enough to get it fixed.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: Implement failover on libpq connect level.

2016-11-23 Thread Catalin Iacob
On Tue, Nov 22, 2016 at 8:38 AM, Tsunakawa, Takayuki
 wrote:
> transaction_read_only=on does not mean the standby.  As the manual article on 
> hot standby says, they are different.

> I'm afraid that if the current patch is committed, you will lose interest in 
> the ideal solution.

I agree with Robert that lacking the ideal solution shouldn't block a
good enough solution today. But I find the current patch as it stands
too confusing to pass the "good enough" bar.

> Then if the current patch is out as v10, there would be a concern about 
> incompatibility when we pursue the ideal solution in a later release.  That 
> is, "should we continue to report > that this server is standby even if it's 
> actually a primary with transaction_read_only is on, to maintain 
> compatibility with the older release."

Agreed. I am worried that this will end up being a wart:
target_server_type=primary doesn't really mean primary it means that
by default the session is not read only which by the way is usually
the case for primaries but that can be changed.

> If you want to connect to a server where the transaction is read-only, then 
> shouldn't the connection parameter be something like 
> "target_session_attrs=readonly"?  That represents exactly what the code does.

FWIW I find this to be a reasonable compromise. To keep the analogy
with the current patch it would be more something like
"target_session_attrs=read_write|any" and the docs could point out "by
the way, if you didn't tweak default_transaction_read_only read_write
will only pick a primary so this can be used for failover".
The functionality is the same but changing the name removes the
ambiguity in the semantics and leaves the door open for a future real
primary/replica functionality which includes logical replication with
whatever semantics are deemed appropriate then.

We do lose pgjdbc compatibility but the current patch doesn't have
that 100% anyway: in pgjdbc it's called targetServerType and it
accepts master instead of primary and it also accepts slave and
preferSlave
I notice at 
https://github.com/pgjdbc/www/blob/master/documentation/head/connect.md
that pgjdbc says "The master/slave distinction is currently done by
observing if the server allows writes" which, due to the "currently"
part seems to acknowledge this is not ideal and leaves the door open
for changes in semantics but I don't think changing semantics is going
to fly for Postgres itself.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: Implement failover on libpq connect level.

2016-11-16 Thread Catalin Iacob
On Wed, Nov 16, 2016 at 2:50 PM, Robert Haas  wrote:
> Hmm, let's go back to the JDBC method, then.  "show
> transaction_read_only" will return true on a standby, but presumably
> also on any other non-writable node.  You could even force it to be
> true artificially if you wanted to force traffic off of a node, using
> ALTER {SYSTEM|USER ...|DATABASE ..} SET default_transaction_read_only
> = on
>
> I think that would address Alvaro's concern, and it's nicer anyway if
> libpq and JDBC are doing the same thing.

Not sure I agree that using this is a good idea in the first place.

But if we end up using this, I really think the docs should be very
explicit about what's implemented and not just say master/any. With
the master/any docs in the patch I would be *very* surprised if a
master is skipped only because it was configured with
default_transaction_read_only = on.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: Implement failover on libpq connect level.

2016-11-15 Thread Catalin Iacob
On Tue, Nov 15, 2016 at 3:42 PM, Alvaro Herrera
 wrote:
> FWIW I'm not sure "primary" is the right term here either.  I think what
> we really want to know is whether the node can accept writes; maybe
> "pg_is_writable_node".

This made me think of another complication: what about cascading
replication where the middle node is both a master and a slave? You'd
probably not want to fallback to the middle node even though it is a
master for its downstream.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: Implement failover on libpq connect level.

2016-11-15 Thread Catalin Iacob
On Tue, Nov 15, 2016 at 5:58 PM, Robert Haas  wrote:
> On Tue, Nov 15, 2016 at 9:42 AM, Alvaro Herrera
>  wrote:
>> I would rather come up with something that works in both cases that we
>> can extend internally later, say pg_is_primary_node() or something like
>> that instead; and we implement it initially by returning the inverse of
>> pg_is_in_recovery() for replicated non-logical flocks, while we figure
>> out what to do in logical replication.  Otherwise it will be harder to
>> change later if we embed it in libpq, and we may be forced into
>> supporting nonsensical situations such as having pg_is_in_recovery()
>> return true for logical replication primary nodes.
>
> I don't think we'll be backed into a corner like that, because we can
> always make this contingent on server version.  libpq will have that
> available.

But even with server version checking code, that code will be inside
libpq so there will be old libpq versions in the field that won't know
the proper query to send to new server versions.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Add TAP tests for pg_dump

2016-05-09 Thread Catalin Iacob
On 5/9/16, Stephen Frost  wrote:
>> And what if the tests are buggy? New test suites should go through a
>> CF first I think for proper review. And this is clearly one.
>
> They still won't result in data loss, corruption, or other direct impact
> on our users, even if they are buggy.  They also won't destablize the
> code base or introduce new bugs into the code.

Yes, but they could make things worse long term. For example if
introduce intermittent failures or if they're hard to understand or if
they test internals too deeply and don't let those internals evolve
because the tests break. All of them reasons why it's good that
they're reviewed.

> There is pretty clearly a lack of consensus on the question about adding
> new tests.

> What *should* we be doing right now?  Reviewing code and testing the
> system is what I had thought but perhaps I'm wrong.  To the extent that
> we're testing the system, isn't it better to write repeatable tests,
> particularly ones which add code coverage, than one-off tests that only
> check that the current snapshot of the code is operating correctly?

I also think that testing *and* keeping those tests for the future is
more valuable than just testing. But committers can just push their
tests while non committers need to wait for the next CF to get their
new tests committed. And committers pushing tests means they bypass
the review process which, as far as I know, doesn't happen for regular
patches: committers usually only directly commit small fixes not lots
of new (test) code.

So, crazy idea: what about test only CommitFests in between now and
the release? The rules would be: only new tests and existing test
improvements are allowed. For the rest, the CF would be the same as
usual: have an end date, committers agree to go through each patch and
commit or return it with feedback etc. Stephen would have added his
tests to the upcoming June test CF, Michael would have reviewed them
and maybe add his own and so on.

This does create work for committers (go through the submitted
patches) but acts as an incentive for test writers. Want your test
code committed? Well, there will be a CF very soon where it will get
committer attention so better write that test. It also gives a clear
date after which test churn also stops in order to not destabilize the
buildfarm right before a release.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: PL/Pythonu - function ereport

2016-03-01 Thread Catalin Iacob
On 3/1/16, Pavel Stehule  wrote:
>> I though about it before and I prefer variant with possibility to enter
>> message as keyword parameter.

That's also ok, but indeed with a check that it's not specified twice
which I see you already added.

> I merged your patches without @3 (many thanks for it). Instead @3 I
> disallow double message specification (+regress test)

Great, welcome. Ran the tests for plpython-enhanced-error-06 again on
2.4 - 2.7 and 3.5 versions and they passed.

I see the pfree you added isn't allowed on a NULL pointer but as far
as I see message is guaranteed not to be NULL as dgettext never
returns NULL.

I'll mark this Ready for Committer.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: PL/Pythonu - function ereport

2016-02-29 Thread Catalin Iacob
On 2/26/16, Pavel Stehule  wrote:
> Sending updated version

I did some changes on top of your last version as that was easier than
commenting about them, see attached.

0001 and 0005 are comment changes.

0002 is really needed, without it the tests fail on Python2.4.

0004 removes more code related to the unused position.

0003 is the most controversial. It removes the ability to pass message
as keyword argument. My reasoning was that keyword arguments are
usually optional and configure extra aspects of the function call
while message is required and fundamental so therefore it should be
positional. If you allow it as keyword as well, you have to deal with
the ambiguity of writing plpy.info('a message', message='a keyword arg
message, does this overwrite the first one or what?').

For the code with my patches on top on I ran the PL/Python tests for
2.4, 2.5, 2.6, 2.7 and 3.5. Everything passed.

Can you have a look at the patches, fold the ones you agree with on
top of yours and send the final version? With that I think this will
be Ready for Committer.


0001-Comment-improvements.patch
Description: binary/octet-stream


0002-Fix-tests-for-Python-2.4.patch
Description: binary/octet-stream


0003-Pass-message-only-as-positional-argument.patch
Description: binary/octet-stream


0004-Get-rid-of-setting-unused-position.patch
Description: binary/octet-stream


0005-Improve-comments.patch
Description: binary/octet-stream

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: PL/Pythonu - function ereport

2016-02-24 Thread Catalin Iacob
On Fri, Feb 19, 2016 at 9:41 PM, Pavel Stehule  wrote:
>  It looks like good idea. Last version are not breaking compatibility - and
> I think so it can works.
>
> I wrote the code, that works on Python2 and Python3

Hi,

I've attached a patch on top of yours with some documentation
improvements, feel free to fold it in your next version.

I think the concept is good. We're down to code level changes. Most of
them are cosmetical, misleading comments and so on but there are some
bugs in there as far as I see.


In object_to_string you don't need to test for Py_None. PyObject_Str
will return NULL on failure and whatever str() returns on the
underlying object. No need to special case None.

In object_to_string, when you're already in a (so != NULL) block, you
can use Py_DECREF instead of XDECREF.

object_to_string seems buggy to me: it returns the pointer returned by
PyString_AsString which points to the internal buffer of the Python
object but it also XDECREFs that object. You seem to be returning a
pointer to freed space.

get_string_attr seems to have the same issue as object_to_string.

In PLy_get_error_data query and position will never be set for
plpy.Error since you don't offer them to Python and therefore don't
set them in PLy_output_kw. So I think you should remove the code
related to them, including the char **query, int *position parameters
for PLy_get_error_data. Removing position also allows removing
get_int_attr and the need to exercise this function in the tests.

You're using PLy_get_spi_sqlerrcode in PLy_get_error_data which makes
the spi in the name unsuitable. I would rename it to just
PLy_get_sqlerrcode. At least the comment at the top of
PLy_get_spi_sqlerrcode needs to change since it now also extracts info
from Error not just SPIError.

/* set value of string pointer on object field */ comments are weird
for a function that gets a value. But those functions need to change
anyway since they're buggy (see above).

The only change in plpy_plpymodule.h is the removal of an empty line
unrelated to this patch, probably from previous versions. You should
undo it to leave plpy_plpymodule.h untouched.

Why rename PLy_output to PLy_output_kw? It only makes the patch bigger
without being an improvement. Maybe you also have this from previous
versions.

In PLy_output_kw you don't need to check message for NULL, just as sv
wasn't checked before.

In PLy_output_kw you added a FreeErrorData(edata) which didn't exist
before. I'm not familiar with that API but I'm wondering if it's
needed or not, good to have it or not etc.

In PLy_output_kw you didn't update the "Note: If sv came from
PyString_AsString(), it points into storage..." comment which still
refers to sv which is now deleted.

In PLy_output_kw you removed the "return a legal object so the
interpreter will continue on its merry way" comment which might not be
the most valuable comment in the world but removing it is still
unrelated to this patch.

In PLy_output_kw you test for kw != NULL && kw != Py_None. The kw !=
NULL is definitely needed but I'm quite sure Python will never pass
Py_None into it so the != Py_None isn't needed. Can't find a reference
to prove this at the moment.


Some more tests could be added to exercise more parts of the patch:
- check that SPIError is enhanced with all the new things:
schema_name, table_name etc.
- in the plpy.error call use every keyword argument not just detail
and hint: it proves you save and restore every field correctly from
the Error fields since that's not exercised by the info call above
which does use every argument
- use a wrong keyword argument to see it gets rejected with you error message
- try some other types than str (like detail=42) for error as well
since error goes on another code path than info
- a test exercising the "invalid sqlstate" code


0001-Doc-improvements.patch
Description: binary/octet-stream

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: make NOTIFY list de-duplication optional

2016-02-21 Thread Catalin Iacob
On Sat, Feb 20, 2016 at 2:00 PM, Filip Rembiałkowski
 wrote:
> I was stuck because both syntaxes have their ugliness. NOTIFY allows the
> payload to be NULL:
> NOTIFY chan01;
>
> How would this look like in "never" mode?
> NOTIFY chan01, NULL, 'never'; -- seems very cryptic.

The docs say:
"The information passed to the client for a notification event
includes the notification channel name, the notifying session's server
process PID, and the payload string, which is an empty string if it
has not been specified."

So a missing payload is not a SQL NULL but an empty string. This means
you would have:
NOTIFY chan01;
NOTIFY chan01, ''; -- same as above
NOTIFY chan01, '', 'maybe'; -- same as above
NOTIFY chan01, '', 'never'; -- send this all the time

Seems ok to me.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: make NOTIFY list de-duplication optional

2016-02-19 Thread Catalin Iacob
On 2/9/16, Tom Lane  wrote:
> FWIW, I think it would be a good thing if the NOTIFY statement syntax were
> not remarkably different from the syntax used in the pg_notify() function
> call.  To do otherwise would certainly be confusing.  So on the whole
> I'd go with the "NOTIFY channel [ , payload [ , mode ] ]" option.

I'm quite interested in getting this addressed in time for 9.6 as I'll
be using NOTIFY extensively in a project and I agree with Craig that
the deduplication is frustrating both because you sometimes want every
event and because it can apparently cause O(n^2) behaviour (which I
didn't know before this thread). If another use case for suppressing
deduplication is needed, consider publishing events like "inserted
tuple", "deleted tuple" from triggers and a transaction that does
"insert, delete, insert" which the client then sees as "insert,
delete, oops nothing else".

Tom's proposal allows for more flexible modes than just the ALL and
DISTINCT keywords and accommodates the concern that DISTINCT will lead
to bug reports about not really being distinct due to savepoints.

Android has a similar thing for push notifications to mobile devices
which they call collapse:
https://developers.google.com/cloud-messaging/concept-options, search
for collapse_key.

So I propose NOTIFY channel [ , payload [ , collapse_mode ] ] with
collapse mode being:

* 'never'
  ** Filip's proposed behaviour for the ALL option
  ** if specified, every notification is queued regardless what's in the queue

* 'maybe'
  ** vague word allowing for flexibility in what the server decides to do
  ** current behaviour
  ** improves performance for big transactions if a row trigger
creates the same payload over and over one after the other due to the
current optimization of checking the tail of the list
  ** has performance problems O(n^2) for big transactions with
different payloads
  *** the performance problems can be addressed by a different
patch which uses a hash table, or decides to collapse less
aggressively (Tom's check last 100 idea), or whatever else
  *** in the meantime the 'never' mode acts as a good workaround

In the future we might support an 'always' collapse_mode which would
really be always, including across savepoints. Or an
'only_inside_savepoints' which guarantees the current behaviour.

Filip, do you agree with Tom's proposal? Do you plan to rework the
patch on these lines? If you are I'll try to review it, if not I could
give it a shot as I'm interested in having this in 9.6.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: PL/Pythonu - function ereport

2016-02-18 Thread Catalin Iacob
On 2/18/16, Pavel Stehule  wrote:
> it doesn't look badly. Is there any possibility how to emulate it with
> Python2 ? What do you think about some similar implementation on Python2?

The example code I gave works as is in Python2.

The Python3 keyword only arguments are only syntactic sugar. See
https://www.python.org/dev/peps/pep-3102 for the details. But, as the
PEP notes,

def f(a, b, *, key1, key2)

is similar to doing this which also works in Python2

def f(a, b, *ignore, key1, key2):
if ignore:
   raise TypeError('too many positional arguments')

For our case, we want to accept any number of positional arguments due
to compatibility so we don't need or want the check for 'too many
positional arguments'.

Note that in both Python 2 and 3, invoking f(1, 2, key1='k1',
key2='k2') is just syntactic sugar for constructing the (1, 2) tuple
and {'key1': 'k1', 'key2': 'k2'} dict and passing those to f which
then unpacks them into a, b, key1 and key2. You see that reflected in
the C API where you get PyObject* args and PyObject* kw and you unpack
them explicitly with PyArg_ParseTupleAndKeywords or just use tuple and
dict calls to peek inside.

What you loose by not having Python3 is that you can't use
PyArg_ParseTupleAndKeywords and tell it that detail and so on are
keywork only arguments. But because we don't want to unpack args we
probably couldn't use that anyway even if we wouldn't support Python2.

Therefore you need to look inside the kw dictionary manually as my
example shows. What we could also do is check that the kw dictionary
*only* contains detail, hint and so on and raise a TypeError if it has
more things to avoid silently accepting stuff like:
plpy.error('message', some_param='abc'). In Python3
PyArg_ParseTupleAndKeywords would ensure that, but we need to do it
manually.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: PL/Pythonu - function ereport

2016-02-17 Thread Catalin Iacob
On Wed, Feb 17, 2016 at 3:32 PM, Pavel Stehule  wrote:
>> Python 3 has keyword only arguments. It occurs to me they're exactly
>> for "optional extra stuff" like detail, hint etc.
>> Python 2 doesn't have that notion but you can kind of fake it since
>> you get an args tuple and a kwargs dictionary.
>
>
> I prefer a possibility to use both ways - positional form is shorter,
> keywords can help with some parameters.
>
> But I cannot to imagine your idea, can you show it in detail?

Sure, what I mean is:

plpy.error('msg') # as before produces message 'msg'
plpy.error(42) # as before produces message '42', including the
conversion of the int to str
plpy.error('msg', 'arg 2 is still part of msg') # as before, produces
message '('msg', 'arg2 is still part of msg')'
# and so on for as many positional arguments, nothing changes
# I still think allowing more than one positional argument is
unfortunate but for compatibility we keep allowing more

# to pass detail you MUST use keyword args to disambiguate "I really
want detail" vs. "I have argument 2 which is part of the messsage
tuple for compatibility"
plpy.error('msg', 42, detail='a detail') # produces message '('msg',
42)' and detail 'a detail'
plpy.error('msg', detail=77) # produces message 'msg' and detail '77'
so detail is also converted to str just like message for consistency
# and so on for the others
plpy.error('msg', 42, detail='a detail', hint='a hint')
plpy.error('msg', 42, schema='sch')

Only keyword arguments are treated specially and we know no existing
code has keyword arguments since they didn't work before.

Implementation wise, it's something like this but in C:

def error(*args, **kwargs):
if len(args) == 1:
message = str(args[0])
else:
message = str(args)

# fetch value from dictionary or None if the key is missing
detail = kwargs.pop('detail', None)
hint = kwargs.pop('hint', None)

# use message, detail, hint etc. to raise exception for error and
fatal/call ereport for the other levels

Is it clear now? What do you think?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: PL/Pythonu - function ereport

2016-02-16 Thread Catalin Iacob
After I unzip the patch it doesn't apply.
patch says:
patch:  Only garbage was found in the patch input.

It's a combined diff, the git-diff manual says this about it:
Chunk header format is modified to prevent people from accidentally
feeding it to patch -p1. Combined diff format was created for review
of merge commit changes, and was not meant for apply.

Thanks for doing the test changes. It definitely shows the change is
big. The tests at least were heavily relying on both the str
conversion and on passing more than one argument. Sad face.

You're going to hate me but seeing this I changed my mind about 5,
requiring all those extra str calls is too much change in behaviour. I
was going to propose passing everything through str (so message,
detail, hint but also schema, table) but thinking about it some more,
I think I have something better.

Python 3 has keyword only arguments. It occurs to me they're exactly
for "optional extra stuff" like detail, hint etc.
Python 2 doesn't have that notion but you can kind of fake it since
you get an args tuple and a kwargs dictionary.

What about keeping the same behaviour for multiple positional
arguments (single one -> make it str, multiple -> use str of the args
tuple) and requiring users to pass detail, hint only as keyword
arguments? Code wise it would only mean adding PyObject* kw to
PLy_output and adding some code to extract detail, hint etc. from kw.
This would also allow getting rid of the GUC since backward
compatibility is fully preserved.

Again, sorry for all the back and forth but it still feels like we
haven't nailed the design to something satisfactory. All the tests you
needed to change are a hint towards that.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: PL/Pythonu - function ereport

2016-02-16 Thread Catalin Iacob
On Tue, Feb 16, 2016 at 5:18 PM, Catalin Iacob  wrote:
> We have
> elog_test_legacy to test elog_test_legacy=true, the rest of the tests
> should use legacy_custom_exception=false.

This should have been:
We have elog_test_legacy to test legacy_custom_exception=true, the
rest of the tests should use legacy_custom_exception=false.

I noticed the patch isn't registered in the March CF, please do that.
It would be a pity to miss 9.6 because of not registering it in time.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: PL/Pythonu - function ereport

2016-02-16 Thread Catalin Iacob
On Sat, Feb 13, 2016 at 4:26 PM, Pavel Stehule  wrote:
> I am sending new version. Current version does:

Hello,

I had a look and I like the big picture.

Tests fail when built against Python 3.5 with stuff like this in
regression.diffs:
-ERROR:  plpy.Error: stop on error
-DETAIL:  some detail
-HINT:  some hint
-CONTEXT:  Traceback (most recent call last):
-  PL/Python function "elog_test", line 18, in 
-plpy.error('stop on error', 'some detail','some hint')
-PL/Python function "elog_test"
+ERROR:  could not convert Python Unicode object to bytes
+DETAIL:  TypeError: bad argument type for built-in operation
+CONTEXT:  PL/Python function "elog_test"

This is related to the use of PyString_AsString and the changed
semantics of that in Python 3 (due to the fact that strings are now
Unicode objects and so on). Didn't have time to dig more deeply into
the exact cause.

Similarly, there are alternative expected test outputs that you didn't
update, for example src/pl/plpython/expected/plpython_types_3.out so
tests fail on some Python versions due to those as well.

> 1. the plpy utility functions can use all ErrorData fields,
> 2. there are no new functions,
> 3. via GUC plpythonu.legacy_custom_exception we can return previous behave,
> 4. only exception Error is raised with natural structure - no composite
> value spidata.
> 5. fields: message, detail and hint are implicitly translated to string - it
> decrease a necessity of legacy mode

I disagree that 5. is a good idea. I think we should just treat
message, detail and hint like the other ones (schema, table etc.). Use
s in PyArg_ParseTupleAndKeywords and let the user explicitly cast to a
string. Explicit is better than implicit. The way you did it you keep
part of the weird old interface which used to cast to string for you.
We shouldn't keep warts of the old behaviour, especially with the GUC
to ask for the old behaviour.

By the way, getting rid of object_to_string and its usage in
PLy_output_kw removed some "ERROR: could not convert Python Unicode
object to bytes" failures in the tests so I'm quite sure that the
usage of PyString_AsString is responsible for those.

I don't like that you set legacy_custom_exception=true in some
existing tests, probably to avoid changing them to the new behaviour.
We should trust that the new behaviour is what we want and the tests
should reflect that. If it's too much work, remember we're asking
users to do the same work to convert their code. We have
elog_test_legacy to test elog_test_legacy=true, the rest of the tests
should use legacy_custom_exception=false.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: PL/Pythonu - function ereport

2016-02-04 Thread Catalin Iacob
On Tue, Feb 2, 2016 at 11:52 PM, Alvaro Herrera
 wrote:
> Robert Haas wrote:
>> The eventual committer is likely to be much happier with this patch if
>> you guys have achieved consensus among yourselves on the best
>> approach.
>
> Actually I imagine that if there's no agreement between author and first
> reviewer, there might not *be* a committer in the first place.  Perhaps
> try to get someone else to think about it and make a decision.  It is
> possible that some other committer is able to decide by themselves but I
> wouldn't count on it.

Pavel and I agree that the backward incompatible behavior is cleaner,
but it's backward incompatible. Whether that extra cleanness is worth
the incompatibility is never obvious. In this case I think it does.
But since my opinion is just my opinion, I was planning to make the
committer be that someone else that weighs the issues and makes a
decision.

I'm new around here and picked this patch to get started due to having
Python knowledge and the patch seeming self contained enough. Being
new makes me wonder all the time "am I just making life difficult for
the patch author or is this idea genuinely good and therefore I should
push it forward?". I think more beginners that try to do reviews
struggle with this.

But, let's try to reach a decision. The existing behaviour dates back
to 0bef7ba Add plpython code by Bruce. I've added him to the mail,
maybe he can help us with a decision. I'll summarize the patch and
explain the incompatibility decision with some illustrative code.

The patch is trying to make it possible to call ereport from PL/Python
and provide the rich ereport information (detail, hint, sqlcode etc.).
There are already functions like plpy.info() but they only expose
message and they call elog instead of ereport.

See the attached incompat.py. Running it produces:

   existing behaviour
PG elog/ereport with message: 1: hi detail: None hint: None
PG elog/ereport with message: ('2: hi', 'another argument') detail:
None hint: None
PG elog/ereport with message: ('3: hi', 'another argument', 2) detail:
None hint: None
PG elog/ereport with message: ('4: hi', 'another argument', 2, 'lots',
'of', 'arguments') detail: None hint: None

 new behaviour
PG elog/ereport with message: 1: hi detail: None hint: None
PG elog/ereport with message: 2: hi detail: another argument hint: None
PG elog/ereport with message: 3: hi detail: another argument hint: 2
Traceback (most recent call last):
  File "incompat.py", line 43, in 
info_new('4: hi', 'another argument', 2, 'lots', 'of', 'arguments')
TypeError: info_new() takes at most 3 arguments (6 given)

I find existing behaviour for 2, 3 and 4 unlike other Python APIs I've
seen, surprising and not very useful. If I want to log a tuple I can
construct and pass a single tuple, I don't see why plpy.info() needs
to construct it for me. And for the documented, single argument call
nothing changes.

The question to Bruce (and others) is: is it ok to change to the new
behaviour illustrated and change meaning for usages like 2, 3 and 4?
If we don't want that, the solution Pavel and I see is introducing a
parallel API named plpy.raise_info or plpy.rich_info or something like
that with the new behaviour and leave the existing functions
unchanged. Another option is some compatibility GUC but I don't think
it's worth the trouble and confusion.

# simulated PG elog
def elog(message):
ereport(message)

# simulated PG ereport
def ereport(message, detail=None, hint=None):
print 'PG elog/ereport with message: %s detail: %s hint: %s' % (message, detail, hint)


# existing code behaves like this:
#   takes an unlimited number of arguments
#   doesn't take keyword arguments
#   makes a string representation of the tuple of all arguments unless that tuple has size 1 in which case it only makes a string representation of the first one
def info_existing(*args):
if len(args) == 1:
# special case added by Peter in 2e3b16
str_to_log = str(args[0])
else:
str_to_log = str(args)

elog(str_to_log)

# and I'm proposing to change it to do this:
#   take 1 required argument and extra optional arguments for every argument accepted by ereport
#   accepts keyword arguments
#   passing too many arguments will get rejected
def info_new(message, detail=None, hint=None):
ereport(message, detail, hint)

print 'existing behaviour'.center(40)
info_existing('1: hi')
info_existing('2: hi', 'another argument')
info_existing('3: hi', 'another argument', 2)
info_existing('4: hi', 'another argument', 2, 'lots', 'of', 'arguments')

print

print 'new behaviour'.center(40)
info_new('1: hi') # for the documented single argument case same behaviour as existing
info_new('2: hi', 'another argument')
info_new('3: hi', 'another argument', 2)
info_new('4: hi', 'another argument', 2, 'lots', 'of', 'arguments')



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers

Re: [HACKERS] proposal: PL/Pythonu - function ereport

2016-02-02 Thread Catalin Iacob
On Tue, Feb 2, 2016 at 3:48 PM, Pavel Stehule  wrote:
> If we decided to break compatibility, then we have to do explicitly. Thid
> discussion can continue with commiter, but send path with duplicitly defined
> functions has not sense for me. Currently I out of office, so I cannot to
> clean it. 4.2 I should to work usually

I think I didn't make myself clear so I'll try again.

There are 2 options:

1. keep info etc. unchanged and add raise_info etc.
2. change behaviour of info etc. and of course don't add raise_*

You already implemented 1. I said I think 2. is better and worth the
compatibility break in my opinion. But the committer decides not me.

Since 1. is already done, what I propose is: let's finish the other
aspects of the patch (incorporate my docs updates and details in Error
instead of SPIError) then I'll mark this ready for committer and
summarize the discussion. I will say there that option 1. was
implemented since it's backward compatible but that if the committer
thinks option 2. is better we can change the patch to implement option
2. If you do the work for 2 now, the committer might still say "I want
1" and then you need to do more work again to go back to 1. Easier to
just stay with 1 for now until we have committer input.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: PL/Pythonu - function ereport

2016-02-01 Thread Catalin Iacob
On Mon, Feb 1, 2016 at 5:37 PM, Pavel Stehule  wrote:
> Dne 29. 1. 2016 18:09 napsal uživatel "Catalin Iacob"
> :
>> Looking at the output above, I don't see who would rely on calling
>> plpy.error with multiple arguments and getting the tuple so I'm
>> actually in favor of just breaking backward compatibility. Note that
>> passing multiple arguments isn't even documented. So I would just
>> change debug, info, error and friends to do what raise_debug,
>> raise_info, raise_error do. With a single argument behavior stays the
>> same, with multiple arguments one gets more useful behavior (detail,
>> hint) instead of the useless tuple. That's my preference but we can
>> leave the patch with raise and leave the decision to the committer.
>>
>
> if breaking compatibility, then raise* functions are useless, and should be
> removed.

Indeed. I think it's better to change the existing functions and break
compatibility instead of adding the raise_ functions. But the
committer will decide if that's what should be done. Since you wrote
the patch with raise_* I propose you keep it that way for now and let
the committer decide. I wrote the doc patch based on raise_* as well.

Attached is the doc patch (made on top of your patch). I'll wait for
you to combine them and switch to raising Error and then hopefully
this is ready for a committer to look at.


0003-Improve-docs.patch
Description: binary/octet-stream

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: PL/Pythonu - function ereport

2016-01-29 Thread Catalin Iacob
On Tue, Jan 26, 2016 at 5:42 PM, Pavel Stehule  wrote:
> I removed from previous patch all OOP related changes. New patch contains
> raise_ functions only. This interface is new generation of previous
> functions: info, notice, warning, error with keyword parameters interface. I
> didn't changed older functions due keeping compatibility.

Hi,

Even without the OOP changes, the exception classes are still there as
the underlying mechanism that error and raise_error use.

I looked at the patch and what I don't like about it is that
raise_error uses SPIError to transport detail, hint etc while error
uses Error. This is inconsistent and confusing.

Take this example:

CREATE OR REPLACE FUNCTION err()
  RETURNS int
AS $$
  plpy.error('only msg from plpy.error', 'arg2 for error is part of
tuple', 'arg3 also in tuple')
$$ LANGUAGE plpython3u;

SELECT err();

CREATE OR REPLACE FUNCTION raise_err()
  RETURNS int
AS $$
  plpy.raise_error('only msg from plpy.raise_error', 'arg2 for
raise_error is detail', 'arg3 is hint')
$$ LANGUAGE plpython3u;

SELECT raise_err();

With your patch, this results in:

CREATE FUNCTION
psql:plpy_error_raise_error_difference:9: ERROR:  plpy.Error: ('only
msg from plpy.error', 'arg2 for error is part of tuple', 'arg3 also in
tuple')
CONTEXT:  Traceback (most recent call last):
  PL/Python function "err", line 2, in 
plpy.error('only msg from plpy.error', 'arg2 for error is part of
tuple', 'arg3 also in tuple')
PL/Python function "err"
CREATE FUNCTION
psql:plpy_error_raise_error_difference:17: ERROR:  plpy.SPIError: only
msg from plpy.raise_error
DETAIL:  arg2 for raise_error is detail
HINT:  arg3 is hint
CONTEXT:  Traceback (most recent call last):
  PL/Python function "raise_err", line 2, in 
plpy.raise_error('only msg from plpy.raise_error', 'arg2 for
raise_error is detail', 'arg3 is hint')
PL/Python function "raise_err"

>From the output you can already see that plpy.error and
plpy.raise_error (even with a single argument) don't use the same
exception: plpy.error uses Error while raise_error uses SPIError. I
think with a single argument they should be identical and with
multiple arguments raise_error should still use Error but use the
arguments as detail, hint etc. In code you had to export a function to
plpy_spi.h to get to the details in SPIError while plpy.error worked
just fine without that.

I've attached two patches on top of yours: first is a comment fix, the
next one is a rough proof of concept for using plpy.Error to carry the
details. This allows undoing the change to export
PLy_spi_exception_set to plpy_spi.h. And then I realized, the changes
you did to SPIError are actually a separate enhancement (report more
details like schema name, table name and so on for the query executed
by SPI). They should be split into a separate patch. With these
patches the output of the above test is:

CREATE FUNCTION
psql:plpy_error_raise_error_difference:9: ERROR:  plpy.Error: ('only
msg from plpy.error', 'arg2 for error is part of tuple', 'arg3 also in
tuple')
CONTEXT:  Traceback (most recent call last):
  PL/Python function "err", line 2, in 
plpy.error('only msg from plpy.error', 'arg2 for error is part of
tuple', 'arg3 also in tuple')
PL/Python function "err"
CREATE FUNCTION
psql:plpy_error_raise_error_difference:17: ERROR:  plpy.Error: only
msg from plpy.raise_error
DETAIL:  arg2 for raise_error is detail
HINT:  arg3 is hint
CONTEXT:  Traceback (most recent call last):
  PL/Python function "raise_err", line 2, in 
plpy.raise_error('only msg from plpy.raise_error', 'arg2 for
raise_error is detail', 'arg3 is hint')
PL/Python function "raise_err"

The two approaches are consistent now, both create a plpy.Error. The
patch is not complete, it only handles detail and hint not the others
but it should illustrate the idea.

Looking at the output above, I don't see who would rely on calling
plpy.error with multiple arguments and getting the tuple so I'm
actually in favor of just breaking backward compatibility. Note that
passing multiple arguments isn't even documented. So I would just
change debug, info, error and friends to do what raise_debug,
raise_info, raise_error do. With a single argument behavior stays the
same, with multiple arguments one gets more useful behavior (detail,
hint) instead of the useless tuple. That's my preference but we can
leave the patch with raise and leave the decision to the committer.

What do you think? Jim doesn't like the separate Error being raised. I
don't agree, but even if I would, it's not this patch's job to change
that and Error is already raised today. This patch should be
consistent with the status quo and therefore be similar to plpy.error.
If Error is bad, it should be replaced by SPIError everywhere
separately.

Next week I'll send a patch to improve the docs.


0002-use-plpy.Error-to-hold-the-data-not-plpy.SPIError.patch
Description: binary/octet-stream


0001-fix-comment.patch
Description: binary/octet-stream

-- 
Sent via pgsql

Re: [HACKERS] proposal: PL/Pythonu - function ereport

2016-01-25 Thread Catalin Iacob
On 1/21/16, Pavel Stehule  wrote:
> 2016-01-14 17:16 GMT+01:00 Catalin Iacob :
>> Consider this call chain: SQL code 1 -> PL/Python code 1 -> SPI (via
>> plpy.execute) -> SQL code 2 -> PL/Python code 2
>>
>> If I'm writing PL/Python code 1 and I want to raise an error toward
>> SQL code 1 I use raise plpy.Error. plpy.SPIError is what I get if I
>> call into SQL code 2 and that has an error. That error could indeed
>> come from a plpy.Error thrown by PL/Python code 2 or it could come
>> from a SQL syntax error or whatever. But the symmetry holds:
>> plpy.Error is what you raise to stop the query and return errors to
>> your SQL caller and plpy.SPIError is what you get back if you use SPI
>> to execute some other piece of SQL which has an error. I *think* (I'm
>> an internals newbie though so I could be wrong) that SQL code 1
>> doesn't call into PL/Python code 1 via SPI so why would the latter use
>> something called SPIError to inform the former about an error?
>>
>
> It is not correct - outside PLPython you got a Error (PostgreSQL error has
> not any classes), and isn't important the raising class (Error or
> SPIError). Inside PL/Python you will got SPIError or successors (based on
> SQLcode).

What exactly is not correct? Indeed, *outside* PLPython you don't get
a Python exception because Postgres isn't Python, you get Postgres'
way of representing an exception (error code, message, hint and so on
that might go to the client or not etc.). But that's normal, it's just
the impedance mismatch between the fact that you embed Python with its
rules into Postgres with its other rules. It's normal that the "host"
(Postgres) wins in the end and uses its mechanism to report errors.
The "guest" (PLPython) is there only to give mechanisms to activate
the "host".

But *inside* PLPython what I wrote is true, see this example for what I mean:

CREATE FUNCTION test()
  RETURNS int
AS $$
  def a_func():
raise plpy.Error('an error')

  try:
a_func()
  except plpy.Error as exc:
plpy.info('have exc {} of type {}'.format(exc, type(exc)))
plpy.info('have spidata {}'.format(hasattr(exc, 'spidata')))
$$ LANGUAGE plpython3u;

Running this produces:

DROP FUNCTION
CREATE FUNCTION
psql:plpy_demo:16: INFO:  have exc an error of type 
psql:plpy_demo:16: INFO:  have spidata False

> Currently If I raise plpy.Error, then it is immediately trasformed to
> PostgreSQL, and and then to SPIError and successors.

Depends how you define immediately. I would say it's really not
immediately, it's only at the Postgres boundary. Imagine in the
example above that a_func could call lots of other Python code and
somewhere deep down raise Error would be used. Within that whole
execution the error stays Error and can be caught as such, it will
have nothing to do with SPIError.

But in my opinion this discussion shouldn't really even be about
catching these things, most of the times you won't catch them and
instead you'll let them go to Postgres. The discussion should be
whether raise plpy.Error(...), plpy.raise_error, plpy.raise_info(,,,)
etc. all with keyword argument support are a good PLPython interface
to Postgres' ereport. I think they are.

On a different note, I've explained what I think but I might be wrong
and don't want to stall the patch just because I don't agree with the
approach it takes.

The waiting for author status is ok since, as far as I can see, Pavel
also agrees that at least some of the issues raised in
http://www.postgresql.org/message-id/CAHg_5grU=LVRbZDhfCiiYc9PhS=1X5f=gd44-3jcul-qaor...@mail.gmail.com
need to be fixed.

Would it help maybe to remove myself as reviewer in the CF? Maybe that
makes somebody else pick it up and move it further.

I could also code the version I'm describing but that design would be
contrary to what Jim and Pavel lean towards now. It could help to
compare approaches side by side but I don't like the idea of dueling
patches.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: PL/Pythonu - function ereport

2016-01-14 Thread Catalin Iacob
On Wed, Jan 13, 2016 at 7:40 PM, Jim Nasby  wrote:
> On 1/12/16 11:25 AM, Catalin Iacob wrote:
>> They're similar but not really the same thing. raise Error and
>> plpy.error are both ways to call ereport(ERROR, ...) while SPIError is
>> raised when coming back after calling into Postgres to execute SQL
>> that itself raises an error. Now indeed, that executed SQL raised an
>> error itself via another ereport(ERROR, ...) somewhere but that's a
>> different thing.
>
>
> Why should they be different? An error is an error. You either want to trap
> a specific type of error or you don't. Having two completely different ways
> to do the same thing is just confusing.

With my (indeed limited) understanding, I don't agree they're the same
thing and I tried to explain above why I think they're quite
different. You may not agree. If they are different things, using
different exception types is natural.

Consider this call chain: SQL code 1 -> PL/Python code 1 -> SPI (via
plpy.execute) -> SQL code 2 -> PL/Python code 2

If I'm writing PL/Python code 1 and I want to raise an error toward
SQL code 1 I use raise plpy.Error. plpy.SPIError is what I get if I
call into SQL code 2 and that has an error. That error could indeed
come from a plpy.Error thrown by PL/Python code 2 or it could come
from a SQL syntax error or whatever. But the symmetry holds:
plpy.Error is what you raise to stop the query and return errors to
your SQL caller and plpy.SPIError is what you get back if you use SPI
to execute some other piece of SQL which has an error. I *think* (I'm
an internals newbie though so I could be wrong) that SQL code 1
doesn't call into PL/Python code 1 via SPI so why would the latter use
something called SPIError to inform the former about an error?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: PL/Pythonu - function ereport

2016-01-12 Thread Catalin Iacob
On Mon, Jan 11, 2016 at 7:33 PM, Pavel Stehule  wrote:
> I see it.
>
> it looks like distinguish between Error and SPIError is wrong way. And I
> have any other ugly use case.
>
> If I raise a Error from one PLPythonu function, then in other PLPython
> function I'll trap a SPIError exception, because the information about class
> was lost inside Postgres. And it should be pretty messy.
> I have not information if any exception was User based or SPI based.

This isn't that bad. The real Python exception is only lost if the
first function calls into Postgres which then ends into another
PL/Python function which raises. That's just the way it is. If you use
Psycopg2 on the client and a PL/Python function in the server you also
don't get in the client the real exception that PL/Python raised even
though it's Python at both ends. The analogy isn't perfect since one
is in process and the other cross process but you get the point. If a
PL/Python function calls another one directly and that one raises
something, the caller can catch that just like in regular Python.

> The differentiation between Error and SPIError is wrong, because there isn't
> any difference in reality.

They're similar but not really the same thing. raise Error and
plpy.error are both ways to call ereport(ERROR, ...) while SPIError is
raised when coming back after calling into Postgres to execute SQL
that itself raises an error. Now indeed, that executed SQL raised an
error itself via another ereport(ERROR, ...) somewhere but that's a
different thing.

I'm getting more and more convinced that SPIError should be left
unchanged and should not even inherit from BaseError as it's a
different thing. And as I said this means BaseError isn't the base of
all exceptions that can be raised by the plpy module so then it should
probably not be called BaseError. Maybe something like ReportableError
(the base class of ereport frontends).

Or don't have a base class at all and just allow constructing both
Error and Fatal instances with message, detail, hint etc. What you
loose is you can't catch both Error and Fatal with a single except
ReportableError block but Fatal is maybe not meant to be caught and
Error is also not really meant to be caught either, it's meant to be
raised in order to call ereport(ERROR, ...). I now like this option
(no base class but same attributes in both instances) most.

As I see it, what this patch tries to solve is that raise Error and
plpy.error (and all the others) are not exposing all features of
ereport, they can only pass a message to ereport but can't pass all
the other things ereport accepts: detail, hint etc. The enhancement is
being able to pass all those arguments (as positional or keyword
arguments) when constructing and raising an Error or Fatal instance or
when using the plpy.error helpers. Since the existing helpers accept
multiple arguments already (which they unfortunately and weirdly
concatenate in a tuple whose string representation is pushed into
ereport as message) we can't repurpose the multiple arguments as
ereport detail, hint etc. so Pavel invented paralel plpy.raise_error
and friends which do that. If a bold committer says the string
representation of the tuple is too weird to be relied on we could even
just change meaning of plpy.error and accept (and document) the
compatibility break.

So the patch is almost there I think, it just needs to stop messing
around with SPIError and the spidata attribute.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: PL/Pythonu - function ereport

2016-01-12 Thread Catalin Iacob
On Tue, Jan 12, 2016 at 5:50 PM, Pavel Stehule  wrote:
> Error and Fatal exception classes are introduced in my patch - it was Peter'
> request (if I remember it well), and now I am thinking so it is not good
> idea.

Now everybody is confused :). Your patch does:

-PLy_exc_error = PyErr_NewException("plpy.Error", NULL, NULL);
-PLy_exc_fatal = PyErr_NewException("plpy.Fatal", NULL, NULL);
-PLy_exc_spi_error = PyErr_NewException("plpy.SPIError", NULL, NULL);

[snip]

+PLy_exc_error = PyErr_NewException("plpy.Error", PLy_exc_base, NULL);
+PLy_exc_fatal = PyErr_NewException("plpy.Fatal", PLy_exc_base, NULL);
+PLy_exc_spi_error = PyErr_NewException("plpy.SPIError",
PLy_exc_base, NULL);

So they are there without the patch, you now make them inherit from
the new BaseError previously they just inherited from Exception.

More soon in another reply I was just typing when this came in.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: PL/Pythonu - function ereport

2016-01-11 Thread Catalin Iacob
On Fri, Jan 8, 2016 at 7:56 AM, Pavel Stehule  wrote:
>> 3. PLy_error__init__ is used for BaseError but always sets an
>> attribute called spidata, I would expect that to only happen for
>> SPIError not for BaseError. You'll need to pick some other way of
>> attaching the error details to BaseError instances. Similarly,
>> PLy_get_spi_error_data became PLy_get_error_data and it's invoked on
>> other classes than SPIError but it still always looks at the spidata
>> attribute.
>
>
> I afraid of compatibility issues - so I am not changing internal format
> simply. Some legacy application can be based on detection of spidata
> attribute. So I cannot to change this structure for SPIError.

Indeed, I wasn't suggesting changing it for SPIError, that needs to
stay spidata.

> I can change it for BaseError, but this is question. Internally BaseError
> and SPIError share same data. So it can be strange if BaseError and SPIError
> uses different internal formats.
>
> I am interesting your opinion??

Yes, it's not great if BaseError and SPIError have different ways but
I think having BaseError have a member (spidata) named after one of
its subclasses is even worse.

I would say detail, hint etc belong to BaseError and I would make them
different attributes of BaseError instances instead of one big
BaseError.data tuple similar to what SPIError.spidata is now. SPIError
would keep its spidata for compatibility but would get the same
information into its detail, hint etc. attributes since it's derived
from BaseError.

So an equivalent to this Python (pseudo)code:

# base class for all exceptions raised by PL/Python
class BaseError:
def __init__(self, msg, detail=None, hint=None, and so on for
every accepted argument):
self.msg = msg
self.detail = detail
# and so on

class Error(BaseError):
 pass

class Fatal(BaseError):
pass

class SPIError(BaseError):
 def __init__(self, msg):
 # call BaseError.__init__ so that SPIError also gets .msg,
.detail and so on like all other BaseError instances
 # do what's done today to fill spidata to keep backward compatibility

If you don't like that spidata duplicates the other fields, it's
probably also ok to not make inherit from BaseError at all. I actually
like this better. Then I would call the base class something like
UserRaisedError (ugly name but can't think of something better to
emphasize that it's about errors that the PL/Python developer is
supposed to raise rather SPIError which is Postgres raising them) and
it would be:

# base class for exceptions raised by users in their PL/Python code
class UserRaisedError:
def __init__(self, msg, detail=None, hint=None, and so on for
every accepted argument):
self.msg = msg
self.detail = detail
# and so on

class Error(UserRaisedError):
 pass

class Fatal(UserRaisedError):
pass

# base class for exceptions raised by Postgres when executing sql code
class SPIError:
 # no change to this class

>
> a Fatal cannnot be raised - it is a session end. Personally - support of
> Fatal level is wrong, and it should not be propagated to user level, but it
> is now. And then due consistency I wrote support for fatal level. But hard
> to test it.
>

I see, then never mind.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: PL/Pythonu - function ereport

2016-01-07 Thread Catalin Iacob
shellOn Thu, Dec 31, 2015 at 11:37 AM, Pavel Stehule
 wrote:
> here is new version.

And here's a new review, sorry for the delay.

> Now I use a common ancestor "plpy.BaseError" for plpy builtin classes. So
> plpy.SPIError isn't descendant of plpy.Error and then there are not possible
> incompatibility issues.

That's good.

> Instead modification builtin function plpy.debug, plpy.info, ... and
> introduction incompatibility I wrote new set of functions with keyword
> parameters (used mainly  for elevel < ERROR):
>
> plpy.raise_debug, plpy.raise_info, plpy.raise_notice, plpy.raise_warning,
> plpy.raise_error and plpy.raise_fatal.

I'm on the fence whether these are good ideas. On one hand having them
is nice and they avoid changing the existing plpy.debug etc. in
backward incompatible ways, on the other hand they're similar to those
so it can be confusing to know which ones to pick. Any opinions from
others on whether it's better to add these or not?

The docs need more work, especially if raise_* are kept, as the docs
should then clearly explain the differences between the 2 sets and
nudge users toward the new raise_* functions. I can help with that
though if there are objections to these functions I wouldn't mind
hearing it before I document them.

As for the code:

1. there are quite some out of date comments referring to SPIError in
places that now handle more types (like /* prepare dictionary with
__init__ method for SPIError class */ in plpy_plpymodule.c)

2. you moved PLy_spi_exception_set from plpy_spi.c to plpy_elog.c and
renamed it to PLy_base_exception_set but it's still spi specific: it
still sets an attribute called spidata. You needed this because you
call it in PLy_output_kw but can't you instead make PLy_output_kw
similar to PLy_output and just call PLy_exception_set in the PG_CATCH
block like PLy_output does? With the patch plpy.error(msg) will raise
an error object without an spidata attribute but plpy.raise_error(msg)
will raise an error with an spidata attribute.

3. PLy_error__init__ is used for BaseError but always sets an
attribute called spidata, I would expect that to only happen for
SPIError not for BaseError. You'll need to pick some other way of
attaching the error details to BaseError instances. Similarly,
PLy_get_spi_error_data became PLy_get_error_data and it's invoked on
other classes than SPIError but it still always looks at the spidata
attribute.

4. it wouldn't hurt to expand the tests a bit to also raise plpy.Fatal
with keyword arguments and maybe catch BaseError and inspect it a bit
to see it contains reasonable values (per 4. have spidata when raising
an SPIError but not when raising an Error or BaseError or Fatal etc.)

As seen from 1, 2, and 3 the patch is still too much about SPIError.
As I see it, it should only touch SPIError to make it inherit from the
new BaseError but for the rest, the patch shouldn't change it and
shouldn't propagate spidata attributes and SPIError comments. As it
stands, the patch has historical artifacts that show it was initially
a patch for SPIError but it's now a patch for error reporting so those
should go away.

I'll set the patch back to waiting for author.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] No Issue Tracker - Say it Ain't So!

2016-01-06 Thread Catalin Iacob
On Wed, Jan 6, 2016 at 3:11 AM, Jim Nasby  wrote:
> Which doesn't help anyone, because neither of those provide a list of "hey,
> here's stuff you could do to contribute". The closest we come to that is the
> TODO, which isn't well known and has almost no items for newbies (and the
> newbie items that are there don't offer much advice).
>
> The reason I this matters is because yesterday I posted a task for a new
> hacker with simple guidelines and 24 hours later it was completed[1]. That
> tells me there's people that would love to contribute but don't know how or
> what to contribute.

Jim,

I want to explicitly thank you for your post about that task, I would
like to see more of that. I share the sentiment that there are more
people wanting to contribute but finding it rather hard to find a
starting point. I was (am?) in that position and so far found two ways
out:

1. I looked at the commit fest patches as a list of things to
contribute to (by doing a review which is currently in progress)
2. Robert at some point mentioned in an email "someone could improve
the docs in this patch" so I did that

But I can see that 1. is intimidating to do for new people that might
think "how can you review without ever having looked at the code
before?". Turns out you can and the wiki mentions it explicitly but
it's probably still intimidating for some. And 2. required noticing
Robert's sentence in the middle of a long email thread.

I also think a list of small things suitable for new contributors
would help attracting them. Not all would stick and go on to larger
items but hopefully at least some would.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: PL/Pythonu - function ereport

2015-12-07 Thread Catalin Iacob
On Thu, Dec 3, 2015 at 6:54 PM, Pavel Stehule  wrote:
> Don't understand - if Fatal has same behave as Error, then why it cannot be
> inherited from Error?
>
> What can be broken?

Existing code that did "except plpy.Error as exc" will now also be
called if plpy.Fatal is raised. That wasn't the case so this changes
meaning of existing code, therefore it introduces an incompatibility.

>>
>> 5. all the reporting functions: plpy.debug...plpy.fatal functions
>> should also be changed to allow more arguments than the message and
>> allow them as keyword arguments
>
>
> this is maybe bigger source of broken compatibility
>
> lot of people uses plpy.debug(var1, var2, var3)
>
> rich constructor use $1 for message, $2 for detail, $3 for hint. So it was a
> reason, why didn't touch these functions

No, if you use PyArg_ParseTupleAndKeywords you'll have Python accept
all of these so previous ways are still accepted:

plpy.error('a_msg', 'a_detail', 'a_hint')
plpy.error'a_msg', 'a_detail', hint='a_hint')
plpy.error('a_msg', detail='a_detail', hint='a_hint')
plpy.error(msg='a_msg', detail='a_detail', hint='a_hint')
plpy.error('a_msg')
plpy.error(msg='a_msg')
etc.

But I now see that even though the docs say plpy.error and friends
take a single msg argument, they actually allow any number of
arguments. If multiple arguments are passed they are converted to a
tuple and the string representation of that tuple goes into
ereport/plpy.Error:

CREATE FUNCTION test()
  RETURNS int
AS $$
  try:
plpy.error('an error message', 'detail for error', 'hint for error')
  except plpy.Error as exc:
plpy.info('have exc {}'.format(exc))
plpy.info('have exc.args |{}| of type {}'.format(exc.args, type(exc.args)))
$$ LANGUAGE plpython3u;
SELECT test();
[catalin@archie pg]$ psql -f plpy test
CREATE FUNCTION
psql:plpy:13: INFO:  have exc ('an error message', 'detail for error',
'hint for error')
psql:plpy:13: INFO:  have exc.args |("('an error message', 'detail for
error', 'hint for error')",)| of type 

In the last line note that exc.args (the args tuple passed in the
constructor of plpy.Error) is a tuple with a single element which is a
string which is a representation of the tuple passed into plpy.error.
Don't know why this was thought useful, it was certainly surprising to
me. Anyway, the separate $2, $3 etc are currently not detail and hint,
they're just more stuff that gets appended to this tuple. They don't
get passed to clients as hints. And you can pass lots of them not just
detail and hint.

My proposal would change the meaning of this to actually interpret the
second argument as detail, third as hint and to only allow a limited
number (the ones with meaning to ereport). The hint would go to
ereport so it would be a real hint: it would go to clients as HINT and
so on. I think that's way more useful that what is done now. But I now
see my proposal is also backward incompatible.

> I am not against to plpy.ereport function - it can live together with rich
> exceptions class, and users can use what they prefer.

I wasn't suggesting introducing ereport, I was saying the existing
functions and exceptions are very similar to your proposed ereport.
Enhancing them to take keyword arguments would make them a bit more
powerful. Adding ereport would be another way of doing the same thing
so that's more confusing than useful.

All in all it's hard to improve this cleanly. I'm still not sure the
latest patch is a good idea but now I'm also not sure what I proposed
is better than leaving it as it is.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: multiple psql option -c

2015-12-04 Thread Catalin Iacob
On Fri, Dec 4, 2015 at 3:47 PM, Robert Haas  wrote:
> For the most part, the cleanups in this version are just cosmetic: I
> fixed some whitespace damage, and reverted some needless changes to
> the psql references page that were whitespace-only adjustments.  In a
> few places, I tweaked documentation or comment language.

Sorry for the docs whitespace-only changes, I did that.

I realized before the submission I made the diff bigger than it needed
to be, but that's because I used M-q in Emacs to break the lines I did
change and that reformatted the whole paragraph including some
unchanged lines. Breaking all the lines by hand would be quite a job,
and any time you go back and tweak the wording or so you need to do it
again. So I just used M-q and sent the result of that.
Do you happen to know of a better way to do this? I do load
src/tools/editors/emacs.samples in my ~/.emacs but it seems the width
my Emacs chooses doesn't match the one already in the file.

The doc tweaks are good, they make the text more clear. I'm happy
that's all you found to improve: writing good docs is hard and the
Postgres docs are already good so it's not easy to change them, I had
the feeling I'll only make them worse and spent quite some time trying
not to do that.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: PL/Pythonu - function ereport

2015-12-03 Thread Catalin Iacob
On Thu, Dec 3, 2015 at 7:58 AM, Pavel Stehule  wrote:
> I am sorry, I don't understand. Now due inheritance plpy.Fatal and
> plpy.SPIError has availability to use keyword parameters.

Indeed, I didn't realize this, but I don't think it changes the main argument.

What I think should happen:

1. Error should take keyword arguments
2. same for Fatal
3. Fatal should not be changed to inherit from Error - it should stay
like it is now, just another exception class
You can argue a Fatal error is an Error but these classes already
exist and are independent, changing their relationship is backward
incompatible.
4. SPIError should not be changed at all
 It's for errors raised by query execution not user PL/Python code
so doing raise SPIError in PL/Python doesn't make sense.
 And again, changing the inheritance relationship of these
existing classes changes meaning of existing code that catches the
exceptions.
5. all the reporting functions: plpy.debug...plpy.fatal functions
should also be changed to allow more arguments than the message and
allow them as keyword arguments
 They are Python wrappers for ereport and this would make them
similar in capabilities to the PL/pgSQL RAISE
 This will make plpy.error(...) stay equivalent in capabilities
with raise plpy.Error(...), same for fatal and Fatal
6. the docs should move to the "Utility Functions" section
 There, it's already described how to raise errors either via the
exceptions or the utility functions.

I think the above doesn't break anything, doesn't confuse user
exceptions with backend SPIError exceptions, enhances error reporting
features for the PL/Python user to bring them up to par with ereport
and PL/pgSQL RAISE and solve your initial use case at the top of the
thread (but with slightly different spelling to match what already
exists in PL/Python):

"We cannot to raise PostgreSQL exception with setting all possible
fields. I propose new function

plpy.ereport(level, [ message [, detail [, hint [, sqlstate, ... )"

Is what I mean more clear now? Do you (and others) agree?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: PL/Pythonu - function ereport

2015-12-02 Thread Catalin Iacob
On Tue, Dec 1, 2015 at 2:17 AM, Pavel Stehule  wrote:
> here is complete patch - regress tests for all supported Python branches

I had a look at what changed in v10 since my last reviewed version and
indeed most of it is straightforward: renames from SPIError to Error.
The patch also changes plpy.Fatal and plpy.SPIError to inherit from
the existing plpy.Error which is a backward incompatibility: catching
a plpy.Error will now also catch SPIError and Fatal.

But by doing this I noticed plpy.Error already existed without the
patch and raising plpy.Error(msg) is documented as equivalent to
calling plpy.error(msg), similar for plpy.Fatal and plpy.fatal(). This
patch makes it possible to raise plpy.Error with more arguments
including keyword ones but doesn't change plpy.error(msg). And Fatal
is not touched so it becomes inconsistent with Error.

So I now think the approach this patch took is wrong. We should
instead enhance the existing error and fatal functions and Error and
Fatal exceptions to accept other arguments that ereport accepts (hint,
sqlstate) and be able to pass all those as keyword parameters.
SPIError should be left unchanged as that's for errors raised by query
execution not by the PL/Python user. This is also close to Pavel's
original ereport proposal but by extending existing functionality
instead of adding a new function and it covers Peter's observation
that in Python the ereport function should be "raise an exception" as
that's already an alternative way of doing it.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: multiple psql option -c

2015-12-01 Thread Catalin Iacob
On Tue, Dec 1, 2015 at 1:53 PM, Michael Paquier
 wrote:
> Attached is a patch implementing those suggestions. This simplifies
> the code without changing its usefulness. If you are fine with those
> changes I will switch this patch as ready for committer.

I tested the v07 patch (so not Michael's version) a few days ago but
didn't send this email earlier.

I combined various -c and -f with --echo-all, --single-transaction,
\set ON_ERROR_STOP=1, separate -c "VACUUM", "SELECT + VACUUM" in a
single and in two -c, inserting -f - somewhere in the middle of the
other -c and -f. They all behave as I would expect.

One maybe slightly surprising behaviour is that -f - can be specified
multiple times and only the first one has an effect since the others
act on an exhausted stdin. But I don't think forbidding multiple -f -
is better.

As for the code (these still apply to Michael's latest patch):

1. the be compiler quiete comment is not good English, /* silence the
compiler */ would be better or remove it completely

2. shouldn't atyp in SimpleActionListCell be of type enum _atypes?
Otherwise why an enum if it's casted to int when actually used? If
it's an enum the repeated ifs on cell->atyp should be a switch, either
with a default Assert(0) or no default which makes gcc give a warning
if an enum value is ever added without having a corresponding case.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: multiple psql option -c

2015-11-25 Thread Catalin Iacob
On Wed, Nov 18, 2015 at 5:49 PM, Catalin Iacob  wrote:
> On Tue, Nov 17, 2015 at 10:13 PM, Tom Lane  wrote:
>> 1. -c no longer implies --no-psqlrc.  That's a backwards incompatibility,
>> but very easy to explain and very easy to work around.
>>
>> 2. You can have multiple -c and/or -f.  Each -c is processed in
>> the traditional way, ie, either it's a single backslash command
>> or it's sent in a single PQexec.  That doesn't seem to me to have
>> much impact on the behavior of adjacent -c or -f.
>>
>> 3. If you combine -1 with -c and/or -f, you get one BEGIN inserted
>> at the beginning and one COMMIT at the end.  Nothing else changes.

> I'll try to write the documentation patch for these semantics sometime
> next week.

Attached is my attempt at a documentation patch, feedback welcome. I'm
assuming Pavel will pick up the implementation, if not I could also
try it.


Update-docs-for-repeated-c.patch
Description: binary/octet-stream

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: multiple psql option -c

2015-11-18 Thread Catalin Iacob
On Tue, Nov 17, 2015 at 10:13 PM, Tom Lane  wrote:
> 1. -c no longer implies --no-psqlrc.  That's a backwards incompatibility,
> but very easy to explain and very easy to work around.
>
> 2. You can have multiple -c and/or -f.  Each -c is processed in
> the traditional way, ie, either it's a single backslash command
> or it's sent in a single PQexec.  That doesn't seem to me to have
> much impact on the behavior of adjacent -c or -f.
>
> 3. If you combine -1 with -c and/or -f, you get one BEGIN inserted
> at the beginning and one COMMIT at the end.  Nothing else changes.

And -v AUTOCOMMIT=off should do the same as now for -c: issue a BEGIN
before each single PQexec with the content of each -c.

I like it, it avoids what I didn't like about -C semantics since the
grouping now means something (single PQexec per group) and you even
see the effects of the grouping in the result (only last result of
group is returned). If you don't like that grouping (probably most
people won't) the solution is intuitive: split the group to multiple
-c.

Another incompatibility is that -1 is now silently ignored by -c so
for example psql -1 -c VACUUM now works and it won't work with the new
semantics. But this seems like a good thing because it reflects that
VACUUM doesn't work in a transaction so I don't think it should stop
this proposal from going ahead.

I'll try to write the documentation patch for these semantics sometime
next week.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: multiple psql option -c

2015-11-16 Thread Catalin Iacob
On Sun, Nov 15, 2015 at 3:53 PM, Andrew Dunstan  wrote:
> I suggest you review the original thread on this subject before a line was
> ever written. "multiple" (see subject line on this whole thread) is clearly
> what is being asked for. Making people turn that into a single argument is
> not what was envisaged. See for example Pavel's original example involving
> use of xargs where that's clearly not at all easy.

I couldn't see why it would matter to have multiple -C, but xargs
having -n which consumes more than 1 stdin item is indeed an use case.
When reading the thread I didn't notice it since I didn't know what -n
does.

But multiple -C is confusing since it suggests the groupings matter.

To me at least, it feels weird that -C "SELECT 1; SELECT 2;" -C
"SELECT 3;" is the same as -C "SELECT 1; SELECT 2; SELECT 3" and lots
of other combinations. It feels like the split in groups must mean
something, otherwise why would you support/use multiple groups?

Upthread at least somebody thought each -C group would/should be a
transaction and I can see this confusion coming up again and again,
enough to question whether this patch is an improvement over the
current situation. And if a single -C is too small of an improvement,
maybe it means the whole idea should be dropped. I think the same of
multiple -f as well: to me they're more confusing than helpful for the
same reason.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: multiple psql option -c

2015-11-15 Thread Catalin Iacob
On Sun, Nov 15, 2015 at 1:27 AM, Andrew Dunstan  wrote:
> That seems to me to get rid of the main motivation for this change, which is
> to allow multiple such arguments, which together would as as if they were
> all written to a file which was then invoked like -f file.

It seems to me the motivation is not "multiple command line arguments"
but sending multiple statements to the backend in one psql invocation
without needing bash specific here docs or a temporary file for -f.
Most combinations of such multiple statements can already be sent via
-c which sends them in one query, the backend executes them in one
transaction but the backend rejects some combinations like SELECT +
VACUUM.

I think the proposal was mislead by the apparent similarity with -c
and said "if -c can't do SELECT + VACUUM let's do a sort of repeated
-c and call that -C SELECT -C VACUUM". But why not do the same with -C
"SELECT 1; VACUUM" which works just like having a file with that
content works today but handier for scripts? Doesn't this solve the
exact need in this thread?

I'm arguing that sending multiple statements and executing each in one
transaction (the current proposed semantics of -C) is not like -c and
doesn't need to be "repeated -c" it's exactly like reading stdin or
file passed to -f and solves the original problem.But maybe I'm
missing something.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: multiple psql option -c

2015-11-13 Thread Catalin Iacob
So I promised I'd try to document this. I had a look at the proposed
semantics of -C and I think in the patch they're too complicated which
makes explaining them hard.

My assumptions about behaviour without this patch, from reading the
docs and some experimenting, correct me if I'm wrong:

1. psql normally splits its input by ; let's call each piece of the
split a statement

2. for every statement resulting after 1, if it's a \ command it's
interpreted internally, else a query with it is sent to the server,
the result is displayed

3. 1. and 2. happen when the input comes from a file (-f) or from stdin

4. autocommit off changes behaviour in that it sends a BEGIN before
any of the statements after the split in 1 (except for \ commands,
BEGIN or things like VACUUM which don't work within transactions)

5. --single-transaction changes behaviour in that it puts a BEGIN
before the whole input (not around each statement) and a COMMIT after

6. all of the above DON'T apply for -c which very different things: it
doesn't split and instead it sends everything, in one query to the
backend. The backend can execute such a thing (it splits itself by ;)
except in some cases like SELECT + VACUUM. Since the single query is
effectively a single transaction for the backend -c ignores
--single-transaction and autocommit off. Even more, when executing
such a multiple statement the backend only returns results for the
last statement of the query.

>From the above it seems -c is a different thing altogether while other
behaviour allows 1 input with multiple commands, multiple results and
works the same on stdin and a file.

So my proposal is: allow a *single* argument for -C and treat its
content *exactly* like the input from stdin or from a file.

This answers all the questions about interactions with
--single-transaction and autocommit naturally: it behaves exactly like
stdin and -f behave today. And having a single parameter is similar to
having a single file or single stdin. Having multiple -C is also
confusing since it seems the statements in one -C are grouped somehow
and the ones in the next -C are another group so this starts feeling
like there's maybe a transaction per -C group etc.

Am I missing something or is it that simple?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: PL/Pythonu - function ereport

2015-11-09 Thread Catalin Iacob
On Mon, Nov 9, 2015 at 6:31 PM, Pavel Stehule  wrote:
> merged your patch

So, I just that tested version 08 is the same as the previous patch +
my change and I already tested that on Python 2.4, 2.5, 2.6, 2.7 and
3.5.

All previous comments addressed so I'll mark this Ready for Committer.

Thanks Pavel


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: PL/Pythonu - function ereport

2015-11-09 Thread Catalin Iacob
On Mon, Nov 9, 2015 at 2:58 PM, Pavel Stehule  wrote:
> I needed to understand the support for Python 3.5.
>
> The patch with the fix is attached regress tests 3.5 Python

I wanted to type a reply this morning to explain and then I noticed
there's another file (_0.out) for Python2.4 and older (as explained by
src/pl/plpython/expected/README) so tests there fail as well. I built
Python 2.4.6 and then Postgres against it and updated the expected
output for 2.4 as well. Find the changes for 2.4 attached. You can add
those to your patch.

While I was at it I tested 2.5 and 2.6 as well, they also pass all
tests. So with the 2.4 changes attached I think we're done.

BTW, the alternative output files for pg_regress are explained at
http://www.postgresql.org/docs/devel/static/regress-variant.html


adjust_py24_expected.patch
Description: binary/octet-stream

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: multiple psql option -c

2015-11-05 Thread Catalin Iacob
On Thu, Nov 5, 2015 at 5:27 PM, Robert Haas  wrote:
>> I wrote some text. But needs some work of native speaker.
>
> It does.  It would be nice if some kind reviewer could help volunteer
> to clean that up.

I'll give it a go sometime next week.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: PL/Pythonu - function ereport

2015-11-04 Thread Catalin Iacob
On Wed, Nov 4, 2015 at 10:12 AM, Pavel Stehule  wrote:
> It helped me lot of, thank you

Welcome, I learned quite some from the process as well.

>>
>>
>> There's just the doc part left then.
>
>
> done

We're almost there but not quite.

There's still a typo in the docs: excation.

A plpy.SPIError can be raised should be A
plpy.SPIError can be raised right?

And most importantly, for Python 3.5 there is a plpython_error_5.out
which is needed because of an alternative exception message in that
version. You didn't update this file, this makes the tests fail on
Python3.5.

Since you might not have Python 3.5 easily available I've attached a
patch to plpython_error_5.out which makes the tests pass, you can fold
this into your patch.


adjust_py35_expected.patch
Description: binary/octet-stream

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: PL/Pythonu - function ereport

2015-11-03 Thread Catalin Iacob
Sorry, you're right, I didn't notice the x = plpy.SPIError() test.

I did notice that you included the kw != NULL, I was explaining why it
really is needed even though it *seems* the code also works without
it.

There's just the doc part left then.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: PL/Pythonu - function ereport

2015-11-03 Thread Catalin Iacob
On Tue, Nov 3, 2015 at 12:49 PM, Pavel Stehule  wrote:
>> 1. in PLy_spi_error__init__ you need to check kw for NULL before doing
>> PyDict_Size(kw) otherwise for plpy.SPIError() you get Bad internal
>> call because PyDict_Size expects a real dictionary not NULL
>
>
> PyDict_Size returns -1 when the dictionary is NULL
>
> http://grokbase.com/t/python/python-dev/042ft6qaqq/pydict-size-error-return

Yes, but it also sets the error indicator to BadInternalCall. In 2.7
the code is:
Py_ssize_t
PyDict_Size(PyObject *mp)
{
if (mp == NULL || !PyDict_Check(mp)) {
PyErr_BadInternalCall();
return -1;
}
return ((PyDictObject *)mp)->ma_used;
}

I had a PLy_elog right after the PyDict_Size call for debugging and
that one was raising BadInternalCall since it checked the error
indicator. So the NULL check is needed.

>> 2. a test with just plpy.SPIError() is still missing, this would have
>> caught 1.
>
>
> please, can you write some example - I am not able raise described error

The example was plpy.SPIError() but I now realize that, in order to
see a failure, you need the extra PLy_elog which I had in there.
But this basic form of the constructor is still an important thing to
test so please add this as well to the regression test.

>> 5. there is conceptual code duplication between PLy_spi_exception_set
>> in plpy_spi.c, since that code also constructs an SPIError but from C
>> and with more info available (edata->internalquery and
>> edata->internalpos). But making a tuple and assigning it to spidata is
>> in both. Not sure how this can be improved.
>
>
> I see it, but I don't think, so current code should be changed.
> PLy_spi_exception_set is controlled directly by fully filled ErrorData
> structure, __init__ is based only on keyword parameters with possibility use
> inherited data. If I'll build ErrorData in __init__ function and call
> PLy_spi_exception_set, then the code will be longer and more complex.

Indeed, I don't really see how to improve this but it does bug me a bit.

One more thing,
+The plpy module provides several possibilities to
+to raise a exception:

This has "to" 2 times and is weird since it says it offers several
possibilities but then shows only one (the SPIError constructor).
And SPIError should be plpy.SPIError everywhere to
be consistent.

Maybe something like (needs markup):
A plpy.SPIError can be raised from PL/Python, the constructor accepts
keyword parameters:
  plpy.SPIError([ message [, detail [, hint [, sqlstate [, schema [,
table [, column [, datatype [, constraint ])
then the example

If you fix the doc and add the plpy.SPIError() test I'm happy. I'll
give it one more test on Python2.7 and 3.5 and mark it Ready for
Committer.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: PL/Pythonu - function ereport

2015-11-02 Thread Catalin Iacob
Hello,

Here's a detailed review:

1. in PLy_spi_error__init__ you need to check kw for NULL before doing
PyDict_Size(kw) otherwise for plpy.SPIError() you get Bad internal
call because PyDict_Size expects a real dictionary not NULL

2. a test with just plpy.SPIError() is still missing, this would have caught 1.

3. the tests have "possibility to set all accessable fields in custom
exception" above a test that doesn't set all fields, it's confusing
(and accessible is spelled wrong)

4. in the docs, "The constructor of SPIError exception (class)
supports keyword parameters." sounds better as "An SPIError instance
can be constructed using keyword parameters."

5. there is conceptual code duplication between PLy_spi_exception_set
in plpy_spi.c, since that code also constructs an SPIError but from C
and with more info available (edata->internalquery and
edata->internalpos). But making a tuple and assigning it to spidata is
in both. Not sure how this can be improved.

6. __init__ can use METH_VARARGS | METH_KEYWORDS in both Python2 and
3, no need for the #if

7. "could not create dictionary for SPI exception" would be more
precise as "could not create dictionary for SPIError" right?

8. in PLy_add_methods_to_dictionary I would avoid import since it
makes one think of importing modules; maybe "could not create function
wrapper for \"%s\"",  "could not create method wrapper for \"%s\""

9. also in PLy_add_methods_to_dictionary "could public method \"%s\"
in dictionary" is not proper English and is missing not, maybe "could
not add method \"%s\" to class dictionary"?

10. in PLy_add_methods_to_dictionary if PyCFunction_New succeeds but
PyMethod_New fails, func will leak

11. it would be nice to have a test for the invalid SQLSTATE code part

12. similar to 10, in PLy_spi_error__init__ taking the "invalid
SQLSTATE" branch leaks exc_args, in general early returns via PLy_elog
will leave the intermediate Python objects leaking


Will mark the patch as Waiting for author.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: PL/Pythonu - function ereport

2015-10-27 Thread Catalin Iacob
On Tue, Oct 27, 2015 at 9:34 AM, Pavel Stehule  wrote:
> Hi
>
> 2015-10-23 7:34 GMT+02:00 Catalin Iacob :
>> The current code doesn't build on Python3 because the 3rd argument of
>> PyMethod_New, the troubled one you need set to NULL has been removed.
>> This has to do with the distinction between bound and unbound methods
>> which is gone in Python3.
>
>
> this part of is well isolated, and we can do switch for Python2 and Python3
> without significant problems.

I had a quick look at this and at least 2 things are needed for Python3:

* using PyInstanceMethod_New instead of PyMethod_New; as
http://bugs.python.org/issue1587 and
https://docs.python.org/3/c-api/method.html explain that's the new way
of binding a PyCFunction to a class, PyMethod_New(func, NULL) fails

* in the PLy_spi_error_methods definition, __init__ has to take
METH_VARARGS | METH_KEYWORDS not just METH_KEYWORDS; in Python2
METH_KEYWORDS implied METH_VARARGS so it's not needed (it also doesn't
hurt) but if I don't add it, in Python3 I get:
ERROR:  SystemError: Bad call flags in PyCFunction_Call. METH_OLDARGS
is no longer supported!

>> Still, the above link shows a (more verbose but maybe better)
>> alternative: don't use PyErr_NewException and instead define an
>> SPIError type with each slot spelled out explicitly. This will remove
>> the "is it safe to set the third argument to NULL" question.
>
> Should be there some problems, if we lost dependency on basic exception
> class? Some compatibility issues? Or is possible to create full type with
> inheritance support?

It's possible to give it a base type, see at
https://hg.python.org/cpython/rev/429cadbc5b10/ this line before
calling PyType_Ready:
PyComError_Type.tp_base = (PyTypeObject*)PyExc_Exception;

PyErr_NewException is a shortcut for defining simple Exception
deriving types, usually one defines a type with the full PyTypeObject
definition.

In the meantime, I had a deeper look at the 2.7.10 code and I trust
that PyMethod_New with the last argument set to NULL is ok. Setting
that to NULL will lead to the PyMethod representing __init__ im_class
being NULL. But that PyMethod object is not held onto by C code, it's
added to the SPIError class' dict. From there, it is always retrieved
from Python via an instance or via the class (so SPIError().__init__
or SPIError.__init__) which will lead to instancemethod_descr_get
being called because it's the tp_descr_get slot of PyMethod_Type and
that code knows the class you're retrieving the attribute from
(SPIError in this case), checks if the PyMethod already has a not NULL
im_class (which SPIError.__init__ doesn't) and, if not, calls
PyMethod_New again and passes the class (SPIError) as the 3rd
argument.

Given this, I think it's ok to keep using PyErr_NewException rather
than spelling out the type explicitly.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: PL/Pythonu - function ereport

2015-10-22 Thread Catalin Iacob
On Sat, Oct 17, 2015 at 8:18 PM, Pavel Stehule  wrote:
> here is new patch
>
> cleaned, all unwanted artefacts removed. I am not sure if used way for
> method registration is 100% valid, but I didn't find any related
> documentation.

Pavel, some notes about the patch, not a full review (yet?).

In PLy_add_exceptions PyDict_New is not checked for failure.

In PLy_spi_error__init__, kw will be NULL if SPIError is called with
no arguments. With the current code NULL will get passed to
PyDict_Size which will generate something like SystemError Bad
internal function call. This also means a test using just SPIError()
is needed.
It seems args should never be NULL by the way, if there are no
arguments it's an empty tuple so the other side of the check is ok.

The current code doesn't build on Python3 because the 3rd argument of
PyMethod_New, the troubled one you need set to NULL has been removed.
This has to do with the distinction between bound and unbound methods
which is gone in Python3.

There is http://bugs.python.org/issue1587 which discusses how to
replace the "third argument" functionality for PyMethod_New in
Python3. One of the messages there links to
http://bugs.python.org/issue1505 and
https://hg.python.org/cpython/rev/429cadbc5b10/ which has an example
very similar to what you're trying to do, rewritten to work in
Python3. But this is still confusing: note that the replaced code
*didn't really use PyMethod_New at all* as the removed line meth =
PyMethod_New(func, NULL, ComError); was commented out and the replaced
code used to simply assign the function to the class dictionary
without even creating a method.
Still, the above link shows a (more verbose but maybe better)
alternative: don't use PyErr_NewException and instead define an
SPIError type with each slot spelled out explicitly. This will remove
the "is it safe to set the third argument to NULL" question.

I tried to answer the "is it safe to use NULL for the third argument
of PyMethod_New in Python2?" question and don't have a definite answer
yet. If you look at the CPython code it's used to set im_class
(Objects/classobject.c) of PyMethodObject which is accessible from
Python.
But this code:
init = plpy.SPIError.__init__
plpy.notice("repr {} str {} im_class {}".format(repr(init), str(init),
init.im_class))
produces:
NOTICE:  repr  str  im_class 
so the SPIError class name is set in im_class from somewhere. But this
is all moot if you use the explicit SPIError type definition.

>> Any help is welcome

I can work with you on this. I don't have a lot of experience with the
C API but not zero either.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Adding since-version tags to the docs?

2015-09-01 Thread Catalin Iacob
On Mon, Aug 31, 2015 at 4:48 PM, Tom Lane  wrote:
> Right now, you might well care about whether a feature arrived in 9.3 vs
> 9.4, for instance; but it's highly unlikely that you care whether a
> feature arrived in 7.1 or 7.2.  The problem with this proposal is that
> it will add far more bloat of the latter sort than currently-useful
> information; and the ratio will get worse over time.

After working with Python, when reading the Postgres docs I did feel
this was missing so +1 from a user. An example for Python's
annotations is at https://docs.python.org/3/library/zipfile.html.
Python only keeps the annotations for a few versions back (3 or 4 I
think).

Another use case which is maybe worth mentioning explicitly: when I
needed to target a certain older version of Python, as long as it was
reasonably close to the current one, I would still read the latest
Python docs and rely on the version annotations to let me know I can't
yet use certain features. For Postgres, that doesn't feel "safe" so
you tend to go to the docs of the specific version you're targeting.
But by doing that, you miss getting extra education about improvements
in newer versions. That extra information could also be a powerful
trigger to update in order to gain those improvements. I think both
are quite valuable.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb concatenate operator's semantics seem questionable

2015-05-18 Thread Catalin Iacob
On Mon, May 18, 2015 at 9:03 PM Andrew Dunstan  wrote:

> So you're arguing that we shouldn't call the operation in question ||
> because it's pretty much the same, mutatis mutandis, as the hstore
> operation of the same name. You've lost me.
>

Hopefully this helps. Peter's argument, as I understand it is:

In hstore @> means unnested containment, in jsonb it means nested
containment. Therefore, when an hstore operator is applied to jsonb it gets
"nestedness" as jsonb is nested and adds that nestedness is an important
thing that sets it apart from hstore. Therefore, since || is unnested
concatenation in hstore, it should become nested concatenation for jsonb.

I don't know if the argument is strong enough but it does make some sense.