from:"Isaac Morland"

Re: [PATCH] GROUP BY ALL

2024-07-22 Thread Isaac Morland

On Mon, 22 Jul 2024 at 17:34, David G. Johnston 
wrote:

> On Mon, Jul 22, 2024 at 1:55 PM David Christensen 
> wrote:
>
>> I see that there'd been some chatter but not a lot of discussion about
>> a GROUP BY ALL feature/functionality.  There certainly is utility in
>> such a construct IMHO.
>>
>> Still need some docs; just throwing this out there and getting some
>> feedback.
>>
>>
> I strongly dislike adding this feature.  I'd only consider supporting it
> if it was part of the SQL standard.
>
> Code is written once and read many times.  This feature caters to
> the writer, not the reader.  And furthermore usage of this is prone to be
> to the writer's detriment as well.
>

And for when this might be useful, the syntax for it already exists,
although a spurious error message is generated:

odyssey=> select (uw_term).*, count(*) from uw_term group by uw_term;
ERROR:  column "uw_term.term_id" must appear in the GROUP BY clause or be
used in an aggregate function
LINE 1: select (uw_term).*, count(*) from uw_term group by uw_term;
^

I'm not sure exactly what's going on here — it's like it's still seeing the
table name in the field list as only a table name and not the value
corresponding to the whole table as a row value (But in general I'm not
happy with the system's ability to figure out that a column's value has
only one possibility given the grouping columns). You can work around:

odyssey=> with t as (select uw_term, count(*) from uw_term group by
uw_term) select (uw_term).*, count from t;

This query works.

Re: [18] Policy on IMMUTABLE functions and Unicode updates

2024-07-22 Thread Isaac Morland

On Mon, 22 Jul 2024 at 13:51, Jeff Davis  wrote:

> > Are you proposing a switch that would make PostgreSQL error out if
> > somebody wants to use an unassigned code point?  That would be an
> > option.
>
> You can use a CHECK(UNICODE_ASSIGNED(t)) in version 17, and in version
> 18 I have a proposal here to make it a database-level option:
>

And if you define a domain over text with this check, you would effectively
have a type that works exactly like text except you can only store assigned
code points in it. Then use that instead of text everywhere (easy to audit
with a query over the system tables).

Re: Add pg_get_acl() function get the ACL for a database object

2024-06-20 Thread Isaac Morland

On Thu, 20 Jun 2024 at 23:44, Tom Lane  wrote:

> Michael Paquier  writes:
> > On Thu, Jun 20, 2024 at 08:32:57AM +0200, Joel Jacobson wrote:
> >> I've added overloaded versions for regclass and regproc so far:
> >>
> >> \df pg_get_acl
> >> List of functions
> >> Schema   |Name| Result data type |  Argument data types   | Type
> >>
> ++--++--
> >> pg_catalog | pg_get_acl | aclitem[]| classid oid, objid oid |
> func
> >> pg_catalog | pg_get_acl | aclitem[]| objid regclass |
> func
> >> pg_catalog | pg_get_acl | aclitem[]| objid regproc  |
> func
> >> (3 rows)
>
> > Interesting idea.
>
> Doesn't that result in "cannot resolve ambiguous function call"
> failures?

If you try to pass an oid directly, as a value of type oid, you should get
"function is not unique". But if you cast a string or numeric value to the
appropriate reg* type for the object you are using, it should work fine.

I have functions which reset object permissions on all objects in a
specified schema back to the default state as if they had been freshly
created which rely on this. They work very well, and allow me to have a
privilege-granting script for each project which always fully resets all
the privileges back to a known state.

Re: Add pg_get_acl() function get the ACL for a database object

2024-06-20 Thread Isaac Morland

On Thu, 20 Jun 2024 at 02:33, Joel Jacobson  wrote:

> On Wed, Jun 19, 2024, at 16:23, Isaac Morland wrote:
> > I have no idea how often this would be useful, but I wonder if it could
> > work to have overloaded single-parameter versions for each of
> > regprocedure (pg_proc.proacl), regclass (pg_class.relacl), …. To call,
> > just cast the OID to the appropriate reg* type.
> >
> > For example: To get the ACL for table 'example_table', call pg_get_acl
> > ('example_table'::regclass)
>
> +1
>
> New patch attached.
>
> I've added overloaded versions for regclass and regproc so far:
>
> \df pg_get_acl
>  List of functions
>Schema   |Name| Result data type |  Argument data types   | Type
>
> ++--++--
>  pg_catalog | pg_get_acl | aclitem[]| classid oid, objid oid | func
>  pg_catalog | pg_get_acl | aclitem[]| objid regclass | func
>  pg_catalog | pg_get_acl | aclitem[]| objid regproc  | func
> (3 rows)

Those were just examples. I think for completeness there should be 5
overloads:

[input type] → [relation.aclattribute]
regproc/regprocedure → pg_proc.proacl
regtype → pg_type.typacl
regclass → pg_class.relacl
regnamespace → pg_namespace.nspacl

I believe the remaining reg* types don't correspond to objects with ACLs,
and the remaining ACL fields are for objects which don't have a
corresponding reg* type.

In general I believe the reg* types are underutilized. All over the place I
see examples where people write code to generate SQL statements and they
take schema and object name and then format with %I.%I when all that is
needed is a reg* value and then format it with a simple %s (of course, need
to make sure the SQL will execute with the same search_path as when the SQL
was generated, or generate with an empty search_path).

Re: Add pg_get_acl() function get the ACL for a database object

2024-06-19 Thread Isaac Morland

On Wed, 19 Jun 2024 at 07:35, Joel Jacobson  wrote:

> Hello hackers,
>
> Currently, obtaining the Access Control List (ACL) for a database object
> requires querying specific pg_catalog tables directly, where the user
> needs to know the name of the ACL column for the object.
>

I have no idea how often this would be useful, but I wonder if it could
work to have overloaded single-parameter versions for each of regprocedure
(pg_proc.proacl), regclass (pg_class.relacl), …. To call, just cast the OID
to the appropriate reg* type.

For example: To get the ACL for table 'example_table', call pg_get_acl
('example_table'::regclass)

Re: Wrong security context for deferred triggers?

2024-06-08 Thread Isaac Morland

On Sat, 8 Jun 2024 at 17:37, Joseph Koshakow  wrote:

> However, I do worry that this is too much of a breaking change and
> fundamentally changes how triggers work. Another draw back is that if
> the trigger owner loses the required privileges, then no one can modify
> to the table.
>

I worry about breaking changes too. The more I think about it, though, the
more I think the existing behaviour doesn’t make sense.

Speaking as a table owner, when I set a trigger on it, I expect that when
the specified actions occur my trigger will fire and will do what I
specify, without regard to the execution environment of the caller
(search_path in particular); and my trigger should be able to do anything
that I can do. For the canonical case of a logging table the trigger has to
be able to do stuff the caller can't do. I don't expect to be able to do
stuff that the caller can do.

Speaking as someone making an update on a table, I don't expect to have it
fail because my execution environment (search_path in particular) is wrong
for the trigger implementation, and I consider it a security violation if
the table owner is able to do stuff as me as a result, especially if I am
an administrator making an update as superuser.

 In effect, I want the action which fires the trigger to be like a system
call, and the trigger, plus the underlying action, to be like what the OS
does in response to the system call.

I'm not sure how to evaluate what problems with existing implementations
would be caused by changing what role executes triggers, but I think it's
pretty clear the existing behaviour is the wrong choice in every other way
than backward compatibility. I welcome examples to the contrary, where the
existing behaviour is not just OK but actually wanted.

Re: Addressing SECURITY DEFINER Function Vulnerabilities in PostgreSQL Extensions

2024-06-06 Thread Isaac Morland

On Thu, 6 Jun 2024 at 12:53, Jeff Davis  wrote:

> > I didn't get you completely here. w.r.t extensions how will this have
> > an impact if we set the search_path for definer functions.
>
> If we only set the search path for SECURITY DEFINER functions, I don't
> think that solves the whole problem.

Indeed. While the ability for a caller to set the search_path for a
security definer functions introduces security problems that are different
than for security invoker functions, it's still weird for the behaviour of
a function to depend on the caller's search_path. It’s even weirder for the
default search path behaviour to be different depending on whether or not
the function is security definer.

Re: SQL:2011 application time

2024-05-21 Thread Isaac Morland

On Tue, 21 May 2024 at 13:57, Robert Haas  wrote:

What I think is less clear is what that means for temporal primary
> keys. As Paul pointed out upthread, in every other case, a temporal
> primary key is at least as unique as a regular primary key, but in
> this case, it isn't. And someone might reasonably think that a
> temporal primary key should exclude empty ranges just as all primary
> keys exclude nulls. Or they might think the opposite.
>

Fascinating. I think you're absolutely right that it's clear that two empty
intervals don't conflict. If somebody wants to claim two intervals
conflict, they need to point to at least one instant in time that is common
between them.

But a major point of a primary key, it seems to me, is that it uniquely
identifies a row. If items are identified by a time range, non-overlapping
or not, then the empty range can only identify one item (per value of
whatever other columns are in the primary key). I think for a unique key
the non-overlapping restriction has to be considered an additional
restriction on top of the usual uniqueness restriction.

I suspect in many applications there will be a non-empty constraint; for
example, it seems quite reasonable to me for a meeting booking system to
forbid empty meetings. But when they are allowed they should behave in the
mathematically appropriate way.

Re: Is there any chance to get some kind of a result set sifting mechanism in Postgres?

2024-05-13 Thread Isaac Morland

On Mon, 13 May 2024 at 04:40, aa  wrote:

> Hello Everyone!
>
> Is there any chance to get some kind of a result set sifting mechanism in
> Postgres?
>
> What I am looking for is a way to get for example: "nulls last" in a
> result set, without having to call "order by" or having to use UNION ALL,
> and if possible to get this in a single result set pass.
>
> Something on this line: SELECT a, b, c FROM my_table WHERE a nulls last
> OFFSET 0 LIMIT 25
>
> I don't want to use order by or union all because these are time consuming
> operations, especially on  large data sets and when comparations are done
> on dynamic values (eg: geolocation distances in between a mobile and a
> static location)
>

This already exists: ORDER BY a IS NULL

I've found it to be more useful than one might initially expect to order by
a boolean expression.

Re: Why don't we support external input/output functions for the composite types

2024-04-26 Thread Isaac Morland

On Fri, 26 Apr 2024 at 14:04, Robert Haas  wrote:

systems have this problem. I wonder if anyone knows of another system
> that works like PostgreSQL in this regard (without sharing code).
>

In Haskell period (".") is used both to form a qualified name (module.name),
very similar to our schema.object, and it is also a perfectly normal
operator which is defined by the standard prelude as function composition
(but you can re-bind it to any object whatsoever). This is disambiguated in
a very simple way however: Module names must begin with an uppercase letter
while variable names must begin with a lowercase letter.

A related point is that parentheses in Haskell act to group expressions,
but they, and commas, are not involved in calling functions: to call a
function, just write it to the left of its parameter (and it only has one
parameter, officially).

All this might sound weird but it actually works very well in the Haskell
context.

Re: Why don't we support external input/output functions for the composite types

2024-04-25 Thread Isaac Morland

On Thu, 25 Apr 2024 at 17:05, Tom Lane  wrote:

> > I think it's confusing and counterintuitive that putting parentheses
> > around a subexpression completely changes the meaning. I don't know of
> > any other programming language that behaves that way,
>
> I take it that you also don't believe that "2 + 3 * 4" should yield
> different results from "(2 + 3) * 4"?
>

In that expression "2 + 3" is not a subexpression, although "3 * 4"
is, thanks to the operator precedence rules.

I could get behind offering an alternative notation, eg "a.b->c does
> the same thing as (a.b).c", if we could find a reasonable notation
> that doesn't infringe on user operator namespace.  I think that might
> be hard to do though, and I don't think the existing notation is so
> awful than we should break existing operators to have an alternative.
> The business with deprecating => operators a few years ago had the
> excuse that "the standard says so", but we don't have that
> justification here.
>

This is one of those areas where it will be difficult at best to do
something which makes things work the way people expect without breaking
other cases. I certainly would like to be able to use . to extract a field
from a composite value without parenthesizing everything, but given the
potential for having a schema name that matches a table or field name I
would want to be very careful about changing anything.

Re: PSQL Should \sv & \ev work with materialized views?

2024-03-28 Thread Isaac Morland

On Thu, 28 Mar 2024 at 20:38, Erik Wienhold  wrote:

> Of course the problem with using DROP and CREATE is that indexes and
> privileges (anything else?) must also be restored.  I haven't bothered
> with that yet.
>

Not just those — also anything that depends on the matview, such as views
and other matviews.

Re: Possibility to disable `ALTER SYSTEM`

2024-03-27 Thread Isaac Morland

On Wed, 27 Mar 2024 at 13:05, Greg Sabino Mullane 
wrote:

> The purpose of the setting is to prevent accidental
>> modifications via ALTER SYSTEM in environments where
>
>
> The emphasis on 'accidental' seems a bit heavy here, and odd. Surely, just
> "to prevent modifications via ALTER SYSTEM in environments where..." is
> enough?
>

Not necessarily disagreeing, but it's very important nobody ever mistake
this for a security feature. I don't know if the extra word "accidental" is
necessary, but I think that's the motivation.

Re: Catalog domain not-null constraints

2024-03-21 Thread Isaac Morland

On Thu, 21 Mar 2024 at 10:30, Tom Lane  wrote:

> The SQL spec's answer to that conundrum appears to be "NULL is
> a valid value of every domain, and if you don't like it, tough".
>

To be fair, NULL is a valid value of every type. Even VOID has NULL.

In this context, it’s a bit weird to be able to decree up front when
defining a type that no table column of that type, anywhere, may ever
contain a NULL. It would be nice if there was a way to reverse the default
so that if you (almost or) never want NULLs anywhere that’s what you get
without saying "NOT NULL" all over the place, and instead just specify
"NULLABLE" (or something) where you want. But that effectively means
optionally changing the behaviour of CREATE TABLE and ALTER TABLE.

Re: Reducing the log spam

2024-03-06 Thread Isaac Morland

On Tue, 5 Mar 2024 at 07:55, Laurenz Albe  wrote:

> Inspired by feedback to [1], I thought about how to reduce log spam.
>
> My experience from the field is that a lot of log spam looks like
>
>   database/table/... "xy" does not exist
>   duplicate key value violates unique constraint "xy"
>
> So what about a parameter "log_suppress_sqlstates" that suppresses
> logging ERROR and FATAL messages with the enumerated SQL states?
>
> My idea for a default setting would be something like
>
>   log_suppress_sqlstates = '23505,3D000,3F000,42601,42704,42883,42P01'
>
> but that's of course bikeshedding territory.
>

I like the basic idea and the way of specifying states seems likely to
cover a lot of typical use cases. Of course in principle the application
should be fixed, but in practice we can't always control that.

I have two questions about this:

First, can it be done per role? If I have a particular application which is
constantly throwing some particular error, I might want to suppress it, but
not suppress the same error occasionally coming from another application. I
see ALTER DATABASE name SET configuration_parameter … as being useful here,
but often multiple applications share a database.

Second, where can this setting be adjusted? Can any session turn off
logging of arbitrary sets of sqlstates resulting from its queries? It feels
to me like that might allow security problems to be hidden. Specifically,
the first thing an SQL injection might do would be to turn off logging of
important error states, then proceed to try various nefarious things.

It seems to me the above questions interact; an answer to the first might
be "ALTER ROLE role_specification SET configuration_parameter", but I think
that would allow roles to change their own settings, contrary to the
concern raised by the second question.

Re: Should we remove -Wdeclaration-after-statement?

2024-01-29 Thread Isaac Morland

On Mon, 29 Jan 2024 at 10:42, Mark Dilger 
wrote:

> I don't think anybody is proposing re-working the existing codebase. I
> understand this to be only about allowing new code to use the newer style.
> Personally, I like, as much as possible, to use initializations to const
> variables and avoid assignment operators, so I much prefer to declare at
> first (and usually only) assignment.
>
> I was responding to Jelte's paragraph upthread:
>
> > On Dec 27, 2023, at 9:59 AM, Jelte Fennema-Nio 
> wrote:
> >
> > But if the only reason not to remove the
> > warning is that then there would be two styles of declaration in the
> > codebase, then I'm happy to create another refactoring script that
> > moves declarations down to their first usage. (Which could then be run
> > on all backbranches to make sure there is no backpatching pain)
>
> I find that kind of gratuitous code churn highly unpleasant.
>

I stand corrected, and agree completely. It’s hard to imagine a change of
such a global nature that would be a big enough improvement that it would
be a good idea to apply to existing code. Personally I’m fine with code of
different vintages using different styles, as long as it’s understood why
the difference exists — in this case because tons of code has already been
written and isn’t going to be re-styled except possibly as part of other
changes.

Re: Should we remove -Wdeclaration-after-statement?

2024-01-29 Thread Isaac Morland

On Mon, 29 Jan 2024 at 10:31, Mark Dilger 
wrote:

>
>
> > On Jan 29, 2024, at 7:03 AM, Jelte Fennema-Nio 
> wrote:
> >
> > So my suggestion is for people to respond with -1, -0.5, +-0, +0.5, or
> > +1 to indicate support against/for the change.
>
> -1 for me.
>
> -Infinity for refactoring the entire codebase and backpatching.
>

I don't think anybody is proposing re-working the existing codebase. I
understand this to be only about allowing new code to use the newer style.
Personally, I like, as much as possible, to use initializations to const
variables and avoid assignment operators, so I much prefer to declare at
first (and usually only) assignment.

But not voting because the amount of code I’ve actually contributed is
minuscule.

Re: Things I don't like about \du's "Attributes" column

2023-12-30 Thread Isaac Morland

On Sat, 30 Dec 2023 at 09:23, Pavel Luzanov 
wrote:

> I think that writing the value "infinity" in places where there is no
> value is
> not a good thing. This hides the real value of the column. In addition,
> there is no reason to set "infinity" when the password is always valid with
> default NULL.
>

Would it make sense to make the column non-nullable and always set it to
infinity when there is no expiry?

In this case, I think NULL simply means infinity, so why not write that? If
the timestamp type didn't have infinity, then NULL would be a natural way
of saying that there is no expiry, but with infinity as a possible value, I
don't see any reason to think of no expiry as being the absence of an
expiry time rather than an infinite expiry time.

Re: Track in pg_replication_slots the reason why slots conflict?

2023-12-26 Thread Isaac Morland

On Thu, 21 Dec 2023 at 09:26, Amit Kapila  wrote:

> A conflicting column where NULL indicates no conflict, and other
> > values indicate the reason for the conflict, doesn't seem too bad.
> >
>
> This is fine too.
>

I prefer this option. There is precedent for doing it this way, for example
in pg_stat_activity.wait_event_type.

The most common test of this field is likely to be "is there a conflict"
and it's better to write this as "[fieldname] IS NOT NULL" than to
introduce a magic constant. Also, it makes clear to future maintainers that
this field has one purpose: saying what type of conflict there is, if any.
If we find ourselves wanting to record a new non-conflict status (no idea
what that could be: "almost conflict"? "probably conflict soon"?) there
would be less temptation to break existing tests for "is there a conflict".

Re: Should REINDEX be listed under DDL?

2023-12-04 Thread Isaac Morland

On Mon, 4 Dec 2023 at 02:54, Laurenz Albe  wrote:

> REINDEX is philosophically a maintenance command and a Postgres
> > extension not in the SQL standard, so it does not really qualify as a
> > DDL because it does not do in object definitions, so we could just
> > delete this comment.  Or could it be more useful to consider that as a
> > special case and report it as a DDL, impacting log_statements?
>
> It should be qualified just like CREATE INDEX.
> Both are not covered by the standard, which does not mention indexes,
> since they are an "implementation detail".
>
> I think that it is pretty clear that CREATE INDEX should be considered
> DDL, since it defines (creates) and object.  The same should apply to
> REINDEX.
>

Isn't REINDEX more like REFRESH MATERIALIZED VIEW and CLUSTER (especially
without USING)?

CREATE INDEX (really, CREATE anything) is clearly DDL as it creates a new
object, and DROP and ALTER are the same. But REINDEX just reaches below the
abstraction and maintains the existing object without changing its
definition.

I don't think whether it's in the standard is the controlling fact. It's
not just DDL vs. not; there are naturally at least 3 categories: DDL,
maintenance, and data modification.

Getting back to the question at hand, I think REINDEX should be treated the
same as VACUUM and CLUSTER (without USING). So if and only if they are
considered DDL for this purpose then REINDEX should be too.

Re: Fix search_path for all maintenance commands

2023-11-06 Thread Isaac Morland

On Mon, 6 Nov 2023 at 15:54, Tom Lane  wrote:

> Isaac Morland  writes:
> > I still think the right default is that CREATE FUNCTION stores the
> > search_path in effect when it runs with the function, and that is the
> > search_path used to run the function (and don't "BEGIN ATOMIC" functions
> > partially work this way already?).
>
> I don't see how that would possibly fly.  Yeah, that behavior is
> often what you want, but not always; we would break some peoples'
> applications with that rule.
>

The behaviour I want is just “SET search_path FROM CURRENT".

I agree there is a backward compatibility issue; if somebody has a schema
creation/update script with function definitions with no "SET search_path"
they would suddenly start getting the search_path from definition time
rather than the caller's search_path.

I don't like adding GUCs but a single one specifying whether no search_path
specification means "FROM CURRENT" or the current behaviour (new explicit
syntax "FROM CALLER"?) would I think address the backward compatibility
issue. This would allow a script to specify at the top which convention it
is using; a typical old script could be adapted to a new database by adding
a single line at the top to get the old behaviour.

Also, one place where it's clearly NOT what you want is while
> restoring a pg_dump script.  And we don't have any way that we could
> bootstrap ourselves out of breaking everything for everybody during
> their next upgrade --- even if you insist that people use a newer
> pg_dump, where is it going to find the info in an existing database?
>

A function with a stored search_path will have a "SET search_path" clause
in the pg_dump output, so for these functions pg_dump would be unaffected
by my preferred way of doing things. Already I don't believe pg_dump ever
puts "SET search_path FROM CURRENT" in its output; it puts the actual
search_path. A bigger problem is with existing functions that use the
caller's search_path; these would need to specify "FROM CALLER" explicitly;
but the new GUC could come into this. In effect a pg_dump created by an old
version is an old script which would need the appropriate setting at the
top.

But all this is premature if there is still disagreement on the proper
default behaviour. To me it is absolutely clear that the right default, in
the absence of an installed base with backward compatibility concerns, is
"SET search_path FROM CURRENT". This is how substantially all programming
languages work: it is quite unusual in modern programming languages to have
the meaning of a procedure definition depend on which modules the caller
has imported. The tricky bit is dealing smoothly with the installed base.
But some of the discussion here makes me think that people have a different
attitude about stored procedures.

Re: Fix search_path for all maintenance commands

2023-11-06 Thread Isaac Morland

On Thu, 2 Nov 2023 at 14:22, Jeff Davis  wrote:

> On Tue, 2023-10-31 at 13:16 -0400, Isaac Morland wrote:
>
> > Perhaps the search_path for running a maintenance command should be
> > the search_path set for the table owner (ALTER ROLE … SET search_path
> > …)?
>
> That's an interesting idea; I hadn't considered that, or at least not
> very deeply. I feel like it came up before but I can't remember what
> (if anything) was wrong with it.
>
> If we expanded this idea a bit, I could imagine it applying to SECURITY
> DEFINER functions as well, and that would make writing SECURITY DEFINER
> functions a lot safer.
>

I still think the right default is that CREATE FUNCTION stores the
search_path in effect when it runs with the function, and that is the
search_path used to run the function (and don't "BEGIN ATOMIC" functions
partially work this way already?). But I suggest the owner search_path as
an option which is clearly better than using the caller's search_path in
most cases.

I think the problems are essentially the same for security invoker vs.
security definer. The difference is that the problems are security problems
only for security definers.

>  After that, change search_path on function invocation as usual
> > rather than having special rules for what happens when a function is
> > invoked during a maintenance command.
>
> I don't follow what you mean here.
>

I’m referring to the idea that the search_path during function execution
should be determined at function creation time (or, at least, not at
function execution time). While this is a security requirement for security
definer functions, I think it’s what is wanted about 99.9% of the time
for security invoker functions as well. So when the maintenance command
ends up running a function, the search_path in effect during the function
execution will be the one established at function definition time; or if we
go with this "search_path associated with function owner" idea, then again
the search_path is determined by the usual rule (function owner), rather
than by any special rules associated with maintenance commands.

Re: Wrong security context for deferred triggers?

2023-11-06 Thread Isaac Morland

On Mon, 6 Nov 2023 at 11:58, Laurenz Albe  wrote:

> Become a superuser again and commit:
> >
> >  RESET ROLE;
> >
> >  COMMIT;
> >  NOTICE:  current_user = postgres
> >
> >
> > So a deferred constraint trigger does not run with the same security
> context
> > as an immediate trigger.  This is somewhat nasty in combination with
> > SECURITY DEFINER functions: if that function performs an operation, and
> that
> > operation triggers a deferred trigger, that trigger will run in the wrong
> > security context.
> >
> > This behavior looks buggy to me.  What do you think?
> > I cannot imagine that it is a security problem, though.
>

This looks to me like another reason that triggers should run as the
trigger owner. Which role owns the trigger won’t change as a result of
constraints being deferred or not, or any role setting done during the
transaction, including that relating to security definer functions.

Right now triggers can’t do anything that those who can
INSERT/UPDATE/DELETE (i.e., cause the trigger to fire) can’t do, which in
particular breaks the almost canonical example of using triggers to log
changes — I can’t do it without also allowing users to make spurious log
entries.

Also if I cause a trigger to fire, I’ve just given the trigger owner the
opportunity to run arbitrary code as me.

> I just realized one problem with running a deferred constraint trigger as
> the triggering role: that role might have been dropped by the time the
> trigger
> executes.  But then we could still error out.
>

This problem is also fixed by running triggers as their owners: there
should be a dependency between an object and its owner. So the
trigger-executing role can’t be dropped without dropping the trigger.

Re: Fix search_path for all maintenance commands

2023-10-31 Thread Isaac Morland

On Fri, 27 Oct 2023 at 19:04, Jeff Davis  wrote:

The approach of locking down search_path during maintenance commands
> would solve the problem, but it means that we are enforcing search_path
> in some contexts and not others. That's not great, but it's similar to
> what we are doing when we ignore SECURITY INVOKER and run the function
> as the table owner during a maintenance command, or (by default) for
> subscriptions.
>

I don't agree that this is ignoring SECURITY INVOKER. Instead, I see it as
running the maintenance command as the owner of the table, which is
therefore the invoker of the function. As far as I can tell we need to do
this for security anyway - otherwise as soon as superuser runs a
maintenance command (which it can already do), the owner of any function
called in the course of the maintenance operation has an opportunity to get
superuser.

For that matter, what would it even mean to ignore SECURITY INVOKER? Run
the function as its owner if it were SECURITY DEFINER?

I understand what ignoring SECURITY DEFINER would mean: obviously, don't
adjust the current user on entry and exit.

The privilege boundary should be at the point where the maintenance command
starts: the role with MAINTAIN privilege gets to kick off maintenance, but
doesn't get to specify anything after that, including the search_path (of
course, function execution search paths should not normally depend on the
caller's search path anyway, but that's a bigger discussion with an
unfortunate backward compatibility problem).

Perhaps the search_path for running a maintenance command should be the
search_path set for the table owner (ALTER ROLE … SET search_path …)? If
none set, the default "$user", public. After that, change search_path on
function invocation as usual rather than having special rules for what
happens when a function is invoked during a maintenance command.

My attempts to more generally try to lock down search_path for
> functions attached to tables didn't seem to get much consensus, so if
> we do make an exception to lock down search_path for maintenance
> commands only, it would stay an exception for the foreseeable future.
>

Re: PostgreSQL domains and NOT NULL constraint

2023-10-23 Thread Isaac Morland

On Mon, 23 Oct 2023 at 13:40, Tom Lane  wrote:

> I wrote:
> > Given the exception the spec makes for CAST, I wonder if we shouldn't
> > just say "NULL is a valid value of every domain type, as well as every
> > base type.  If you don't like it, too bad; write a separate NOT NULL
> > constraint for your table column."
>
> After ruminating on this for awhile, here's a straw-man proposal:
>

[]

> 3. Left unsaid here is whether we should treat assignments to,
> e.g., plpgsql variables as acting like assignments to table
> columns.  I'm inclined not to, because
>
> (3A) I'm lazy, and I'm also worried that we'd miss places where
> this arguably should happen.
>
> (3B) I don't think the SQL spec contemplates any such thing
> happening.
>
> (3C) Not doing that means we have a pretty consistent view of
> what the semantics are for "values in flight" within a query.
> Anything that's not stored in a table is "in flight" and so
> can be NULL.
>
> (3D) Again, if you don't like it, there's already ways to attach
> a separate NOT NULL constraint to plpgsql variables.
>
>
> Documenting this in an intelligible fashion might be tricky,
> but explaining the exact spec-mandated behavior wouldn't be
> much fun either.

This sounds pretty good.

I'd be OK with only running the CHECK clause on non-NULL values. This would
imply that "CHECK (VALUE NOT NULL)" would have exactly the same effect as
"CHECK (TRUE)" (i.e., no effect). This might seem insane but it avoids a
special case and in any event if somebody wants the NOT NULL behaviour,
they can get it by specifying NOT NULL in the CREATE DOMAIN command.

Then domain CHECK constraints are checked anytime a non-NULL value is
turned into a domain value, and NOT NULL ones are checked only when storing
to a table. CHECK constraints would be like STRICT functions; if the input
is NULL, the implementation is not run and the result is NULL (which for a
CHECK means accept the input).

Whether I actually think the above is a good idea would require me to read
carefully the relevant section of the SQL spec. If it agrees that CHECK ()
is for testing non-NULL values and NOT NULL is for saying that columns of
actual tables can't be NULL, then I would probably agree with my own idea,
otherwise perhaps not depending on exactly what it said.

Some possible documentation wording to consider for the CREATE DOMAIN page:

Under "NOT NULL": "Table columns whose data type is this domain may not be
NULL, exactly as if NOT NULL had been given in the column specification."

Under "NULL": "This is a noise word indicating the default, which is that
the domain does not restrict NULL from occurring in table columns whose
data type is this domain."

Under "CHECK (expression)", replacing the first sentence: "CHECK clauses
specify integrity constraints or tests which non-NULL values of the domain
must satisfy; NULLs are never checked by domain CHECK clauses. To use a
domain to prevent a NULL from occurring in a table column, use the NOT NULL
clause."

Also, where it says "Expressions evaluating to TRUE or UNKNOWN succeed": Do
we really mean "Expressions evaluating to TRUE or NULL succeed"?

It would be nice if we had universally agreed terminology so that we would
have one word for the non-NULL things of various data types, and another
word for the possibly NULL things that might occur in variable or column.

If we decide we do want "CHECK (VALUE NOT NULL)" to work, then I wonder if
we could pass NULL to the constraint at CREATE DOMAIN time, and if it
returns FALSE, do exactly what we would have done (set pg_type.typnotnull)
if an actual NOT NULL clause had been specified? Then when actually
processing domain constraints during a query, we could use the above
procedure. I'm thinking about more complicated constraints that evaluate to
FALSE for NULL but which are not simply "CHECK (VALUE NOT NULL)".

Is it an error to specify both NULL and NOT NULL? What about CHECK (VALUE
NOT NULL) and NULL?

Re: Pre-proposal: unicode normalized text

2023-10-17 Thread Isaac Morland

On Tue, 17 Oct 2023 at 11:15, Robert Haas  wrote:

> Are code points assigned from a gapless sequence? That is, is the
> implementation of codepoint_is_assigned(char) just 'codepoint <
> SOME_VALUE' and SOME_VALUE increases over time?
>

Not even close. Code points are organized in blocks, e.g. for mathematical
symbols or Ethiopic script. Sometimes new blocks are added, sometimes new
characters are added to existing blocks. Where they go is a combination of
convenience, history, and planning.

Re: Pre-proposal: unicode normalized text

2023-10-06 Thread Isaac Morland

On Fri, 6 Oct 2023 at 15:07, Jeff Davis  wrote:

> On Fri, 2023-10-06 at 13:33 -0400, Robert Haas wrote:
> > What I think people really want is a whole column in
> > some encoding that isn't the normal one for that database.
>
> Do people really want that? I'd be curious to know why.
>
> A lot of modern projects are simply declaring UTF-8 to be the "one true
> way". I am not suggesting that we do that, but it seems odd to go in
> the opposite direction and have greater flexibility for many encodings.
>

And even if they want it, we can give it to them when we send/accept the
data from the client; just because they want to store ISO-8859-1 doesn't
mean the actual bytes on the disk need to be that. And by "client" maybe I
mean the client end of the network connection, and maybe I mean the program
that is calling in to libpq.

If they try to submit data that cannot possibly be encoded in the stated
encoding because the bytes they submit don't correspond to any string in
that encoding, then that is unambiguously an error, just as trying to put
February 30 in a date column is an error.

Is there a single other data type where anybody is even discussing letting
the client tell us how to write the data on disk?

Re: Pre-proposal: unicode normalized text

2023-10-05 Thread Isaac Morland

On Thu, 5 Oct 2023 at 07:32, Robert Haas  wrote:

> But I do think that sometimes users are reluctant to perform encoding
> conversions on the data that they have. Sometimes they're not
> completely certain what encoding their data is in, and sometimes
> they're worried that the encoding conversion might fail or produce
> wrong answers. In theory, if your existing data is validly encoded and
> you know what encoding it's in and it's easily mapped onto UTF-8,
> there's no problem. You can just transcode it and be done. But a lot
> of times the reality is a lot messier than that.
>

In the case you describe, the users don’t have text at all; they have
bytes, and a vague belief about what encoding the bytes might be in and
therefore what characters they are intended to represent. The correct way
to store that in the database is using bytea. Text types should be for when
you know what characters you want to store. In this scenario, the
implementation detail of what encoding the database uses internally to
write the data on the disk doesn't matter, any more than it matters to a
casual user how a table is stored on disk.

Similarly, I don't believe we have a "YMD" data type which stores year,
month, and day, without being specific as to whether it's Gregorian or
Julian; if you have that situation, make a 3-tuple type or do something
else. "Date" is for when you actually know what day you want to record.

Re: Pre-proposal: unicode normalized text

2023-10-04 Thread Isaac Morland

On Wed, 4 Oct 2023 at 17:37, Jeff Davis  wrote:

> On Wed, 2023-10-04 at 14:14 -0400, Isaac Morland wrote:
> > Always store only UTF-8 in the database
>
> What problem does that solve? I don't see our encoding support as a big
> source of problems, given that database-wide UTF-8 already works fine.
> In fact, some postgres features only work with UTF-8.
>

My idea is in the context of a suggestion that we support specifying the
encoding per column. I don't mean to suggest eliminating the ability to set
a server-wide encoding, although I doubt there is any use case for using
anything other than UTF-8 except for an old database that hasn’t been
converted yet.

I see no reason to write different strings using different encodings in the
data files, depending on what column they belong to. The various text types
are all abstract data types which store sequences of characters (not
bytes); if one wants bytes, then one has to encode them. Of course, if one
wants UTF-8 bytes, then the encoding is, under the covers, the identity
function, but conceptually it is still taking the characters stored in the
database and converting them to bytes according to a specific encoding.

By contrast, although I don’t see it as a top-priority use case, I can
imagine somebody wanting to restrict the characters stored in a particular
column to characters that can be encoded in a particular encoding. That is
what "CHARACTER SET LATIN1" and so on should mean.

> What about characters not in UTF-8?
>
> Honestly I'm not clear on this topic. Are the "private use" areas in
> unicode enough to cover use cases for characters not recognized by
> unicode? Which encodings in postgres can represent characters that
> can't be automatically transcoded (without failure) to unicode?
>

Here I’m just anticipating a hypothetical objection, “what about characters
that can’t be represented in UTF-8?” to my suggestion to always use UTF-8
and I’m saying we shouldn’t care about them. I believe the answers to your
questions in this paragraph are “yes”, and “none”.

Re: Pre-proposal: unicode normalized text

2023-10-04 Thread Isaac Morland

On Wed, 4 Oct 2023 at 14:05, Chapman Flack  wrote:

> On 2023-10-04 13:47, Robert Haas wrote:
>

> The SQL standard would have me able to:
>
> CREATE TABLE foo (
>a CHARACTER VARYING CHARACTER SET UTF8,
>b CHARACTER VARYING CHARACTER SET LATIN1
> )
>
> and so on, and write character literals like
>
> _UTF8'Hello, world!' and _LATIN1'Hello, world!'
>
> and have those columns and data types independently contain what
> they can contain, without constraints imposed by one overall
> database encoding.
>
> Obviously, we're far from being able to do that. But should it
> become desirable to get closer, would it be worthwhile to also
> try to follow how the standard would have it look?
>
> Clearly, part of the job would involve making the wire protocol
> able to transmit binary values and identify their encodings.
>

I would go in the other direction (note: I’m ignoring all backward
compatibility considerations related to the current design of Postgres).

Always store only UTF-8 in the database, and send only UTF-8 on the wire
protocol. If we still want to have a concept of "client encoding", have the
client libpq take care of translating the bytes between the bytes used by
the caller and the bytes sent on the wire.

Note that you could still define columns as you say, but the character set
specification would effectively act simply as a CHECK constraint on the
characters allowed, essentially CHECK (column_name ~ '^[...all characters
in encoding...]$*'). We don't allow different on-disk representations of
dates or other data types; except when we really need to, and then we have
multiple data types (e.g. int vs. float) rather than different ways of
storing the same datatype.

What about characters not in UTF-8? If a character is important enough for
us to worry about in Postgres, it’s important enough to get a U+ number
from the Unicode Consortium, which automatically puts it in UTF-8. In the
modern context, "plain text" mean "UTF-8 encoded text", as far as I'm
concerned.

Re: Possibility to disable `ALTER SYSTEM`

2023-09-11 Thread Isaac Morland

On Mon, 11 Sept 2023 at 11:11, Magnus Hagander  wrote:

> I'm actually going to put a strong +1 to Gabriele's proposal. It's an
> > undeniable problem (I'm only seeing arguments regarding other ways the
> > system would be insecure), and there might be real use cases for users
> > outside kubernetes for having this feature and using it (meaning
> > disabling the use of ALTER SYSTEM).
>
> If enough people are in favor of it *given the known issues with it*,
> I can drop my objection to a "meh, but I still don't think it's a good
> idea".
>
> But to do that, there would need to be a very in-your-face warning in
> the documentation about it like "note that this only disables the
> ALTER SYSTEM command. It does not prevent a superuser from changing
> the configuration remotely using other means".
>
> For example, in the very simplest, wth the POC patch out there now, I
> can still run:
>
[…]

Maybe in addition to making "ALTER SYSTEM" throw an error, the feature that
disables it should also disable reading postgresql.auto.conf? Maybe even
delete it and make it an error if it is present on startup (maybe even warn
if it shows up while the DB is running?).

Interesting corner case: What happens if I do "ALTER SYSTEM SET
alter_system_disabled = true"?

Counterpoint: maybe the idea is to disable ALTER SYSTEM but still use
postgresql.auto.conf, maintained by an external program, to control the
instance's behaviour.

Re: Possibility to disable `ALTER SYSTEM`

2023-09-08 Thread Isaac Morland

On Fri, 8 Sept 2023 at 10:03, Gabriele Bartolini <
gabriele.bartol...@enterprisedb.com> wrote:


> ALTER SYSTEM is already heavily restricted.
>
>
> Could you please help me better understand what you mean here?
>
>
>> I don't think we need random kluges added to the permissions system.
>
>
> If you allow me, why do you think disabling ALTER SYSTEM altogether is a
> random kluge? Again, I'd like to better understand this position. I've
> personally been in many conversations on the security side of things for
> Postgres in Kubernetes environments, and this is a frequent concern by
> users who request that changes to the Postgres system (not a database)
> should only be done declaratively and prevented from within the system.
>

Alternate idea, not sure how good this is: Use existing OS security
features (regular permissions, or more modern features such as the
immutable attribute) to mark the postgresql.auto.conf file as not being
writeable. Then any attempt to ALTER SYSTEM should result in an error.

Re: Logging of matching pg_hba.conf entry during auth skips trust auth, potential security issue

2023-08-21 Thread Isaac Morland

On Mon, 21 Aug 2023 at 19:23, Michael Paquier  wrote:

I am not sure that we need to change this historic term, TBH.  Perhaps
> it would be shorter to just rip off the trust method from the tree
> with a deprecation period but that's not something I'm much in favor
> off either (I use it daily for my own stuff, as one example).
> Another, more conservative approach may be to make it a developer-only
> option and discourage more its use in the docs.
>

I hope we're not really considering removing the "trust" method. For
testing and development purposes it's very handy — just tell the database,
running in a VM, to allow all connections and just believe who they say
they are from a client process running in the same or a different VM, with
no production data anywhere in site and no connection to the real network.

If people are really getting confused and using it in production, then
change the documentation to make it even more clear that it is a
non-authenticating setting which is there specifically to bypass security
in testing contexts. Ultimately, real tools have the ability to cut your
arm off, and our documentation just needs to make clear which parts of
Postgres are like that.

Re: Faster "SET search_path"

2023-08-01 Thread Isaac Morland

On Wed, 2 Aug 2023 at 01:07, Nathan Bossart 
wrote:

> On Mon, Jul 31, 2023 at 10:28:31PM -0700, Jeff Davis wrote:
> > On Sat, 2023-07-29 at 12:44 -0400, Isaac Morland wrote:
> >> Essentially, "just" observe efficiently (somehow) that no change is
> >> needed, and skip changing it?
> >
> > I gave this a try and it speeds things up some more.
> >
> > There might be a surprise factor with an optimization like that,
> > though. If someone becomes accustomed to the function running fast,
> > then changing the search_path in the caller could slow things down a
> > lot and it would be hard for the user to understand what happened.
>
> I wonder if this is a good enough reason to _not_ proceed with this
> optimization.  At the moment, I'm on the fence about it.
>

Speaking as someone who uses a lot of stored procedures, many of which call
one another, I definitely want this optimization.

I don’t think the fact that an optimization might suddenly not work in a
certain situation is a reason not to optimize. What would our query planner
look like if we took that approach? Many people regularly fix performance
problems by making inscrutable adjustments to queries, sometimes after
having accidentally ruined performance by modifying the query. Instead, we
should try to find ways of making the performance more transparent. We
already have some features for this, but maybe more can be done.

Re: Faster "SET search_path"

2023-07-29 Thread Isaac Morland

On Sat, 29 Jul 2023 at 11:59, Jeff Davis  wrote:

Unfortunately, adding a "SET search_path" clause to functions slows
> them down. The attached patches close the performance gap
> substantially.
>
> Changes:
>
> 0001: Transform the settings in proconfig into a List for faster
> processing. This is simple and speeds up any proconfig setting.
>
> 0002: Introduce CheckIdentifierString(), which is a faster version of
> SplitIdentifierString() that only validates, and can be used in
> check_search_path().
>
> 0003: Cache of previous search_path settings. The key is the raw
> namespace_search_path string and the role OID, and it caches the
> computed OID list. Changes to the search_path setting or the role can
> retrieve the cached OID list as long as nothing else invalidates the
> cache (changes to the temp schema or a syscache invalidation of
> pg_namespace or pg_role).
>

I'm glad to see this work. Something related to consider, not sure if this
is helpful: can the case of the caller's search_path happening to be the
same as the SET search_path setting be optimized? Essentially, "just"
observe efficiently (somehow) that no change is needed, and skip changing
it? I ask because substantially all my functions are written using "SET
search_path FROM CURRENT", and then many of them call each other. As a
result, in my use I would say that the common case is a function being
called by another function, where both have the same search_path setting.
So ideally, the search_path would not be changed at all when entering and
exiting the callee.

Re: cataloguing NOT NULL constraints

2023-07-25 Thread Isaac Morland

On Tue, 25 Jul 2023 at 14:59, Robert Haas  wrote:

> On Tue, Jul 25, 2023 at 1:33 PM Isaac Morland 
> wrote:
> > My suggestion is for \d+ to show NOT NULL constraints only if there is
> something weird going on (wrong name, duplicate constraints, …). If there
> is nothing weird about the constraint then explicitly listing it provides
> absolutely no information that is not given by "not null" in the "Nullable"
> column. Easier said than done I suppose. I'm just worried about my \d+
> displays becoming less useful.
>
> I mean, the problem is that if you want to ALTER TABLE .. DROP
> CONSTRAINT, you need to know what the valid arguments to that command
> are, and the names of these constraints will be just as valid as the
> names of any other constraints.
>

Can't I just ALTER TABLE … DROP NOT NULL still?

OK, I suppose ALTER CONSTRAINT to change the deferrable status and validity
(that is why we're doing this, right?) needs the constraint name. But the
constraint name is formulaic by default, and my proposal is to suppress it
only when it matches the formula, so you could just construct the
constraint name using the documented formula if it's not explicitly listed.

I really don’t see it as a good use of space to add n lines to the \d+
display just to confirm that the "not null" designations in the "Nullable"
column are implemented by named constraints with the expected names.

Re: cataloguing NOT NULL constraints

2023-07-25 Thread Isaac Morland

On Tue, 25 Jul 2023 at 12:24, Alvaro Herrera 
wrote:

> On 2023-Jul-25, Isaac Morland wrote:
>
> > I agree. I definitely do *not* want a bunch of NOT NULL constraint names
> > cluttering up displays. Can we legislate that all NOT NULL implementing
> > constraints are named by mashing together the table name, column name,
> and
> > something to identify it as a NOT NULL constraint?
>
> All constraints are named like that already, and NOT NULL constraints
> just inherited the same idea.  The names are __not_null
> for NOT NULL constraints.  pg_dump goes great lengths to avoid printing
> constraint names when they have this pattern.
>

OK, this is helpful. Can \d do the same thing? I use a lot of NOT NULL
constraints and I very seriously do not want \d (including \d+) to have an
extra line for almost every column. It's just noise, and while my screen is
large, it's still not infinite.

I do not want these constraint names cluttering the output either.
> That's why I propose moving them to a new \d++ command, where they will
> only bother you if you absolutely need them.  But so far I have only one
> vote supporting that idea.

My suggestion is for \d+ to show NOT NULL constraints only if there is
something weird going on (wrong name, duplicate constraints, …). If there
is nothing weird about the constraint then explicitly listing it provides
absolutely no information that is not given by "not null" in the "Nullable"
column. Easier said than done I suppose. I'm just worried about my \d+
displays becoming less useful.

Re: cataloguing NOT NULL constraints

2023-07-25 Thread Isaac Morland

On Tue, 25 Jul 2023 at 11:39, Robert Haas  wrote:

>
> I'm not really thrilled with the idea of every not-null constraint
> having a name, to be honest. Of all the kinds of constraints that we
> have in the system, NOT NULL constraints are probably the ones where
> naming them is least likely to be interesting, because they don't
> really have any interesting properties. A CHECK constraint has an
> expression; a foreign key constraint has columns that it applies to on
> each side plus the identity of the table and opclass information, but
> a NOT NULL constraint seems like it can never have any property other
> than which column. So it sort of seems like a waste to name it. But if
> we want it catalogued then we don't really have an option, so I
> suppose we just have to accept a bit of clutter as the price of doing
> business.
>

I agree. I definitely do *not* want a bunch of NOT NULL constraint names
cluttering up displays. Can we legislate that all NOT NULL implementing
constraints are named by mashing together the table name, column name, and
something to identify it as a NOT NULL constraint? Maybe even something
like pg_not_null_[relname]_[attname] (with some escaping), using the pg_
prefix to make the name reserved similar to schemas and tables? And then
don't show such constraints in \d, not even \d+ - just indicate it in
the Nullable column of the column listing as done now. Show a NOT NULL
constraint if there is something odd about it - for example, if it gets
renamed, or not renamed when the table is renamed.

Sorry for the noise if this has already been decided otherwise.

Re: Looking for context around which event triggers are permitted

2023-07-17 Thread Isaac Morland

On Mon, 17 Jul 2023 at 11:26, Aleksander Alekseev 
wrote:

> Hi,
>
> > I was working on a project with event triggers and was wondering if
> there was any context from the developers around why some things make this
> list and others do not. Example: REVOKE/ GRANT are in the event trigger
> matrix [1] but REINDEX is not. Just wondering if there's a mailing list
> thread or a commit message that has more info. I can't seem to find
> anything in the postgres list archives. Thanks!
> >
> > [1] https://www.postgresql.org/docs/15/event-trigger-matrix.html
>
> Good question. My guess would be that no one really needed an event
> trigger for REINDEX so far.
>

My answer is not authoritative, but I notice that ANALYZE and VACUUM are
also not there. Those, together with REINDEX, are maintenance commands,
which normally should not affect which queries you can run or their
results. If we think of the queries we can run and the objects we can run
them against as forming an abstraction with maintenance commands breaking
the abstraction, then we can think of event triggers as operating against
the abstraction layer, not the underlying maintenance layer.

On the other hand, the event triggers include tags related to indexes,
which themselves (except for enforcement of uniqueness) in some sense sit
below the abstraction: presence of an index can affect the query plan and
how efficient it is, but shouldn't change the result of a query or whether
it is a valid query. So this is not a fully satisfactory explanation.

Re: Forgive trailing semicolons inside of config files

2023-07-11 Thread Isaac Morland

On Tue, 11 Jul 2023 at 10:43, Greg Sabino Mullane 
wrote:

> This has been a long-standing annoyance of mine. Who hasn't done something
> like this?:
>
> psql> SET random_page_cost = 2.5;
> (do some stuff, realize that rpc was too high)
>
> Let's put that inside of postgresql.conf:
>
>
> #--
> # CUSTOMIZED OPTIONS
>
> #--
>
> # Add settings for extensions here
>
> random_page_cost = 2.5;
>
>
> Boom! Server will not start. Surely, we can be a little more liberal in
> what we accept? Attached patch allows a single trailing semicolon to be
> silently discarded. As this parsing happens before the logging collector
> starts up, the error about the semicolon is often buried somewhere in a
> separate logfile or journald - so let's just allow postgres to start up
> since there is no ambiguity about what random_page_cost (or any other GUC)
> is meant to be set to.
>

Please, no!

There is no end to accepting sloppy syntax. What next, allow "SET
random_page_cost = 2.5;" (with or without semicolon) in config files?

I'd be more interested in improvements in visibility of errors. For
example, maybe if I try to start the server and there is a config file
problem, I could somehow get a straightforward error message right in the
terminal window complaining about the line of the configuration which is
wrong.

Or maybe there could be a "check configuration" subcommand which checks the
configuration. If it's fine, say so and set a flag saying the server is
clear to be started/restarted. If not, give useful error messages and don't
set the flag. Then make the start/restart commands only do their thing if
the "config OK" flag is set. Make sure that editing the configuration
clears the flag (or have 2 copies of the configuration, copied over by the
"check" subcommand: one for editing, one for running with).

This might properly belong outside of Postgres itself, I don't know. But I
think it would be way more useful than a potentially never-ending series of
patches to liberalize the config parser.

Re: Fix search_path for all maintenance commands

2023-07-06 Thread Isaac Morland

On Thu, 6 Jul 2023 at 21:39, Jeff Davis  wrote:

I apologize in advance if anything I’ve written below is either too obvious
or too crazy or misinformed to belong here. I hope I have something to say
that is on point, but feel unsure what makes sense to say.

* It might break for users who have a functional index where the
> function implicitly depends on a search_path containing a namespace
> other than pg_catalog. My opinion is that such functional indexes are
> conceptually broken and we need to desupport them, and there will be
> some breakage, but I'm open to suggestion about how we minimize that (a
> compatibility GUC or something?).
>

I agree this is OK. If somebody has an index whole meaning depends on the
search_path, then the best that can be said is that their database hasn't
been corrupted yet. At the same time, I can see that somebody would get
upset if they couldn't upgrade their database because of this. Maybe
pg_upgrade could apply "SET search_path TO pg_catalog, pg_temp" to any
function used in a functional index that doesn't have a search_path setting
of its own? (BEGIN ATOMIC functions count, if I understand correctly, as
having a search_path setting, because the lookups happen at definition time)

Now I'm doing more reading and I'm worried about SET TIME ZONE (or more
precisely, its absence) and maybe some other ones.

* The fix might not go far enough or might be in the wrong place. I'm
> open to suggestion here, too. Maybe we can make it part of the general
> function call mechanism, and can be overridden by explicitly setting
> the function search path? Or maybe we need new syntax where the
> function can acquire the search path from the session explicitly, but
> uses a safe search path by default?
>

Change it so by default each function gets handled as if "SET search_path
FROM CURRENT" was applied to it? That's what I do for all my functions
(maybe hurting performance?). Expand on my pg_upgrade idea above by
applying it to all functions?

I feel that this may tie into other behaviour issues where to me it is
obvious that the expected behaviour should be different from the actual
behaviour. If a view calls a function, shouldn't it be called in the
context of the view's definer/owner? It's weird that I can write a view
that filters a table for users of the view, but as soon as the view calls
functions they run in the security context of the user of the view. Are
views security definers or not? Similar comment for triggers. Also as far
as I can tell there is no way for a security definer function to determine
who (which user) invoked it. So I can grant/deny access to run a particular
function using permissions, but I can't have the supposed security definer
define security for different callers.

Is the fundamental problem that we now find ourselves wanting to do things
that require different defaults to work smoothly? On some level I suspect
we want lexical scoping, which is what most of us have in our programming
languages, in the database; but the database has many elements of dynamic
scoping, and changing that is both a compatibility break and requires
significant changes in the way the database is designed.

Re: When IMMUTABLE is not.

2023-06-15 Thread Isaac Morland

On Thu, 15 Jun 2023 at 10:49, Tom Lane  wrote:

In particular, we've never enforced that an immutable function can't
> call non-immutable functions.  While that would seem like a good idea
> in the abstract, we've intentionally not tried to do it.  (I'm pretty
> sure there is more than one round of previous discussions of the point
> in the archives, although locating relevant threads seems hard.)
> One reason not to is that polymorphic functions have to be marked
> with worst-case volatility labels.  There are plenty of examples of
> functions that are stable for some input types and immutable for
> others (array_to_string, for instance); but the marking system can't
> represent that so we have to label them stable.  Enforcing that a
> user-defined immutable function can't use such a function might
> just break things for no gain.
>

More sophisticated type systems (which I am *not* volunteering to graft
onto Postgres) can handle some of this, but even Haskell has
unsafePerformIO. The current policy is both wise and practical.

Disk space not released after schema deletion

2023-06-07 Thread Isaac Morland

The usual question is “why did DELETE not release disk space?”, and I
understand why that is and something about how to get the space back
(VACUUM).

I have a database which hosts multiple applications in various schemas and
I’m trying to make test/sample data files by starting with a restored copy
of production and then dropping all schemas except for the ones I need for
a particular application.

The total size of all relations after the drop operations is just a few MB:

odyssey=# select sum (pg_total_relation_size (oid)) from pg_class;
   sum
--
 13877248
(1 row)

Yet the database size is still large (although much smaller than in the
original database):

odyssey=# select datname, pg_database_size (oid) from pg_database;
  datname  | pg_database_size
---+--
 postgres  |  8930083
 _repmgr   |654934531
 template0 |  8643075
 template1 |  8864547
 odyssey   |  14375453475
(5 rows)

The only change made after starting from a basebackup of production was to
set all the passwords to NULL in pg_authid, and to delete most of the
schemas. In particular, I wouldn’t expect VACUUM to do anything.

Does anybody know what could be holding all that space?

Re: PG 16 draft release notes ready

2023-05-19 Thread Isaac Morland

On Fri, 19 May 2023 at 22:59, jian he  wrote:

>
> Sorry for changing the subject line.
>
> these two commits seems not mentioned.
>

On a similar topic, should every committed item from the commitfest be
mentioned, or only ones that are significant enough?

I’m wondering because I had a role in this very small item, yet I don’t see
it listed in the psql section:

https://commitfest.postgresql.org/42/4133/

It’s OK if we don’t mention every single change, I just want to make sure I
understand.

Re: xmlserialize bug - extra empty row at the end

2023-04-23 Thread Isaac Morland

On Sun, 23 Apr 2023 at 12:28, Pavel Stehule  wrote:

>
>
> Dne ne 23. 4. 2023 18:03 uživatel Isaac Morland 
> napsal:
>
>> On Sun, 23 Apr 2023 at 10:52, Tom Lane  wrote:
>>
>>> Isaac Morland  writes:
>>>
>>
>>
>>> > I might go so
>>> > far as to change the psql display routines to not leave a blank line
>>> after
>>> > the content in the event it ends with a newline.
>>>
>>> psql has *no* business changing what it displays: if what came from the
>>> server has a trailing newline, that had better be made visible.  Even if
>>> we thought it was a good idea, it's about 25 years too late to reconsider
>>> that.  However, xmlserialize()'s new indenting behavior isn't set in
>>> stone yet, so we could contemplate chomping newlines within that.
>>
>>
>> The trailing newline is made visible by the little bent arrow character
>> that appears at the right hand side. So you could still tell whether the
>> value ended with a trailing newline. I agree that simply dropping the
>> trailing newline before displaying the value would be a very bad idea.
>>
>
> What is benefit or usage of this trailing newline?
>

When creating a text file, it is conventional to end it with a newline.
Every single line of the file is ended with a newline, including the last
line of a file. Various tools deal with text files which are missing the
newline on the last line in various ways depending on context. If you "cat"
a file which is missing its trailing newline, the command prompt naturally
ends up on the same line of the display as the characters after the last
newline. Tools like “less” often adjust their display so the presence or
absence of the trailing newline makes no difference.

So it’s not so much about benefit or usage, it’s about what text files
normally contain. Newline characters are used as line terminators, not line
separators.

Of course, it's conventional for a database value not to end with a
newline. If I store a person's name in the database, it would be weird to
append a newline to the end. Here we have serialized XML which we tend to
think of storing in a text file — where one would expect it to end with a
newline — but we're displaying it in a table cell as part of the output of
a database query, where one typically does not expect values to end with a
newline (although they can, and psql displays values differently even if
they differ only in the presence or absence of a newline at the end).

If you were to load a typical text file into a column of a database row and
display it using psql, you would see the same phenomenon.

My inclination would be, if we're just calling to a long-standardized
library routine, to just accept its output as is. If a program is saving
the output to a text file, that would be the expected behaviour. If not,
then we need to document that the output of our function is the output of
the library function, minus the trailing newline.

Personally, when I do some formatting I prefer adding newlines on client
> side before necessity of removing.
>

Re: xmlserialize bug - extra empty row at the end

2023-04-23 Thread Isaac Morland

On Sun, 23 Apr 2023 at 10:52, Tom Lane  wrote:

> Isaac Morland  writes:
>

> > I might go so
> > far as to change the psql display routines to not leave a blank line
> after
> > the content in the event it ends with a newline.
>
> psql has *no* business changing what it displays: if what came from the
> server has a trailing newline, that had better be made visible.  Even if
> we thought it was a good idea, it's about 25 years too late to reconsider
> that.  However, xmlserialize()'s new indenting behavior isn't set in
> stone yet, so we could contemplate chomping newlines within that.

The trailing newline is made visible by the little bent arrow character
that appears at the right hand side. So you could still tell whether the
value ended with a trailing newline. I agree that simply dropping the
trailing newline before displaying the value would be a very bad idea.

Re: xmlserialize bug - extra empty row at the end

2023-04-23 Thread Isaac Morland

On Sun, 23 Apr 2023 at 01:31, Pavel Stehule  wrote:

> Hi
>
> maybe I found a bug in xmlserialize
>
> SELECT xmlserialize(DOCUMENT '42'
> AS varchar INDENT);
>
> (2023-04-23 07:27:53) postgres=# SELECT xmlserialize(DOCUMENT
> '42' AS varchar INDENT);
> ┌─┐
> │  xmlserialize   │
> ╞═╡
> │   ↵│
> │   ↵│
> │ 42↵│
> │  ↵│
> │  ↵│
> │ │
> └─┘
> (1 row)
>
> Looks so there is an extra empty row.
>

I wouldn't necessarily worry about this much. There is not, as such, an
extra blank line at the end; rather it is conventional that a text file
should end with a newline character. That is, conventionally every single
line in a text file ends with a newline character, meaning the only text
file that doesn't end with a newline is the empty file. You can see this in
tools like diff, which explicitly report "no newline at end of file" if the
file ends with a different character.

If you were to save the value to a file you would probably want it the way
it is.

That being said, this is a database column result and I agree it would look
more elegant if the blank line in the display were not there. I might go so
far as to change the psql display routines to not leave a blank line after
the content in the event it ends with a newline.

Re: Mark a transaction uncommittable

2023-04-22 Thread Isaac Morland

On Sat, 22 Apr 2023 at 11:01, Gurjeet Singh  wrote:

> This is a proposal for a new transaction characteristic. I haven't
> written any code, yet, and am interested in hearing if others may find
> this feature useful.
>
> Many a times we start a transaction that we never intend to commit;
> for example, for testing, or for EXPLAIN ANALYZE, or after detecting
> unexpected results but still interested in executing more commands
> without risking commit,  etc.
>
> A user would like to declare their intent to eventually abort the
> transaction as soon as possible, so that the transaction does not
> accidentally get committed.
>

I have an application for this: creating various dev/test versions of data
from production.

Start by restoring a copy of production from backup. Then successively
create several altered versions of the data and save them to a place where
developers can pick them up. For example, you might have one version which
has all data old than 1 year deleted, and another where 99% of the
students/customers/whatever are deleted. Anonymization could also be
applied. This would give you realistic (because it ultimately originates
from production) test data.

This could be done by starting a non-committable transaction, making the
adjustments, then doing a pg_dump in the same transaction (using --snapshot
to allow it to see that transaction). Then rollback, and repeat for the
other versions. This saves repeatedly restoring the (probably very large)
production data each time.

What I’m not sure about is how long it takes to rollback a transaction. I'm
assuming that it’s very quick compared to restoring from backup.

It would be nice if pg_basebackup could also have the --snapshot option.

Re: CAST(... ON DEFAULT) - WIP build on top of Error-Safe User Functions

2023-03-28 Thread Isaac Morland

On Mon, 19 Dec 2022 at 17:57, Corey Huinker  wrote:

>
> Attached is my work in progress to implement the changes to the CAST()
> function as proposed by Vik Fearing.
>
> CAST(expr AS typename NULL ON ERROR)
> will use error-safe functions to do the cast of expr, and will return
> NULL if the cast fails.
>
> CAST(expr AS typename DEFAULT expr2 ON ERROR)
> will use error-safe functions to do the cast of expr, and will return
> expr2 if the cast fails.
>

Is there any difference between NULL and DEFAULT NULL?

Re: Infinite Interval

2023-03-25 Thread Isaac Morland

On Sat, 25 Mar 2023 at 15:59, Tom Lane  wrote:

> Joseph Koshakow  writes:
> > In terms of adding/subtracting infinities, the IEEE standard is pay
> > walled and I don't have a copy. I tried finding information online but
> > I also wasn't able to find anything useful. I additionally checked to see
> > the results of C++, C, and Java, and they all match which increases my
> > confidence that we're doing the right thing. Does anyone happen to have
> > a copy of the standard and can confirm?
>
> I think you can take it as read that simple C test programs on modern
> platforms will exhibit IEEE-compliant handling of float infinities.
>

Additionally, the Java language specification claims to follow IEEE 754:

https://docs.oracle.com/javase/specs/jls/se11/html/jls-15.html#jls-15.18.2

So either C and Java agree with each other and with the spec, or they
disagree in the same way even while at least one of them explicitly claims
to be following the spec. I think you're on pretty firm ground.

Re: Remove source code display from \df+?

2023-03-02 Thread Isaac Morland

On Thu, 2 Mar 2023 at 17:20, Tom Lane  wrote:

> Isaac Morland  writes:
> > [ 0001-Remove-source-code-display-from-df-v6.patch ]
>
> Pushed after some editorialization on the test case.
>

Thanks!

One thing I noticed while testing is that if you apply \df+ to an
> aggregate function, it will show "Internal name" of "aggregate_dummy".
> While that's an accurate description of what's in prosrc, it seems
> not especially useful and perhaps indeed confusing to novices.
> So I thought about suppressing it.  However, that would require
> a server-version-dependent test and I wasn't quite convinced it'd
> be worth the trouble.  Any thoughts on that?
>

I think it’s OK. Right now \df+ claims that the source code for an
aggregate function is “aggregate_dummy”; that’s probably more untrue than
saying that its internal name is “aggregate_dummy”. There are several
features of aggregate functions that are always defined the same way by the
creation process; who’s to say they don’t all have a shared dummy internal
name?

Unable to create table of view row type

2023-02-21 Thread Isaac Morland

I thought I should be able to do this:

=> create view testv as values (1, 'a'), (2, 'b'), (3, 'c');
CREATE VIEW
=> create table testt of testv;
ERROR:  type testv is not a composite type

But as you can see I can’t. pg_type seems to think the type is composite:

ijmorlan=> select typtype from pg_type where typname = 'testv';
 typtype
─
 c
(1 row)

I’m guessing there are good reasons this isn’t supported, so I’m thinking
to provide a documentation patch for CREATE TABLE and ALTER TABLE to
specify that the type given for OF type_name must be a
non-relation-row-type composite type.

Am I missing something?

Re: Set arbitrary GUC options during initdb

2023-01-27 Thread Isaac Morland

On Fri, 27 Jan 2023 at 09:49, Peter Eisentraut <
peter.eisentr...@enterprisedb.com> wrote:

> On 25.01.23 22:25, Tom Lane wrote:
> > So this invents an initdb switch "-c NAME=VALUE" just like the
> > one that the server itself has long had.
>
> This seems useful.
>
> > The specified settings
> > are applied on the command line of the initial probe calls
> > (which happen before we've made any config files), and then they
> > are added to postgresql.auto.conf, which causes them to take
> > effect for the bootstrap backend runs as well as subsequent
> > postmaster starts.
>
> I would have expected them to be edited into postgresql.conf.  What are
> the arguments for one or the other?
>

That would be my expectation also. I believe that is how it works now for
options which can be set by initdb, such as locale and port. I view
postgresql.auto.conf being for temporary changes, or changes related to
different instances within a replication setup, or whatever other uses
people come up with - but not for the permanent configuration established
by initdb.

In particular, I would be surprised if removing a postgresql.auto.conf
completely disabled an instance. Obviously, in my replication setup
example, the replication would be broken, but the basic operation of the
instance would still be possible.

Also, if somebody wants to put a change in postgresql.auto.conf, they can
easily do it using ALTER SYSTEM once the instance is running, or by just
writing out their own postgresql.auto.conf before starting it. Putting a
change in postgresql.conf programmatically is a bit of a pain.

Re: Re: Support plpgsql multi-range in conditional control

2023-01-25 Thread Isaac Morland

On Wed, 25 Jan 2023 at 12:02, Pavel Stehule  wrote:

>
>
> st 25. 1. 2023 v 17:22 odesílatel songjinzhou <2903807...@qq.com> napsal:
>
>>
>> As follows, we can only repeat the for statement before we use such SQL:
>>
>> begin
>> for i in 10..20 loop
>> raise notice '%', i; -- Things to do
>> end loop;
>>
>> for i in 100 .. 200 by 10 loop
>> raise notice '%', i; -- Things to do
>> end loop;
>> end;
>>
>> But now we can simplify it as follows:
>>
>> begin
>> for i in 10..20, 100 .. 200 by 10 loop
>> raise notice '%', i; -- Things to do
>> end loop;
>> end;
>>
>> Although we can only use integer iterative control here, this is just a
>> horizontal expansion of the previous logic. Thank you very much for your
>> reply. I am very grateful!
>>
>>
> Unfortunately, this is not a real use case - this is not an example from
> the real world.
>

And anyway, this is already supported using generate_series() and UNION:

odyssey=> do $$ declare i int; begin for i in select generate_series (10,
20) union all select generate_series (100, 200, 10) do loop raise notice
'i=%', i; end loop; end;$$;
NOTICE:  i=10
NOTICE:  i=11
NOTICE:  i=12
NOTICE:  i=13
NOTICE:  i=14
NOTICE:  i=15
NOTICE:  i=16
NOTICE:  i=17
NOTICE:  i=18
NOTICE:  i=19
NOTICE:  i=20
NOTICE:  i=100
NOTICE:  i=110
NOTICE:  i=120
NOTICE:  i=130
NOTICE:  i=140
NOTICE:  i=150
NOTICE:  i=160
NOTICE:  i=170
NOTICE:  i=180
NOTICE:  i=190
NOTICE:  i=200
DO
odyssey=>

The existing x..y notation is just syntactic sugar for a presumably common
case (although I’m dubious how often one really loops through a range of
numbers — surely in a database looping through a query result is
overwhelmingly dominant?); I don’t think you’ll find much support around
here for adding more syntax possibilities to the loop construct.

Re: Unicode grapheme clusters

2023-01-24 Thread Isaac Morland

On Tue, 24 Jan 2023 at 11:40, Greg Stark  wrote:

>
> At the end of the day Unicode kind of assumes a variable-width display
> where the rendering is handled by something that has access to the
> actual font metrics. So anything trying to line things up in columns
> in a way that works with any rendering system down the line using any
> font is going to be making a best guess.
>

Really what is needed is another Unicode attribute: how many columns of a
monospaced display each character (or grapheme cluster) should take up. The
standard should include a precisely defined function that can take any
sequence of characters and give back its width in monospaced display
character spaces. Typefaces should only qualify as monospaced if they
respect this standard-defined computation.

Note that this is not actually a new thing: this was included in ASCII
implicitly, with a value of 1 for every character, and a value of n for
every n-character string. It has always been possible to line up values
displayed on monospaced displays by adding spaces, and it is only the
omission of this feature from Unicode which currently makes it impossible.

Re: Remove source code display from \df+?

2023-01-22 Thread Isaac Morland

On Sun, 22 Jan 2023 at 21:37, Justin Pryzby  wrote:

> On Sun, Jan 22, 2023 at 08:23:25PM -0500, Isaac Morland wrote:
> > > Were you able to test with your own github account ?
> >
> > I haven’t had a chance to try this. I must confess to being a bit
> confused
> > by the distinction between running the CI tests and doing "make check";
> > ideally I would like to be able to run all the tests on my own machine
> > without any external resources. But at the same time I don’t pretend to
> > understand the full situation so I will try to use this when I get some
> > time.
>
> First: "make check" only runs the sql tests, and not the perl tests
> (including pg_upgrade) or isolation tests.  check-world runs everything.
>

Thanks very much. I should have remembered check-world, and of course the
fact that the CI tests multiple platforms. I’ll go and do some
reading/re-reading; now that I’ve gone through some parts of the process
I’ll probably understand more.

The latest submission appears to have passed:

http://cfbot.cputube.org/isaac-morland.html

However, one of the jobs (Windows - Server 2019, MinGW64 - Meson) is paused
and appears never to have run:

https://cirrus-ci.com/task/6687014536347648

Other than that, I think this is passing the tests.

Re: Remove source code display from \df+?

2023-01-22 Thread Isaac Morland

On Sun, 22 Jan 2023 at 17:27, Justin Pryzby  wrote:

> On Sun, Jan 22, 2023 at 04:28:21PM -0500, Isaac Morland wrote:
> > On Sun, 22 Jan 2023 at 15:04, Tom Lane  wrote:

> But now I'm having a problem I don't understand: the CI are still
> failling,
> > but not in the psql test. Instead, I get this:
> >
> > [20:11:17.624] +++ tap check in src/bin/pg_upgrade +++
>
> You'll find the diff in the "artifacts", but not a separate "diff" file.
>
> https://api.cirrus-ci.com/v1/artifact/task/6146418377752576/testrun/build/testrun/pg_upgrade/002_pg_upgrade/log/regress_log_002_pg_upgrade
>
>  CREATE FUNCTION public.regress_psql_df_sql() RETURNS void
>  LANGUAGE sql
>  BEGIN ATOMIC
> - SELECT NULL::text;
> + SELECT NULL::text AS text;
>  END;
>
> It's failing because after restoring the function, the column is named
> "text" - maybe it's a bug.
>

OK, thanks. I'd say I've uncovered a bug related to pg_upgrade, unless I’m
missing something. However, I've adjusted my patch so that nothing it
creates is kept. This seems tidier even without the test failure.

Tom's earlier point was that neither the function nor its owner needs to
> be preserved (as is done to exercise pg_dump/restore/upgrade - surely
> functions are already tested).  Dropping it when you're done running \df
> will avoid any possible issue with pg_upgrade.
>
> Were you able to test with your own github account ?
>

I haven’t had a chance to try this. I must confess to being a bit confused
by the distinction between running the CI tests and doing "make check";
ideally I would like to be able to run all the tests on my own machine
without any external resources. But at the same time I don’t pretend to
understand the full situation so I will try to use this when I get some
time.

0001-Remove-source-code-display-from-df-v6.patch
Description: Binary data

Re: Remove source code display from \df+?

2023-01-22 Thread Isaac Morland

On Sun, 22 Jan 2023 at 16:56, Justin Pryzby  wrote:

> On Sun, Jan 22, 2023 at 03:04:14PM -0500, Tom Lane wrote:
>

> That's excessive.  The policy Alvaro mentions applies to globally-visible
> > object names (i.e., database, role, and tablespace names), and it's there
> > to try to ensure that doing "make installcheck" against a live
> > installation won't clobber any non-test-created objects.  There's no
> point
> > in having such a policy within a test database --- its most likely effect
> > there would be to increase the risk that different test scripts step on
> > each others' toes.  If you feel a need for a name prefix for non-global
> > objects, use something based on the name of your test script.
>
> But we *are* talking about the role to be created to allow stable output
> of \df+ , so it's necessary to name it "regress_*".  To appease
> ENFORCE_REGRESSION_TEST_NAME_RESTRICTIONS, and to avoid clobbering
> global objects during "installcheck".
>

Tom is talking about my informal policy of prefixing all objects. Only
global objects need to be prefixed with regress_, but I prefixed everything
I created (functions as well as the role). I actually called the
role regress_psql_df and used that entire role name as the prefix of my
function names, so I think it unlikely that I’ll collide with anything else.

Re: Remove source code display from \df+?

2023-01-22 Thread Isaac Morland

On Sun, 22 Jan 2023 at 15:04, Tom Lane  wrote:

> Isaac Morland  writes:
> > On Sun, 22 Jan 2023 at 14:26, Alvaro Herrera 
> > wrote:
> >> This one would fail the sanity check that all roles created by
> >> regression tests need to have names that start with "regress_".
>
> > Thanks for the correction. Now I feel like I've skipped some of the
> > readings!
> > Updated patch attached. Informally, I am adopting the regress_* policy
> for
> > all object types.
>
> That's excessive.  The policy Alvaro mentions applies to globally-visible
> object names (i.e., database, role, and tablespace names), and it's there
> to try to ensure that doing "make installcheck" against a live
> installation won't clobber any non-test-created objects.  There's no point
> in having such a policy within a test database --- its most likely effect
> there would be to increase the risk that different test scripts step on
> each others' toes.  If you feel a need for a name prefix for non-global
> objects, use something based on the name of your test script.
>

I already used a test-specific prefix, then added "regress_" in front.
Point taken, however, on the difference between global and non-global
objects.

But now I'm having a problem I don't understand: the CI are still failling,
but not in the psql test. Instead, I get this:

[20:11:17.624] +++ tap check in src/bin/pg_upgrade +++
[20:11:17.624] [20:09:11] t/001_basic.pl ... ok  106 ms ( 0.00 usr
 0.00 sys +  0.06 cusr  0.02 csys =  0.08 CPU)
[20:11:17.624]
[20:11:17.624] #   Failed test 'old and new dumps match after pg_upgrade'
[20:11:17.624] #   at t/002_pg_upgrade.pl line 362.
[20:11:17.624] #  got: '1'
[20:11:17.624] # expected: '0'
[20:11:17.624] # Looks like you failed 1 test of 13.
[20:11:17.624] [20:11:17] t/002_pg_upgrade.pl ..
[20:11:17.624] Dubious, test returned 1 (wstat 256, 0x100)
[20:11:17.624] Failed 1/13 subtests
[20:11:17.624] [20:11:17]
[20:11:17.624]
[20:11:17.624] Test Summary Report
[20:11:17.624] ---
[20:11:17.624] t/002_pg_upgrade.pl (Wstat: 256 Tests: 13 Failed: 1)
[20:11:17.624]   Failed test:  13
[20:11:17.624]   Non-zero exit status: 1
[20:11:17.624] Files=2, Tests=21, 126 wallclock secs ( 0.01 usr  0.00 sys +
 6.65 cusr  3.95 csys = 10.61 CPU)
[20:11:17.624] Result: FAIL
[20:11:17.624] make[2]: *** [Makefile:55: check] Error 1
[20:11:17.625] make[1]: *** [Makefile:43: check-pg_upgrade-recurse] Error 2

As far as I can tell this is the only failure and doesn’t have anything to
do with my change. Unless the objects I added are messing it up? Unlike
when the psql regression test was failing, I don’t see an indication of
where I can see the diffs.

Re: Remove source code display from \df+?

2023-01-22 Thread Isaac Morland

On Sun, 22 Jan 2023 at 14:26, Alvaro Herrera 
wrote:

> On 2023-Jan-22, Isaac Morland wrote:
>
> > I’ve re-written the tests to create a test-specific role and functions so
> > there is no longer a dependency on the superuser name.
>
> This one would fail the sanity check that all roles created by
> regression tests need to have names that start with "regress_".
>

Thanks for the correction. Now I feel like I've skipped some of the
readings!

Updated patch attached. Informally, I am adopting the regress_* policy for
all object types.

> I pondered the notion of going with the flow and just leaving out the
> > tests but that seemed like giving up too easily.
>
> I think avoiding even more untested code is good, so +1 for keeping at
> it.
>


0001-Remove-source-code-display-from-df-v5.patch
Description: Binary data

Re: Remove source code display from \df+?

2023-01-22 Thread Isaac Morland

On Sun, 22 Jan 2023 at 00:45, Justin Pryzby  wrote:

> On Sun, Jan 22, 2023 at 12:18:34AM -0500, Isaac Morland wrote:
>

> > It turns out that my tests wanted the owner to be “vagrant” rather than
> > “postgres”. This is apparently because I was running as that user (in a
> > Vagrant VM) when running the tests. Then I took that output and just made
> > it the expected output. I’ve re-worked my build environment a bit so
> that I
> > run as “postgres” inside the Vagrant VM.
> >
> > What I don’t understand is why that didn’t break all the other tests.
>
> Probably because the other tests avoid showing the owner, exactly
> because it varies depending on who runs the tests.  Most of the "plus"
> commands aren't tested, at least in the sql regression tests.
>

Thanks for your patience. I didn’t think about it enough but everything you
both said makes sense.

I’ve re-written the tests to create a test-specific role and functions so
there is no longer a dependency on the superuser name. I pondered the
notion of going with the flow and just leaving out the tests but that
seemed like giving up too easily.

We should probably change one of the CI usernames to something other
> than "postgres" to catch the case that someone hardcodes "postgres".
>
> > proper value in the Owner column so let’s see what CI does with it.
>
> Or better: see above about using it from your github account.

Yes, I will try to get this working before I try to make another
contribution.

0001-Remove-source-code-display-from-df-v4.patch
Description: Binary data

Re: Remove source code display from \df+?

2023-01-21 Thread Isaac Morland

On Thu, 19 Jan 2023 at 13:02, Isaac Morland  wrote:

> On Thu, 19 Jan 2023 at 11:30, Justin Pryzby  wrote:
>
>> On Wed, Jan 18, 2023 at 10:27:46AM -0500, Isaac Morland wrote:
>> >
>> > I thought I had: https://commitfest.postgresql.org/42/4133/
>>
>> This is failing tests:
>> http://cfbot.cputube.org/isaac-morland.html
>>
>> It seems like any "make check" would fail .. but did you also try
>> cirrusci from your own github account?
>> ./src/tools/ci/README
>>
>
> I definitely ran "make check" but I did not realize there is also
> cirrusci. I will look at that. I'm having trouble interpreting the cfbot
> page to which you linked but I'll try to run cirrusci myself before
> worrying too much about that.
>
> Running "make check" the first time I was surprised to see no failures -
> so I added tests for \df+, which passed when I did "make check".
>
>>
It turns out that my tests wanted the owner to be “vagrant” rather than
“postgres”. This is apparently because I was running as that user (in a
Vagrant VM) when running the tests. Then I took that output and just made
it the expected output. I’ve re-worked my build environment a bit so that I
run as “postgres” inside the Vagrant VM.

What I don’t understand is why that didn’t break all the other tests. I
would have expected “postgres” to show up all over the expected results and
be changed to “vagrant” by what I did; so I should have seen all sorts of
test failures in the existing tests. Anyway, my new tests now have the
proper value in the Owner column so let’s see what CI does with it.

BTW, I think "ELSE NULL" could be omitted.
>>
>
> Looks like; I'll update. Might as well reduce the visual size of the code.
>

I did this. I’m ambivalent about this because I usually think of CASE and
similar constructs as needing to explicitly cover all possible cases but
the language does provide for the NULL default case so may as well use the
feature where applicable.

0001-Remove-source-code-display-from-df-v3.patch
Description: Binary data

Re: Remove source code display from \df+?

2023-01-19 Thread Isaac Morland

On Thu, 19 Jan 2023 at 11:30, Justin Pryzby  wrote:

> On Wed, Jan 18, 2023 at 10:27:46AM -0500, Isaac Morland wrote:
> >
> > I thought I had: https://commitfest.postgresql.org/42/4133/
>
> This is failing tests:
> http://cfbot.cputube.org/isaac-morland.html
>
> It seems like any "make check" would fail .. but did you also try
> cirrusci from your own github account?
> ./src/tools/ci/README
>

I definitely ran "make check" but I did not realize there is also cirrusci.
I will look at that. I'm having trouble interpreting the cfbot page to
which you linked but I'll try to run cirrusci myself before worrying too
much about that.

Running "make check" the first time I was surprised to see no failures - so
I added tests for \df+, which passed when I did "make check".

> BTW, I think "ELSE NULL" could be omitted.
>

Looks like; I'll update. Might as well reduce the visual size of the code.

Re: Remove source code display from \df+?

2023-01-18 Thread Isaac Morland

On Wed, 18 Jan 2023 at 00:00, Pavel Stehule  wrote:

>
> út 17. 1. 2023 v 20:29 odesílatel Isaac Morland 
> napsal:
>
>>
>> I welcome comments and feedback. Now to try to find something manageable
>> to review.
>>
>
> looks well
>
> you miss update psql documentation
>
> https://www.postgresql.org/docs/current/app-psql.html
>
> If the form \df+ is used, additional information about each function is
> shown, including volatility, parallel safety, owner, security
> classification, access privileges, language, source code and description.
>

Thanks, and sorry about that, it just completely slipped my mind. I’ve
attached a revised patch with a slight update of the psql documentation.

you should to assign your patch to commitfest app
>
> https://commitfest.postgresql.org/
>

I thought I had: https://commitfest.postgresql.org/42/4133/

Did I miss something?


0001-Remove-source-code-display-from-df-v2.patch
Description: Binary data

Re: Remove source code display from \df+?

2023-01-17 Thread Isaac Morland

On Thu, 12 Jan 2023 at 12:06, Isaac Morland  wrote:

Thanks everybody. So based on the latest discussion I will:
>
> 1) rename the column from “Source code” to “Internal name”; and
> 2) change the contents to NULL except when the language (identified by
> oid) is INTERNAL or C.
>
> Patch forthcoming, I hope.
>

I've attached a patch for this. It turns out to simplify the existing code
in one way because the recently added call to pg_get_function_sqlbody() is
no longer needed since it applies only to SQL functions, which will now
display as a blank column.

I implemented the change and was surprised to see that no tests failed.
Turns out that while there are several tests for \df, there were none for
\df+. I added a couple, just using \df+ on some functions that appear to me
to be present on plain vanilla Postgres.

I was initially concerned about translation support for the column heading,
but it turns out that \dT+ already has a column with the exact same name so
it appears we don’t need any new translations.

I welcome comments and feedback. Now to try to find something manageable to
review.

0001-Remove-source-code-display-from-df.patch
Description: Binary data

Re: Remove source code display from \df+?

2023-01-12 Thread Isaac Morland

On Thu, 12 Jan 2023 at 10:04, Magnus Hagander  wrote:

We could shorten it to "See \sf" or something like that.  But if we change
>>> the column header to "internal name" or the like, then the column just
>>> obviously doesn't apply for non-internal languages, so leaving it null
>>> should be fine.
>>>
>>
>> +1
>>
>>
> Sure, that works for me as well. I agree the suggested text was way too
> long, I was more thinking of "something in this direction" rather than that
> exact text. But yes, with a change of names, we can leave it NULL as well.
>

Thanks everybody. So based on the latest discussion I will:

1) rename the column from “Source code” to “Internal name”; and
2) change the contents to NULL except when the language (identified by oid)
is INTERNAL or C.

Patch forthcoming, I hope.

Re: Named Operators

2023-01-12 Thread Isaac Morland

On Thu, 12 Jan 2023 at 05:59, Gurjeet Singh  wrote:

I'll consider using one of the other special characters. Do you have
> any suggestions?
>

What about backticks (`)? They are allowed as operator characters but do
not otherwise appear in the lexical syntax as far as I can tell:
https://www.postgresql.org/docs/current/sql-syntax-lexical.html

Re: Remove source code display from \df+?

2023-01-11 Thread Isaac Morland

On Wed, 11 Jan 2023 at 13:11, Pavel Stehule  wrote:

please, don't send top post replies -
> https://en.wikipedia.org/wiki/Posting_style
>

Sorry about that; I do know to do it properly and usually get it right.
GMail doesn’t seem to have an option (that I can find) to leave no space at
the top and put my cursor at the bottom; it nudges pretty firmly in the
direction of top-posting. Thanks for the reminder.

> I don't think printing a few first rows is a good idea - usually there is
> nothing interesting (same is PL/Perl, PL/Python, ...)
>
> If the proposed feature can be generic, then this information should be
> stored somewhere in pg_language. Or we can redesign usage of prosrc and
> probin columns - but this can be a much more massive change.
>
>> 

>>>
I’m looking for a quick win. So I think that means either drop the source
column entirely, or show single-line source values only and nothing or a
placeholder for anything that is more than one line, unless somebody comes
up with another suggestion. Originally I was thinking just to remove
entirely, but I’ve seen a couple of comments suggesting that people would
find it unfortunate if the source weren’t shown for internal/C functions,
so now I’m leaning towards showing single-line values only.

I agree that showing the first line or couple of lines isn't likely to be
very useful. The way I format my functions, the first line is always blank
anyway: I write the bodies like this:

$$
BEGIN
…
END;
$$;

Even if somebody uses a different style, the first line is probably just
"BEGIN" or something equally formulaic.

Re: Remove source code display from \df+?

2023-01-11 Thread Isaac Morland

Right, for internal or C functions it just gives a symbol name or something
similar. I've never been annoyed seeing that, just having pages of PL/PGSQL
(I use a lot of that, possibly biased towards the “too much” direction)
take up all the space.

A bit hacky, but what about only showing the first line of the source code?
Then you would see link names for that type of function but the main
benefit of suppressing the full source code would be obtained. Or, show
source if it is a single line, otherwise “…” (as in, literally an ellipsis).

On Wed, 11 Jan 2023 at 12:31, Pavel Stehule  wrote:

>
>
> st 11. 1. 2023 v 18:25 odesílatel Magnus Hagander 
> napsal:
>
>>
>>
>> On Wed, Jan 11, 2023 at 6:19 PM Pavel Stehule 
>> wrote:
>>
>>>
>>>
>>> st 11. 1. 2023 v 17:50 odesílatel Isaac Morland 
>>> napsal:
>>>
>>>> I find \df+ much less useful than it should be because it tends to be
>>>> cluttered up with source code. Now that we have \sf, would it be reasonable
>>>> to remove the source code from the \df+ display? This would make it easier
>>>> to see function permissions and comments. If somebody wants to see the full
>>>> definition of a function they can always invoke \sf on it.
>>>>
>>>> If there is consensus on the idea in principle I will write up a patch.
>>>>
>>>
>>> +1
>>>
>>>
>> +1 but maybe with a twist. For any functions in a procedural language
>> like plpgsql, it makes it pretty useless today. But when viewing an
>> internal or C language function, it's short enough and still actually
>> useful. Maybe some combination where it would keep showing those for such
>> language, but would show "use \sf to view source" for procedural languages?
>>
>
> yes, it is almost necessary for C functions or functions in external
> languages. Maybe it can be specified in pg_language if prosrc is really
> source code or some reference.
>
>
>
>
>
>
>> --
>>  Magnus Hagander
>>  Me: https://www.hagander.net/ <http://www.hagander.net/>
>>  Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/>
>>
>

Remove source code display from \df+?

2023-01-11 Thread Isaac Morland

I find \df+ much less useful than it should be because it tends to be
cluttered up with source code. Now that we have \sf, would it be reasonable
to remove the source code from the \df+ display? This would make it easier
to see function permissions and comments. If somebody wants to see the full
definition of a function they can always invoke \sf on it.

If there is consensus on the idea in principle I will write up a patch.

Re: MERGE ... RETURNING

2023-01-08 Thread Isaac Morland

On Sun, 8 Jan 2023 at 07:28, Dean Rasheed  wrote:

So playing around with it (and inspired by the WITH ORDINALITY syntax
> for SRFs), I had the idea of allowing "WITH WHEN CLAUSE" at the end of
> the returning list, which adds an integer column to the list, whose
> value is set to the index of the when clause executed, as in the
> attached very rough patch.
>

Would it be useful to have just the action? Perhaps "WITH ACTION"? My idea
is that this would return an enum of INSERT, UPDATE, DELETE (so is "action"
the right word?). It seems to me in many situations I would be more likely
to care about which of these 3 happened rather than the exact clause that
applied. This isn't necessarily meant to be instead of your suggestion
because I can imagine wanting to know the exact clause, just an alternative
that might suffice in many situations. Using it would also avoid problems
arising from editing the query in a way which changes the numbers of the
clauses.

So, quoting an example from the tests, this allows things like:
>
> WITH t AS (
>   MERGE INTO sq_target t USING v ON tid = sid
> WHEN MATCHED AND tid > 2 THEN UPDATE SET balance = t.balance + delta
> WHEN NOT MATCHED THEN INSERT (balance, tid) VALUES (balance + delta,
> sid)
> WHEN MATCHED AND tid < 2 THEN DELETE
> RETURNING t.* WITH WHEN CLAUSE
> )
> SELECT CASE when_clause
>  WHEN 1 THEN 'UPDATE'
>  WHEN 2 THEN 'INSERT'
>  WHEN 3 THEN 'DELETE'
>END, *
> FROM t;
>
>   case  | tid | balance | when_clause
> +-+-+-
>  INSERT |  -1 | -11 |   2
>  DELETE |   1 | 100 |   3
> (2 rows)
>
> 1 row is returned for each merge action executed (other than DO
> NOTHING actions), and as usual, the values represent old target values
> for DELETE actions, and new target values for INSERT/UPDATE actions.
>

Would it be feasible to allow specifying old.column or new.column? These
would always be NULL for INSERT and DELETE respectively but more useful
with UPDATE. Actually I've been meaning to ask this question about UPDATE …
RETURNING.

Actually, even with DELETE/INSERT, I can imagine wanting, for example, to
get only the new values associated with INSERT or UPDATE and not the values
removed by a DELETE. So I can imagine specifying new.column to get NULLs
for any row that was deleted but still get the new values for other rows.

It's also possible to return the source values, and a bare "*" in the
> returning list expands to all the source columns, followed by all the
> target columns.
>

Does this lead to a problem in the event there are same-named columns
between source and target?

The name of the added column, if included, can be changed by
> specifying "WITH WHEN CLAUSE [AS] col_alias". I chose the syntax "WHEN
> CLAUSE" and "when_clause" as the default column name because those
> match the existing terminology used in the docs.
>

Re: Add SHELL_EXIT_CODE to psql

2022-12-31 Thread Isaac Morland

On Sat, 31 Dec 2022 at 16:47, Corey Huinker  wrote:

>
>> I wonder if there is value in setting up a psql on/off var
>> SHELL_ERROR_OUTPUT construct that when set to "off/false"
>> suppresses standard error via appending "2> /dev/null" (or "2> nul" if
>> #ifdef WIN32). At the very least, it would allow for tests like this to be
>> done with standard regression scripts.
>>
>
> Thinking on this some more a few ideas came up:
>
> 1. The SHELL_ERROR_OUTPUT var above. This is good for simple situations,
> but it would fail if the user took it upon themselves to redirect output,
> and suddenly "myprog 2> /dev/null" works fine on its own but will fail when
> we append our own "2> /dev/null" to it.
>

Rather than attempting to append redirection directives to the command,
would it work to redirect stderr before invoking the shell? This seems to
me to be more reliable and it should allow an explicit redirection in the
shell command to still work. The difference between Windows and Unix then
becomes the details of what system calls we use to accomplish the
redirection (or maybe none, if an existing abstraction layer takes care of
that - I'm not really up on Postgres internals enough to know), rather than
what we append to the provided command.

Re: split TOAST support out of postgres.h

2022-12-28 Thread Isaac Morland

On Wed, 28 Dec 2022 at 08:07, Peter Eisentraut <
peter.eisentr...@enterprisedb.com> wrote:

> Most backend code doesn't actually need the variable-length data types
> support (TOAST support) in postgres.h.  So I figured we could try to put
> it into a separate header file.  That makes postgres.h more manageable,
> and it avoids including a bunch of complicated unused stuff everywhere.
> I picked "varatt.h" as the name.  Then we could either
>
[…]

> I went with the last option in my patch.
>
> Thoughts?

This is a bit of a bikeshed suggestion, but I'm wondering if you considered
calling it toast.h? Only because the word is so distinctive within
Postgres; everybody knows exactly to what it refers.

I definitely agree with the principle of organizing and splitting up the
header files. Personally, I don't mind importing a bunch of headers if I'm
using a bunch of subsystems so I would be OK with needing to import this
new header if I need it.

Re: [PATCH] Support using "all" for the db user in pg_ident.conf

2022-12-27 Thread Isaac Morland

On Tue, 27 Dec 2022 at 10:54, Jelte Fennema 
wrote:

This change makes it much easier to have a certain database
> administrator peer or cert authentication, that allows connecting as
> any user. Without this change you would need to add a line to
> pg_ident.conf for every user that is in the database.
>
> In some small sense this is a breaking change if anyone is using "all"
> as a user currently and has pg_ident.conf rules for it. This seems
> unlikely, since "all" was already handled specially in pg_hb.conf.
> Also it can easily be worked around by quoting the all token in
> pg_ident.conf. As long as this is called out in the release notes
> it seems okay to me. However, if others disagree there would
> be the option of changing the token to "pg_all". Since any
> pg_ prefixed users are reserved by postgres there can be no user.
> For now I used "all" though to stay consistent with pg_hba.conf.


+1 from me. I recently was setting up a Vagrant VM for testing and wanted
to allow the OS user which runs the application to connect to the database
as whatever user it wants and was surprised to find I had to list all the
potential target DB users in the pg_ident.conf (in production it uses
password authentication and each server gets just the passwords it needs
stored in ~/.pgpass). I like the idea that both config files would be
consistent, although the use of keywords such as "replication" in the DB
column has always made me a bit uncomfortable.

Related question: is there a reason why pg_ident.conf can't/shouldn't be
replaced by a system table? As far as I can tell, it's just a 3-column
table, essentially, with all columns in the primary key. This latest
proposal changes that a little; strictly, it should probably introduce a
second table with just two columns identifying which OS users can connect
as any user, but existing system table style seems to suggest that we would
just use a special value in the DB user column for "all".

Re: GROUP BY ALL

2022-12-19 Thread Isaac Morland

On Sun, 18 Dec 2022 at 23:30, Tom Lane  wrote:

> Andrey Borodin  writes:
> > I saw a thread in a social network[0] about GROUP BY ALL. The idea seems
> useful.
>
> Isn't that just a nonstandard spelling of SELECT DISTINCT?
>

In a pure relational system, yes; but since Postgres allows duplicate rows,
both in actual table data and in intermediate and final result sets, no.
Although I'm pretty sure no aggregates other than count() are useful - any
other aggregate would always just combine count() copies of the duplicated
value in some way.

What would happen if there are aggregate functions in the tlist?
> I'm not especially on board with "ALL" meaning "ALL (oh, but not
> aggregates)".
>

The requested behaviour can be accomplished by an invocation something like:

select (t).*, count(*) from (select (…field1, field2, …) as t from
…tables…) s group by t;

So we collect all the required fields as a tuple, group by the tuple, and
then unpack it into separate columns in the outer query.

Re: allow granting CLUSTER, REFRESH MATERIALIZED VIEW, and REINDEX

2022-12-14 Thread Isaac Morland

On Wed, 14 Dec 2022 at 15:57, Jeff Davis  wrote:

> On Wed, 2022-12-14 at 15:32 -0500, Isaac Morland wrote:
>
> > Is there a firm decision on the issue of changing the cluster index
> > of a table? Re-clustering a table on the same index is clearly
> > something that should be granted by MAINTAIN as I imagine it, but
> > changing the cluster index, strictly speaking, changes the schema and
> > could be considered outside of the scope of what should be allowed.
> > On the other hand, I can see simplicity in having CLUSTER check the
> > same permissions whether or not the cluster index is being updated.
>
> In both the case of CLUSTER and REFRESH, I don't have a strong enough
> opinion to break them out into separate privileges. There's some
> argument that can be made; but at the same time it's hard for me to
> imagine someone really making use of the privileges separately.
>

Thanks, that makes a lot of sense. I wanted to make sure the question was
considered. I'm very pleased this is happening and appreciate all the work
you're doing. I have a few places where I want to be able to grant MAINTAIN
so I'll be using this as soon as it's available on our production database.

Re: allow granting CLUSTER, REFRESH MATERIALIZED VIEW, and REINDEX

2022-12-14 Thread Isaac Morland

On Wed, 14 Dec 2022 at 14:47, Jeff Davis  wrote:

Furthermore, MAINTAIN privileges on the partitioned table do not grant
> the ability to create new partitions. There's a comment in tablecmds.c
> alluding to a possible "UNDER" privilege:
>
>   /*
>* We should have an UNDER permission flag for this, but for now,
>* demand that creator of a child table own the parent.
>*/
>
> Perhaps there's something we want to do there, but it's a different use
> case than the MAINTAIN privilege, so I don't see a reason it should be
> grouped. Also, there's a bit of weirdness to think about in cases where
> another user creates (and owns) a partition of your table (currently
> this is only possible if the other user is a superuser).
>

I strongly agree. MAINTAIN is for actions that leave the schema the same.
Conceptually, running MAINTAIN shouldn't affect the result of pg_dump. That
may not be strictly true, but adding a table is definitely not something
that MAINTAIN should allow.

Is there a firm decision on the issue of changing the cluster index of a
table? Re-clustering a table on the same index is clearly something that
should be granted by MAINTAIN as I imagine it, but changing the cluster
index, strictly speaking, changes the schema and could be considered
outside of the scope of what should be allowed. On the other hand, I can
see simplicity in having CLUSTER check the same permissions whether or not
the cluster index is being updated.

Re: Ordering behavior for aggregates

2022-12-13 Thread Isaac Morland

On Tue, 13 Dec 2022 at 07:50, Vik Fearing  wrote:

I am proposing something like pg_aggregate.aggordering which would be an
> enum of behaviors such as f=Forbidden, a=Allowed, r=Required.  Currently
> all aggregates would have 'a' but I am thinking that a lot of them could
> be switched to 'f'.  In that case, if a user supplies an ordering, an
> error is raised.
>

Although I find "r" attractive, I have two concerns about it:

1) Do we really want to require ordering? I know it's weird and partially
undefined to call something like string_agg without an ordering, but what
if in the specific application it doesn’t matter in what order the items
appear?

2) There is a backward compatibility issue here; it’s not clear to me we
could apply "r" to any existing aggregate.

Actually upon consideration, I think I have similar concerns about "f". We
don’t usually forbid "dumb" things; e.g., I can write a function which
ignores its inputs. And in some situations, "dumb" things make sense. For
example, if I’m specifying a function to use as a filter, it could be
reasonable in a particular instance to provide a function which ignores one
or more of its inputs.

Re: add \dpS to psql

2022-12-07 Thread Isaac Morland

On Thu, 8 Dec 2022 at 00:07, Nathan Bossart 
wrote:

> On Wed, Dec 07, 2022 at 11:48:20PM -0500, Isaac Morland wrote:
> > For what it's worth, I wouldn't bother changing the format of the
> > permission bits to expand the pool of available bits.
>
> 7b37823 expanded AclMode to 64 bits, so we now have room for 16 additional
> privileges (after the addition of VACUUM and ANALYZE in b5d6382).
>
> > My previous analysis
> > shows that there is no vast hidden demand for new privilege bits. If we
> > implement MAINTAIN to control access to VACUUM, ANALYZE, REFRESH,
> CLUSTER,
> > and REINDEX, we will cover everything that I can find that has seriously
> > discussed on this list, and still leave 3 unused bits for future
> expansion.
>
> If we added CLUSTER, REFRESH MATERIALIZED VIEW, and REINDEX as individual
> privilege bits, we'd still have 13 remaining for future use.
>

I was a bit imprecise. I was comparing to the state before the recent
changes - so 12 bits used out of 16, with MAINTAIN being the 13th bit. I
think in my mind it's still approximately 2019 on some level.

Re: add \dpS to psql

2022-12-07 Thread Isaac Morland

On Wed, 7 Dec 2022 at 23:25, Tom Lane  wrote:

> Nathan Bossart  writes:
> > I haven't formed an opinion on whether VACUUM FULL should get its own
> bit,
> > but FWIW І just finished writing the first draft of a patch set to add
> bits
> > for CLUSTER, REFRESH MATERIALIZED VIEW, and REINDEX.  I plan to post that
> > tomorrow.
>
> The fact that we just doubled the number of available bits doesn't
> mean we should immediately strive to use them up.  Perhaps it'd
> be better to subsume these retail privileges under some generic
> "maintenance action" privilege?
>

That was my original suggestion:

https://www.postgresql.org/message-id/CAMsGm5c4DycKBYZCypfV02s-SC8GwF%2BKeTt%3D%3DvbWrFn%2Bdz%3DKeg%40mail.gmail.com

In that message I review the history of permission bit growth. A bit later
in the discussion, I did some investigation into the history of demand for
new permission bits and I proposed calling the new privilege MAINTAIN:

https://www.postgresql.org/message-id/CAMsGm5d%3D2gi4kyKONUJyYFwen%3DbsWm4hz_KxLXkEhMmg5WSWTA%40mail.gmail.com

For what it's worth, I wouldn't bother changing the format of the
permission bits to expand the pool of available bits. My previous analysis
shows that there is no vast hidden demand for new privilege bits. If we
implement MAINTAIN to control access to VACUUM, ANALYZE, REFRESH, CLUSTER,
and REINDEX, we will cover everything that I can find that has seriously
discussed on this list, and still leave 3 unused bits for future expansion.
There is even justification for stopping after this expansion: if it is
done, then schema changes (DDL) will only be able to be done by owner; data
changes (insert, update, delete, as well as triggering of automatic data
maintenance actions) will be able to be done by anybody who is granted
permission.

My guess is that if we ever do expand the privilege bit system, it should
be in a way that removes the limit entirely, replacing a bit map model with
something more like a table with one row for each individual grant, with a
field indicating which grant is involved. But that is a hypothetical future.

Re: ANY_VALUE aggregate

2022-12-05 Thread Isaac Morland

On Mon, 5 Dec 2022 at 22:52, Vik Fearing  wrote:

> On 12/5/22 20:31, Corey Huinker wrote:
> >
> > Adding to the pile of wanted aggregates: in the past I've lobbied for
> > only_value() which is like first_value() but it raises an error on
> > encountering a second value.
>
> I have had use for this in the past, but I can't remember why.  What is
> your use case for it?  I will happily write a patch for it, and also
> submit it to the SQL Committee for inclusion in the standard.  I need to
> justify why it's a good idea, though, and we would need to consider what
> to do with nulls now that there is .
>

I have this in my local library of "stuff that I really wish came with
Postgres", although I call it same_agg and it just goes to NULL if there
are more than one distinct value.

I sometimes use it when normalizing non-normalized data, but more commonly
I use it when the query planner isn't capable of figuring out that a column
I want to use in the output depends only on the grouping columns. For
example, something like:

SELECT group_id, group_name, count(*) from group_group as gg natural join
group_member as gm group by group_id

I think that exact example actually does or is supposed to work now, since
it realizes that I'm grouping on the primary key of group_group so the
group_name field in the same table can't differ between rows of a group,
but most of the time when I expect that feature to allow me to use a field
it actually doesn't.

I have a vague notion that part of the issue may be the distinction between
gg.group_id, gm.group_id, and group_id; maybe the above doesn't work but it
does work if I group by gg.group_id instead of by group_id. But obviously
there should be no difference because in this query those 3 values cannot
differ (outer joins are another story).

For reference, here is my definition:

CREATE OR REPLACE FUNCTION same_sfunc (
a anyelement,
b anyelement
) RETURNS anyelement
LANGUAGE SQL IMMUTABLE STRICT
SET search_path FROM CURRENT
AS $$
SELECT CASE WHEN $1 = $2 THEN $1 ELSE NULL END
$$;
COMMENT ON FUNCTION same_sfunc (anyelement, anyelement) IS 'SFUNC for
same_agg aggregate; returns common value of parameters, or NULL if they
differ';

DROP AGGREGATE IF EXISTS same_agg (anyelement);
CREATE AGGREGATE same_agg (anyelement) (
SFUNC = same_sfunc,
STYPE = anyelement
);
COMMENT ON AGGREGATE same_agg (anyelement) IS 'Return the common non-NULL
value of all non-NULL aggregated values, or NULL if some values differ';

You can tell I've had this for a while - there are several newer Postgres
features that could be used to clean this up noticeably.

I also have a repeat_agg which returns the last value (not so interesting)
but which is sometimes useful as a window function (more interesting:
replace NULLs with the previous non-NULL value in the column).

Re: pgsql: Revoke PUBLIC CREATE from public schema, now owned by pg_databas

2022-11-30 Thread Isaac Morland

On Wed, 30 Nov 2022 at 17:35, Tom Lane  wrote:

BTW, is "create a schema with the same name" sufficient detail?
> You have to either make it owned by that user, or explicitly
> grant CREATE permission on it.  I'm not sure if that detail
> belongs here, but it feels like maybe it does.

It might be worth mentioning AUTHORIZATION. The easiest way to create an
appropriately named schema for a user is "CREATE SCHEMA AUTHORIZATION
username".

Understanding WAL - large amount of activity from removing data

2022-11-20 Thread Isaac Morland

I'm encountering some surprising (to me) behaviour related to WAL, and I'm
wondering if anybody can point me at an article that might help me
understand what is happening, or give a brief explanation.

I'm trying to make a slimmed down version of my database for testing
purposes. As part of this, I'm running a query something like this:

UPDATE table1
SET pdfcolumn = 'redacted'
WHERE pdfcolumn IS NOT NULL;

(literally 'redacted', not redacted here for your benefit)

The idea is to replace the actual contents of the column, which are PDF
documents totalling 70GB, with just a short placeholder value, without
affecting the other columns, which are a more ordinary collection - a few
integers and short strings.

The end result will be a database which is way easier to copy around but
which still has all the records of the original; the only change is that an
attempt to access one of the PDFs will not return the actual PDF but rather
a garbage value. For most testing this will make little to no difference.

What I'm finding is that the UPDATE is taking over an hour for 5000
records, and tons of WAL is being generated, several files per minute.
Selecting the non-PDF columns from the entire table takes a few
milliseconds, and the only thing I'm doing with the records is updating
them to much smaller values. Why so much activity just to remove data? The
new rows are tiny.

Re: cataloguing NOT NULL constraints

2022-09-20 Thread Isaac Morland

On Tue, 20 Sept 2022 at 06:56, Alvaro Herrera 
wrote:

The NULL checks would still be mostly done by the attnotnull checks
> internally, so there shouldn't be too much of a difference.
>
> .. though I'm now wondering if there's additional overhead from checking
> the constraint twice on each row: first the attnotnull bit, then the
> CHECK itself.  Hmm.  That's probably quite bad.
>

Another reason to treat NOT NULL-implementing constraints differently.

My thinking is that pg_constraint entries for NOT NULL columns are mostly
an implementation detail. I've certainly never cared whether I had an
actual constraint corresponding to my NOT NULL columns. So I think marking
them as such, or a different contype, and excluding them from \d+ display,
probably makes sense. Just need to deal with the issue of trying to create
a constraint and having its name conflict with a NOT NULL constraint. Could
it work to reserve [field name]_notnull for NOT NULL-implementing
constraints? I'd be worried about what happens with field renames; renaming
the constraint automatically seems a bit weird, but maybe…

Re: cataloguing NOT NULL constraints

2022-09-19 Thread Isaac Morland

On Mon, 19 Sept 2022 at 09:32, Robert Haas  wrote:

> On Wed, Aug 17, 2022 at 2:12 PM Alvaro Herrera 
> wrote:
> > If you say CREATE TABLE (a int NOT NULL), you'll get a CHECK constraint
> > printed by psql: (this is a bit more noisy that previously and it
> > changes a lot of regression tests output).
> >
> > 55489 16devel 1776237=# create table tab (a int not null);
> > CREATE TABLE
> > 55489 16devel 1776237=# \d tab
> > Tabla «public.tab»
> >  Columna │  Tipo   │ Ordenamiento │ Nulable  │ Por omisión
> > ─┼─┼──┼──┼─
> >  a   │ integer │  │ not null │
> > Restricciones CHECK:
> > "tab_a_not_null" CHECK (a IS NOT NULL)
>
> In a table with many columns, most of which are NOT NULL, this is
> going to produce a ton of clutter. I don't like that.
>
> I'm not sure what a good alternative would be, though.
>

I thought I saw some discussion about the SQL standard saying that there is
a difference between putting NOT NULL in a column definition, and CHECK
(column_name NOT NULL). So if we're going to take this seriously, I think
that means there needs to be a field in pg_constraint which identifies
whether a constraint is a "real" one created explicitly as a constraint, or
if it is just one created because a field is marked NOT NULL.

If this is correct, the answer is easy: don't show constraints that are
there only because of a NOT NULL in the \d or \d+ listings. I certainly
don't want to see that clutter and I'm having trouble seeing why anybody
else would want to see it either; the information is already there in the
"Nullable" column of the field listing.

The error message for a duplicate constraint name when creating a
constraint needs however to be very clear that the conflict is with a NOT
NULL constraint and which one, since I'm proposing leaving those ones off
the visible listing, and it would be very bad for somebody to get
"duplicate name" and then be unable to see the conflicting entry.

Re: why can't a table be part of the same publication as its schema

2022-09-10 Thread Isaac Morland

On Sat, 10 Sept 2022 at 19:18, Robert Haas  wrote:

If I encountered this syntax in a vacuum, that's not what I would
> think. I would think that ADD ALL TABLES IN SCHEMA meant add all the
> tables in the schema to the publication one by one as individual
> objects, i.e. add the tables that are currently as of this moment in
> that schema to the publication; and I would think that ADD SCHEMA
> meant remember that this schema is part of the publication and so
> whenever tables are created and dropped in that schema (or moved in
> and out) what is being published is automatically updated.
>
> The analogy here seems to be to GRANT, which actually does support
> both syntaxes. And if I understand correctly, GRANT ON SCHEMA gives
> privileges on the schema; whereas GRANT ON ALL TABLES IN SCHEMA
> modifies each table that is currently in that schema (never mind what
> happens later).
>

Yes, except GRANT ON SCHEMA only grants access to the schema - CREATE or
USAGE. You cannot write GRANT SELECT ON SCHEMA to grant access to all
tables in the schema.

Re: Can we avoid chdir'ing in resolve_symlinks() ?

2022-09-01 Thread Isaac Morland

On Thu, 1 Sept 2022 at 19:39, Tom Lane  wrote:

This code was mine originally (336969e49), but I sure don't
> remember why I wrote it like that.  I know we didn't have a
> robust version of canonicalize_path() then, and that may have
> been the main issue, but that offhand comment about mount
> points bothers me.  But I can't reconstruct precisely what
> I was worried about there.  The only contemporaneous discussion
> thread I can find is [2], which doesn't go into coding details.
>

Does this happen in a context where we need to worried about the directory
structure changing under us, either accidentally or maliciously?

I'm wondering because I understand cd'ing through the structure can avoid
some of the related problems and might be the reason for doing it that way
originally. My impression is that the modern equivalent would be to use
openat() with O_PATH to step through the hierarchy. But then I'm not clear
on how to get back to the absolute path, given a file descriptor for the
final directory.

Re: identifying the backend that owns a temporary schema

2022-08-23 Thread Isaac Morland

On Tue, 23 Aug 2022 at 05:29, Greg Stark  wrote:

> Having this function would be great (I admit I never responded because
> I never figured out if your suggestion was right or not:). But should
> it also be added to the pg_stat_activity view? Perhaps even just in
> the SQL view using the function.
>
> Alternately should pg_stat_activity show the actual temp schema name
> instead of the id? I don't recall if it's visible outside the backend
> but if it is, could pg_stat_activity show whether the temp schema is
> actually attached or not?
>

Would it work to cast the schema oid to type regnamespace? Then the actual
data (numeric oid) would be present in the view, but it would display as
text.

Re: System catalog documentation chapter

2022-07-20 Thread Isaac Morland

On Wed, 20 Jul 2022 at 16:08, Bruce Momjian  wrote:

> On Tue, Jul 19, 2022 at 01:41:44PM -0400, Bruce Momjian wrote:
> > I am going to look at moving system views that make sense into the
> > chapters where their contents are mentioned.  I don't think having a
> > central list of views is really helping us because we expect the views
> > to be used in ways the system catalogs would not be.
>
> I have grouped the views by topic.  What I would like to do next is to
> move these view sections to the end of relevant documentation chapters.
> Is that going to be an improvement?

Will there be a comprehensive list somewhere? Even if it just lists the
views, gives maybe a one-sentence description, and links to the relevant
chapter, I would find that helpful sometimes.

I ask because I occasionally find myself wanting a comprehensive list of
functions, and as far as I can tell it doesn't exist. I'm hoping to avoid
that situation for views.

Re: Bug: Reading from single byte character column type may cause out of bounds memory reads.

2022-07-13 Thread Isaac Morland

On Wed, 13 Jul 2022 at 09:15, Aleksander Alekseev 
wrote:

I can confirm the bug exists in the `master` branch as well and
> doesn't depend on the platform.
>
> Although the bug is easy to fix for this particular case (see the
> patch) I'm not sure if this solution is general enough. E.g. is there
> something that generally prevents pg_mblen() from doing out of bound
> reading in cases similar to this one? Should we prevent such an INSERT
> from happening instead?
>

Not just INSERTs, I would think: the implicit cast is already invalid,
since the "char" type can only hold characters that can be represented in 1
byte. A comparable example in the numeric types might be:

odyssey=> select (2.0 ^ 80)::double precision::integer;
ERROR:  integer out of range

By comparison:

odyssey=> select '🀆'::"char";
 char
──

(1 row)

I think this should give an error, perhaps 'ERROR: "char" out of range'.

Incidentally, if I apply ascii() to the result, I get sometimes 0 and
sometimes 90112, neither of which should be a possible value for ascii ()
of a "char" value and neither of which is 126982, the actual value of that
character.

odyssey=> select ascii ('🀆'::"char");
 ascii
───
 90112
(1 row)

odyssey=> select ascii ('🀆'::"char");
 ascii
───
 0
(1 row)

odyssey=> select ascii ('🀆');
 ascii

 126982
(1 row)

Re: pg_checkpointer is not a verb or verb phrase

2022-06-30 Thread Isaac Morland

On Thu, 30 Jun 2022 at 21:22, Michael Paquier  wrote:

> On Thu, Jun 30, 2022 at 08:57:04AM -0400, Isaac Morland wrote:
> > I was going to point out that pg_database_owner is the same way, but it
> is
> > fundamentally different in that it has no special allowed access and is
> > meant to be the target of permission grants rather than being granted to
> > other roles.
> >
> > +1 to rename it to pg_checkpoint or to some similar name.
>
> We are still in beta, so, FWIW, I am fine to adjust this name even if
> it means an extra catversion bump.
>
> "checkpoint" is not a verb (right?), so would something like
> "pg_perform_checkpoint" rather than "pg_checkpoint" fit better in the
> larger picture?
>

I would argue it’s OK. In the Postgres context, I can imagine someone
saying they’re going to checkpoint the database, and the actual command is
just CHECKPOINT. Changing from checkpointer to checkpoint means that we’re
describing the action rather than what a role member is.

If we are going to put a more standard verb in there, I would use execute
rather than perform, because that is what the documentation says members of
this role can do — “Allow executing the CHECKPOINT command”. Zooming out a
little, I think we normally talk about executing commands rather than
performing them, so this is consistent with those other uses; otherwise we
should reconsider what the documentation itself says to match
other commands that we talk about running.

OK, I think I’ve bikeshedded enough. I’m just happy to have all these new
roles to avoid handing out full superuser access routinely.

Re: pg_checkpointer is not a verb or verb phrase

2022-06-30 Thread Isaac Morland

On Thu, 30 Jun 2022 at 08:48, Robert Haas  wrote:

Almost all of these are verbs or verb phrases: having this role gives
> you the ability to read all data, or write all data, or read all
> settings, or whatever. But you can't say that having this role gives
> you the ability to checkpointer. It gives you the ability to
> checkpoint, or to perform a checkpoint.
>
> So I think the name is inconsistent with our usual convention, and we
> ought to fix it.
>

I was going to point out that pg_database_owner is the same way, but it is
fundamentally different in that it has no special allowed access and is
meant to be the target of permission grants rather than being granted to
other roles.

+1 to rename it to pg_checkpoint or to some similar name.

Re: Making the subquery alias optional in the FROM clause

2022-06-28 Thread Isaac Morland

On Tue, 28 Jun 2022 at 00:32, Julien Rouhaud  wrote:

> As to forcing SQL-complaint queries, that ship sailed a long time ago:
> > Postgres allows but does not enforce the use of SQL-compliant queries,
> and
> > many of its important features are extensions anyway, so forcing SQL
> > compliant queries is out of the question (although I could see the
> utility
> > of a mode where it warns or errors on non-compliant queries, at least in
> > principle).
>
> Sure, but it doesn't mean that we should support even more non-compliant
> syntax
> without any restraint.  In this case, I don't see much benefit as it's not
> solving performance problem or something like that.
>

It's improving developer performance by eliminating the need to make up
utterly useless names. I don't care if behind the scenes names are
assigned, although it would be even better if the names didn't exist at
all. I just want the computer to do stuff for me that requires absolutely
no human judgement whatsoever.

> As to bad habits, I'm having trouble understanding. Why do you think
> > leaving the alias off a subquery is a bad habit (assuming it were
> allowed)?
>
> I think It's a bad habit because as far as I can see it's not supported on
> mysql or sqlserver.
>

So it’s a bad habit to use features of Postgres that aren’t available on
MySQL or SQL Server?

For myself, I don’t care one bit about whether my code will run on those
systems, or Oracle: as far as I’m concerned I write Postgres applications,
not SQL applications. Of course, many people have a need to support other
systems, so I appreciate the care we take to document the differences from
the standard, and I hope we will continue to support standard queries. But
if it’s a bad habit to use Postgres-specific features, why do we create any
of those features?

> If the name is never used, why are we required to supply it?
>
> But similarly, I many times relied on the fact that writable CTE are
> executed
> even if not explicitly referenced.  So by the same argument shouldn't we
> allow
> something like this?
>
> WITH (INSERT INTO t SELECT * pending WHERE ts < now())
> SELECT now() AS last_processing_time;
>

I’m not necessarily opposed to allowing this too. But the part which causes
me annoyance is normal subquery naming.

Re: Separate the attribute physical order from logical order

2022-06-28 Thread Isaac Morland

On Tue, 28 Jun 2022 at 05:32, Julien Rouhaud  wrote:

> I think that supporting at least a way to specify the logical order during
> the
> table creation should be easy to implement (there shouldn't be any
> question on whether it needs to invalidate any cache or what lock level to
> use), and could also be added in the initial submission without much extra
> efforts, which could help with the testing.
>

I think the meaning of “logical order” (well, the meaning it has for me, at
least) implies that the logical order of a table after CREATE TABLE is the
order in which the columns were given in the table creation statement.

If there needs to be a way of specifying the physical order separately,
that is a different matter.

ALTER TABLE ADD … is another matter. Syntax there to be able to say BEFORE
or AFTER an existing column would be nice to have. Presumably it would
physically add the column at the end but set the logical position as
specified.

Re: Making the subquery alias optional in the FROM clause

2022-06-27 Thread Isaac Morland

On Mon, 27 Jun 2022 at 11:12, Julien Rouhaud  wrote:

> More generally, I'm -0.5 on the feature.
> I prefer to force using SQL-compliant queries, and also not take bad
> habits.
>

As to forcing SQL-complaint queries, that ship sailed a long time ago:
Postgres allows but does not enforce the use of SQL-compliant queries, and
many of its important features are extensions anyway, so forcing SQL
compliant queries is out of the question (although I could see the utility
of a mode where it warns or errors on non-compliant queries, at least in
principle).

As to bad habits, I'm having trouble understanding. Why do you think
leaving the alias off a subquery is a bad habit (assuming it were allowed)?
If the name is never used, why are we required to supply it?

Re: Finer grain log timestamps

2022-06-20 Thread Isaac Morland

On Mon, 20 Jun 2022 at 11:01, Tom Lane  wrote:

> Alvaro Herrera  writes:
>

> If I were coding it, I would allow only exactly 1 digit (%.Nt) to simplify
> the parsing side of things and bound the required buffer size.  Without
> having written it, it's not clear to me whether further restricting the
> set of supported values would save much code.  I will point out, though,
> that throwing an error during log_line_prefix processing will lead
> straight to infinite recursion.
>

I would parse the log_line_prefix when it is set. Then if there is a
problem you just log it using whatever format is in effect and don't change
the setting. Then the worst that happens is that logs show up in a format
log processing isn't prepared to accept.

That being said, I think I fall in the “just start putting more digits in
the log” camp, although it is conceivable the counter arguments might be
convincing.

Re: Why is EXECUTE granted to PUBLIC for all routines?

2022-04-23 Thread Isaac Morland

On Fri, 22 Apr 2022 at 13:44, Tom Lane  wrote:

> There is zero security concern for non-SECURITY-DEFINER functions,
> since they do nothing callers couldn't do for themselves.  For those,
> you typically do want to grant out permissions.  As for SECURITY DEFINER
> functions, there is no reason to make one unless it is meant to be called
> by someone besides the owner.  Perhaps PUBLIC isn't the scope you want to
> grant it to, but no-privileges wouldn't be a useful default there either.
>

No privileges would be a safe default, not entirely unlike the default "can
only connect from localhost" pg_hba.conf, …

> In any case, changing this decision now would cause lots of problems,
> such as breaking existing dump files.  We're unlikely to revisit it.
>

… but, yeah, this would be rather hard to change without causing more
trouble.

> As noted in the docs, best practice is to adjust the permissions
> as you want them in the same transaction that creates the function.
>

I wrote a function which resets the permissions on all objects in the
specified schemas to default. Then for each project I have a
privileges-granting file which starts by resetting all permissions, then
grants exactly the permissions I want. Most of the resetting is done by
checking the existing privileges and revoking them; then it ASSERTs that
this leaves an empty ACL, and finally does an UPDATE on the relevant system
table to change the ACL from empty to NULL. For SECURITY DEFINER functions,
the reset function then revokes PUBLIC privileges, leaving it to the
specific project to grant the appropriate privileges.

BTW, the reg* types are amazing for writing this kind of stuff. Makes all
sorts of things so much easier.

Re: Pointer subtraction with a null pointer

2022-03-26 Thread Isaac Morland

On Sat, 26 Mar 2022 at 12:24, Andres Freund  wrote:

> NULL can never be part of the same "array object" or one past past the last
> element as the pointer it is subtracted from. Hence the undefined beaviour.
>

Even more fundamentally, NULL is not 0 in any ordinary mathematical sense,
even though it can be written 0 in source code and is often (but not
always) represented in memory as an all-0s bit pattern. I'm not at all
surprised to learn that arithmetic involving NULL is undefined.

Re: Foreign key joins revisited

2021-12-27 Thread Isaac Morland

On Mon, 27 Dec 2021 at 10:20, Joel Jacobson  wrote:

> Foreign key constraint names have been given the same names as the
> referenced tables.
>

While I agree this could be a simple approach in many real cases for having
easy to understand FK constraint names, I wonder if for illustration and
explaining the feature if it might work better to use names that are
completely unique so that it's crystal clear that the names are constraint
names, not table names.

Re: Foreign key joins revisited

2021-12-27 Thread Isaac Morland

On Mon, 27 Dec 2021 at 03:22, Joel Jacobson  wrote:

> However, I see one problem with leaving out the key columns:
> First, there is only one FK in permission pointing to role, and we write a
> query leaving out the key columns.
> Then, another different FK in permission pointing to role is later added,
> and our old query is suddenly in trouble.
>

I thought the proposal was to give the FK constraint name. However, if the
idea now is to allow leaving that out also if there is only one FK, then
that's also OK as long as people understand it can break in the same way
NATURAL JOIN can break when columns are added later. For that matter, a
join mentioning column names can break if the columns are changed. But
breakage where the query no longer compiles are better than ones where it
suddenly means something very different so overall I wouldn't worry about
this too much.

1 2 3 >

1 - 100 of 244 matches

Mail list logo