Re: Pro et contra of preserving pg_proc oids during pg_upgrade

2023-10-16 Thread Nikita Malakhov
Hi,

Thank you very much, I'll check it out. It looks like the
getObjectIdentity() used in
pg_identify_object() could do.

--
Regards,
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
https://postgrespro.ru/


Re: Pro et contra of preserving pg_proc oids during pg_upgrade

2023-10-16 Thread Alvaro Herrera
On 2023-Oct-13, Nikita Malakhov wrote:

> Textual representation requires a long text field because it could
> contain schema, arguments, it is difficult and not effective to be
> saved as part of the data, and must be parsed to retrieve function
> oid.

It is worse than that: the regproc textual representation depends on
search_path.  If you store the text now, the meaning could change later,
depending on the search_path that applies at read time.

Of course, the storage for OID is much shorter and not subject to this
problem; but it is subject to the problem that it breaks if you drop and
replace the function, which could happen for instance in an extensions
upgrade script.

I think a better way to store a function's identity is to store the
'identity' column from pg_identify_object().  It is fully qualified and
you can cast to regprocedure with no ambiguity (which gives you an OID,
if you need one).  And it should upgrade cleanly.

If you have a regproc column that you want to upgrade, maybe it would
work to do 'ALTER TABLE .. SET TYPE TEXT USING' and turn the value into
pg_identify_object().identity.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/




Re: Pro et contra of preserving pg_proc oids during pg_upgrade

2023-10-13 Thread Laurenz Albe
On Thu, 2023-10-12 at 19:56 +0300, Nikita Malakhov wrote:
> Say, we have data processed by some user function and we want to keep 
> reference to this function
> in our data. In this case we have two ways - first - store string output of 
> regprocedure, which is not
> very convenient, and the second - store its OID, which requires slight 
> modification of pg_upgrade
> (pg_dump and func/procedure creation function).

So far, we have lived quite well with the rule "don't store any system OIDs in 
the database
if you want to pg_upgrade" (views on system objects, reg* data types, ...).

What is inconvenient about storing the output of regprocedure?

Yours,
Laurenz Albe




Re: Pro et contra of preserving pg_proc oids during pg_upgrade

2023-10-12 Thread David G. Johnston
On Thu, Oct 12, 2023 at 2:58 PM Nikita Malakhov  wrote:

> Why pg_upgrade cannot be used?
>

We document both a pg_dump/pg_restore migration and a pg_upgrade one (not
to mention that logical backup and restore would cause the oids to
change).  It seems odd to have a feature that requires pg_upgrade to be the
chosen one.  pg_upgrade is an option, not a requirement.  Same goes for
pg_basebackup.

pg_upgrade itself warns that should the on-disk file format change then it
would be unusable - though I suspect that we'd end up with some kind of
hybrid approach in that case.


> OID preservation logic is already implemented
> for several OIDs in catalog tables, like pg_class, type, relfilenode,
> enum...
>
>
We are allowed to preserve oids if we wish but that doesn't mean we must,
nor does doing so constitute a declaration that such oids are part of
the public API.  And I don't see us making OIDs part of the public API
unless we modify pg_dump to include them in its output.


> Actually, I've asked here because there are several references to PG_PROC
> oids
> from other tables in the system catalog
>

Of course there are, e.g., views depending on functions would result is
those.  But pg_upgrade et al. recomputes the views so the changing of oids
isn't a problem.

Long text fields are common in databases; and if there are concerns with
parsing/interpretation we can add functions to make doing that simpler.

David J.


Re: Pro et contra of preserving pg_proc oids during pg_upgrade

2023-10-12 Thread Nikita Malakhov
Hi,

Textual representation requires a long text field because it could contain
schema,
arguments, it is difficult and not effective to be saved as part of the
data, and must
be parsed to retrieve function oid. By using direct oid (actually, a value
of the regprocedure field) we avoid it and function could be retrieved by
pk.

Why pg_upgrade cannot be used? OID preservation logic is already implemented
for several OIDs in catalog tables, like pg_class, type, relfilenode,
enum...
I've mentioned twice that this logic is already implemented and I haven't
encountered
any problems with pg_upgrade.

Actually, I've asked here because there are several references to PG_PROC
oids
from other tables in the system catalog, so I was worried if this logic
could break
something I do not know about.

--
Regards,
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
https://postgrespro.ru/


Re: Pro et contra of preserving pg_proc oids during pg_upgrade

2023-10-12 Thread David G. Johnston
On Thu, Oct 12, 2023 at 1:31 PM Nikita Malakhov  wrote:

> About using surrogate key - this feature is more for data generated by
> the DBMS itself, i.e. data processed by some extension and saved
> and re-processed automatically or by user's request, but without bothering
> user with these internal keys.
>

Then what does it matter whether you spell it:

12345
or
my_ext.do_something(int)
?

Why do you require us to redefine the scope for which pg_proc.oid is useful
in order to implement this behavior?

Your extension breaks if your user uses logical backups or we otherwise
get into a position where pg_upgrade cannot be used to migrate in the
future.  Is avoiding the textual representation so necessary that you need
to add another dependency to the system?  That just seems unwise regardless
of how easy it may be to accomplish.

David J.


Re: Pro et contra of preserving pg_proc oids during pg_upgrade

2023-10-12 Thread Nikita Malakhov
Hi,

I've already implemented preserving PG_PROC oids during pg_upgrade
in a way like relfilenodes, etc, actually, it is quite simple, and on the
first
look there are no any problems.

About using surrogate key - this feature is more for data generated by
the DBMS itself, i.e. data processed by some extension and saved
and re-processed automatically or by user's request, but without bothering
user with these internal keys.

The main question - maybe, are there pitfalls of which I am not aware of?

Thanks for your replies!

-- 
Regards,
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
https://postgrespro.ru/


Re: Pro et contra of preserving pg_proc oids during pg_upgrade

2023-10-12 Thread Robert Haas
On Thu, Oct 12, 2023 at 3:36 PM David G. Johnston
 wrote:
> Every catalog has both a natural and a surrogate key.  Developers get to use 
> the surrogate key while end-users get to use the natural one (i.e., the one 
> they provided).  I see no reason to change that specification.

I agree with this.

> And I do believe there are no compelling reasons for an end-user to need to 
> use the surrogate key instead of the natural one.

But I disagree with this.

> The example provided by the OP isn't one, IMO, the overall goal can be 
> accomplished via the natural key (if it cannot, maybe we need to make 
> retrieving the natural key for a pg_proc record given an OID easier).  The 
> fact that OIDs are not even accessible via SQL further reinforces this 
> belief.  The only reason to need OIDs as a DBA is to perform joins among the 
> catalogs and all such joins are local to the database and even session 
> executing them - the specific values are immaterial.

This just all seems very simplistic to me. In theory it's true, but in
practice it isn't.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Pro et contra of preserving pg_proc oids during pg_upgrade

2023-10-12 Thread David G. Johnston
On Thu, Oct 12, 2023 at 11:43 AM Robert Haas  wrote:

> On Thu, Oct 12, 2023 at 2:38 PM David G. Johnston
>  wrote:
> > It's more like a lot number or surveying tract than an postal address.
> Useful for a single party, the builder or the government, but not something
> you give out to other people so they can find you.
> >
> > Whether or not we copy over oids should be done based upon our internal
> needs, not end users.  Which is why the fee that do get copied exists,
> because we store them in internal files that we want to copy as part of the
> upgrade.  It also isn't like pg_dump/restore is going to retain them and
> the less divergence between that and pg_upgrade arguably the better.
>
> We build the product for the end users. Their desires and needs are
> relevant. And if they're telling us we did it wrong, we need to listen
> to that. We don't have to do everything that everybody wants, but
> treating developer needs as strictly more important than end-user
> needs is self-defeating.
>

Every catalog has both a natural and a surrogate key.  Developers get to
use the surrogate key while end-users get to use the natural one (i.e., the
one they provided).  I see no reason to change that specification.  And I
do believe there are no compelling reasons for an end-user to need to use
the surrogate key instead of the natural one.  The example provided by the
OP isn't one, IMO, the overall goal can be accomplished via the natural key
(if it cannot, maybe we need to make retrieving the natural key for a
pg_proc record given an OID easier).  The fact that OIDs are not even
accessible via SQL further reinforces this belief.  The only reason to
need OIDs as a DBA is to perform joins among the catalogs and all such
joins are local to the database and even session executing them - the
specific values are immaterial.

The behavior of pg_upgrade only preserving OIDs that are necessary due to
the physical copying of data files from the old server to the new one seems
sufficient both in terms of effort and the principle of doing the minimum
amount to solve the problem at hand.

David J.


Re: Pro et contra of preserving pg_proc oids during pg_upgrade

2023-10-12 Thread Robert Haas
On Thu, Oct 12, 2023 at 2:38 PM David G. Johnston
 wrote:
> It's more like a lot number or surveying tract than an postal address.  
> Useful for a single party, the builder or the government, but not something 
> you give out to other people so they can find you.
>
> Whether or not we copy over oids should be done based upon our internal 
> needs, not end users.  Which is why the fee that do get copied exists, 
> because we store them in internal files that we want to copy as part of the 
> upgrade.  It also isn't like pg_dump/restore is going to retain them and the 
> less divergence between that and pg_upgrade arguably the better.

We build the product for the end users. Their desires and needs are
relevant. And if they're telling us we did it wrong, we need to listen
to that. We don't have to do everything that everybody wants, but
treating developer needs as strictly more important than end-user
needs is self-defeating.

I agree that there's a trade-off here. Preserving more OIDs requires
more code and makes pg_dump and other things more complicated, which
is not great. But, at least to me, arguing that there are no downsides
of not preserving these OIDs is simply not a believable argument.

Well, maybe somebody believes it. But I don't.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Pro et contra of preserving pg_proc oids during pg_upgrade

2023-10-12 Thread David G. Johnston
On Thu, Oct 12, 2023, 11:21 Robert Haas  wrote:

>
> The pg_upgrade experience right now is a bit as if you woke up in the
> morning and found that city officials came by during the night and
> renumbered your house, thus changing your address. Then, they sent
> change of address forms to everyone who ever mails you anything, plus
> updated your address with your doctor's office and your children's
> school. In a way, there's no problem: nothing has really changed for
> you in any way that matters. Yet, I think that would feel pretty
> uncomfortable if it actually happened to you, and I think the
> pg_upgrade experience is uncomfortable in the same way.
>

It's more like a lot number or surveying tract than an postal address.
Useful for a single party, the builder or the government, but not something
you give out to other people so they can find you.

Whether or not we copy over oids should be done based upon our internal
needs, not end users.  Which is why the fee that do get copied exists,
because we store them in internal files that we want to copy as part of the
upgrade.  It also isn't like pg_dump/restore is going to retain them and
the less divergence between that and pg_upgrade arguably the better.

David J.

>


Re: Pro et contra of preserving pg_proc oids during pg_upgrade

2023-10-12 Thread Robert Haas
On Thu, Oct 12, 2023 at 10:35 AM Tom Lane  wrote:
> You have the burden of proof backwards.  That would add a great deal
> of new mechanism, and you haven't provided even one reason why it'd
> be worth doing.

"A great deal of new mechanism" seems like a slight exaggeration. We
preserve a bunch of kinds of OIDs already, and it wouldn't be any
harder to preserve this one than the ones we preserve already, or so I
think. So it would be some additional mechanism, but maybe not a great
deal.

As to whether it's a good idea, it isn't necessary for the system to
operate properly, so we didn't, but it's a judgement call whether it's
better for other reasons, like being able to have regprocedure columns
survive an upgrade, or making users being less confused, or allowing
people supporting PostgreSQL having an easier time debugging issues.
Personally, I've never been quite sure we made the right decision
there. I admit that I'm not particularly keen to try to add the amount
of mechanism that would be required to preserve every single OID
everywhere, but I also somehow feel like the fact that we don't is
pretty weird.

The pg_upgrade experience right now is a bit as if you woke up in the
morning and found that city officials came by during the night and
renumbered your house, thus changing your address. Then, they sent
change of address forms to everyone who ever mails you anything, plus
updated your address with your doctor's office and your children's
school. In a way, there's no problem: nothing has really changed for
you in any way that matters. Yet, I think that would feel pretty
uncomfortable if it actually happened to you, and I think the
pg_upgrade experience is uncomfortable in the same way.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Pro et contra of preserving pg_proc oids during pg_upgrade

2023-10-12 Thread David G. Johnston
On Thu, Oct 12, 2023 at 7:36 AM Tom Lane  wrote:

> Nikita Malakhov  writes:
> > Please advise on the idea of preserving pg_proc oids during pg_upgrade,
> in
> > a way like relfilenodes, type id and so on. What are possible downsides
> of
> > such a solution?
>
> You have the burden of proof backwards.  That would add a great deal
> of new mechanism, and you haven't provided even one reason why it'd
> be worth doing.
>
>
I was curious about the comment regarding type oids being copied over and I
found the commentary in pg_upgrade.c that describes which oids are copied
over and why, but the IMPLEMENTATION seems to be out-of-sync with the
actual implementation.

"""
It preserves the relfilenode numbers so TOAST and other references
to relfilenodes in user data is preserved.  (See binary-upgrade usage
in pg_dump). We choose to preserve tablespace and database OIDs as well.
"""

David J.


Re: Pro et contra of preserving pg_proc oids during pg_upgrade

2023-10-12 Thread David G. Johnston
On Thu, Oct 12, 2023 at 9:57 AM Nikita Malakhov  wrote:

> Say, we have data processed by some user function and we want to keep
> reference to this function
> in our data.
>

Then you need to keep the user-visible identifier of said function
(schema+name+input argument types - you'd probably want to incorporate
version into the name) in your user-space code.  Exposing runtime generated
oids to user-space is not something I can imagine the system supporting.
It goes against the very definition of "implementation detail" that
user-space code is not supposed to depend upon.

David J.


Re: Pro et contra of preserving pg_proc oids during pg_upgrade

2023-10-12 Thread Nikita Malakhov
Hi!

Say, we have data processed by some user function and we want to keep
reference to this function
in our data. In this case we have two ways - first - store string output of
regprocedure, which is not
very convenient, and the second - store its OID, which requires slight
modification of pg_upgrade
(pg_dump and func/procedure creation function).

I've read previous threads about using regproc, and agree that this is not
a very good case anyway,
but I haven't found any serious obstacles that forbid modifying pg_upgrade
this way.

--
Regards,
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
https://postgrespro.ru/


Re: Pro et contra of preserving pg_proc oids during pg_upgrade

2023-10-12 Thread Tom Lane
Nikita Malakhov  writes:
> Please advise on the idea of preserving pg_proc oids during pg_upgrade, in
> a way like relfilenodes, type id and so on. What are possible downsides of
> such a solution?

You have the burden of proof backwards.  That would add a great deal
of new mechanism, and you haven't provided even one reason why it'd
be worth doing.

regards, tom lane




Pro et contra of preserving pg_proc oids during pg_upgrade

2023-10-12 Thread Nikita Malakhov
Hi hackers!

Please advise on the idea of preserving pg_proc oids during pg_upgrade, in
a way like relfilenodes, type id and so on. What are possible downsides of
such a solution?

Thanks!

-- 
Regards,
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
https://postgrespro.ru/