Is there a way to automatically scan a table and determine the format of data

2022-02-15 Thread Shaozhong SHI
Is there a way to automatically scan a table and report the format of data
for each column?
Regards,
David


Re: "grant usage on schema" confers the ability to execute all user-defined functions in that schema, with needing to grant "execute"

2022-02-15 Thread Bryn Llewellyn
> b...@yugabyte.com wrote:
> 
>> ddevie...@gmail.com wrote:
>> 
>>> b...@yugabyte.com wrote:
>>> 
>>> …Then I did this:
>>> 
>>> with c as (
>>>  select
>>>proname::text as name,
>>>pronamespace::regnamespace::text  as schema,
>>>aclexplode(proacl)as "aclexplode(proacl)"
>>>  from pg_catalog.pg_proc)
>>> select "aclexplode(proacl)" from c
>>> where name = 'q' and schema = 's';
>>> 
>>> This is the result:
>>> aclexplode(proacl)
>>> -
>>> (1494148,0,EXECUTE,f)
>>> (1494148,1494148,EXECUTE,f)
>>> (1494148,1494150,EXECUTE,f)
>> 
>> `aclexplode` is a table-valued function, so you normally use it in the
>> FROM clause.
>> Here's how I use it on schemas for example:
>> 
>> ```
>> select nspname as name,
>>   nspowner::regrole::text as owner,
>>   grantor::regrole::text,
>>   grantee::regrole::text,
>>   privilege_type, is_grantable
>>  from pg_namespace
>>  left join lateral aclexplode(nspacl) on true
>> where ...
>> order by nspname
>> ```
> 
> Thank you very much for the tip and for the code example, Dominique. Yes, my 
> SQL was poorly written. I wanted just a simple proof of concept that 
> "aclexplode()" lets me access the individual values that the "proacl" column 
> represents as an array of "aclitem" records without needing to parse text 
> strings like "z=X/x". I'd started to picture writing my own function to do 
> what "aclexplode()" does. But Julien Rouhaud told me about the built-in for 
> the purpose I needed before I'd had time to give my own function any thought.
> 
> I should have at least moved my invocation of "aclexplode()" out of the CTE. 
> But, of course, for an approach that finds many "pg_proc" rows, I'll need a 
> proper, robust approach like you showed.

*Listing publicly executable subprograms*

I mentioned earlier in this thread that I thought that it would be useful to be 
able to list all the user-defined functions and procedures in a database which 
"public" is authorized to execute. I think that I mentioned "table function" as 
a possible useful encapsulation. Forget this. It was just a metaphor for 
"reusable". I wrote this short "language sql" function. The idea is the it 
would be installed with a dedicated owner in a dedicated schema so that all 
users in the database could execute it. Here's the DDL to create it:

create function pg.public_has_execute(proacl in aclitem[])
  returns boolean
  security invoker
  immutable
  language sql
as $body$
  select
(select proacl is null)
or
(with c as (select aclexplode(proacl) as r)
select exists (select * from c where (r).grantee = 0));
$body$;

Have I understood right that because this is  "language sql", its defining 
subquery is inlined into the statement that uses it early enough in the 
compilation that it ends up the same as if its text has been included directly 
in the using statement—in other words, that the encapsulation as a function 
brings no performance considerations?

I've copied a self-contained script below that creates and tests it. The few 
tests that I did show that it works as I wanted it to. This is the result that 
it produces. It's what I expect:

  owner   | schema |name| type | language | public_has_execute 
--+++--+--+
 postgres | utils  | public_has_execute | func | sql  | true
 x| s  | f  | func | plpgsql  | true
 x| s  | f  | func | sql  | false
 x| s  | p  | proc | plpgsql  | true
 y| s  | g  | func | plpgsql  | false

I did mention that it might be good if such a function could ship as part of a 
future PG Version. Forget that I said this, too. It's so short that anyone who 
wants it could write it. Moreover, somebody might want to list subprograms 
that, say, "mary" and "john" can execute. It seems that it would be far better 
just to implement this explicitly than to lard up a generic function with an 
elaborate parameterization and implementation.

So, unless anybody has comments, it's "case closed" from me.


-- Setup
\c postgres postgres
set client_min_messages = warning;
drop database if exists db;
create database db owner postgres;

\c db postgres
set client_min_messages = warning;
drop schema if exists public cascade;
-- A more realistic example would have a dedicated user, say "utl"
-- to own utility subprograms.
create schema utils authorization postgres;

create function utils.public_has_execute(proacl in aclitem[])
  returns boolean
  security invoker
  immutable
  language sql
as $body$
  select
(select proacl is null)
or
(with c as (select aclexplode(proacl) as r)
select exists (select * from c where (r).grantee = 0));
$body$;

-- Example use.
create view utils.publicly_executable_su

Re: Can we go beyond the standard to make Postgres radically better?

2022-02-15 Thread Alvaro Herrera
On 2022-Feb-13, Guyren Howe wrote:

> I’m back to just having no earthly idea why anyone who finds relations
> to be a productive tool for building a model would think that SQL
> being the only means to do that is Okay.

There are aspects other than technical reasons alone why some things
live on while "better" things barely survive without thriving, or even
die.  For example, the fact that there are multiples companies
furthering the development of the SQL language means that there's a
group of engineers working to introduce improvements to the language
after extensive discussions and exposure to requests from users.  If
Postgres decided to "fork" and go solitary with its own query language,
it would by itself have to produce all the language improvements, or be
left behind by the other products.  And it takes a lot of effort to
produce those improvements.  Have you looked at how SQL changed from one
version to the next?

Another aspect is inertia.  The amount of software products that rely on
SQL is just too high.  Suppose you came up with a technically-better
language that has all the current capabilities of SQL.  Do you think a
majority of those products would immediately switch to the new language?
My guess is no, they wouldn't, because the safe bet is that SQL will
continue to work in 10, 20 years, while there is no certainty at all
that your new language would.  So by ditching SQL, Postgres would no
longer be a database of choice for those products.

So, while SQL may not be the greatest possible relational language
possible, there are very good reasons for it to continue to be the
language of choice.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/




Re: Can we go beyond the standard to make Postgres radically better?

2022-02-15 Thread Bruce Momjian
On Tue, Feb 15, 2022 at 02:18:35PM -0600, Merlin Moncure wrote:
> Exactly. SQL is proven to be more productive and code written in it
> has longer longevity than alternatives.  It's also generally more
> terse in the hands of a good author.  The authors of all the 'SQL
> sucks' rants don't really explore why this is the case.  For example,
> SQL has transactions and pretty much all other major languages don't.
> They may have it in a limited sense but not standardized throughout
> the syntax and the standard libraries.  High quality automatic
> concurrency models are another factor.

What I found with QUEL was that simple things were easier than SQL, but
things like aggregates and subqueries were harder, confusing. or
impossible.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  If only the physical world exists, free will is an illusion.





Re: Can we go beyond the standard to make Postgres radically better?

2022-02-15 Thread Merlin Moncure
On Sun, Feb 13, 2022 at 4:00 AM Pavel Stehule  wrote:
>
>
>
> ne 13. 2. 2022 v 10:45 odesílatel Guyren Howe  napsal:
>>
>>
>> The MySQL autocomplete is designed without context filtering. Maybe we can 
>> have this implementation too (as alternative)
>>
>> so using all column names + all table names + aliases.column names (when we 
>> know defined alias)
>>
>> Another idea about column excluding. Any implementation on the client side 
>> is very complex, because you need to parse sql. But maybe we can enhance SQL 
>> with some syntax.
>>
>> SELECT ... and LAST optional clause can be HIDE, OR EXCLUDE NAME or EXCLUDE 
>> TYPE
>>
>> SELECT * FROM postgistab HIDE TYPE LIKE 'wkb%'
>> SELECT * FROM postgistab ORDER BY ... EXCLUDE COLUMN LIKE 'shape%'
>>
>> WITH x AS (SELECT * FROM xx)
>> SELECT * FROM x EXCLUDE COLUMN x1,x2
>>
>> The column excluding should be separate *last* clase.
>>
>> More with this syntax is less possibility of collision with ANSI SQL
>>
>> Not against this. Seems somewhere in here might be a nice quality of life 
>> change.
>>
>> Still.
>>
>> I originally suggested that SQL is terrible and we should fearlessly either 
>> replace it or migrate it toward something better. And the thread winds up 
>> with a debate about a minor change to a development tool.
>>
>> I’m back to just having no earthly idea why anyone who finds relations to be 
>> a productive tool for building a model would think that SQL being the only 
>> means to do that is Okay.
>
> I think the rating of data langues is very subjective, and I am happy with 
> SQL more than with special languages like D or Quel, or other. I know SQL has 
> a lot of disadvantages, but it was designed for humans and it works for me.

Exactly. SQL is proven to be more productive and code written in it
has longer longevity than alternatives.  It's also generally more
terse in the hands of a good author.  The authors of all the 'SQL
sucks' rants don't really explore why this is the case.  For example,
SQL has transactions and pretty much all other major languages don't.
They may have it in a limited sense but not standardized throughout
the syntax and the standard libraries.  High quality automatic
concurrency models are another factor.


merlin




Re: Moving the master to a new server

2022-02-15 Thread Glen Eustace



On 16/02/22 1:58 am, Marc Millas wrote:
another way would be to, while everything running, you create a second 
slave on the new machine on rocky8 with a pg_basebackup


Thanks, I did consider this as well.  Last night I did the move using 
the rsync approach and it worked very well.


--
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Glen Eustace,
GodZone Internet Services, a division of AGRE Enterprises Ltd.,
P.O. Box 8020, Palmerston North, New Zealand 4446 Ph +64 6 357 8168, Mob +64 27 
542 4015

“Specialising in providing low-cost professional Internet Services since 1997"





Re: increasing effective_cache_size slows down join queries by a factor of 4000x

2022-02-15 Thread Tomas Vondra




On 2/15/22 01:06, A Shaposhnikov wrote:


Interestingly I have a second PG 14.2 database, with identical table 
definitions, but about 10% smaller row counts, and the exact same query 
works fast there without the 2nd condition:




Are you sure about the 10%? Because in the plans from the first machine 
I see this:


>
> -> Index Scan using team_pkey on team t (cost=0.57..11382381.88
> rows=78693167 width=175) (actual time=0.016..0.695 rows=854 loops=1)
>

while the second machine does this:



-> Index Scan using team_pkey on team t (cost=0.57..2366113.83 
rows=2807531 width=160) (actual time=0.031..0.801 rows=888 loops=1)




That's 2.8M vs. 78M, quite far from "10% difference". Not sure about 
team_aliases table, that's imposible to say from the plans.


This may matter a lot, because we use effective cache size to calculate 
cache hit ratio for the query, with relation sizes as an input. So 
smaller relations (or larger effective_cache_size) means cheaper random 
I/O, hence preference for nested loop join.


The other thing is data distribution - that may matter too.


IMO it's pointless to investigate this further - we know what's causing 
the issue. The optimizer is oblivious that merge join will have to skip 
large part of the second input, due to the implicit condition. Notice 
that adding the condition changes the cost from:


 Limit (cost=81.33..331.82 rows=1000 width=183) ...

to

 Limit (cost=81.33..720.48 rows=1000 width=183) ...

So it seems *more* expensive than the first plan. Taken to the extreme 
the planner could theoretically have chosen to use the first plan (and 
delay the condition until after the join).


regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: Moving the master to a new server

2022-02-15 Thread Marc Millas
Hi,

another way would be to, while everything running, you create a second
slave on the new machine on rocky8 with a pg_basebackup.

and start the new slave.

when low activity, you just stop the master, then promote the slave => new
master up
then modify the connection line in your recovery.conf file in the old
slave, and restart it.
maybe adding first:
recovery_target_timeline latest in the recovery.conf file


Marc MILLAS
Senior Architect
+33607850334
www.mokadb.com



On Mon, Feb 14, 2022 at 8:59 PM Glen Eustace 
wrote:

>
> On 15/02/22 8:39 am, Alan Hodgson wrote:
> > pg_dump -> restore will break your streaming replication. You'll need
> > to set it up again.
> That's what I thought might be the case.
> >
> > If the PG version isn't changing and you're still on the same version
> > of Linux, rsync would be easier.
>
> I did an ELevate upgrade on the slave from CentOS7 to Rocky8 and then
> just rename 10/data to data and that seemed to work just fine.
>
> But upgrading that way takes too long for the master so I build a new
> server instead. So, if I shutdown both postgresql instances old and new,
> rsync the data directory and restart on the new. I should be OK ?
>
> --
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
> Glen Eustace,
> GodZone Internet Services, a division of AGRE Enterprises Ltd.,
> P.O. Box 8020, Palmerston North, New Zealand 4446 Ph +64 6 357 8168, Mob
> +64 27 542 4015
>
> “Specialising in providing low-cost professional Internet Services since
> 1997"
>
>
>
>


Re: Operator % and its meaning and use

2022-02-15 Thread Laurenz Albe
On Tue, 2022-02-15 at 11:08 +, Shaozhong SHI wrote:
> Can anyone remind me of the meaning and use of operator %.

https://www.postgresql.org/docs/current/pgtrgm.html#PGTRGM-OP-TABLE

Yours,
Laurenz Albe





Aw: Operator % and its meaning and use

2022-02-15 Thread Karsten Hilbert
Dear David,

> Can anyone remind me of the meaning and use of operator %.
 
I can gladly report that I remember having seen relevant documentation on
that operator while Reading up in The Fine Manual on json_to_row following
the hint Ion kindly provided.

It was amazing !  Reading up on that helped my understanding !

Best regards,
Karsten




Operator % and its meaning and use

2022-02-15 Thread Shaozhong SHI
Can anyone remind me of the meaning and use of operator %.

It works in one of scripts like  WHERE NOT (street_name % designatedname)

Regards,

David