from:"Mladen Gogala"

Re: Lazy View's Column Computing

2021-08-02 Thread Mladen Gogala

For all that we know, it may already be happening. That looks like a 
pretty reasonable optimization which may already be in place. If we 
create a view:


mgogala=# select * from dept;
 deptno |   dname    |   loc
++--
 10 | ACCOUNTING | NEW YORK
 20 | RESEARCH   | DALLAS
 30 | SALES  | CHICAGO
 40 | OPERATIONS | BOSTON
(4 rows)

mgogala=# create view acct_view as select * from emp where deptno=10;
CREATE VIEW

The query from the view would probably merge view  with the original and 
optimize everything as a single query. Unfortunately, there is no way to 
tell:


mgogala=# explain select ename,job,sal from acct_view;
 QUERY PLAN

 Seq Scan on emp  (cost=0.00..1.18 rows=3 width=21)
   Filter: (deptno = 10)
(2 rows)

The only tool that you have at your disposal is EXPLAIN. What we need to 
ascertain that assumption is an optimizer trace file detailing the 
decisions made by optimizer, something like the event 10053 from another 
database which will remain unnamed. Merging the view query into the top 
level query would produce something like this:


mgogala=# select ename,job,sal from emp
mgogala-# where deptno=10;
 ename  |    job    | sal
+---+--
 CLARK  | MANAGER   | 2450
 KING   | PRESIDENT | 5000
 MILLER | CLERK | 1300
(3 rows)

The table, shown below, has more columns than the 3 used in the above query:

mgogala=# \d emp
   Table "mgogala.emp"
  Column  |    Type | Collation | Nullable | Default
--+-+---+--+-
 empno    | smallint    |   | not null |
 ename    | character varying(10)   |   |  |
 job  | character varying(9)    |   |  |
 mgr  | smallint    |   |  |
 hiredate | timestamp without time zone |   |  |
 sal  | double precision    |   |  |
 comm | double precision    |   |  |
 deptno   | smallint    |   |  |
Indexes:
    "emp_pkey" PRIMARY KEY, btree (empno)
Foreign-key constraints:
    "fk_deptno" FOREIGN KEY (deptno) REFERENCES dept(deptno)

Merging the top level query with the view query would be smart tactic 
which is probably already deployed. However, it is not possible to tell 
with the tools at hand. That is what you want: the query touches only 
the columns you need, nothing else. That is done by the query optimizer 
in the "rewrite" phase of the query.


https://www.postgresql.org/docs/12/query-path.html

I could bet that the top level query gets merged with the view query 
during the rewrite and that the columns that aren't needed aren't 
touched. That in particular means that the function computing an 
untouched column of the query isn't executed as it is.


Regards


Regards

On 8/2/21 10:12 AM, Avi Weinberg wrote:


Hi,

Is there a way to compute a column in a view only if it is referenced 
in the query?  I have a view's column that its value is computed by a 
function.  If in the query that column is not used at all, can 
Postgres "skip" computing it?


Thanks!

IMPORTANT - This email and any attachments is intended for the above 
named addressee(s), and may contain information which is confidential 
or privileged. If you are not the intended recipient, please inform 
the sender immediately and delete this email: you should not copy or 
use this e-mail for any purpose nor disclose its contents to any person. 


--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: Series of 10 questions about the use of postgresql, generally.

2021-08-06 Thread Mladen Gogala



On 8/6/21 1:11 AM, Pavel Stehule wrote:


 2. If I am using the COPY command for input and output around
*.csv files, how may I specify internal tuple delimiters of
',', but at the same time also specify an end of line
delimeter of ';' ?

you cannot to specify end of line delimiter in Postgres


You can specify end of line delimiter in Perl, Python or PHP. It is 
highly advisable to learn a scripting language or two when working with 
databases. With all databases, not just Postgres.





 3. How may I get postgresql to output the create table
statement(s) for one or more tables of one database?



 4. I know that I can use COPY to import or export one database
table between it and a *.csv file. Can I use it to do this
with multiple TABLES and *.csv files specified in one COPY
COMMAND, or not? How may I use COPY, or another default native
postgresql function, to export or import create table commands
to and from a *.csv file?

COPY can be used just for one table in one time

if you want to make DDL command (CREATE TABLE), then you should to run 
pg_dump -t tablename -a


 5. In the absence of OS command line instructions, is there an
internal postgresql way, via COPY or another function for
example, to backup an entire database, with all its create
table statements and all insert statements, and any other
associated object, in one hit? Or is this ill advised?

pg_dump

 6. -How exactly do Intervals work, via themselves and in relation
to the other provided native datatypes? What are Intervals
used for?

Interval is Postgres native type, that helps with work with date 
operations that are not possible to map to some basic unit. Internally 
it is a structure with numeric fields - years, months, days, hours, 
seconds. For example - I can work with a value '3 months', but without 
knowledge of the beginning, I cannot say how many days this interval 
has. You can create interval value by constructor function 
make_interval, or directly from literal, or you can divide two 
timestamp values.


Interval is a SQL standard data type for adding and subtracting dates. 
It is supported by a gazillion of other databases:


https://www.ibm.com/docs/en/informix-servers/12.10?topic=types-sql-datetime-interval-data

https://www.mysqltutorial.org/mysql-interval/

https://www.oracletutorial.com/oracle-basics/oracle-interval/

https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types

The only major database that does not support the interval type is SQL 
Server. SQL Server has a major problem with SQL standards. It also 
doesn't support FETCH  FIRST  ROWS ONLY OFFSET  syntax. Of course, 
this group is not about SQL Server so I'll leave it at that.



--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: JWT decoder

2021-08-08 Thread Mladen Gogala

You could probably use Pl/Python. Python has base64 module and urllib 
module which can deal with all URL strings I am aware of. Pl/Perl would 
probably do as well.


On 8/8/21 10:16 PM, Masih Tavassoli wrote:

Hi experts,

I am trying to find a way to decode a URL request header and extract 
its JSON fields in postgreql .


I can do this in Oracle sql using

select    (utl_raw.cast_to_varchar2 (utl_encode.base64_decode( 
utl_raw.cast_to_raw (regexp_replace ( ( 
'eyJhbGciOiJSUzI1NiIsImtpZCI6InBhLWFwaS1zaWduIn0.eyJzY29wZSI6W10sInN1YiI6Imp1bGllQHNhcHBoaXJlY29udmV5YW5jaW5nLm5ldC5hdSIsImNsaWVudCI6InB4YV9hY19zZXR0c3BsdXMiLCJ1c2VybmFtZSI6Imp1bGllQHNhcHBoaXJlY29udmV5YW5jaW5nLm5ldC5hdSIsImV4cCI6MTU1NDA4NzU2N30.OZwVETCP+HUDsSkudCfa52eQT2Fyt6BgefFpYr5xpJROjZJeVnXtzsU6MS8aHauqyyOwQSz8iG6OC2YjYgrGiZ8YTwTh1uHptNjicyCFmCkh++d0oGdK78ZOKIAvuslW3GX4V13hPUJADzGLkyO8hy01YF9Id4+7GrvmnRRkwPO7C2eUvKTB8pxDAA94Jpw74s4BiNTmm1Uv+56AXqdI8m2KFkLNwGoUR3OSNvsifl5uxBsGGmP4wbgNe/Dy1q2UsL2eBAW/GuWryHGCRz1/PR2jIV1v21ClQ8YLiFAzM3YxHb8m2ngzRynKof+Nw3nZvuafRQ16MV5THz0ahGFYLA'


), '[[:space:]]', '') from dual



But there doesn't seem to be a way doing it in postgres.

Has anyone got any suggesions?

Thanks
Masih


--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: JWT decoder

2021-08-09 Thread Mladen Gogala

Hmmm, Pl/Python and Pl/Perl are languages usable from within Postgres. 
You can write Python functions in Postgres. I apologize for not making 
that clear.


On 8/9/21 1:15 AM, Masih Tavassoli wrote:

There are lots of decoders but I need to do it within postgresql.

On Monday, August 9, 2021, 01:24:33 PM GMT+10, Mladen Gogala 
 wrote:



You could probably use Pl/Python. Python has base64 module and urllib 
module which can deal with all URL strings I am aware of.  Pl/Perl 
would probably do as well.



--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: Serious List of PostgreSQL usage questions.

2021-08-09 Thread Mladen Gogala


Responses in-line:


On 8/9/21 10:01 PM, A Z wrote:
1) Are there free scripts for CREATE TYPE (native type), more 
advanced,  or  sorts of types out there, online, free 
for commercial use? With function support, too? Can someone reply with 
a link or a suggestion?



ORAFCE? That extension has Oracle date/time functions.




2) How may I get PostgreSQL to output the create table statement(s) 
for one or more tables inside one database, without issuing 
instructions via the command line, but only inside a database login, 
as a query or pl/sql?


Postgres doesn't store DDL into the catalog, DDL has to be created from 
the catalog data. I would advise pg_tables, pg_attribute and pg_indexes. 
That shouldn't be too hard of an exercise with PlPgSQL, If you want, you 
can play with the ddlx extension: https://pgxn.org/dist/ddlx/





3) I know that I can use COPY to import or export one database table 
between it and a *.csv file. Can I use it to do this with multiple 
TABLES and *.csv files specified in one COPY COMMAND, or not?


You can not. Copy does only one table at one time. Of course, you can 
use "screen" and do multiple tables from multiple windows or something 
similar.





4) In the absence of OS command line instructions, is there an 
internal PostgreSQL way, via COPY or another function for example, to 
backup an entire database, with all it's create table statements and 
all insert statements, and any other associated object, in one hit? Or 
is this ill-advised?
Postgres pg_dump utility will do that. The results should be interesting 
for a muti-TB database. BTW, any backup utility, even the plain and 
simple pg_basebackup with --format=tar will be able to restore the 
database and recover the database. You can always reconstruct the DDL 
with pg_dump --schema-only. PgAdmin4 will also provide the DDL on request.


5) When setting up communication to remote databases on remote 
machines, I need to use the OPTIONS() function. It seems to require as 
its first function parameter, the schema of the table (the second 
parameter) that it wants to access. Can I supply a null schema, and 
still be able to reference the remote table, or must I also make use 
of IMPORT FOREIGN SCHEMA?


6) I have found that the native trigonometry functions, namely the 
radians versions, do produce error results around key trigonometry 
input values. I have discovered that these errors persist, even if I 
cast the input parameter away from DOUBLE PRECISION and into DECIMAL. 
I would like to know if there are any freely available scripts out 
there that include Arbitrary Precision mathematical functions support 
that work on DECIMAL and not on DOUBLE PRECISION, that do not produce 
any error values around key inputs? Could someone refer me to a 
website that has a script that is such?


PlPgPython can use numpy: https://numpy.org


--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

log_statement GUC parameter

2021-08-12 Thread Mladen Gogala


Hi!

Unfortunately, only a superuser can set log_statement='all'; Would it be 
possible to execute set session log_statement='all'; as an ordinary 
user? I am trying to execute it from login.sql, a part of login_hook 
extension which implements on-login triggers in PostgreSQL. I will 
create a procedure with security definer, owned by the role "postgres", 
and grant it to public. That should do the trick. However, it would be 
much nicer if PostgreSQL allowed me to set the parameter as a part of 
the normal session.


The idea is to log all statements by the particular user, not by 
everybody. The user in question uses Weblogic 12.2.1.4  and creates a 
connection pool so I need to find out which statements are the longest 
running ones and make them perform.


Regards

--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: log_statement GUC parameter

2021-08-12 Thread Mladen Gogala


Hi Bruce

Yes, I can. I have already done so and it works. I wrote a procedure 
because of my Oracle background, but it doesn't really matter. This was 
just a suggestion for the session settable parameters.


Regards

On 8/12/21 4:23 PM, Bruce Momjian wrote:

I think you can write a SECURITY DEFINER function that calls SET, call
that function at login.


--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: log_statement GUC parameter

2021-08-12 Thread Mladen Gogala

Thank you Tom! It turns out that the Oracle way of doing things 
(SECURITY DEFINER) was the wrong way here. Thanks a bunch.


On 8/12/21 4:37 PM, Tom Lane wrote:

ctually, for that specific requirement, there's an easier way:

ALTER USER target_user SET log_statement = 'all';

While the target_user can't do that for himself, a superuser
can.


--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: Cluster fencing tool/software for PostgreSQL cluster with streaming replication

2021-08-16 Thread Mladen Gogala


Hi Vikas!

I used to see you on the mailing list for another database, that shall 
remain unnamed but starts with the letter "O". To answer your question: 
I have very good experience with Veritas Cluster on both Linux and 
Solaris (or was it Slowaris? I forgot the exact spelling :). It's an 
industry standard fail-over cluster and Veritas provides an excellent 
support for it.


Regards

On 8/16/21 6:31 AM, Vikas Sharma wrote:

Hi,

I am planning for an enterprise grade PostgreSQL cluster and so 
looking for the tool/softwares which will do the cluster management or 
fencing to avoid split brain.


Please, could you let me know if there are some tools/software which 
can do that so it can be used in a production environment.


PostgreSQL12 with streaming replication + repmgr and pgpool for 
connection management and load balancing.


Regards
Vikas S



--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: Can't locate Ora2Pg.pm while executing export_schema.sh

2021-08-17 Thread Mladen Gogala


You should locate Ora2Pg.pm using "find" like this:

[mgogala@umajor ~]$ find /usr/local -name Ora2Pg.pm
/usr/local/share/perl5/5.32/Ora2Pg.pm
[mgogala@umajor ~]$

and then make sure that your PERL5LIB directory contains  the location:

export PERL5LIB=/usr/local/share/perl5/5.32:$PERL5LIB

That should do the trick. On most distributions, 
/usr/local/share/perl5/ is already included into PERL5LIB. 
Are you working on Cygwin? You can do a simple test:


[mgogala@umajor ~]$ perl -e 'use Ora2Pg; print "$Ora2Pg::VERSION\n";'
21.0

This is not related to PostgreSQL, this is a Perl question: 
https://gist.github.com/cgoldberg/4332167


On 8/17/21 3:15 AM, Pansara, Jiten wrote:


Hello Team,

I am getting below error while executing export_schema.sh. Any 
suggestion/help is really appreciated.


BEGIN failed--compilation aborted at 
/c/Strawberry/perl/site/bin/ora2pg line 27.


Can't locate Ora2Pg.pm in @INC (you may need to install the Ora2Pg 
module) (@INC contains: /usr/lib/perl5/site_perl 
/usr/share/perl5/site_perl /usr/lib/perl5/vendor_perl 
/usr/share/perl5/vendor_perl /usr/lib/perl5/core_perl 
/usr/share/perl5/core_perl) at /c/Strawberry/perl/site/bin/ora2pg line 27.


BEGIN failed--compilation aborted at 
/c/Strawberry/perl/site/bin/ora2pg line 27.


If we don’t want to use this script, should we execute ora2pg directly 
to migrate the database from Oracle to Postgres?


  * Ora2pg -c ora2pg.conf

Let me know your views.

*Jiten Pansara*

Senior Database Analyst

*T*:  +91 9158149600
*E*: jiten.pans...@fisglobal.com <mailto:jiten.pans...@fisglobal.com>
*FIS | Advancing the way the world Pays, Banks, and 
Invests***<https://www.facebook.com/FIStoday><https://twitter.com/FISGlobal><https://www.linkedin.com/company/fis>


A picture containing text, outdoor, sign Description automatically 
generated


**

The information contained in this message is proprietary and/or 
confidential. If you are not the intended recipient, please: (i) 
delete the message and all copies; (ii) do not disclose, distribute or 
use the message in any manner; and (iii) notify the sender 
immediately. In addition, please be aware that any message addressed 
to our domain is subject to archiving and review by persons other than 
the intended recipient. Thank you. 


--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: Serious Assistance with PostgreSQL True Infinite Arbitrary Precision Maths.

2021-08-17 Thread Mladen Gogala

The answer for you is Pl/Python3u. It can use numpy, which supports 
infinite precision arithmetic and converting results to double and back. 
It can also do linked lists. Depending on your age and your preferences, 
you might try Pl/Perl. I learned Perl in 1993 and have never stopped 
loving it. It has several modules for the infinite precision arithmetic. 
My favorite is Math::BigFloat. As for the linked lists, there is 
List::DoubleLinked module on CPAN. In other words, all procedural 
extensions can be used to fulfill your requirements. BTW, one of my 
employers told me that Perl is verboten and that the company has 
standardized on Python. So I had to learn it. It's not too bad if you're 
in that sort of things.


Regards


On 8/17/21 10:49 PM, A Z wrote:

Dear pgsql-general,

I'm after Infinite Precision Integer (ii) or Infinite Precision 
Rational (ir)

type and function support for PostgreSQL, as an easy installation
for two new native types.  That are available under free
and open source conditions, for public, private and commercial, free
use.

I am aware of the pgmp project.  While I have contacted it and and am
awaiting their response, it is the case that the present state of pgmp
is not enough for my needs.

PostgreSQL does not have a private doubly linked list available,
and so I can't assemble all this for myself, given that I do not
know enough about PostgreSQL extensions and the C language.

What I am after is True Arbitrary Precision Arithmetic, and not High 
Precision
Arithmetic.  What I seek would be contingent only upon available OS or 
Hardware

memory.

I need arithmetic and comparison support on Integer and Rational 
(Decimal type)
numbers. * +, -, *,/,%, ==, !=, <>,>,<, >=, <=*. Other important 
candidates required

include *+=, -=, *=, /=, %=.*

The trick is, that I also need full classical function support as 
well.  What I need

fairly well is:

*cast(ir) returns double precision;*
*cast(double precision) returns ir; *
*cast(ir) returns real;*
*cast(real) return ir;*
*cast(ir) returns numeric;*
*cast(numeric) returns ir;*
*cast(ir) returns decimal;*
*cast(decimal) returns ir;*
*cast(ii) returns bigserial;*
*cast(bigserial) returns ii;*
*cast(ii) returns serial;*
*cast(serial) returns ii;*
*cast(ii) returns smallserial;*
*cast(smallserial) returns ii;*
*cast(ii) returns bigint;*
*cast(bigint) returns ii;*
*cast(ii) returns integer;*
*cast(integer) returns ii;*
*cast(ii) returns smallint;*
*cast(smallint) returns ii;*
*
*
*cast(text as ii) returns ii;*
*cast(text as ir) returns ir;*
*cast(ir as text) returns text;*
*cast(ii as text) returns text;*
*cast(ii as ir) returns ir;*
*cast(ir as ii) returns ii;*
*
*
*sign(ir input) returns ir;*
*abs(ir input) returns ir;*
*pi(ii places) returns ir;*
*e(ii places) returns ir;*
*power(ir base, ir exponent) returns ir;*
*sqrt(ir input) returns ir*
*nroot(ii theroot, ir input) returns ir;*
*log10(ir input) returns ir;*
*loge(ir input) returns ir;*
*log2(ir input) returns ir;*
*factorial(ii input) returns ii;*
*degrees(ir input) returns ir;*
*radians(ir input) returns it;*
*
*
*sind(ir input) returns ir;*
*cosd(ir input) returns ir;*
*tand(ir input) returns ir;*
*asind(ir input) returns ir;*
*acosd(ir input) returns ir;*
*atand(ir input) returns ir;*
*
*
*sinr(ir input) returns ir;*
*cosr(ir input) returns ir;*
*tanr(ir input) returns ir;*
*asinr(ir input) returns ir;*
*acosr(ir input) returns ir;*
*atanr(ir input) returns ir;*

The last two sets of functions are forward and inverse trigonometry 
functions,

both in degrees and radians.

I also need the update module to install easily.  A windows *.exe and 
*.msi,

a Linux *.deb, *.rpm or *.bin.

Is there someone or something out there that can get there?


--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: Licensing

2021-08-18 Thread Mladen Gogala

For a six-pack of Sam Adams Boston Lager or Molson Canadian Lager per 
website, I hereby grant thee the right to use PostgreSQL with any 
customer, to your heart's content. Please note that I don't accept 
Budweiser or Miller products.



On 8/18/21 2:39 PM, Bryan Boone wrote:

Hello Everyone.  I have a question about licensing with PostgeSQL.

I work for a small company and we design websites for customers of 
ours.  I read the license that is listed here.


https://www.postgresql.org/about/licence/ 
<https://www.postgresql.org/about/licence/>


However, I am still confused.

The small company I work for develops websites and applications on AWS 
EC2 instances for our customers and we charge the customers a fee for 
hosting and for maintaining their websites.


I am trying to figure out which dbase software I can use and if I need 
to buy licenses or not.


We do not modify the PostgreSQL code and we do not redistribute the 
binaries.  So far we are just running on a single instance for each of 
our customers, but we are not utilizing anything like a cluster or 
high availability CDN, other than what AWS EC2 provides.  We always 
utilize Wordpress and Drupal for site front-end.


Can someone tell me if I am able to use PostgreSQL for the small 
company I work for?  Is there a way to by a commercial license of 
PostgreSQL if we are required too?


Thanks


--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: Postgres Migration error - Need help

2021-08-18 Thread Mladen Gogala

Why are you connecting to Postgres in the first place? Ora2Pg should 
connect to Oracle and produce a SQL script to re-create PostgreSQL 
version of the Oracle schema. As for the error below, you're probably 
using an old version of DBD::Pg. Check it using CPAN and upgrade if 
necessary.


Regards

On 8/18/21 5:50 AM, Pansara, Jiten wrote:


Hello Team,

I am executing below command to copy data but it is throwing error. 
Can anyone help please?


*Jiten Pansara*

Senior Database Analyst

*T*:  +91 9158149600
*E*: jiten.pans...@fisglobal.com <mailto:jiten.pans...@fisglobal.com>
*FIS | Advancing the way the world Pays, Banks, and 
Invests***<https://www.facebook.com/FIStoday><https://twitter.com/FISGlobal><https://www.linkedin.com/company/fis>


A picture containing text, outdoor, sign Description automatically 
generated


**

The information contained in this message is proprietary and/or 
confidential. If you are not the intended recipient, please: (i) 
delete the message and all copies; (ii) do not disclose, distribute or 
use the message in any manner; and (iii) notify the sender 
immediately. In addition, please be aware that any message addressed 
to our domain is subject to archiving and review by persons other than 
the intended recipient. Thank you. 


--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: Make bloom extension trusted, but can not drop with normal user

2021-08-21 Thread Mladen Gogala

Why do you think that the normal user should be allowed to drop 
extensions? Extensions are additions to the RDBMS functionality. Do you 
really want normal users to tinker with the database functionality and 
reduce the number of index types you can create? I am not sure that I am 
on board with that idea.


Regards

On 8/20/21 7:01 AM, Li EF Zhang wrote:
So, I want to know if extensions in contrib dir can be set "trusted" 
ourselves? If we can, I think drop extension should be allowed by 
normal user.


--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re:

2021-08-27 Thread Mladen Gogala


On 8/27/21 8:59 AM, Ravi Krishna wrote:


how to take incremental backup in postgresql windows machine.


AFAIK PG has no concept of incremental backup.  pgbackrest has, but not sure 
whether it runs on Windows.


PG_RMAN can do incremental backups. I think it does the same thing as 
Oracle  without enabled block change tracking: it backs up all the files 
changed since the last full backup. However, that doesn't help you much 
because of the vacuum which can touch the file well after it has been 
referenced in transaction. I like pg_rman because it can automatically 
delete old backups and keep only 4 latest backups.


--

Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: Insert statement doesn't complete

2021-08-28 Thread Mladen Gogala



On 8/28/21 11:40 PM, Trang Le wrote:

Hi all,

I am facing an issue

when I run a script:
INSERT INTO mdm.etl_exception_info
SELECT
        *
FROM mdm.etl_exception_info_vw2;


Is there any error message thrown? If there is, can you tell us? Is 
autocommit on? What does the log say?



after running script I check this script in pg_lock, everything ok, job
done, data is inserted. However, the session contains the script above
doesn't complete though data is inserted and job done that is verified in
pg_lock.


What does the phrase "the script above doesn't complete" mean? How do 
you know that the script doesn't complete? Is the script hanging?  
Furthermore, what is the software version? What platform are you using?




Could you help me on this issue?


Insufficient data for a meaningful answer.




Thanks all.

Regards,
Trang


--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: database design with temporary tables

2021-08-29 Thread Mladen Gogala




On 8/29/21 12:24 PM, Adrian Klaver wrote:
Presumably not. Temporary tables only live at most for the length of a 
session. It would be a really bad idea to hold sessions open for 24 
hours. That is assuming nothing else causes the session to drop and 
the data to be lost. 


Well, that's precisely how application servers work. The allocate 
connection pool and keep them open for a very long time.



--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: database design with temporary tables

2021-08-29 Thread Mladen Gogala




On 8/29/21 2:26 PM, Adrian Klaver wrote:
The pool is maintained, the individual connections(sessions) come and 
go. Otherwise there would be no point to having a pool. Every time the 
connection(session) is closed the temporary table disappears. 


Yes, you're right. And that is very convenient. I also prefer temporary 
tables with ON COMMIT DELETE ROWS so that I don't have leftovers between 
transactions. There is also pgtt extension by Giles Darold which 
emulates Oracle global temporary tables:


https://github.com/darold/pgtt#use-of-the-extension

I have never particularly liked Oracle's implementation of temporary 
tables, local or global,  so I have no problems with questions like 
that. BTW, I have compared Oracle's implementation with other 
implementations here:


https://dbwhisperer.wordpress.com/2019/03/18/private-temporary-tables-oracle-18c/

I could have used Postgres instead of the SQL Server, the comparison 
would read the same, except for the "#" character which denotes 
temporary table in SQL Server.


Regards


--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: How to observe plan_cache_mode transition from custom to generic plan?

2021-09-05 Thread Mladen Gogala

bind values and table statistics. 
People switching from Oracle, me being one of those, frequently make 
mistake of using bind variables in Postgres. Sometimes, there is no 
choice, for instance if your app uses an application server and an ORM. 
Postgres doesn't have the memory structure called "shared pool" and 
cursors and compiled procedures are not cached. That means that by using 
Postgres, you are deprived of all the fun managing the cursor caching, 
shared pool, database statistics and latches (don't ask, please). There 
is a small consolation that you get to keep a ton of money in your 
pocket. You also can have a stable plan. There are no adaptive plans or 
cursor sharing.


Frank Pachot, an Oracle Ace, has an excellent article on the subject:

https://franckpachot.medium.com/postgresql-bind-variable-peeking-fb4be4942252

--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: How to observe plan_cache_mode transition from custom to generic plan?

2021-09-06 Thread Mladen Gogala




On 9/6/21 2:26 AM, Laurenz Albe wrote:

"Bind variables" just being an Oraclism for parameters, it is*not*  a
mistake to use them in PostgreSQL.


Actually, it is a mistake because they don't give you any performance 
benefit and can potentially worsen the performance. There is no cursor 
sharing and generic plans can be much worse than "custom" plans, 
generated with the actual values. The only reason for using bind 
variables/parameters is to protect yourself from SQL injection. Creating 
SQL dynamically from input is the recipe for the "little Bobby Tables" 
situation: https://xkcd.com/327/



--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: The tragedy of SQL

2021-09-14 Thread Mladen Gogala


Replies in-line

On 9/14/21 01:51, Guyren Howe wrote:
They are making a decent decision. SQL is a *fucking terrible* 
language, which I don’t blame them for not wanting to learn.


Based on what criteria?




The whole industry, programming languages, infrastructure, everything 
would have developed differently if relations were a natural, 
pleasurable thing to use in any programming language. Like an Array, 
or a Hash.


Thee is nothing natural about either relations or arrays and 
hashes/dictionaries. Relations are pretty literal implementation of the 
basic set theory. Having a decent understanding of the basic set theory 
is a condition  for understanding SQL. Now, we can discuss whether a 
language implementing a mathematical theory is "good" or "bad", whatever 
the meaning of "good" or "bad" in the given context. Historically, SQL 
is a good fit for the banking business and accounting and that is why it 
is still around.


Another misconception about SQL is treating it as a general purpose 
programming language. SQL is data description language, nothing more, 
nothing less. It doesn't need loops, arrays, hashes or subroutines, its 
principal purpose is to describe a subset of data. Without SQL you would 
have to read all the data and filter the unnecessary stuff yourself. 
Furthermore, part of SQL are so called "ACID requirements". Transaction 
can only see the data that was committed before the transaction has 
begun. Implementing ACID takes a lot of code, that's what MVCC is all 
about. However, that too has its origins in accounting. You cannot pay 
the money you don't have. And the last thing about SQL is transaction 
management. Without relational databases and SQL, you would need a 
proprietary transaction manager, just like MongoDB. And MongoDB has a 
complex proprietary transaction manager and is losing market share. So, 
to recapitulate:


 * Declarative subset definition
 * ACID consistency
 * Transaction management
 * Ideal fit for accounting.

That is why SQL is still around. And that is why we all live in a yellow 
subroutine (this reference is not for the millennials or younger).


--
I'll speak the key, the whole key and nothing but the key, so help me Codd.
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: The tragedy of SQL

2021-09-14 Thread Mladen Gogala



On 9/14/21 02:18, Rob Sargent wrote:

All languages are fucking terrible.


I like English. It's not very complex and it allows me to express myself 
very well. You should see my native tongue, Croatian language, from the 
group of Slavic languages. It's fucking terrible.



--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: The tragedy of SQL

2021-09-14 Thread Mladen Gogala




On 9/14/21 16:07, Raymond Brinzer wrote:

By analogy:  Arabic and Roman numerals both describe the natural
numbers.  Hence, they have the same mathematical properties.  Spending
a little time doing algebra with Roman numerals should convince
anyone, however, that how you express a concept matters a lot.


Analogy is not an accepted logical method. I do agree that the style of 
expression matters. That is why we have literature. Saying "hey there" 
and "friends, Romans, countrymen, lend me your ears" can have the same 
meaning in the given context but the latter expression is much more 
poetic. As software engineers, we are very much opposed to poetry, 
especially those among us using Perl.


However, the stated purpose of the SQL language is to describe sets and 
subsets. It's supposed to be verbose. Storage extensions are database 
specific and are here for performance reasons.



--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: The tragedy of SQL

2021-09-14 Thread Mladen Gogala




On 9/14/21 15:57, Guyren Howe wrote:

Verbosity. Redundancy. Lack of orthogonality. Resemblance to English.


Verbosity is a feature, as well as the resemblance to English. The 
language is meant to be understood by accountants. Once upon a time 
people were using something called "COmmon Business Oriented Language" 
which was also very verbose, for the same reason: it had to be 
understandable to the business people. SQL is written by the people with 
the background in mathematics and is thus more understandable for the 
people with background in mathematics. I have no problem with SQL.



--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: The tragedy of SQL

2021-09-16 Thread Mladen Gogala




On 9/16/21 19:29, Guyren Howe wrote:
Missing my original point here. The set theory is the _point_. SQL is 
a gargantuan distraction from using it efficiently.


Imagine if COBOL was the only widely-available programming language 
with functions. You might use it, because functions are really great 
abstraction for programming. That wouldn’t mean that COBOL wasn’t an 
utterly awful language.


SQL is like that, only for relations, sets and logic.



I am probably a bit biased because I am using SQL for a few weeks now. I 
was the first person certified for teaching Oracle 6 internals in the 
EMEA region and I was a certified Oracle 5.1.22 DBA on VAX/VMS. So, by 
using SQL for several weeks now, I grew attached to it and I like it. My 
second problem is my mathematical background. I have a Bsc in 
mathematics and was quite well acquainted not only with the basic set 
theory but also with things like Zermelo's  axiom of choice, Zorn's 
lemma and well ordering theorem. I am still quite familiar with the 
relational algebra, unions, intersections, Cartesian products, 
equivalence relations and alike. I think that SQL represents relational 
algebra quite well. It must be doing something right, because it lasts 
for so long. After all, it was created by a mathematician. There is a 
famous quote from the move "Chinatown" saying that politicians, ugly 
buildings, and whores all get respectable if they last long enough. The 
same applies to computer languages. I love the smell of SQL in the morning.



--
I'll speak the key, the whole key and nothing but the key, so help me Codd.
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: Possibilities for optimizing inserts across oracle_fdw foreign data wrapper

2021-09-19 Thread Mladen Gogala



On 9/19/21 06:28, Niels Jespersen wrote:


Hello all

We are often using the oracle_fdw to transfer data between Postgres 
(version 11+) and Oracle (version 18+). It works great.


However I have a task at hand that requires inserting a few billion 
rows in an Oracle table from a Postgres query.


insert into t_ora (a,b,c)

select a,b,c from t_pg;

This is driven from a plpgsql stored procedure, if that matters.

I want to optimize the running time of this. But I am unsure of which, 
if any, possibilities there actually is.


Reducing the number of network roundtrips is usually a good way to 
increase throughput. But, how do I do that?


If I could make the Oracle insert direct load, that would usually also 
increase throughput. But, is that possible here. There are no 
constraints defined on the destinaton tables.


Regards Niels Jespersen



The problem with oracle_fdw is that the SQL is parsed on the Postgres 
side, not on the Oracle side. If it was parsed on the Oracle side, you 
could use /*+ APPEND */ hint, which is essentially, a direct insert. You 
will have to write a script in one of the scripting languages, which 
would utilize the array insert, available with the instant client. Even 
Oracle ODBC driver utilizes array insert, as visible from the following 
article:


https://dbwhisperer.wordpress.com/2020/11/21/pyodbc-fast_executemany-and-oracle-rdbms/

Unfortunately, the Postgres side of the equation is not particularly 
good when using array fetch and does not do particularly well when 
trying to cut down on the number of network trips:


https://github.com/mkleehammer/pyodbc/wiki/Driver-support-for-fast_executemany

I would use a script on the Postgres side and then use superior options 
provided by SQL*Net.  You will need some fancy programming to prevent 
waiting on each operation. I would actually write 2 scripts, one reading 
data from Postgres, converting it to CSV and then piping it into script 
that inserts data into Oracle. That would make the scripts work in 
parallel, at least partially. Situations like this are the reason why a 
DBA needs to know how to script. So, this is where you start:


https://python.swaroopch.com/

Regards

--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: Faster distinct query?

2021-09-22 Thread Mladen Gogala




On 9/22/21 16:20, David G. Johnston wrote:
I'd probably turn that index into a foreign key that just ensures that 
every (station,channel) that appears in the data table also appears on 
the lookup table.  Grouping and array-ifying the lookup table would be 
trivial.  Either modify the application code or add a trigger to 
populate the lookup table as needed.



I fully agree with this. Adding a trigger to populate a lookup table is 
a standard design in situations like this. Using "DISTINCT" almost 
always spells trouble for the performance.



--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: Force re-compression with lz4

2021-10-18 Thread Mladen Gogala




On 10/18/21 01:07, Michael Paquier wrote:

CPU-speaking, LZ4 is*much*  faster than pglz when it comes to
compression or decompression with its default options.  The
compression ratio is comparable between both, still LZ4 compresses in
average less than PGLZ.
--
Michael


LZ4 works much better with deduplication tools like Data Domain or Data 
Domain Boost (client side deduplication). With zip or gzip compression, 
deduplication ratios are much lower than with LZ4. Most of the modern 
backup tools (DD, Veeam, Rubrik, Commvault) support deduplication. LZ4 
algorithm uses less CPU than zip, gzip or bzip2 and works much better 
with deduplication algorithms employed by the backup tools. This is 
actually a very big and positive change.


Disclosure:

I used to work for Commvault as a senior PS engineer. Commvault was the 
first tool on the market to combine LZ4 and deduplication.


Regards


--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: Force re-compression with lz4

2021-10-19 Thread Mladen Gogala




On 10/18/21 11:01, Adrian Klaver wrote:
ot sure how much this applies to the Postgres usage of lz4. As I 
understand it, this is only used internally for table compression. 
When using pg_dump compression gzip is used. Unless you pipe plain 
text output through some other program. 



This applies when using a 3rd party commercial backup tool with 
deduplication. You'd be surprised how many people do that. One tool to 
backup them all, one tool to find them and on the LTO cartridge backup 
them. I apologize for this cheesy paraphrase of Tolkien, but I couldn't 
resist. Long story short, for the 3rd party backup tools LZO4 
compression will yield better deduplication ratios than other forms of 
compression, thereby saving you space.


--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: WAL streaming and dropping a large table

2021-10-23 Thread Mladen Gogala

On 10/21/21 16:23, Rory Falloon wrote:

Hi,

My postgres server setup is thus: a production database, which is 
using WAL streaming (hot standby) to four other hosts of various 
latencies. I Have a table that is ~200GB which has been backed up, and 
now I Want to remove it. The table is not in use, it is a child table 
to a parent table that _is_ in use but I foresee no issues here. The 
total DB size is roughly 300GB.  The main reason for needing to remove 
it is to recover the space back on the partition which is humming away 
at 88% usage, and the table I wish to drop is better off in an archive 
somewhere.

I just removed around 10% of it with  'delete from', which of course 
cascaded to the replication hosts. This increased the size of my 
pg_xlog folder (from ~3GB to ~6.5GB) which of course increased my 
partition usage. Obviously this is expected, but I wonder what would 
happen if I had issued the 'drop table'?

I expect the nature of enabling max_replication_slots would mean the 
database would retain the wal segments until all have caught up; it 
could take quite a long time to 'catch up' after the 'drop table' 
command? It took about 10 minutes before the pg_xlog folder size had 
'settled down' to what I normally see as default which is ~3GB.

> wal_keep_segments is defined as 128,
> wal_max_size is not defined,
> max_replication_slots is enabled

I'd prefer to use 'drop table' as it would recover the data 
immediately, but not if it will impact the production database in a 
way that the partition will become full, which defeats the purpose. Is 
it advisable to move the pg_xlog folder to another volume on the 
system with more space (which I have) and symlink - obviously a stop & 
start of the db required - and then let the WAL archives fill up as 
need be? Or am I missing something obvious (likely)

Thanks

Well, the difference between "drop table" and "delete" is that "drop 
table" is transaction on the data dictionary. PostgreSQL has 
transactional DDL, which means that "drop table" can be rolled back, so 
that transaction has to be protected by WAL, but it is just the log of 
the dictionary tables. There is the 3rd option, called "truncate" which 
is also a DDL which creates a new empty table with the same description 
and discards the old files.  Both 'drop" and "truncate" will not 
generate much WAL logs. On the other hand, "delete" is a regular DML 
transaction which will generate logs needed to recover the 200GB table.

--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: ZFS filesystem - supported ?

2021-10-23 Thread Mladen Gogala




On 10/23/21 07:29, Laura Smith wrote:

Hi,

Given an upcoming server upgrade, I'm contemplating moving away from XFS to ZFS 
(specifically the ZoL flavour via Debian 11). BTRFS seems to be falling away 
(e.g. with Redhat deprecating it etc.), hence my preference for ZFS.

However, somewhere in the back of my mind I seem to have a recollection of reading about 
what could be described as a "strong encouragement" to stick with more 
traditional options such as ext4 or xfs.

A brief search of the docs for "xfs" didn't come up with anything, hence the 
question here.

Thanks !

Laura



Hi Laura,

May I ask why would you like to change file systems? Probably because of 
the snapshot capability? However, ZFS performance leaves much to be 
desired. Please see the following article:


https://www.phoronix.com/scan.php?page=article&item=ubuntu1910-ext4-zfs&num=1

This is relatively new, from 2019. On the page 3 there are tests with 
SQLite, Cassandra and RocksDB. Ext4 is much faster in all of them. 
Finally, there is another article about relational databases and ZFS:


https://blog.docbert.org/oracle-on-zfs/

In other words, I would test very thoroughly because your performance is 
likely to suffer. As for the supported part, that's not a problem. 
Postgres supports all modern file systems. It uses Posix system calls to 
manipulate, read and write files. Furthermore, if you need snapshots, 
disk arrays like NetApp, Hitachi or EMC can always provide that.


Regards



--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: ZFS filesystem - supported ?

2021-10-23 Thread Mladen Gogala

On 10/23/21 09:37, Laura Smith wrote:

Hi Mladen,

Yes indeed, snapshots is the primary reason, closely followed by
zfssend/receive.

I'm no stranger to using LVM snapshots with ext4/xfs but it requires a custom
shell script to manage the whole process around backups. I feel the whole
thing could well be a lot cleaner with zfs.

Thank you for the links, I will take a look.

Laura

Yes, ZFS is extremely convenient. It's a volume manager and a file
system, all rolled into one, with some additiional convenient tools.
However, performance is a major concern. If your application is OLTP,
ZFS might be a tad too slow for your performance requirements. On the
other hand, snapshots can save you a lot of time with backups,
especially if you have some commercial backup capable of multiple
readers. If your application is OLTP, ZFS might be a tad too slow for
your performance requirements. The only way to find out is to test. The
ideal tool for testing is pgio:

https://kevinclosson.net/2019/09/21/announcing-pgio-the-slob-method-for-postgresql-is-released-under-apache-2-0-and-available-at-github/

For those who do not know, Kevin Closson was the technical architect who
has built both Exadata and EMC XTRemIO. He is now the principal engineer
of the Amazon RDS. This part is intended only for those who would tell
him that "Oracle has it is not good enough" if he ever decided to post here.

--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: ZFS filesystem - supported ?

2021-10-23 Thread Mladen Gogala



On 10/23/21 23:12, Lucas wrote:
I don't know where you have your database deployed, but in my case is 
in AWS EC2 instances. The way I handle backups is at the block storage 
level, performing EBS snapshots.


Yes, Amazon uses SAN equipment that supports snapshots.


This has proven to work very well for me. I had to restore a few 
backups already and it always worked. The bad part is that I need to 
stop the database before performing the Snapshot, for data integrity, 
so that means that I have a hot-standby server only for these snapshots.

Lucas


Actually, you don't need to stop the database. You need to execute 
pg_start_backup() before taking a snapshot and then pg_stop_backup() 
when the snapshot is done. You will need to recover the database when 
you finish the restore but you will not lose any data. I know that 
pg_begin_backup() and pg_stop_backup() are deprecated but since 
PostgreSQL doesn't have any API for storage or file system snapshots, 
that's the only thing that can help you use storage snapshots as 
backups. To my knowledge,the only database that does have API for 
storage snapshots is DB2. The API is called "Advanced Copy Services" or 
ACS. It's documented here:


https://www.ibm.com/docs/en/db2/11.1?topic=recovery-db2-advanced-copy-services-acs

For Postgres, the old begin/stop backup functions should be sufficient.


--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: ZFS filesystem - supported ?

2021-10-25 Thread Mladen Gogala




On 10/25/21 13:13, Stephen Frost wrote:

No, it's not- you must also be sure to archive any WAL that's generated
between the pg_start_backup and pg_stop_backup and then to be sure and
add into the snapshot the appropriate signal files or recovery.conf,
depending on PG version, to indicate that you're restoring from a backup
and make sure that the WAL is made available via restore_command.

Just doing stat/stop backup is*not*  enough and you run the risk of
having an invalid backup or corruption when you restore.

If the entire system is on a single volume then you could possibly just
take a snapshot of it (without any start/stop backup stuff) but it's
very risky to do that and then try to do PITR with it because we don't
know where consistency is reached in such a case (we*must*  play all the
way through to the end of the WAL which existed at the time of the
snapshot in order to reach consistency).

In the end though, really, it's much, much, much better to use a proper
backup and archiving tool that's written specifically for PG than to try
and roll your own, using snapshots or not.

Thanks,

Stephen



Stephen, thank you for correcting me. You, of course, are right. I have 
erroneously thought that backup of WAL logs is implied because I always 
back that up. And yes, that needs to be made clear.


Regards

--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: ZFS filesystem - supported ?

2021-10-25 Thread Mladen Gogala




On 10/25/21 15:43, E-BLOKOS wrote:
what about BTRFS since it's the successor of ZFS? 



BTRFS is NOT the successor to ZFS. It never was. It was completely new 
file system developed by Oracle Corp. For some reason, Oracle seems to 
have lost interest in it. Red Hat has deprecated and, in all likelihood, 
BTRFS will go the way of Solaris and SPARC chips: ride into the glorious 
history of the computer science. However, BTRFS has never been widely 
used, not even among Fedora users like me. BTRFS was suffering from 
problems with corruption and performance. This is probably not the place 
to discuss the inner workings of snapshots, but it is worth knowing that 
snapshots drastically increase the IO rate on the file system - for 
every snapshot. That's where the slowness comes from.


--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: ZFS filesystem - supported ?

2021-10-26 Thread Mladen Gogala




On 10/26/21 05:35, Laura Smith wrote:

Curious, when it comes to "traditional" filesystems, why ext4 and not xfs ? 
AFAIK the legacy issues associated with xfs are long gone ?


XFS is not being very actively developed any more. Ext4 is being 
actively developed and it has some features to help with SSD space 
allocation. Phoronix has some very useful benchmarks:


https://www.phoronix.com/scan.php?page=news_item&px=Linux-5.14-File-Systems

Ext4 is much better than XFS with SQLite tests and almost equal with 
MariaDB test. PostgreSQL is a relational database (let's forget the 
object part for now) and the IO patterns will be similar to SQLite and 
MariaDB. That benchmark is brand new, done on the kernel 5.14. Of 
course, the only guarantee is doing your own benchmark, with your own 
application.


--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: ZFS filesystem - supported ?

2021-10-26 Thread Mladen Gogala




On 10/26/21 20:12, E-BLOKOS wrote:
RedHat and Oracle are mostly maintaining XFS updates, and I didn't see 
anything saying it's not mainained actively,

especially when they offering many solutions with XFS as default


Oh, they are maintaining it, all right, but they're not developing it. 
XFS is still the file system for rotational disks with plates, reading 
heads, tracks and sectors, the type we were taught about in school. 
Allocation policy for SSD devices is completely different as are 
physical characteristics. Ext4 is being adjusted to ever more popular 
NVME devices. XFS is not. In the long run, my money is on Ext4 or its 
successors. Here is another useful benchmark:


https://www.percona.com/blog/2012/03/15/ext4-vs-xfs-on-ssd/

This one is a bit old, but it shows clear advantage for Ext4 in async 
mode. I maybe wrong. Neither of the two file systems has gained any new 
features since 2012. The future may lay in F2FS ("Flash Friendly File 
System") which is very new but has a ton of optimizations for SSD 
devices. Personally, I usually use XFS for my databases but I am testing 
Ext4 with Oracle 21c on Fedora. So far, I don't have any results to 
report. The difference is imperceptible. I am primarily an Oracle DBA 
and I am testing with Oracle. That doesn't necessarily have to be 
pertinent for Postgres.


--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: ZFS filesystem - supported ?

2021-10-26 Thread Mladen Gogala



On 10/26/21 20:50, Imre Samu wrote:

> Phoronix has some very useful benchmarks:
> 
https://www.phoronix.com/scan.php?page=news_item&px=Linux-5.14-File-Systems 
<https://www.phoronix.com/scan.php?page=news_item&px=Linux-5.14-File-Systems>

> Ext4 is much better than XFS with SQLite tests and almost equal with
> MariaDB test. PostgreSQL is a relational database (let's forget the
> object part for now) and the IO patterns will be similar to SQLite and
> MariaDB.

there is a link from the Phoronix page to the full 
OpenBenchmarking.org result file
and multiple PostgreSQL 13 pgbench results included: 
https://openbenchmarking.org/result/2108260-PTS-SSDS978300&sor&ppt=D&oss=postgres 
<https://openbenchmarking.org/result/2108260-PTS-SSDS978300&sor&ppt=D&oss=postgres>

( XFS, F2FS, EXT4, BTRFS )

Regards,
 Imre


Wow! That is really interesting. Here is the gist of it:

XFS is the clear winner. It also answers the question about BTRFS. 
Thanks Imre!



--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: Incremental backup

2021-10-28 Thread Mladen Gogala



On 10/28/21 18:07, Andreas Joseph Krogh wrote:
I think everybody agrees that incremental backup /per database/, and 
not cluster-wide, is nice, and it would be nice if PG supported it. 
But, given the way PG is architectured, having cluster-wide WALs, 
that's not an easy task to implement.

Repeating "other databases have it" doesn't change that.
--
Andreas Joseph Krogh


I am not advocating for the database level incremental backups, but all 
databases that have it also have cluster wide/instance wide WAL logs. 
Cluster wide WAL logs do not make database level incremental backups 
hard. Both Oracle and DB2 have database level incremental backups and 
both have cluster wide WAL (redo or logs). So does SQL*Server which is a 
bit specific because its log is used for both recovery, as WAL, and undo 
so its very different from Postgres. Also, SQL Server is, to my 
knowledge, the only database which doesn't use MVCC.


The technology that all of those databases employ is a bitmap device 
which has one bit per each database block. Full backups set all bits to 
0 and whenever block is modified, the corresponding bit is set to 1. The 
backup tool in the incremental mode then only copies blocks with the 
bitmap value of 1. I am not too thrilled by that implementation. In 
particular, with Oracle there were bugs with database restore, 
"duplicate database" operation, global cache locks and instance latches. 
I've had quite a few headaches with RMAN cumulative incremental backups. 
My preferred method of backup is storage snapshot. Snapshots then can 
be  backed up to other arrays (NetApp SnapVault, Hitachi HUR, EMC SRDF)  
or can be backed up to deduplicated offline storage like AWS Glacier or 
EMC Data Domain using simple file level utilities. Once snapshot is 
taken, it is a read-only file system and the files are no longer opened 
by the database processes.


The classic file level backup tools like pg_basebackup or rman are 
pretty much unusable once your database hits 50 TB or so. With 50TB 
database, your RTO (Recovery Time Objective) will be in days. You can 
only push around 3TB/hour down a 10Gbit Ethernet pipe. On the other hand 
companies like Facebook. Amazon, Walmart or Target are losing hundreds 
of thousands USD per hour of downtime. Downtime of 15 hours or longer is 
completely unacceptable. Backup is only the last line of defense. It 
should only be used if everything else fails. And if you have to use 
file level tool like pg_basebackup and tar to restore your VLDB, you 
should also get your resume ready.


Incremental backups are awkward and ungainly. You have to restore the 
full backup and all incrementals taken since the last full. After that, 
you must apply the remaining WAL files. So, by definition, restoring 
incremental backup is, by definition, longer than restoring a full 
backup. So, if you have to take a backup, get yourself a deduplicating 
backup utility. Deduplication will only save the backup blocks that are 
different from the previous backup. In other words, you can do a daily 
full backup with the space expenditure of an incremental backup. Also, 
if you need backups because of regulatory reasons (HIPAA, SOX), 
incremental backups will not do.


Long story short, there are much more important things to do on Postgres 
than incremental database level backups. BTW, Chuck Norris can take and 
incremental database level backup of a Postgres database and recover 
database without restoring the preceding full backup. I am not sure 
whether Chuck Norris is a member of Postgres steering committee.


--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: Incremental backup

2021-10-29 Thread Mladen Gogala




On 10/29/21 03:30, Peter J. Holzer wrote:

I was not aware that Oracle even has something equivalent to a Postgres cluster.

 hp


It's called "Oracle instance".

Regards

--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: Incremental backup

2021-10-29 Thread Mladen Gogala




On 10/29/21 08:49, Peter J. Holzer wrote:

I don't think that's equivalent. An Oracle instance is a runtime concept
(the collection of server processes on a single machine serving a single
database (the collection of files on the disk)) whereas a PostgreSQL cluster is
both a data and a runtime concept (config + data files for several
databases with some shared data + the processes serving them). Also
instance:database is n:1 while cluster to database is 1:n. Very
different.
Peter, Oracle instance manages collection of the databases and is 
ensuring recoverabilty using redo logs, which are completely analogous 
to WAL logs, if managed a bit differently. Let's not be nitpicking here. 
Oracle instance is completely analogous to Postgres cluster. If you ask 
me, the word cluster was picked to avoid the word "instance"


--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: Incremental backup

2021-10-29 Thread Mladen Gogala

On 10/29/21 09:43, Peter J. Holzer wrote:

On 2021-10-29 09:03:04 -0400, Mladen Gogala wrote:

On 10/29/21 08:49, Peter J. Holzer wrote:

I don't think that's equivalent. An Oracle instance is a runtime concept
(the collection of server processes on a single machine serving a single
database (the collection of files on the disk)) whereas a PostgreSQL cluster is
both a data and a runtime concept (config + data files for several
databases with some shared data + the processes serving them). Also
instance:database is n:1 while cluster to database is 1:n. Very
different.

Peter, Oracle instance manages collection of the databases and is ensuring
recoverabilty using redo logs, which are completely analogous to WAL logs,
if managed a bit differently. Let's not be nitpicking here. Oracle instance
is completely analogous to Postgres cluster. If you ask me, the word cluster
was picked to avoid the word "instance"

I'm quoting Tom Kyte here:

| In fact, it is true to say that an instance will mount and open at
| most a single database in its entire lifetime!

While that article is originally from 2009, it was last changed in 2021,
and I'd trust Tom to change something as fundamental if it wasn't true
anymore.

Few days ago, in july 2013, Oracle has introduced version 12c with
multi-tenant option.

http://appstech.com/2013/08/oracle-announces-general-availability-of-oracle-database-12c-the-first-database-designed-for-the-cloud/

Each Oracle instance can now manage up to 255 pluggable databases which
are, for all intents and purposes, equivalent to Postgres databases, if
somewhat more awkward. The "unplug" and "plug in" operations leave a lot
of room for improvement. That makes Oracle instance functionally
equivalent to the Postgres cluster. Current Oracle version, 21c no
longer supports flat architecture, without pluggable databases.

--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: ZFS filesystem - supported ?

2021-11-01 Thread Mladen Gogala




On 11/1/21 17:58, Stephen Frost wrote:

Well, at least one alternative to performing these snapshots would be to
use a tool like pg_basebackup or pgbackrest to perform the backups
instead.


File system based backups are much slower than snapshots. The 
feasibility of file based utility like pg_basebackup depends on the size 
of the database and the quality of the infrastructure. However, if 
opting for snapshot based solutions, I would advise something like Pure 
or NetApp which use special hardware to accelerate the process and have 
tools to backup snapshots, like SnapVault (NetApp).


Also, when using file level utilities, I would go full commercial. 
Commercial backup utilities are usually optimized for speed, support 
deduplication and maintain backup catalog, which can come handy if there 
are regulatory requirements about preserving your backups (HIPPA, SOX)



--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: Postgres Equivalent of Oracle Package

2021-11-17 Thread Mladen Gogala




On 11/16/21 12:23, DAVID ROTH wrote:

One of the nice things about Oracle packages is that the code is loaded and 
global values are set and stored only once per session. This is very useful for 
values that are used repeatedly.

What is the best way of emulating this behavior in Postgresql?




You can create a temporary table with "ON COMMIT PRESERVE ROWS", which 
is visible in a session and cannot cross the session boundary.  Values 
in the table can be queried throughout the session. That is a pretty 
good approximation of the session variables in Oracle.


--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: Database Scalability

2021-11-30 Thread Mladen Gogala


On 11/30/21 22:08, Saurav Sarkar wrote:

Hi Ben,

Thanks a lot for your reply.

So are all the schemas on one DB or are distributed/sharded across 
multiple DBs ?


Best Regards,
Saurav


To my knowledge PostgreSQL doesn't support sharding, which is well and 
good because sharding is mostly useless, at least in my opinion.


--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: Query planner issue with preferring primary key over a better index when using ORDER BY and LIMIT

2021-12-06 Thread Mladen Gogala


On 12/6/21 22:16, Laurenz Albe wrote:

An ugly alternative is to use "ORDER BY id + 0", which prevents PostgreSQL
from using the index.


That was actually the earliest form of Oracle hints. I remember doing 
exactly that in Oracle 5.1.22 on VAX/VMS.


--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: Reindex "locked" standby database

2021-12-14 Thread Mladen Gogala


On 12/14/21 22:37, Michael Paquier wrote:

You are referring to the startup process that replays WAL, right?
Without having an idea about the type of workload your primary and/or
standbys are facing, as well as an idea of the configuration you are
using on both (hot_standby_feedback for one), I have no direct idea,
but that could be a conflict caused by a concurrent vacuum.


Hi Michael,

I am preparing for a standby deployment. I don't have a standby yet and, 
therefore, I don't have any standby problems. Would it be advisable to 
turn vacuum off on the standby? Applying WAL will also, in theory, 
populate the statistics which is also held in the database blocks.


Regards

--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: Packages, inner subprograms, and parameterizable anonymous blocks for PL/pgSQL

2021-12-16 Thread Mladen Gogala


On 12/16/21 16:48, Pavel Stehule wrote:
I don't think Postgres needs packages - this is a redundant concept in 
Postgres, when Postgres has schemas (different from Oracle's schemas) 
and extensions.


Yes, ORAFCE uses schemas as the package names. However, one very 
practical thing is missing: session variables. Yes, you can emulate 
those with ON COMMIT PRESERVE ROWS temporary tables, but that's a rather 
ugly hack.


On the other hand, packages can easily be emulated by using Python. 
Having packages would make PLPg/SQL programming much prettier. It would 
be much prettier to group related routines into a package than to have 
them laying around without anything indicating that the routines are 
related. On the plus side, packages would make it much easier to migrate 
from Oracle to Postgres. And you do want that, don't you?


I am very well aware of the animosity toward the Oracle community and 
the reasons for that animosity. Oracle wiped the floor with the 
predecessor of Postgres, the database called "Ingres". Sandra Kurtzig, 
the Ingres CEO at the the time, and Michael Stonebraker were both 
involved in very public spat with Larry Ellison. Stonebraker is still 
very much a factor in Postgres community and I doubt that his feelings 
toward the Larry and his company have got any warmer with years. 
Postgres was created with the idea of fighting the Dark Lord in the land 
of Belmont, CA. However, if it is the goal of the community to have 
users convert from Oracle to Postgres, than making that conversion easy 
would be conducive to that goal.  A long time ago a former pastry baker 
wrote the following article:


https://www.toolbox.com/tech/data-management/blogs/why-postgresql-doesnt-have-query-hints-020411/

Disclosure: I am the Oracle DBA that Josh Berkus is talking about in 
that article. Recent events with "advice" to a very well known Oracle 
personality testify to the fact that the emotions haven't changed much 
since that article was written. Of course, Oracle is still the most 
popular database in the world.



--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: List all tables from a specific database

2022-01-14 Thread Mladen Gogala

There is also INFORMATION_SCHEMA which is SQL standard and extremely easy
to use.

On Fri, Jan 14, 2022, 1:08 PM Thomas Boussekey 
wrote:

>
>
> Le ven. 14 janv. 2022 à 12:04, Ray O'Donnell  a écrit :
>
>> On 14/01/2022 10:39, Flaviu2 wrote:
>> > I work on a project that use Postgre SQL, and I have 0 experience in
>> > this regard. I hope to get help here. I need two SQL scripts for
>> Postgre:
>> >
>> > 1. Get all databases, which I aquired already:
>> >
>> > *SELECT datname FROM pg_database WHERE datistemplate = false*
>> >
>> > This one is functional, it's ok.
>> >
>> > 2. Now, I need to find all tables *under a specific* database. This one
>> > I don't know how to achieve it. Can you help me here ? It is possible ?
>>
>
>> If it's any help, running psql with the -E switch will show you the SQL
>> which psql generates... so then (within psql) issuing the \dt command to
>> list tables will show you the SQL used.
>>
>> Ray.
>>
>> Hello,
> You can use this SQL command to view all the standard table (known as
> relations):
>
> SELECT relname FROM pg_class WHERE relkind = 'r';
>
> And this one for partitioned tables (if you use them):
>
> SELECT relname FROM pg_class WHERE relkind = 'p';
>
> HTH,
> Thomas
>
>
>> --
>> Raymond O'Donnell // Galway // Ireland
>> r...@rodonnell.ie
>>
>>
>>

Re: Assistance with an out of shared memory error

2022-01-15 Thread Mladen Gogala


On 1/13/22 18:35, Tom Lane wrote:

.. or else reduce the number of partitions you're using.  (People
frequently seem to think that more partitions are free.  That is
extremely not true.  I generally think that if you're using more
than a few dozen partitions per table, you are making a mistake.)


Interesting. What resources do partitions consume? Is it documented 
anywhere? Why do you think that more than a few dozen partitions is a 
mistake?


Thanks in advance for your explanation.

--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: WAL Archiving and base backup

2022-01-15 Thread Mladen Gogala


On 1/14/22 16:00, David G. Johnston wrote:
I still don't really understand what is so great about it.  About its 
only redeeming feature is a declaration that "it is in core" and that 
newcomers can just default to it without thinking.  I'd rather just 
play favorites and write "use pgbackrest" in our documentation.  Or 
some hybrid approach where we don't just pick one but instead guide 
people to the community solutions that are out there.  I don't think I 
really want the people responsible for core to spend time on writing 
end-user backup tooling. Their time is much more valuably spent 
working on the core product.


David J.

Well, the "without thinking" part of your post can be rephrased as "ease 
of use". Do database administrators really need to think about which 
backup software to use? What kind of knowledge will such an evaluation 
provide? All commercial databases have some form of backup software 
included into the core database. After all, backup and restore are 
extremely important functions which IMHO should be provided along with 
the database software.



--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: WAL Archiving and base backup

2022-01-16 Thread Mladen Gogala


On 1/16/22 13:12, Issa Gorissen wrote:


Just to avoid any misunderstanding. I am perfectly happy using the 
backup/restore with pg_dump and we use it for at least a decade with 
success for our need of backups.


I am using pgbackrest. I used to use pg_rman, being well acquainted with 
rman backup tool  belonging to another database system, but pg_rman was 
removing WAL archives before they were delivered to replica. I have 
replication slots from the OLTP database to the DW database with 
partitions, hash and bloom indexes and massive amount of processors, for 
parallel query. I need the replication to work. In addition to that, 
pgbackrest can employ parallelism while pg_rman cannot do that. I find 
pgbackrest by far the best Postgres backup system.


--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Could not serialize access due to concurrent update

2022-01-26 Thread Mladen Gogala

In this post, I am not asking a question, I am sharing an experience.
The application is running on Linux, PostgreSQL 13.5. using Websphere 9
application server. When using "SKIP LOCKED" option, I suddenly started
seeing errors like "Could not serialize access due to concurrent
update". After some reading, the problem was pinpointed to the
transaction isolation level:

https://pganalyze.com/docs/log-insights/app-errors/U138

OK, the app is not setting transaction isolation level to repeatable
read, so what's going on? The documentation for Websphere reveals the truth:

https://www.ibm.com/support/pages/transaction-isolation-levels-and-websphere-application-server

*If you do not specify the isolation level*

The product does not require you to set the isolation level on a data
source resource reference for a non-CMP application module. If you do
not specify isolation level on the resource reference, or if you specify
TRANSACTION_NONE, the WebSphere Application Server run time uses a
default isolation level for the data source. Application Server uses a
default setting based on the JDBC driver.

*For most drivers, WebSphere Application Server uses
an isolation level default of TRANSACTION_REPEATABLE_READ. *(Bold font
is my addition)

Fortunately, the same document explains how to set the transaction
isolation level to READ COMMITTED for the Websphere data source. No
wonder that IBM stands for "It's Better Manually". Be vewy, vewy
cawefull when using Websphere and PostgreSQL. Here is how to deal with
the problem:

*Possible values* *JDBC isolation level* *DB2 isolation level*
8 TRANSACTION_SERIALIZABLERepeatable Read (RR)
4 (default) TRANSACTION_REPEATABLE_READ Read Stability (RS)
2 TRANSACTION_READ_COMMITTED Cursor Stability (CS)
1 TRANSACTION_READ_UNCOMMITTEDUncommitted Read (UR)
0 TRANSACTION_NONENo Commit (NC)

*Note**:* If TRANSACTION_NONE is used, the DB file does not have to be
journaled.

To define this custom property for a data source, you should do the
following:

1. Click *Resources* > *JDBC provider* > */JDBC_provider/*.
2. Click *Data sources* in the Additional Properties section.
3. Click the name of the data source.
4. Click *Custom properties*.
5. Create the webSphereDefaultIsolationLevel custom property
a. Click *New*.
b. Enter *webSphereDefaultIsolationLevel* for the name field.
c. Enter one of the *"possible values*" in the value field from the
table above.**i.e. 0, 1, 2, 4, or 8

The value that needs to be entered is 2.
**

*
*

--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: PostgreSQL Management and monitoring tool

2022-01-26 Thread Mladen Gogala


On 1/26/22 22:24, Lucas wrote:


Thanks… But I was just hoping that someone would share their solution 
in a more depth way…


For example: Having Zabbix to monitor the database is nice, but it 
does not allow you to manage Replication, backups and more from the 
web, like awide.io <http://awide.io> does. Maybe there isn’t even a 
solution like that open source….


You can try with Oracle Enterprise Manager. Basic installation is free 
of charge and so is Postgres plugin. Of course, you'll need a fairly 
large box housing Oracle 19c EE and Weblogic application server, both 
parts of the OEM. Also, you don't know what is an adventure until you 
try upgrading OEM. It's things like that that make life interesting. As 
for the backups, my advice is to purchase Commvault, it has a nice GUI, 
can do storage snapshot backups and has great support.


Disclosure:

I used to work for Commvault Systems until May 2019, for 7 years.

--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

pg_try_advisory_lock is waiting?

2022-01-28 Thread Mladen Gogala

I am trying to replace SELECT  FROM  WHERE 
 FOR UPDATE with pg_try_advisory_lock. The documentation says 
the following:


https://www.postgresql.org/docs/13/functions-admin.html

|pg_try_advisory_lock| ( /|key|/ |bigint| ) → |boolean|

|pg_try_advisory_lock| ( /|key1|/ |integer|, /|key2|/ |integer| ) → 
|boolean|


Obtains an exclusive session-level advisory lock if available. This will 
either obtain the lock immediately and return |true|, or return 
|false|_*without waiting*_ if the lock cannot be acquired immediately.


I tried the following:

_*1st Session:*_

mgogala=# begin transaction;
BEGIN
mgogala=*# update emp set sal=sal*1 where empno=7934;
UPDATE 1
mgogala=*#

_*2nd Session:*_

mgogala=# begin transaction;
BEGIN
mgogala=*# select pg_try_advisory_lock(0) from (select ename from emp 
where empno=7934 for update) as tbl;


To my infinite surprise, "pg_advisory_lock" is waiting. I am aware of 
SELECT FOR UPDATE NOWAIT, but that produces an error and kills the 
transaction block. I would like to use something that would not kill the 
transaction block. I am obviously doing something wrong because the 
select in parenthesis will not return, so the query cannot be executed.  
On the other hand, without the "FOR UPDATE" clause, I am getting TRUE, 
which is wrong:


mgogala=# begin transaction;
BEGIN
mgogala=*# select pg_try_advisory_xact_lock(0) from (select ename from 
emp where empno=7934) as tbl;

 pg_try_advisory_xact_lock
---
 t
(1 row)

mgogala=*# rollback;
ROLLBACK
mgogala=# select pg_try_advisory_xact_lock(1) from (select ename from 
emp where empno=7934) as tbl;

 pg_try_advisory_xact_lock
---
 t
(1 row)

The row is still locked by the UPDATE statement, so the 
try_advisory_lock should return "f", not "t". The database is 13.5 on  
Oracle Linux 8, x86_64. Transactions are written in Java so an exception 
will terminate the transaction block. SQL statements are generated by 
the home grown ORM. The application is ported from Oracle which will not 
hang the transaction block on the 1st error. Is there a way to get 
PostgreSQL to use something like NOWAIT without aborting the transaction 
block?


--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: pg_try_advisory_lock is waiting?

2022-01-28 Thread Mladen Gogala


On 1/28/22 19:08, Tom Lane wrote:

I doubt it.  I think the FOR UPDATE in the sub-select is blocked
because the other session has an uncommitted update on the row
it wants to lock.  This command won't reach the pg_try_advisory_lock
call until that row lock comes free.


Yes, I figured it out, but pg_try_advisory_lock returned TRUE even 
without "FOR UPDATE" clause in the subquery. Shouldn't it return false 
because it can't lock the row until the uncommitted update finishes?


Regards

--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: pg_try_advisory_lock is waiting?

2022-01-29 Thread Mladen Gogala


On 1/28/22 20:54, Michael Lewis wrote:
On Fri, Jan 28, 2022 at 5:34 PM Mladen Gogala 
 wrote:


pg_try_advisory_lock returned TRUE even without "FOR UPDATE"
clause in the subquery. Shouldn't it return false because it can't
lock the row until the uncommitted update finishes?

The rows being updated or not is unrelated to advisory locks. Your 
example shows you locking on key 0 and then on key 1. That should be 
allowed. If you tried key 0 twice, then that is when you would get 
back "false" from the function call.


You could establish a pattern of using the table OID as the first lock 
key and the primary key value (if a single column) as the second 
advisory lock key with the two parameter version of the function. But 
it is totally up to your code to honor that advisory lock, or not.


Again, why use advisory locks and not select for update? Perhaps just 
because you don't want to deal with the failed transaction? What 
should happen when some other process cannot get a lock on that row? 
Do you want to wait and retry? Have you looked into the "skip locked" 
option? If you use "returning id" with that, you'd be able to check if 
you got the lock or not.


I managed to resolve the problem:

savepoint pt1;

select from table where <...> for update nowait;

If it fails, do "rollback to pt1" and continue. Transaction block will 
survive. OK, now the life is good again.


Regards

--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: what is the solution like oracle DB's datafile

2022-01-30 Thread Mladen Gogala


On 1/29/22 17:15, Yudianto Prasetyo wrote:

Hello,

I want to ask why postgresql doesn't create a datafile like it has oracle?

I'm confused when I have 2 HDD. HDD 1 is used to install the OS and 
postgresql database. when HDD 1 is full. how to increase the capacity 
of postgresql database with HDD 2 (without RAID system)?


is there any other way like oracle DB's "add datafile" which can be 
used to add capacity to another HDD?


I'm sorry if it says comparing with Oracle DB, but in essence I'm just 
looking for a solution to the problem above.


Thank You
Yours faithfully


Yudianto


Number one, Postgres tables are files. Tablespaces are directories which 
reside in file systems, In Oracle, tables are sets of blocks within data 
files that tablespace is comprised of. That is a profound difference. In 
Oracle, we are using direct IO to bypass the OS cache and only use SGA 
("System Global Area") for caching data blocks. In Postgres, we are 
caching blocks from the files in OS cache which is essentially the free 
memory.


When translated to Postgres, your question reads: how to add space to 
file system? That depends on the file system and volume manager. If 
you're using brtfs or zfs (hopefully not) then your file sysems are also 
volume managers, If you're using LVM with xfs (my combination), then you 
can add space to your volume and extend the xfs file system. I have also 
tried using Oracle ASM as volume manager and use ACFS as the file 
system. The result was very nice, roughly the same as with LVM. However, 
the installation of ASM is rather complex and since PostgreSQL cannot be 
clustered, there is no justification for doing that.


--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

How to determine whether I'm running on a standby?

2022-02-10 Thread Mladen Gogala


Hi!

I am running PostgreSQL 13.5, one primary cluster and two standby 
clusters, managed by PgBouncer which will automatically promote one of 
the  standbys to the primary in case of failure. I have a job that is 
cleaning "old data",  with "old data" being whatever business side of 
the company deems as "old data". The cleanup is a PgPL/SQL procedure 
which uses a bunch of DELETE commands. The procedure works fine on the 
primary but reports a bunch of errors on the read-only standbys. In 
another database system, I have the table called V$DATABASE and the 
column named DATABASE_ROLE which can be either primary or some kind of 
standby (physical or logical). Is there anything like that in the world 
of Postgres? I would like to do something like this:


DECLARE

DB_ROLE VARCHAR2(64);

BEGIN

SELECT DATABASE_ROLE INTO DB_ROLE FROM V$DATABASE;

IF DB_ROLE = 'PRIMARY' THEN

   

ELSE

    exit;

END IF;

END;

What would a similar idiom in the PostgreSQL world look like? I have the 
job in crontab on the primary and I don't really know which of the 3 
clusters will be my primary at any given moment. So far, I am using 
manual labor to do that, which is silly. Knowing Postgres idioms, there 
will probably be a function for this but I was unable to figure out 
which one.


Regards

--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: Can we go beyond the standard to make Postgres radically better?

2022-02-10 Thread Mladen Gogala


Please, don't top-post.

On 2/10/22 17:51, Guyren Howe wrote:
When you dig into it, the powerful idea here is the relational 
algebra, and its equivalence to a first-orderish logic.


I put up with SQL so I can use relations, and I love Postgres because 
it has the least bad SQL (by a mile!)


As you've said, the basic motivation behind SQL is the relational 
algebra from the branch of mathematics called "naive set theory". SQL 
does that really well. Late Dr. Ted Codd was a mathematician, so he was 
also aware of non-naive set theory so the "selection language" of 
Zermelo's axiom of choice was also included. That is why we are 
selecting elements of subsets and not defining subsets. Codd was also 
working for IBM, once great company, whose name stands for "It's Better 
Manually". IBM was doing lots of business with the banks (figures) so 
the transaction handling part was modeled after the banking business. 
When I've had my first course in SQL, back in 1988, I was given analogy 
with writing checks: I A writes a $100 check to B, one of the following 
two things can happen: either A has enough money to cover the check, in 
which case B will have $100 more in his account and A will have $100 
less or, if the transaction doesn't go through, nothing will happen. Any 
mixed outcome in which B is $100 richer but A doesn't have $100 less 
money in his account or A has $100 less money but B is not $100 richer 
is strictly prohibited. That transaction must happen atomically, as a 
whole or not happen at all. So, that's SQL for you: relational algebra 
plus banking business. Relational algebra implementation is rather good, 
although a tad too verbose. Transactional part is excellent and still 
reflects the needs of the business community today.





But SQL is a terrible, no good, very bad language.



I cannot accept such a religious persecution of SQL without a detailed 
explanation.





I don’t really understand why folks who love the relational model 
aren’t perpetually up in arms about SQL being their only option. Much 
better query languages are known and well studied.


https://www.scattered-thoughts.net/writing/against-sql

I’d love to see Postgres offer say Datalog. But until that Halcyon 
day, we could at least aggressively extend SQL to make it less awful.


It will take much more than a blog post to eradicate decades of 
investment into SQL by both computer scientists and business community. 
I've seen the same crusade against SQL on Slashdot and I assume you were 
behind that as well. So, if I were you, I would establish a company to 
implement Datalog language in a database and try to sell that to the 
people. That happened in 1979: Larry Ellison, Bob Miner and Ed Oates 
established a company and offered a SQL product, competing with 
hierarchical databases like IMS and CICS/DL1, to the world. The rest is 
history. I am sure that if you do the same and if you are right about 
the Datalog, a brilliant future is ahead of you. That is the way of the 
Force. Anything can happen. A young student can come from Africa, 
convinced that he can sell electric cars and be very successful while 
doing that. So successful that he became the richest man in the world. I 
am sure that your endeavor with Datalog will have similar success. 
Market decides what is the standard. May the Force be with you. In the 
mean time, I will still be using SQL.


--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: Can we go beyond the standard to make Postgres radically better?

2022-02-11 Thread Mladen Gogala


On 2/10/22 23:56, Guyren Howe wrote:

On Feb 10, 2022, at 17:06 , Mladen Gogala  wrote:



But SQL is a terrible, no good, very bad language.



I cannot accept such a religious persecution of SQL without a 
detailed explanation.




I feel like anyone who is defending SQL here isn’t aware of how much 
better the alternatives are, and how bad SQL really is.


I mean: it’s hard to write, hard to read. It’s hard to generate. But 
that’s just the starting point.


OK. If there are better alternatives, I am sure you will be able to sell 
them to the world. Establish a company and have a go at it.





One of the worst things about it that I don’t see much discussed is 
that it imposes assumptions about the storage model that aren’t part 
of the relational model. Like heavyweight, hard to change tables with 
transactional guarantees and such. Don’t get me wrong, those things 
are great to have available, but I don’t need them all the time.


Storage model and implementation are not part of SQL for good reason. 
Database vendors have different implementations. MySQL and Postgres are 
different. MySQL storage engines differ among themselves. Both of them 
are different from SQL Server which is in turn different from Oracle and 
DB2. Storage model determines the implementation. When there is only a 
single relational database vendor left in the market, then they can burn 
their storage implementation into the language standard. Until then, the 
more, the merrier.






The whole NoSQL movement has been such a tragedy. Having diagnosed a 
problem with SQL databases, they threw out the relational model and 
very often reimplemented a form of SQL when they should have done the 
opposite. There is no reason you can’t have a relational database with 
an on-demand schema, with eventual consistency, with all those fun 
things that various NoSQL databases provide. Those storage models have 
their place, but the SQL standard says you can’t use them.


But the biggest issue is the verbose, terrible, very bad query 
language. In the web development community where I spend my time, it 
is almost holy writ to treat the database as a dumb data bucket, and 
do everything in the application layer (even things like validations, 
even when that is a provably incorrect approach). Why? I think it’s 
because they’re used to working in a pleasant language like Ruby or 
Python, and they want to do everything there. And who can blame them?


As a database architect who has successfully bridged two very different 
database systems, I can tell you that the application programmers put 
the business logic into the application because they frequently don't 
know what the options are. They know Java or Python and that's what they 
do, period. That has nothing to do with SQL.




But this is bad. Proper relational design can take over much (most!) 
of the design of a typical business app, with significant efficiency 
gains the result. But no *community* is going to choose that when most 
of the members of the community don’t want to learn SQL and who can 
blame them?
Business community which hires them to make efficient applications can 
blame them. And frequently does so.


Another issue: everyone thinks “relational” is the same thing as 
“SQL”. If we could get folks to break that association, then relations 
should be a standard feature of high-level programming languages, just 
as arrays and hashes are.


Heck, give me a functional programming language with a good relational 
model, and I can implement OOP in that relational language without 
breaking a sweat.


Software *should* be designed around a logical/relational layer with 
minimal occasional forays into Turing completeness where necessary. 
But no-one is even thinking about software like that because 
relational is SQL and SQL is awful.


There is such thing as "free market". If you offer them a better 
alternative, people will come. You may be the next Larry Ellison. And 
then again, you may be not. There is only one way to tell, and that's 
not proselytizing on the Postgres mailing list.


--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: Can we go beyond the standard to make Postgres radically better?

2022-02-11 Thread Mladen Gogala


On 2/11/22 09:48, Benedict Holland wrote:
So to summarize, people are bad programmers who refuse to learn SQL So 
SQL is the problem? Common. You cannot bring that to a postgres list 
serve.


Look. It's not perfect. It's a pain. It is hard to generate queries 
(oh my God why are you doing this?) and it's hard to work with. You 
are describing c++ to Python programmers and arguing why no one should 
ever use c++. And look, there are other languages that are way better 
like Python. So why would anyone ever chose c++?


There is probably a misunderstanding here. I haven't talked about 
people, I leave that to politicians. Second, I was defending SQL. You've 
got me confused with somebody else. Last but not least, I didn't bring 
anything to this list, I was just responding to the posts.


--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: Microsoft Report Builder

2022-02-11 Thread Mladen Gogala


On 2/11/22 22:22, Kim Foltz wrote:


I am trying to configure Microsoft Report Builder to run against a 
mainframe PostgreSQL database. The ODBC connection tests as good and 
manually entered SQL Select statements properly return data. The 
problem is the Query Designer in the software doesn't support the 
normal graphical display of available tables and fields with 
PostgreSQL. All I find in Microsoft's documentation is the statement 
some data sources don't support graphical tools in Report Builder.


More than likely there are better report generators than Microsoft's 
tool but our agency is very fond of Microsoft solutions. Anyone know 
of a way to get the graphical tools working in Report Builder? In the 
alternative, is there an alternative Windows report generator to try 
if Microsoft's software isn't repairable?



Well, there is a fish to achieve just that:

https://babelfishpg.org/

The problem is that you will have to build PostgreSQL from source with 
some specific hooks for Babelfish. On the other hand, this extension is 
supported by AWS:


https://aws.amazon.com/blogs/database/migrate-sql-server-to-amazon-aurora-postgresql-using-best-practices-and-lessons-learned-from-the-field/

That means that it will continue to be developed and that you will not 
waste all that effort.


Regards

--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Babelfish for PostgreSQL

2022-02-12 Thread Mladen Gogala


Hi!

There is a method of simulating SQL Server by using PostgreSQL. The 
extension is called "Babelfish": https://babelfishpg.org/


However, this extension requires modifications to the PostgreSQL source 
and building the PostgreSQL with Babelfish modifications from source. 
While it is a rather seamless process and goes on without much trouble, 
it is still much more complicated than installing the RPM or DEB 
packages, depending on your Linux distro of choice. Is there any chance 
that the Babelfish modifications will be adopted in the PostgreSQL 
source proper? From my DBA point of view, I didn't find any differences 
in the functionality of the PostgreSQL database itself. Amazon RDS 
incorporates that modification and allows creating Aurora PostgreSQL 
database with the Babelfish extension from the menu.


Regards


--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: Can we go beyond the standard to make Postgres radically better?

2022-02-12 Thread Mladen Gogala


On 2/12/22 19:11, Andreas 'ads' Scherbaum wrote:

The complaint is not about complex queries, or CTEs, or Joins. This is
about simple queries where a user wants to discover - surf - the database
and look into specific tables, but exclude certain columns. More 
specifically,
this is when the user types in interactive queries. 


There is already something very similar to what you are describing:

https://www.psycopg.org/docs/cursor.html

Each cursor has its description, which consists of the column 
descriptions. Basically, it's like doing \d on a cursor. Unfortunately, 
it's not interactive, one has to do some pythong programming in order do 
to that. Unfortunately, it is not possible to just "describe the 
cursor", the description becomes available after the "execute" call. 
Hopefully, I understood you correctly.


--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: Can we go beyond the standard to make Postgres radically better?

2022-02-13 Thread Mladen Gogala


On 2/13/22 05:00, Pavel Stehule wrote:
But there can be a valid second question - it can be nice to use 
extensions with availability to define their own communication 
protocol. Postgres has a special protocol for replication or for 
backup. With this possibility you can do what you need without the 
necessity of an external application server.


Regards


And here we are back on the Babelfish question. Babelfish defines its 
own protocol (TDS = Table Data Streaming) but it requires building 
PostgreSQL from the source, with some hooks for the additional protocol. 
A built in mechanism to do that without rebuilding would be nice. 
Babelfish Postgres is version 13.5, currently the most prevalent version 
in the data centers. A library to implement foreign communication 
protocol primitives would be a very nice thing indeed.


--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

2 phase commit with FDW

2022-02-17 Thread Mladen Gogala


Hi!

I am getting the following error when trying to PREPARE transaction 
which updates both local and foreign table:


2/17/22 12:48:00:657 EST] 0128 RegisteredRes E   WTRN0046E: An 
attempt by the transaction manager to call prepare on a transactional 
resource has resulted in an error. The error code was XAER_RMFAIL. The 
exception stack trace follows: org.postgresql.xa.PGXAException: Error 
preparing transaction. prepare xid={XidImpl: formatId(57415344), 
gtrid_length(36), bqual_length(54),


data(017f08cc3aea0001232c3afce8abf609478378ca6ac9d4bddbfea3e4efa55cf3017f08cc3aea0001232c3afce8abf609478378ca6ac9d4bddbfea3e4efa55cf300010002)}

    at 
org.postgresql.xa.PGXAConnection.prepare(PGXAConnection.java:365)


There are no errors is all the tables are local. It seems that 
postgres_fdw is incompatible with the XA protocol. Is there a way around 
it? If not, we will need yet another code change and another data source.


Regards

--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: 2 phase commit with FDW

2022-02-18 Thread Mladen Gogala


On 2/17/22 13:10, Mladen Gogala wrote:


Hi!

I am getting the following error when trying to PREPARE transaction 
which updates both local and foreign table:


2/17/22 12:48:00:657 EST] 0128 RegisteredRes E   WTRN0046E: An 
attempt by the transaction manager to call prepare on a transactional 
resource has resulted in an error. The error code was XAER_RMFAIL. The 
exception stack trace follows: org.postgresql.xa.PGXAException: Error 
preparing transaction. prepare xid={XidImpl: formatId(57415344), 
gtrid_length(36), bqual_length(54),

data(017f08cc3aea0001232c3afce8abf609478378ca6ac9d4bddbfea3e4efa55cf3017f08cc3aea0001232c3afce8abf609478378ca6ac9d4bddbfea3e4efa55cf300010002)}
    at 
org.postgresql.xa.PGXAConnection.prepare(PGXAConnection.java:365)


There are no errors is all the tables are local. It seems that 
postgres_fdw is incompatible with the XA protocol. Is there a way 
around it? If not, we will need yet another code change and another 
data source.


Regards

Is there anybody on this list who can tell me whether foreign tables can 
participate in 2-phase commit or not? My version is 14.2 on Oracle Linux 
8.5.


Regards


--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: 20220226-Clarification regarding delay time of PostgeSQL starting up

2022-02-26 Thread Mladen Gogala


On 2/26/22 10:17, Laurenz Albe wrote:
*If you talking about a Windows CIFS (Network Share), you may be out 
of luck. That is known to be an unreliable file system that cannot be 
used with a database.*

A crash has likely corrupted your database.
(But we cannot tell for certain, because you didn't show any interesting 
messages.)

Yours,
Laurenz Albe


Thanks for letting me know, I didn't know that. Are there any such 
limitations with NFS and NAS devices like NetApp?


Regards

--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: 20220226-Clarification regarding delay time of PostgeSQL starting up

2022-02-26 Thread Mladen Gogala


On 2/26/22 13:34, Techsupport wrote:

Thanks for your reply.
Is it a good practice to upgrade / update database on production on every
year ?


No, it is not a good practice. Database upgrade must be treated as a 
project which will result in non-negligible downtime. You should upgrade 
to the new version if the new version brings about features that you 
need. Here is the list of new features, introduced in the PostgreSQL 14:


https://www.postgresql.org/docs/14/release-14.html#id-1.11.6.7.5

If you need any of the listed features, you should upgrade. If not, you 
can safely skip the release. Upgrade is something that has business 
repercussions, inconveniences the database users and should be subjected 
to the same cost-benefit analysis as every other major IT move. Upgrade 
policy is usually set by the management, not by the DBA, much to chagrin 
of the DBA personnel. I used to be a DBA for a very long time and I know.


--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: Interesting fail when migrating Pg from Ubuntu Bionic to Focal

2022-03-04 Thread Mladen Gogala


On 3/4/22 17:03, Tom Lane wrote:

Mmm ... it might have just been that the planner chose not to use
JIT when it thought there were fewer rows involved.  Did you check
with EXPLAIN that these cut-down cases still used JIT?


This is interesting and informative answer. How do I check whether JIT 
is used in the explain plan? Can you give me an example?


TIA

--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: Select .... where id not in (....) returns 0 incorrectly

2022-04-05 Thread Mladen Gogala


On 4/4/22 09:21, J. Roeleveld wrote:

This was actually the case, I forgot there is 1 NULL-value in that list.
Personally, I think NULL should be treated as a seperate value and not lead to
strange behaviour.


NULL is strange. Relational databases use ternary, not binary logic. In 
the woke vernacular, one could say that Postgres is non-binary. NULL 
literally means "no value". It is a part of the standard, so we have to 
deal with it, Codd help us. However, based on my lifelong experience 
with Oracle, NULL values are bad and are best avoided. Postgres is more 
forgiving than Oracle because in Postgres, the condition "is not null" 
can be resolved by index. In Oracle, it can not.


--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: Huge archive log generate in Postgresql-13

2022-04-18 Thread Mladen Gogala


On 4/18/22 07:34, Ram Pratap Maurya wrote:


Hi Support,

We have upgraded postgresql DB from version 11 to 13 .  after upgrade 
to 13  huge archive log generate in system .


Before upgrade  per day 120GB to 150 GB log generated but after 
upgrade per day approx. 250 to 300 GB log generated.


Can you please suggest why huge archive log generated after upgrade 
 there any configure setting or this is Postgresql-13 behaviour.


Postgresql-13 Postgresql conf file attached for your references.

Regards,

Ram Pratap.

Have you checked what's in those WAL archives? You can do that with 
pg_waldump. My guess would be that your vacuum is probably more active 
than in the version 11. However, that's just a guess. You can also turn 
on WAL compression. Be aware that compression will reduce disk 
consumption at the expense of CPU consumption.


Regards

--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: Are stored procedures/triggers common in your industry

2022-04-20 Thread Mladen Gogala


On 4/20/22 15:18, Guyren Howe wrote:
I’ve really only ever worked in web development. 90+% of web 
developers regard doing anything at all clever in the database with 
suspicion.


I’m considering working on a book about implementing business logic in 
Postgres, and I’m curious about how common that actually is.


Well, there are 2 schools of thought:

1. Put the business logic into the application
2. Put the business logic into the database

Putting the business logic into the application can give you more 
flexibility around enforcing them. On the other hand, you also increase 
chances of inconsistency. There will likely be more than one application 
using reference tables like ADDRESS, ZIP_CODE, STATE, COUNTRY, QUARTER, 
ACCOUNT, CUSTOMER and similar. If there is a rule that a country must 
exist before you add an address in that country into the table, that can 
be enforced by a foreign key. Enforcing it within the application does 2 
things:


1. Move the rule code to the application server which is traditionally
   weaker than a database server. In other words, you are more likely
   to run out of CPU juice and memory on an application server than you
   are likely to run out of resources on the DB server.
2. There is a possibility for inconsistency. Different applications can
   use different business rules for the same set of tables. That means
   that data entered by one application may make the table internally
   inconsistent for another application.

I am a big proponent of using foreign keys, check constraints and 
triggers to enforce business rules. I am also a big proponent of 
avoiding NULL values wherever possible. Database design is an art. CAD 
software used to be popular once upon a time, in a galaxy far, far 
away.  Properly enforcing the business rules in the database itself 
makes the application more clear and easier to write.


Regards

--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: Whole Database or Table AES encryption

2022-04-28 Thread Mladen Gogala


On 4/28/22 18:55, Aaron Gray wrote:

On Wed, 27 Apr 2022 at 18:01, Bruce Momjian  wrote:

On Wed, Apr 27, 2022 at 05:21:41PM +0100, Aaron Gray wrote:

Hi,

Is there any way to (virtually) encrypt whole databases or tables with
AES or other types of encryption ?

You can use storage encryption via the operating system.  Cybertec has a
patch for PG 12 and the community is working on an implementation,
perhaps for PG 16.

I would be very interested in this, are there any further pointers as
to where to look at this ?
Is there a repo branch in place working on this or any documentation ?

Cheers,

Aaron


There is a commercial product which supports database encryption, 
including PostgreSQL, it's called Voltage:


https://www.techtarget.com/searchsecurity/feature/HP-Security-Voltages-SecureData-Enterprise-Product-overview

The company that I work for uses it to encrypt data in both Oracle and 
PostgreSQL databases.


--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: External psql editor

2022-04-29 Thread Mladen Gogala


On 4/29/22 13:35, Jan Wieck wrote:
Not that I know of. \e starts the external editor and you have to save 
and exit that editor to get back to psql in order to execute it. IMHO 
the whole construct has very limited usability.



Regards, Jan 


Is there a way to define the name of the temporary file created by \e 
command? I'd like to name it "afiedt.buf", not for sentimental reasons. 
I already have a cron job that cleans afiedt.buf from my home directory 
every hour and having psql name temporary file like that would simplify 
the cleaning process. The name comes from another database with the same 
editor construct as \e. I am actually quite used to that.


Regards

--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: External psql editor

2022-05-02 Thread Mladen Gogala


On 5/2/22 17:27, Tom Lane wrote:

Rich Shepard  writes:

On Mon, 2 May 2022, Reid Thompson wrote:

I believe that psql also uses readline, so my thought was that maybe these
instructions could enable you to map the 'move' keystrokes that you're
familiar with to be used while on the psql command line. A very quick test
seems to indicate that you can.

Nope. No difference.

Perhaps your psql is built against libedit rather than readline.

regards, tom lane



That is easy enough to determine:

mgogala@umajor ~]$ ldd `which psql`
    linux-vdso.so.1 (0x7ffded951000)
    libpq.so.private13-5 => /lib64/libpq.so.private13-5 
(0x7ff145284000)

_*libreadline.so.8 => /lib64/libreadline.so.8 (0x7ff14522c000)*_
    libm.so.6 => /lib64/libm.so.6 (0x7ff14515)
    libc.so.6 => /lib64/libc.so.6 (0x7ff144f49000)
    libssl.so.1.1 => /lib64/libssl.so.1.1 (0x7ff144eac000)
    libcrypto.so.1.1 => /lib64/libcrypto.so.1.1 (0x7ff144bbe000)
    libgssapi_krb5.so.2 => /lib64/libgssapi_krb5.so.2 (0x7ff144b65000)
    libldap_r-2.4.so.2 => /lib64/libldap_r-2.4.so.2 (0x7ff144b09000)
    libtinfo.so.6 => /lib64/libtinfo.so.6 (0x7ff144ada000)
    /lib64/ld-linux-x86-64.so.2 (0x7ff14539a000)
    libz.so.1 => /lib64/libz.so.1 (0x7ff144ac)
    libkrb5.so.3 => /lib64/libkrb5.so.3 (0x7ff1449e2000)
    libk5crypto.so.3 => /lib64/libk5crypto.so.3 (0x7ff1449ca000)
    libcom_err.so.2 => /lib64/libcom_err.so.2 (0x7ff1449c1000)
    libkrb5support.so.0 => /lib64/libkrb5support.so.0 (0x7ff1449b)
    libkeyutils.so.1 => /lib64/libkeyutils.so.1 (0x7ff1449a9000)
    libresolv.so.2 => /lib64/libresolv.so.2 (0x7ff144995000)
    liblber-2.4.so.2 => /lib64/liblber-2.4.so.2 (0x7ff144982000)
    libsasl2.so.3 => /lib64/libsasl2.so.3 (0x7ff144962000)
    libselinux.so.1 => /lib64/libselinux.so.1 (0x7ff144935000)
    libcrypt.so.2 => /lib64/libcrypt.so.2 (0x7ff1448fb000)
    libpcre2-8.so.0 => /lib64/libpcre2-8.so.0 (0x7ff144864000)
[mgogala@umajor ~]$

Regards

--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: Question about attention to pgsql-hack...@lists.postgresql.org

2022-06-26 Thread Mladen Gogala


On 6/26/22 09:15, Julien Rouhaud wrote:

On top of that OP has already received a lot of answers in the past, it's just
that since no one wants to take care of a whole non trivial project for free
the answer is never what OP wants.


On top of that, most of the functionality can be obtained by using 
plpython3u and importing "numpy".


--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: How can I set up Postgres to use given amount of RAM?

2022-06-26 Thread Mladen Gogala


On 6/26/22 14:40, W.P. wrote:

Question in topic:

"How can I set up Postgres to use given amount of RAM?"

I have now laptop with 8GB of RAM, i can see Linux uses no more than 
2-3GB.


So my question is how to FORCE PostgreSQL

use let's say 2-4Ghow to B of RAM for caching tables  I run queries on?

As I can see disk actity running queries.


W.P..



I would try docker with --memory option. An alternative is to create a 
VM with the given amount of memory. If your laptop is memory starved, 
then creating a VM is probably not an option. Docker is much cheaper. 
Dockerfile is relatively simple to master and you can use Docker to run 
almost anything.


Regards

--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: Get the table creation DDL

2022-07-10 Thread Mladen Gogala


On 7/10/22 13:57, Michael Nolan wrote:
I do not know those other tools, but there should be documentation for 
them, as there is in the man page for how to process just one table 
using pg_dump.   You can pipe the output of pg_dump directly to psql, 
but I find that's seldom useful.

--
Mike Nolan


There is an extension which does precisely what you need:

https://github.com/MichaelDBA/pg_get_tabledef

In my opinion, that is a little inconsistency on the part of the 
maintainers part because there are functions pg_get_functiondef, 
pg_get_indexdef and pg_get_viewdef  but nothing for tables.


https://www.postgresql.org/docs/14/functions-info.html

--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: Get the table creation DDL

2022-07-10 Thread Mladen Gogala


On 7/10/22 12:47, Igor Korot wrote:

So there is no "query" per se?
Also how do I pass the table name?

Thank you.


You can create one from the catalog tables. Personally, I would use 
INFORMATION_SCHEMA to avoid pg_class and pg_attribute. However, there is 
an extension which does that for you. Somebody else has already done the 
hard work.


--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: Get the table creation DDL

2022-07-10 Thread Mladen Gogala

 functions
 Schema |  Name   | Result data type |
   Argument data types
| Typ
e
+-+--+--

---+
--
 public | pg_get_tabledef | text | in_schema character 
varying, in_t
able character varying, in_fktype tabledef_fkeys DEFAULT 
'FKEYS_INTERNAL'::table
def_fkeys, in_trigger tabledef_trigs DEFAULT 
'NO_TRIGGERS'::tabledef_trigs | fun

c
(1 row)
|

|As expected, the function returns the "text" data type.
|

|Regards
|

||

||

--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: Get the table creation DDL

2022-07-10 Thread Mladen Gogala


On 7/10/22 20:02, Mladen Gogala wrote:
This "extension" is nothing PL/PGSQL source code 


This "extension" is nothing _*but*_ PL/PGSQL source code

--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: Get the table creation DDL

2022-07-10 Thread Mladen Gogala


On 7/10/22 20:15, Igor Korot wrote:

I understand.

The question here - does this "extension'' is a part of standard
PostgreSQL install?
And if not - can I copy and paste that code in my program?

Thank you.


Of course this function is not a part of the standard install. If it was 
a part of the standard install, it wouldn't need a separate Git repo on 
Gitlab. And of course you can use it in your database, just send me 
$1000 and you will have no problems with the license. I believe there is 
license file in the repo and it says something like that.


Regards

--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: Get the table creation DDL

2022-07-10 Thread Mladen Gogala


On 7/10/22 21:13, Igor Korot wrote:

How should I do that?

Thank you.


Oh boy! I give up.

--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: Get the table creation DDL

2022-07-11 Thread Mladen Gogala


On 7/10/22 22:52, Rob Sargent wrote:
Are you alone or on a team? 


What are your pronouns?

--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: Proposal to introduce a shuffle function to intarray extension

2022-07-16 Thread Mladen Gogala


On 7/15/22 04:36, Martin Kalcher wrote:

Dear list,

i am dealing with an application that processes fairly large arrays of 
integers. It makes heavy use of the intarray extension, which works 
great in most cases. However, there are two requirements that cannot 
be addressed by the extension and are rather slow with plain SQL. Both 
can be met with shuffling:


- Taking n random members from an integer array
- Splitting an array into n chunks, where each member is assigned to a 
random chunk


Shuffling is currently implemented by unnesting the array, ordering 
the members by random() and aggregating them again.



Martin, have you considered PL/Python and NumPy module?

--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: Proposal to introduce a shuffle function to intarray extension

2022-07-16 Thread Mladen Gogala


On 7/16/22 16:21, Martin Kalcher wrote:

Hey Mladen,

thank you for your advice. Unfortunately the performance of shuffling 
with NumPy is about the same as with SQL.


  create function numpy_shuffle(arr int[])
  returns int[]
  as $$
    import numpy
    numpy.random.shuffle(arr)
    return arr
  $$ language 'plpython3u';

  select arr[1:3]::text || ' ... ' || arr[398:400]::text
  from (
    select numpy_shuffle(arr) arr from numbers
  ) shuffled;

  ---
   {674026,3306457,1727170} ... {343875,3825484,1235246}

  Time: 2315.431 ms (00:02.315)

Am i doing something wrong?

Martin


Hi Martin,

No, you're doing everything right. I have no solution for you. You may 
need to do some C programming or throw a stronger hardware at the 
problem. The performance of your processors may be the problem. Good luck!


--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: More than one Cluster on single server (single instance)

2022-07-18 Thread Mladen Gogala


On 7/18/22 03:55, Daulat wrote:

Please share your thoughts on this approach?

How can we create multiple clusters under a single postgres version?

You can use Docker images.


How can we identify/allocate the resources to parameters like 
shared_buffers, effective_cache_size, work_mem etc in case of multiple 
clusters?


Thanks


Each container can be adjusted to have its own memory and parameter 
file. Personally, I don't like that approach. I think that separating 
data into separate databases is enough separation even for the most 
fervent separatists (pun with the name of political movements is purely 
accidental).


--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: postgis

2022-07-19 Thread Mladen Gogala


On 7/18/22 13:08, Marc Millas wrote:

Hi,

I would like to install postgis 3.04 on a debian 11.

digging into various web sites, I didnt found the name of that packet.

can someone help ?

thanks

Marc MILLAS
Senior Architect
+33607850334
www.mokadb.com <http://www.mokadb.com>

Install alien and convert the .rpm package into the .deb package. When 
dealing with alien be careful, so that it doesn't jump and stick to your 
face. Bad things can happen then


--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: Was my question inappropriate for postgres?

2022-07-24 Thread Mladen Gogala


On 7/24/22 19:56, Taka Taka wrote:

Hello.

I posted question bellow.



And? No takers? I wonder why. Maybe the question is unclear? This is 
really surprising :)




Hello.
I would like to know if psqlodbc_13_02 is compatible with M365.
Also, could you please tell me which of the psqlodbc_13_02 would be
suitable to Windows 10 64-bit?


What is M365? Is it a part of Microsoft Office 365, aka "O365"? Is it 
related to M-16? What di you mean by "driver suitable for Windows 10"? 
If the driver can be installed and configured by the MS ODBC driver 
administrator, then I guess it's suitable. Make sure the driver you use 
is mauve, they take the least RAM




Was my question inappropriate?
I wanted to know if the pdbc driver above would be compatible to 
Microsoft's M365.

What part of Office 365 do you want the compatibility with? Excel?

Also, I find variety in the odbc driver.


I don't. ODBC drivers implement the same protocol when communicating 
with the database and they all work the same. Here is what I have:


[mgogala@umajor ~]$ rpm -qa *odbc*
oracle-instantclient-odbc-21.5.0.0.0-1.x86_64
postgresql-odbc-13.01.-2.fc36.x86_64
msodbcsql17-17.10.1.1-1.x86_64
[mgogala@umajor ~]$

The string "f36" in the PostgreSQL ODBC package name denotes a very 
special version of Windows, more advanced than even Windows 11. I was 
able to connect all 3 of the drivers above to LibreOffice and use them 
to query the underlying database. I was able to do the same on my 
Windows 8.1, with all 3 of those drivers.


Is it possible to know which of the driver is suitable to Windows 10 
64-bit?


Suitable how? What exactly do you want to do? What does the word 
"suitable" even mean in the context you're using it?



Or could you please tell me where to inquire?

There is this site named "https://www.google.com"; which can be used to 
find a lot of useful information and some paid ads, too. As of lately, I 
prefer DDG, but that's just a personal preference.


If there were no takers for the first question, that may say something 
about the question. At this point, you gotta ask yourself one question 
only: do I feel lucky?


--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: Was my question inappropriate for postgres?

2022-07-24 Thread Mladen Gogala


On 7/24/22 20:15, Gavin Flower wrote:
I suspect that most people do not know what 'M365' is, would be good 
to explain.  Am curious as to what 'M365' is! 


M365 is the latest member in the ArmaLite M-16 family of products. It 
can be connected to database to track down the DBA who don't check their 
backups.


--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: Was my question inappropriate for postgres?

2022-07-24 Thread Mladen Gogala

Answers in-line:

On 7/24/22 20:43, Adrian Klaver wrote:
Other parts of Office also use database connectivity such as Access
or Word.

Also, I find variety in the odbc driver.

I don't. ODBC drivers implement the same protocol when communicating
with the database and they all work the same. Here is what I have:

Hmm, I wonder why they have versions then?

Good question. One part of the story are package versions, bug fixes and
such. However, I think that your question was about the versions of the
ODBC protocol. This may answer it:

https://en.wikipedia.org/wiki/Open_Database_Connectivity

Version history

Version history:^[22]
<https://en.wikipedia.org/wiki/Open_Database_Connectivity#cite_note-22>

* 1.0: released in September 1992^[23]
<https://en.wikipedia.org/wiki/Open_Database_Connectivity#cite_note-23>
* 2.0: c. 1994
* 2.5
* 3.0: c. 1995, John Goodson of Intersolv and Frank Pellow and Paul
Cotton of IBM provided significant input to ODBC 3.0^[24]
<https://en.wikipedia.org/wiki/Open_Database_Connectivity#cite_note-24>
* 3.5: c. 1997
* 3.8: c. 2009, with Windows 7^[25]

<https://en.wikipedia.org/wiki/Open_Database_Connectivity#cite_note-What's_New_in_ODBC_3.8-25>

* 4.0: Development announced June 2016^[26]

<https://en.wikipedia.org/wiki/Open_Database_Connectivity#cite_note-MSDA_blog_2016-06-07-26>
with first implementation with SQL Server 2017 released Sep 2017 and
additional desktop drivers late 2018 ^[27]
<https://en.wikipedia.org/wiki/Open_Database_Connectivity#cite_note-27>
final spec on Github <https://github.com/Microsoft/ODBC-Specification>

By the way, version 4 of the protocol supports "fast_executemany":

https://dbwhisperer.wordpress.com/2020/11/21/pyodbc-fast_executemany-and-oracle-rdbms/

MS SQL ODBC 1.7 and Oracle Instant Client 19.9 ODBC drivers and newer do
support "fast_executemany". PostgreSQL ODBC 13 does not. Here is the table:

https://github.com/mkleehammer/pyodbc/wiki/fast_executemany-support-for-various-ODBC-drivers

As to below, really? If you don't want to answer the question then
don't.

Boys just wanna have fun. This is a shameless plug of the famous Cindy
Lauper's song, but you get the picture, don't you? To my credit, I
didn't answer the 1st one. Tonight, I've been socializing with Sam
Adams, so I replied. When someone who doesn't know how to properly write
"O365" asks about "how suitable ODBC driver is", then I feel entitled to
little fun.

Now, since you don't appear to be a fan of my particular variety of
humor, can you possibly translate the question and explain to me what a
"suitable driver" is and what is M365? If you manage to prove to me that
the question makes sense, I will apologize to the OP and offer you a
pint of Sam Adams as an apology for my bad behavior. On the other hand,
if the question doesn't make sense, I will gladly accept the same from you.

--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: Was my question inappropriate for postgres?

2022-07-25 Thread Mladen Gogala


On 7/25/22 00:36, Adrian Klaver wrote:
Suitable driver is just that a driver that will suit the needs of 
working on Windows 10 with MS Office/Office/Microsoft 365.


Literally every ODBC driver on Windows fits this definition. That is the 
purpose of ODBC: having a uniform set of calls for various databases. 
That is why people still use it, even on Linux. And yes, like you, I am 
a Linux user too.


--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: [Beginner Querstion]Where can I find the news of database?

2022-09-03 Thread Mladen Gogala


On 9/3/22 04:37, BeginnerC wrote:

Hello everyone,
I am a student come from the University of Emergency Management and I 
am learning the postgres.
As you can see,I want to get some latest news,opinions of the 
database,but I don't know where I can get them.
(I have already subscribe the rss of planet postgresql,but I want to 
seek more)

Thanks in advance!
Yours,
BeginnerC.


I assume that by "news" you mean "release notes", as there is no 
PostgreSQL equivalent of 60 Minutes. Release notes can be found here: 
https://www.postgresql.org/docs/release/


There is also a very nice blog with RSS feed: https://planet.postgresql.org/

Any RSS reader like Liferea will be good for following the blog. As for 
the news the hurricane 15.1 is about to hit the Postgres world. For now, 
it is only a Cat 1 hurricane, but it's likely to strengthen.....


Regards

--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: Postgres SQL unable to handle Null values for Text datatype

2022-09-06 Thread Mladen Gogala


On 9/6/22 02:10, Karthik K L V wrote:
We are migrating from Oracle 12C to Aurora Postgres 13 and running 
into query failures when the bind value of a Text datatype resolves to nul


Oracle is actually in the wrong here. Nothing should be equal to null, 
ever. There is also different behavior with unique indexes:


   [mgogala@umajor ~]$ docker start psql14-5
   psql14-5
   [mgogala@umajor ~]$ psql -U scott
   Password for user scott:
   psql (14.5)
   Type "help" for help.

   scott=> create table test1 (key1 integer,key2 integer, data
   varchar(10));
   CREATE TABLE
   scott=> alter table test1 add constraint test1_uq unique(key1,key2);
   ALTER TABLE
   scott=> insert into test1 values(1,null,'a');
   INSERT 0 1
   scott=> insert into test1 values(1,null,'b');
   INSERT 0 1
   scott=> select * from test1;
 key1 | key2 | data
   --+--+---
    1 |  | a
    1 |  | b
   (2 rows)

The same thing would not work with Oracle. However, please note that, 
according to SQL standard, NULL is not equal to anything, to those 2 
rows are actually not a unique constraint violation. To enforce the 
uniqueness the same way as with Oracle, you actually need 2 indexes.  
You need to use the "coalesce" function.


--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: WARNING: could not open statistics file "pg_stat_tmp/global.stat": Operation not permitted

2022-09-08 Thread Mladen Gogala


On 9/8/22 23:05, Perry Smith wrote:
This is an issue when PostgreSQL is running inside a container.  In my 
quest to find an answer, I’ve discovered three instances that it has 
come up and various people have talked about fixes but no one seemed 
to notice what I found.


I opened an issue here[1].

From within the container, files which I assume are created by 
PostgreSQL are ending up being owned by root rather than Postgres. 
 Thus, to me, it appears to NOT be an issue of mapping internal UIDs 
and GIDs to external IDs since there should not be anything outside 
the PostgreSQL container creating files inside Postgres’ data directory.


The reason I’m sending this note to the general list is to ask how bad 
is this error?  Some “solutions” are to make the pg_stat_tmp directory 
internal to the image and that somehow resolves the issue but I don’t 
think anyone really understands why and things like that bother me. 
 But I’m also curious if that appears to be a viable solution.  The 
result will be that when the Postgres is stopped and the container 
exited, the next time Postgres starts back up, the pg_stat_tmp 
directory will be gone.  Is that ok?  Does Postgres store anything 
that that needs to survive a restart?


Thank you for your help,
Perry
[1] https://github.com/docker-library/docs/issues/2188#issue-1367170047


Hi Perry,

You probably need to fix your Dockerfile because your postgres seems to 
be running as root. Here is my Dockerfile:


FROM oraclelinux:8
LABEL Description="This install PostgreSQL 14 on top of Oracle Linux 8"
LABEL maintainer="Mladen Gogala"
RUN dnf -y update
COPY RPMS/pgdg-redhat-repo-latest.noarch.rpm /tmp
COPY RPMS/pg_hint_plan14-1.4-1.el8.x86_64.rpm /tmp/
COPY RPMS/pg_hint_plan14-llvmjit-1.4-1.el8.x86_64.rpm /tmp/
# Install PostgreSQL software
RUN dnf localinstall -y /tmp/pgdg-redhat-repo-latest.noarch.rpm
RUN dnf -qy module disable postgresql
RUN dnf -y install postgresql14
RUN dnf -y install postgresql14-libs
RUN dnf -y install postgresql14-server
RUN dnf -y install postgresql14-llvmjit
RUN dnf -y install postgresql14-contrib
# Install pg_hint_plan 1.4
RUN cd /tmp;dnf -y localinstall `ls *.rpm`
# Cleanup
RUN rm -f /tmp/*.rpm
RUN dnf clean all

# Copy postgresql.auto.conf (modified parameters) and run initdb
USER postgres
ARG PGPASSWD="qwerty"
ENV PGDATA=/var/lib/pgsql/14/data
ENV PATH=/usr/pgsql-14/bin:/usr/bin:/usr/local/bin
RUN echo "$PGPASSWD">/var/lib/pgsql/14/pgcluster.pwd
RUN initdb -A password --pwfile=/var/lib/pgsql/14/pgcluster.pwd
RUN echo "host all all 0.0.0.0/0 md5">>$PGDATA/pg_hba.conf
COPY --chown=postgres:postgres RPMS/postgresql.auto.conf 
/var/lib/pgsql/14/data

# Finish
EXPOSE 5432/tcp
ENTRYPOINT ["postgres"]

Please note that I switch users in the file and that owners are 
switched. I don't have any issues with the container built this way. I 
have another container which is built by untaring backup of my sample 
database. I use ADD command to untar it to $PGDATA. Also, my trick with 
putting the modified parameters into postgresql.auto.conf goes contrary 
to the Postgres recommendations. You may want to avoid doing that. I 
haven't pushed the image to the Docker hub because I'm still working on it.


--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: SPAM Re: WARNING: could not open statistics file "pg_stat_tmp/global.stat": Operation not permitted

2022-09-08 Thread Mladen Gogala


On 9/9/22 00:08, Perry Smith wrote:

The data directory is outside so it is persistent.  The pg_stat_tmp is inside 
the data directory.


Ah, that's the reason. Docker daemon runs as root so if you do binding 
mount, files will be owned by root. You may want to use normal Docker 
volume and not an external directory.


--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: Is it possible to stop sessions killing eachother when they all authorize as the same role?

2022-09-14 Thread Mladen Gogala

On 9/12/22 18:51, Bryn Llewellyn wrote:

I'll use "kill" here a shorthand for using the "pg_terminate_backend()" built-in function. I read
about it in the "Server Signaling Functions" section of the enclosing "System Administration
Functions" section of the current doc:

www.postgresql.org/docs/current/functions-admin.html#FUNCTIONS-ADMIN-SIGNAL

And I tried a few tests. All of the outcomes were just as the doc promised.

I'm troubled by the notion that (as it seems) one session that authorizes as the role "r1" can
easily list all other concurrent sessions that are also authorized as "r1"—and kill them all
without restriction. (The doc does say "Use of these functions is usually restricted to superusers, with
noted exceptions." So I s'pose that I'm talking about one of these noted exceptions.)

It's common to design a three tier app so that the middle tier always authorizes as just a single role—say,
"client"—and where the operations that "client" can perform are limited as the overall design
specifies. The maximal example of this paradigm defines the API to the database functionality by granting
"execute" to just the designed set of subprograms. Here, the subprograms and the tables that they access all
have owners other than "client". (The authorization of external principals, and ow their identity is mapped
to a unique key for use within that database, is outside the scope of what I write about here.)

It seems far-fetched to think that the requirements spec for every such design
would deliberately specify:

— Must be possible for any "client" session to kill all other concurrent
"client" sessions.

Yet the paradigm is that the database API expresses exactly and only what the
design says that it should. Ergo, the paradigm is, in general, unimplementable.

I appreciate that (while the privileges that "client" has are unchanged) a just-killed
session can easily reconnect by trying what they had just tried again. But not before suffering the
fatal "57P01: terminating connection due to administrator command" error.

The implication is that every client program must follow every database call with
defensive code to detect error "57P01" and programmatically re-try. (Maybe some
drivers can do this automatically. But I haven't found out if whatever psql uses can do
this. Nor have I found out how to write re-try code in psql.)

Does anybody else find all this as troubling as I do? And, if so, might a
remedy be possible? Maybe something like this:

— Define a new privilege as a cousin to "pg_signal_backend". I'll call it
"pg_signal_backend_for_self_role" here. This would govern the possibility that a session
can kill another session that authorized as the same role as itself.

— Document the fact that "pg_signal_backend_for_self_role" is implicitly granted to a
newly-created role (just as it's documented that "execute… to public" is implicitly
granted to a newly created subprogram).

— Allow "revoke pg_signal_backend_for_self_role from…"—by all means with extra
rules like only a superuser can do this.

Bryn, you can revoke execute on pg_terminate_backend from public and
that will, by extension, revoke it from all users who do not have DBA
privilege or have not been explicitly granted the "execute" privilege on
pg_terminate_backend. This doesn't look like a big problem because
applications usually don't contain code for killing other user's
sessions. I am not sure that GTA is running on top of Postgres database.

--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: Is it possible to stop sessions killing eachother when they all authorize as the same role?

2022-09-14 Thread Mladen Gogala


On 9/13/22 00:49, Tom Lane wrote:

Bryn Llewellyn  writes:

My  non-superuser normalrole with direct login, "u1", is *still* able to invoke 
pg_terminate_backend() and kill other "u1" sessions—even after this (as a super-user):

Really?

I did this in 14.5:

regression=# revoke execute on function pg_terminate_backend from public;
REVOKE
regression=# select proacl from pg_proc where proname = 'pg_terminate_backend';
 proacl
---
  {postgres=X/postgres}
(1 row)

(as expected, the superuser's own execute permission is all that remains)

regression=# create user joe;
CREATE ROLE
regression=# \c - joe
You are now connected to database "regression" as user "joe".
regression=> select pg_terminate_backend(42);
ERROR:  permission denied for function pg_terminate_backend


It very much looks as if what I have describe was deemed to be a bug (after 
that behavior had survived from at least version 11) and that it's now been 
fixed!

No, it very much looks like pilot error.  But you've not shown
us exactly what your test consisted of, so it's hard to say just
where it went off the rails.

regards, tom lane



Tom, I did the same thing on 14.5, and it behaves as Bryn alleges:

postgres=# select proacl from pg_proc where proname = 
'pg_terminate_backend';

    proacl
---
 {postgres=X/postgres}
(1 row)

So,the only user who should be able to execute pg_terminate_backend is 
"postgres". Let's try with user "scott".


mgogala@umajor ~]$ psql -U scott Password for user scott: psql (14.5) 
Type "help" for help. scott=> select pid from pg_stat_activity where 
usename='scott'; pid - 66 79 (2 rows) scott=> select 
pg_terminate_backend(66); pg_terminate_backend -- t 
(1 row)


User scott has no special privileges:

postgres=# select usesuper,usecreatedb,usebypassrls from pg_user where 
usename='scott';

usesuper | usecreatedb | usebypassrls
--+-+--
f | f | f
(1 row)

Yet, it is still able to execute the function in question. My version is 
the following:


scott=> select version();
version


---------
 PostgreSQL 14.5 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.5.0 
20210514 (R

ed Hat 8.5.0-10), 64-bit
(1 row)


Regards

--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

1 2 >

1 - 100 of 112 matches

Mail list logo