Re: Lazy View's Column Computing
For all that we know, it may already be happening. That looks like a pretty reasonable optimization which may already be in place. If we create a view: mgogala=# select * from dept; deptno | dname | loc ++-- 10 | ACCOUNTING | NEW YORK 20 | RESEARCH | DALLAS 30 | SALES | CHICAGO 40 | OPERATIONS | BOSTON (4 rows) mgogala=# create view acct_view as select * from emp where deptno=10; CREATE VIEW The query from the view would probably merge view with the original and optimize everything as a single query. Unfortunately, there is no way to tell: mgogala=# explain select ename,job,sal from acct_view; QUERY PLAN Seq Scan on emp (cost=0.00..1.18 rows=3 width=21) Filter: (deptno = 10) (2 rows) The only tool that you have at your disposal is EXPLAIN. What we need to ascertain that assumption is an optimizer trace file detailing the decisions made by optimizer, something like the event 10053 from another database which will remain unnamed. Merging the view query into the top level query would produce something like this: mgogala=# select ename,job,sal from emp mgogala-# where deptno=10; ename | job | sal +---+-- CLARK | MANAGER | 2450 KING | PRESIDENT | 5000 MILLER | CLERK | 1300 (3 rows) The table, shown below, has more columns than the 3 used in the above query: mgogala=# \d emp Table "mgogala.emp" Column | Type | Collation | Nullable | Default --+-+---+--+- empno | smallint | | not null | ename | character varying(10) | | | job | character varying(9) | | | mgr | smallint | | | hiredate | timestamp without time zone | | | sal | double precision | | | comm | double precision | | | deptno | smallint | | | Indexes: "emp_pkey" PRIMARY KEY, btree (empno) Foreign-key constraints: "fk_deptno" FOREIGN KEY (deptno) REFERENCES dept(deptno) Merging the top level query with the view query would be smart tactic which is probably already deployed. However, it is not possible to tell with the tools at hand. That is what you want: the query touches only the columns you need, nothing else. That is done by the query optimizer in the "rewrite" phase of the query. https://www.postgresql.org/docs/12/query-path.html I could bet that the top level query gets merged with the view query during the rewrite and that the columns that aren't needed aren't touched. That in particular means that the function computing an untouched column of the query isn't executed as it is. Regards Regards On 8/2/21 10:12 AM, Avi Weinberg wrote: Hi, Is there a way to compute a column in a view only if it is referenced in the query? I have a view's column that its value is computed by a function. If in the query that column is not used at all, can Postgres "skip" computing it? Thanks! IMPORTANT - This email and any attachments is intended for the above named addressee(s), and may contain information which is confidential or privileged. If you are not the intended recipient, please inform the sender immediately and delete this email: you should not copy or use this e-mail for any purpose nor disclose its contents to any person. -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: Series of 10 questions about the use of postgresql, generally.
On 8/6/21 1:11 AM, Pavel Stehule wrote: 2. If I am using the COPY command for input and output around *.csv files, how may I specify internal tuple delimiters of ',', but at the same time also specify an end of line delimeter of ';' ? you cannot to specify end of line delimiter in Postgres You can specify end of line delimiter in Perl, Python or PHP. It is highly advisable to learn a scripting language or two when working with databases. With all databases, not just Postgres. 3. How may I get postgresql to output the create table statement(s) for one or more tables of one database? 4. I know that I can use COPY to import or export one database table between it and a *.csv file. Can I use it to do this with multiple TABLES and *.csv files specified in one COPY COMMAND, or not? How may I use COPY, or another default native postgresql function, to export or import create table commands to and from a *.csv file? COPY can be used just for one table in one time if you want to make DDL command (CREATE TABLE), then you should to run pg_dump -t tablename -a 5. In the absence of OS command line instructions, is there an internal postgresql way, via COPY or another function for example, to backup an entire database, with all its create table statements and all insert statements, and any other associated object, in one hit? Or is this ill advised? pg_dump 6. -How exactly do Intervals work, via themselves and in relation to the other provided native datatypes? What are Intervals used for? Interval is Postgres native type, that helps with work with date operations that are not possible to map to some basic unit. Internally it is a structure with numeric fields - years, months, days, hours, seconds. For example - I can work with a value '3 months', but without knowledge of the beginning, I cannot say how many days this interval has. You can create interval value by constructor function make_interval, or directly from literal, or you can divide two timestamp values. Interval is a SQL standard data type for adding and subtracting dates. It is supported by a gazillion of other databases: https://www.ibm.com/docs/en/informix-servers/12.10?topic=types-sql-datetime-interval-data https://www.mysqltutorial.org/mysql-interval/ https://www.oracletutorial.com/oracle-basics/oracle-interval/ https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types The only major database that does not support the interval type is SQL Server. SQL Server has a major problem with SQL standards. It also doesn't support FETCH FIRST ROWS ONLY OFFSET syntax. Of course, this group is not about SQL Server so I'll leave it at that. -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: JWT decoder
You could probably use Pl/Python. Python has base64 module and urllib module which can deal with all URL strings I am aware of. Pl/Perl would probably do as well. On 8/8/21 10:16 PM, Masih Tavassoli wrote: Hi experts, I am trying to find a way to decode a URL request header and extract its JSON fields in postgreql . I can do this in Oracle sql using select (utl_raw.cast_to_varchar2 (utl_encode.base64_decode( utl_raw.cast_to_raw (regexp_replace ( ( 'eyJhbGciOiJSUzI1NiIsImtpZCI6InBhLWFwaS1zaWduIn0.eyJzY29wZSI6W10sInN1YiI6Imp1bGllQHNhcHBoaXJlY29udmV5YW5jaW5nLm5ldC5hdSIsImNsaWVudCI6InB4YV9hY19zZXR0c3BsdXMiLCJ1c2VybmFtZSI6Imp1bGllQHNhcHBoaXJlY29udmV5YW5jaW5nLm5ldC5hdSIsImV4cCI6MTU1NDA4NzU2N30.OZwVETCP+HUDsSkudCfa52eQT2Fyt6BgefFpYr5xpJROjZJeVnXtzsU6MS8aHauqyyOwQSz8iG6OC2YjYgrGiZ8YTwTh1uHptNjicyCFmCkh++d0oGdK78ZOKIAvuslW3GX4V13hPUJADzGLkyO8hy01YF9Id4+7GrvmnRRkwPO7C2eUvKTB8pxDAA94Jpw74s4BiNTmm1Uv+56AXqdI8m2KFkLNwGoUR3OSNvsifl5uxBsGGmP4wbgNe/Dy1q2UsL2eBAW/GuWryHGCRz1/PR2jIV1v21ClQ8YLiFAzM3YxHb8m2ngzRynKof+Nw3nZvuafRQ16MV5THz0ahGFYLA' ), '[[:space:]]', '') from dual But there doesn't seem to be a way doing it in postgres. Has anyone got any suggesions? Thanks Masih -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: JWT decoder
Hmmm, Pl/Python and Pl/Perl are languages usable from within Postgres. You can write Python functions in Postgres. I apologize for not making that clear. On 8/9/21 1:15 AM, Masih Tavassoli wrote: There are lots of decoders but I need to do it within postgresql. On Monday, August 9, 2021, 01:24:33 PM GMT+10, Mladen Gogala wrote: You could probably use Pl/Python. Python has base64 module and urllib module which can deal with all URL strings I am aware of. Pl/Perl would probably do as well. -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: Serious List of PostgreSQL usage questions.
Responses in-line: On 8/9/21 10:01 PM, A Z wrote: 1) Are there free scripts for CREATE TYPE (native type), more advanced, or sorts of types out there, online, free for commercial use? With function support, too? Can someone reply with a link or a suggestion? ORAFCE? That extension has Oracle date/time functions. 2) How may I get PostgreSQL to output the create table statement(s) for one or more tables inside one database, without issuing instructions via the command line, but only inside a database login, as a query or pl/sql? Postgres doesn't store DDL into the catalog, DDL has to be created from the catalog data. I would advise pg_tables, pg_attribute and pg_indexes. That shouldn't be too hard of an exercise with PlPgSQL, If you want, you can play with the ddlx extension: https://pgxn.org/dist/ddlx/ 3) I know that I can use COPY to import or export one database table between it and a *.csv file. Can I use it to do this with multiple TABLES and *.csv files specified in one COPY COMMAND, or not? You can not. Copy does only one table at one time. Of course, you can use "screen" and do multiple tables from multiple windows or something similar. 4) In the absence of OS command line instructions, is there an internal PostgreSQL way, via COPY or another function for example, to backup an entire database, with all it's create table statements and all insert statements, and any other associated object, in one hit? Or is this ill-advised? Postgres pg_dump utility will do that. The results should be interesting for a muti-TB database. BTW, any backup utility, even the plain and simple pg_basebackup with --format=tar will be able to restore the database and recover the database. You can always reconstruct the DDL with pg_dump --schema-only. PgAdmin4 will also provide the DDL on request. 5) When setting up communication to remote databases on remote machines, I need to use the OPTIONS() function. It seems to require as its first function parameter, the schema of the table (the second parameter) that it wants to access. Can I supply a null schema, and still be able to reference the remote table, or must I also make use of IMPORT FOREIGN SCHEMA? 6) I have found that the native trigonometry functions, namely the radians versions, do produce error results around key trigonometry input values. I have discovered that these errors persist, even if I cast the input parameter away from DOUBLE PRECISION and into DECIMAL. I would like to know if there are any freely available scripts out there that include Arbitrary Precision mathematical functions support that work on DECIMAL and not on DOUBLE PRECISION, that do not produce any error values around key inputs? Could someone refer me to a website that has a script that is such? PlPgPython can use numpy: https://numpy.org -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
log_statement GUC parameter
Hi! Unfortunately, only a superuser can set log_statement='all'; Would it be possible to execute set session log_statement='all'; as an ordinary user? I am trying to execute it from login.sql, a part of login_hook extension which implements on-login triggers in PostgreSQL. I will create a procedure with security definer, owned by the role "postgres", and grant it to public. That should do the trick. However, it would be much nicer if PostgreSQL allowed me to set the parameter as a part of the normal session. The idea is to log all statements by the particular user, not by everybody. The user in question uses Weblogic 12.2.1.4 and creates a connection pool so I need to find out which statements are the longest running ones and make them perform. Regards -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: log_statement GUC parameter
Hi Bruce Yes, I can. I have already done so and it works. I wrote a procedure because of my Oracle background, but it doesn't really matter. This was just a suggestion for the session settable parameters. Regards On 8/12/21 4:23 PM, Bruce Momjian wrote: I think you can write a SECURITY DEFINER function that calls SET, call that function at login. -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: log_statement GUC parameter
Thank you Tom! It turns out that the Oracle way of doing things (SECURITY DEFINER) was the wrong way here. Thanks a bunch. On 8/12/21 4:37 PM, Tom Lane wrote: ctually, for that specific requirement, there's an easier way: ALTER USER target_user SET log_statement = 'all'; While the target_user can't do that for himself, a superuser can. -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: Cluster fencing tool/software for PostgreSQL cluster with streaming replication
Hi Vikas! I used to see you on the mailing list for another database, that shall remain unnamed but starts with the letter "O". To answer your question: I have very good experience with Veritas Cluster on both Linux and Solaris (or was it Slowaris? I forgot the exact spelling :). It's an industry standard fail-over cluster and Veritas provides an excellent support for it. Regards On 8/16/21 6:31 AM, Vikas Sharma wrote: Hi, I am planning for an enterprise grade PostgreSQL cluster and so looking for the tool/softwares which will do the cluster management or fencing to avoid split brain. Please, could you let me know if there are some tools/software which can do that so it can be used in a production environment. PostgreSQL12 with streaming replication + repmgr and pgpool for connection management and load balancing. Regards Vikas S -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: Can't locate Ora2Pg.pm while executing export_schema.sh
You should locate Ora2Pg.pm using "find" like this: [mgogala@umajor ~]$ find /usr/local -name Ora2Pg.pm /usr/local/share/perl5/5.32/Ora2Pg.pm [mgogala@umajor ~]$ and then make sure that your PERL5LIB directory contains the location: export PERL5LIB=/usr/local/share/perl5/5.32:$PERL5LIB That should do the trick. On most distributions, /usr/local/share/perl5/ is already included into PERL5LIB. Are you working on Cygwin? You can do a simple test: [mgogala@umajor ~]$ perl -e 'use Ora2Pg; print "$Ora2Pg::VERSION\n";' 21.0 This is not related to PostgreSQL, this is a Perl question: https://gist.github.com/cgoldberg/4332167 On 8/17/21 3:15 AM, Pansara, Jiten wrote: Hello Team, I am getting below error while executing export_schema.sh. Any suggestion/help is really appreciated. BEGIN failed--compilation aborted at /c/Strawberry/perl/site/bin/ora2pg line 27. Can't locate Ora2Pg.pm in @INC (you may need to install the Ora2Pg module) (@INC contains: /usr/lib/perl5/site_perl /usr/share/perl5/site_perl /usr/lib/perl5/vendor_perl /usr/share/perl5/vendor_perl /usr/lib/perl5/core_perl /usr/share/perl5/core_perl) at /c/Strawberry/perl/site/bin/ora2pg line 27. BEGIN failed--compilation aborted at /c/Strawberry/perl/site/bin/ora2pg line 27. If we don’t want to use this script, should we execute ora2pg directly to migrate the database from Oracle to Postgres? * Ora2pg -c ora2pg.conf Let me know your views. *Jiten Pansara* Senior Database Analyst *T*: +91 9158149600 *E*: jiten.pans...@fisglobal.com <mailto:jiten.pans...@fisglobal.com> *FIS | Advancing the way the world Pays, Banks, and Invests***<https://www.facebook.com/FIStoday><https://twitter.com/FISGlobal><https://www.linkedin.com/company/fis> A picture containing text, outdoor, sign Description automatically generated ** The information contained in this message is proprietary and/or confidential. If you are not the intended recipient, please: (i) delete the message and all copies; (ii) do not disclose, distribute or use the message in any manner; and (iii) notify the sender immediately. In addition, please be aware that any message addressed to our domain is subject to archiving and review by persons other than the intended recipient. Thank you. -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: Serious Assistance with PostgreSQL True Infinite Arbitrary Precision Maths.
The answer for you is Pl/Python3u. It can use numpy, which supports infinite precision arithmetic and converting results to double and back. It can also do linked lists. Depending on your age and your preferences, you might try Pl/Perl. I learned Perl in 1993 and have never stopped loving it. It has several modules for the infinite precision arithmetic. My favorite is Math::BigFloat. As for the linked lists, there is List::DoubleLinked module on CPAN. In other words, all procedural extensions can be used to fulfill your requirements. BTW, one of my employers told me that Perl is verboten and that the company has standardized on Python. So I had to learn it. It's not too bad if you're in that sort of things. Regards On 8/17/21 10:49 PM, A Z wrote: Dear pgsql-general, I'm after Infinite Precision Integer (ii) or Infinite Precision Rational (ir) type and function support for PostgreSQL, as an easy installation for two new native types. That are available under free and open source conditions, for public, private and commercial, free use. I am aware of the pgmp project. While I have contacted it and and am awaiting their response, it is the case that the present state of pgmp is not enough for my needs. PostgreSQL does not have a private doubly linked list available, and so I can't assemble all this for myself, given that I do not know enough about PostgreSQL extensions and the C language. What I am after is True Arbitrary Precision Arithmetic, and not High Precision Arithmetic. What I seek would be contingent only upon available OS or Hardware memory. I need arithmetic and comparison support on Integer and Rational (Decimal type) numbers. * +, -, *,/,%, ==, !=, <>,>,<, >=, <=*. Other important candidates required include *+=, -=, *=, /=, %=.* The trick is, that I also need full classical function support as well. What I need fairly well is: *cast(ir) returns double precision;* *cast(double precision) returns ir; * *cast(ir) returns real;* *cast(real) return ir;* *cast(ir) returns numeric;* *cast(numeric) returns ir;* *cast(ir) returns decimal;* *cast(decimal) returns ir;* *cast(ii) returns bigserial;* *cast(bigserial) returns ii;* *cast(ii) returns serial;* *cast(serial) returns ii;* *cast(ii) returns smallserial;* *cast(smallserial) returns ii;* *cast(ii) returns bigint;* *cast(bigint) returns ii;* *cast(ii) returns integer;* *cast(integer) returns ii;* *cast(ii) returns smallint;* *cast(smallint) returns ii;* * * *cast(text as ii) returns ii;* *cast(text as ir) returns ir;* *cast(ir as text) returns text;* *cast(ii as text) returns text;* *cast(ii as ir) returns ir;* *cast(ir as ii) returns ii;* * * *sign(ir input) returns ir;* *abs(ir input) returns ir;* *pi(ii places) returns ir;* *e(ii places) returns ir;* *power(ir base, ir exponent) returns ir;* *sqrt(ir input) returns ir* *nroot(ii theroot, ir input) returns ir;* *log10(ir input) returns ir;* *loge(ir input) returns ir;* *log2(ir input) returns ir;* *factorial(ii input) returns ii;* *degrees(ir input) returns ir;* *radians(ir input) returns it;* * * *sind(ir input) returns ir;* *cosd(ir input) returns ir;* *tand(ir input) returns ir;* *asind(ir input) returns ir;* *acosd(ir input) returns ir;* *atand(ir input) returns ir;* * * *sinr(ir input) returns ir;* *cosr(ir input) returns ir;* *tanr(ir input) returns ir;* *asinr(ir input) returns ir;* *acosr(ir input) returns ir;* *atanr(ir input) returns ir;* The last two sets of functions are forward and inverse trigonometry functions, both in degrees and radians. I also need the update module to install easily. A windows *.exe and *.msi, a Linux *.deb, *.rpm or *.bin. Is there someone or something out there that can get there? -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: Licensing
For a six-pack of Sam Adams Boston Lager or Molson Canadian Lager per website, I hereby grant thee the right to use PostgreSQL with any customer, to your heart's content. Please note that I don't accept Budweiser or Miller products. On 8/18/21 2:39 PM, Bryan Boone wrote: Hello Everyone. I have a question about licensing with PostgeSQL. I work for a small company and we design websites for customers of ours. I read the license that is listed here. https://www.postgresql.org/about/licence/ <https://www.postgresql.org/about/licence/> However, I am still confused. The small company I work for develops websites and applications on AWS EC2 instances for our customers and we charge the customers a fee for hosting and for maintaining their websites. I am trying to figure out which dbase software I can use and if I need to buy licenses or not. We do not modify the PostgreSQL code and we do not redistribute the binaries. So far we are just running on a single instance for each of our customers, but we are not utilizing anything like a cluster or high availability CDN, other than what AWS EC2 provides. We always utilize Wordpress and Drupal for site front-end. Can someone tell me if I am able to use PostgreSQL for the small company I work for? Is there a way to by a commercial license of PostgreSQL if we are required too? Thanks -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: Postgres Migration error - Need help
Why are you connecting to Postgres in the first place? Ora2Pg should connect to Oracle and produce a SQL script to re-create PostgreSQL version of the Oracle schema. As for the error below, you're probably using an old version of DBD::Pg. Check it using CPAN and upgrade if necessary. Regards On 8/18/21 5:50 AM, Pansara, Jiten wrote: Hello Team, I am executing below command to copy data but it is throwing error. Can anyone help please? *Jiten Pansara* Senior Database Analyst *T*: +91 9158149600 *E*: jiten.pans...@fisglobal.com <mailto:jiten.pans...@fisglobal.com> *FIS | Advancing the way the world Pays, Banks, and Invests***<https://www.facebook.com/FIStoday><https://twitter.com/FISGlobal><https://www.linkedin.com/company/fis> A picture containing text, outdoor, sign Description automatically generated ** The information contained in this message is proprietary and/or confidential. If you are not the intended recipient, please: (i) delete the message and all copies; (ii) do not disclose, distribute or use the message in any manner; and (iii) notify the sender immediately. In addition, please be aware that any message addressed to our domain is subject to archiving and review by persons other than the intended recipient. Thank you. -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: Make bloom extension trusted, but can not drop with normal user
Why do you think that the normal user should be allowed to drop extensions? Extensions are additions to the RDBMS functionality. Do you really want normal users to tinker with the database functionality and reduce the number of index types you can create? I am not sure that I am on board with that idea. Regards On 8/20/21 7:01 AM, Li EF Zhang wrote: So, I want to know if extensions in contrib dir can be set "trusted" ourselves? If we can, I think drop extension should be allowed by normal user. -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re:
On 8/27/21 8:59 AM, Ravi Krishna wrote: how to take incremental backup in postgresql windows machine. AFAIK PG has no concept of incremental backup. pgbackrest has, but not sure whether it runs on Windows. PG_RMAN can do incremental backups. I think it does the same thing as Oracle without enabled block change tracking: it backs up all the files changed since the last full backup. However, that doesn't help you much because of the vacuum which can touch the file well after it has been referenced in transaction. I like pg_rman because it can automatically delete old backups and keep only 4 latest backups. -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: Insert statement doesn't complete
On 8/28/21 11:40 PM, Trang Le wrote: Hi all, I am facing an issue when I run a script: INSERT INTO mdm.etl_exception_info SELECT * FROM mdm.etl_exception_info_vw2; Is there any error message thrown? If there is, can you tell us? Is autocommit on? What does the log say? after running script I check this script in pg_lock, everything ok, job done, data is inserted. However, the session contains the script above doesn't complete though data is inserted and job done that is verified in pg_lock. What does the phrase "the script above doesn't complete" mean? How do you know that the script doesn't complete? Is the script hanging? Furthermore, what is the software version? What platform are you using? Could you help me on this issue? Insufficient data for a meaningful answer. Thanks all. Regards, Trang -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: database design with temporary tables
On 8/29/21 12:24 PM, Adrian Klaver wrote: Presumably not. Temporary tables only live at most for the length of a session. It would be a really bad idea to hold sessions open for 24 hours. That is assuming nothing else causes the session to drop and the data to be lost. Well, that's precisely how application servers work. The allocate connection pool and keep them open for a very long time. -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: database design with temporary tables
On 8/29/21 2:26 PM, Adrian Klaver wrote: The pool is maintained, the individual connections(sessions) come and go. Otherwise there would be no point to having a pool. Every time the connection(session) is closed the temporary table disappears. Yes, you're right. And that is very convenient. I also prefer temporary tables with ON COMMIT DELETE ROWS so that I don't have leftovers between transactions. There is also pgtt extension by Giles Darold which emulates Oracle global temporary tables: https://github.com/darold/pgtt#use-of-the-extension I have never particularly liked Oracle's implementation of temporary tables, local or global, so I have no problems with questions like that. BTW, I have compared Oracle's implementation with other implementations here: https://dbwhisperer.wordpress.com/2019/03/18/private-temporary-tables-oracle-18c/ I could have used Postgres instead of the SQL Server, the comparison would read the same, except for the "#" character which denotes temporary table in SQL Server. Regards -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: How to observe plan_cache_mode transition from custom to generic plan?
bind values and table statistics. People switching from Oracle, me being one of those, frequently make mistake of using bind variables in Postgres. Sometimes, there is no choice, for instance if your app uses an application server and an ORM. Postgres doesn't have the memory structure called "shared pool" and cursors and compiled procedures are not cached. That means that by using Postgres, you are deprived of all the fun managing the cursor caching, shared pool, database statistics and latches (don't ask, please). There is a small consolation that you get to keep a ton of money in your pocket. You also can have a stable plan. There are no adaptive plans or cursor sharing. Frank Pachot, an Oracle Ace, has an excellent article on the subject: https://franckpachot.medium.com/postgresql-bind-variable-peeking-fb4be4942252 -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: How to observe plan_cache_mode transition from custom to generic plan?
On 9/6/21 2:26 AM, Laurenz Albe wrote: "Bind variables" just being an Oraclism for parameters, it is*not* a mistake to use them in PostgreSQL. Actually, it is a mistake because they don't give you any performance benefit and can potentially worsen the performance. There is no cursor sharing and generic plans can be much worse than "custom" plans, generated with the actual values. The only reason for using bind variables/parameters is to protect yourself from SQL injection. Creating SQL dynamically from input is the recipe for the "little Bobby Tables" situation: https://xkcd.com/327/ -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: The tragedy of SQL
Replies in-line On 9/14/21 01:51, Guyren Howe wrote: They are making a decent decision. SQL is a *fucking terrible* language, which I don’t blame them for not wanting to learn. Based on what criteria? The whole industry, programming languages, infrastructure, everything would have developed differently if relations were a natural, pleasurable thing to use in any programming language. Like an Array, or a Hash. Thee is nothing natural about either relations or arrays and hashes/dictionaries. Relations are pretty literal implementation of the basic set theory. Having a decent understanding of the basic set theory is a condition for understanding SQL. Now, we can discuss whether a language implementing a mathematical theory is "good" or "bad", whatever the meaning of "good" or "bad" in the given context. Historically, SQL is a good fit for the banking business and accounting and that is why it is still around. Another misconception about SQL is treating it as a general purpose programming language. SQL is data description language, nothing more, nothing less. It doesn't need loops, arrays, hashes or subroutines, its principal purpose is to describe a subset of data. Without SQL you would have to read all the data and filter the unnecessary stuff yourself. Furthermore, part of SQL are so called "ACID requirements". Transaction can only see the data that was committed before the transaction has begun. Implementing ACID takes a lot of code, that's what MVCC is all about. However, that too has its origins in accounting. You cannot pay the money you don't have. And the last thing about SQL is transaction management. Without relational databases and SQL, you would need a proprietary transaction manager, just like MongoDB. And MongoDB has a complex proprietary transaction manager and is losing market share. So, to recapitulate: * Declarative subset definition * ACID consistency * Transaction management * Ideal fit for accounting. That is why SQL is still around. And that is why we all live in a yellow subroutine (this reference is not for the millennials or younger). -- I'll speak the key, the whole key and nothing but the key, so help me Codd. Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: The tragedy of SQL
On 9/14/21 02:18, Rob Sargent wrote: All languages are fucking terrible. I like English. It's not very complex and it allows me to express myself very well. You should see my native tongue, Croatian language, from the group of Slavic languages. It's fucking terrible. -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: The tragedy of SQL
On 9/14/21 16:07, Raymond Brinzer wrote: By analogy: Arabic and Roman numerals both describe the natural numbers. Hence, they have the same mathematical properties. Spending a little time doing algebra with Roman numerals should convince anyone, however, that how you express a concept matters a lot. Analogy is not an accepted logical method. I do agree that the style of expression matters. That is why we have literature. Saying "hey there" and "friends, Romans, countrymen, lend me your ears" can have the same meaning in the given context but the latter expression is much more poetic. As software engineers, we are very much opposed to poetry, especially those among us using Perl. However, the stated purpose of the SQL language is to describe sets and subsets. It's supposed to be verbose. Storage extensions are database specific and are here for performance reasons. -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: The tragedy of SQL
On 9/14/21 15:57, Guyren Howe wrote: Verbosity. Redundancy. Lack of orthogonality. Resemblance to English. Verbosity is a feature, as well as the resemblance to English. The language is meant to be understood by accountants. Once upon a time people were using something called "COmmon Business Oriented Language" which was also very verbose, for the same reason: it had to be understandable to the business people. SQL is written by the people with the background in mathematics and is thus more understandable for the people with background in mathematics. I have no problem with SQL. -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: The tragedy of SQL
On 9/16/21 19:29, Guyren Howe wrote: Missing my original point here. The set theory is the _point_. SQL is a gargantuan distraction from using it efficiently. Imagine if COBOL was the only widely-available programming language with functions. You might use it, because functions are really great abstraction for programming. That wouldn’t mean that COBOL wasn’t an utterly awful language. SQL is like that, only for relations, sets and logic. I am probably a bit biased because I am using SQL for a few weeks now. I was the first person certified for teaching Oracle 6 internals in the EMEA region and I was a certified Oracle 5.1.22 DBA on VAX/VMS. So, by using SQL for several weeks now, I grew attached to it and I like it. My second problem is my mathematical background. I have a Bsc in mathematics and was quite well acquainted not only with the basic set theory but also with things like Zermelo's axiom of choice, Zorn's lemma and well ordering theorem. I am still quite familiar with the relational algebra, unions, intersections, Cartesian products, equivalence relations and alike. I think that SQL represents relational algebra quite well. It must be doing something right, because it lasts for so long. After all, it was created by a mathematician. There is a famous quote from the move "Chinatown" saying that politicians, ugly buildings, and whores all get respectable if they last long enough. The same applies to computer languages. I love the smell of SQL in the morning. -- I'll speak the key, the whole key and nothing but the key, so help me Codd. Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: Possibilities for optimizing inserts across oracle_fdw foreign data wrapper
On 9/19/21 06:28, Niels Jespersen wrote: Hello all We are often using the oracle_fdw to transfer data between Postgres (version 11+) and Oracle (version 18+). It works great. However I have a task at hand that requires inserting a few billion rows in an Oracle table from a Postgres query. insert into t_ora (a,b,c) select a,b,c from t_pg; This is driven from a plpgsql stored procedure, if that matters. I want to optimize the running time of this. But I am unsure of which, if any, possibilities there actually is. Reducing the number of network roundtrips is usually a good way to increase throughput. But, how do I do that? If I could make the Oracle insert direct load, that would usually also increase throughput. But, is that possible here. There are no constraints defined on the destinaton tables. Regards Niels Jespersen The problem with oracle_fdw is that the SQL is parsed on the Postgres side, not on the Oracle side. If it was parsed on the Oracle side, you could use /*+ APPEND */ hint, which is essentially, a direct insert. You will have to write a script in one of the scripting languages, which would utilize the array insert, available with the instant client. Even Oracle ODBC driver utilizes array insert, as visible from the following article: https://dbwhisperer.wordpress.com/2020/11/21/pyodbc-fast_executemany-and-oracle-rdbms/ Unfortunately, the Postgres side of the equation is not particularly good when using array fetch and does not do particularly well when trying to cut down on the number of network trips: https://github.com/mkleehammer/pyodbc/wiki/Driver-support-for-fast_executemany I would use a script on the Postgres side and then use superior options provided by SQL*Net. You will need some fancy programming to prevent waiting on each operation. I would actually write 2 scripts, one reading data from Postgres, converting it to CSV and then piping it into script that inserts data into Oracle. That would make the scripts work in parallel, at least partially. Situations like this are the reason why a DBA needs to know how to script. So, this is where you start: https://python.swaroopch.com/ Regards -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: Faster distinct query?
On 9/22/21 16:20, David G. Johnston wrote: I'd probably turn that index into a foreign key that just ensures that every (station,channel) that appears in the data table also appears on the lookup table. Grouping and array-ifying the lookup table would be trivial. Either modify the application code or add a trigger to populate the lookup table as needed. I fully agree with this. Adding a trigger to populate a lookup table is a standard design in situations like this. Using "DISTINCT" almost always spells trouble for the performance. -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: Force re-compression with lz4
On 10/18/21 01:07, Michael Paquier wrote: CPU-speaking, LZ4 is*much* faster than pglz when it comes to compression or decompression with its default options. The compression ratio is comparable between both, still LZ4 compresses in average less than PGLZ. -- Michael LZ4 works much better with deduplication tools like Data Domain or Data Domain Boost (client side deduplication). With zip or gzip compression, deduplication ratios are much lower than with LZ4. Most of the modern backup tools (DD, Veeam, Rubrik, Commvault) support deduplication. LZ4 algorithm uses less CPU than zip, gzip or bzip2 and works much better with deduplication algorithms employed by the backup tools. This is actually a very big and positive change. Disclosure: I used to work for Commvault as a senior PS engineer. Commvault was the first tool on the market to combine LZ4 and deduplication. Regards -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: Force re-compression with lz4
On 10/18/21 11:01, Adrian Klaver wrote: ot sure how much this applies to the Postgres usage of lz4. As I understand it, this is only used internally for table compression. When using pg_dump compression gzip is used. Unless you pipe plain text output through some other program. This applies when using a 3rd party commercial backup tool with deduplication. You'd be surprised how many people do that. One tool to backup them all, one tool to find them and on the LTO cartridge backup them. I apologize for this cheesy paraphrase of Tolkien, but I couldn't resist. Long story short, for the 3rd party backup tools LZO4 compression will yield better deduplication ratios than other forms of compression, thereby saving you space. -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: WAL streaming and dropping a large table
On 10/21/21 16:23, Rory Falloon wrote: Hi, My postgres server setup is thus: a production database, which is using WAL streaming (hot standby) to four other hosts of various latencies. I Have a table that is ~200GB which has been backed up, and now I Want to remove it. The table is not in use, it is a child table to a parent table that _is_ in use but I foresee no issues here. The total DB size is roughly 300GB. The main reason for needing to remove it is to recover the space back on the partition which is humming away at 88% usage, and the table I wish to drop is better off in an archive somewhere. I just removed around 10% of it with 'delete from', which of course cascaded to the replication hosts. This increased the size of my pg_xlog folder (from ~3GB to ~6.5GB) which of course increased my partition usage. Obviously this is expected, but I wonder what would happen if I had issued the 'drop table'? I expect the nature of enabling max_replication_slots would mean the database would retain the wal segments until all have caught up; it could take quite a long time to 'catch up' after the 'drop table' command? It took about 10 minutes before the pg_xlog folder size had 'settled down' to what I normally see as default which is ~3GB. > wal_keep_segments is defined as 128, > wal_max_size is not defined, > max_replication_slots is enabled I'd prefer to use 'drop table' as it would recover the data immediately, but not if it will impact the production database in a way that the partition will become full, which defeats the purpose. Is it advisable to move the pg_xlog folder to another volume on the system with more space (which I have) and symlink - obviously a stop & start of the db required - and then let the WAL archives fill up as need be? Or am I missing something obvious (likely) Thanks Well, the difference between "drop table" and "delete" is that "drop table" is transaction on the data dictionary. PostgreSQL has transactional DDL, which means that "drop table" can be rolled back, so that transaction has to be protected by WAL, but it is just the log of the dictionary tables. There is the 3rd option, called "truncate" which is also a DDL which creates a new empty table with the same description and discards the old files. Both 'drop" and "truncate" will not generate much WAL logs. On the other hand, "delete" is a regular DML transaction which will generate logs needed to recover the 200GB table. -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: ZFS filesystem - supported ?
On 10/23/21 07:29, Laura Smith wrote: Hi, Given an upcoming server upgrade, I'm contemplating moving away from XFS to ZFS (specifically the ZoL flavour via Debian 11). BTRFS seems to be falling away (e.g. with Redhat deprecating it etc.), hence my preference for ZFS. However, somewhere in the back of my mind I seem to have a recollection of reading about what could be described as a "strong encouragement" to stick with more traditional options such as ext4 or xfs. A brief search of the docs for "xfs" didn't come up with anything, hence the question here. Thanks ! Laura Hi Laura, May I ask why would you like to change file systems? Probably because of the snapshot capability? However, ZFS performance leaves much to be desired. Please see the following article: https://www.phoronix.com/scan.php?page=article&item=ubuntu1910-ext4-zfs&num=1 This is relatively new, from 2019. On the page 3 there are tests with SQLite, Cassandra and RocksDB. Ext4 is much faster in all of them. Finally, there is another article about relational databases and ZFS: https://blog.docbert.org/oracle-on-zfs/ In other words, I would test very thoroughly because your performance is likely to suffer. As for the supported part, that's not a problem. Postgres supports all modern file systems. It uses Posix system calls to manipulate, read and write files. Furthermore, if you need snapshots, disk arrays like NetApp, Hitachi or EMC can always provide that. Regards -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: ZFS filesystem - supported ?
On 10/23/21 09:37, Laura Smith wrote: Hi Mladen, Yes indeed, snapshots is the primary reason, closely followed by zfssend/receive. I'm no stranger to using LVM snapshots with ext4/xfs but it requires a custom shell script to manage the whole process around backups. I feel the whole thing could well be a lot cleaner with zfs. Thank you for the links, I will take a look. Laura Yes, ZFS is extremely convenient. It's a volume manager and a file system, all rolled into one, with some additiional convenient tools. However, performance is a major concern. If your application is OLTP, ZFS might be a tad too slow for your performance requirements. On the other hand, snapshots can save you a lot of time with backups, especially if you have some commercial backup capable of multiple readers. If your application is OLTP, ZFS might be a tad too slow for your performance requirements. The only way to find out is to test. The ideal tool for testing is pgio: https://kevinclosson.net/2019/09/21/announcing-pgio-the-slob-method-for-postgresql-is-released-under-apache-2-0-and-available-at-github/ For those who do not know, Kevin Closson was the technical architect who has built both Exadata and EMC XTRemIO. He is now the principal engineer of the Amazon RDS. This part is intended only for those who would tell him that "Oracle has it is not good enough" if he ever decided to post here. -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: ZFS filesystem - supported ?
On 10/23/21 23:12, Lucas wrote: I don't know where you have your database deployed, but in my case is in AWS EC2 instances. The way I handle backups is at the block storage level, performing EBS snapshots. Yes, Amazon uses SAN equipment that supports snapshots. This has proven to work very well for me. I had to restore a few backups already and it always worked. The bad part is that I need to stop the database before performing the Snapshot, for data integrity, so that means that I have a hot-standby server only for these snapshots. Lucas Actually, you don't need to stop the database. You need to execute pg_start_backup() before taking a snapshot and then pg_stop_backup() when the snapshot is done. You will need to recover the database when you finish the restore but you will not lose any data. I know that pg_begin_backup() and pg_stop_backup() are deprecated but since PostgreSQL doesn't have any API for storage or file system snapshots, that's the only thing that can help you use storage snapshots as backups. To my knowledge,the only database that does have API for storage snapshots is DB2. The API is called "Advanced Copy Services" or ACS. It's documented here: https://www.ibm.com/docs/en/db2/11.1?topic=recovery-db2-advanced-copy-services-acs For Postgres, the old begin/stop backup functions should be sufficient. -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: ZFS filesystem - supported ?
On 10/25/21 13:13, Stephen Frost wrote: No, it's not- you must also be sure to archive any WAL that's generated between the pg_start_backup and pg_stop_backup and then to be sure and add into the snapshot the appropriate signal files or recovery.conf, depending on PG version, to indicate that you're restoring from a backup and make sure that the WAL is made available via restore_command. Just doing stat/stop backup is*not* enough and you run the risk of having an invalid backup or corruption when you restore. If the entire system is on a single volume then you could possibly just take a snapshot of it (without any start/stop backup stuff) but it's very risky to do that and then try to do PITR with it because we don't know where consistency is reached in such a case (we*must* play all the way through to the end of the WAL which existed at the time of the snapshot in order to reach consistency). In the end though, really, it's much, much, much better to use a proper backup and archiving tool that's written specifically for PG than to try and roll your own, using snapshots or not. Thanks, Stephen Stephen, thank you for correcting me. You, of course, are right. I have erroneously thought that backup of WAL logs is implied because I always back that up. And yes, that needs to be made clear. Regards -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: ZFS filesystem - supported ?
On 10/25/21 15:43, E-BLOKOS wrote: what about BTRFS since it's the successor of ZFS? BTRFS is NOT the successor to ZFS. It never was. It was completely new file system developed by Oracle Corp. For some reason, Oracle seems to have lost interest in it. Red Hat has deprecated and, in all likelihood, BTRFS will go the way of Solaris and SPARC chips: ride into the glorious history of the computer science. However, BTRFS has never been widely used, not even among Fedora users like me. BTRFS was suffering from problems with corruption and performance. This is probably not the place to discuss the inner workings of snapshots, but it is worth knowing that snapshots drastically increase the IO rate on the file system - for every snapshot. That's where the slowness comes from. -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: ZFS filesystem - supported ?
On 10/26/21 05:35, Laura Smith wrote: Curious, when it comes to "traditional" filesystems, why ext4 and not xfs ? AFAIK the legacy issues associated with xfs are long gone ? XFS is not being very actively developed any more. Ext4 is being actively developed and it has some features to help with SSD space allocation. Phoronix has some very useful benchmarks: https://www.phoronix.com/scan.php?page=news_item&px=Linux-5.14-File-Systems Ext4 is much better than XFS with SQLite tests and almost equal with MariaDB test. PostgreSQL is a relational database (let's forget the object part for now) and the IO patterns will be similar to SQLite and MariaDB. That benchmark is brand new, done on the kernel 5.14. Of course, the only guarantee is doing your own benchmark, with your own application. -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: ZFS filesystem - supported ?
On 10/26/21 20:12, E-BLOKOS wrote: RedHat and Oracle are mostly maintaining XFS updates, and I didn't see anything saying it's not mainained actively, especially when they offering many solutions with XFS as default Oh, they are maintaining it, all right, but they're not developing it. XFS is still the file system for rotational disks with plates, reading heads, tracks and sectors, the type we were taught about in school. Allocation policy for SSD devices is completely different as are physical characteristics. Ext4 is being adjusted to ever more popular NVME devices. XFS is not. In the long run, my money is on Ext4 or its successors. Here is another useful benchmark: https://www.percona.com/blog/2012/03/15/ext4-vs-xfs-on-ssd/ This one is a bit old, but it shows clear advantage for Ext4 in async mode. I maybe wrong. Neither of the two file systems has gained any new features since 2012. The future may lay in F2FS ("Flash Friendly File System") which is very new but has a ton of optimizations for SSD devices. Personally, I usually use XFS for my databases but I am testing Ext4 with Oracle 21c on Fedora. So far, I don't have any results to report. The difference is imperceptible. I am primarily an Oracle DBA and I am testing with Oracle. That doesn't necessarily have to be pertinent for Postgres. -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: ZFS filesystem - supported ?
On 10/26/21 20:50, Imre Samu wrote: > Phoronix has some very useful benchmarks: > https://www.phoronix.com/scan.php?page=news_item&px=Linux-5.14-File-Systems <https://www.phoronix.com/scan.php?page=news_item&px=Linux-5.14-File-Systems> > Ext4 is much better than XFS with SQLite tests and almost equal with > MariaDB test. PostgreSQL is a relational database (let's forget the > object part for now) and the IO patterns will be similar to SQLite and > MariaDB. there is a link from the Phoronix page to the full OpenBenchmarking.org result file and multiple PostgreSQL 13 pgbench results included: https://openbenchmarking.org/result/2108260-PTS-SSDS978300&sor&ppt=D&oss=postgres <https://openbenchmarking.org/result/2108260-PTS-SSDS978300&sor&ppt=D&oss=postgres> ( XFS, F2FS, EXT4, BTRFS ) Regards, Imre Wow! That is really interesting. Here is the gist of it: XFS is the clear winner. It also answers the question about BTRFS. Thanks Imre! -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: Incremental backup
On 10/28/21 18:07, Andreas Joseph Krogh wrote: I think everybody agrees that incremental backup /per database/, and not cluster-wide, is nice, and it would be nice if PG supported it. But, given the way PG is architectured, having cluster-wide WALs, that's not an easy task to implement. Repeating "other databases have it" doesn't change that. -- Andreas Joseph Krogh I am not advocating for the database level incremental backups, but all databases that have it also have cluster wide/instance wide WAL logs. Cluster wide WAL logs do not make database level incremental backups hard. Both Oracle and DB2 have database level incremental backups and both have cluster wide WAL (redo or logs). So does SQL*Server which is a bit specific because its log is used for both recovery, as WAL, and undo so its very different from Postgres. Also, SQL Server is, to my knowledge, the only database which doesn't use MVCC. The technology that all of those databases employ is a bitmap device which has one bit per each database block. Full backups set all bits to 0 and whenever block is modified, the corresponding bit is set to 1. The backup tool in the incremental mode then only copies blocks with the bitmap value of 1. I am not too thrilled by that implementation. In particular, with Oracle there were bugs with database restore, "duplicate database" operation, global cache locks and instance latches. I've had quite a few headaches with RMAN cumulative incremental backups. My preferred method of backup is storage snapshot. Snapshots then can be backed up to other arrays (NetApp SnapVault, Hitachi HUR, EMC SRDF) or can be backed up to deduplicated offline storage like AWS Glacier or EMC Data Domain using simple file level utilities. Once snapshot is taken, it is a read-only file system and the files are no longer opened by the database processes. The classic file level backup tools like pg_basebackup or rman are pretty much unusable once your database hits 50 TB or so. With 50TB database, your RTO (Recovery Time Objective) will be in days. You can only push around 3TB/hour down a 10Gbit Ethernet pipe. On the other hand companies like Facebook. Amazon, Walmart or Target are losing hundreds of thousands USD per hour of downtime. Downtime of 15 hours or longer is completely unacceptable. Backup is only the last line of defense. It should only be used if everything else fails. And if you have to use file level tool like pg_basebackup and tar to restore your VLDB, you should also get your resume ready. Incremental backups are awkward and ungainly. You have to restore the full backup and all incrementals taken since the last full. After that, you must apply the remaining WAL files. So, by definition, restoring incremental backup is, by definition, longer than restoring a full backup. So, if you have to take a backup, get yourself a deduplicating backup utility. Deduplication will only save the backup blocks that are different from the previous backup. In other words, you can do a daily full backup with the space expenditure of an incremental backup. Also, if you need backups because of regulatory reasons (HIPAA, SOX), incremental backups will not do. Long story short, there are much more important things to do on Postgres than incremental database level backups. BTW, Chuck Norris can take and incremental database level backup of a Postgres database and recover database without restoring the preceding full backup. I am not sure whether Chuck Norris is a member of Postgres steering committee. -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: Incremental backup
On 10/29/21 03:30, Peter J. Holzer wrote: I was not aware that Oracle even has something equivalent to a Postgres cluster. hp It's called "Oracle instance". Regards -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: Incremental backup
On 10/29/21 08:49, Peter J. Holzer wrote: I don't think that's equivalent. An Oracle instance is a runtime concept (the collection of server processes on a single machine serving a single database (the collection of files on the disk)) whereas a PostgreSQL cluster is both a data and a runtime concept (config + data files for several databases with some shared data + the processes serving them). Also instance:database is n:1 while cluster to database is 1:n. Very different. Peter, Oracle instance manages collection of the databases and is ensuring recoverabilty using redo logs, which are completely analogous to WAL logs, if managed a bit differently. Let's not be nitpicking here. Oracle instance is completely analogous to Postgres cluster. If you ask me, the word cluster was picked to avoid the word "instance" -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: Incremental backup
On 10/29/21 09:43, Peter J. Holzer wrote: On 2021-10-29 09:03:04 -0400, Mladen Gogala wrote: On 10/29/21 08:49, Peter J. Holzer wrote: I don't think that's equivalent. An Oracle instance is a runtime concept (the collection of server processes on a single machine serving a single database (the collection of files on the disk)) whereas a PostgreSQL cluster is both a data and a runtime concept (config + data files for several databases with some shared data + the processes serving them). Also instance:database is n:1 while cluster to database is 1:n. Very different. Peter, Oracle instance manages collection of the databases and is ensuring recoverabilty using redo logs, which are completely analogous to WAL logs, if managed a bit differently. Let's not be nitpicking here. Oracle instance is completely analogous to Postgres cluster. If you ask me, the word cluster was picked to avoid the word "instance" I'm quoting Tom Kyte here: | In fact, it is true to say that an instance will mount and open at | most a single database in its entire lifetime! While that article is originally from 2009, it was last changed in 2021, and I'd trust Tom to change something as fundamental if it wasn't true anymore. hp Few days ago, in july 2013, Oracle has introduced version 12c with multi-tenant option. http://appstech.com/2013/08/oracle-announces-general-availability-of-oracle-database-12c-the-first-database-designed-for-the-cloud/ Each Oracle instance can now manage up to 255 pluggable databases which are, for all intents and purposes, equivalent to Postgres databases, if somewhat more awkward. The "unplug" and "plug in" operations leave a lot of room for improvement. That makes Oracle instance functionally equivalent to the Postgres cluster. Current Oracle version, 21c no longer supports flat architecture, without pluggable databases. -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: ZFS filesystem - supported ?
On 11/1/21 17:58, Stephen Frost wrote: Well, at least one alternative to performing these snapshots would be to use a tool like pg_basebackup or pgbackrest to perform the backups instead. File system based backups are much slower than snapshots. The feasibility of file based utility like pg_basebackup depends on the size of the database and the quality of the infrastructure. However, if opting for snapshot based solutions, I would advise something like Pure or NetApp which use special hardware to accelerate the process and have tools to backup snapshots, like SnapVault (NetApp). Also, when using file level utilities, I would go full commercial. Commercial backup utilities are usually optimized for speed, support deduplication and maintain backup catalog, which can come handy if there are regulatory requirements about preserving your backups (HIPPA, SOX) -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: Postgres Equivalent of Oracle Package
On 11/16/21 12:23, DAVID ROTH wrote: One of the nice things about Oracle packages is that the code is loaded and global values are set and stored only once per session. This is very useful for values that are used repeatedly. What is the best way of emulating this behavior in Postgresql? You can create a temporary table with "ON COMMIT PRESERVE ROWS", which is visible in a session and cannot cross the session boundary. Values in the table can be queried throughout the session. That is a pretty good approximation of the session variables in Oracle. -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: Database Scalability
On 11/30/21 22:08, Saurav Sarkar wrote: Hi Ben, Thanks a lot for your reply. So are all the schemas on one DB or are distributed/sharded across multiple DBs ? Best Regards, Saurav To my knowledge PostgreSQL doesn't support sharding, which is well and good because sharding is mostly useless, at least in my opinion. -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: Query planner issue with preferring primary key over a better index when using ORDER BY and LIMIT
On 12/6/21 22:16, Laurenz Albe wrote: An ugly alternative is to use "ORDER BY id + 0", which prevents PostgreSQL from using the index. That was actually the earliest form of Oracle hints. I remember doing exactly that in Oracle 5.1.22 on VAX/VMS. -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: Reindex "locked" standby database
On 12/14/21 22:37, Michael Paquier wrote: You are referring to the startup process that replays WAL, right? Without having an idea about the type of workload your primary and/or standbys are facing, as well as an idea of the configuration you are using on both (hot_standby_feedback for one), I have no direct idea, but that could be a conflict caused by a concurrent vacuum. Hi Michael, I am preparing for a standby deployment. I don't have a standby yet and, therefore, I don't have any standby problems. Would it be advisable to turn vacuum off on the standby? Applying WAL will also, in theory, populate the statistics which is also held in the database blocks. Regards -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: Packages, inner subprograms, and parameterizable anonymous blocks for PL/pgSQL
On 12/16/21 16:48, Pavel Stehule wrote: I don't think Postgres needs packages - this is a redundant concept in Postgres, when Postgres has schemas (different from Oracle's schemas) and extensions. Yes, ORAFCE uses schemas as the package names. However, one very practical thing is missing: session variables. Yes, you can emulate those with ON COMMIT PRESERVE ROWS temporary tables, but that's a rather ugly hack. On the other hand, packages can easily be emulated by using Python. Having packages would make PLPg/SQL programming much prettier. It would be much prettier to group related routines into a package than to have them laying around without anything indicating that the routines are related. On the plus side, packages would make it much easier to migrate from Oracle to Postgres. And you do want that, don't you? I am very well aware of the animosity toward the Oracle community and the reasons for that animosity. Oracle wiped the floor with the predecessor of Postgres, the database called "Ingres". Sandra Kurtzig, the Ingres CEO at the the time, and Michael Stonebraker were both involved in very public spat with Larry Ellison. Stonebraker is still very much a factor in Postgres community and I doubt that his feelings toward the Larry and his company have got any warmer with years. Postgres was created with the idea of fighting the Dark Lord in the land of Belmont, CA. However, if it is the goal of the community to have users convert from Oracle to Postgres, than making that conversion easy would be conducive to that goal. A long time ago a former pastry baker wrote the following article: https://www.toolbox.com/tech/data-management/blogs/why-postgresql-doesnt-have-query-hints-020411/ Disclosure: I am the Oracle DBA that Josh Berkus is talking about in that article. Recent events with "advice" to a very well known Oracle personality testify to the fact that the emotions haven't changed much since that article was written. Of course, Oracle is still the most popular database in the world. -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: List all tables from a specific database
There is also INFORMATION_SCHEMA which is SQL standard and extremely easy to use. On Fri, Jan 14, 2022, 1:08 PM Thomas Boussekey wrote: > > > Le ven. 14 janv. 2022 à 12:04, Ray O'Donnell a écrit : > >> On 14/01/2022 10:39, Flaviu2 wrote: >> > I work on a project that use Postgre SQL, and I have 0 experience in >> > this regard. I hope to get help here. I need two SQL scripts for >> Postgre: >> > >> > 1. Get all databases, which I aquired already: >> > >> > *SELECT datname FROM pg_database WHERE datistemplate = false* >> > >> > This one is functional, it's ok. >> > >> > 2. Now, I need to find all tables *under a specific* database. This one >> > I don't know how to achieve it. Can you help me here ? It is possible ? >> > >> If it's any help, running psql with the -E switch will show you the SQL >> which psql generates... so then (within psql) issuing the \dt command to >> list tables will show you the SQL used. >> >> Ray. >> >> Hello, > You can use this SQL command to view all the standard table (known as > relations): > > SELECT relname FROM pg_class WHERE relkind = 'r'; > > And this one for partitioned tables (if you use them): > > SELECT relname FROM pg_class WHERE relkind = 'p'; > > HTH, > Thomas > > >> -- >> Raymond O'Donnell // Galway // Ireland >> r...@rodonnell.ie >> >> >>
Re: Assistance with an out of shared memory error
On 1/13/22 18:35, Tom Lane wrote: .. or else reduce the number of partitions you're using. (People frequently seem to think that more partitions are free. That is extremely not true. I generally think that if you're using more than a few dozen partitions per table, you are making a mistake.) Interesting. What resources do partitions consume? Is it documented anywhere? Why do you think that more than a few dozen partitions is a mistake? Thanks in advance for your explanation. -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: WAL Archiving and base backup
On 1/14/22 16:00, David G. Johnston wrote: I still don't really understand what is so great about it. About its only redeeming feature is a declaration that "it is in core" and that newcomers can just default to it without thinking. I'd rather just play favorites and write "use pgbackrest" in our documentation. Or some hybrid approach where we don't just pick one but instead guide people to the community solutions that are out there. I don't think I really want the people responsible for core to spend time on writing end-user backup tooling. Their time is much more valuably spent working on the core product. David J. Well, the "without thinking" part of your post can be rephrased as "ease of use". Do database administrators really need to think about which backup software to use? What kind of knowledge will such an evaluation provide? All commercial databases have some form of backup software included into the core database. After all, backup and restore are extremely important functions which IMHO should be provided along with the database software. -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: WAL Archiving and base backup
On 1/16/22 13:12, Issa Gorissen wrote: Just to avoid any misunderstanding. I am perfectly happy using the backup/restore with pg_dump and we use it for at least a decade with success for our need of backups. I am using pgbackrest. I used to use pg_rman, being well acquainted with rman backup tool belonging to another database system, but pg_rman was removing WAL archives before they were delivered to replica. I have replication slots from the OLTP database to the DW database with partitions, hash and bloom indexes and massive amount of processors, for parallel query. I need the replication to work. In addition to that, pgbackrest can employ parallelism while pg_rman cannot do that. I find pgbackrest by far the best Postgres backup system. -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Could not serialize access due to concurrent update
In this post, I am not asking a question, I am sharing an experience. The application is running on Linux, PostgreSQL 13.5. using Websphere 9 application server. When using "SKIP LOCKED" option, I suddenly started seeing errors like "Could not serialize access due to concurrent update". After some reading, the problem was pinpointed to the transaction isolation level: https://pganalyze.com/docs/log-insights/app-errors/U138 OK, the app is not setting transaction isolation level to repeatable read, so what's going on? The documentation for Websphere reveals the truth: https://www.ibm.com/support/pages/transaction-isolation-levels-and-websphere-application-server *If you do not specify the isolation level* The product does not require you to set the isolation level on a data source resource reference for a non-CMP application module. If you do not specify isolation level on the resource reference, or if you specify TRANSACTION_NONE, the WebSphere Application Server run time uses a default isolation level for the data source. Application Server uses a default setting based on the JDBC driver. *For most drivers, WebSphere Application Server uses an isolation level default of TRANSACTION_REPEATABLE_READ. *(Bold font is my addition) Fortunately, the same document explains how to set the transaction isolation level to READ COMMITTED for the Websphere data source. No wonder that IBM stands for "It's Better Manually". Be vewy, vewy cawefull when using Websphere and PostgreSQL. Here is how to deal with the problem: *Possible values* *JDBC isolation level* *DB2 isolation level* 8 TRANSACTION_SERIALIZABLERepeatable Read (RR) 4 (default) TRANSACTION_REPEATABLE_READ Read Stability (RS) 2 TRANSACTION_READ_COMMITTED Cursor Stability (CS) 1 TRANSACTION_READ_UNCOMMITTEDUncommitted Read (UR) 0 TRANSACTION_NONENo Commit (NC) *Note**:* If TRANSACTION_NONE is used, the DB file does not have to be journaled. To define this custom property for a data source, you should do the following: 1. Click *Resources* > *JDBC provider* > */JDBC_provider/*. 2. Click *Data sources* in the Additional Properties section. 3. Click the name of the data source. 4. Click *Custom properties*. 5. Create the webSphereDefaultIsolationLevel custom property a. Click *New*. b. Enter *webSphereDefaultIsolationLevel* for the name field. c. Enter one of the *"possible values*" in the value field from the table above.**i.e. 0, 1, 2, 4, or 8 The value that needs to be entered is 2. ** * * -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: PostgreSQL Management and monitoring tool
On 1/26/22 22:24, Lucas wrote: Thanks… But I was just hoping that someone would share their solution in a more depth way… For example: Having Zabbix to monitor the database is nice, but it does not allow you to manage Replication, backups and more from the web, like awide.io <http://awide.io> does. Maybe there isn’t even a solution like that open source…. You can try with Oracle Enterprise Manager. Basic installation is free of charge and so is Postgres plugin. Of course, you'll need a fairly large box housing Oracle 19c EE and Weblogic application server, both parts of the OEM. Also, you don't know what is an adventure until you try upgrading OEM. It's things like that that make life interesting. As for the backups, my advice is to purchase Commvault, it has a nice GUI, can do storage snapshot backups and has great support. Disclosure: I used to work for Commvault Systems until May 2019, for 7 years. -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
pg_try_advisory_lock is waiting?
I am trying to replace SELECT FROM WHERE FOR UPDATE with pg_try_advisory_lock. The documentation says the following: https://www.postgresql.org/docs/13/functions-admin.html |pg_try_advisory_lock| ( /|key|/ |bigint| ) → |boolean| |pg_try_advisory_lock| ( /|key1|/ |integer|, /|key2|/ |integer| ) → |boolean| Obtains an exclusive session-level advisory lock if available. This will either obtain the lock immediately and return |true|, or return |false|_*without waiting*_ if the lock cannot be acquired immediately. I tried the following: _*1st Session:*_ mgogala=# begin transaction; BEGIN mgogala=*# update emp set sal=sal*1 where empno=7934; UPDATE 1 mgogala=*# _*2nd Session:*_ mgogala=# begin transaction; BEGIN mgogala=*# select pg_try_advisory_lock(0) from (select ename from emp where empno=7934 for update) as tbl; To my infinite surprise, "pg_advisory_lock" is waiting. I am aware of SELECT FOR UPDATE NOWAIT, but that produces an error and kills the transaction block. I would like to use something that would not kill the transaction block. I am obviously doing something wrong because the select in parenthesis will not return, so the query cannot be executed. On the other hand, without the "FOR UPDATE" clause, I am getting TRUE, which is wrong: mgogala=# begin transaction; BEGIN mgogala=*# select pg_try_advisory_xact_lock(0) from (select ename from emp where empno=7934) as tbl; pg_try_advisory_xact_lock --- t (1 row) mgogala=*# rollback; ROLLBACK mgogala=# select pg_try_advisory_xact_lock(1) from (select ename from emp where empno=7934) as tbl; pg_try_advisory_xact_lock --- t (1 row) The row is still locked by the UPDATE statement, so the try_advisory_lock should return "f", not "t". The database is 13.5 on Oracle Linux 8, x86_64. Transactions are written in Java so an exception will terminate the transaction block. SQL statements are generated by the home grown ORM. The application is ported from Oracle which will not hang the transaction block on the 1st error. Is there a way to get PostgreSQL to use something like NOWAIT without aborting the transaction block? -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: pg_try_advisory_lock is waiting?
On 1/28/22 19:08, Tom Lane wrote: I doubt it. I think the FOR UPDATE in the sub-select is blocked because the other session has an uncommitted update on the row it wants to lock. This command won't reach the pg_try_advisory_lock call until that row lock comes free. Yes, I figured it out, but pg_try_advisory_lock returned TRUE even without "FOR UPDATE" clause in the subquery. Shouldn't it return false because it can't lock the row until the uncommitted update finishes? Regards -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: pg_try_advisory_lock is waiting?
On 1/28/22 20:54, Michael Lewis wrote: On Fri, Jan 28, 2022 at 5:34 PM Mladen Gogala wrote: pg_try_advisory_lock returned TRUE even without "FOR UPDATE" clause in the subquery. Shouldn't it return false because it can't lock the row until the uncommitted update finishes? The rows being updated or not is unrelated to advisory locks. Your example shows you locking on key 0 and then on key 1. That should be allowed. If you tried key 0 twice, then that is when you would get back "false" from the function call. You could establish a pattern of using the table OID as the first lock key and the primary key value (if a single column) as the second advisory lock key with the two parameter version of the function. But it is totally up to your code to honor that advisory lock, or not. Again, why use advisory locks and not select for update? Perhaps just because you don't want to deal with the failed transaction? What should happen when some other process cannot get a lock on that row? Do you want to wait and retry? Have you looked into the "skip locked" option? If you use "returning id" with that, you'd be able to check if you got the lock or not. I managed to resolve the problem: savepoint pt1; select from table where <...> for update nowait; If it fails, do "rollback to pt1" and continue. Transaction block will survive. OK, now the life is good again. Regards -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: what is the solution like oracle DB's datafile
On 1/29/22 17:15, Yudianto Prasetyo wrote: Hello, I want to ask why postgresql doesn't create a datafile like it has oracle? I'm confused when I have 2 HDD. HDD 1 is used to install the OS and postgresql database. when HDD 1 is full. how to increase the capacity of postgresql database with HDD 2 (without RAID system)? is there any other way like oracle DB's "add datafile" which can be used to add capacity to another HDD? I'm sorry if it says comparing with Oracle DB, but in essence I'm just looking for a solution to the problem above. Thank You Yours faithfully Yudianto Number one, Postgres tables are files. Tablespaces are directories which reside in file systems, In Oracle, tables are sets of blocks within data files that tablespace is comprised of. That is a profound difference. In Oracle, we are using direct IO to bypass the OS cache and only use SGA ("System Global Area") for caching data blocks. In Postgres, we are caching blocks from the files in OS cache which is essentially the free memory. When translated to Postgres, your question reads: how to add space to file system? That depends on the file system and volume manager. If you're using brtfs or zfs (hopefully not) then your file sysems are also volume managers, If you're using LVM with xfs (my combination), then you can add space to your volume and extend the xfs file system. I have also tried using Oracle ASM as volume manager and use ACFS as the file system. The result was very nice, roughly the same as with LVM. However, the installation of ASM is rather complex and since PostgreSQL cannot be clustered, there is no justification for doing that. -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
How to determine whether I'm running on a standby?
Hi! I am running PostgreSQL 13.5, one primary cluster and two standby clusters, managed by PgBouncer which will automatically promote one of the standbys to the primary in case of failure. I have a job that is cleaning "old data", with "old data" being whatever business side of the company deems as "old data". The cleanup is a PgPL/SQL procedure which uses a bunch of DELETE commands. The procedure works fine on the primary but reports a bunch of errors on the read-only standbys. In another database system, I have the table called V$DATABASE and the column named DATABASE_ROLE which can be either primary or some kind of standby (physical or logical). Is there anything like that in the world of Postgres? I would like to do something like this: DECLARE DB_ROLE VARCHAR2(64); BEGIN SELECT DATABASE_ROLE INTO DB_ROLE FROM V$DATABASE; IF DB_ROLE = 'PRIMARY' THEN ELSE exit; END IF; END; What would a similar idiom in the PostgreSQL world look like? I have the job in crontab on the primary and I don't really know which of the 3 clusters will be my primary at any given moment. So far, I am using manual labor to do that, which is silly. Knowing Postgres idioms, there will probably be a function for this but I was unable to figure out which one. Regards -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: Can we go beyond the standard to make Postgres radically better?
Please, don't top-post. On 2/10/22 17:51, Guyren Howe wrote: When you dig into it, the powerful idea here is the relational algebra, and its equivalence to a first-orderish logic. I put up with SQL so I can use relations, and I love Postgres because it has the least bad SQL (by a mile!) As you've said, the basic motivation behind SQL is the relational algebra from the branch of mathematics called "naive set theory". SQL does that really well. Late Dr. Ted Codd was a mathematician, so he was also aware of non-naive set theory so the "selection language" of Zermelo's axiom of choice was also included. That is why we are selecting elements of subsets and not defining subsets. Codd was also working for IBM, once great company, whose name stands for "It's Better Manually". IBM was doing lots of business with the banks (figures) so the transaction handling part was modeled after the banking business. When I've had my first course in SQL, back in 1988, I was given analogy with writing checks: I A writes a $100 check to B, one of the following two things can happen: either A has enough money to cover the check, in which case B will have $100 more in his account and A will have $100 less or, if the transaction doesn't go through, nothing will happen. Any mixed outcome in which B is $100 richer but A doesn't have $100 less money in his account or A has $100 less money but B is not $100 richer is strictly prohibited. That transaction must happen atomically, as a whole or not happen at all. So, that's SQL for you: relational algebra plus banking business. Relational algebra implementation is rather good, although a tad too verbose. Transactional part is excellent and still reflects the needs of the business community today. But SQL is a terrible, no good, very bad language. I cannot accept such a religious persecution of SQL without a detailed explanation. I don’t really understand why folks who love the relational model aren’t perpetually up in arms about SQL being their only option. Much better query languages are known and well studied. https://www.scattered-thoughts.net/writing/against-sql I’d love to see Postgres offer say Datalog. But until that Halcyon day, we could at least aggressively extend SQL to make it less awful. It will take much more than a blog post to eradicate decades of investment into SQL by both computer scientists and business community. I've seen the same crusade against SQL on Slashdot and I assume you were behind that as well. So, if I were you, I would establish a company to implement Datalog language in a database and try to sell that to the people. That happened in 1979: Larry Ellison, Bob Miner and Ed Oates established a company and offered a SQL product, competing with hierarchical databases like IMS and CICS/DL1, to the world. The rest is history. I am sure that if you do the same and if you are right about the Datalog, a brilliant future is ahead of you. That is the way of the Force. Anything can happen. A young student can come from Africa, convinced that he can sell electric cars and be very successful while doing that. So successful that he became the richest man in the world. I am sure that your endeavor with Datalog will have similar success. Market decides what is the standard. May the Force be with you. In the mean time, I will still be using SQL. -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: Can we go beyond the standard to make Postgres radically better?
On 2/10/22 23:56, Guyren Howe wrote: On Feb 10, 2022, at 17:06 , Mladen Gogala wrote: But SQL is a terrible, no good, very bad language. I cannot accept such a religious persecution of SQL without a detailed explanation. I feel like anyone who is defending SQL here isn’t aware of how much better the alternatives are, and how bad SQL really is. I mean: it’s hard to write, hard to read. It’s hard to generate. But that’s just the starting point. OK. If there are better alternatives, I am sure you will be able to sell them to the world. Establish a company and have a go at it. One of the worst things about it that I don’t see much discussed is that it imposes assumptions about the storage model that aren’t part of the relational model. Like heavyweight, hard to change tables with transactional guarantees and such. Don’t get me wrong, those things are great to have available, but I don’t need them all the time. Storage model and implementation are not part of SQL for good reason. Database vendors have different implementations. MySQL and Postgres are different. MySQL storage engines differ among themselves. Both of them are different from SQL Server which is in turn different from Oracle and DB2. Storage model determines the implementation. When there is only a single relational database vendor left in the market, then they can burn their storage implementation into the language standard. Until then, the more, the merrier. The whole NoSQL movement has been such a tragedy. Having diagnosed a problem with SQL databases, they threw out the relational model and very often reimplemented a form of SQL when they should have done the opposite. There is no reason you can’t have a relational database with an on-demand schema, with eventual consistency, with all those fun things that various NoSQL databases provide. Those storage models have their place, but the SQL standard says you can’t use them. But the biggest issue is the verbose, terrible, very bad query language. In the web development community where I spend my time, it is almost holy writ to treat the database as a dumb data bucket, and do everything in the application layer (even things like validations, even when that is a provably incorrect approach). Why? I think it’s because they’re used to working in a pleasant language like Ruby or Python, and they want to do everything there. And who can blame them? As a database architect who has successfully bridged two very different database systems, I can tell you that the application programmers put the business logic into the application because they frequently don't know what the options are. They know Java or Python and that's what they do, period. That has nothing to do with SQL. But this is bad. Proper relational design can take over much (most!) of the design of a typical business app, with significant efficiency gains the result. But no *community* is going to choose that when most of the members of the community don’t want to learn SQL and who can blame them? Business community which hires them to make efficient applications can blame them. And frequently does so. Another issue: everyone thinks “relational” is the same thing as “SQL”. If we could get folks to break that association, then relations should be a standard feature of high-level programming languages, just as arrays and hashes are. Heck, give me a functional programming language with a good relational model, and I can implement OOP in that relational language without breaking a sweat. Software *should* be designed around a logical/relational layer with minimal occasional forays into Turing completeness where necessary. But no-one is even thinking about software like that because relational is SQL and SQL is awful. There is such thing as "free market". If you offer them a better alternative, people will come. You may be the next Larry Ellison. And then again, you may be not. There is only one way to tell, and that's not proselytizing on the Postgres mailing list. -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: Can we go beyond the standard to make Postgres radically better?
On 2/11/22 09:48, Benedict Holland wrote: So to summarize, people are bad programmers who refuse to learn SQL So SQL is the problem? Common. You cannot bring that to a postgres list serve. Look. It's not perfect. It's a pain. It is hard to generate queries (oh my God why are you doing this?) and it's hard to work with. You are describing c++ to Python programmers and arguing why no one should ever use c++. And look, there are other languages that are way better like Python. So why would anyone ever chose c++? There is probably a misunderstanding here. I haven't talked about people, I leave that to politicians. Second, I was defending SQL. You've got me confused with somebody else. Last but not least, I didn't bring anything to this list, I was just responding to the posts. -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: Microsoft Report Builder
On 2/11/22 22:22, Kim Foltz wrote: I am trying to configure Microsoft Report Builder to run against a mainframe PostgreSQL database. The ODBC connection tests as good and manually entered SQL Select statements properly return data. The problem is the Query Designer in the software doesn't support the normal graphical display of available tables and fields with PostgreSQL. All I find in Microsoft's documentation is the statement some data sources don't support graphical tools in Report Builder. More than likely there are better report generators than Microsoft's tool but our agency is very fond of Microsoft solutions. Anyone know of a way to get the graphical tools working in Report Builder? In the alternative, is there an alternative Windows report generator to try if Microsoft's software isn't repairable? Well, there is a fish to achieve just that: https://babelfishpg.org/ The problem is that you will have to build PostgreSQL from source with some specific hooks for Babelfish. On the other hand, this extension is supported by AWS: https://aws.amazon.com/blogs/database/migrate-sql-server-to-amazon-aurora-postgresql-using-best-practices-and-lessons-learned-from-the-field/ That means that it will continue to be developed and that you will not waste all that effort. Regards -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Babelfish for PostgreSQL
Hi! There is a method of simulating SQL Server by using PostgreSQL. The extension is called "Babelfish": https://babelfishpg.org/ However, this extension requires modifications to the PostgreSQL source and building the PostgreSQL with Babelfish modifications from source. While it is a rather seamless process and goes on without much trouble, it is still much more complicated than installing the RPM or DEB packages, depending on your Linux distro of choice. Is there any chance that the Babelfish modifications will be adopted in the PostgreSQL source proper? From my DBA point of view, I didn't find any differences in the functionality of the PostgreSQL database itself. Amazon RDS incorporates that modification and allows creating Aurora PostgreSQL database with the Babelfish extension from the menu. Regards -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: Can we go beyond the standard to make Postgres radically better?
On 2/12/22 19:11, Andreas 'ads' Scherbaum wrote: The complaint is not about complex queries, or CTEs, or Joins. This is about simple queries where a user wants to discover - surf - the database and look into specific tables, but exclude certain columns. More specifically, this is when the user types in interactive queries. There is already something very similar to what you are describing: https://www.psycopg.org/docs/cursor.html Each cursor has its description, which consists of the column descriptions. Basically, it's like doing \d on a cursor. Unfortunately, it's not interactive, one has to do some pythong programming in order do to that. Unfortunately, it is not possible to just "describe the cursor", the description becomes available after the "execute" call. Hopefully, I understood you correctly. -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: Can we go beyond the standard to make Postgres radically better?
On 2/13/22 05:00, Pavel Stehule wrote: But there can be a valid second question - it can be nice to use extensions with availability to define their own communication protocol. Postgres has a special protocol for replication or for backup. With this possibility you can do what you need without the necessity of an external application server. Regards And here we are back on the Babelfish question. Babelfish defines its own protocol (TDS = Table Data Streaming) but it requires building PostgreSQL from the source, with some hooks for the additional protocol. A built in mechanism to do that without rebuilding would be nice. Babelfish Postgres is version 13.5, currently the most prevalent version in the data centers. A library to implement foreign communication protocol primitives would be a very nice thing indeed. -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
2 phase commit with FDW
Hi! I am getting the following error when trying to PREPARE transaction which updates both local and foreign table: 2/17/22 12:48:00:657 EST] 0128 RegisteredRes E WTRN0046E: An attempt by the transaction manager to call prepare on a transactional resource has resulted in an error. The error code was XAER_RMFAIL. The exception stack trace follows: org.postgresql.xa.PGXAException: Error preparing transaction. prepare xid={XidImpl: formatId(57415344), gtrid_length(36), bqual_length(54), data(017f08cc3aea0001232c3afce8abf609478378ca6ac9d4bddbfea3e4efa55cf3017f08cc3aea0001232c3afce8abf609478378ca6ac9d4bddbfea3e4efa55cf300010002)} at org.postgresql.xa.PGXAConnection.prepare(PGXAConnection.java:365) There are no errors is all the tables are local. It seems that postgres_fdw is incompatible with the XA protocol. Is there a way around it? If not, we will need yet another code change and another data source. Regards -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: 2 phase commit with FDW
On 2/17/22 13:10, Mladen Gogala wrote: Hi! I am getting the following error when trying to PREPARE transaction which updates both local and foreign table: 2/17/22 12:48:00:657 EST] 0128 RegisteredRes E WTRN0046E: An attempt by the transaction manager to call prepare on a transactional resource has resulted in an error. The error code was XAER_RMFAIL. The exception stack trace follows: org.postgresql.xa.PGXAException: Error preparing transaction. prepare xid={XidImpl: formatId(57415344), gtrid_length(36), bqual_length(54), data(017f08cc3aea0001232c3afce8abf609478378ca6ac9d4bddbfea3e4efa55cf3017f08cc3aea0001232c3afce8abf609478378ca6ac9d4bddbfea3e4efa55cf300010002)} at org.postgresql.xa.PGXAConnection.prepare(PGXAConnection.java:365) There are no errors is all the tables are local. It seems that postgres_fdw is incompatible with the XA protocol. Is there a way around it? If not, we will need yet another code change and another data source. Regards Is there anybody on this list who can tell me whether foreign tables can participate in 2-phase commit or not? My version is 14.2 on Oracle Linux 8.5. Regards -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: 20220226-Clarification regarding delay time of PostgeSQL starting up
On 2/26/22 10:17, Laurenz Albe wrote: *If you talking about a Windows CIFS (Network Share), you may be out of luck. That is known to be an unreliable file system that cannot be used with a database.* A crash has likely corrupted your database. (But we cannot tell for certain, because you didn't show any interesting messages.) Yours, Laurenz Albe Thanks for letting me know, I didn't know that. Are there any such limitations with NFS and NAS devices like NetApp? Regards -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: 20220226-Clarification regarding delay time of PostgeSQL starting up
On 2/26/22 13:34, Techsupport wrote: Thanks for your reply. Is it a good practice to upgrade / update database on production on every year ? No, it is not a good practice. Database upgrade must be treated as a project which will result in non-negligible downtime. You should upgrade to the new version if the new version brings about features that you need. Here is the list of new features, introduced in the PostgreSQL 14: https://www.postgresql.org/docs/14/release-14.html#id-1.11.6.7.5 If you need any of the listed features, you should upgrade. If not, you can safely skip the release. Upgrade is something that has business repercussions, inconveniences the database users and should be subjected to the same cost-benefit analysis as every other major IT move. Upgrade policy is usually set by the management, not by the DBA, much to chagrin of the DBA personnel. I used to be a DBA for a very long time and I know. -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: Interesting fail when migrating Pg from Ubuntu Bionic to Focal
On 3/4/22 17:03, Tom Lane wrote: Mmm ... it might have just been that the planner chose not to use JIT when it thought there were fewer rows involved. Did you check with EXPLAIN that these cut-down cases still used JIT? This is interesting and informative answer. How do I check whether JIT is used in the explain plan? Can you give me an example? TIA -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: Select .... where id not in (....) returns 0 incorrectly
On 4/4/22 09:21, J. Roeleveld wrote: This was actually the case, I forgot there is 1 NULL-value in that list. Personally, I think NULL should be treated as a seperate value and not lead to strange behaviour. NULL is strange. Relational databases use ternary, not binary logic. In the woke vernacular, one could say that Postgres is non-binary. NULL literally means "no value". It is a part of the standard, so we have to deal with it, Codd help us. However, based on my lifelong experience with Oracle, NULL values are bad and are best avoided. Postgres is more forgiving than Oracle because in Postgres, the condition "is not null" can be resolved by index. In Oracle, it can not. -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: Huge archive log generate in Postgresql-13
On 4/18/22 07:34, Ram Pratap Maurya wrote: Hi Support, We have upgraded postgresql DB from version 11 to 13 . after upgrade to 13 huge archive log generate in system . Before upgrade per day 120GB to 150 GB log generated but after upgrade per day approx. 250 to 300 GB log generated. Can you please suggest why huge archive log generated after upgrade there any configure setting or this is Postgresql-13 behaviour. Postgresql-13 Postgresql conf file attached for your references. Regards, Ram Pratap. Have you checked what's in those WAL archives? You can do that with pg_waldump. My guess would be that your vacuum is probably more active than in the version 11. However, that's just a guess. You can also turn on WAL compression. Be aware that compression will reduce disk consumption at the expense of CPU consumption. Regards -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: Are stored procedures/triggers common in your industry
On 4/20/22 15:18, Guyren Howe wrote: I’ve really only ever worked in web development. 90+% of web developers regard doing anything at all clever in the database with suspicion. I’m considering working on a book about implementing business logic in Postgres, and I’m curious about how common that actually is. Well, there are 2 schools of thought: 1. Put the business logic into the application 2. Put the business logic into the database Putting the business logic into the application can give you more flexibility around enforcing them. On the other hand, you also increase chances of inconsistency. There will likely be more than one application using reference tables like ADDRESS, ZIP_CODE, STATE, COUNTRY, QUARTER, ACCOUNT, CUSTOMER and similar. If there is a rule that a country must exist before you add an address in that country into the table, that can be enforced by a foreign key. Enforcing it within the application does 2 things: 1. Move the rule code to the application server which is traditionally weaker than a database server. In other words, you are more likely to run out of CPU juice and memory on an application server than you are likely to run out of resources on the DB server. 2. There is a possibility for inconsistency. Different applications can use different business rules for the same set of tables. That means that data entered by one application may make the table internally inconsistent for another application. I am a big proponent of using foreign keys, check constraints and triggers to enforce business rules. I am also a big proponent of avoiding NULL values wherever possible. Database design is an art. CAD software used to be popular once upon a time, in a galaxy far, far away. Properly enforcing the business rules in the database itself makes the application more clear and easier to write. Regards -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: Whole Database or Table AES encryption
On 4/28/22 18:55, Aaron Gray wrote: On Wed, 27 Apr 2022 at 18:01, Bruce Momjian wrote: On Wed, Apr 27, 2022 at 05:21:41PM +0100, Aaron Gray wrote: Hi, Is there any way to (virtually) encrypt whole databases or tables with AES or other types of encryption ? You can use storage encryption via the operating system. Cybertec has a patch for PG 12 and the community is working on an implementation, perhaps for PG 16. I would be very interested in this, are there any further pointers as to where to look at this ? Is there a repo branch in place working on this or any documentation ? Cheers, Aaron There is a commercial product which supports database encryption, including PostgreSQL, it's called Voltage: https://www.techtarget.com/searchsecurity/feature/HP-Security-Voltages-SecureData-Enterprise-Product-overview The company that I work for uses it to encrypt data in both Oracle and PostgreSQL databases. -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: External psql editor
On 4/29/22 13:35, Jan Wieck wrote: Not that I know of. \e starts the external editor and you have to save and exit that editor to get back to psql in order to execute it. IMHO the whole construct has very limited usability. Regards, Jan Is there a way to define the name of the temporary file created by \e command? I'd like to name it "afiedt.buf", not for sentimental reasons. I already have a cron job that cleans afiedt.buf from my home directory every hour and having psql name temporary file like that would simplify the cleaning process. The name comes from another database with the same editor construct as \e. I am actually quite used to that. Regards -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: External psql editor
On 5/2/22 17:27, Tom Lane wrote: Rich Shepard writes: On Mon, 2 May 2022, Reid Thompson wrote: I believe that psql also uses readline, so my thought was that maybe these instructions could enable you to map the 'move' keystrokes that you're familiar with to be used while on the psql command line. A very quick test seems to indicate that you can. Nope. No difference. Perhaps your psql is built against libedit rather than readline. regards, tom lane That is easy enough to determine: mgogala@umajor ~]$ ldd `which psql` linux-vdso.so.1 (0x7ffded951000) libpq.so.private13-5 => /lib64/libpq.so.private13-5 (0x7ff145284000) _*libreadline.so.8 => /lib64/libreadline.so.8 (0x7ff14522c000)*_ libm.so.6 => /lib64/libm.so.6 (0x7ff14515) libc.so.6 => /lib64/libc.so.6 (0x7ff144f49000) libssl.so.1.1 => /lib64/libssl.so.1.1 (0x7ff144eac000) libcrypto.so.1.1 => /lib64/libcrypto.so.1.1 (0x7ff144bbe000) libgssapi_krb5.so.2 => /lib64/libgssapi_krb5.so.2 (0x7ff144b65000) libldap_r-2.4.so.2 => /lib64/libldap_r-2.4.so.2 (0x7ff144b09000) libtinfo.so.6 => /lib64/libtinfo.so.6 (0x7ff144ada000) /lib64/ld-linux-x86-64.so.2 (0x7ff14539a000) libz.so.1 => /lib64/libz.so.1 (0x7ff144ac) libkrb5.so.3 => /lib64/libkrb5.so.3 (0x7ff1449e2000) libk5crypto.so.3 => /lib64/libk5crypto.so.3 (0x7ff1449ca000) libcom_err.so.2 => /lib64/libcom_err.so.2 (0x7ff1449c1000) libkrb5support.so.0 => /lib64/libkrb5support.so.0 (0x7ff1449b) libkeyutils.so.1 => /lib64/libkeyutils.so.1 (0x7ff1449a9000) libresolv.so.2 => /lib64/libresolv.so.2 (0x7ff144995000) liblber-2.4.so.2 => /lib64/liblber-2.4.so.2 (0x7ff144982000) libsasl2.so.3 => /lib64/libsasl2.so.3 (0x7ff144962000) libselinux.so.1 => /lib64/libselinux.so.1 (0x7ff144935000) libcrypt.so.2 => /lib64/libcrypt.so.2 (0x7ff1448fb000) libpcre2-8.so.0 => /lib64/libpcre2-8.so.0 (0x7ff144864000) [mgogala@umajor ~]$ Regards -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: Question about attention to pgsql-hack...@lists.postgresql.org
On 6/26/22 09:15, Julien Rouhaud wrote: On top of that OP has already received a lot of answers in the past, it's just that since no one wants to take care of a whole non trivial project for free the answer is never what OP wants. On top of that, most of the functionality can be obtained by using plpython3u and importing "numpy". -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: How can I set up Postgres to use given amount of RAM?
On 6/26/22 14:40, W.P. wrote: Question in topic: "How can I set up Postgres to use given amount of RAM?" I have now laptop with 8GB of RAM, i can see Linux uses no more than 2-3GB. So my question is how to FORCE PostgreSQL use let's say 2-4Ghow to B of RAM for caching tables I run queries on? As I can see disk actity running queries. W.P.. I would try docker with --memory option. An alternative is to create a VM with the given amount of memory. If your laptop is memory starved, then creating a VM is probably not an option. Docker is much cheaper. Dockerfile is relatively simple to master and you can use Docker to run almost anything. Regards -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: Get the table creation DDL
On 7/10/22 13:57, Michael Nolan wrote: I do not know those other tools, but there should be documentation for them, as there is in the man page for how to process just one table using pg_dump. You can pipe the output of pg_dump directly to psql, but I find that's seldom useful. -- Mike Nolan There is an extension which does precisely what you need: https://github.com/MichaelDBA/pg_get_tabledef In my opinion, that is a little inconsistency on the part of the maintainers part because there are functions pg_get_functiondef, pg_get_indexdef and pg_get_viewdef but nothing for tables. https://www.postgresql.org/docs/14/functions-info.html -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: Get the table creation DDL
On 7/10/22 12:47, Igor Korot wrote: So there is no "query" per se? Also how do I pass the table name? Thank you. You can create one from the catalog tables. Personally, I would use INFORMATION_SCHEMA to avoid pg_class and pg_attribute. However, there is an extension which does that for you. Somebody else has already done the hard work. -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: Get the table creation DDL
functions Schema | Name | Result data type | Argument data types | Typ e +-+--+-- ---+ -- public | pg_get_tabledef | text | in_schema character varying, in_t able character varying, in_fktype tabledef_fkeys DEFAULT 'FKEYS_INTERNAL'::table def_fkeys, in_trigger tabledef_trigs DEFAULT 'NO_TRIGGERS'::tabledef_trigs | fun c (1 row) | |As expected, the function returns the "text" data type. | |Regards | || || -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: Get the table creation DDL
On 7/10/22 20:02, Mladen Gogala wrote: This "extension" is nothing PL/PGSQL source code This "extension" is nothing _*but*_ PL/PGSQL source code -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: Get the table creation DDL
On 7/10/22 20:15, Igor Korot wrote: I understand. The question here - does this "extension'' is a part of standard PostgreSQL install? And if not - can I copy and paste that code in my program? Thank you. Of course this function is not a part of the standard install. If it was a part of the standard install, it wouldn't need a separate Git repo on Gitlab. And of course you can use it in your database, just send me $1000 and you will have no problems with the license. I believe there is license file in the repo and it says something like that. Regards -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: Get the table creation DDL
On 7/10/22 21:13, Igor Korot wrote: How should I do that? Thank you. Oh boy! I give up. -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: Get the table creation DDL
On 7/10/22 22:52, Rob Sargent wrote: Are you alone or on a team? What are your pronouns? -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: Proposal to introduce a shuffle function to intarray extension
On 7/15/22 04:36, Martin Kalcher wrote: Dear list, i am dealing with an application that processes fairly large arrays of integers. It makes heavy use of the intarray extension, which works great in most cases. However, there are two requirements that cannot be addressed by the extension and are rather slow with plain SQL. Both can be met with shuffling: - Taking n random members from an integer array - Splitting an array into n chunks, where each member is assigned to a random chunk Shuffling is currently implemented by unnesting the array, ordering the members by random() and aggregating them again. Martin, have you considered PL/Python and NumPy module? -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: Proposal to introduce a shuffle function to intarray extension
On 7/16/22 16:21, Martin Kalcher wrote: Hey Mladen, thank you for your advice. Unfortunately the performance of shuffling with NumPy is about the same as with SQL. create function numpy_shuffle(arr int[]) returns int[] as $$ import numpy numpy.random.shuffle(arr) return arr $$ language 'plpython3u'; select arr[1:3]::text || ' ... ' || arr[398:400]::text from ( select numpy_shuffle(arr) arr from numbers ) shuffled; --- {674026,3306457,1727170} ... {343875,3825484,1235246} Time: 2315.431 ms (00:02.315) Am i doing something wrong? Martin Hi Martin, No, you're doing everything right. I have no solution for you. You may need to do some C programming or throw a stronger hardware at the problem. The performance of your processors may be the problem. Good luck! -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: More than one Cluster on single server (single instance)
On 7/18/22 03:55, Daulat wrote: Please share your thoughts on this approach? How can we create multiple clusters under a single postgres version? You can use Docker images. How can we identify/allocate the resources to parameters like shared_buffers, effective_cache_size, work_mem etc in case of multiple clusters? Thanks Each container can be adjusted to have its own memory and parameter file. Personally, I don't like that approach. I think that separating data into separate databases is enough separation even for the most fervent separatists (pun with the name of political movements is purely accidental). -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: postgis
On 7/18/22 13:08, Marc Millas wrote: Hi, I would like to install postgis 3.04 on a debian 11. digging into various web sites, I didnt found the name of that packet. can someone help ? thanks Marc MILLAS Senior Architect +33607850334 www.mokadb.com <http://www.mokadb.com> Install alien and convert the .rpm package into the .deb package. When dealing with alien be careful, so that it doesn't jump and stick to your face. Bad things can happen then -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: Was my question inappropriate for postgres?
On 7/24/22 19:56, Taka Taka wrote: Hello. I posted question bellow. And? No takers? I wonder why. Maybe the question is unclear? This is really surprising :) Hello. I would like to know if psqlodbc_13_02 is compatible with M365. Also, could you please tell me which of the psqlodbc_13_02 would be suitable to Windows 10 64-bit? What is M365? Is it a part of Microsoft Office 365, aka "O365"? Is it related to M-16? What di you mean by "driver suitable for Windows 10"? If the driver can be installed and configured by the MS ODBC driver administrator, then I guess it's suitable. Make sure the driver you use is mauve, they take the least RAM Was my question inappropriate? I wanted to know if the pdbc driver above would be compatible to Microsoft's M365. What part of Office 365 do you want the compatibility with? Excel? Also, I find variety in the odbc driver. I don't. ODBC drivers implement the same protocol when communicating with the database and they all work the same. Here is what I have: [mgogala@umajor ~]$ rpm -qa *odbc* oracle-instantclient-odbc-21.5.0.0.0-1.x86_64 postgresql-odbc-13.01.-2.fc36.x86_64 msodbcsql17-17.10.1.1-1.x86_64 [mgogala@umajor ~]$ The string "f36" in the PostgreSQL ODBC package name denotes a very special version of Windows, more advanced than even Windows 11. I was able to connect all 3 of the drivers above to LibreOffice and use them to query the underlying database. I was able to do the same on my Windows 8.1, with all 3 of those drivers. Is it possible to know which of the driver is suitable to Windows 10 64-bit? Suitable how? What exactly do you want to do? What does the word "suitable" even mean in the context you're using it? Or could you please tell me where to inquire? There is this site named "https://www.google.com"; which can be used to find a lot of useful information and some paid ads, too. As of lately, I prefer DDG, but that's just a personal preference. If there were no takers for the first question, that may say something about the question. At this point, you gotta ask yourself one question only: do I feel lucky? -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: Was my question inappropriate for postgres?
On 7/24/22 20:15, Gavin Flower wrote: I suspect that most people do not know what 'M365' is, would be good to explain. Am curious as to what 'M365' is! M365 is the latest member in the ArmaLite M-16 family of products. It can be connected to database to track down the DBA who don't check their backups. -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: Was my question inappropriate for postgres?
Answers in-line: On 7/24/22 20:43, Adrian Klaver wrote: Other parts of Office also use database connectivity such as Access or Word. Also, I find variety in the odbc driver. I don't. ODBC drivers implement the same protocol when communicating with the database and they all work the same. Here is what I have: Hmm, I wonder why they have versions then? Good question. One part of the story are package versions, bug fixes and such. However, I think that your question was about the versions of the ODBC protocol. This may answer it: https://en.wikipedia.org/wiki/Open_Database_Connectivity Version history Version history:^[22] <https://en.wikipedia.org/wiki/Open_Database_Connectivity#cite_note-22> * 1.0: released in September 1992^[23] <https://en.wikipedia.org/wiki/Open_Database_Connectivity#cite_note-23> * 2.0: c. 1994 * 2.5 * 3.0: c. 1995, John Goodson of Intersolv and Frank Pellow and Paul Cotton of IBM provided significant input to ODBC 3.0^[24] <https://en.wikipedia.org/wiki/Open_Database_Connectivity#cite_note-24> * 3.5: c. 1997 * 3.8: c. 2009, with Windows 7^[25] <https://en.wikipedia.org/wiki/Open_Database_Connectivity#cite_note-What's_New_in_ODBC_3.8-25> * 4.0: Development announced June 2016^[26] <https://en.wikipedia.org/wiki/Open_Database_Connectivity#cite_note-MSDA_blog_2016-06-07-26> with first implementation with SQL Server 2017 released Sep 2017 and additional desktop drivers late 2018 ^[27] <https://en.wikipedia.org/wiki/Open_Database_Connectivity#cite_note-27> final spec on Github <https://github.com/Microsoft/ODBC-Specification> By the way, version 4 of the protocol supports "fast_executemany": https://dbwhisperer.wordpress.com/2020/11/21/pyodbc-fast_executemany-and-oracle-rdbms/ MS SQL ODBC 1.7 and Oracle Instant Client 19.9 ODBC drivers and newer do support "fast_executemany". PostgreSQL ODBC 13 does not. Here is the table: https://github.com/mkleehammer/pyodbc/wiki/fast_executemany-support-for-various-ODBC-drivers As to below, really? If you don't want to answer the question then don't. Boys just wanna have fun. This is a shameless plug of the famous Cindy Lauper's song, but you get the picture, don't you? To my credit, I didn't answer the 1st one. Tonight, I've been socializing with Sam Adams, so I replied. When someone who doesn't know how to properly write "O365" asks about "how suitable ODBC driver is", then I feel entitled to little fun. Now, since you don't appear to be a fan of my particular variety of humor, can you possibly translate the question and explain to me what a "suitable driver" is and what is M365? If you manage to prove to me that the question makes sense, I will apologize to the OP and offer you a pint of Sam Adams as an apology for my bad behavior. On the other hand, if the question doesn't make sense, I will gladly accept the same from you. -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: Was my question inappropriate for postgres?
On 7/25/22 00:36, Adrian Klaver wrote: Suitable driver is just that a driver that will suit the needs of working on Windows 10 with MS Office/Office/Microsoft 365. Literally every ODBC driver on Windows fits this definition. That is the purpose of ODBC: having a uniform set of calls for various databases. That is why people still use it, even on Linux. And yes, like you, I am a Linux user too. -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: [Beginner Querstion]Where can I find the news of database?
On 9/3/22 04:37, BeginnerC wrote: Hello everyone, I am a student come from the University of Emergency Management and I am learning the postgres. As you can see,I want to get some latest news,opinions of the database,but I don't know where I can get them. (I have already subscribe the rss of planet postgresql,but I want to seek more) Thanks in advance! Yours, BeginnerC. I assume that by "news" you mean "release notes", as there is no PostgreSQL equivalent of 60 Minutes. Release notes can be found here: https://www.postgresql.org/docs/release/ There is also a very nice blog with RSS feed: https://planet.postgresql.org/ Any RSS reader like Liferea will be good for following the blog. As for the news the hurricane 15.1 is about to hit the Postgres world. For now, it is only a Cat 1 hurricane, but it's likely to strengthen..... Regards -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: Postgres SQL unable to handle Null values for Text datatype
On 9/6/22 02:10, Karthik K L V wrote: We are migrating from Oracle 12C to Aurora Postgres 13 and running into query failures when the bind value of a Text datatype resolves to nul Oracle is actually in the wrong here. Nothing should be equal to null, ever. There is also different behavior with unique indexes: [mgogala@umajor ~]$ docker start psql14-5 psql14-5 [mgogala@umajor ~]$ psql -U scott Password for user scott: psql (14.5) Type "help" for help. scott=> create table test1 (key1 integer,key2 integer, data varchar(10)); CREATE TABLE scott=> alter table test1 add constraint test1_uq unique(key1,key2); ALTER TABLE scott=> insert into test1 values(1,null,'a'); INSERT 0 1 scott=> insert into test1 values(1,null,'b'); INSERT 0 1 scott=> select * from test1; key1 | key2 | data --+--+--- 1 | | a 1 | | b (2 rows) The same thing would not work with Oracle. However, please note that, according to SQL standard, NULL is not equal to anything, to those 2 rows are actually not a unique constraint violation. To enforce the uniqueness the same way as with Oracle, you actually need 2 indexes. You need to use the "coalesce" function. -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: WARNING: could not open statistics file "pg_stat_tmp/global.stat": Operation not permitted
On 9/8/22 23:05, Perry Smith wrote: This is an issue when PostgreSQL is running inside a container. In my quest to find an answer, I’ve discovered three instances that it has come up and various people have talked about fixes but no one seemed to notice what I found. I opened an issue here[1]. From within the container, files which I assume are created by PostgreSQL are ending up being owned by root rather than Postgres. Thus, to me, it appears to NOT be an issue of mapping internal UIDs and GIDs to external IDs since there should not be anything outside the PostgreSQL container creating files inside Postgres’ data directory. The reason I’m sending this note to the general list is to ask how bad is this error? Some “solutions” are to make the pg_stat_tmp directory internal to the image and that somehow resolves the issue but I don’t think anyone really understands why and things like that bother me. But I’m also curious if that appears to be a viable solution. The result will be that when the Postgres is stopped and the container exited, the next time Postgres starts back up, the pg_stat_tmp directory will be gone. Is that ok? Does Postgres store anything that that needs to survive a restart? Thank you for your help, Perry [1] https://github.com/docker-library/docs/issues/2188#issue-1367170047 Hi Perry, You probably need to fix your Dockerfile because your postgres seems to be running as root. Here is my Dockerfile: FROM oraclelinux:8 LABEL Description="This install PostgreSQL 14 on top of Oracle Linux 8" LABEL maintainer="Mladen Gogala" RUN dnf -y update COPY RPMS/pgdg-redhat-repo-latest.noarch.rpm /tmp COPY RPMS/pg_hint_plan14-1.4-1.el8.x86_64.rpm /tmp/ COPY RPMS/pg_hint_plan14-llvmjit-1.4-1.el8.x86_64.rpm /tmp/ # Install PostgreSQL software RUN dnf localinstall -y /tmp/pgdg-redhat-repo-latest.noarch.rpm RUN dnf -qy module disable postgresql RUN dnf -y install postgresql14 RUN dnf -y install postgresql14-libs RUN dnf -y install postgresql14-server RUN dnf -y install postgresql14-llvmjit RUN dnf -y install postgresql14-contrib # Install pg_hint_plan 1.4 RUN cd /tmp;dnf -y localinstall `ls *.rpm` # Cleanup RUN rm -f /tmp/*.rpm RUN dnf clean all # Copy postgresql.auto.conf (modified parameters) and run initdb USER postgres ARG PGPASSWD="qwerty" ENV PGDATA=/var/lib/pgsql/14/data ENV PATH=/usr/pgsql-14/bin:/usr/bin:/usr/local/bin RUN echo "$PGPASSWD">/var/lib/pgsql/14/pgcluster.pwd RUN initdb -A password --pwfile=/var/lib/pgsql/14/pgcluster.pwd RUN echo "host all all 0.0.0.0/0 md5">>$PGDATA/pg_hba.conf COPY --chown=postgres:postgres RPMS/postgresql.auto.conf /var/lib/pgsql/14/data # Finish EXPOSE 5432/tcp ENTRYPOINT ["postgres"] Please note that I switch users in the file and that owners are switched. I don't have any issues with the container built this way. I have another container which is built by untaring backup of my sample database. I use ADD command to untar it to $PGDATA. Also, my trick with putting the modified parameters into postgresql.auto.conf goes contrary to the Postgres recommendations. You may want to avoid doing that. I haven't pushed the image to the Docker hub because I'm still working on it. -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: ***SPAM*** Re: WARNING: could not open statistics file "pg_stat_tmp/global.stat": Operation not permitted
On 9/9/22 00:08, Perry Smith wrote: The data directory is outside so it is persistent. The pg_stat_tmp is inside the data directory. Ah, that's the reason. Docker daemon runs as root so if you do binding mount, files will be owned by root. You may want to use normal Docker volume and not an external directory. -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: Is it possible to stop sessions killing eachother when they all authorize as the same role?
On 9/12/22 18:51, Bryn Llewellyn wrote: I'll use "kill" here a shorthand for using the "pg_terminate_backend()" built-in function. I read about it in the "Server Signaling Functions" section of the enclosing "System Administration Functions" section of the current doc: www.postgresql.org/docs/current/functions-admin.html#FUNCTIONS-ADMIN-SIGNAL And I tried a few tests. All of the outcomes were just as the doc promised. I'm troubled by the notion that (as it seems) one session that authorizes as the role "r1" can easily list all other concurrent sessions that are also authorized as "r1"—and kill them all without restriction. (The doc does say "Use of these functions is usually restricted to superusers, with noted exceptions." So I s'pose that I'm talking about one of these noted exceptions.) It's common to design a three tier app so that the middle tier always authorizes as just a single role—say, "client"—and where the operations that "client" can perform are limited as the overall design specifies. The maximal example of this paradigm defines the API to the database functionality by granting "execute" to just the designed set of subprograms. Here, the subprograms and the tables that they access all have owners other than "client". (The authorization of external principals, and ow their identity is mapped to a unique key for use within that database, is outside the scope of what I write about here.) It seems far-fetched to think that the requirements spec for every such design would deliberately specify: — Must be possible for any "client" session to kill all other concurrent "client" sessions. Yet the paradigm is that the database API expresses exactly and only what the design says that it should. Ergo, the paradigm is, in general, unimplementable. I appreciate that (while the privileges that "client" has are unchanged) a just-killed session can easily reconnect by trying what they had just tried again. But not before suffering the fatal "57P01: terminating connection due to administrator command" error. The implication is that every client program must follow every database call with defensive code to detect error "57P01" and programmatically re-try. (Maybe some drivers can do this automatically. But I haven't found out if whatever psql uses can do this. Nor have I found out how to write re-try code in psql.) Does anybody else find all this as troubling as I do? And, if so, might a remedy be possible? Maybe something like this: — Define a new privilege as a cousin to "pg_signal_backend". I'll call it "pg_signal_backend_for_self_role" here. This would govern the possibility that a session can kill another session that authorized as the same role as itself. — Document the fact that "pg_signal_backend_for_self_role" is implicitly granted to a newly-created role (just as it's documented that "execute… to public" is implicitly granted to a newly created subprogram). — Allow "revoke pg_signal_backend_for_self_role from…"—by all means with extra rules like only a superuser can do this. Bryn, you can revoke execute on pg_terminate_backend from public and that will, by extension, revoke it from all users who do not have DBA privilege or have not been explicitly granted the "execute" privilege on pg_terminate_backend. This doesn't look like a big problem because applications usually don't contain code for killing other user's sessions. I am not sure that GTA is running on top of Postgres database. -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: Is it possible to stop sessions killing eachother when they all authorize as the same role?
On 9/13/22 00:49, Tom Lane wrote: Bryn Llewellyn writes: My non-superuser normalrole with direct login, "u1", is *still* able to invoke pg_terminate_backend() and kill other "u1" sessions—even after this (as a super-user): Really? I did this in 14.5: regression=# revoke execute on function pg_terminate_backend from public; REVOKE regression=# select proacl from pg_proc where proname = 'pg_terminate_backend'; proacl --- {postgres=X/postgres} (1 row) (as expected, the superuser's own execute permission is all that remains) regression=# create user joe; CREATE ROLE regression=# \c - joe You are now connected to database "regression" as user "joe". regression=> select pg_terminate_backend(42); ERROR: permission denied for function pg_terminate_backend It very much looks as if what I have describe was deemed to be a bug (after that behavior had survived from at least version 11) and that it's now been fixed! No, it very much looks like pilot error. But you've not shown us exactly what your test consisted of, so it's hard to say just where it went off the rails. regards, tom lane Tom, I did the same thing on 14.5, and it behaves as Bryn alleges: postgres=# select proacl from pg_proc where proname = 'pg_terminate_backend'; proacl --- {postgres=X/postgres} (1 row) So,the only user who should be able to execute pg_terminate_backend is "postgres". Let's try with user "scott". mgogala@umajor ~]$ psql -U scott Password for user scott: psql (14.5) Type "help" for help. scott=> select pid from pg_stat_activity where usename='scott'; pid - 66 79 (2 rows) scott=> select pg_terminate_backend(66); pg_terminate_backend -- t (1 row) User scott has no special privileges: postgres=# select usesuper,usecreatedb,usebypassrls from pg_user where usename='scott'; usesuper | usecreatedb | usebypassrls --+-+-- f | f | f (1 row) Yet, it is still able to execute the function in question. My version is the following: scott=> select version(); version --------- PostgreSQL 14.5 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.5.0 20210514 (R ed Hat 8.5.0-10), 64-bit (1 row) Regards -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com