Re: Tools for moving normalized data around
On 19 Jan 2023, at 6:47, Peter wrote: > Now I want to grab some part of the data, on a certain condition > (let's say all records belonging to user 'Bob', if there is a "user" > table somewhere at the tree-bottom), and move it to another database > with the very same layout - which is already populated with data > and runs a different instance of the same application. > > Grabbing the intended records is just some dead-simple Selects. But > then inserting them into the other database is not fun, because some > primary keys will likely collide. > A very small sliver in this problem: The key collision problem could be avoided if the unique and arbitrary keys were UUID Many of the other keys should be related to their respective “table of truth” so the migration is dependant on these being the same across the locations The rest TBA down thread :) Regards Gavan Schneider —— Gavan Schneider, Sodwalls, NSW, Australia Explanations exist; they have existed for all time; there is always a well-known solution to every human problem — neat, plausible, and wrong. — H. L. Mencken, 1920
Re: pg_upgrade 13.6 to 15.1?
On 16 Jan 2023, at 8:59, p...@pfortin.com wrote: > encodings for database "template1" do not match: old "UTF8", new > "SQL_ASCII" Failure, exiting > Suggest the old dB using UTF8 is the better practice, and the new dB should do likewise > "template1" is not a DB I've ever messed with; so this will require that > I fire up the old version and change the encoding somehow? > This is created at initdb and mostly you don’t need/want to mess with it > Is this likely to repeat for my actual databases? > AFAICT the least work option is to redo the initdb for the new v15.1 database. There is a lot of pain (and potential data corruption) to be had trying to reconfigure the old one before it can be moved. Personally, UTF8 is the way to go. It will handle everything in the old database and the future brings to the new one. I can see no advantage in pure ASCII when there is the potential for the real world to be contributing text. And there could well be non-ASCII characters lurking in the old dB, especially since someone set it up to receive them. Regards Gavan Schneider —— Gavan Schneider, Sodwalls, NSW, Australia Explanations exist; they have existed for all time; there is always a well-known solution to every human problem — neat, plausible, and wrong. — H. L. Mencken, 1920
Re: Stored procedure code no longer stored in v14 and v15, changed behaviour
On 2 Dec 2022, at 6:51, Tom Lane wrote: > Dominique Devienne writes: >> On Thu, Dec 1, 2022 at 8:09 PM Christophe Pettus wrote: >>> Since this is a custom-built system, there is nothing keeping you from >>> creating your own table in the database that stores the original text of >>> the function. > >> That's not the point. If a DBA updates one of our triggers or proc or >> whatever else, >> the recorded info in a custom table won't be affected. We are diff'ing >> the server-side >> schema, against the expected in-memory model of the physical model. > >> Thus the dictionaries are the only source of truth we can trust for >> the current state of the schema. >> And beside minor syntactic differences, and some more troublesome >> object-name rewrites, this is >> exactly what we want. The system itself needs to preserve the original >> DDL IMHO. --DD > > Do you really fail to see the contradictions in this? You want the > database to preserve the original DDL, but you also want it to update > in response to subsequent alterations. You can't have both those > things. > At the risk of stating the (over) obvious… NEW: the Pg system has a parsed/tokenised version of functions and such like, with the excellent feature that these will be kept synchronised with any name change of objects referenced OLD: easily available text version of functions, etc. with comments and layout (but not easily updatable in the event referenced objects get renamed) — sysadmin usage The contradiction is obvious but both needs can be justified… NEXT(?): human readable version of function, etc definitions be generated from the parsed version, with the addition of tokens within this parsed version that allow programer’s comments to be reinstated. Leave the layout style conventions to the system and promise to keep this style consistent and maintained as part of the feature. Putting this generated (and updated as needed) text into pg_proc.prosrc would seem to be the least change needed to allow existing usage to move with the new Pg versions (and maybe help pgAdmin as well) Regards Gavan Schneider —— Gavan Schneider, Sodwalls, NSW, Australia Explanations exist; they have existed for all time; there is always a well-known solution to every human problem — neat, plausible, and wrong. — H. L. Mencken, 1920
Re: Seeking practice recommendation: is there ever a use case to have two or more superusers?
On 22 Nov 2022, at 10:05, Bryn Llewellyn wrote: > Because PG allows a cluster to have as many superusers as you please, and > because any one of these can create or drop another, any convention in this > space needs some extra mechanisms to enforce it.. > > … effectively tamper-proof implementation of the scheme … > Somewhat interesting thread so far but seems to be asking more than one question — Q1. Is there ever a use case to have two or more superusers? Answer: Yes, but entirely depending on the use case. Q2. [IMPLIED] How to make the database tamper-proof since at least one superuser is unavoidable? Answer: Not possible, ever — see below. It is best to consider a database security system’s design objectives to be tamper proof from the outside (ie., general client access perspective), and tamper evident from within the database. As far as the server is concerned one person’s superuser tampering is another person’s maintenance. There is no way to configure login credentials to prevent malicious or mistaken changes when you need to have the occasional superuser role that can repair a serious fault or process a system upgrade. If an upgrade or repair can be anticipated it should already be done, the superuser is needed for the things that were not expected or too complex to pre-automate. AFAICT minimal tamper evident criteria will include logs being kept of changes made and these on a system the database superuser cannot change. At worst the logs will still have recorded when they were turned off. And the logs should have recorded the credentials used to assume the superuser role. After that it is basic business management — was the person acting badly, were the credentials stolen, what damage was done, and are the backups useful? The full security analysis is complex and searches around “threat landscape” will help widen the perspective. The question becomes one of identifying what is possible to prevent, what can only be mitigated, and what can only be repaired after the fact. Database security is a lot more complicated than just trying to restrict the superuser role. Gavan Schneider —— Gavan Schneider, Sodwalls, NSW, Australia Explanations exist; they have existed for all time; there is always a well-known solution to every human problem — neat, plausible, and wrong. — H. L. Mencken, 1920
Re: ('dog$house' = quote_ident('dog$house')) is surprisingly FALSE
On 6 Oct 2022, at 16:04, Bryn Llewellyn wrote: Does this imply a risk that a future PG version will go against the SQL standard and reject any non-latin name that is free of all punctuation characters, when used in the role of a SQL identifier, unless it's double quoted? From my perspective this thread seems to miss the essential purposes behind quote_ident(). It is part of processing external/user input — 1. Protecting from PostgreSQL which always maps everything to lower case before anything gets to the parser 2. Protection against SQL injection when processing input from outside the trusted perimeter Expecting an arbitrary string to be equal to itself after it has been through string processing code is risky unless that processing is part of the design, and quote_ident() was never designed to be part of any such arrangement. Expanding — 1. It is a complex question what happens to non-ASCII characters when they are mapped to lower case… sometimes this is a meaningful concept e.g., ∏ -> π, sometimes it is not, e.g., pick any Chinese/Korean/Japanese character. If the designer decides to use non-ASCII characters in the identifier they can… just double-quote those identifiers. If the designer wants to use camelCase ASCII they can, but the identifier will be camelcase inside the machine unless it was double quoted. AFAIK we never really use quote_ident() except to process external input. As noted above this function is not designed to be part of an equality test when attempting system introspection, rather — 2. The simple quote_ident() function can also be used to wrap untrusted input so it will not mess with the parser. It is used with quote_literal() when building dynamic SQL statements from user (i.e., untrusted) input. From my perspective any use of these function outside their scope is just that… outside their scope, with no promise this usage will work or comply with any current or future standard, or imply anything useful about pretty much anything. Maybe I’m oversimplifying but I believe the current functions work and do their specific jobs, and have nothing to do with anything else. So there is no surprise for me in the subject line. There is mild surprise the question was asked. BTW this ignores whether or not PG mapping everything that’s not quoted to lower case is standards compliant. This whole topic would be simpler if the case was left alone but that’s a long road ago and I believe most of the bridges have been burnt :) Regards Gavan Schneider —— Gavan Schneider, Sodwalls, NSW, Australia Explanations exist; they have existed for all time; there is always a well-known solution to every human problem — neat, plausible, and wrong. — H. L. Mencken, 1920
Re: Creation of FK without enforcing constraint for existing data
On 2 Aug 2022, at 17:14, sivapostg...@yahoo.com wrote: Hello, I need to create a Foreign Key for a table without enforcing the constraint for existing data. Few orphan exists in existing data, which we plan to resolve it later. We use the following query to create the FK [ Which of course checks for the presence of record in referencing table] Try something like [sql-altertable.html](https://www.postgresql.org/docs/current/sql-altertable.html): ALTER TABLE public.table1 ADD CONSTRAINT "constraint1_FK" FOREIGN KEY (field1) REFERENCES public.tabnle2(field2) MATCH SIMPLE ON UPDATE RESTRICT ON DELETE RESTRICT DEFERRABLE INITIALLY DEFERRED ; The facility NOT to check for the presence of the parent record is available in SQL Server. Searching an equivalent option in PostGreSQL. Is it available? Else any other way to achieve this. No knowledge of SQL Server and how this is specified there, but the …DEFFER… syntax is according to the SQL standard Regards Gavan Schneider —— Gavan Schneider, Sodwalls, NSW, Australia Explanations exist; they have existed for all time; there is always a well-known solution to every human problem — neat, plausible, and wrong. — H. L. Mencken, 1920
Re: Concurrent INSERT statements with RETURNING clause resetting SERIAL sequence
On 20 Jul 2022, at 4:08, Francisco Olarte wrote: As a remark, in Spain bill numbers need to be gapless increasing. I have done it with a sequence ( with cache 1, so I doubt it is much more performant than a single row table, as it has a backing element which needs to be saved ), and just used another process which swept the values and filled the gap a posteriori ( there is a way to fill a gap, has it legal uses, you can use it for that as long as you do not have too many, basically insert cancelled bills in the gaps ). I probably would do it with the one-row table now for simplicity, I doubt I gained enough performance to justify the 20-something perl lines plus the crontab entry to use sequences. As beard turns grey I apreciate more the simpler ways to do things. One answer to this problem has been around for a while, and my version is shown below. No extra table is needed, just a windowed look at the table where the reference is used. My usage is at human speed so performance is not an issue but it should be pretty good if there are minimal holes. What is not addressed is that a reference can be reissued upto the time the calling process commits an entry in the table and takes that reference out of circulation. There are different approaches to handling/preventing such collisions. CREATE OR REPLACE FUNCTION accounts.next_active_reference() RETURNS integer LANGUAGE 'sql' VOLATILE LEAKPROOF STRICT PARALLEL UNSAFE COST 3000 -- pure guesstimate AS $BODY$ SELECT L.transaction_ref + 1 AS start FROM accounts.transaction_refs AS L LEFT OUTER JOIN accounts.transaction_refs AS R ON L.transaction_ref + 1 = R.transaction_ref WHERE R.transaction_ref IS NULL AND L.transaction_ref >700 -- historic reasons only, added to existing system ORDER BY L.transaction_ref LIMIT 1; $BODY$; COMMENT ON FUNCTION accounts.next_active_reference() IS $COMMENT$ Determines the next available reference number, making sure to fill any holes. The holes in the sequence prior to 700 are ignored (pure history), and this code will make sure any out of sequence data blocks will not cause conflicts. Credits: Ref: <http://www.xaprb.com/blog/2005/12/06/find-missing-numbers-in-a-sequence-with-sql/> 2022 update: this link is now dead, only reporting "There is nothing here". $COMMENT$; Regards Gavan Schneider —— Gavan Schneider, Sodwalls, NSW, Australia Explanations exist; they have existed for all time; there is always a well-known solution to every human problem — neat, plausible, and wrong. — H. L. Mencken, 1920
Re: newbie db design question
On 11 Jun 2022, at 17:05, Rino Mardo wrote: > … experimental database design: > > create table products ( > product_id serial primary key, > description text, > supplier_id) ; > > create table supplier ( > supplier_id serial primary key, > description text) ; > > … the products table should be linked to the supplier table via "supplier_id" > This is best done via foreign key constraint — see below > … what would be the data type of supplier_id in > This is a matter of style, either an integer (even serial), or text (which will help a human like yourself when checking the data) > how then to proceed? > You could adopt a style convention advocated by Joe Celko, eg., in “SQL for Smarties”, where the table name is the plural form of the content item (primary key). This is not the only way to do business. Many on this list use integers as an id. The rationale behind the short text identifiers is that you can see what is going on when inspecting the data. So in this spirit you might also consider a “natural”, i.e., text, as the primary key for the products. Opinions vary in this area and often devolve to performance. Remember the basic programers trade off — your time versus hardware time. Mostly we are better using conventions that make our time more effective, but some have truly serious data problems and may benefit from less human friendly practices. As always YMMV. So combining all these suggestions: pendari=# create table products ( pendari(# product serial primary key, pendari(# description text, pendari(# supplier text) pendari-# ; CREATE TABLE pendari=# create table suppliers ( pendari(# supplier text primary key, pendari(# description text) pendari-# ; CREATE TABLE — I used pgadmin4 to build this constraint, and I suggest you will find it helpful as well — this is the definition of the constraint holding the supplier key between tables — pendari=# ALTER TABLE IF EXISTS public.products pendari-# ADD CONSTRAINT fkey_check__supplier_must_exist FOREIGN KEY (supplier) pendari-# REFERENCES public.suppliers (supplier) MATCH SIMPLE pendari-# ON UPDATE CASCADE pendari-# ON DELETE RESTRICT pendari-# DEFERRABLE pendari-# NOT VALID; ALTER TABLE You now have a basic skeleton which ties the supplier keys together. The supplier has to be entered first, then all the products from that supplier can get entered. This means any attempt to enter a product where the supplier is not known (or has been misspelt) will raise an error. IMNSHO constraints are good: they help preserve data integrity, allow keys to be changes (and the change cascades to all the places which matter), and only cause extra db work when new data is entered (i.e., no ongoing overhead). Regards Gavan Schneider —— Gavan Schneider, Sodwalls, NSW, Australia Explanations exist; they have existed for all time; there is always a well-known solution to every human problem — neat, plausible, and wrong. — H. L. Mencken, 1920
Re: Strange results when casting string to double
On 17 Feb 2022, at 8:02, Carsten Klein wrote: > We have a trigger, that monitors such updates and it reports a changed value, > which is not correct. The problem is, that the assignment > > SET xyz = 1.56 > > actually means > > SET xyz = 1.55999 > > since getting a double value from the string 1.56 yields 1.55999. > I just wanted to ask whether someone knows something about this or has ever > heard about such a behavior. You say, the snowflake machine has issues... I > don't believe in hardware issues, since it runs in VMware and likely on many > different CPUs. Isn't it more than unlikely that such a constantly occurring > error is caused by one faulty CPU (among that many CPUs an ESX server > typically has)? > Harking back to my long distant formative years I was taught to never ever rely on equality tests when it came to floating point values. Nothing has changed in this regard. If binary exact is part of the requirement then the answer is INTEGER or NUMERIC never FLOAT, REAL, DOUBLE or anything similar. So, assuming the trigger function is the only source of this spurious grief one could apply the lesson taught in the 60s from the dawn of computing: EQUALITY = absolute_value(op1 - op2) < epsilon — where op1 & op2 are DOUBLE, and epsilon is smaller than you care for Given the intrinsic (standards compliant) uncertainty when converting from absolute (e.g., string representation) to floating point there will never be value resolving why there are differences. I suggest using the comparison that is appropriate to the representation of those values or fix the design by using the proper representation. Gavan Schneider —— Gavan Schneider, Sodwalls, NSW, Australia Explanations exist; they have existed for all time; there is always a well-known solution to every human problem — neat, plausible, and wrong. — H. L. Mencken, 1920
Re: Best Strategy for Large Number of Images
On 16 Dec 2021, at 7:12, Estevan Rech wrote: > I have an application that takes pictures of items and uses them as > evidence in a report. > > These images are saved and then used to generate a report. > Once you start talking reports and evidence I think audit trail and verification a.k.a. “chain of evidence”. With the images in the database they can be part of your information ecosystem, get a datestamp along with checksum and all this included in the audit trail. All this can be done with the images in the file system and the database holds the pointers only… it just may not be as easy to convince the oppositions lawyers that it is tamper proof, except the best you can really get is tamper evident. > Each item averages 10 photos and I have about 2 million photos currently, > with an average growth of 1 million photos over the next year. > > I think about using it within the database because of the ease of selecting > records with the content of the images (when necessary). > That will be the same however you go. The selection will be on the context data you enter with the images. There has been no mention of searching within the image itself, and this is not something PG can do “out of the box” anyway. > I think my biggest concern is to continue to use this strategy for the next > 2 years and encounter a PostgreSQL limitation or some failure and have to > migrate database again or have significant loss of images. And on the > contrary too, if I use it on disk and in 2 years I find some failure or > file limit in the operating system (currently it's windows server 2016)... > The total image data will end up somewhere less than 4Tb if your estimates continue to hold. That’s not going to stress PG. It’s not such a big deal on a modern home computer either, but I can’t make any informed comment on Windows anything. Realistically, once the data is in the database the whole can be moved to a more capable machine and/or OS if problems start to appear. PG will work wherever you deploy it. > Related to this is the backup service that seems to me that in a disaster, > I can restore the database relatively quickly if it's in the database. On > the disk, I believe that the restoration is much slower and I don't think > it's reliable that all the images are copied/restored. > If you have a reliable tested backup system for the database and don’t want to add to the workload then get the images into the database. If there are reasons why you end up with the images outside the database then, of course, the backup process has to be able to keep everything together and guarantee a full restore. Doable as well, but it does add some complexity. If I am reading this correctly the use case is one of receiving data in the form of photos and field reports, entering the report items associated with the photos into the information system, generating a report (with embedded photos), and keeping records so disputes can be handled. This is not a high volume access to the images so there is no real need for optimum filesystem speed to serve the images… keeping them in the database as bytea is perfectly workable and will work for data volumes well beyond the current estimates. Gavan Schneider —— Gavan Schneider, Sodwalls, NSW, Australia Explanations exist; they have existed for all time; there is always a well-known solution to every human problem — neat, plausible, and wrong. — H. L. Mencken, 1920
Re: Alter and move corresponding: was The tragedy of SQL
On 16 Sep 2021, at 9:31, Gavin Flower wrote: > would assign the value of 42 to x. > Which brings up another topic, e.g., https://news.mit.edu/2019/answer-life-universe-and-everything-sum-three-cubes-mathematics-0910 > Never tried it, I now wish I had! > You could see if it’s accurately emulated, e.g., http://ibm1130.org/emu/ What is truly amazing about old style FORTRAN is that it has a theological aspect. What other computer language can give truth to a maxim such as: IN FORTRAN GOD IS REAL Gavan Schneider —— Gavan Schneider, Sodwalls, NSW, Australia Explanations exist; they have existed for all time; there is always a well-known solution to every human problem — neat, plausible, and wrong. — H. L. Mencken, 1920
Re: Sort question - Fractions, Metric etc
On 15 Aug 2021, at 7:47, Bret Stern wrote: > I will say this business has been behind in the attributes game. Plus there > are many "artsy" vendors who can hardly speak in these terms, and don't > publish to us, so we do the best we can. > > Getting vendors to supply the basic values is a struggle. > I suspect you have already found all vendors reliably supply two values: how many (a dimensionless parameter), and how much (mostly in units of dollar) After that there may be length width and thickness values specified with (hopefully) the same units ± weight (per unit or box?) ± allowance for grout/joining (in case your application is going to be used as an aid in estimating quantities needed) The truly artistic supplier will refrain from making anything the same and your customer will be expected to buy the item first and plan their project around it. One possible data wrangling scheme would be to give each item a catalogue number (yours) — Just noticed Adrain saying something similar so apologies for the overlap. The vendor_spec table would capture the vendor’s identifier, description and supplied dimensions (with a units column). The dimensions_view (suggest a materialised view) would carry the dimension information in converted form, e.g., mm: numeric(8,3) Once the conversions are setup the customer can be supplied with dimensions in the system of their choice and you have a sensible common point reference for any rankings. When I first saw this thread I thought you were getting into metal dimensions and started to wonder if your next phase was to “organise” bolts according to diameter length and thread… you have picked the easier course, the other is a nightmare, e.g., https://www.americanmachinetools.com/machinist_tables.htm Gavan Schneider —— Gavan Schneider, Sodwalls, NSW, Australia Explanations exist; they have existed for all time; there is always a well-known solution to every human problem — neat, plausible, and wrong. — H. L. Mencken, 1920
Re: A simple question about text fields
On 18 Jun 2021, at 9:34, David G. Johnston wrote: > On Thursday, June 17, 2021, Gavan Schneider > wrote: > >> >> My approach is to define such fields as ‘text’ and set a constraint using >> char_length(). This allows PG to do the business with the text in native >> form, and only imposes the cost of any length check when the field is >> updated… best of both worlds. >> > > Those are basically the same world…your alternative probably is strictly > worse than varchar(n) because of its novel way of implementing the same > functionality. > Not sure if this is strictly true. Novelty per se is not always worse. :) The design advantage is in all text fields being defined the same — no built in length. When it becomes apparent a length constraint is needed it can be added for the relevant field(s), e.g., when the system does not impose proper restraint at the input stage. > For most text fields any good constraint is going be done in the form of a > regular expression, one that at minimum prevents non-printable characters > (linefeed and carriage return being obvious examples). > Agree. If the business rules need some additional restrictions they can go here as well. Not so obvious that newlines, etc. are unwelcome. Consider the need for human readable text in comment fields (or other annotation where the readability is enhanced by such layout). There is always the consideration of SQL injection (but it’s mostly too late if we’re leaving this to a constraint) and other toxic string sequences. But this is all business logic. The database just needs to handle the ‘text’ and we need to design the restraint around the content. Gavan Schneider —— Gavan Schneider, Sodwalls, NSW, Australia Explanations exist; they have existed for all time; there is always a well-known solution to every human problem — neat, plausible, and wrong. — H. L. Mencken, 1920
Re: A simple question about text fields
On 17 Jun 2021, at 1:08, Tom Lane wrote: > Martin Mueller writes: > >> Are there performance issues with the choice of 'text' vs. varchar and some >> character limit? For instance, if I have a table with ten million records >> and text fields that may range in length from 15 to 150, can I expect a >> measurable improvement in response time for using varchar(150) or will text >> do just or nearly as well. > > There is no situation where varchar outperforms text in Postgres. > If you need to apply a length constraint for application semantic > reasons, do so ... otherwise, text is the native type. It's > useful to think of varchar as being a domain over text, though > for various reasons it's not implemented quite that way. > This reminds of my days converting from MySQL to PostgreSQL. MySQL, along with other databases, seemed to have a strong preference for setting a length on character strings. And all this from before the advent of UTF encoding which has made the concept of string ‘length’ very messy. Database guru and SQL author Joe Celko asserts in his ’SQL for Smarties’ that if he finds a text field without a length limit he will input the Heart Sutra (presumably in ASCII :) to demonstrate the design error. (Of course he is ignoring the potential for this input to help the database achieve inner consistency. :) . But taking Joe’s central point there do seem to be grounds for restricting user input text fields to a reasonable length according to the business need… if only to limit the damage of a cat sitting on the keyboard. My approach is to define such fields as ‘text’ and set a constraint using char_length(). This allows PG to do the business with the text in native form, and only imposes the cost of any length check when the field is updated… best of both worlds. Gavan Schneider —— Gavan Schneider, Sodwalls, NSW, Australia Explanations exist; they have existed for all time; there is always a well-known solution to every human problem — neat, plausible, and wrong. — H. L. Mencken, 1920
Re: database sorting algorithms.
On 1 May 2021, at 17:06, Jian He wrote: Been self study Database, from database I deep dived into sorting algorithms. Databases can do in-memory QuickSort. It also has an on-disk MergeSort. For MergeSort: I follow this tutorial https://youtu.be/6pV2IF0fgKY?t=1108 (around 1 minutes only) Also check https://en.wikipedia.org/wiki/Merge_sort But I am still not fully understanding about *nlogn*. I understand how many passes it will take, that is* logn. * Yes each pass will sort N elements. But I still don't get the *N* stand f*or in n*logn.* So, answering the question… The ’n’ refers to the need to do something to each element at least once, so the sort time grows in simple proportion to the size of the list that needs to be sorted. Unfortunately that is not enough work to get the list sorted so extra steps are needed. The log(n) indicates how many extra steps are needed. So the overall performance is proportional to the number of elements (N) multiplied by the log of the number of elements, viz., N * log(N) Regards Gavan Schneider —— Gavan Schneider, Sodwalls, NSW, Australia Explanations exist; they have existed for all time; there is always a well-known solution to every human problem — neat, plausible, and wrong. — H. L. Mencken, 1920
Re: archive_commnad parameter question
On 20 Apr 2021, at 7:09, Allie Crawford wrote: archive_command = 'test ! -f /mnt/server/archivedir/%f && cp %p /mnt/server/archivedir/%f' # Unix Does anybody know what is “test” representing in this parameter configuration? my_unix_prompt> man test gives: Tests the expression given and sets the exit status to 0 if true, and 1 if false. An expression is made up of one or more operators and their arguments. In other words “test” here is a unix command which evaluated the expression supplied via the arguments. Mostly used in its alternate form of “[ … ]” in shell scripts Gavan Schneider —— Gavan Schneider, Sodwalls, NSW, Australia Explanations exist; they have existed for all time; there is always a well-known solution to every human problem — neat, plausible, and wrong. — H. L. Mencken, 1920
Re: unexpected character used as group separator by to_char
On 10 Mar 2021, at 16:24, Alvaro Herrera wrote: That space (0xe280af) is U+202F, which appears to be used for French and Mongolian languages (exclusively?). It is quite possible that in the future some other language will end up using some different whitespace character, possibly breaking any code you write today -- the use of U+202F appears to be quite recent. Drifting off topic a little. That a proper code point for things that will benefit from the whitespace but should still stay together. Also it’s not that new, added in 1999 — https://codepoints.net/U+202F The other use case is between the number and its ISO symbol e.g., 20 °C — and the non-breaking form is important here As for who uses a thin space in money… it is a normal in European notation — https://www.languageediting.com/format-numbers-eu-vs-us/ And the thin space is part of the international standard for breaking up large numbers (from 1948), specifically no dots or commas should be used in this role. The dot or comma is only to be used for the decimal point! All of which just highlights problems in localisation and begs the question whether there is an ISO locale setting — it should work everywhere ;) Gavan Schneider —— Gavan Schneider, Sodwalls, NSW, Australia Explanations exist; they have existed for all time; there is always a well-known solution to every human problem — neat, plausible, and wrong. — H. L. Mencken, 1920
Re: vacuumdb not letting me connect to db
On 6 Feb 2021, at 23:06, Atul Kumar wrote: Thanks for providing the details, I need more clarification on this as how should I analyze that what should be ideal no. of connections should we set to avoid IO overhead based on the available hardware resources. How to do this analysis properly is outside my expertise and likely off topic for this list. But you can get an overall idea doing what I suggested: measure the performance (i.e., total time to completion) when using different numbers of threads). In an ideal world the time will get less as you invoke more threads. But there will be a point where there is a bottle neck (or some resource hits its limit) and the time will be more or less the same no matter how many threads. How to do this calculation ? That’s likely too hard. Measurement will tell you what you need to know. Once you know how many threads it takes to saturate you will know you need to use a lesser number. This may, or may not, get you to revisit the max 700 connections setting but once you know there is a performance limit, and you decide it’s not good enough, then you can ask the hardware people for help. Note: even during 300 threads, my RAM utilisation is totally normal. That’s not unexpected. The CPU activity should be using the data held in RAM not spending all those cycles allocating RAM. Gavan Schneider —— Gavan Schneider, Sodwalls, NSW, Australia Explanations exist; they have existed for all time; there is always a well-known solution to every human problem — neat, plausible, and wrong. — H. L. Mencken, 1920
Re: vacuumdb not letting me connect to db
On 6 Feb 2021, at 3:37, Ron wrote: On 2/5/21 10:22 AM, Rob Sargent wrote: On 2/5/21 9:11 AM, Ron wrote: Obviously... don't use 300 threads. No, no Ron. Clearly the answer is more CPUs I hope you're being sarcastic. A reasonable conjecture… though there is the consideration that 300 CPU intensive tasks spread across a given number of CPUs is going to waste some resources with context switching., i.e., need more CPUs :) Basically if there is plenty of wait time for I/O completion then CPU task switching can get more total work done. So far so obvious. In this thread I can see where it is disappointing to have a system considered capable of 700 connections getting saturated by a “mere” 300 threads. But this is only a “problem” if connections are equated to threads. PG max connection count is about external users having access to resources needed to get a task done. Like all resource allocations this relies on estimated average usage, i.e., each connection only asks for a lot of CPU in brief bursts and then the result is transmitted with a time lag before the connection makes another CPU demand. The system designer should use estimations about usage and load to budget and configure the system, and, monitor it all against actual performance in the real world. Of course estimates are a standing request for outliers and the system will show stress under an unexpected load. So far I have not seen an analysis of where the bottle neck has occurred: CPU RAM HD and/or the data bus connecting these. Some of these hardware resources maxed out to the extent the system would not immediately pick up an additional work unit. As I see it OP started 300 CPU intensive tasks on hardware intended for 700 connections. If the connection count was designed with say 50% CPU intensive time per connection you would expect this hardware to be fully saturated with 300 CPU intensive tasks. More than that, doing the task with 300 threads would probably take longer than (say) 200 threads as the increased CPU context swapping time is just wasted effort. OP now has a choice: decrease threads or (seriously) upgrade the hardware. We in the gallery would love to see a plot of total time to completion as a function of threads invoked (50-300 increments of 50) assuming the starting conditions are the same :) Gavan Schneider —— Gavan Schneider, Sodwalls, NSW, Australia Explanations exist; they have existed for all time; there is always a well-known solution to every human problem — neat, plausible, and wrong. — H. L. Mencken, 1920
Re: How to convert escaped text column - force E prefix
On 6 Jan 2021, at 19:43, Pavel Stehule wrote: Currently there are not any functions that you need. You need to write your own. CREATE OR REPLACE FUNCTION public.unistr(text) RETURNS text LANGUAGE plpgsql IMMUTABLE STRICT AS $function$ declare r text; begin execute 'select e''' || quote_literal($1) || into r; return r; end; $function$; Attention: This is ugly and possible sql injection vulnerable!!! But there is not another way. The fix is in queue https://www.postgresql.org/docs/current/functions-string.html quote_literal ( text ) → text Returns the given string suitably quoted to be used as a string literal in an SQL statement string. Embedded single-quotes and backslashes are properly doubled. Note that quote_literal returns null on null input; if the argument might be null, quote_nullable is often more suitable. See also Example 42.1. quote_literal(E'O\'Reilly') → 'O''Reilly' It is even more ugly but would it at least help with the SQL injection risk? Gavan Schneider —— Gavan Schneider, Sodwalls, NSW, Australia Explanations exist; they have existed for all time; there is always a well-known solution to every human problem — neat, plausible, and wrong. — H. L. Mencken, 1920
Re: DOMAIN/composite TYPE vs. base TYPE
On 29 Sep 2020, at 7:31, Joe Abbate wrote: Hello Rob, On 28/9/20 17:17, Rob Sargent wrote: just record all three fields (day, month, year) with nulls and do the to-date as needed. That is not sufficient. An earlier implementation had something like a CHAR(8) to record MMDD, but how can you indicate, for example, an issue date of a bimonthly magazine, say July-Aug 2020? We can store 2020-07-01 in the DATE attribute, but we need another attribute to indicate it's really two months. Also, by storing three separate columns, you loose the beauty of the PG DATE abstraction. This is only a partial “fix” and goes nowhere near solving the full wrapper/abstraction problem… Consider expressing all the component fields as a range. This allows you the ability to be a precise as you need and still have the benefits of well defined comparison functions. Regards Gavan Schneider —— Gavan Schneider, Sodwalls, NSW, Australia Explanations exist; they have existed for all time; there is always a well-known solution to every human problem — neat, plausible, and wrong. The ancients, in the case at bar, laid the blame upon the gods: sometimes they were remote and surly, and sometimes they were kind. In the Middle Ages lesser powers took a hand in the matter, and so one reads of works of art inspired by Our Lady, by the Blessed Saints, by the souls of the departed, and even by the devil. H. L. Mencken, 1920
Re: PG12.2 Configure cannot enalble SSL
On 15 May 2020, at 22:27, Peter Eisentraut wrote: Then the next step is to check in config.log for the details of the test failure. (Search the file for "openssl/ssl.h" to find the right place.) Thank you, hopefully I’ll find my typos quicker this way configure:13101: checking openssl/ssl.h usability configure:13101: /usr/bin/gcc -c -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -Wno-unused- command-line-argument -O2 -I/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include -I/usr/local/Cellar/openssl@1.1/1.1.1g/include -I/usr/local/include -I/Library/Developer/CommandLineTools/usr/include conftest.c >&5 configure:13101: $? = 0 configure:13101: result: yes Regards Gavan Schneider —— Gavan Schneider, Sodwalls, NSW, Australia Explanations exist; they have existed for all time; there is always a well-known solution to every human problem — neat, plausible, and wrong. The ancients, in the case at bar, laid the blame upon the gods: sometimes they were remote and surly, and sometimes they were kind. In the Middle Ages lesser powers took a hand in the matter, and so one reads of works of art inspired by Our Lady, by the Blessed Saints, by the souls of the departed, and even by the devil. H. L. Mencken, 1920
Re: PG12.2 Configure cannot enalble SSL
On 15 May 2020, at 23:24, Tom Lane wrote: "Gavan Schneider" writes: HDRS=${HDRS}:"/usr/local/Cellar/openssl@1.1/1.1.1g/include" ... --with-includes=${HRDS} If that's an accurate copy of your script, spelling HDRS correctly would help. D’oh! More more proof I haven’t been having a good week. But it’s getting better. Thank you for spotting this, checking openssl/ssl.h usability... yes checking openssl/ssl.h presence... yes checking for openssl/ssl.h... yes Now to get some real work done :) Gavan Schneider —— Gavan Schneider, Sodwalls, NSW, Australia Explanations exist; they have existed for all time; there is always a well-known solution to every human problem — neat, plausible, and wrong. The ancients, in the case at bar, laid the blame upon the gods: sometimes they were remote and surly, and sometimes they were kind. In the Middle Ages lesser powers took a hand in the matter, and so one reads of works of art inspired by Our Lady, by the Blessed Saints, by the souls of the departed, and even by the devil. H. L. Mencken, 1920
Re: PG12.2 Configure cannot enalble SSL
On 14 May 2020, at 23:26, Tom Lane wrote: "Gavan Schneider" writes: -bash-3.2$ ./configure --with-openssl \ > --with-includes=/usr/local/opt/openssl/include/openssl \ > --with-libraries=/usr/local/opt/openssl/lib ... checking openssl/ssl.h usability... no checking openssl/ssl.h presence... no checking for openssl/ssl.h... no configure: error: header file is required for OpenSSL Offhand, I'll guess that you need to shorten the --with-includes path to --with-includes=/usr/local/opt/openssl/include What you are showing here would only work if the header file's full path is /usr/local/opt/openssl/include/openssl/openssl/ssl.h which doesn't seem likely. My bad. I made an error in cobbling together a reduced example which achieved the same error for the wrong reasons. I’ll share the larger version but the problem remains. It does not seem to be as simple as a bad path. -bash-3.2$ -bash-3.2$ ./_Build_PostgeSQL.sh > build.log 2>&1 -bash-3.2$ ls -las /usr/local/Cellar/openssl@1.1/1.1.1g/include/openssl/ssl.h 224 -rw-r--r-- 1 pendari staff 111253 21 Apr 22:22 /usr/local/Cellar/openssl@1.1/1.1.1g/include/openssl/ssl.h -bash-3.2$ Where _Build_PostgeSQL.sh looks like: echo === echo = `date "+%Y-%m-%d %H:%M"` echo === PORT=65432 # Working Base BASE="/usr/local/pgsql/v12" # Define basic tools == Homebrew versions (updated on a regular basis) MAKE="/usr/local/bin/gmake" CC="/usr/bin/gcc" # alternate? "/usr/local/bin/gcc-9" LIBS="" HDRS="" # OPENSSL LIBS=${LIBS}:"/usr/local/Cellar/openssl@1.1/1.1.1g/lib" HDRS=${HDRS}:"/usr/local/Cellar/openssl@1.1/1.1.1g/include" #LIBS=${LIBS}:"/usr/local/Cellar/libressl/3.1.1/lib" #HDRS=${HDRS}:"/usr/local/Cellar/libressl/3.1.1/include" # OPENLDAP LIBS=${LIBS}:"/usr/local/opt/openldap/lib" HDRS=${HDRS}:"/usr/local/opt/openldap/include" # GENERIC LIBS=${LIBS}:"/usr/local/lib:/Library/Developer/CommandLineTools/usr/lib" HDRS=${HDRS}:"/usr/local/include:/Library/Developer/CommandLineTools/usr/include" cd postgresql-12.2 ./configure \ CC=${CC}\ --with-includes=${HRDS} \ --with-libraries=${LIBS} \ --prefix=${BASE}\ --with-pgport=${PORT} \ --with-python \ --with-libxml --with-libxslt\ --with-wal-blocksize=8 \ --with-system-tzdata=/usr/share/zoneinfo\ --with-pam \ --with-ldap \ --with-uuid=e2fs\ --with-bonjour \ --with-openssl \ echo === echo === echo --with-includes =${HDRS} echo --with-libraries=${LIBS} echo CC is: $CC echo MAKE is: $MAKE echo === echo === exit DEBUG and generated === = 2020-05-15 21:43 === checking build system type... x86_64-apple-darwin19.4.0 checking host system type... x86_64-apple-darwin19.4.0 . . . checking zlib.h usability... yes checking zlib.h presence... yes checking for zlib.h... yes checking openssl/ssl.h usability... no checking openssl/ssl.h presence... no checking for openssl/ssl.h... no configure: error: header file is required for OpenSSL === === --with-includes =:/usr/local/Cellar/openssl
PG12.2 Configure cannot enalble SSL
Greetings: This problem may be related to several threads I have found, e.g., https://www.postgresql.org/message-id/29069.1575298784%40sss.pgh.pa.us Context: Building PG12.2 and need to enable SSL Previous version builds have been uneventful but I haven’t attempted to enable SSL before Sources: PG — standard source distribution 12.2 OpenSSL — Homebrew macOS Catalina v 10.15.4 Checks: configure finishes normally if I do not attempt to enable SSL the header files are at the end of the —with-includes they are readable searches for this problem seem to relate to older versions and are no longer relevant in that I seem to have the include files properly identified Problem: I really don’t know where to go with this except to hope someone can see my =obvious= error All help appreciated. Relecvant terminal: -bash-3.2$ cd postgresql-12.2 -bash-3.2$ ls -las total 1488 0 drwxrwxrwx@ 18 pendari staff 576 11 Feb 09:29 . 0 drwxr-xr-x+ 8 postgres wheel 256 14 May 16:17 .. 8 -rw-r--r--@ 1 pendari staff 730 11 Feb 09:14 .dir-locals.el 8 -rw-r--r--@ 1 pendari staff1547 11 Feb 09:14 .gitattributes 8 -rw-r--r--@ 1 pendari staff 504 11 Feb 09:14 .gitignore 8 -rw-r--r--@ 1 pendari staff1192 11 Feb 09:14 COPYRIGHT 8 -rw-r--r--@ 1 pendari staff3909 11 Feb 09:14 GNUmakefile.in 8 -rw-r--r--@ 1 pendari staff 284 11 Feb 09:14 HISTORY 128 -rw-r--r--@ 1 pendari staff 61606 11 Feb 09:29 INSTALL 8 -rw-r--r--@ 1 pendari staff1665 11 Feb 09:14 Makefile 8 -rw-r--r--@ 1 pendari staff1212 11 Feb 09:14 README 8 -rw-r--r--@ 1 pendari staff 522 11 Feb 09:14 aclocal.m4 0 drwxrwxrwx@ 23 pendari staff 736 11 Feb 09:28 config 1120 -rwxr-xr-x@ 1 pendari staff 572134 11 Feb 09:14 configure 168 -rw-r--r--@ 1 pendari staff 83465 11 Feb 09:14 configure.in 0 drwxrwxrwx@ 59 pendari staff1888 11 Feb 09:28 contrib 0 drwxrwxrwx@ 7 pendari staff 224 11 Feb 09:28 doc 0 drwxrwxrwx@ 22 pendari staff 704 11 Feb 09:29 src -bash-3.2$ -bash-3.2$ ./configure --with-openssl \ > --with-includes=/usr/local/opt/openssl/include/openssl \ > --with-libraries=/usr/local/opt/openssl/lib checking build system type... x86_64-apple-darwin19.4.0 checking host system type... x86_64-apple-darwin19.4.0 ... ... ... checking zlib.h usability... yes checking zlib.h presence... yes checking for zlib.h... yes checking openssl/ssl.h usability... no checking openssl/ssl.h presence... no checking for openssl/ssl.h... no configure: error: header file is required for OpenSSL -bash-3.2$ -bash-3.2$ /usr/local/Cellar/openssl@1.1/1.1.1g/bin/openssl version OpenSSL 1.1.1g 21 Apr 2020 -bash-3.2$ Gavan Schneider —— Gavan Schneider, Sodwalls, NSW Australia Explanations exist; they have existed for all time; there is always a well-known solution to every human problem — neat, plausible, and wrong. The ancients, in the case at bar, laid the blame upon the gods: sometimes they were remote and surly, and sometimes they were kind. In the Middle Ages lesser power took a hand in the matter, and so one reads of works of art inspired by Our Lady, by the Blessed Saints, by the souls of the departed, and even by the devil. H. L. Mencken, 1920