[GENERAL] Re: Dynamic use of RAISE with USING to generate and catch non-hardcoded custom exceptions
>This works for me: > >DO $$ >DECLARE > v_msg TEXT := 'SOMETHING IS WRONG'; > v_sqlstate TEXT := 'E0001'; >BEGIN > RAISE EXCEPTION USING message = v_msg, errcode = v_sqlstate; >EXCEPTION > WHEN SQLSTATE 'E0001' THEN > RAISE NOTICE '%','Error E0001 raised - going to do something about it'; > WHEN OTHERS THEN > RAISE NOTICE 'OTHER ERRORS: %,%', sqlstate,sqlerrm; >END$$; >NOTICE: Error E0001 raised - going to do something about it > >Or you could do > RAISE EXCEPTION SQLSTATE v_sqlstate USING message = v_msg; That does indeed work ! The second possible way of : RAISE EXCEPTION SQLSTATE v_sqlstate USING message = v_msg; however doesn't ! I think that was the format i had also tried and why i went down the dymanic route. So it seems variables can be used in the USING subclause but not outside it. The manual does seem to hint at this as "after level if any, you can write a format (which must be a simple string literal, not an expression)" Anyway, RAISE EXCEPTION USING message = v_msg, errcode = v_sqlstate; works a treat! Many thanks Tom & Pavel. Mike
[GENERAL] Dynamic use of RAISE with USING to generate and catch non-hardcoded custom exceptions
I’m trying to get dynamic version of the RAISE command working so that I can use a table of custom application error messages and codes for use by all developed plpgsql functions. In this way the customer error codes and message are not hard coded into code and are defined consistently in one place in the db. However, I cannot get a dynamic/parameterised version of the RAISE command working with the USING syntax - I want to do this so that i can catch the raised error in an EXCEPTION block. The following example shows a example of (working) hardcoded version: DO $$ DECLARE BEGIN RAISE EXCEPTION 'Something is wrong' USING errcode = 'E0001'; EXCEPTION WHEN SQLSTATE 'E0001' THEN RAISE NOTICE '%','Error E0001 raised - going to do something about it'; WHEN OTHERS THEN RAISE NOTICE 'OTHER ERRORS: %', sqlstate; END $$ Which raises and catches the custom error E0001 and returns (as expected) NOTICE: Error E0001 raised - going to do something about it. Now what I am trying to achieve is as above but for the msg text and errcode to be retrieved from a table before issuing the RAISE EXCEPTION statement. ie. Assume v_msg and v_sqlstate have been retrieved and contain: v_msg = 'Something is wrong’ v_sqlstate = ‘E0001’ The what I want to raise dynamically is: RAISE EXCEPTION v_msg USING errcode = v_sqlstate; and be able to use the same exception block as above in the hard coded example. I searched and found a couple of similar examples where RAISE EXCEPTION ’%’, i_msg is used and works but this does not allow a custom SQLSTATE to be raised and trapped. ie. The following runs ok: DO $$ DECLARE v1 TEXT ; BEGIN v1 := 'SOMETHING IS WRONG'; RAISE NOTICE '%', v1; RAISE EXCEPTION '%', v1; EXCEPTION WHEN SQLSTATE 'E0001' THEN RAISE NOTICE '%','Error E0001 raised - going to do something about it'; WHEN OTHERS THEN RAISE NOTICE 'OTHER ERRORS: %', sqlstate; END $$ and returns: NOTICE: SOMETHING IS WRONG NOTICE: OTHER ERRORS: P0001 but obviously the WHEN OTHERS has caught the exception and the SQLSTATE is the default P0001. So, then what i really want is similar to the above but with the USING keyword of RAISE being dynamic/parameterised. So i tried the following: DO $$ DECLARE v_msg TEXT := '''SOMETHING IS WRONG'''; v_sqlstate TEXT := '''E0001'''; v1 TEXT ; BEGIN v1 := v_msg || ' USING errcode = ' || v_sqlstate; RAISE NOTICE '%', v1; RAISE EXCEPTION '%', v1; EXCEPTION WHEN SQLSTATE 'E0001' THEN RAISE NOTICE '%','Error E0001 raised - going to do something about it'; WHEN OTHERS THEN RAISE NOTICE 'OTHER ERRORS: %,%', sqlstate,sqlerrm; END $$ which returns: NOTICE: 'SOMETHING IS WRONG' USING errcode = 'E0001' NOTICE: OTHER ERRORS: P0001,'SOMETHING IS WRONG' USING errcode = 'E0001' So clearly the whole of v1 (whilst syntatically correct) is treated as the message and the default sqlstate of P0001 is still raised and caught by WHEN OTHERS. Have tried a few other things but cannot find way to get a custom errcode/sqlstate to be raised except by hardcoding it ! I’m sure there must be a way to do this! Any help or advice on how to achieve this very much appreciated !
Re: [GENERAL] Surprising results from array concatenation
On Tue, Apr 25, 2017 at 12:53 PM, Tom Lanewrote: > Yeah. The core problem here is that the parser has to disambiguate the > || operator: is it "anyarray || anyelement" or "anyarray || anyarray"? > <...> > Peeking at the contents of the literal would make the behavior very > unpredictable/data-dependent, so we don't. Fair enough. Would a note in that section of the docs pointing out this behavior be worthwhile?
[GENERAL] Surprising results from array concatenation
The docs (section 9.18 for PG 9.6) show as an example for array concatenation ARRAY[4,5,6] || 7 which works fine. However, trying the same with an array of text doesn't work: # select array['a','b','c'] || 'd'; ERROR: malformed array literal: "d" LINE 1: select array['a','b','c'] || 'd'; ^ DETAIL: Array value must start with "{" or dimension information. Casting the second value to TEXT works. # select array['a','b','c'] || 'd'::TEXT; ?column? --- {a,b,c,d} (1 row) The assumption that the second argument is an array constant seems surprising. __ *Mike Blackwell | Technical Analyst, Distribution Services/Rollout Management | RRD* 1750 Wallace Ave | St Charles, IL 60174-3401 Office: 630.313.7818 mike.blackw...@rrd.com http://www.rrdonnelley.com <http://www.rrdonnelley.com/> * <mike.blackw...@rrd.com>*
Re: [GENERAL] Column Tetris Automatisation
|From: Nicolas Paris |Hello, |In postgresl order of columns does have an non negligeable impact on table |size[1]. |Table are in many cases dynamic, and new fields can appear in the database life. |I suspect re-ordering columns based on types would be an automatisable task | |Moreover, most client code should not be dependent on column order. Then a |TETRIS option in VACUUM FULL would be usefull for many users. |Does that makes sense ? For me it doesn't. As a database architect/engineer, I always create a data model first (using some software like Xcase, Embarcadero, Erwin), then use that software to generate the DDL to create the physical database objects, then start creating stored functions against those objects. Since my code follows my model, and since I don't use "select *", and since I prefer to have all of my identifiers at the "top" of a table, order does matter...it all ties together nicely, making it easier for other developers to follow an identical pattern across all of the database objects. All of that said, the notion of embedding Tetris functionality into a codebase makes me smile, for some reason... Mike Sofen -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] LYDB: What advice about stored procedures and other server side code?
|From: Christoph Moench-Tegeder |Initially, running code in your database can make life easier for the developers |(ise pgTap for testing, pl/profiler and pl/debugger, etc.). But once you have to |change your schema, the hurt begins: |you'll need downtime for that, or you'll have to deal with the possibility of |changing the API of your "database side" code, and matching code and tables on |the database. I look at this from the opposite direction: with a stable database API (via stored procs), I can change the schema and logic within the procs without causing any app code breakage…the app tier is completely insulated from those changes – that’s worth a lot. Yes, for deploying the schema change there must be an outage, but that’s true regardless of where the data is being manipulated – the data is still in a db…and there are ways to mitigate/eliminate the duration of the outage. |The next pain point is scalability: running code on the database server puts your |code on the most expensive and hardest to scale CPUs. You can (almost) always |add another appserver to your setup (just spin up a VM with a tomcat or |whatever-you-use). But if the bottleneck is your database CPUs, you'd have to |move to a larger server Our dedicated db servers have not yet shown any real CPU consumption during app use - memory, for us, is the only real limiting factor. The only time CPU consumption spikes is during admin activities - reindexing, vacuuming, bulk data loads...that sort of thing. Even the boxplot calculations barely cause a ripple. To me that speaks to the efficiency of language and engine working together. You are right of course on the scaling - if we do run out of CPU horsepower and cannot scale up any more, we'd have to scale out, and there are ways to do that too. IOWs, the model doesn't have to change, just the solution to solve the scaling (solve the problem, not the symptom). |TL;DR: database side code can be a great thing in a small application, but once |the application and traffic grows, "code in the database" |requires specialist attention and may become a burden. |Unfortunately, most large applications started small... In my opinion, having a database specialist work on database stuff is a GOOD thing. Tables get designed properly, correct indexes are built, efficient query plans are created, etc. ORMs are a shortcut to getting an app talking to data, but aren't a substitute for a proper, scalable data tier. IMO...being a data specialist... :-) Mike Sofen (Synthetic Genomics) -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Performance PLV8 vs PLPGSQL
From: Tim Uckun I have seen various links on the internet which indicate that PLV8 is significantly faster than PL-PGSQL sometimes an order of magnitude faster. Is there any benefit to choosing PL-PGSQL? I can’t speak to PLV8. However, I can speak to plpgsql, and specifically stored functions (procs). I use it exclusively to create a database API for real-time web applications to hit. My API calls (procs) are hitting large tables, sometimes doing complex logic within the sproc. It allows me to provide a simple, standardized interface to the web devs, allowing them to focus on the app code work. Performance is superb and continues to surprise me (I came from the SQL Server world). As others have mentioned, the natural lashup of plpgsql to postgres (I liked Alban’s term, “impedance”), is a key aspect. Also: -stored procs provide another security layer against sql injection attacks. -Caching SEEMS to be more efficient/effective with stored procs (that could be wishful thinking too). -Stored procs allow skilled sql practitioners to provide far more sophisticated sql solutions than the typical python developer is capable of…my experience is that most web devs don’t really understand databases (or even care about them – they are a necessary evil), so providing a pure encapsulated sql solution (via stored procs) removes that mental impedance mismatch. -Performance? Simple “get” procs that return data for a specific indexed query against larger tables (50m+ rows) in a few milliseconds…I can live with that kind of performance. -I’m also doing some heavy lifting in the sql, calculating histograms and boxplots for data visualizations. This is an unusual scenario, but the other option is sending a massive chunk of data to another server for processing – just the transit time would kill the deal. I am mindful that at a certain point, there won’t be enough memory and i/o to go around, but the web app is a low user count/high user task complexity app, so I’ve tailored the model to match. Mike Sofen (Synthetic Genomics)
Re: [GENERAL] Syntax error needs fresh eyeballs
From: Rich Shepard My schema includes three tables and psql throws an error I'm not seeing when I try to read the schema into the database. I'd appreciate fresh eyes looking at the table and learning what error I'm not seeing. The tables: CREATE TABLE Weather ( site_id INTEGER PRIMARY KEY, site_name TEXT, site_location TEXT ); CREATE TABLE Weather_Params ( site_id INTEGER REFERENCES Weather(site_id), param TEXT, param_unit TEXT, freq INTEGER, freq_unit TEXT, equip TEXT, PRIMARY KEY (site_id, param) ); CREATE TABLE Weather_Data ( site_id INTEGER REFERENCES Weather(site_id), monit_date DATE, monit_time TIME, read_by TEXT, -- name of employee collecting data param TEXT REFERENCES Weather_Params(param), param_value REAL, PRIMARY KEY (site_id, monit_date, monit_time, param) ); The error: ERROR: there is no unique constraint matching given keys for referenced table "weather_params". --- Just a guess...You've camel-cased the table names but aren't using double quotes. Try either lower-casing all table names and references to them, or double-quoting all identifiers. Mike
Re: [GENERAL] Storing files: 2.3TBytes, 17M file count
From: Thomas Güttler Sent: Monday, November 28, 2016 6:28 AM ...I have 2.3TBytes of files. File count is 17M Since we already store our structured data in postgres, I think about storing the files in PostgreSQL, too. Is it feasible to store file in PostgreSQL? --- I am doing something similar, but in reverse. The legacy mysql databases I’m converting into a modern Postgres data model, have very large genomic strings stored in 3 separate columns. Out of the 25 TB of legacy data storage (in 800 dbs across 4 servers, about 22b rows), those 3 columns consume 90% of the total space, and they are just used for reference, never used in searches or calculations. They range from 1k to several MB. Since I am collapsing all 800 dbs into a single PG db, being very smart about storage was critical. Since we’re also migrating everything to AWS, we’re placing those 3 strings (per row) into a single json document and storing the document in S3 bins, with the pointer to the file being the globally unique PK for the row…super simple. The app tier knows to fetch the data from the db and large string json from the S3 bins. The retrieval time is surprisingly fast, this is all real time web app stuff. This is a model that could work for anyone dealing with large objects (text or binary). The nice part is, the original 25TB of data storage drops to 5TB – a much more manageable number, allowing for significant growth, which is on the horizon. Mike Sofen (Synthetic Genomics USA)
Re: [GENERAL] initdb createuser commands
From: Samuel Williams Sent: Sunday, October 30, 2016 3:42 PM As a community I'd think that having feedback from a new user would be valuable since as you say, sometimes when you get ingrained into the "way of doing things" that you don't see how they could be improved or different. Samuel I’d take a different tack. I spent 20 years with SQL Server and easily (almost gleefully) hopped over to Postgres and especially pgplsql and PgAdmin III, from using SqlServer Management Studio (SSMS – their admin/coding app). Sure, I had to learn the PG way of doing things, but really, it was a no-brainer. I had to spend a few extra cycles learning the PG best practices and particular way of doing things but it was trivial…google and done. The vast community has created massive amounts of examples for nearly everything imaginable – and some things I would never have imagined anyone would try to do – such that I don’t have to Lewis and Clark it but just dive right in and write code. IMO, nothing major needs changing in the language or command syntax – it’s logical and easy for anyone skilled in sql. If someone isn’t skilled in sql, the requests you’ve made won’t assist them at all. Mike Sofen (Synthetic Genomics)
Re: [GENERAL] Immutable datastore library?
From: Guyren Howe Sent: Monday, October 17, 2016 1:40 PM I would like to use Postgres as an immutable data store. A subject table would have a timestamp column, and that would be added to what would otherwise be the primary key. Trap updates and turn them into inserts. Have an associated _deleted table. Trap deletes and turn them into inserts of the primary key into that table. Create a view that only shows the non-deleted records with the most recent timestamps. Stored procedure to do all that to a table. Event trigger to drop and re-create the view on changes to the table columns. -- A couple years ago at another firm, I designed and built a real time ODS (operational data store) for a large healthcare company, on SQL Server 2012 that supported your exact requirements, since the ODS was the primary data feed for their data warehouse. My solution leveraged the Merge tsql function ("Upsert" in PG) to detect inserts/updates/deletes. We don't allow physical row deletes in medical data, so these were logical deletes aka an update to an "InactivatedDatetime" column making it not null. I used a checksum function in the Update branch to detect if the inbound data had changed at all, to avoid creating dry updates (no change in the data but a new Update row would be written otherwise). Ok that's the internals for the write to the ODS. I wrapped the entire Merge statement inside of another insert statement using the equivalent of the PG "Into" function, which took every column from the ODS write and wrote the same data to a "History" table of the same name - those writes were always inserts, creating a persistent, complete picture of every write to the ODS. Each row going into the History tables was marked with a "D" (delete), "I" (insert) or "U" (update). The History data was used for both auditing and for nightly batches feeding the data warehouse, where row type (D, I, or U) drove their Type 2 processing. As you can imagine, the table design was crucial to the success of this model. This was ultra-efficient on the real time data flowing in from the hospital and clinical EMRs (different systems) - in one transaction/one query, I was able to double-write the data and ensure both writes completed or fail both and error. The "batches" were small - up to 100k rows or less, and processed in under 100ms. But even when there was a network outage and we had to do a catch up load with millions of rows, it ran very quickly. IOWs, the double write overhead was very modest, especially with modern disk performance. Mike Sofen (Synthetic Genomics) -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Graphical entity relation model
From: jotpe Sent: Wednesday, September 28, 2016 1:17 PM Does anybody know a Software for generating graphical entity relation models from existing postgresql databases? Best regards Johannes --- I don’t know any useful free ones, but my favorite less expensive one (and the one I’ve used for the last few years) is xcase (http://www.xcase.com/) . It can reverse engineer against most current popular databases including Postgres. The current version has 2 postgres quirks: 1) you cannot (for some bizzare reason) declare a json/jsonb column, so I declare it as text and simply adjust the generated DDL as needed. 2) you cannot declare an index on a text column (as if they are thinking it is akin to a sql server text column). Other than those issues, which can be easily worked around, it is by far the fastest modeling tool I’ve ever used (I’ve used the more famous ones), and the fully comprehensive, with a very modern looking UI. In contrast, there are the over-priced dinosaurs with old ugly UIs. A while back I reviewed some of the modeling tools, and none did it for me, I went ahead and got another license to xcase. Mike Sofen (Synthetic Genomics)
Re: [GENERAL] how to monitor the progress of really large bulk operations?
From: Pavel StehuleSent: Tuesday, September 27, 2016 9:18 PM 2016-09-28 6:13 GMT+02:00 Pavel Stehule <pavel.steh...@gmail.com <mailto:pavel.steh...@gmail.com> >: Hi 2016-09-27 23:03 GMT+02:00 Mike Sofen <mso...@runbox.com <mailto:mso...@runbox.com> >: Hi gang, how to view the state of a transaction in flight, seeing how many rows have been read or inserted (possible for a transaction in flight?), memory allocations across the various PG processes, etc. some years ago I used a trick http://okbob.blogspot.cz/2014/09/nice-unix-filter-pv.html#links pltoolbox has counter function https://github.com/okbob/pltoolbox/blob/master/utils.c pavel=# insert into omega2 select (x.xx).* from (select pst.counter(omega,20, true) xx from omega ) x; NOTICE: processed 20 rows, current value is '(5,8)' NOTICE: processed 20 rows, current value is '(5,8)' Regards Pavel Pavel - That’s a very interesting function and thanks for sharing your toolbox. The big question of course, is what is the impact on performance, scalability and stability? Would it work inside of a stored function that would allow me write out the progress to a tracking table? Mike
[GENERAL] how to monitor the progress of really large bulk operations?
Hi gang, On PG 9.5.1, linux, I'm running some large ETL operations, migrate data from a legacy mysql system into PG, upwards of 250m rows in a transaction (it's on a big box). It's always a 2 step operation - extract raw mysql data and pull it to the target big box into staging tables that match the source, the second step being read the landed dataset and transform it into the final formats, linking to newly generated ids, compressing big subsets into jsonb documents, etc. While I could break it into smaller chunks, it hasn't been necessary, and it doesn't eliminate my need: how to view the state of a transaction in flight, seeing how many rows have been read or inserted (possible for a transaction in flight?), memory allocations across the various PG processes, etc. Possible or a hallucination? Mike Sofen (Synthetic Genomics)
Re: [GENERAL] IDE for function/stored proc development.
From: Tim Uckun Sent: Saturday, September 03, 2016 2:37 AM Does anybody use an IDE for doing heavy duty stored proc development? PGadmin is decent but I am looking for something better. I have been using the Datagrip app (from Jetbrains), from its beta release up through now v 2016.2 and love it. Full autocomplete, it has my object browser on the left, source code file browser on the right (I have it tied into our git), massively customizable to look and behave any way you want it. It is not effective yet for admin tasks. I really love the modern interface. I have zero connection to the company, paid for a license ($200). Worth every penny. I’m a hardcore stored proc/func dev, building database api’s for the large systems I design/build…I write a LOT of code and datagrip is where I do it. I also use the crash-prone PgAdminIII for admin stuff like table mods, quick scripting of table def or insert columns, or backups, etc…the normal admin stuff. MikeS
Re: [GENERAL] Rackspace to RDS using DMS (Postgres 9.2)
From: Joshua D. Drake Sent: Thursday, September 01, 2016 9:29 AM On 08/31/2016 03:41 PM, Patrick B wrote: > Hi guys, I posted this question on the ADMIN list but will post here > as well so more people can comment... > <https://www.postgresql.org/message-id/CAJNY3it_AfxJhmwMHtpiAbHG47GS5rJ> > https://www.postgresql.org/message-id/CAJNY3it_AfxJhmwMHtpiAbHG47GS5rJ > OAUgfHw%2BGm5OXCbUm7w%40mail.gmail.com > > I've got a 2.3TB Database running at Rackspace... We'll be migrating > it to RDS PostgreSQL 9.5 very soon... > > We already have an EC2 Instance at Amazon running PostgreSQL 9.2 as > streaming replication from Rackspace. > > I'll have to upgrade the version of Postgres on that instance before > start using DMS service. > > *Question:* > Has anybody ever used that service? I'm just trying to find out how > much time it will take to perform the migration... It is 2.3TB, it is going to take a long time no matter what service you are running. No, I have not used DMS. Frankly, with all respect to AWS/RDS the idea of running a 2.3TB instance that will get any level of performance sounds ridiculously expensive. Sincerely, JD I currently have an EC2 instance in AWS – an m4.xlarge (4 cores, 16gb, 3tb SSDs) and it’s pretty cheap, about $620/mo ($210/mo for the compute, $410 for the storage). The performance of this setup rivals in-house Cisco UCS server that we are demoing that costs ~$100k, as long as our batch sizes don’t exceed available memory – that’s where the larger Cisco pulls ahead. The $620/mo is the on-demand price, btw…the reserved price is much lower. $100k/ $620 = 161 months of operation before cost parity. Mike S
Re: [GENERAL] UPDATE OR REPLACE?
On Thu, Sep 1, 2016 at 12:10 PM, dandl <da...@andl.org> wrote: > Sqlite has options to handle an update that causes a duplicate key. Is > there anything similar in Postgres? > This is not an UPSERT. The scenario is an UPDATE that changes some key > field so that there is now a duplicate key. In Sqlite this handled as: > UPDATE OR IGNORE table SET > UPDATE OR REPLACE table SET > > And so on > > See https://www.sqlite.org/lang_update.html. > > Can Postgres do this? I would propose that this effectively violates referential integrity and shouldn't be a valid design pattern. In my mind primary keys are supposed to be static, stable, non-volatile...aka predictable. It feels like an alien invading my schema, to contemplate such an activity. I hope PG never supports that. Postgres allows developers incredible freedom to do really crazy things. That doesn't mean that they should. Mike Sofen (USA) -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] UUIDs & Clustered Indexes
From: George Neuner Sent: Tuesday, August 30, 2016 5:54 PM >Mike Sofen wrote: So in this scenario, I'm using >BOTH bigserials as the PK and uuids as AKs in the core tables. I >reference the bigints for all joins and (have to) use the uuids for the >filters. It's been working ok so far, lookup performance on a table >with a few million rows, using the uuid (indexed) is instantaneous. >I'll soon have a 100 million+ rows loaded into a single table and know a bit more. > >The uuids are also design insurance for me in case I need to shard, >since I'll need/want that uniqueness across servers. FYI: articles about sharding using bigint keys. <http://instagram-engineering.tumblr.com/post/10853187575/sharding-ids-at-in stagram> http://instagram-engineering.tumblr.com/post/10853187575/sharding-ids-at-ins tagram <http://rob.conery.io/2014/05/29/a-better-id-generator-for-postgresql/> http://rob.conery.io/2014/05/29/a-better-id-generator-for-postgresql/ George I remember reading these articles a long time ago, forgot about them...and appreciate the reminder! I really liked the enhanced Instagram function from Rob Conery in the second link, but so far haven't needed to deal with it. However, an upcoming project may require huge data storage - approaching hundreds of billions of rows, and I'm sticking with Postgres - so this will be a great way to test drive the function. And I may try my hand at a further enhancement, time permitting. Thanks for the links! Mike
Re: [GENERAL] UUIDs & Clustered Indexes
From: Tom Lane Sent: Tuesday, August 30, 2016 7:16 AM Luke Gordon < <mailto:gord...@gmail.com> gord...@gmail.com> writes: > However, according to a message on this mailing list, Postgres doesn't > have clustered indexes: > "But Postgres doesn't _have_ clustered indexes, so that article > doesn't apply at all. The other authors appear to have missed this important point." > <https://www.postgresql.org/message-id/56798352.7060902%40uchicago.edu> https://www.postgresql.org/message-id/56798352.7060902%40uchicago.edu > But, doing a quick check, it appears Postgres does indeed have a > mechanism for a clustered index: > <https://www.postgresql.org/docs/9.5/static/sql-cluster.html> https://www.postgresql.org/docs/9.5/static/sql-cluster.html CLUSTER just does a one-time sort to put the table into index order. There is no mechanism that would cause subsequent insertions of new keys to respect that ordering, so it's pretty much irrelevant to the argument about whether new UUID keys need to be generated in some ordered fashion. Do you actually *need* UUID keys, and if so why? A plain old bigint column is smaller, cheaper to index, and the natural mechanism for generating it (ie a sequence) will tend to preserve ordering for free. regards, tom lane I agree with Tom for a "normal" application - I would always use bigints (bigserial) as a PK column. The app I currently working on is a high security web app for which the app coders require guids for all identifiers flowing around the system. So in this scenario, I'm using BOTH bigserials as the PK and uuids as AKs in the core tables. I reference the bigints for all joins and (have to) use the uuids for the filters. It's been working ok so far, lookup performance on a table with a few million rows, using the uuid (indexed) is instantaneous. I'll soon have a 100 million+ rows loaded into a single table and know a bit more. The uuids are also design insurance for me in case I need to shard, since I'll need/want that uniqueness across servers. Mike Sofen
Re: [GENERAL] Re: Clustered index to preserve data locality in a multitenant application?
From: Thomas Kellerer Sent: Tuesday, August 30, 2016 4:39 AM Nicolas Grilly schrieb am 30.08.2016 um 13:12: > We rely on clustered indexes to preserve data locality for each > tenant. Primary keys start with the tenant ID. This way, rows > belonging to the same tenant are stored next to each other. Because > all requests hit only one tenant, this is a great performance > improvement. > What about partitioning by tenant? With a local index on each partition. Partitioning is currently a bit limited in Postgres (e.g. you can't have incoming foreign keys) but this would fit your requirements pretty much as I understand them. For Nicolas’s situation, that would require 10,000 partitions – not very useful, and each partition would be very small. The premise of clustered indexes is that the row data is “in the index”, so no row lookups are needed – that’s very effective/performant and clearly works well for many scenarios. In Postgres, as you mentioned, clustering is a “one time” operation but only in the sense that after you add more rows, you’ll need to re-cluster the table. Depending on the activity model for that table, that may be feasible/ok. For example, if you load it via regular batch scripts, then the clustering could be done after those loads. If you add rows only rarely but then do lots of updates, then the clustering would work great. If this is an active real time data table, then clustering would not be viable. But you may be still be fine without them. Here’s why: Postgres is a very fast database engine. I am constantly surprised at how performant it is. I came from the SQL Server world where I always leveraged the automatic clustered indexes but in Postgres I am not using them and am running some big data with no table partitioning (yet) and performance is still very good. I carefully optimize my data models, so that is part of it. I also carefully optimize my stored functions/indexes. I am migrating data from 500+ mysql databases/~8,000 tables/~20 billion rows into a single Postgres db. As my data grows, I may shard it. The new parallel query support in PG v9.6 may also support your table model very nicely, depending on how you query into it. So I’d say, just build a prototype PG db, build that one table, load your existing data into it (very easy via the mysqly_fdw data wrapper – it’s exactly what I am doing…ping me off list if you need some ideas), put a regular index on it and run some queries. If you have a decent size dev server to work on, you should see adequate performance. And, what you’ll be incredibly pleased with is the remarkably rich and flexible plpgsql coding environment. It’s heaven for sql devs. BTW, I am putting together a rather detailed examination of Postgres ETL/bulk loading performance, on 2 different platforms: 4 cores/16gb ram/3tb SSD on AWS (Amazon cloud), and 48 cores/256gb ram/ 10tb SSD on a monster loaner Cisco UCS server. Should have that posted to the Perform list later this week. Mike Sofen (USA)
Re: [GENERAL] Stored procedure version control
-Original Message- >From: Mark Morgan Lloyd Sent: Thursday, June 30, 2016 2:41 AM >Neil Anderson wrote: >> On 2016-06-29 12:37 PM, Mark Morgan Lloyd wrote: >>> Elsewhere, somebody was asking how people implemented version control >>> for stored procedures on (MS) SQL Server. >>> >>> The consensus was that this is probably best managed by using scripts >>> or command files to generate stored procedures etc., but does anybody >>> have any comment on that from the POV of PostgreSQL? >>> > [etc.] Thanks everybody, summary passed on. >Mark Morgan Lloyd >markMLl .AT. telemetry.co .DOT. uk A bit late to the thread, but here's some specific details on how I've implemented version control in PG 9.5, in a small team environment deploying to single database servers in each tier (dev, qa, stage, prod). It's working well so far, and allows my stored proc versions to be aligned with the middle and upper tier code releases. I'm the lead database architect-engineer for a brand new genomics application (lots of data). Details: - we're using git for version control, with a base name for each repo that holds a single micro-service (like "JobManager") and a suffix for the data tier code ("JobManagerDBMS") making it simple for devops to find the related code for a micro-service deployment by repo. - within a DBMS repo, I've got subfolders like "scripts", "sprocs", "documentation", where scripts holds ad hoc scripts that need to be run during a deployment (might be adjusting DDL or seeding or cleaning up data), sprocs for stored function files that must be compiled into PG, and documentation holds notes, data models, etc. We have a simple python script that compiles/recompiles all stored proc files within a named folder - deployment done with one call. - I only code using source code files, by cloning an existing suitable base stored proc (akin to a template) to a new file name (like a "get" or "set" stored proc) and then revising to match the requirement. In a detailed comment block within each stored proc, I list a version number (just for reference, not used programmatically at this point), change history, author, comments, and one or more sample calls that form the basis of my unit tests. - after I've finished the requested work and the stored procs are working as expected, I update a Version file in the folder, push it into git and merge it into the development branch. - let's say a new enhancement request comes in. I create a new branch (like "RequestForNewThingy"), version that, do the work and merge it back in just like the above. So we've got isolation and persistence of changes. - I happen to be using the new DataGrip code editor, which supports this beautifully, since my git tree appears on the right side of editor window, allowing me directly edit/clone without leaving the editor. My coding efficiency using this model is quite high...the overhead of using git is trivial. For rollbacks, we can simply point to the prior stored proc version and recompile those. For DDL rollbacks, I have to code those scripts and supply them...this is the one place I have to spend a bit more time creating a more automated solution. I'd love to hear how other folks have solved programmatic rollbacks. Mike Sofen (San Diego, CA USA) -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] OT hardware recommend
From: Adam Brusselback Sent: Saturday, June 18, 2016 1:26 PM Subject: Re: [GENERAL] OT hardware recommend It's really amazing how much solid state drives transferred the database bottleneck away from disk. Adam – so very true. We used to spend ungodly amounts of time/money/effort to mitigate disk performance limitations. It is almost MAGIC what SSDs do now. Real numbers don’t lie: 5400 rpm disks can muster no more than 65 IOPS (7200s get ~90-100, 10k get 140-150). So: 15 x 65 = 975 IOPS (aka boohoo) Using the AS SSD Benchmark, the Samsung 480gb m2 850 EVO in my core i7 laptop measures (IOPS then MB/s): Random 4k blocks: 7,235 iops read, 14,012 iops Random 4K-64Threads: 97,743 iops read, 68,864 iops write Random 512B: 14,380 iops read, 19,858 iops write (db comparison here) MB/s: Sequential: 500 MB/s read, 449 MB/s write Random 4K: 28.26 MB/s read, 54.74 MB/s write 4K-64Threads: 381.81 MB/s read, 269.00 MB/s write (this is closer to what db access looks like). Access Times: 0.070 ms read, 0.050 ms write Thusly, 1 x SSD = 14.75 times faster than a 15 drive array on reads, and 20 times faster on writes. Like everyone else has said, just buy a 1 TB Samsung EVO 850 for $300 (USD) and call it a day. :) Mike
Re: [GENERAL] PostgresSQL and HIPAA compliance
-Original Message- From: Alex John Sent: Friday, June 17, 2016 3:04 AM To: pgsql-general@postgresql.org Subject: [GENERAL] PostgresSQL and HIPAA compliance Hello, I have a few questions regarding the use of PostgreSQL and HIPAA compliance. I work for a company that plans on storing protected health information (PHI) on our servers. We have looked at various solutions for doing so, and RDS is a prime candidate except for the fact that they have explicitly stated that the Postgres engine is *not* HIPAA compliant. Users on the IRC channel generally say that the guidelines are more catered towards building better firewalls and a sane access policy, but I would like to know if there is anything within the implementation of Postgres itself that violates said compliance. If anyone works at a similar company and utilizes postgresql to store PHI, please let me know. Thank you, Alex - HIPAA compliance does not specify (ever) the technical solution to meet the requirements, so ANY datastore that can be properly managed within the context of HIPAA compliance is legal and allowed. Ignore IRCs and search on securing PHI on relational databases, you'll find lots of details around data access roles, documentation, processes, data obfuscation, etc. Mike -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] regarding schema only migration from sqlserver to postgres with runmtk.sh
From: pgsql-general-ow...@postgresql.org [mailto:pgsql-general-ow...@postgresql.org] On Behalf Of David G. Johnston Sent: Wednesday, June 15, 2016 1:31 PM To: Durgamahesh ManneCc: pgsql-general@postgresql.org Subject: Re: [GENERAL] regarding schema only migration from sqlserver to postgres with runmtk.sh On Wed, Jun 15, 2016 at 11:27 AM, Durgamahesh Manne > wrote: may i know recommended approach to migrate the tables from sql server to postgresql instead of using 3rd party tools I suppose SQL Server has some means to export in TSV/CSV or other row-oriented structured output. The \copy psql meta-command or COPY SQL command could load that data. David J. There are MANY tools for sql server, many are built in (aka free) that will generate DDL, or extract data, or both, plus you can directly query the system catalogs, just like PG. Utterly trivial stuff to use and Postgres can ingest all of it since it’s normally either flat file or ansi sql. MikeS
Re: [GENERAL] Alternate or Optimization for with hold cursor
>From: Sangeetha Sent: Thursday, May 12, 2016 1:58 AM Currently , I am using "With hold" cursor. In our case , the With hold cursor is used to fetch the next record of the given primary key . The performance is very slow for large data set. Can you provide me some alternative ways like having own copy of table , or optimization for With hold cursor? Thanks and Regards, S.Sangeetha< == Cursors are the last tool I would ever grab out of my sql toolbox (aka, I never use one) - it converts the enormous power of a relational database engine into "RBAR" (row by agonizing row). For a large dataset in particular, you are killing the server since the entire resultset must be retained in working memory for the duration of the query as it peels off one row at a time from that resultset OR if it's larger than your ram, you'll be paging to disk constantly. And since you're working on a single row at time, it will take forever. Convert the cursor into a normal query and you should see BIG (10-1000x) gains in speed. A cursor can always be converted to normal sql...always...it's not always easy but it's always worth the effort. Mike Sofen -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Thoughts on "Love Your Database"
From: dandlSent: Wednesday, May 04, 2016 5:05 PM To: 'Pierre Chevalier Géologue' <pierrechevalierg...@free.fr> > From: Pierre Chevalier Géologue [ <mailto:pierrechevalierg...@free.fr> > mailto:pierrechevalierg...@free.fr] > ... > > Then I think you've seriously misunderstood. Most people can indeed > >learn to write basic SQL queries, but those are > >(obviously) not what I'm talking about. > > > > To write the business logic of a significant application entirely in > >SQL requires PLSQL (or in other dialects, whatever passes for SQL/PSM). > >It means writing an entire data access layer as a set of stored > >procedures, with a substantial set of special functions, types, > >triggers and so on. No beginner and few experts have the skills > >required to do that in SQL, and then debug that code on the server. > > All right, I understand better now. I think I also totally missed > your point, sorry... > I'll give a look at andl. I hope you do. Please feel free to contact me with any comments, suggestions, etc. I have not completed the Postgres implementation -- probably another couple of weeks – but in-memory and Sqlite are there. Bonne chance! Regards David M Bennett FACS === I disagree. I’ve worked as database architect/engineer at a number of large and small firms in various verticals (healthcare, financials, insurance, aerospace, telecom, etc), and created complete database api’s via stored procs/stored functions, some of which were quite complex. I’ve found that a mid-level database developer, with modest coaching and good comments in the code, can pick up the code, support it and even enhance it. So the notion that experts can only write and maintain quality code isn’t valid in my experience. There is definitely a difference in capability/velocity/solution solving between junior, mid-level and senior developers, but that isn’t a deal killer, it’s just something that needs to be managed and accounted for. One reason for a database api is that ORMs have proved themselves incapable of proper scaling and ACID compliance, where stored procs/functions are capable of leveraging the massive set-based relational power of the underlying engine, and leverage efficient functionalities like windowing functions. So I guess you’d say I’m in the entirely opposite camp, since it’s proven to be such an effective solution architecture for many applications that leverage relational database engines. Mike Sofen (San Diego, CA USA)
Re: [GENERAL] Thoughts on "Love Your Database"
>From: Vincent Veyron Sent: Wednesday, May 04, 2016 3:35 PM >I write management applications for businesses, and give _a lot_ of care to the database structure. >I find that the number of lines of code that need to be written is strictly inversely correlated to the >appropriateness of the database design (meaning that the better the database structure, the lesser >code is needed). >Knowing about the many fine functions Postgresql offers also helps, of course. > Bien à vous, Vincent Veyron An inverse corollary is also true, imo: encountering demands for exotic, wild functions, cursors and code constructs (like triggers calling triggers or frankly, triggers in general), is nearly always an indication of poor database design. I'm not talking about windowing functions or json or CTEs, btw. Postgres and mysql have piles and piles of functions that I will never use and can't even imagine scenarios in which to use them. So I agree 100% - it's all about the database (design). BTW, I'm currently designing/building OLTP databases for use in genomics research (using Postgres)...that's big data...where there is zero tolerance for slack db design that could cause scalability or performance issues. My stored functions are...relatively simple. Mike Sofen (San Diego, CA USA) -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Function PostgreSQL 9.2
From: David G. Johnston Sent: Tuesday, May 03, 2016 2:46 PM To: drum.lu...@gmail.com …The only other reasonable option is change your model and requirements to something less complex. Seriously, get yourself the books I have recommended and study them BEFORE you continue attempting to design your database. You need a firm understanding of logical design & flow, otherwise you will be wasting your time. That's what I'm doing.. Studying.. asking for some help to get a better understand isn't this the purpose of this mail list? The purpose of this list is whatever people are willing to make of it - within reason. I share the sentiment that your particular method of education is becoming burdensome to the people who volunteer their time on these lists to answer questions. As the alternative is to simply stop replying to your emails be grateful that someone was at least willing to tell you to try other avenues of education. David J. +1 +1 +1 +1: Lucas – David really nails it: you’ve stopped relying on yourself and are relying on the mercy and generosity of strangers to solve your problems. It takes a lot of time to read, dissect, understand and then comment on your (or anyone’s) emails, and you’ve proven adept at consuming much of the available oxygen on the list. Please a bit more considerate - you’ve received an amazing amount of solid guidance and advice. I can tell you that when I see your name as the author, I now normally delete the email – that’s NOT the response you want, right? I’ve worked with a bunch of junior developers over the years, some of whom hit a wall and just sit there, waiting for someone to fix them. Those junior developers “never” become senior developers because their peers soon lose interest in collaborating with them, if you catch my drift… Mike Sofen
Re: [GENERAL] (VERY) Slow Query - PostgreSQL 9.2
From: drum.lu...@gmail.com <mailto:drum.lu...@gmail.com> Sent: Tuesday, May 03, 2016 2:55 AM I'm trying to get the query below a better performance.. but just don't know what else I can do... Please, have a look and let me know if you can help somehow.. also.. if you need some extra data jet ask me please. * Note that the gorfs.inode_segments table is 1.7TB size I have the following Query: explain analyze SELECT split_part(full_path, '/', 4)::INT AS account_id, split_part(full_path, '/', 6)::INT AS note_id, split_part(full_path, '/', 9)::TEXT AS variation, st_size, segment_index, reverse(split_part(reverse(full_path), '/', 1)) as file_name, i.st_ino, full_path, (i.st_size / 100::FLOAT)::NUMERIC(5,2) || 'MB' AS size_mb FROM gorfs.inodes i JOIN gorfs.inode_segments s ON i.st_ino = s.st_ino_target WHERE i.checksum_md5 IS NOT NULL AND s.full_path ~ '^/userfiles/account/[0-9]+/[a-z]+/[0-9]+' AND i.st_size > 0; split_part(s.full_path, '/', 4)::INT IN ( SELECT account.id FROM public.ja_clients AS account WHERE NOT ( ((account.last_sub_pay > EXTRACT('epoch' FROM (transaction_timestamp() - CAST('4 Months' AS INTERVAL AND (account.price_model > 0)) OR (account.regdate > EXTRACT('epoch' FROM (transaction_timestamp() - CAST('3 Month' AS INTERVAL OR (((account.price_model = 0) AND (account.jobcredits > 0)) AND (account.last_login > EXTRACT('epoch' FROM (transaction_timestamp() - CAST('4 Month' AS INTERVAL) ) LIMIT 100 ); There is one obvious solution: restructure your data, since it is not in a “standard” form but you’re trying to query it as if it were…you are turning your long full_path string into columns…if performance is a concern, that overhead has to be eliminated. Your two choices would be to either restructure this table directly (requiring a change in app code that was filling it), or use it to fill a proper table that already has everything decomposed from the long full_path string via post-processing after the insert. A third consideration would be to archive off older/unneeded rows to a history table to reduce row counts. This is about proper structure. Mike Sofen
Re: [GENERAL] truncate table getting blocked
From: Jayadevan M Sent: Tuesday, April 26, 2016 6:32 AM Hello, I have a python script. It opens a cursor… Thanks, Jayadevan
Re: [GENERAL] Function PostgreSQL 9.2
This is such a poorly designed, hacked together “thing” – it isn’t a database, it’s someone’s idea of how to store data when they don’t know how to store data, like they moved it from Access or Excel. Just start over and design a proper relational schema with best practices and you’ll save, oh, perhaps 10 years of wasted effort and 12 million emails. This is as close to bandaids on bandaids on steroids that it comes. Really – rethink your solution model. Mike From: drum.lu...@gmail.com <mailto:drum.lu...@gmail.com>Sent: Tuesday, April 19, 2016 7:40 PM Just forgot to say: Does increment_client_code relate to users or some other table, say clients? nope.. there is no link between them If the users.code is empty/null, then the trigger has to get the last number from client_code_increment and put on the users.code column
Re: [GENERAL] Re: Why is the comparison between timestamp and date so much slower then between two dates
|-Original Message- |From: Thomas Kellerer Sent: Wednesday, April 13, 2016 11:37 PM | |Alban Hertroys schrieb am 13.04.2016 um 16:39: |>>> So my question is: why is comparing a timestamp to a date so much slower? |> |> The reason that the other way around is so much more expensive is that |> the database needs to do that conversion twice for every row in the |> table. When down-converting now(), the DB only needs to do that once |> for all rows. | |Why does it do that for each row? The value of now() won't change while the |statement is running, so this conversion could be done once at the start of the |statement. The general rule in the SQL Server world is that using a function in a Where clause or join will eliminate usage of an index that would have been leveraged if the function didn't exist. The reason is that functions are non-deterministic, so the optimizer can't possibly tell in advance what the outcome will be and thus takes the safest route to completion. I'm betting that the same logic holds in PG (I just haven't tested it enough to be absolutely sure). In the case of now() in the Where clause, to avoid the conversion/loss of index usage, I always place (what should be a static value anyway) the output of now() into a local variable and then use that in the Where clause...and get my index back. This is just a style of coding (no functions in where clauses/joins), but one that doesn't seem prevalent in PG...instead I see people using functions within functions within functions, the cascading impact of which becomes very hard to unravel. Mike Sofen -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] I/O - Increase RAM
|From: John R Pierce Sent: Wednesday, April 13, 2016 1:53 PM | |On 4/13/2016 1:43 PM, drum.lu...@gmail.com wrote: |> At the moment I'm having 100% I/O during the day. My server has SATA |> HDs, and it can't be changed now. |> So, to solve the problem (or at least try) I was thinking about double |> the RAM, and by doing that, increasing the cache. | |depends on if its read or write IO. many of our database servers are |nearly 100% write IO, only thing that will speed that up is faster disks and/or |more disks in raid10. |-- |john r pierce, recycling bits in santa cruz Agree with John and also add that if your READ queries or data model are not optimized then you could throw a TB of ram at it and see no difference. You need to analyze your queries and find out WHY there's so much i/o. It sounds like thrash to me...so you need to do some homework and get the behaviors sorted out, then it should become obvious what needs fixing. Mike -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] vacuum - reclaiming disk space.
I have a large table with numerous indexes which has approximately doubled in size after adding a column - every row was rewritten and 50% of the tuples are dead. I'd like to reclaim this space, but VACUUM FULL cannot seem to finish within the scheduled downtime. Any suggestions for reclaiming the space without excessive downtime? **
Re: [GENERAL] How to delete few elements from array beginning?
>>Alexander Farber wrote on Wednesday, March 09, 2016 4:11 AM Hello fellow PostgreSQL users, what is please the most efficient way to delete a slice from the start of a longer array (after I have copied it to another array)? Do I really have to copy a large slice of the array to itself, like in the last line here: pile_array := pile_array || swap_array; /* here I copy away swap_len elements */ new_hand := pile_array[1:swap_len]; /* here I don't know how to efficiently remove already copied elements */ pile_array := pile_array[(swap_len + 1):ARRAY_LENGTH(pile_array, 1)]; or is there a better way? Thank you Alex << Have you considered a normal (relational), non-array-based data model for this app (2 or 3 tables in a 1:M/M:M) instead of the single table model you’ve shown? That would then allow you to use normal sql set-based operations that are readable, understandable, maintainable and very fast/scalable. When I see row by row operations (looping or cursors) in what should be a real time query…that’s my alarm bell that perhaps the code has wandered off a valid solution path. Mike
[GENERAL] regexp_replace to remove sql comments
Hi, I am trying to clean up the query field returned by the pg_stat_statements extension and remove all comments. Some of the queries in the query field contain comments like '-- some comment' and also '/* c style comments */' I have managed to strip off the '--' comments and also white space but after trying numerous regex for this via google but I am stuck. WITH to_clean AS ( SELECT regexp_replace( regexp_replace(trim(query), '--[^\r\n]*', '') --clear up comments like this one <-- this is ok , '\s+', ' ', 'g') as q--clear up white space <-- this is ok FROM public.pg_stat_statements WHERE dbid IN (SELECT oid FROM pg_database WHERE datname = current_database()) ) SELECT regexp_replace(q,'/\*.*\*/','') as q /* strip off comments like this */ <-- cannot get a regex to do this FROM to_clean ORDER BY q Im now thinking it may be better to do in a pgsql function as I think if the comments are in queries then they need to be ignored. Has anyone done anything like this? Thanks, Mike. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] regexp_replace to remove sql comments
Thanks with a bit of moving stuff about I think thats sorted it - in case anyone every needs it: SELECT query, trim(regexp_replace( regexp_replace( regexp_replace(query,'\/\*.+\*\/','','g'), '--[^\r\n]*', ' ', 'g') , '\s+', ' ', 'g')) as q FROM public.pg_stat_statements WHERE dbid IN (SELECT oid FROM pg_database WHERE datname = current_database()) order by query Thanks again, Mike. On 28/10/2015 22:43, Marc Mamin wrote: ', '\/\*.+\*\/','','g'): -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Invalid memory alloc request size
Over the weekend, Amazon did some maintenance that resulted in one of our instances being restarted. Apparently this left the database in a bad state. This particular instance functions as a slony replication target and when I went to start up slony, I get the following error message. Here's some relevant info. The last bit is from the slony log file and it shows the invalid memory alloc request size error. Is there a way to recover this or do I go ahead and restore from backup? ubuntu@ip-10-90-23-122:~$ psql -U postgres -c SELECT version(); PostgreSQL 9.3.5 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-52), 64-bit (1 row) lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 12.04.4 LTS Release: 12.04 Codename: precise 2015-08-24 06:50:03 UTC INFO remoteWorkerThread_1: syncing set 1 with 102 table(s) from provider 1 2015-08-24 06:50:33 UTC ERROR remoteWorkerThread_1_1: error at end of COPY IN: ERROR: invalid memory alloc request size 1970234207 CONTEXT: COPY sl_log_1, line 97033: [image: Clutch Holdings, LLC] http://www.clutch.com Mike James | Manager of Infrastructure 267.419.6400, ext 204 | mike.james@clutch.com201 S Maple St. | Suite 250 | Ambler, PA 19002 Clutch.com http://www.clutch.com | Twitter https://twitter.com/clutchsuccess | LinkedIn https://www.linkedin.com/company/2837209 | YouTube https://www.youtube.com/user/clutchsuccess | Clutch Support Center http://clientsupport.clutch.com/ The only end to end consumer management platform that empowers consumer-focused businesses to identify, target, message, and engage their best customers.
[GENERAL] inputs into query planner costing
Hi there, I'm having an issue with query performance between 2 different pgsql environments. Ther first is our current production postgres server with is running 9.3.5 on Centos 5 x64. The second system is Amazon's RDS postgres as a service. On our local DB server we have a query that executes in a reasonable amount of time (600 msec). On RDS the query will run for more then 10 minutes on a similarly CPU specced systems. I've been working through with Amazon support and I'm looking for more suggestions on where to look (both for me and to direct Amazon). The RDS system does use a network filesystem while our production server is a local RAID10 array, I can see that effecting the actual performance of the query but not the query planner costing (unless there's an input to query planner costing that I can't find) The Query plan costs generated by the 2 systems are vastly different, while the plans themselves are basically identical other then the materialization that RDS is doing (if I disable the materialization then they are almost the same other then a seq scan/heap scan on one small 2000 row table). All the tables in the query have been analyzed on each server without any impact Current Production Explain: http://explain.depesz.com/s/Tkyc Explain Analyze http://explain.depesz.com/s/UnQt RDS: (with enable_material=off) http://explain.depesz.com/s/vDiV (with enable_material=on) http://explain.depesz.com/s/HUjx I have validated that all the query planning configuration variables on this page http://www.postgresql.org/docs/9.3/static/runtime-config-query.html are the same between the 2 environments. If I modify the local production system values for things like random_page_cost and seq_page_cost to absurd values like 6 I can get it to generate a similar planner cost. Similarly if I lower the RDS values to absurdly low values like .0001 I can get it to generate a similarly costed plan (while still performing horridly). I've reached the end of things I can think about (I'm also working on rewriting the query but it's a generated query out of a infrastructure component so it's not a simple change). Just looking for any ideas on additional things to look into. The query is available here: https://www.dropbox.com/s/m31ct6k0mod0576/simplifiedquery.sql?dl=0 -- Data's inconvienient when people have opinions.
Re: [GENERAL] inputs into query planner costing
Thanks for the responses For anyone searching in the future I'll answer Tom's questions and list the boneheaded fix that it ended up actually being (really painful as I've been fighting this for a week). 1) According to amazon they run stock postgres as far as the query planner is concerned. 2) Yes sorry I forgot to note on our prod system the random_page_cost was 2 vs 4 on the RDS system. 3) I had run vacuum on all the tables in the query and the pg_relation_size on the tables aren't way out of wack 4) Yep both default_statistics_target was the default of 100 on both. I was concentrating completely on the wrong direction here. What it turned out to be was the RDS configuration of postgres which we had modified somewhat I had missed configuring work_mem to something greater then their default of 1 MB. Once I brought work_mem upto the same value as our production server low and behold the query runs fast. Sorry for wasting everyones time. Hopefully this will help someone else down the line.
Re: [GENERAL] 9.4+ partial log-shipping possible?
No idea if this is an option for you or not, but if you make tables unlogged they won't generate WAL and therefore won't be replicated: http://rhaas.blogspot.ca/2010/05/global-temporary-and-unlogged-tables.html Of course unlogged tables have several drawbacks to carefully consider. On 03/26/2015 08:32 AM, Sven Geggus wrote: Hello, I have a question regarding log-shipping replication, but let me first explain what I currently do. Usually Openstreetmap Tile Servers are currently set up using a PostgreSQL/Postgis Database which stores the data need for rendering raster map-tiles. After the initial import of a complete OSM dataset it is possible to replicate changes to this database using cyclic calls to openstreetmaps own tool called osm2pgsql. Unfortunately there is one major drawback with this approach: We need to somehow save the state of the osm database to be able to apply future changes. This is currently done using a few tables in the target database. However these tables are not needed for map rendering and are consuming by far the most disk space (still somewhat expensive on SSD)! So here is my question: Would it be possible to have a setup, where one master data database will act in the above matter (still running osm2pgsl) but will also provide publicly availabe data for log-shipping standby servers? We would need to be able to explicitely specify the tables to be replicated or the other way round explicitely exclude a couple of them. If this helps it would be possible to separate them by tablespaces. Regards Sven -- Mike -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] ltree gist index errors and fill factor questions
Hello, We have an ltree column (tree_path) that has a gist index (index_nodes_on_tree_path). This is in a 9.3.5 database. Recently errors started occurring in the postgres log on some updates to this table: fixing incomplete split in index index_nodes_on_tree_path, block 2358 STATEMENT: UPDATE nodes SET parent_id = $1, contents_id = $2, contents_type = $3, trough_code = $4, live_flag = $5, ordering =$6, id = $7, tree_path = $8 WHERE nodes.id = 127695 ERROR: failed to add item to index page in index_nodes_on_tree_path Reindexing index_nodes_on_tree_path fixes the problem temporarily, but it returns. I'm wondering if this is a bug (googleing the incomplete split error returns results about a similar bug with b-tree indexes) or a problem with an inadequate fill factor setting. It doesnt look like there is a specified fill factor for this index and I'm not sure what the gist default is. CREATE INDEX index_nodes_on_tree_path ON nodes USING gist (tree_path) The table in question has about 94k rows, an example of the widest tree_path tuple is 69066.69090.69091.69094 Any advice is appreciated, happy new year! Mike
Re: [GENERAL] ltree gist index errors and fill factor questions
I will do my best to provide a reproducible test case. Is there any more information I can supply in the meantime that would help?
Re: [GENERAL] ltree gist index errors and fill factor questions
The database is not crashing thankfully. We are waiting for the errors to come back to turn up logging in the hopes of creating the reproducible set.
[GENERAL] Hostnames, IDNs, Punycode and Unicode Case Folding
I'd like to store hostnames in a postgres database and I want to fully support IDNs (Internationalised Domain Names) I want to be able to recover the original representation of the hostname, so I can't just encode it with punycode and then store the ascii result. For example, these two are the same hostnames thanks to unicode case folding [1]: tesst.ëxämplé.com teßt.ëxämplé.com They both encode in punycode to the same thing: xn--tesst.xmpl.com-cib7f2a Don't believe me, then try visiting any domain with two s's in, whilst replacing the s's with ß's. E.g: ericßon.com nißan.com americanexpreß.com So if I pull out xn--tesst.xmpl.com-cib7f2a from the database, I've no idea which of those two hostnames was the original representation. The trouble is, if I store the unicode representation of a hostname instead, then when I run queries with conditions like: WHERE hostname='nißan.com' that wont pull out rows where hostname='nissan.com'. Also, if I create a unique index on the hostname field, or even on lower(hostname), that wont stop the same hostname being inserted more than once, with slightly different representations. So the system I've settled with is storing both the originally supplied representation, *and* the lower cased punycode encoded version in a separate column for indexing/search. This seems really hackish to me though. It seems to me that Postgres would benefit from a native hostname type and/or a pair of punycode encode/decode functions. And perhaps even a simple unicode case folding function. With the end result that these return TRUE: unicode_case_fold('ß') = 'ss' 'xn--tesst.xmpl.com-cib7f2a' = punycode_encode('teßt.ëxämplé.com') punycode_decode('xn--tesst.xmpl.com-cib7f2a') = 'tesst.ëxämplé.com' A native type would also be able to apply suitable constraints, e.g a maximum length of 253 octets on a punycode-encoded trailing-dot-excluded hostname, a limit of 1-63 octets on a punycode encoded label, no leading or trailing hyphens on a label, etc. I initially created a pair of functions for using punycode using plperl and the IDNA::Punycode Perl module, but the performance wasn't good enough, so I settled with duplicating the data to sacrifice disk space and complexity, for speed. I'm new to Postgres, and to this list, so if there is a better way for me to submit this suggestion or solve my problem, please point me in the right direction. [1] http://www.unicode.org/Public/UNIDATA/CaseFolding.txt Regards, -- Mike Cardwell https://grepular.com https://emailprivacytester.com OpenPGP Key35BC AF1D 3AA2 1F84 3DC3 B0CF 70A5 F512 0018 461F XMPP OTR Key 8924 B06A 7917 AAF3 DBB1 BF1B 295C 3C78 3EF1 46B4 signature.asc Description: Digital signature
Re: [GENERAL] Hostnames, IDNs, Punycode and Unicode Case Folding
* on the Mon, Dec 29, 2014 at 04:23:22PM -0700, David G Johnston wrote: WHERE hostname='nißan.com' _IF_ Postgres had a punycode function, then you could use: WHERE punycode(hostname) = punycode('nißan.com') If the OP wraps what he is doing up into a function that is what you end up getting: a memoized punycode function. http://en.wikipedia.org/wiki/Memoization It has to be defined as volatile but basically write the function to check for the provided input on the indexed table and if it doesn't exist the function will calculate the punycode value and store it onto the table before returning the punycode value to the caller. I'm not sure all that is necessary. It could be quite a simple function, like the lower() function. So what I would do is this: CREATE UNIQUE INDEX hostnames_hostname_key ON hostnames (lower(punycode_encode(hostname))); That would prevent adding more than one representation for the same hostname to the column. And I if I wanted to do a fast, indexed search where I could supply any representation of the hostname as input, I would just do: WHERE lower(punycode_encode(hostname)) = lower(punycode_encode('any-representation')) There doesn't need to be any extra table storage for the punycode encoded version. -- Mike Cardwell https://grepular.com https://emailprivacytester.com OpenPGP Key35BC AF1D 3AA2 1F84 3DC3 B0CF 70A5 F512 0018 461F XMPP OTR Key 8924 B06A 7917 AAF3 DBB1 BF1B 295C 3C78 3EF1 46B4 signature.asc Description: Digital signature
Re: [GENERAL] Hostnames, IDNs, Punycode and Unicode Case Folding
* on the Mon, Dec 29, 2014 at 07:00:05PM -0500, Andrew Sullivan wrote: CREATE UNIQUE INDEX hostnames_hostname_key ON hostnames (lower(punycode_encode(hostname))); This wouldn't work to get the original back if oyu have any IDNA200 data, because puncode-encoding the UTF-8 under IDNA2003 and the punycode-decoding it doesn't always result in the same label. See my other message. The original is the thing that is stored in the database. I wouldn't need to do any conversion to get the original back. In my example I am storing the original and creating an index on the punycode version. This is exactly the same method that we commonly use for performing case insensitive text searches using lower() indexes. -- Mike Cardwell https://grepular.com https://emailprivacytester.com OpenPGP Key35BC AF1D 3AA2 1F84 3DC3 B0CF 70A5 F512 0018 461F XMPP OTR Key 8924 B06A 7917 AAF3 DBB1 BF1B 295C 3C78 3EF1 46B4 signature.asc Description: Digital signature
Re: [GENERAL] Hostnames, IDNs, Punycode and Unicode Case Folding
* on the Mon, Dec 29, 2014 at 05:01:44PM -0700, David G Johnston wrote: CREATE UNIQUE INDEX hostnames_hostname_key ON hostnames (lower(punycode_encode(hostname))); That would prevent adding more than one representation for the same hostname to the column. Except two different hostname can resolve to the same punycode_encode(hostname) value That's exactly what I'm taking advantage of... so the unique index won't work. I think you misunderstand what I'm suggesing. I'm suggesting storing the unicode version in the database, and then creating a unique index on the result of the punycode function to prevent more than one representation of the same hostname being inserted. It was also mentioned that using the Perl encoding function was non-performant; which is why caching the data into a memoization table has value. It was non-performant because it involved unnecessarily starting a Perl interpreter. Not because the algorithm its self is slow. WHERE lower(punycode_encode(hostname)) = lower(punycode_encode('any-representation')) I'm not for knowing the rules of punycode but I'm not seeing what value lower() provides here... Case insensitive matching. So that EXAMPLE.COM = example.com -- Mike Cardwell https://grepular.com https://emailprivacytester.com OpenPGP Key35BC AF1D 3AA2 1F84 3DC3 B0CF 70A5 F512 0018 461F XMPP OTR Key 8924 B06A 7917 AAF3 DBB1 BF1B 295C 3C78 3EF1 46B4 signature.asc Description: Digital signature
Re: [GENERAL] Hostnames, IDNs, Punycode and Unicode Case Folding
* on the Mon, Dec 29, 2014 at 07:22:21PM -0500, Andrew Sullivan wrote: can't just encode it with punycode and then store the ascii result. For example, these two are the same hostnames thanks to unicode case folding [1]: tesst.ëxämplé.com teßt.ëxämplé.com Well, in IDNA2003 they're the same. In IDNA2008 (RFC 5890 and suite), they're not the same. In UTS46, they're kind of the same, because pre-lookup processing maps one of them to the other (it depends which mode you're in which way the mapping goes, which is just fantastic because you can't tell at the server which mode the client is in. IDNA is an unholy mess); but the lookup is still done using the IDNA2008 rules, approximately. Heh. And I just thought I was finally starting to get to grips with this stuff. They both encode in punycode to the same thing: xn--tesst.xmpl.com-cib7f2a Under no circumstances should they encode to that. Eurgh, you're right. The library I'm using does actually do it right, I just forgot to split on the dot and encode each label separately when writing the examples for this email. Sorry for confusing matters. [snip lots of useful and interesting information] You seem to want a bunch of label constraints, not all of which are related to IDNA. I think it would be better to break these up into a small number of functions. As it happens, I have a colleague at Dyn who I think has some need of some of this too, and so it might be worth spinning up a small project to try to get generic functions: to_idna2003, to_idna2008, check_ldh, split_labels, and so on. If this seems possibly interesting for collaboration, let me know I'll try to put together the relevant people. Those functions would be very useful to me. I know a bit of C, but probably not enough to produce an acceptable patch. If there are people who would also find these functions useful, and people motivated to implement them, that would be great... -- Mike Cardwell https://grepular.com https://emailprivacytester.com OpenPGP Key35BC AF1D 3AA2 1F84 3DC3 B0CF 70A5 F512 0018 461F XMPP OTR Key 8924 B06A 7917 AAF3 DBB1 BF1B 295C 3C78 3EF1 46B4 signature.asc Description: Digital signature
Re: [GENERAL] Hostnames, IDNs, Punycode and Unicode Case Folding
* on the Mon, Dec 29, 2014 at 07:25:59PM -0500, Andrew Sullivan wrote: This is exactly the same method that we commonly use for performing case insensitive text searches using lower() indexes. Hmm. How did you get the original, then? The original in my case, is the hostname which the end user supplied. Essentially, when I display it back to them, I want to make sure it is displayed the same way that it was when they originally submitted it. If you have the original Unicode version, why don't you switch to IDNA2008 publication rules, which are way more reliable? In that case, you do have a 1:1 lookup and you shouldn't have a problem. I was unaware of the different versions of IDNA. I basically started using the Perl module IDNA::Punycode in my project and assumed that this was the only type. Seems like I need to do some more reading. -- Mike Cardwell https://grepular.com https://emailprivacytester.com OpenPGP Key35BC AF1D 3AA2 1F84 3DC3 B0CF 70A5 F512 0018 461F XMPP OTR Key 8924 B06A 7917 AAF3 DBB1 BF1B 295C 3C78 3EF1 46B4 signature.asc Description: Digital signature
[GENERAL] Problem with constraint exclusion on partitions
Hello, I'm having an issue getting the query planner to skip child tables based on my check constraints. I have constraint_exclusion = partition in my config file, and I think I have everything set up correctly. The parent table shows the child tables: cloud_test2=# \d+ engine_sessions ... Child tables: data.engine_sessions_2008, data.engine_sessions_2009, data.engine_sessions_2010, . data.engine_sessions_201411, data.engine_sessions_201412 And the check constraints looks right on the child tables: cloud_test2=# \d data.engine_sessions_2008 ... Check constraints: engine_sessions_2008_check CHECK (date_created = '2008-01-01 00:00:00-05'::timestamp with time zone AND date_created '2009-01-01 00:00:00-05'::timestamp with time zone) (date_created is a timestamp with time zone not null default now()) Just looking at the explain output- when I select where date_created now() - interval '24 hours', the query planner does a sequential scan on all the child tables: cloud_test2=# explain analyze select * from engine_sessions where date_created now() - interval '24 hours'; Append (cost=0.00..59268.32 rows=354 width=97) (actual time=250.421..255.227 rows=42 loops=1) - Seq Scan on engine_sessions (cost=0.00..0.00 rows=1 width=96) (actual time=0.000..0.000 rows=0 loops=1) Filter: (date_created (now() - '24:00:00'::interval)) - Seq Scan on engine_sessions_2008 (cost=0.00..3384.94 rows=11 width=96) (actual time=12.086..12.086 rows=0 loops=1) Filter: (date_created (now() - '24:00:00'::interval)) Rows Removed by Filter: 106568 ... - Seq Scan on engine_sessions_201411 (cost=0.00..1607.85 rows=5 width=97) (actual time=5.586..5.586 rows=0 loops=1) Filter: (date_created (now() - '24:00:00'::interval)) Rows Removed by Filter: 46620 - Seq Scan on engine_sessions_201412 (cost=0.00..1378.07 rows=180 width=97) (actual time=0.006..4.810 rows=42 loops=1) Filter: (date_created (now() - '24:00:00'::interval)) Rows Removed by Filter: 39915 Total runtime: 255.322 ms (58 rows) But when I take the output of now() - interval '24 hours': cloud_test2=# select now() - interval '24 hours'; ?column? --- 2014-12-18 21:28:47.926603-05 (1 row) And use that directly, it works fine: cloud_test2=# explain analyze select * from engine_sessions where date_created '2014-12-18 21:28:47.926603-05'; QUERY PLAN - Append (cost=0.00..1178.34 rows=181 width=97) (actual time=0.004..3.135 rows=42 loops=1) - Seq Scan on engine_sessions (cost=0.00..0.00 rows=1 width=96) (actual time=0.000..0.000 rows=0 loops=1) Filter: (date_created '2014-12-18 21:28:47.926603-05'::timestamp with time zone) - Seq Scan on engine_sessions_201412 (cost=0.00..1178.34 rows=180 width=97) (actual time=0.003..3.130 rows=42 loops=1) Filter: (date_created '2014-12-18 21:28:47.926603-05'::timestamp with time zone) Rows Removed by Filter: 39915 Total runtime: 3.151 ms (7 rows) The types match: cloud_test2=# select pg_typeof(now() - interval '24 hours'); pg_typeof -- timestamp with time zone Is there something I'm missing? Thanks! Mike
Re: [GENERAL] Problem with constraint exclusion on partitions
Hey David, What version are you using? Sorry, I'm on 9.3.4. Now() is a volatile function so the planner cannot omit partitions. Replace that with a constant and now it can. I'm not sure what you mean- now() is a stable function: cloud_test2=# \df+ now() Schema | Name | Result data type | Argument data types | Type | Security | Volatility | Owner | Language | Source code | Description +--+--+-+--- -+--++---+--+-+- - pg_catalog | now | timestamp with time zone | | normal | invoker | stable | pgres | internal | now | current transaction time it's the timestamp at the start of the transaction- so the planner should have a set value for all rows. Am I missing something else? Mike -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] pl/pgsql trigger function - compare *most* columns in NEW vs. OLD
Is there a simple notation for comparing most columns in the new and old records in a pl/pgsql trigger function? Something like (new.b, new.c, new.d) = (old.b, old.c, old.d) works to compare all the columns except 'a', but is fragile in that it needs to be updated any time a column is added to the table, and is rather messy looking when the actual table has many columns. Is there a better approach to detecting a change in any column except a few specific ones? * mike.blackw...@rrd.com*
[GENERAL] Vacuum freeze
check_postgres.pl (--action=autovac_freeze) recently complained that we needed to run VACUUM FREEZE. Doing so generated a boatload of WAL files - perhaps on the order of the of the database itself. Is VACUUM FREEZE something that is normally handled by autovac? If so, how would we approach finding what caused it not to happen automatically? Or if it's not, what's the normal approach to scheduling it manually to prevent this flood of WAL?
[GENERAL] Estimating WAL usage during pg_basebackup
I need to get an idea of how much WAL space will be required during a long (many hours) pg_basebackup over a relatively slow network connection. This is for a server that's not yet running PITR / streaming. Any thoughts? * mike.blackw...@rrd.com*
[GENERAL] Any Postgres experts not afraid of the camera?
http://meta.stackoverflow.com/questions/270574/an-experiment-stack-overflow-tv?cb=1
[GENERAL] Regular expression question with Postgres
I'm curious why this query returns 0: SELECT 'AAA' ~ '^A{,4}$' Yet, this query returns 1: SELECT 'AAA' ~ '^A{0,4}$' Is this a bug with the regular expression engine?
Re: [GENERAL] Regular expression question with Postgres
Yea seems right. I was testing the expression on Rubular (Which uses the Ruby parser) and it worked. I guess Ruby allows this non-standard expression with the missing lower bounds. Every reference I could find, though, agrees only the upper bound is optional. On Thu, Jul 24, 2014 at 1:42 PM, David G Johnston david.g.johns...@gmail.com wrote: Mike Christensen-2 wrote I'm curious why this query returns 0: SELECT 'AAA' ~ '^A{,4}$' Yet, this query returns 1: SELECT 'AAA' ~ '^A{0,4}$' Is this a bug with the regular expression engine? Apparently since {,#} is not a valid regexp expression the engine simply interprets it as a literal and says 'AAA' != 'A{,4}' http://www.postgresql.org/docs/9.3/interactive/functions-matching.html#FUNCTIONS-POSIX-REGEXP Table 9-13. Regular Expression Quantifiers Note the all of the { } expressions have a lower bound (whether explicit or implied). David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Regular-expression-question-with-Postgres-tp5812777p5812778.html Sent from the PostgreSQL - general mailing list archive at Nabble.com. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Regular expression question with Postgres
Yea looks like Postgres has it right, well.. per POSIX standard anyway. JavaScript also has it right, as does Python and .NET. Ruby is just weird. On Thu, Jul 24, 2014 at 1:57 PM, Tom Lane t...@sss.pgh.pa.us wrote: Mike Christensen m...@kitchenpc.com writes: I'm curious why this query returns 0: SELECT 'AAA' ~ '^A{,4}$' Yet, this query returns 1: SELECT 'AAA' ~ '^A{0,4}$' Is this a bug with the regular expression engine? Our regex documentation lists the following variants of bounds syntax: {m} {m,} {m,n} Nothing about {,n}. I rather imagine that the engine is deciding that that's just literal text and not a bounds constraint ... regression=# SELECT 'A{,4}' ~ '^A{,4}$'; ?column? -- t (1 row) ... yup, apparently so. A look at the POSIX standard says that it has the same idea of what is a valid bounds constraint: When an ERE matching a single character or an ERE enclosed in parentheses is followed by an interval expression of the format {m}, {m,}, or {m,n}, together with that interval expression it shall match what repeated consecutive occurrences of the ERE would match. The values of m and n are decimal integers in the range 0 = m= n= {RE_DUP_MAX}, where m specifies the exact or minimum number of occurrences and n specifies the maximum number of occurrences. The expression {m} matches exactly m occurrences of the preceding ERE, {m,} matches at least m occurrences, and {m,n} matches any number of occurrences between m and n, inclusive. regards, tom lane
Re: [GENERAL] WARNING: database must be vacuumed within 8439472 transactions
Sounds like you just have to wait until it finishes.. On Mon, Jul 7, 2014 at 12:56 PM, Prabhjot Sheena prabhjot.she...@rivalwatch.com wrote: Hello We are using postgresql 8.3 database for last 5 yrs for this production database and its running fine. This is our critical database which runs 24*7. This weekend we started getting these messages HINT: To avoid a database shutdown, execute a full-database VACUUM. WARNING: database must be vacuumed within 8439472 transactions i am currently running this command vacuumdb --analyze db while this command is running i m still getting these messages WARNING: database must be vacuumed within 2645303 transactions. The value of number of transactions is going down every minute Can anyone tell me what is the best way to sort up this issue. Thanks Avi
[GENERAL] EBCDIC conversion
We have a need to check certain text fields to be sure they'll convert properly to EBCDIC. A check constraint with a convert() was the initial thought, but there doesn't seem to be a default conversion from UTF8 to EBCDIC. Does anyone have an implementation they'd care to share, or suggestions on the best way to implement the conversion function? Our PostgreSQL servers currently only have the default languages installed. http://www.rrdonnelley.com/ * mike.blackw...@rrd.com*
Re: [GENERAL] PostgreSQL specific datatypes very confusing for beginners who use wrappers around JDBC
I've had the same problem as well with NHibernate (On .NET) with Postgres ENUM types. Luckily, NHibernate is incredibly powerful and you *can* get everything working flawlessly, however it takes some serious digging into the source code and reading the docs to figure it out. The main issue is that NHibernate, out of the box, wants to map an ENUM as a number. For example: INSERT INTO FOO SomeEnumColumn VALUES (1); This will cause an error, because PG is looking for a string value (Even though ENUMs are stored as numeric values under the covers). It's pretty easy to configure NHibernate to convert ENUMs to strings (there's tons of blog posts on that).. However, this causes NHibernate to write: INSERT INTO FOO SomeEnumColumn VALUES ('EnumValue'::text); Which will also cause an error. I've found the only way around it is to configure NHibernate to treat ENUMs as Objects which will simply generate: INSERT INTO FOO SomeEnumColumn VALUES ('EnumValue'); -- No casting here, yay! This works. However, to agree with the original poster's point, if Postgres could be a little more forgiving about values that could be interpreted as correct (like an implicit cast between numeric and enum and string and enum) then we wouldn't have these issues.. Mike On Tue, Jan 28, 2014 at 1:37 PM, John R Pierce pie...@hogranch.com wrote: On 1/28/2014 1:20 PM, Tom Lane wrote: I think you can fix it by explicitly casting your placeholders, eg ?::macaddr. that might work for a wrapper that lets you roll your own SQL, but I thought he said one of these autogenerated SQL, taking it out of his control. -- john r pierce 37N 122W somewhere on the middle of the left coast -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] PostgreSQL specific datatypes very confusing for beginners who use wrappers around JDBC
Interesting! I'd be curious as to what types of bugs were caused by these implicit casts.. Note 8.3 was in the days back before ORMs became popular, so just write better SQL was a perfectly decent solution to the problem back then. Now days, this requirement might make Postgres incompatible with certain ORMs out there, which is a bummer. I'm wondering if these ambiguities you speak of could be solved in other ways. Such as implicitly cast iff the intention is not ambiguous, otherwise raise some sort of ambiguous error or default to some behavior. Mike On Tue, Jan 28, 2014 at 2:46 PM, John R Pierce pie...@hogranch.com wrote: On 1/28/2014 2:35 PM, Mike Christensen wrote: This works. However, to agree with the original poster's point, if Postgres could be a little more forgiving about values that could be interpreted as correct (like an implicit cast between numeric and enum and string and enum) then we wouldn't have these issues.. it had more implicit casts prior to (I think) 8.3, but there were many ambiguities where things could be interpreted to mean radically different sorts of operations, so they tightened things up in 8.3+ (or was it 8.4+ ?) -- john r pierce 37N 122W somewhere on the middle of the left coast -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] PostgreSQL specific datatypes very confusing for beginners who use wrappers around JDBC
How do you create casts in Postgres? On Tue, Jan 28, 2014 at 3:24 PM, Andrew Sullivan a...@crankycanuck.cawrote: On Tue, Jan 28, 2014 at 02:55:03PM -0800, Mike Christensen wrote: I'd be curious as to what types of bugs were caused by these implicit casts.. Typically, they were cases when there was an ambiguity that the programmer didn't understand, causing applications to blow up in surprising and wonderful ways. There are things you can do if you're really prepared for the gun aimed at your feet. Since you can create casts in Postgres, you can actually add back many of the implicit casts yourself. Such as implicitly cast iff the intention is not ambiguous I think if the developers could write code that read minds, they'd be working on more profitable enterprises ;-) Best, A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] PostgreSQL specific datatypes very confusing for beginners who use wrappers around JDBC
Oh. The CREATE CAST command. Wow, I was totally unaware of this entire feature! On Tue, Jan 28, 2014 at 3:36 PM, Mike Christensen m...@kitchenpc.comwrote: How do you create casts in Postgres? On Tue, Jan 28, 2014 at 3:24 PM, Andrew Sullivan a...@crankycanuck.cawrote: On Tue, Jan 28, 2014 at 02:55:03PM -0800, Mike Christensen wrote: I'd be curious as to what types of bugs were caused by these implicit casts.. Typically, they were cases when there was an ambiguity that the programmer didn't understand, causing applications to blow up in surprising and wonderful ways. There are things you can do if you're really prepared for the gun aimed at your feet. Since you can create casts in Postgres, you can actually add back many of the implicit casts yourself. Such as implicitly cast iff the intention is not ambiguous I think if the developers could write code that read minds, they'd be working on more profitable enterprises ;-) Best, A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] question about checksum in 9.3
Thanks for the responses, that clarifies the checksum feature for me. FWIW, my pgbench tests between a 9.2 server and a 9.3 server with checksum showed very similar performance characteristics and system resource utilization. Im going to set up another load test with our actual application to see if that reveals any noticeable performance difference. Thanks again Mike On Mon, Jan 13, 2014 at 7:11 PM, Michael Paquier michael.paqu...@gmail.comwrote: On Tue, Jan 14, 2014 at 1:50 AM, Mike Broers mbro...@gmail.com wrote: Hello, I am in the process of planning a 9.3 migration of postgres and I am curious about the checksum features available. In my test 9.3 instance it seemed like this feature provides a log entry of the exact database/oid of the corrupt object when it is accessed, but not much else. I can't find much documentation on anything else this feature provides. Few things: - The only way to know if a server is using data checksums is to use pg_controldata. - Be aware as well of the potential performance impact on your CPU, checksums are checked each time a page is read, and recalculated each time a page is updated. - ignore_checksum_failure can be used to ignore failures. Don't use that on a production system. Is there a built-in method of scanning the server to check for corruption or will I have to wait for a corrupt object to be accessed to see the log entry? You can as well access manually tables with some for example sequential scan to check if blocks are broken or not. Is there a relation that stores last checksum status or anyway of reporting on what objects are identified by postgres as corrupt or not corrupt? No, you could build one though with a background worker that scans relation pages and registers that failing blocks. Are there any other features of the checksum I am missing besides the log entry? 9.4 has a new GUC parameter called data_checksums that allow you to check with a psql client if checksums are used on a server. Regards, -- Michael
[GENERAL] question about checksum in 9.3
Hello, I am in the process of planning a 9.3 migration of postgres and I am curious about the checksum features available. In my test 9.3 instance it seemed like this feature provides a log entry of the exact database/oid of the corrupt object when it is accessed, but not much else. I can't find much documentation on anything else this feature provides. Is there a built-in method of scanning the server to check for corruption or will I have to wait for a corrupt object to be accessed to see the log entry? Is there a relation that stores last checksum status or anyway of reporting on what objects are identified by postgres as corrupt or not corrupt? Are there any other features of the checksum I am missing besides the log entry? Thanks Mike
Re: [GENERAL] Npgsql - Where can I find Npgsql.NpgsqlServices
Thanks! Got it working after messing around for a while.. I decided to check out EF for a new project I'm working on. So far, I think I like NHibernate better. A lot more control, and works flawlessly with Postgres and Npgsql. Mike On Thu, Dec 12, 2013 at 8:34 AM, Francisco Figueiredo Jr. franci...@npgsql.org wrote: Hi, Mike. We are working to get EF6 properly supported and our master branch already has support for it. Unfortunately pgfoundry is down at the moment. Please, if you can compile Npgsql from source code, please get a current snapshot from github and give it a try: https://github.com/npgsql/Npgsql and directly https://github.com/npgsql/Npgsql/archive/master.zip Just open the project on visual studio and select the correct configuration (net-45) and you will get Npgsql.dll and Npgsql.EntityFramework.dll in the EntityFramework/bin folder. I hope it helps. Let me know if you have any question. On Thu, Dec 12, 2013 at 1:24 AM, Mike Christensen m...@kitchenpc.comwrote: It seems I need NpgsqlServices to use Npgsql with EF6, however I can't figure out where you get this thing! I've tried installing it through NuGet: PM Install-Package Npgsql -pre Installing 'Npgsql 2.0.14.1'. Successfully installed 'Npgsql 2.0.14.1'. Adding 'Npgsql 2.0.14.1' to EFTest. Successfully added 'Npgsql 2.0.14.1' to EFTest. However, this doesn't have it. I've also tried installing the beta from: http://pgfoundry.org/frs/download.php/3494/Npgsql2.0.13.91-bin-ms.net4.5Ef6.zip No luck there. Any ideas? Mike -- Regards, Francisco Figueiredo Jr. Npgsql Lead Developer http://www.npgsql.org http://gplus.to/franciscojunior http://fxjr.blogspot.com http://twitter.com/franciscojunior
[GENERAL] Npgsql - Where can I find Npgsql.NpgsqlServices
It seems I need NpgsqlServices to use Npgsql with EF6, however I can't figure out where you get this thing! I've tried installing it through NuGet: PM Install-Package Npgsql -pre Installing 'Npgsql 2.0.14.1'. Successfully installed 'Npgsql 2.0.14.1'. Adding 'Npgsql 2.0.14.1' to EFTest. Successfully added 'Npgsql 2.0.14.1' to EFTest. However, this doesn't have it. I've also tried installing the beta from: http://pgfoundry.org/frs/download.php/3494/Npgsql2.0.13.91-bin-ms.net4.5Ef6.zip No luck there. Any ideas? Mike
Re: [GENERAL] Re: corruption issue after server crash - ERROR: unexpected chunk number 0
The restore of a post-crash production backup worked as hoped and the 2nd replication slave is back into its happy hot standby state. So if this problem replicated to our standby servers does that indicate that the potential problematic fsync occurred during a pg_xlog write? Would breaking replication at the time of the crash have prevented this from cascading or was it already too late at that point? Thanks again for the input, its been very helpful! Mike On Mon, Nov 25, 2013 at 12:20 PM, Mike Broers mbro...@gmail.com wrote: Thanks Shaun, Im planning to schedule a time to do the vacuum freeze suggested previously. So far the extent of the problem seems limited to the one session table and the one session row that was being used by a heavy bot scan at the time of the crash. Currently Im testing a recovery of a production backup from today to rebase one of the replication targets that I was using to test fixes last week. Hopefully that validates the current backups and I can proceed inquiring with our managed services provider about the false notification of the disk write and ways to prevent that going forward. I'll update the list if I uncover anything interesting in the process and/or need more advice, thanks again for your input - its much appreciated as always. Nothing like a little crash corruption to get the blood flowing! Mike On Mon, Nov 25, 2013 at 10:29 AM, Shaun Thomas stho...@optionshouse.comwrote: Update - I have two hot replication slaves of this db, both have the problem. I took one out of recovery and ran REINDEX table session_session and it fixed the errors about this row. Now Im going to run vacuum and see if there are other tables that complain, but Im guessing if so I will need to see if there is a way to force vacuum to continue on error, worst case I might have to script a table by table vacuum script I guess.. If anyone has a better suggestion for determining the extent of the damage Id appreciate it. Oh man. I'm sorry, Mike. One of the cardinal rules I have is to disconnect any replication following a database crash. It's just too easy for damaged replicated rows to be propagated unless you're on 9.3 and have checksums enabled. If you want to perform a table-by-table check, don't vacuum the database, but the individual tables. I'd go with a DO loop and have it raise notices into the log so you can investigate further: COPY ( SELECT 'VACUUM ' || oid::regclass::text || ';' FROM pg_class WHERE relkind = 'r' ) to '/tmp/vac_all.sql'; Run the /tmp/vac_all.sql through psql and pipe the contents into a log file. Any table that doesn't vacuum successfully will need to be repaired manually. One way you can do this if there are dupes, is by checking the ctid value after disabling index scans: SET enable_indexscan TO False; SELECT ctid, * FROM [broken_table] WHERE ...; Just construct the WHERE clause based on the error output, and you should get all rows if there are dupes. You'll need to figure out which row to keep, then delete the bad row based on the ctid. Do this as many times as it takes, then reindex to make sure the proper row versions are indexed. It's also a good idea to dump any table that came back with an error, just in case. After you've done all of that, you should re-base your replicas once you've determined your production system is usable. In the meantime, I highly recommend you set up a VIP you can assign to one of your replicas if your production system dies again, and remove any autostart code. If your production system crashes, switch the VIP immediately to a replica, and invalidate your old production system. Data corruption is insidious when streaming replication is involved. Look into tools like repmgr to handle managing your replicas as a cluster to make forced invalidation and re-basing easier. Good luck! -- Shaun Thomas OptionsHouse | 141 W. Jackson Blvd | Suite 500 | Chicago IL, 60604 312-676-8870 stho...@optionshouse.com __ See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email
Re: [GENERAL] Re: corruption issue after server crash - ERROR: unexpected chunk number 0
Thanks Shaun, Im planning to schedule a time to do the vacuum freeze suggested previously. So far the extent of the problem seems limited to the one session table and the one session row that was being used by a heavy bot scan at the time of the crash. Currently Im testing a recovery of a production backup from today to rebase one of the replication targets that I was using to test fixes last week. Hopefully that validates the current backups and I can proceed inquiring with our managed services provider about the false notification of the disk write and ways to prevent that going forward. I'll update the list if I uncover anything interesting in the process and/or need more advice, thanks again for your input - its much appreciated as always. Nothing like a little crash corruption to get the blood flowing! Mike On Mon, Nov 25, 2013 at 10:29 AM, Shaun Thomas stho...@optionshouse.comwrote: Update - I have two hot replication slaves of this db, both have the problem. I took one out of recovery and ran REINDEX table session_session and it fixed the errors about this row. Now Im going to run vacuum and see if there are other tables that complain, but Im guessing if so I will need to see if there is a way to force vacuum to continue on error, worst case I might have to script a table by table vacuum script I guess.. If anyone has a better suggestion for determining the extent of the damage Id appreciate it. Oh man. I'm sorry, Mike. One of the cardinal rules I have is to disconnect any replication following a database crash. It's just too easy for damaged replicated rows to be propagated unless you're on 9.3 and have checksums enabled. If you want to perform a table-by-table check, don't vacuum the database, but the individual tables. I'd go with a DO loop and have it raise notices into the log so you can investigate further: COPY ( SELECT 'VACUUM ' || oid::regclass::text || ';' FROM pg_class WHERE relkind = 'r' ) to '/tmp/vac_all.sql'; Run the /tmp/vac_all.sql through psql and pipe the contents into a log file. Any table that doesn't vacuum successfully will need to be repaired manually. One way you can do this if there are dupes, is by checking the ctid value after disabling index scans: SET enable_indexscan TO False; SELECT ctid, * FROM [broken_table] WHERE ...; Just construct the WHERE clause based on the error output, and you should get all rows if there are dupes. You'll need to figure out which row to keep, then delete the bad row based on the ctid. Do this as many times as it takes, then reindex to make sure the proper row versions are indexed. It's also a good idea to dump any table that came back with an error, just in case. After you've done all of that, you should re-base your replicas once you've determined your production system is usable. In the meantime, I highly recommend you set up a VIP you can assign to one of your replicas if your production system dies again, and remove any autostart code. If your production system crashes, switch the VIP immediately to a replica, and invalidate your old production system. Data corruption is insidious when streaming replication is involved. Look into tools like repmgr to handle managing your replicas as a cluster to make forced invalidation and re-basing easier. Good luck! -- Shaun Thomas OptionsHouse | 141 W. Jackson Blvd | Suite 500 | Chicago IL, 60604 312-676-8870 stho...@optionshouse.com __ See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email
[GENERAL] Browsing postgres database using the Eclipse QuantumDB plugin
Has anyone successfully connected and browsed a postgres database using the Eclipse QuantumDB plugin? I can connect and execute sql, but the existing table list is always empty as if no meta information is ever provided to the browser plugin. At first, I thought it might be a permission problem with the database user I provided, so I also tried it with the postgres user, but had the same results. I've always had QuantumDB work with Oracle, hsqldb, h2database, and other platforms. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Browsing postgres database using the Eclipse QuantumDB plugin
I wanted to make sure that it wasn't a permission configuration problem in postgres first, since all of the other databases have worked without a similar issue. On Fri, Nov 22, 2013 at 9:54 AM, Adrian Klaver adrian.kla...@gmail.com wrote: On 11/22/2013 05:46 AM, Mike Kienenberger wrote: Has anyone successfully connected and browsed a postgres database using the Eclipse QuantumDB plugin? You might get a answer sooner here: http://sourceforge.net/p/quantum/discussion/24178/ I can connect and execute sql, but the existing table list is always empty as if no meta information is ever provided to the browser plugin. At first, I thought it might be a permission problem with the database user I provided, so I also tried it with the postgres user, but had the same results. I've always had QuantumDB work with Oracle, hsqldb, h2database, and other platforms. -- Adrian Klaver adrian.kla...@gmail.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Browsing postgres database using the Eclipse QuantumDB plugin
Adrian, Thanks! Searching through those forum posts (I had tried doing general google searches first before posting here, but didn't turn up that forum) I found this: http://sourceforge.net/p/quantum/discussion/24178/thread/17724b28/?limit=25#1166 which seems to indicate that at least back in 2007, there was an issue with the quantum support for postgres. Perhaps that's still the case with the version I am using. On Fri, Nov 22, 2013 at 10:28 AM, Mike Kienenberger mkien...@gmail.com wrote: I wanted to make sure that it wasn't a permission configuration problem in postgres first, since all of the other databases have worked without a similar issue. On Fri, Nov 22, 2013 at 9:54 AM, Adrian Klaver adrian.kla...@gmail.com wrote: On 11/22/2013 05:46 AM, Mike Kienenberger wrote: Has anyone successfully connected and browsed a postgres database using the Eclipse QuantumDB plugin? You might get a answer sooner here: http://sourceforge.net/p/quantum/discussion/24178/ I can connect and execute sql, but the existing table list is always empty as if no meta information is ever provided to the browser plugin. At first, I thought it might be a permission problem with the database user I provided, so I also tried it with the postgres user, but had the same results. I've always had QuantumDB work with Oracle, hsqldb, h2database, and other platforms. -- Adrian Klaver adrian.kla...@gmail.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] corruption issue after server crash - ERROR: unexpected chunk number 0
vacuumb avz, pg_dumpall, and vacuum freeze analyze on the former standby database that received the corruption via replication all came back without errors. Is the vacuum freeze intended to potentially fix problems or just reveal if other tables may have corruption, Im trying to decide if this needs to be run in production. On Thu, Nov 21, 2013 at 5:09 PM, Kevin Grittner kgri...@ymail.com wrote: Mike Broers mbro...@gmail.com wrote: Is there anything I should look out for with vacuum freeze? Just check the logs and the vacuum output for errors and warnings. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [GENERAL] corruption issue after server crash - ERROR: unexpected chunk number 0
Thanks for the response. fsync and full_page_writes are both on. Our database runs on a managed hosting provider's vmhost server/san, I can possibly request for them to provide some hardware test results - do you have any specifics diagnostics in mind? The crash was apparently due to our vmhost suddenly losing power, the only row that it has complained with the chunk error also migrated into both standby servers, and as previously stated was fixed with a reindex of the parent table in one of the standby servers after taking it out of recovery. The vacuumdb -avz on this test copy didnt have any errors or warnings, im going to also run a pg_dumpall on this host to see if any other rows are problematic. Is there something else I can run to confirm we are more or less ok at the database level after the pg_dumpall or is there no way to be sure and a fresh initdb is required. I am planning on running the reindex in actual production tonight during our maintenance window, but was hoping if that worked we would be out of the woods. On Thu, Nov 21, 2013 at 3:56 PM, Kevin Grittner kgri...@ymail.com wrote: Mike Broers mbro...@gmail.com wrote: Hello we are running postgres 9.2.5 on RHEL6, our production server crashed hard and when it came back up our logs were flooded with: ERROR: unexpected chunk number 0 (expected 1) for toast value 117927127 in pg_toast_19122 Your database is corrupted. Unless you were running with fsync = off or full_page_writes = off, that should not happen. It is likely to be caused by a hardware problem (bad RAM, a bad disk drive, or network problems if your storage is across a network). If it were me, I would stop the database service and copy the full data directory tree. http://wiki.postgresql.org/wiki/Corruption If fsync or full_page_writes were off, your best bet is probably to go to your backup. If you don't go to a backup, you should try to get to a point where you can run pg_dump, and dump and load to a freshly initdb'd cluster. If fsync and full_page_writes were both on, you should run hardware diagnostics at your earliest opportunity. When hardware starts to fail, the first episode is rarely the last or the most severe. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [GENERAL] corruption issue after server crash - ERROR: unexpected chunk number 0
Thanks, after this pg_dumpall I am going to see what kind of impact I can expect from running VACUUM FREEZE ANALYZE (normally I just run vacuumdb -avz nightly via a cron job) and schedule time to run this in production against all the tables in the database. Is there anything I should look out for with vacuum freeze? Much appreciated, Mike On Thu, Nov 21, 2013 at 4:51 PM, Kevin Grittner kgri...@ymail.com wrote: Mike Broers mbro...@gmail.com wrote: Thanks for the response. fsync and full_page_writes are both on. [ corruption appeared following power loss on the machine hosing the VM running PostgreSQL ] That leaves three possibilities: (1) fsync doesn't actually guarantee persistence in your stack. (2) There is a hardware problem which has not been recognized. (3) There is a so-far unrecognized bug in PostgreSQL. Based on my personal experience, those are listed in descending order of probability. I seem to recall reports of some VM for which an fsync did not force data all the way to persistent storage, but I don't recall which one. You might want to talk to your service provider about what guarantees they make in this regard. Is there something else I can run to confirm we are more or less ok at the database level after the pg_dumpall or is there no way to be sure and a fresh initdb is required. Given that you had persistence options in their default state of on, and the corruption appeared after a power failure in a VM environment, I would guess that the damage is probably limited. That said, damage from this sort of event can remain hidden and cause data loss later. Unfortunately we do not yet have a consistency checker that can root out such problems. If you can arrange a maintenance window to dump and load to a fresh initdb, that would eliminate the possibility that some hidden corruption is lurking. If that is not possible, running VACUUM FREEZE ANALYZE will reduce the number of things that can go wrong, without requiring down time. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
[GENERAL] Re: corruption issue after server crash - ERROR: unexpected chunk number 0
Update - I have two hot replication slaves of this db, both have the problem. I took one out of recovery and ran REINDEX table session_session and it fixed the errors about this row. Now Im going to run vacuum and see if there are other tables that complain, but Im guessing if so I will need to see if there is a way to force vacuum to continue on error, worst case I might have to script a table by table vacuum script I guess.. If anyone has a better suggestion for determining the extent of the damage Id appreciate it. On Thu, Nov 21, 2013 at 2:10 PM, Mike Broers mbro...@gmail.com wrote: Hello we are running postgres 9.2.5 on RHEL6, our production server crashed hard and when it came back up our logs were flooded with: STATEMENT: SELECT session_session.session_key, session_session.session_data, session_session.expire_date, session_session.nonce FROM session_session WHERE (session_session.session_key = 'gk9aap5d7btp6tzquh0kf73gpfmik5w3' AND session_session.expire_date '2013-11-21 13:27:33.107913' ) ERROR: unexpected chunk number 0 (expected 1) for toast value 117927127 in pg_toast_19122 We restarted the application and whatever session was constantly hitting that row stopped, but Im concerned about remediation. When I attempt to read from that row the error occurs. select * from session_session where session_key = 'gk9aap5d7btp6tzquh0kf73gpfmik5w3'; ERROR: unexpected chunk number 0 (expected 1) for toast value 117927127 in pg_toast_19122 When I attempt to delete this row I get this error: delete from session_session where session_key = 'gk9aap5d7btp6tzquh0kf73gpfmik5w3'; ERROR: tuple concurrently updated We happen to have a maintenance window tonight so I will have some time when the app is down to run some database fixes. I saw other threads suggesting a reindex of the toast table, but this is a 14GB table and I'm not sure how long that will take or if it will even be successful. We also have a full db vacuum/analyze scheduled nightly for 2am so I am expecting to learn if there are other impacted tables, but its troubling if I dont know what the remediation. This particular table could be truncated if necessary if that is an option but Im not sure about other tables. Any suggestions for how to handle the tuple concurrently updated error? Or if a reindex is likely to help with the unexpected chunk error? Thanks Mike
[GENERAL] corruption issue after server crash - ERROR: unexpected chunk number 0
Hello we are running postgres 9.2.5 on RHEL6, our production server crashed hard and when it came back up our logs were flooded with: STATEMENT: SELECT session_session.session_key, session_session.session_data, session_session.expire_date, session_session.nonce FROM session_session WHERE (session_session.session_key = 'gk9aap5d7btp6tzquh0kf73gpfmik5w3' AND session_session.expire_date '2013-11-21 13:27:33.107913' ) ERROR: unexpected chunk number 0 (expected 1) for toast value 117927127 in pg_toast_19122 We restarted the application and whatever session was constantly hitting that row stopped, but Im concerned about remediation. When I attempt to read from that row the error occurs. select * from session_session where session_key = 'gk9aap5d7btp6tzquh0kf73gpfmik5w3'; ERROR: unexpected chunk number 0 (expected 1) for toast value 117927127 in pg_toast_19122 When I attempt to delete this row I get this error: delete from session_session where session_key = 'gk9aap5d7btp6tzquh0kf73gpfmik5w3'; ERROR: tuple concurrently updated We happen to have a maintenance window tonight so I will have some time when the app is down to run some database fixes. I saw other threads suggesting a reindex of the toast table, but this is a 14GB table and I'm not sure how long that will take or if it will even be successful. We also have a full db vacuum/analyze scheduled nightly for 2am so I am expecting to learn if there are other impacted tables, but its troubling if I dont know what the remediation. This particular table could be truncated if necessary if that is an option but Im not sure about other tables. Any suggestions for how to handle the tuple concurrently updated error? Or if a reindex is likely to help with the unexpected chunk error? Thanks Mike
Re: [GENERAL] Call for design: PostgreSQL mugs
Oooh can we make the handle an elephant trunk? (Ok, now I'm sure I'm adding all sorts of expense - but hey you'll save so much money using Postgres you can afford an expensive coffee mug!) On Thu, Sep 12, 2013 at 5:30 AM, Andreas 'ads' Scherbaum adsm...@wars-nicht.de wrote: On 09/10/2013 10:15 PM, Kevin Grittner wrote: patrick keshishian pkesh...@gmail.com wrote: One more cute idea that came to me last night. Here is a very poor attempt at it by yours truly; keep in mind I'm not a graphics artist. This image is for illustration purposes only! http://sidster.com/gallery/**2013/09/10/elephant_paw.sml.**jpghttp://sidster.com/gallery/2013/09/10/elephant_paw.sml.jpg Can you picture a bunch of these on a meeting table? If that image does not scream Stampede!, I don't know what does. =) Again, a great conversation starter. I think I like this idea best so far! Some supporting documentation for a final version: http://www.asknature.org/**strategy/**29c12a353dab52ad8d4eb5d4337cef**b9http://www.asknature.org/strategy/29c12a353dab52ad8d4eb5d4337cefb9 I agree, that's a good idea. Instead of the original PG logo near the top, the mug can show a real elephant face along with the foots on both sides. And then have PostgreSQL somewhere written - maybe turned 90° near the handle, on both sides? How does this sound? And next question: who can design this? ;-) -- Andreas 'ads' Scherbaum German PostgreSQL User Group European PostgreSQL User Group - Board of Directors Volunteer Regional Contact, Germany - PostgreSQL Project -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/**mailpref/pgsql-generalhttp://www.postgresql.org/mailpref/pgsql-general
[GENERAL] EF / npgsql and VIEWs
I understand from one of our developers there may be issues using VIEWs with Entity Framework and npgsql. Can anyone with some experience using PostgreSQL in a .NET environment comment? __ *Mike Blackwell | Technical Analyst, Distribution Services/Rollout Management | RR Donnelley* 1750 Wallace Ave | St Charles, IL 60174-3401 Office: 630.313.7818 mike.blackw...@rrd.com http://www.rrdonnelley.com http://www.rrdonnelley.com/ * mike.blackw...@rrd.com*
Re: [GENERAL] function with unknown params
You passed in: 22/1/2013 Which is 22 divided by 1, divided by 2013 - which is an integer.. On Tue, Jul 9, 2013 at 10:17 AM, giozh gi...@yahoo.it wrote: ok, it works. But why on error message i had that two unknown data type? if was an error on date type, why it don't signal that? -- View this message in context: http://postgresql.1045698.n5.nabble.com/function-with-unknown-params-tp5763215p5763224.html Sent from the PostgreSQL - general mailing list archive at Nabble.com. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] PERFORM statement
I was reading about Postgres stored procs in the FAQ: https://wiki.postgresql.org/wiki/FAQ#Does_PostgreSQL_have_stored_procedures.3F It claims that an alternative syntax to: SELECT theNameOfTheFunction(arg1, arg2); Is: PERFORM theNameOfTheFunction(arg1, arg2); However, when I try the following: CREATE TABLE app_for_leave ( sno integer NOT NULL, eid integer, ename varchar(20), sd date, ed date, sid integer, status boolean DEFAULT false, CONSTRAINT pk_snoa PRIMARY KEY (sno) ); CREATE FUNCTION MyInsert(_sno integer, _eid integer, _sd date, _ed date, _sid integer, _status boolean) RETURNS void AS $BODY$ BEGIN INSERT INTO app_for_leave(sno, eid, sd, ed, sid, status) VALUES(_sno, _eid, _sd, _ed, _sid, _status); END; $BODY$ LANGUAGE 'plpgsql' VOLATILE COST 100; PERFORM MyInsert(1,101,'2013-04-04','2013-04-04',2,'f' ); I get the error: ERROR: syntax error at or near PERFORM SQL state: 42601 Character: 1 Is the FAQ out of date or was this feature removed? I'm using 9.2.1. Thanks! Mike
Re: [GENERAL] PERFORM statement
Ah ok that makes sense. The FAQ wasn't exactly clear about that. On Mon, Jul 8, 2013 at 9:38 PM, Tony Theodore tony.theod...@gmail.comwrote: On 09/07/2013, at 2:20 PM, Mike Christensen m...@kitchenpc.com wrote: PERFORM MyInsert(1,101,'2013-04-04','2013-04-04',2,'f' ); I get the error: ERROR: syntax error at or near PERFORM SQL state: 42601 Character: 1 Is the FAQ out of date or was this feature removed? I'm using 9.2.1. Thanks! I believe PERFORM is a PL/pgSQL construct, not an SQL one. You'd need to execute it inside a function. Cheers, Tony
Re: [GENERAL] View's plan not taking advantage of WHERE?
It appears that the culprit is a cached query plan, the tables in the UNION have changed and no long match however the View does not throw a each UNION query must have the same number of columns error. Is there a way to force the View's query plan to be updated on each access?
Re: [GENERAL] View's plan not taking advantage of WHERE?
From what I'm reading the View is frozen when it's created, including it's plan, and the usual solution is to use a set returning function... is this not true? I've double checked all schemas and the view is only defined once. Thanks.
Re: [GENERAL] View's plan not taking advantage of WHERE?
Thanks Scott, interesting. Other than the tests in the original post do you have any suggestions? Thanks for your time.
Re: [GENERAL] View's plan not taking advantage of WHERE?
Thanks Tom, I've found other discussion of this, that aggregates foul-up the planner with views. GROUP BY DISTINCT don't work, we're trying to grab a subset of records and backfill any nulls to present a complete, single record... we're stuck with a view as this is used by a Rails app. We'll see what else we can come-up with. Thanks again. On Wed, Jun 5, 2013 at 9:16 AM, Tom Lane t...@sss.pgh.pa.us wrote: Mike Summers msummer...@gmail.com writes: Other than the tests in the original post do you have any suggestions? If you're speaking of http://www.postgresql.org/message-id/CAJGeMG89QbDxMab7-aPD_yXVsGx7Q=auxym9ufvaq06crz4...@mail.gmail.com that has nothing to do with cached plans, obsolete or otherwise. You seem to be wishing that the planner would deduce x = constant from aggregate_function(x) = constant, which is wrong on its face. AFAICS it's not even correct for the special case that the aggregate function is first_not_null(), since you have multiple occurrences of that in the view and there's no certainty that they all choose to return values from the same row. Even if the optimization is valid given some additional assumptions that you've not told us, it's going to be a sufficiently narrow case that I doubt we'd ever be willing to expend planner cycles on checking for it. If you want WHERE clauses to be pushed down into this query you need to think of some other way to define the query. Perhaps something involving GROUP BY or DISTINCT instead of first_not_null() would be more amenable to optimization. regards, tom lane
[GENERAL] View's plan not taking advantage of WHERE?
I have a VIEW that does not appear to take advantage of the WHERE when given the opportunity: db=# explain select * from best_for_sale_layouts; QUERY PLAN Aggregate (cost=1977.21..1977.22 rows=1 width=118) - Sort (cost=1938.18..1940.05 rows=748 width=122) Sort Key: sources.for_sale_layout_rank - Hash Join (cost=1.04..1902.48 rows=748 width=122) Hash Cond: (for_sale_layouts.source_id = sources.id) - Append (cost=0.00..1613.60 rows=74760 width=118) - Seq Scan on for_sale_layouts (cost=0.00..806.74 rows=37374 width=118) - Seq Scan on assessor_records (cost=0.00..806.86 rows=37386 width=118) - Hash (cost=1.02..1.02 rows=2 width=8) - Seq Scan on sources (cost=0.00..1.02 rows=2 width=8) (10 rows) db=# explain analyze select * from best_for_sale_layouts where address_id = 2871034;; QUERY PLAN --- Aggregate (cost=1979.33..1979.34 rows=1 width=118) (actual time=93569.509..93569.510 rows=1 loops=1) Filter: (first_not_null(for_sale_layouts.address_id) = 2871034) - Sort (cost=1938.18..1940.05 rows=748 width=122) (actual time=320.652..464.523 rows=74748 loops=1) Sort Key: sources.for_sale_layout_rank Sort Method: external sort Disk: 5840kB - Hash Join (cost=1.04..1902.48 rows=748 width=122) (actual time=0.057..198.500 rows=74748 loops=1) Hash Cond: (for_sale_layouts.source_id = sources.id) - Append (cost=0.00..1613.60 rows=74760 width=118) (actual time=0.022..94.871 rows=74748 loops=1) - Seq Scan on for_sale_layouts (cost=0.00..806.74 rows=37374 width=118) (actual time=0.021..22.361 rows=37374 loops=1) - Seq Scan on assessor_records (cost=0.00..806.86 rows=37386 width=118) (actual time=0.011..23.383 rows=37374 loops=1) - Hash (cost=1.02..1.02 rows=2 width=8) (actual time=0.015..0.015 rows=2 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 1kB - Seq Scan on sources (cost=0.00..1.02 rows=2 width=8) (actual time=0.007..0.010 rows=2 loops=1) Total runtime: 93573.390 ms (14 rows) If I run the View's select with the WHERE in psql I get what I expect (first_not_null is an aggregate function): db=# explain analyze SELECT first_not_null(a.id) as id, first_not_null(a.address_id) as address_id, first_not_null(a.created_at) as created_at, first_not_null(a.updated_at) as updated_at FROM (SELECT b.*, for_sale_layout_rank FROM ((SELECT * FROM for_sale_layouts UNION ALL SELECT * FROM assessor_records) AS b INNER JOIN sources ON b.source_id = sources.id) ORDER BY for_sale_layout_rank) AS a where address_id = 2871034; QUERY PLAN --- Aggregate (cost=21.23..21.24 rows=1 width=118) (actual time=0.571..0.571 rows=1 loops=1) - Sort (cost=17.64..17.64 rows=2 width=122) (actual time=0.272..0.274 rows=2 loops=1) Sort Key: sources.for_sale_layout_rank Sort Method: quicksort Memory: 25kB - Nested Loop (cost=0.00..17.63 rows=2 width=122) (actual time=0.199..0.253 rows=2 loops=1) Join Filter: (for_sale_layouts.source_id = sources.id) Rows Removed by Join Filter: 2 - Append (cost=0.00..16.54 rows=2 width=118) (actual time=0.140..0.185 rows=2 loops=1) - Index Scan using index_for_sale_layouts_on_address_id on for_sale_layouts (cost=0.00..8.27 rows=1 width=118) (actual time=0.139..0.142 rows=1 loops=1) Index Cond: (address_id = 2871034) - Index Scan using index_assessor_layouts_on_address_id on assessor_records (cost=0.00..8.27 rows=1 width=118) (actual time=0.038..0.039 rows=1 loops=1) Index Cond: (address_id = 2871034) - Materialize (cost=0.00..1.03 rows=2 width=8) (actual time=0.022..0.025 rows=2 loops=2) - Seq Scan on sources (cost=0.00..1.02 rows=2 width=8) (actual time=0.020..0.023 rows=2 loops=1) Total runtime: 0.802 ms (15 rows) Is there anything I can do to get the View to update its plan? Improvements are welcome, although for other reasons (Rails' ActiveRecord) the View is a must. Thanks in advance.
Re: [GENERAL] Success stories of PostgreSQL implementations in different companies
On Thu, May 23, 2013 at 2:51 PM, Steve Crawford scrawf...@pinpointresearch.com wrote: On 05/23/2013 02:36 PM, Oscar Calderon wrote: Hi, this question isn't technical, but is very important for me to know. Currently, here in El Salvador our company brings PostgreSQL support, but Oracle and SQL Server are more popular here. Even with that, some clients are being encouraged to change to PostgreSQL to lower their companies costs in technologies, but very often they ask if there are success stories of PostgreSQL implementations in companies in our region or around the world, success stories (if is possible) with some information like number of concurrent users, some hardware specs or storage size. I think that in my country is more common to hear success stories like that about other databases like Oracle because is more expanded here, but i would like if there's a place or if you can share with me some real experiences or success stories that you ever heard of successful implementations of PostgreSQL in companies to talk with people when they ask that kind of things. Start with the web-site - especially: http://www.postgresql.org/**about/users/http://www.postgresql.org/about/users/ http://www.postgresql.org/**about/quotesarchive/http://www.postgresql.org/about/quotesarchive/ I don't know about name-recognition in El Salvador but Etsy, Wisconsin Courts, Skype, Affilias, FlightAware, NTT are quite recognizable here. And don't forget about everyone's favorite recipe search engine, www.kitchenpc.com - powered by Postgres 9.1.. /ShamelessPlug
Re: [GENERAL] dblink does not resolve DNS, but works with IP
Ah, gotcha! I guess whatever sample I was originally copying from used hostaddr for some reason.. Thanks for the clarification, Tom! On Wed, May 15, 2013 at 6:08 AM, Tom Lane t...@sss.pgh.pa.us wrote: Mike Christensen m...@kitchenpc.com writes: Though I'm a bit curious why there's a host and hostaddr. Why can't it just resolve whatever you give it? Well, it will ... if you use the host parameter. The whole point of hostaddr is that for that parameter, it will not try a DNS lookup. You'd only use that if you had issues with the speed or reliability of your DNS service. regards, tom lane
[GENERAL] dblink does not resolve DNS, but works with IP
If I have this: CREATE OR REPLACE VIEW Link.Foo AS select * from dblink( 'hostaddr=123.123.123.123 dbname=KitchenPC user=Website password=secret', 'select * from Foo') as ... Then it works. However, if I do: CREATE OR REPLACE VIEW Link.Foo AS select * from dblink( 'hostaddr=db.domain.com dbname=KitchenPC user=Website password=secret', 'select * from Foo') as ... Then I get: ERROR: could not establish connection DETAIL: could not translate host name db.domain.com to address: Unknown host However, from a command prompt I can ping db.domain.com and get 123.123.123.123. Does dblink just not support DNS resolution? I really don't want to hard code IP addresses in my scripts. Thanks!
Re: [GENERAL] dblink does not resolve DNS, but works with IP
Excellent! Thanks so much. On Tue, May 14, 2013 at 9:25 PM, Adrian Klaver adrian.kla...@gmail.comwrote: On 05/14/2013 09:17 PM, Mike Christensen wrote: If I have this: CREATE OR REPLACE VIEW Link.Foo AS select * from dblink( 'hostaddr=123.123.123.123 dbname=KitchenPC user=Website password=secret', 'select * from Foo') as ... Then it works. However, if I do: CREATE OR REPLACE VIEW Link.Foo AS select * from dblink( 'hostaddr=db.domain.com http://db.domain.com dbname=KitchenPC user=Website password=secret', 'select * from Foo') as ... Then I get: ERROR: could not establish connection DETAIL: could not translate host name db.domain.com http://db.domain.com to address: Unknown host However, from a command prompt I can ping db.domain.com http://db.domain.com and get 123.123.123.123. Does dblink just not support DNS resolution? I really don't want to hard code IP addresses in my scripts. Thanks! See below for explanation of hostaddr and host. Short version, you are looking for host: http://www.postgresql.org/**docs/9.2/interactive/libpq-** connect.html#LIBPQ-**PARAMKEYWORDShttp://www.postgresql.org/docs/9.2/interactive/libpq-connect.html#LIBPQ-PARAMKEYWORDS -- Adrian Klaver adrian.kla...@gmail.com
Re: [GENERAL] dblink does not resolve DNS, but works with IP
Though I'm a bit curious why there's a host and hostaddr. Why can't it just resolve whatever you give it? On Tue, May 14, 2013 at 9:31 PM, Mike Christensen m...@kitchenpc.comwrote: Excellent! Thanks so much. On Tue, May 14, 2013 at 9:25 PM, Adrian Klaver adrian.kla...@gmail.comwrote: On 05/14/2013 09:17 PM, Mike Christensen wrote: If I have this: CREATE OR REPLACE VIEW Link.Foo AS select * from dblink( 'hostaddr=123.123.123.123 dbname=KitchenPC user=Website password=secret', 'select * from Foo') as ... Then it works. However, if I do: CREATE OR REPLACE VIEW Link.Foo AS select * from dblink( 'hostaddr=db.domain.com http://db.domain.com dbname=KitchenPC user=Website password=secret', 'select * from Foo') as ... Then I get: ERROR: could not establish connection DETAIL: could not translate host name db.domain.com http://db.domain.com to address: Unknown host However, from a command prompt I can ping db.domain.com http://db.domain.com and get 123.123.123.123. Does dblink just not support DNS resolution? I really don't want to hard code IP addresses in my scripts. Thanks! See below for explanation of hostaddr and host. Short version, you are looking for host: http://www.postgresql.org/**docs/9.2/interactive/libpq-** connect.html#LIBPQ-**PARAMKEYWORDShttp://www.postgresql.org/docs/9.2/interactive/libpq-connect.html#LIBPQ-PARAMKEYWORDS -- Adrian Klaver adrian.kla...@gmail.com
[GENERAL] Where in the source code does postgres write to disk?
According to http://www.postgresql.org/docs/9.2/static/storage-file-layout.html When a table or index exceeds 1 GB, it is divided into gigabyte-sized segments. The first segment's file name is the same as the filenode; subsequent segments are named filenode.1, filenode.2, etc. I was wondering where in the source code this is dealt with. I have been searching for the last few hours but have had no luck Any help guiding me to the location in the source code where postgres writes the buffer to disk would be greatly appreciated. Thanks
[GENERAL] Where in the source code does postgres write to disk?
According to http://www.postgresql.org/docs/9.2/static/storage-file-layout.html When a table or index exceeds 1 GB, it is divided into gigabyte-sized segments. The first segment's file name is the same as the filenode; subsequent segments are named filenode.1, filenode.2, etc. I was wondering where in the source code this is dealt with. I have been searching for the last few hours but have had no luck Any help guiding me to the location in the source code where postgres writes the buffer to disk would be greatly appreciated. Thanks
[GENERAL] Fetching Server configured port from C Module
Hi There, I'm having a bit of an issue finding a C function to fetch the configured server port from a C module. We have written a C module to allow for remote clients to call a function to run pg_dump/pg_restore remotely but create files locally on the db server. Currently it works fine if the client connects through a network socket as we're using inet_server_port to get the port to pass onto pg_dump/restore. But if the client is connected through a unix socket (actually a remote client connecting to pgbouncer which is connecting to postgres though the unix socket) inet_server_port is null. I've looked for a function that we can use to get the configured server port but haven't had any luck. I could hard code the port in the module when we build it but it would be nice to be able to change the configured postgres port and not have to rebuild the module. Anyone have any suggestions? I've posted the code for our backup module here: http://pastebin.com/wQ6VidWn
Re: [GENERAL] Fetching Server configured port from C Module
SHOW PORT; test= SELECT setting FROM pg_settings WHERE name = 'port'; setting - 5432 Both of these are from a query context. This is in a C module, I suppose I could run a query but there has to be a direct C function to get this data.
Re: [GENERAL] Fetching Server configured port from C Module
Perfect thanks Bruce that worked. I just extern'd PostPortNumber in my module and everything seems to be working. --Mike
Re: [GENERAL] how to create materialized view in postgresql 8.3
This is the number one requested feature on Uservoice: http://postgresql.uservoice.com/forums/21853-general/suggestions/247548-materialized-views On Mon, Apr 8, 2013 at 9:27 AM, John R Pierce pie...@hogranch.com wrote: On 4/7/2013 11:58 PM, Zahid Quadri wrote: is it possible to created materialized view in postgresql 8.3 if yes please provide some sample. in older versions, the best you could do was to create a table and populate it with your 'view', then drop it when you're done (or truncate and repopulate it to update the 'view'). -- john r pierce 37N 122W somewhere on the middle of the left coast