[GENERAL] Two instances of Postgres with single data directory
All, I need to know is it possible to execute two instances of Postgres with single data directory shared between the two instances. This is to know if we can achieve Oracle RAC like cluster for Postgres. Regards Dina -- View this message in context: http://postgresql.nabble.com/Two-instances-of-Postgres-with-single-data-directory-tp5826788.html Sent from the PostgreSQL - general mailing list archive at Nabble.com. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] After insert trigger not work
Hello! I use postgre-9.3.5 on windows7 x64. Trigger should update data in table: CREATE TABLE trassa.ram_free_stat ( id serial NOT NULL, device integer NOT NULL, min_value integer NOT NULL, avg_value integer NOT NULL DEFAULT 0, max_value integer NOT NULL, last_update timestamp without time zone NOT NULL DEFAULT now(), CONSTRAINT ram_free_stat_pk PRIMARY KEY (id), CONSTRAINT ram_free_stat_device_fk FOREIGN KEY (device) REFERENCES trassa.devices (id) MATCH SIMPLE ON UPDATE NO ACTION ON DELETE NO ACTION, CONSTRAINT ram_free_stat_max_fk FOREIGN KEY (max_value) REFERENCES trassa.ram (id) MATCH SIMPLE ON UPDATE NO ACTION ON DELETE NO ACTION, CONSTRAINT ram_free_stat_min_fk FOREIGN KEY (min_value) REFERENCES trassa.ram (id) MATCH SIMPLE ON UPDATE NO ACTION ON DELETE NO ACTION ) My trigger: CREATE OR REPLACE FUNCTION trassa.update_ram_free_stat() RETURNS trigger AS $BODY$ DECLARE device_id INTEGER DEFAULT 0; min_id INTEGER DEFAULT 0; avg_val INTEGER DEFAULT 0; max_id INTEGER DEFAULT 0; BEGIN SELECT id INTO device_id FROM trassa.ram_free_stat WHERE device = NEW.device; SELECT free_ram, id INTO min_id FROM trassa.ram WHERE device = NEW.device ORDER BY free_ram LIMIT 1; SELECT free_ram, id INTO max_id FROM trassa.ram WHERE device = NEW.device ORDER BY free_ram DESC LIMIT 1; SELECT CEIL(AVG(free_ram)) INTO avg_val FROM trassa.ram WHERE device = NEW.device; IF device_id 0 THEN UPDATE trassa.ram_free_stat SET min_value = min_id, avg_value = avg_val, max_value = max_id WHERE id = device_id; ELSE INSERT INTO trassa.ram_free_stat (device, min_value, avg_value, max_value) VALUES(NEW.device, min_id, avg_val, max_id); END IF; RETURN NULL; END; $BODY$ LANGUAGE plpgsql VOLATILE SECURITY DEFINER COST 100; I add this trigger on another table: CREATE TRIGGER update_ram_free_stat_trigger AFTER INSERT ON trassa.ram FOR EACH ROW EXECUTE PROCEDURE trassa.update_ram_free_stat(); All executed without any error, but data in trassa.ram_free_stat not updated. Can you help me solve this problem? Thank you and excuse me for my bad english. -- Best regards, Brilliantov Kirill Vladimirovich -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Two instances of Postgres with single data directory
On Thu, Nov 13, 2014 at 9:17 PM, dineshkaarthick dineshkaarth...@yahoo.co.uk wrote: All, I need to know is it possible to execute two instances of Postgres with single data directory shared between the two instances. No, a server instance cannot run on a data folder being used by an existing instance. -- Michael -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] sepgsql where are the security labels
This table maintains information about the context of postgresql objects not the data in tables. On Wed, Nov 12, 2014 at 5:56 PM, Adrian Klaver adrian.kla...@aklaver.com wrote: On 11/12/2014 02:45 PM, Ted Toth wrote: I'm running selinux mls policy I've got labeled ipsec working and my postgresql configured to load sepgsql. I've created a db, run the sepgsql.sql script on it, created tables and inserted data. How do I query the security labels on the data? I do not use SECURITY LABELS, but it seems they can be queried here: http://www.postgresql.org/docs/9.3/interactive/view-pg-seclabels.html As best I can tell there is no security_context column on either of the tables I've created that I see? How does the system column security_context get added to tables? I've read everything I can find on the web but a lot of it is dated. Here's how I'm creating my db and tables: -- Adrian Klaver adrian.kla...@aklaver.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] After insert trigger not work
Melvin Davidson wrote on 11/13/2014 05:29 PM: s for I suspect your problem is because you have 6 columns that are NOT NULL, but on INSERT you are only supplying values for 4 columns plus the id(serial). Therefore, the INSERT will fail. Perhaps if you supplied a value for last_update you it will work a lot better. Hello, Melvin! Why this is a problem is last_update column created with default value? Trigger should update data in table: CREATE TABLE trassa.ram_free_stat ( id serial NOT NULL, device integer NOT NULL, min_value integer NOT NULL, avg_value integer NOT NULL DEFAULT 0, max_value integer NOT NULL, last_update timestamp without time zone NOT NULL DEFAULT now(), CONSTRAINT ram_free_stat_pk PRIMARY KEY (id), CONSTRAINT ram_free_stat_device_fk FOREIGN KEY (device) REFERENCES trassa.devices (id) MATCH SIMPLE ON UPDATE NO ACTION ON DELETE NO ACTION, CONSTRAINT ram_free_stat_max_fk FOREIGN KEY (max_value) REFERENCES trassa.ram (id) MATCH SIMPLE ON UPDATE NO ACTION ON DELETE NO ACTION, CONSTRAINT ram_free_stat_min_fk FOREIGN KEY (min_value) REFERENCES trassa.ram (id) MATCH SIMPLE ON UPDATE NO ACTION ON DELETE NO ACTION ) -- Best regards, Brilliantov Kirill Vladimirovich -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Data corruption
We appear to have had some corruption on a customer's postgres cluster. They are on 9.0.17 32bit Windows Server 2003 - Service pack 2 Intel Xeon 2.66GHZ 4GB Memory Raid is setup but doesn't look good - just now showing status of Degraded!! The RAID doesn't look too well currently has status Degraded and on the Segments tab and it's showing Segment 1 (Missing) I guess we can assume the issue is down to hardware... An engineer has been dispatched to replace the hardware and we are arranging to have the cluster shutdown and backed up to a separate storage device. Their postgresql.conf file is pretty much as it comes with only the following line added to the end: custom_variable_classes = 'user_vars' Everything was fine until 13:28 on 7th November when there was a number of these entries in the log: 2014-11-07 13:28:45 GMT WARNING: worker took too long to start; cancelled After that the log file was cycled and it started with: 2014-11-07 14:15:19 GMT FATAL: the database system is starting up 2014-11-07 14:15:20 GMT FATAL: the database system is starting up 2014-11-07 14:15:20 GMT LOG: database system was interrupted; last known up at 2014-11-07 13:28:42 GMT 2014-11-07 14:15:21 GMT FATAL: the database system is starting up 2014-11-07 14:15:22 GMT FATAL: the database system is starting up 2014-11-07 14:15:23 GMT FATAL: the database system is starting up 2014-11-07 14:15:23 GMT LOG: database system was not properly shut down; automatic recovery in progress 2014-11-07 14:15:23 GMT LOG: record with zero length at 5/7B4CAC0 2014-11-07 14:15:23 GMT LOG: redo is not required 2014-11-07 14:15:24 GMT FATAL: the database system is starting up 2014-11-07 14:15:25 GMT FATAL: the database system is starting up 2014-11-07 14:15:25 GMT LOG: database system is ready to accept connections 2014-11-07 14:15:25 GMT LOG: autovacuum launcher started 2014-11-07 14:15:33 GMT LOG: unexpected EOF on client connection Since then whenever trying to write to or query one particular table we receive the following: 2014-11-07 15:13:57 GMT ERROR: invalid page header in block 29838 of relation base/16392/640564 It's always the same error (block and relation) as far as I can tell. So the question is, what next? We may have lost data as it couldn't be written but it's not the end of the world. The more important bit would be to stop any further data loss. Regards, Russell Keane INPS Tel: +44 (0)20 7501 7277 Follow ushttps://twitter.com/INPSnews on twitter | visit www.inps.co.ukhttp://www.inps.co.uk/ Registered name: In Practice Systems Ltd. Registered address: The Bread Factory, 1a Broughton Street, London, SW8 3QJ Registered Number: 1788577 Registered in England Visit our Internet Web site at www.inps.co.uk The information in this internet email is confidential and is intended solely for the addressee. Access, copying or re-use of information in it by anyone else is not authorised. Any views or opinions presented are solely those of the author and do not necessarily represent those of INPS or any of its affiliates. If you are not the intended recipient please contact is.helpd...@inps.co.uk
Re: [GENERAL] After insert trigger not work
On 11/13/2014 04:27 AM, Brilliantov Kirill Vladimirovich wrote: Hello! I use postgre-9.3.5 on windows7 x64. Trigger should update data in table: CREATE TABLE trassa.ram_free_stat ( id serial NOT NULL, device integer NOT NULL, min_value integer NOT NULL, avg_value integer NOT NULL DEFAULT 0, max_value integer NOT NULL, last_update timestamp without time zone NOT NULL DEFAULT now(), CONSTRAINT ram_free_stat_pk PRIMARY KEY (id), CONSTRAINT ram_free_stat_device_fk FOREIGN KEY (device) REFERENCES trassa.devices (id) MATCH SIMPLE ON UPDATE NO ACTION ON DELETE NO ACTION, CONSTRAINT ram_free_stat_max_fk FOREIGN KEY (max_value) REFERENCES trassa.ram (id) MATCH SIMPLE ON UPDATE NO ACTION ON DELETE NO ACTION, CONSTRAINT ram_free_stat_min_fk FOREIGN KEY (min_value) REFERENCES trassa.ram (id) MATCH SIMPLE ON UPDATE NO ACTION ON DELETE NO ACTION ) My trigger: CREATE OR REPLACE FUNCTION trassa.update_ram_free_stat() RETURNS trigger AS $BODY$ DECLARE device_id INTEGER DEFAULT 0; min_id INTEGER DEFAULT 0; avg_val INTEGER DEFAULT 0; max_id INTEGER DEFAULT 0; BEGIN SELECT id INTO device_id FROM trassa.ram_free_stat WHERE device = NEW.device; SELECT free_ram, id INTO min_id FROM trassa.ram WHERE device = NEW.device ORDER BY free_ram LIMIT 1; In above and below you are selecting two column values into one integer variable, you may not be getting what you think you are: test= DO $$ DECLARE var_1 integer DEFAULT 0; BEGIN SELECT 10, 1 INTO var_1; RAISE NOTICE '%', var_1; END; $$ LANGUAGE plpgsql ; NOTICE: 10 DO Or maybe you are, it is not clear what the variables are supposed to hold. From the name I would say the device id, from how they are used below I would say the free ram values. SELECT free_ram, id INTO max_id FROM trassa.ram WHERE device = NEW.device ORDER BY free_ram DESC LIMIT 1; SELECT CEIL(AVG(free_ram)) INTO avg_val FROM trassa.ram WHERE device = NEW.device; IF device_id 0 THEN UPDATE trassa.ram_free_stat SET min_value = min_id, avg_value = avg_val, max_value = max_id WHERE id = device_id; ELSE INSERT INTO trassa.ram_free_stat (device, min_value, avg_value, max_value) VALUES(NEW.device, min_id, avg_val, max_id); END IF; RETURN NULL; END; $BODY$ LANGUAGE plpgsql VOLATILE SECURITY DEFINER COST 100; I add this trigger on another table: CREATE TRIGGER update_ram_free_stat_trigger AFTER INSERT ON trassa.ram FOR EACH ROW EXECUTE PROCEDURE trassa.update_ram_free_stat(); All executed without any error, but data in trassa.ram_free_stat not updated. Can you help me solve this problem? You might to put some RAISE NOTICEs in your function to track what is going on: http://www.postgresql.org/docs/9.3/interactive/plpgsql-errors-and-messages.html Thank you and excuse me for my bad english. -- Adrian Klaver adrian.kla...@aklaver.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] sepgsql where are the security labels
On 11/13/2014 05:58 AM, Ted Toth wrote: This table maintains information about the context of postgresql objects not the data in tables. http://www.slideshare.net/kaigai/label-based-mandatory-access-control-on-postgresql Slide 23 On Wed, Nov 12, 2014 at 5:56 PM, Adrian Klaver adrian.kla...@aklaver.com wrote: On 11/12/2014 02:45 PM, Ted Toth wrote: I'm running selinux mls policy I've got labeled ipsec working and my postgresql configured to load sepgsql. I've created a db, run the sepgsql.sql script on it, created tables and inserted data. How do I query the security labels on the data? I do not use SECURITY LABELS, but it seems they can be queried here: http://www.postgresql.org/docs/9.3/interactive/view-pg-seclabels.html As best I can tell there is no security_context column on either of the tables I've created that I see? How does the system column security_context get added to tables? I've read everything I can find on the web but a lot of it is dated. Here's how I'm creating my db and tables: -- Adrian Klaver adrian.kla...@aklaver.com -- Adrian Klaver adrian.kla...@aklaver.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] sepgsql where are the security labels
On 11/13/2014 05:58 AM, Ted Toth wrote: This table maintains information about the context of postgresql objects not the data in tables. To follow up, an expanded explanation of the security_label column: https://wiki.postgresql.org/wiki/SEPostgreSQL_Architecture#The_security_label_system_column -- Adrian Klaver adrian.kla...@aklaver.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] sepgsql where are the security labels
Exactly what I talking about ... but unfortunately that appears to have been based on KaiGai's branch and is not in 9.3. The current discuss/work is around row-level-security with patches to 9.5 which is not much help to me now :( On Thu, Nov 13, 2014 at 9:26 AM, Adrian Klaver adrian.kla...@aklaver.com wrote: On 11/13/2014 05:58 AM, Ted Toth wrote: This table maintains information about the context of postgresql objects not the data in tables. http://www.slideshare.net/kaigai/label-based-mandatory-access-control-on-postgresql Slide 23 On Wed, Nov 12, 2014 at 5:56 PM, Adrian Klaver adrian.kla...@aklaver.com wrote: On 11/12/2014 02:45 PM, Ted Toth wrote: I'm running selinux mls policy I've got labeled ipsec working and my postgresql configured to load sepgsql. I've created a db, run the sepgsql.sql script on it, created tables and inserted data. How do I query the security labels on the data? I do not use SECURITY LABELS, but it seems they can be queried here: http://www.postgresql.org/docs/9.3/interactive/view-pg-seclabels.html As best I can tell there is no security_context column on either of the tables I've created that I see? How does the system column security_context get added to tables? I've read everything I can find on the web but a lot of it is dated. Here's how I'm creating my db and tables: -- Adrian Klaver adrian.kla...@aklaver.com -- Adrian Klaver adrian.kla...@aklaver.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] sepgsql where are the security labels
On 11/13/2014 07:37 AM, Ted Toth wrote: Exactly what I talking about ... but unfortunately that appears to have been based on KaiGai's branch and is not in 9.3. The current discuss/work is around row-level-security with patches to 9.5 which is not much help to me now :( Then my previous post would not be of much help either. I do not have --selinux on my instances, so I have no way of testing. I'm afraid I am out of ideas. -- Adrian Klaver adrian.kla...@aklaver.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Modeling Friendship Relationships
On Nov 11, 2014, at 5:38 PM, Robert DiFalco wrote: Thoughts? Do I just choose one or is there a clear winner? TIA! I prefer this model user_id__a INT NOT NULL REFERENCES user(id), user_id__b INT NOT NULL REFERENCES user(id), is_reciprocal BOOLEAN primary key (user_id__a, user_id__b) if a relationship is confirmed (or dropped) I toggle is_reciprocal. having that value saves a lot of work doing joins or analyzing friendship sets if you have multiple relationship types, then things get tricky. you can either - treat the row as a triplet ( user_id__a, user_id__b, relationship_type_id) [i still recommend the reciprocal bool] - if you have a finite set of relationship types, you could just use each one as a bool column within the a2b row I've tried doing the one row per relationship approach, and didn't like it. the time savings on simple searches were marginally faster, but the sql was increasingly more complex and slower to execute as we leveraged the table into other queries. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] troubleshooting a database that keeps locking up
I have a database that has started to constantly hang after a brief period of activity looking at `select * from pg_stat_activity;` I roughly see the following each time: process 1 | IDLE process 2 | IDLE in transaction process 3 | IDLE in transaction process 4 | IDLE process 5 | IDLE process 6 | IDLE process 7 | INSERT INTO table_a RETURNING id occasionally I'll see process 8 | UPDATE table_b Does anyone have tips on how I can troubleshoot this. I was hoping there would be some way to show the history of the IDLE in transaction processes, but I couldn't find them. I was also wondering if the RETURNING id might have something to do with this. I'd appreciate any pointers in trying to figure out what is causing this. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] troubleshooting a database that keeps locking up
Jonathan Vanasco postg...@2xlp.com writes: I have a database that has started to constantly hang after a brief period of activity looking at `select * from pg_stat_activity;` I roughly see the following each time: process 1 | IDLE process 2 | IDLE in transaction process 3 | IDLE in transaction process 4 | IDLE process 5 | IDLE process 6 | IDLE process 7 | INSERT INTO table_a RETURNING id Does anyone have tips on how I can troubleshoot this. It's a reasonable bet that process 7 is blocked waiting for a lock that's held by one of the idle-in-transaction processes. You could look into pg_locks to confirm that. I was hoping there would be some way to show the history of the IDLE in transaction processes, but I couldn't find them. No, PG doesn't track that. You could turn on log_statement = all and then look to the postmaster log to see what those processes had been doing. (You'll probably want to set log_line_prefix to at least %p if you haven't already.) regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Are there any downsides to using postgres' data directory on a dedicated drive/partition / filesystem?
I'm running postgres on a virtual server I was wondering if there were any known issues with moving the data directory to another mounted partition / filesystem. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Are there any downsides to using postgres' data directory on a dedicated drive/partition / filesystem?
On Thu, 13 Nov 2014 14:44:22 -0500 Jonathan Vanasco postg...@2xlp.com wrote: I'm running postgres on a virtual server I was wondering if there were any known issues with moving the data directory to another mounted partition / filesystem. At my previous job the company standard was that databases went under /var/db, so all our PostgreSQL servers used /var/db/postgres /var/db was also usually a dedicated mount point connected to a fast RAID-10 drive array. Never had any trouble with it. -- Bill Moran I need your help to succeed: http://gamesbybill.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Are there any downsides to using postgres' data directory on a dedicated drive/partition / filesystem?
On 11/13/2014 11:44 AM, Jonathan Vanasco wrote: I'm running postgres on a virtual server I was wondering if there were any known issues with moving the data directory to another mounted partition / filesystem. I do that all the time.I avoid using nfs/smb type shares for data directories, but any SAN or direct attach file system is fine. I'll either mount the dedicated data partition as /var/lib/pgsql/9.3/data, or I'll update the startup configuration to force the data dir to be in another path, like /u01/pgsql/9.3/data -- john r pierce 37N 122W somewhere on the middle of the left coast -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Re: Are there any downsides to using postgres' data directory on a dedicated drive/partition / filesystem?
Jonathan Vanasco-7 wrote I'm running postgres on a virtual server I was wondering if there were any known issues with moving the data directory to another mounted partition / filesystem. Define moving Also, it is recommended to separate system logs, WAL, and data onto separate drives where possible. Are you wanting to just move data or all of these things? David J. -- View this message in context: http://postgresql.nabble.com/Are-there-any-downsides-to-using-postgres-data-directory-on-a-dedicated-drive-partition-filesystem-tp5826870p5826878.html Sent from the PostgreSQL - general mailing list archive at Nabble.com. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Re: Are there any downsides to using postgres' data directory on a dedicated drive/partition / filesystem?
On 11/13/2014 1:08 PM, David G Johnston wrote: Also, it is recommended to separate system logs, WAL, and data onto separate drives where possible. Are you wanting to just move data or all of these things? I find putting all the disks in one big happy raid10 and leaving everything on the same file system works just fine, achieves evenly balanced IO -- john r pierce 37N 122W somewhere on the middle of the left coast -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Are there any downsides to using postgres' data directory on a dedicated drive/partition / filesystem?
Jonathan Vanasco postg...@2xlp.com writes: I'm running postgres on a virtual server I was wondering if there were any known issues with moving the data directory to another mounted partition / filesystem. You can put the database directory anywhere you want. Obviously, it has to be trustworthy storage, and in that respect I'd caution against soft-mounted storage. There's a disaster story in the PG list archives somewhere (quite some time back, maybe 10 years ago) about somebody who had their database on NFS storage. One day the NFS server was a bit slow to mount after a reboot and didn't come up until after the postgres server start script had launched. This was one of those helpful vendor-supplied start scripts that would automatically run initdb if the data directory wasn't there. So it started up, looked at $PGDATA and didn't find anything, and launched initdb. initdb faithfully checked that $PGDATA was empty (still true) and set about creating files. Right after that, the NFS server finally comes online ... and now initdb is scribbling over the system catalogs of the live database. Needless to say, that did not end well. Disconnecting storage from under an already-running postmaster will possibly spoil your day too, although it should be reasonably survivable as long as it wasn't the WAL logs you dismounted. So, while you can do what you like, it behooves you to think about possible failure modes anytime you choose to put the database directory (or a tablespace directory) on any dismountable storage. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] hstore, but with fast range comparisons?
I want to do something that is perfectly satisfied by an hstore column. *Except* that I want to be able to do fast (ie indexed) , etc comparisons, not just equality. From what I can tell, there isn’t really any way to get hstore to do this, so I’ll have to go to a key-value table. But I thought I’d ask just in case I’m missing something. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Are there any downsides to using postgres' data directory on a dedicated drive/partition / filesystem?
Thanks, everyone! For now this will be provisioning physical drive for a box -- and everything will be there for now. So OS on one drive, and DB on another. I've run into programs before (mostly on Mac/Win) that are exceedingly not happy if they're run on a drive other than the OS. Since many people partition data and services under pg, I figured it would be okay -- but I couldn't find anything in the docs and wanted to check. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] hstore, but with fast range comparisons?
On 11/13/2014 3:46 PM, Guyren Howe wrote: I want to do something that is perfectly satisfied by an hstore column. *Except* that I want to be able to do fast (ie indexed) , etc comparisons, not just equality. From what I can tell, there isn’t really any way to get hstore to do this, so I’ll have to go to a key-value table. But I thought I’d ask just in case I’m missing something. I think your missing something. Is it one field in the hstore? Did you try an expression index? create index cars_mph on cars ( (data-'mph') ); thats a btree index, which should support and . (Although I've never tried it) -Andy -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] hstore, but with fast range comparisons?
Andy Colson wrote On 11/13/2014 3:46 PM, Guyren Howe wrote: I want to do something that is perfectly satisfied by an hstore column. *Except* that I want to be able to do fast (ie indexed) , etc comparisons, not just equality. From what I can tell, there isn’t really any way to get hstore to do this, so I’ll have to go to a key-value table. But I thought I’d ask just in case I’m missing something. I think your missing something. Is it one field in the hstore? Did you try an expression index? create index cars_mph on cars ( (data-'mph') ); thats a btree index, which should support and . (Although I've never tried it) With the one caveat that everything in hstore is a string so you'd probably want to add an appropriate cast to the expression. David J. -- View this message in context: http://postgresql.nabble.com/hstore-but-with-fast-range-comparisons-tp5826886p5826898.html Sent from the PostgreSQL - general mailing list archive at Nabble.com. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Performance issue with libpq prepared queries on 9.3 and 9.4
I have hit a rather odd issue with prepared queries on both pg 9.3 and 9.4 beta. I have this table (copy at http://samsaffron.com/testing.db.gz) with a very odd performance profile: When I run the following prepared query it is running significantly slower than the raw counterpart: ``` select * from topics where archetype = $1 limit 1 ``` Full test harness here: ``` #include stdio.h #include stdlib.h #include string.h #include errno.h #include sys/time.h #include time.h #include libpq-fe.h static void exit_nicely(PGconn *conn) { PQfinish(conn); exit(1); } static double get_wall_time(){ struct timeval time; if (gettimeofday(time,NULL)){ // Handle error return 0; } return (double)time.tv_sec + (double)time.tv_usec * .01; } int main(int argc, char **argv) { const char *conninfo; PGconn *conn; PGresult *res; PGresult *stmt; int i; Oid textOid = 25; char *paramValues[1]; int paramLengths[1]; paramLengths[0] = 6; paramValues[0] = banner; if (argc 1) conninfo = argv[1]; else conninfo = dbname = testing; printf(connecting database\n); /* Make a connection to the database */ conn = PQconnectdb(conninfo); /* Check to see that the backend connection was successfully made */ if (PQstatus(conn) != CONNECTION_OK) { fprintf(stderr, Connection to database failed: %s, PQerrorMessage(conn)); exit_nicely(conn); } stmt = PQprepare(conn, test, select * from topics where archetype = $1 limit 1, 1, textOid); printf(prepared statement\n); double start = get_wall_time(); for(i=0; i2000; i++){ res = PQexecPrepared(conn, test, 1, paramValues, paramLengths, NULL, 0); if (PQresultStatus(res) != PGRES_TUPLES_OK) { fprintf(stderr, command failed: %s, PQerrorMessage(conn)); PQclear(res); exit_nicely(conn); } PQclear(res); } double finish = get_wall_time(); fprintf(stderr, Prepared %f \n, (finish-start)); start = get_wall_time(); for(i=0; i2000; i++){ res = PQexec(conn, select * from topics where archetype = 'banner' limit 1); if (PQresultStatus(res) != PGRES_TUPLES_OK) { fprintf(stderr, command failed: %s, PQerrorMessage(conn)); PQclear(res); exit_nicely(conn); } PQclear(res); } finish = get_wall_time(); fprintf(stderr, raw %f \n, (finish-start)); /* close the connection to the database and cleanup */ PQfinish(conn); return 0; } ``` Results: ```text sam@ubuntu pq_play % cc -o play -I/usr/include/postgresql play.c -lpq -L/usr/include/postgresql/libpq ./play connecting database prepared statement Prepared 9.936938 Raw 1.369071 ``` So my prepared counterpart is running at an 8th the speed of the raw. I had a nightmare of a time generating a workload that exhibits the issue via script but managed to anonymise the data which is linked above. Very strangely when I run the query in psql it does not exhibit the issue: ```text sam@ubuntu pq_play % psql testing psql (9.3.5) Type help for help. testing=# prepare test as select * from topics where archetype = $1 limit 1; PREPARE testing=# explain analyze execute test('banner'); QUERY PLAN Limit (cost=0.29..651.49 rows=1 width=520) (actual time=0.983..0.983 rows=0 loops=1) - Index Scan using idx11 on topics (cost=0.29..651.49 rows=1 width=520) (actual time=0.980..0.980 rows=0 loops=1) Index Cond: ((archetype)::text = 'banner'::text) Total runtime: 1.037 ms (4 rows) testing=# explain analyze select * from topics where archetype = 'banner' limit 1; QUERY PLAN Limit (cost=0.29..651.49 rows=1 width=520) (actual time=0.642..0.642 rows=0 loops=1) - Index Scan using idx11 on topics (cost=0.29..651.49 rows=1 width=520) (actual time=0.641..0.641 rows=0 loops=1) Index Cond: ((archetype)::text = 'banner'::text) Total runtime: 0.673 ms (4 rows) ``` Something about running this from libpq is causing it to scan the table as opposed to scanning the index. Any idea why is this happening? (note: Rails 4.2 is moving to a pattern of using prepared statements far more often which is why I discovered this issue) ps. also posted on our meta to keep track of it: https://meta.discourse.org/t/performance-issue-with-prepared-queries/22141 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Performance issue with libpq prepared queries on 9.3 and 9.4
Sam Saffron sam.saff...@gmail.com writes: I have hit a rather odd issue with prepared queries on both pg 9.3 and 9.4 beta. I have this table (copy at http://samsaffron.com/testing.db.gz) with a very odd performance profile: Interesting case. The issue seems to be that your statistics look like this: select * from pg_stats where tablename = 'topics' and attname = 'archetype'; schemaname | tablename | attname | inherited | null_frac | avg_width | n_distinct | most_common_vals | most_common_freqs | histogram_bounds | correlation | most_common_elems | most_common_elem_freqs | elem_count_histogram +---+---+---+---+---++---+-+--+-+---++- public | topics| archetype | f | 0 |12 | 2 | {private_message,regular} | {0.604957,0.395043} | | 0.612985 | || (1 row) That is, archetype consists of 60% 'private_message', 40% 'regular', and absolutely nothing else. So the condition archetype = 'banner' is very selective, and a plan built knowing that that is the parameter value will use the index: # explain select * from topics where archetype = 'banner' limit 1; QUERY PLAN -- Limit (cost=0.29..651.49 rows=1 width=520) - Index Scan using idx11 on topics (cost=0.29..651.49 rows=1 width=520) Index Cond: ((archetype)::text = 'banner'::text) (3 rows) However, that's still a pretty darn expensive indexscan, mainly because archetype is not the leading key ... if you care about the performance of this query, why don't you have an index to match? # create index on topics(archetype); CREATE INDEX # explain select * from topics where archetype = 'banner' limit 1; QUERY PLAN --- Limit (cost=0.29..6.80 rows=1 width=520) - Index Scan using topics_archetype_idx on topics (cost=0.29..6.80 rows=1 width=520) Index Cond: ((archetype)::text = 'banner'::text) (3 rows) However, just fixing the index availability actually makes the performance ratio even worse, because the prepared query still doesn't use the index: # explain execute foo('banner'); QUERY PLAN - Limit (cost=0.00..0.11 rows=1 width=520) - Seq Scan on topics (cost=0.00..1158.19 rows=10088 width=520) Filter: ((archetype)::text = $1) (3 rows) (Yes, you can get this result in psql, you just need to repeat the EXECUTE half a dozen times until it shifts to a generic plan.) The problem here is that without knowledge of the comparison value, the planner assumes that it will probably be one of the two values that make up the table content. (After all, why would you query for something else?) On that basis, a seqscan will probably hit a matching row in no time, and so (with the LIMIT 1) it looks like a better deal than the indexscan. We've talked about this type of problem before. Part of the issue is that costing of LIMIT doesn't apply any penalty for a bad worst-case scenario, and part of it is that the heuristics for choosing between custom and generic plans don't consider the possibility that the generic plan's estimated cost is way wrong for lack of knowledge of the comparison value. It's not real obvious how to improve either heuristic without probably making some cases worse. One thing that occurs to me is that if the generic plan estimate comes out much cheaper than the custom one, maybe we should assume that the generic's cost estimate is bogus. Right offhand I can't think of a reason for a custom plan to look worse than a generic one, unless there's a statistical quirk like this one. In the meantime, I assume that your real data contains a small percentage of values other than these two? If so, maybe cranking up the statistics target would help. If the planner knows that there are more than two values in the column, I think it would be less optimistic about assuming that the comparison value is one of the big two. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [HACKERS] Re: [GENERAL] Performance issue with libpq prepared queries on 9.3 and 9.4
On Thu, Nov 13, 2014 at 5:47 PM, Tom Lane t...@sss.pgh.pa.us wrote: David G Johnston david.g.johns...@gmail.com writes: Tom Lane-2 wrote In the meantime, I assume that your real data contains a small percentage of values other than these two? If so, maybe cranking up the statistics target would help. If the planner knows that there are more than two values in the column, I think it would be less optimistic about assuming that the comparison value is one of the big two. Is there any value (or can value be added) in creating a partial index of the form: archetype IN ('banner','some other rare value') such that the planner will see that such a value is possible but infrequent and will, in the presence of a plan using a value contained in the partial index, refuse to use a generic plan knowing that it will be unable to use the very specific index that the user created? The existence of such an index wouldn't alter the planner's statistics. In theory we could make it do so, but I seriously doubt the cost-benefit ratio is attractive, either as to implementation effort or the added planning cost. [adding -general back in...] While planner hints comes to mind...on the SQL side can we extend the PREPARE command with two additional keywords? PREPARE name [ ( data_type [, ...] ) ] [ [NO] GENERIC ] AS statement I was originally thinking this could attach to EXECUTE and maybe it could there as well. If EXECUTE is bare whatever the PREPARE used would be in effect (a bare PREPARE exhibiting the current dynamic behavior). If EXECUTE and PREPARE disagree execute wins and the current call is (re-)prepared as requested. We have introduced intelligence to PREPARE/EXECUTE that is not always favorable but provide little way to override it if the user has superior knowledge. The dual role of prepared statements to both prevent SQL-injection as well as create cache-able generic plans further complicates things. In effect by supplying NO GENERIC on the PREPARE the caller is saying they only wish to make use of the SQL-injection aspect of prepared statements. Adding the EXECUTE piece allows for the same plan to be used in injection-prevention mode if the caller knows that the user-supplied value does not play well with the generic plan. David J.
Re: [HACKERS] Re: [GENERAL] Performance issue with libpq prepared queries on 9.3 and 9.4
David Johnston david.g.johns...@gmail.com writes: âWhile planner hints comes to mind...on the SQL side can we extend the PREPARE command with two additional keywords?â âPREPARE name [ ( data_type [, ...] ) ] [ [NO] GENERIC â] â âAS statement Don't really see the point. The OP's problem is that the prepare is being driven by a client-side library, which would have even less clue than the backend as to whether a generic plan is more appropriate than a custom plan. The right thing here IMO is to improve the heuristics on the backend side. I'm sure we can do better, it's just going to take some thought. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Performance issue with libpq prepared queries on 9.3 and 9.4
Thank you so much! So to recap the general way to reproduce this issue is: create table products(id int primary key, type varchar); insert into products select generate_series(1,1), 'aaa'; insert into products select generate_series(10001,2), 'bbb'; create index idx on products(type); prepare stmt as select * from products where type = $1 limit 1; Which quickly devolves into: explain analyze execute stmt ('ccc'); QUERY PLAN -- Limit (cost=0.00..0.03 rows=1 width=8) (actual time=1.821..1.821 rows=0 loops=1) - Seq Scan on products (cost=0.00..339.00 rows=1 width=8) (actual time=1.819..1.819 rows=0 loops=1) Filter: ((type)::text = $1) Rows Removed by Filter: 2 Total runtime: 1.843 ms (5 rows) So if I am searching for 'ccc' eventually the prepared plan optimises and uses the better mechanism of just scanning the table to find the first hit which is what the statistics suggest. However a fairly common pattern I use it to check for lack of presence of a value. For example: if the product type 'ccc' is not in the table do this. Unfortunately the optimiser deoptimises this class of operation. I tried the exact example above with an int instead of a varchar in the type column and was not able to reproduce the issue, I wonder if there is some sort of different handling for strings vs numbers. Unfortunately my actual table in play has a rather bad schema, the archetype column really should be an int. That said we only have 2 general archetypes at the moment (private message and topic) and the occasional single banner outlier, which may or may not be there. So the data modelling is pretty hostile to performance. I have some ideas on how to solve my particular problem but I do have some general concerns. Ruby on Rails is just about to ship a new version that heavily changes the mechanics of query execution. For example, Product.where(name: foo).first will now result in a prepared query whereas in the past it would just send the raw query. Overall this approach is better and saner, but my general concern is that our API offers no escape hatch for these outlier conditions. You can disable globally, but can not just disable for a single call. I will raise this particular concern to the team. My second question/concern is that I feel I am totally misunderstanding the changes to 'plancache.c', I thought that the decision of the plan to use was purely based on the value sent in to the prepared query. However it seems that the planner completely ignores the value in some steps. (so, for example I was thinking that aaa and ccc would result in completely different plans) Thank you so much for your time, patience and general awesomeness On Fri, Nov 14, 2014 at 11:34 AM, Tom Lane t...@sss.pgh.pa.us wrote: Sam Saffron sam.saff...@gmail.com writes: I have hit a rather odd issue with prepared queries on both pg 9.3 and 9.4 beta. I have this table (copy at http://samsaffron.com/testing.db.gz) with a very odd performance profile: Interesting case. The issue seems to be that your statistics look like this: select * from pg_stats where tablename = 'topics' and attname = 'archetype'; schemaname | tablename | attname | inherited | null_frac | avg_width | n_distinct | most_common_vals | most_common_freqs | histogram_bounds | correlation | most_common_elems | most_common_elem_freqs | elem_count_histogram +---+---+---+---+---++---+-+--+-+---++- public | topics| archetype | f | 0 |12 | 2 | {private_message,regular} | {0.604957,0.395043} | | 0.612985 | || (1 row) That is, archetype consists of 60% 'private_message', 40% 'regular', and absolutely nothing else. So the condition archetype = 'banner' is very selective, and a plan built knowing that that is the parameter value will use the index: # explain select * from topics where archetype = 'banner' limit 1; QUERY PLAN -- Limit (cost=0.29..651.49 rows=1 width=520) - Index Scan using idx11 on topics (cost=0.29..651.49 rows=1 width=520) Index Cond: ((archetype)::text = 'banner'::text) (3 rows) However, that's still a pretty darn expensive indexscan, mainly because archetype is not the leading key ... if you care about the performance of this query, why don't you have an index to match? # create index on topics(archetype); CREATE INDEX # explain select * from topics where archetype = 'banner' limit 1;